ADVANCES IN I M A G I N G AND ELECTRON PHYSICS VOLUME 120
EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, Franc...
61 downloads
928 Views
17MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN I M A G I N G AND ELECTRON PHYSICS VOLUME 120
EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, France
ASSOCIATE EDITORS
BENJAMIN K A Z A N Xerox Corporation Palo Alto Research Center Palo Alto, California
TOM M U L V E Y
Department of Electronic Engineering and Applied Physics Aston University Birmingham, United Kingdom
Advances in
Imaging and Electron Physics EDITED BY
PETER W. HAWKES CEMES-CNRS Toulouse, France
V O L U M E 120
ACADEMIC PRESS An Elsevier Science Imprint San Diego
San Francisco New York Boston London Sydney Tokyo
This book is printed on acid-free paper.
Copyright 2002, Elsevier Science (USA) All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher's consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per-copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2002 chapters are as shown on the title pages: If no fee code appears on the title page, the copy fee is the same as for current chapters. 1076-5670/02 $35.00 Explicit permission from Academic Press is not required to reproduce a maximum of two figures or tables from an Academic Press chapter in another scientific or research publication provided that the material has not been credited to another source and that full credit to the Academic Press chapter is given.
Academic Press An Elsevier Science Imprint 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com
Academic Press Harcourt Place, 32 Jamestown Road, London NW1 7BY, UK International Standard Serial Number: 1076-5670 International Standard Book Number: 0-12-014762-9 PRINTED IN THE UNITED STATES OF AMERICA 02 03 04 05 06 MB 9 8 7 6 5 4 3
2
1
D e d i c a t e d in g r a t i t u d e to
Peter W. Hawkes
f o r twenty years of d i s t i n g u i s h e d editorial achievement Advances in Electronics and Electron Physics, first published in 1948, subsequently combined with Advances in Image Pick-up and Display and Advances in Optical and Electron Microscopy, and ultimately titled Advances in Imaging and Electron Physics, could not possibly have come into more capable hands than those of Peter Hawkes. I met Peter during my very first brief excursion beyond the iron curtain at the 10th International Congress on Electron Microscopy in Hamburg in 1982, the year he took over the editing of this important serial. My short trip, barely a week in length, cost me more than my year's salary. But what it showed me was that the names in electron microscopy also had faces, and that one of them belonged to Peter, whom I knew from his papers and publications. We spent some time talking about the lectures we had just heard, exploring our common interest in electron optics and our common hobby of electron optical aberrations - - for which Peter was the true guru and I just a humble beginner. This helped me to recognize the importance of meeting people and talking to them, something I had hardly known during my country's isolation. Two years later in Budapest at the EUREM meeting, we were already talking like old friends. Since then we have exchanged many letters and reprints; and, thanks to e-mail, we also have been able to write two joint papers. During various official and unofficial gatherings it was always a great pleasure to meet him, whether for lunch, dinner, or a cup of coffee, he is such excellent company. I have always admired the extent of Peter's knowledge, because his interests are not limited to aberrations in electron optics but extend to image processing, and because he is so conversant with the history of electron optics and microscopy. The time he spent editing the many volumes of AIEP did not prevent him from publishing more than 100 papers of his own during the past twenty years; the
citations number well over 1000. His most important single contribution has been the book, Principles of Electron Optics, co-authored by E. Kasper and published by Academic Press in 1989, which is an invaluable starting point for any research in the field. He has evaluated numerous PhD theses; those reviews alone, if published, could make a special volume. Such enormous scholarly activity would suggest that he must spend all of his time behind a desk, sorting piles of papers, but nothing could be further from the truth. Peter is a respected leader and organizer whose name appears in the proceedings of countless conferences on electron microscopy and particle optics. He also served as the first President of the European Society of Electron Microscopy and now represents the French EM society in that body. BOHUMILA LENCOVA
I have known Peter from his earliest PhD research at the Cavendish Laboratory, Cambridge, under the supervision of V. E. Cosslett, of happy memory, to the present day. He is an outstanding researcher and a brilliant scientific editor in addition to being an active physicist, a scientific globe trotter and a man of parts. National and scientific barriers do not seem to exist in his mind. His courteous nature, paired with his strong sense of humor, indeed of fun, allows him to remain on friendly terms with those who might disagree with him. Due to his great editorial and diplomatic skills, he has been able to take on a wide range of scientific contributions from all over the world. Peter has undoubtedly raised the standards of electron optical publication worldwide by his careful attention to detail and scientific accuracy, as well as providing electron microscopists with an extraordinarily broad range of electron-optical literature. One of his greatest achievements was to take on Advances in Imaging and Electron Physics Volume 96, "The Growth of Electron Microscopy," to which all members of IFSEM were invited to contribute. Previous attempts by IFSEM to organize the production of such a volume had failed because of the complexities of their membership and the difficulties in many countries in producing a professional text in English. Peter Hawkes suggested that Academic Press could undertake such a task and his offer was accepted. I agreed to act as editor of this volume. It was indeed an enormous task, brilliantly handled by Academic Press, under Peter's constant guidance and his diplomatic manner. His work with the volume did not prevent Peter from his extensive collaboration with E. Kasper in the definitive books on electron optics or in the final "polishing up" of the well-known books of Ludwig Reimer. TOM MULVEY
CONTENTS
CONTRIBUTORS PREFACE .
.
.
. .
. .
. .
. .
FUTURE CONTRIBUTIONS
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
.
.
.
ix
.
.
.
.
xi
. . . . . . . . . . . . . . . . . . . . . . .
xiii
A Review of Image Segmentation Techniques Integrating Region and Boundary Information X. CUFf, X. Mulqoz, J. F R E I X E N E T , AND J. MARTf I. II. III. IV. V.
Introduction . . . . . . . . . . . . . . . . . . . . . . E m b e d d e d Integration . . . . . . . . . . . . . . . . . . Postprocessing Integration . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . Conclusions and Further W o r k . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
1 6 17 31 35 36
Mirror Corrector for Low-Voltage Electron Microscopes P. HARTEL, D. PREIKSZAS, R. SPEHR, H. Mf0LLER, AND H. ROSE I. II. III. IV. V. VI.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . General Considerations . . . . . . . . . . . . . . . . . . . . The Spectromicroscope " S M A R T . . . . . . . . . . . . . . . . . . M e c h a n i c a l D e s i g n o f the M i r r o r C o r r e c t o r . . . . . . . . . . . . Testing o f the Mirror Corrector . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Addition o f Refractive P o w e r s in the T w o - L e n s S y s t e m . . References . . . . . . . . . . . . . . . . . . . . . . . . .
42 44 52 72 84 128 130 132
Characterization of Texture in Scanning Electron Microscope Images J. L. LADAGA AND R. D. BONETTO I. II. III. IV. V.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . T h e Variogram as a Surface Characterization Tool . . . . . . . . . Variogram Use for Texture Characterization o f Digital I m a g e s . . . . Two E x a m p l e s of Application in S E M I m a g e s . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . A p p e n d i x I: Correlation b e t w e e n Fourier P o w e r S p e c t r u m M a x i m u m and Variogram Characteristic M i n i m u m . . . . . . . . ~
Vll
136 136 146 174
183 183
viii
CONTENTS Appendix II: Theoretical Example to Show the Correlation between the Fourier Power Spectrum M a x i m u m and the Variogram Characteristic M i n i m u m . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .
186 189
Degradation Identification and Model Parameter Estimation in Discontinuity-Adaptive Visual Reconstruction A. TONAZZINI AND L. BEDINI I. II. III. IV. V. VI. VII. VIII. IX. X. XI. XII.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Fully Bayesian Approach to Unsupervised Blind Restoration . . . . The M A P - M L Method . . . . . . . . . . . . . . . . . . . . M A P Estimation of the Image Field . . . . . . . . . . . . . . . M L Estimation of the Degradation Parameters . . . . . . . . . . . M L Estimation of the Model Parameters . . . . . . . . . . . . . The Overall Architecture for the Fully Blind and Unsupervised Restoration . . . . . . . . . . . . . . . . . . . Adaptive Smoothing and Edge Tracking . . . . . . . . . . . . . Experimental Results: The Blind Restoration Subcase . . . . . . . Experimental Results: The Unsupervised Restoration Subcase . . . . Experimental Results: The Fully Unsupervised Blind Restoration Case Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .
INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
194 202 206 208 215 217 227 231 238 247 270 279 280
285
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors' contribution begins.
L. BEDINI (193), Institute for the Elaboration of Information, Area della Ricerca CNR di Pisa, 1-56124 Pisa, Italy R. D. BONETTO(135), Center of Investigation and Development in Processes Catalytic, National Council of Investigations Scientific and Technical, Universidad Nacional de La Plata, 1900 La Plata, Argentina X. CUFf (1), Computer Vision and Robotics Group, Department EIA-IIiA, University of Girona, 17071 Girona, Spain J. FREIXENET (1), Computer Vision and Robotics Group, Department EIAIIiA, University of Girona, 17071 Girona, Spain P. HARTEL (41), Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany J. L. LADAGA (135), Laser Laboratory, Department of Physics, Faculty of Engineering, Universidad de Buenos Aires, 1063 Buenos Aires, Argentina J. MARTf (1), Computer Vision and Robotics Group, Department EIA-IIiA, University of Girona, 1707.1 Girona, Spain n. MULLER (41), Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany X. MUgTOZ(1), Computer Vision and Robotics Group, Department EIA-IIiA, University of Girona, 17071 Girona, Spain D. PREIKSZAS(41), Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany H. ROSE (41), Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany R. SPEHR (41), Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany A. TONAZZINI (193), Institute for the Elaboration of Information, Area della Ricerca CNR di Pisa, 1-56124 Pisa, Italy ix
This Page Intentionally Left Blank
PREFACE
The contributions to this volume form a good balance between imaging and electron physics, with chapters on segmentation, texture analysis of scanning electron microscope images, blind image restoration and an extremely thorough account of a new aberration corrector for the scanning electron microscope. We begin with a careful discussion from the Computer and Robotics Group in the University of Girona of the integration of boundary and region information into segmentation techniques. The authors make a clear distinction between embedded integration and postprocessing. This is not usually a high priority and an analysis of the problems and advantages of such an approach is therefore all the more welcome. The second, very long contribution describes the ambitious mirror aberration-corrector that is currently being developed by a consortium of German organizations with government support, namely, the universities of Darmstadt, Clausthal and Wtirzburg, the Fritz-Haber Institute in Berlin and LEO Elektronenmikroskopie. Although correctors have now been developed for high-resolution transmission and scanning transmission electron microscopes, there is no satisfactory corrector for direct imaging in the low-voltage domain. The design described at length here is an attempt to remedy this omission, at the cost of a relatively complex electron path between source and specimen and detector. Two families of microscopes are considered, lowenergy electron microscopes operating at energies below 15 keV (LEEM) and scanning electron microscopes; if the specimen is illuminated with photons, then the microscope will be a photoemission electron microscope (PEEM). The ultimate aim of the project is to incorporate such a corrector into a combined PEEM-LEEM known as SMART, a "SpectroMicroscope for All Relevant Techniques". The authors take us carefully through all aspects of the instrumentation, discussing both the optics and the mechanical requirements. A particularly interesting section is devoted to testing the device and to trouble-shooting, from which the reader can assess the difficulty of putting this design into practice and its chances ofjoining the growing number of corrected instruments. The third contribution, by J. L. Ladaga and R. D. Bonetto from Buenos Aires and La Plata is again concerned with the scanning electron microscope. Interest in digital processing of the SEM image arose soon after the instrument became commercially available, though the first attempts could not of course be fully digital, and SEM image processing has now reached a high degree xi
xii
PREFACE
of sophistication and some such tools are routinely supplied with these instruments. Here the theme is texture characterization and the preferred tool is the variogram, from which the fractal dimension can be deduced. The authors present the basic ideas and their own contributions to this approach very clearly and conclude with several examples, showing the power of the technique. We conclude with a very complete account by A. Tonazzini and L. Bedini from Pisa of ways of identifying image degradation and of restoring degraded images based on blind restoration. The authors introduce us to the Bayesian approach to restoration and explain in great detail how fully blind and unsupervised restoration can be achieved. The authors' own contribution is fully described and placed in the context or other attempts to solve the difficulties that arise. This account is on the scale of a short monograph and brings out very clearly the practical merits of their methods. It only remains for me to thank all the authors for the trouble they have taken to make their surveys complete and accessible to non-specialists. As usual, I conclude with a list of forthcoming articles, many of which will be published in the course of 2002.
FUTURE CONTRIBUTIONS
T. Aaeh Lapped transforms G. Abbate New developments in liquid-crystal-based photonic devices S. Ando Gradient operators and edge and corner detection A. Arnrodo, N. Decoster, P. Kestener and S. Roux A wavelet-based method for multifractal image analysis M. Barnabei and L. Montefusco Algebraic aspects of signal and image processing C. Beeli Structure and microscopy of quasicrystals I. Bloch Fuzzy distance measures in image processing
G. Borgefors Distance transforms B. L. Breton, D. McMullan and K. C. A. Smith (Eds) Sir Charles Oatley and the scanning electron microscope A. Carini, G. L. Sicuranza and E. Mumolo V-vector algebra and Volterra filters
Y. Cho Scanning nonlinear dielectric microscopy E. R. Davies Mean, median and mode filters H. Delingette Surface reconstruction based on simplex meshes A. Diaspro Two-photon excitation in microscopy
R. G. Forbes Liquid metal ion sources xiii
xiv
FUTURE CONTRIBUTIONS
E, F6rster and E N, Chukhovsky X-ray optics A. Fox The critical-voltage effect L. Frank and I. Miillerovfi Scanning low-energy electron microscopy M. Freeman and G. M. Steeves Ultrafast scanning tunneling microscopy A. Garcia Sampling theory
L, Godo & V, Torra Aggregation operators P, W. Hawkes Electron optics and electron microscopy: conference proceedings and abstracts as source material
M, I, Herrera The development of electron microscopy in Spain J. S. Hesthaven Higher-order accuracy computational methods for time-domain electromagnetics K. Ishizuka Contrast transfer and crystal images
I, P, Jones ALCHEMI W. S. Kerwin and J. Prince The kriging update model
B, Kessler Orthogonal multiwavelets A. Khursheed (vol. 122) Add-on lens attachments for the scanning electron microscope G. K6gel Positron microscopy W. Krakow Sideband imaging
FUTURE CONTRIBUTIONS
xv
N. Krueger The application of statistical and deterministic regularities in biological and artificial vision systems B. Lahme Karhunen-Lo~ve decomposition C. L. Matson Back-propagation through turbid media P. G. Merli, M. Vittori Antisari and G. Calestani, eds (vol. 123) Aspects of Electron Microscopy S. Mikoshiba and E L. Curzon Plasma displays M. A. O'Keefe Electron image simulation N. Papamarkos and A. Kesidis The inverse Hough transform M. G. A. Paris and G. d'Ariano Quantum tomography
C. Passow Geometric methods of treating energy transport phenomena E. Petajan HDTV
E A. Ponce Nitride semiconductors for high-brightness blue and green light emission T.-C. Poon Scanning optical holography H. de Raedt, K. E L. Michielsen and J. Th. M. Hosson Aspects of mathematical morphology E. Rau Energy analysers for electron microscopes H. Rauch The wave-particle dualism R. de Ridder Neural networks in nonlinear image processing D. Saad, R. Vicente and A. Kabashima Error-correcting codes
xvi
FUTURE CONTRIBUTIONS
O. Scherzer Regularization techniques G. Sehmahl
X-ray microscopy S. Shirai
CRT gun design methods T. Soma
Focus-deflection systems and their applications I. Talmon Study of complex fluids by transmission electron microscopy M. Tonouchi
Terahertz radiation imaging N. M. Towghi Ip n o r m optimal filters T. Tsutsui and Z. Dechun
Organic electroluminescence, materials and devices Y. Uchikawa Electron gun optics D. van Dyck Very high resolution electron microscopy J. S. Walker Tree-adapted wavelet shrinkage C. D. Wright and E. W. Hill
Magnetic force microscopy F. Yang and M. Paindavoine
Pre-filtering for pattern recognition using wavelet transforms and neural networks M. Yeadon Instrumentation for surface studies S. Zaefferer Computer-aided crystallographic analysis in TEM
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 120
A R e v i e w of I m a g e S e g m e n t a t i o n T e c h n i q u e s Integrating Region and Boundary Information
X. CUF[, X. MUlqOZ, J. FREIXENET, AND J. MARTI Computer Vision and Robotics Group, EIA-IIiA University of Girona, 17071 Girona, Spain
I. I n t r o d u c t i o n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Integration Techniques: Embedded versus Postprocessing . . . . . . . . B . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . Embedded Integration . . . . . . . . . . . . . . . . . . . . . . . . A . Guidance of Seed Placement . . . . . . . . . . . . . . . . . . . . B. Control of Growing Criterion . . . . . . . . . . . . . . . . . . . . . 1. Integration in Split-and-Merge Algorithms . . . . . . . . . . . . . 2. Integration in Region-Growing Algorithms . . . . . . . . . . . . .
.
A. II.
C. F u z z y L o g i c . III.
.
.
.
.
.
Postprocessing Integration A. O v e r s e g m e n t a t i o n B.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
. .
6
. .
7
. .
10
. .
10
. .
12
.
.
14
. . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
Boundary Refinement . . . . . . 1. Boundary Refinement by Snakes
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
.
.
.
.
.
.
17
.
IV. S u m m a r y
.
.
.
.
.
.
.
.
.
.
.
23
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
31
. . . . . . . . . . . . . . . . . . . .
33
. . . . . . . . . . . . . . . . . . . .
35
Disadvantages of Both Strategies Conclusions and Further Work . . . References . . . . . . . . . . .
A. V.
.
1 4
.
C. S e l e c t i o n
.
.
.
. . .
I.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
INTRODUCTION
One of the first and most important operations in image analysis and computer vision is segmentation (R. Haralick and R. Shapiro. 1992-1993; Rosenfeld and Kak, 1982). The aim of image segmentation is the domain-independent partition of the image into a set of regions which are visually distinct and uniform with respect to some property, such as gray level, texture, or color. Segmentation can be considered the first step and key issue in object recognition, scene understanding, and image understanding. Its application areas vary from industrial quality control to medicine, robot navigation, geophysical exploration, military applications, and so forth. In all these areas, the quality of the final results depends largely on the quality of the segmentation. The problem of segmentation has been, and still is, an important research field, and many segmentation methods have been proposed in the literature (Fu and Mui, 1981; R. M. Haralick and L. G. Shapiro, 1985; Nevatia, 1986; 1 Volume 120 ISBN 0-12-014762-9
ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 2002,
ElsevierScience (USA). All rightsreserved.
ISSN 1076-5670/02 $35.00
2
X. CUFf ET AL.
Pal and Pal, 1993; Riseman and Arbib, 1977; Zucker, 1977). In general, segmentation methods are based on two basic properties of the pixels in relation to their local neighborhood: discontinuity and similarity. Methods that are based on some discontinuity property of the pixels are called boundarybased methods, whereas methods based on some similarity property are called region-based methods. More specifically, 9 The boundary approach uses the postulate that abrupt changes occur with regard to the features of the pixels (e.g., abrupt changes in gray values) at the boundary between two regions. To find these positions, one can choose from two basic approaches: first- and second-order differentiation. In the first case, a gradient mask (Roberts, 1965, and Sobel, 1970, are wellknown examples) is convolved with the image to obtain the gradient vector V f associated with each pixel. Edges are the places where the magnitude of the gradient vector [IV f II is a local maximum along the direction of the gradient vector 4~(Vf). For this purpose, the local value of the gradient magnitude must be compared with the values of the gradient estimated along this orientation and at unit distance on either side away from the pixel. After this process of nonmaxima suppression takes place, the values of the gradient vectors that remain are thresholded, and only pixels with a gradient magnitude exceeding the threshold are considered as edge pixels (Petrou and Bosdogianni, 1999). In the second-order derivative class, optimal edges (maxima of gradient magnitude) are found by searching for places where the second derivative is zero. The isotropic generalization of the second derivative to two dimensions is the Laplacian (Prewitt, 1970). However, when gradient operators are applied to an image, the zeros rarely fall exactly on a pixel. It is possible to isolate these zeros by finding zero crossings: places where one pixel is positive and a neighbor is negative (or vice versa). Ideally, edges of images should correspond to boundaries of homogeneous objects and object surfaces. 9 The region approach tries to isolate areas of images that are homogeneous according to a given set of characteristics. Candidate areas may be grown, shrunk, merged, split, created, or destroyed during the segmentation process. There are two typical region-based segmentation algorithms: region-growing and split-and-merge algorithms. Region growing (Adams and Bischof, 1994; Zucker, 1976) is one of the most simple and popular algorithms and it starts by choosing a starting point or seed pixel. Then, the region grows by adding neighboring pixels that are similar, according to a certain homogeneity criterion, which increases the size of the region step by step. Typical split-and-merge techniques (Chen and Pavlidis, 1980; Fukada, 1980) consist of two basic steps. First, the whole image is considered as one region. If this region does not comply with a homogeneity
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
3
criterion, the region is split into four quadrants and each quadrant is tested in the same way until every square region created in this way contains homogeneous pixels. Next, in a second step, all adjacent regions with similar attributes may be merged upon compliance with other criteria. Unfortunately, both techniques, boundary-based and region-based, often fail to produce accurate segmentation, although the locations where they fail are not necessarily identical. On the one hand, in boundary-based methods, if an image is noisy or if its region attributes differ by only a small amount between regions, characteristics very common in natural scenes, edge detection may result in spurious and broken edges. This occurs mainly because such methods rely entirely on the local information available in the image; very few pixels are used to detect the desired features. Edge-linking techniques can be employed to bridge short gaps in such a region boundary, although doing so is generally considered an extremely difficult task. On the other hand, region-based methods always provide closed-contour regions and make use of relatively large neighborhoods in order to obtain sufficient information to allow the algorithm to decide whether to aggregate a pixel into a region. Consequently, the region approach tends to sacrifice resolution and detail in the image to gain a sample large enough for the calculation of useful statistics for local properties. This sacrifice can result in segmentation errors at the boundaries of the regions and in failure to distinguish regions that would be small in comparison with the block size used. Further, in the absence of a priori information, reasonable starting seed points and stopping criteria are often difficult to choose. Finally, both approaches sometimes suffer from a lack of knowledge because they rely on the use of ill-defined hard thresholds that may lead to wrong decisions (Salotti and Garbay, 1992). In the task of segmentation of some complex pictures, such as outdoor and natural images, it is often difficult to obtain satisfactory results by using only one approach to image segmentation. Taking into account the complementary nature of the edge-based and region-based information, it is possible to alleviate the problems related to each of them considered separately. The tendency toward the integration of several techniques seems to be the best way to produce better results. The difficulty in achieving this lies in that even though the two approaches yield complementary information, they involve conflicting and incommensurate objectives. Thus, as observed by Pavlidis and Liow (1990), although integration has long been a desirable goal, achieving it is a nontrivial task. In the 1990s, numerous techniques for integrating region and boundary information were proposed. One of the principal characteristics that permits classification of these approaches is the time of fusionmembedded in the region detection or after both processes (Falah et al., 1994):
4
x. CUFI ET AL. 9 Embedded integration can be described as integration through the definition of new parameters or a new decision criterion for the region-based segmentation. First, the edge information is extracted, and, second, this information is then used within the segmentation algorithm which is mainly based on regions. For example, edge information can be used to define the seed points from which regions are grown. 9 Postprocessing integration is performed after image processing by using the two approaches (boundary-based and region-based techniques). Edge information and region information are extracted independently in a preliminary step, and then integrated
Although many surveys on image segmentation have been published, as stated previously, none focuses specifically on the integration of region and boundary information. As a way to overcome this deficiency, this article discusses the most current and most relevant segmentation techniques that integrate region and boundary information. The remainder of this article is structured as follows: A discussion of embedded and postprocessing strategies and the related work concludes the Introduction. Section II defines and classifies the different approaches to embedded integration, whereas Section III analyzes the proposals for the postprocessing strategy. Section IV summarizes the advantages and disadvantages of the various approaches. Finally, the results of our study are summarized in Section V.
A. Integration Techniques: Embedded versus Postprocessing Many cooperative methods have been developed, all with the common objective of improving the segmentation by using integration. However, the fusion of boundary information and region information has been attempted through many different approaches. The result is a set of techniques that contains very disparate tendencies. As many authors have proposed (Falah et al., 1994; Le Moigne and Tilton, 1995), one of the main characteristics that allows classification of the integration techniques is the time of fusion. This concept refers to the moment during the segmentation process when the integration of the dual sets of information is performed. This property allows us to distinguish two basic groups among the integration proposals: embedded and postprocessing. The techniques based on embedded integration start with the extraction of the edge map. This information is then used in the region-detection algorithm, in which the boundary information is combined with the region information to carry out the segmentation of the image. A basic scheme of this method is indicated in Figure 1a. The additional information contributed by the edge detection can be employed in the definition of new parameters or new decision criteria.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
5
Input image Input image Edge Detection Edge Detection
'l
Region Detection
Region Detection Information Fusion
Output image Outputimage
(a)
(b)
FIGURE 1. Strategy schemes for region and boundary integration according to the time of fusion: (a) embedded integration; (b) postprocessing integration.
The aim of this integration strategy is to use boundary information as the means of avoiding many of the common problems of region-based techniques. Conversely, the techniques based on postprocessing integration extract edge information and region information independently, as depicted in the scheme of Figure lb. This preliminary step results in two segmented images obtained by using the classical techniques of both approaches, so they probably have the typical faults that are generated by the use of a single isolated method. An a posteriori fusion process then tries to exploit the dual information in order to modify, or refine, the initial segmentation obtained by a single technique. The aim of this strategy is the improvement of the initial results and the production of a more accurate segmentation. In the following sections (Sections II and III), we describe several key approaches that we have classified as embedded or postprocessing. Within the embedded methods we differentiate between those that use boundary information for seed-placement purposes and those that use this information to establish an appropriate decision criterion. Within the postprocessing methods, we differentiate three approaches: oversegmentation, boundary refinement, and selection evaluation. We discuss each of these approaches in depth and, in some cases, emphasize aspects related to the implementation of the methods (region-growing or split-and-merge) or to the use of fuzzy logic, which has been considered in a number of proposals. B. Related Work
Brief mention of the integration of region and boundary information for segmentation can be found in the introductory sections of several papers. As a
6
X. CUFI ET AL.
first reference, Pavlidis and Liow (1990) introduced some earlier papers that emphasized the integration of such information. In 1994 Falah et al. identified two basic strategies for achieving the integration of dual information, boundaries and regions. The first strategy (postprocessing) is described as the use of the edge information to control or refine a region segmentation process. The second strategy (embedded) is to integrate edge detection and region extraction in the same process. The classification proposed by Falah et al. has been adopted by us and is discussed in this article. In a different case, Le Moigne and Tilton (1995), thinking in the general case of data fusion, identified two levels of fusion: pixel and symbol. In a pixel-level integration between edges and regions, the decision for integration is made individually on each pixel, whereas the symbol-level integration is made on the basis of selected features, which simplifies the problem. In the same paper, these authors discussed embedded and postprocessing strategies and presented important arguments on the supposed superiority of the postprocessing strategy. They argued that the a posteriori fusion yields a more general approach because, for the initial task, it can employ any type of boundary and region segmentation. A different viewpoint regarding the integration of edge and region information for segmentation proposals consists of the use of dynamic contours (snakes). In this sense, Chan et al. (1996) reviewed different approaches, pointing out that integration is the way to decrease the limitations of traditional deformable contours.
II. EMBEDDED INTEGRATION
The embedded integration strategy consists of using the edge information, previously extracted, within a region segmentation algorithm. It is well known that in most of the region-based segmentation algorithms, the manner in which initial regions are formed and the criteria for growing them are set a priori. Hence, the resulting segmentation will inevitably depend on the choice of initial region growth points (Kittler and Illingworth, 1985), whereas the region's shape will depend on the particular growth chosen (Kohler, 1981). Some proposals try to use boundary information in order to avoid these problems. According to the manner in which this information is used, it is possible to distinguish two tendencies: 1. Guidance of seed placement: Edge information is used as a guide to choose the most suitable position to place the seed (or seeds) of the regiongrowing process. 2. Control ofgrowing criteria: Edge information is included in the definition of the decision criterion which controls the growth of the region.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
7
A. Guidance of Seed Placement In 1992 Benois and Barba presented a segmentation technique that combined contour detection and a split-and-merge procedure of region growing. In this technique, the boundary information is used to choose the growing centers. More specifically, the original idea of the method is the placement of the seeds on the skeleton of nonclosed regions obtained by edge detection. The technique starts with contour detection and extraction, according to the algorithm proposed in Moulet and Barba (1988), which finds the most evident frontiers of homogeneous regions. The contours obtained as a result of this overall procedure are of high quality, but they are not always closed. Then, a region-growing procedure is used to close these regions and to obtain a more precise segmentation. Hence, as a way to obtain a uniformly spread speed of region growing constrained by original contours, the growing centers should be chosen as far as possible from these contours. To do so, the algorithm chooses them on the skeleton defined by the set of the original contours. The skeleton is computed by the Rosenfeld method of local maxima distance. Finally, the region-growing process is realized in the following steps: a splitting process that divides an initial image into homogeneous rectangular blocks, then a merging process grouping these blocks around growing centers to obtain final segments. A similar work was proposed by Sinclair (1999), who presented an interesting integration segmentation algorithm. First, the Voronoi image generated from the edge image is used to derive the placement of the seeds. The intensity at each point in a Voronoi image is the distance to the closest edge. Second, the peaks in the Voronoi image, reflecting the farthest points from the contours, are used as seed points for region growing. In the growth, two criteria are used in order to attach unassigned pixels: the difference in color between the candidate pixel and the boundary member pixel must be less than a set threshold, and the difference in color between the candidate and the mean color of the region must be less than a second, larger threshold. In this sense, these criteria take into account local and global region information for the aggregation of a new pixel to a region. This could be especially interesting for blurred regions. From another integration aspect, edges recovered from the image act as hard barriers through which regions are not allowed to grow. Figure 2 shows the images generated during the segmentation process, including the Voronoi image, which guide the placement of the region-growing centers. Moghaddamzadeh and Bourbakis (1997) proposed an algorithm that uses edge detection to guide initialization of an a posteriori region-growing process. Actually, this work is not specifically oriented to the placement of the seeds for the a posteriori growing process, but is focused on establishing a specific order for the processes of growing. As is well known, one disadvantage of the
8
X. CUFI ET AL.
FIGURE 2. The Sinclair (1999) approach using the Voronoi image. (a) Original image. (b) Edges extracted from the original color image. (c) Voronoi image computed from the edge image. (d) Final segmentation. region-growing and merging processes is their inherently sequential nature. Hence, the final segmentation results depend on the order in which regions are grown or merged. The objective of this proposal is to simulate the order by which we humans separate segments from each other in an image; that is, from large to small. As a way to achieve this, an edge-detection technique is applied to the image to separate large and crisp segments from the rest. The threshold of the edge-detection algorithm is fixed low enough to detect even the weakest edge pixels in order to separate regions from each other. Next, the regions obtained (considering a region as a place closed by edges) are sequentially expanded, starting from the largest segment and finishing with the smallest. E x p a n d i n g a segment refers to merging adjacent pixels with the segment, on the basis of some conditions. Two fuzzy techniques are then proposed to expand the large segments and/or to find the smaller ones. Another proposal, which uses the edge information to initialize the seeds of a posteriori region growing, has been presented by Cuff et al. (2000). Like the proposal of Moghaddamzadeh and Bourbakis, Cuff et al.'s proposal takes into account seed placement as well as the order by which the regions start the growing process. However, Moghaddamzadeh and Bourbakis give priority to the largest regions, whereas Cuff et al. prefer a concurrent growing, giving the same opportunities to the regions. The basic scheme of their technique is shown in Figure 3. The technique begins by detecting the main contours of the image following the edge extraction algorithm discussed in Cuff and Casals (1996). For each extracted contour, the algorithm places a set of growing centers at each side and along it. It is assumed that the whole set of seeds of one side of the contour belong to the same region. Then, these seeds are
FIGURE 3. Scheme of the segmentation technique proposed by Cuff et al. (2000). The method is composed of four basic steps: (1) main contour detection, (2) analysis of the seeds, (3) adjustment of the homogeneity criterion, and (4) concurrent region growing.
10
x. CUFI ET AL.
used as samples of the corresponding regions and analyzed in the chromatic space in order to establish appropriate criteria for the posterior growing processes. The goal is to know a priori some characteristics of regions with the aim of adjusting the homogeneity criterion to the region's characteristics. Finally, the seeds simultaneously start a concurrent growth using the criterion established for each region, which is based on clustering analysis and convex hull construction.
B. Control of Growing Criterion Another way to carry out the integration from an embedded strategy is to incorporate the edge information into the growing criterion of a region-based segmentation algorithm. Thus, the edge information is included in the definition of the decision criterion that controls the growth of the region. As discussed in the Introduction, region-growing and split-and-merge algorithms are the typical region-based segmentation algorithms. Although both share the essential concept of homogeneity, the way they carry out the segmentation process is different in the decisions taken. For this reason, and ' to facilitate the analysis of the surveyed algorithms, we present these two types of approaches in separate subsections.
1. Integration in Split-and-Merge Algorithms Bonnin and his colleagues (1989) proposed a region extraction based on a split-and-merge algorithm controlled by edge detection. The method incorporates boundary information into the homogeneity criterion of the regions to guide the region-detection procedure. The criterion to decide the split of a region takes into account edge and intensity characteristics. More specifically, if there is no edge point on the patch and if the intensity homogeneity constraints are satisfied, the region is stored; otherwise, the patch is divided into four subpatches, and the process is recursively repeated. The homogeneity intensity criterion is necessary because of possible failures of the edge detector. After the split phase, the contours are thinned and chained into edges relative to the boundaries of the initial regions. Later, a final merging process takes into account edge information in order to solve possible oversegmentation problems. In this last step, two adjacent initial regions are merged only if there are no edges on the common boundary. The general structure of the method is depicted in Figure 4, where it can be observed that edge information guides the split-and-merge procedure in both steps of the algorithm: first, to decide the split of a region, and second, in the final merging phase, to solve the possible oversegmentation.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
EdgePoint ]~ _v_ ~ n_:_._-'~~] Thinning Chainin: ~ Detection Initial I Control~
11
PiecewiseLinear I SegmentApproximation[
I
IstFeedback Regi~oint
2ndF4back Edge--iRe#on
[,
~..]Split&M e r g e ~ n i . f i a l ~
FinalMerge~
~'~Ed~,es~'~
Inz':I FIGURE 4. Scheme of the segmentation technique proposed by Bonnin et al. (1989). The edge information guides the split-and-merge procedure in both steps of the algorithm: first, to decide the split of a region, and second, in the final merging phase, to solve the possible oversegmentation.
The split-and-merge algorithm cooperating with an edge extractor was also proposed in the work of Buvry, Zagrouba et al. (1994). The proposed algorithm follows the basic idea introduced by Bonnin, considering the edge segmentation in the step of merging. However, a rule-based system was added to improve the initial segmentation. A scheme of the proposed algorithm is illustrated in Figure 5. These authors argued that the split-and-merge segmentation algorithm creates many horizontal or vertical boundaries without any physical meaning. To solve this problem, the authors defined a rule-based system dealing with this type of boundary. Specifically, the gradient mean of each boundary is used to decide if the boundary has a physical reality.
Image
I Median ] Modified F'dter l Image
Prewitt
OP.
J Gradient [ Edge L] Edge l I m a g e [ Extraction "1 , Image ~
t| i ! !
N
plit-and-~ Regions MergingofSmallRegions] Modified Image andCurve ] Regions Approximation
GROWINGALGORITHM Rules
Application
t[
Final Regions
FIGURE 5. Segmentation technique proposed by Buvry, Zagrouba et al. (1994). Edge information is used to guide the split-and-merge region segmentation. Finally, a set of rules improve the initial segmentation by removing boundaries without corresponding edge information. Prewitt op., Prewitt operator.
12
x. CUFf ET AL.
In 1997, Buvry, Senard et al. reviewed the work presented in Buvry, Zagrouba et al.'s publication (1994) and proposed a new hierarchical regiondetection algorithm for stereovision applications taking into account the gradient image. The method yields a hierarchical coarse-to-fine segmentation in which each region is validated by exploiting the gradient information. At each level of the segmentation process, a threshold is computed and the gradient image is binarized according to this threshold. Each closed area is labeled by applying a classical coloring process and defines a new region. Edge information is also used to determine if the split process is finished or if the next partition must be computed. As a way to do that, a gradient histogram of all pixels belonging to the region is calculated and its characteristics (mean, maximum, and entropy) are analyzed. A proposal for enriching the segmentation by irregular pyramidal structure by using edge information can be found in the work of Bertolino and Montanvert (1996). In this proposed algorithm, a graph of adjacent regions is computed and modified according to the edge map obtained from the original image. Each graph edge* is weighted with a pair of values (r, c) which represent the number of region elements and contour elements in the common boundary of both regions, respectively. Then, the algorithm goes through the graph and at each graph edge decides whether to forbid or favor the fusion between adjacent regions. The use of edge information in a split-and-merge algorithm may not be reduced to only the decision criterion. In this sense, Gevers and Smeulders presented, in 1997, a new technique that extends the possibilities of this integration. Their proposal uses edge information to decide how the partition of the region should be made, or, in other words, where to split the region. The idea is the adjustment of this decision to boundary information and to split the region following the edges contained in it. In reference to previous works, these authors affirmed that although the quad-tree scheme is simple to implement and computationally efficient, its major drawback is that the image tessellation process is unable to adapt the tessellation grid to the underlying structure of the image. For this reason they proposed to employ the incremental Delaunay triangulation competent of forming grid edges of arbitrary orientation and position. The tessellation grid, defined by the Delaunay triangulation, is adjusted to the semantics of the image data. In the splitting phase, if a global similarity criterion is not satisfied, pixels lying on image boundaries are determined by using local difference measures and are used as new vertices to locally refine the tessellation grid. 2. Integration in Region-Growing Algorithms
One of the first integrations of edge information into a region-growing algorithm can be found in the work of Xiaohan et al. (1992), in which edge *To avoid confusion, we designate graph edge as an edge thatjoins two nodes in a graph.
A REVIEW OF INTEGRATED IMAGE SEGMENTATIONTECHNIQUES
13
information was included in the decision criterion. A classic region-growing algorithm generally takes into account only the contrast between the current pixel and the region in order to decide the merging of them. Xiaohan et al. proposed a region-growing technique that includes the gradient region in the homogeneity criterion to make this decision. The proposed combination of region-growing and gradient information can be expressed in the following formula:
x(i, j) --IXNa v -- f (i, j)l z(i, j) -- (1 -- qb)x(i, j) + CG(i, j)
(1)
where X~ v is the average gray value of the region which is updated pixel by pixel. The contrast of the current pixel with respect to the region is denoted by x(i, j). Parameter 4~ controls the weight of gradient G(i, j). Finally, the sum of the local and the global contrast is the final homogeneity measure, z(i, j). Following this expression the proposed algorithm can be described by using only two steps: Step 1 If z(i, j) is less than a given threshold r , then the current pixel is merged into the region. Step 2 Else the local maximum of the gradients on a small neighborhood of the current pixel is searched along the direction of region growing. The procedure stops at the pixel with the local gradient maximum. The first step of the algorithm describes the growing of the region guided by the proposed homogeneity criterion. The second tries to avoid the typical error of the region-based segmentation techniques~that is, the inaccuracy of the boundaries detected~by putting the result of the segmentation in coincidence with the edge map. A similar integration proposal was suggested by Falah et al. in 1994. In this work the gradient information is included in the decision criterion to restrict the growth of regions. At each iteration, only pixels having low gradient values (below a certain threshold) are allowed to be aggregated into the growing region. Another interesting aspect of this work is the choice of the seeds for the process of region growing. This selection uses the redundancy between the results obtained by several region segmentations (with different thresholds and different directions of image scanning), with the aim of placing the seeds in a proper position in which they have a high degree of certainty of belonging to a homogeneous region. In 1992 Salotti and Garbay developed a theoretical framework of an integrated segmentation system. The core of the problem of traditional segmentation methods, as denoted by these authors, relates to the autarchy of the methods and to the schedule of conditions that are defined with a priori assumptions. As a way to solve this problem, major directives to control each decision are
14
x. CUFI ET AL.
presented; to accumulate local information before taking difficult decisions; to use processes exploiting complementary information to cooperate successfully; to defer difficult decisions until more information is available; and, finally, to enable easy context switches to ensure an opportunistic cooperation. The main idea of these directives is that each decision must be strongly controlled. This implies that a massive collaboration must be carried out and that the segmentation task should not necessarily be achieved before the beginning of the high-level process. Finally, all these principles are used in a segmentation system with a region-growing process as main module. Pixels that seem difficult to classify because there is insufficient information for a sure decision are given to an edge-detection unit that must respond whether they correspond to an edge or not. The same directives were followed in an a posteriori work (Bellet et al., 1994). that presents an edge-following techniques which uses region-based information to compute adaptive thresholds. In such situations, when it is difficult to follow the high gradient, complementary information is requested and successfully obtained through the emergence of regions on both sides of the edge. A child edge process is then created with a threshold adapted to lower gradient values. Moreover, these authors introduce the adaptability of the aggregation criterion to the region's characteristics: several types of regions are distinguished and defined. The region-growing method dynamically identifies the type of the analyzed region, and a specific adapted criterion is used.
C. Fuzzy Logic A current trend in segmentation techniques that deserves special attention is the use of fuzzy logic (Bezdek et al., 1999). The role of fuzzy sets in segmentation techniques is becoming more important (Lambert and Carron, 1999; Pham and Prince, 1999), and the integration techniques are in the mainstream of this tendency. In this sense, we want to emphasize the growing interest of researchers to incorporate fuzzy logic methods into integrated segmentation. This interest was mainly prompted because these two integration methods are developed from complementary approaches and do not share a common measure. Hence, fuzzy logic offers the possibility to solve this problem, as it is especially suited to carry out the fusion of information of a diverse nature (Kong and Kosko, 1992; Moghaddamzadeh and Bourbakis, 1997). In the case of embedded integration of edge information into a region-growing procedure (Krishnan et al., 1994; Steudel and Glesner, 1999), the fuzzy rule-based homogeneity criterion offers several advantages in contrast to ordinary feature aggregation methods. Among these advantages is its short development time as a result of the existing set of tools and methodologies for the development of fuzzy rule-based systems. An existing rule-based system can
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
15
easily be modified or extended to meet the specific requirements of a certain application. Furthermore, it does not require a full knowledge of the process and it is intuitive to understand because of its human-like semantics. In addition, it is possible to include such linguistic concepts as shape, size, and color, which are difficult to handle when one is using most other mathematical methods. A key work in using fuzzy logic was by Steudel and Glesner (1999), in which the segmentation is carried out on the basis of a region-growing algorithm that uses a fuzzy rule-based system for the evaluation of the homogeneity criterion. These authors affirmed that there are several negative aspects of using only the intensity difference for segmentation: 9 Oversegmentation of the image 9 Annoying false contours 9 Contours that are not sufficiently smooth Therefore, new features are introduced into the rule base of the fuzzy rulebased system which result in a better and more robust partitioning of the image while maintaining a small and compact rule base. The proposed homogeneity criterion is composed of a set of four fuzzy rules. The main criterion is the difference between the average intensity A of a region Rj and the pixel in under investigation. The corresponding fuzzy rule is as follows: RI"
IF D I F F E R E N C E IS S M A L L THEN HOMOGENEOUS ELSE NOT_HOMOGENEOUS
Another important feature for the segmentation of regions is the gradient at
the position of the pixel to be merged. A new pixel may be merged into a region Rj when the gradient at that location is 1 ow. Conversely, when the gradient is t o o h i g h , the pixel definitely does not belong to the region and should not be merged. In terms of a fuzzy rule, R2"
IF G R A D I E N T IS L O W THEN PROBABLY HOMOGENEOUS ELSE NOT_HOMOGENEOUS
With this rule, an adjacent pixel in satisfies the premise of rule R2 with a degree of lZmw (GRADIENT(in)). The two remaining rules are refereed to the size and the shape of regions in order to avoid smallest regions and to benefit compact regions with smooth contours. A complete scheme of this proposal is shown in Figure 6. Krishnan et al. (1994) described a boundary extraction algorithm based on the integration of fuzzy rule-based region growing and fuzzy rule-based edge detection. The properties of homogeneity and edge information of each
FIGURE 6. Fuzzy segmentation technique by Steudel and Glesner (1999). The method is composed of a set of fuzzy rules corresponding to the main properties of the regions: intensity, gradient, shape, and size. The united result of these rules indicates the desirability of aggregating a new pixel into the region. H, homogeneous; NH, not homogeneous; PH, probably homogeneous; PNH, probably not homogeneous. (Reprinted from Pattern Recognition, vol. 32, no. 11, A. Steudel and M. Glesner, Fuzzy Segmented Image Coding Using Orthonormal Bases and Derivative Chain Coding, page 1830, 9 1999, with permission from Elsevier Science.)
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
17
candidate along the search directions are evaluated and compared with the properties of the seed. The fuzzy output values of edge detection and a similarity measure of the candidate pixel can be used to determine the test for the boundary pixel. This proposal was applied to colonoscopic images for the identification of closed boundaries of intestinal lumen, to facilitate diagnosis of colon abnormalities. Another proposal for the integration of boundary information into the regiongrowing process was presented by Gambotto (1993), in which edge information was used to stop the growing process. The algorithm starts with the gradient image and an initial seed that must be located inside the connected region. Then, pixels that are adjacent to the region are iteratively merged if they satisfy a similarity criterion. A second criterion is used to stop this growth. The criteria assume that the gradient takes a high value over a large part of the region boundary. Thus, growth termination is based on the average gradient, F(n), computed over the region boundary following the expression
F(n) -- ~ G(k, 1)/P(n)
(2)
where P(n) is the perimeter of the region R(n), and G(k, 1) is the value of the modulus of the gradient of pixels on the region boundary. The iterative growing process is then continued until the maximum of the global contrast function, F, is detected. Gambotto points out that the cooperation between region growing and contour detection is desirable because the assumption of homogeneous regions is usually too restrictive. If this approach is used, the class of regions that can be characterized is wider than that characterized by using smooth gray-level variations alone.
III. POSTPROCESSINGINTEGRATION In contrast to the works analyzed until this point, which follow an embedded strategy, the postprocessing strategy carries out the integration a posteriori to the segmentation of the image by region-based and boundary-based algorithms. Region information and edge information are extracted in a preliminary step and then integrated. Postprocessing integration is based on fusing results from single segmentation methods attempting to combine the map of regions (generally with thick and inaccurate boundaries) and the map of edge outputs (generally with fine and sharp lines, but dislocated), with the aim of providing an accurate and meaningful segmentation. Most researchers agree on differentiating embedded methods from postprocessing methods. We have identified different approaches for performing postprocessing tasks:
18
X. CUFf ET AL.
1. Oversegmentation: This approach consists of using a segmentation method with parameters specifically fixed to obtain an oversegmented result. Then, additional information from other segmentation techniques is used to eliminate false boundaries that do not correspond with regions. 2. Boundary refinement: This approach considers the region segmentation result as a first approach, with regions well defined, but with inaccurate boundaries. Information from edge detection is used to refine region boundaries and to obtain a more precise result. 3. Selection evaluation: In this approach, edge information is used to evaluate the quality of different region-based segmentation results, with the aim of choosing the best. A third set of techniques deal with the difficulty of establishing adequate stopping criteria and thresholds in region segmentation. A. Oversegmentation
The oversegmentation approach has emerged because of the difficulty of establishing an adequate homogeneity criterion for the region growing. As Pavlidis and Liow (1990) suggested, the major reason that region growing produces false boundaries is that the definition of region uniformity is too strict, as when the definition insists on approximately constant brightness while in reality brightness may vary linearly within a region. It is very difficult to find uniformity criteria that match these requirements exactly and do not generate false boundaries. Summarizing, these authors argued that the results can be significantly improved if all region boundaries qualified as edges are checked rather than attempting to fine-tune the uniformity criteria. A basic scheme is shown in Figure 7. A first proposal can be found in the work of Monga et al. (Gagalowicz and Monga, 1986; Wrobel and Monga, 1987). The algorithm starts with a region-growing or a split-and-merge procedure, in which the parameters have been set up so that an oversegmented image results. Then the region-merging process is controlled by edge information which helps to remove false contours generated by region segmentation. Every initial boundary is checked by analyzing its coherence with the edge map, where real boundaries must have high gradient values, while low values correspond to false contours, According to this assumption, two adjacent regions are merged if the average gradient on their boundary is lower than a fixed threshold. In 1992, Kong and Kosko included fuzzy logic in the algorithm proposed by Monga et al. As Monga et al. did, Kong and Kosko computed gradient information that they called high-frequency characteristics h, to eliminate false contours: h =
Ihigh-frequency components along the boundaryl length of the boundary
(3)
A REVIEW OF I N T E G R A T E D I M A G E S E G M E N T A T I O N TECHNIQUES
19
Original Image
Region Detection
[
i <.
[
Edge Detection
OverSegmented Result
1
Removing of false boundaries
[. ["
FIGURE 7. Scheme of postprocessing integration method based on oversegmentation. First, thresholds are set to obtain an initial oversegmented result. Second, complementary information allows removal of false boundaries.
For any boundary, if the high-frequency information h is smal i, the algorithm concludes that the boundary is a false contour and that it can be eliminated. Another interesting work was presented by Pavlidis and Liow (1990). The proposed algorithm shares the basic strategy of the previously described works, but these authors included a criterion in the merging decision in order to eliminate the false boundaries that resulted from the data structure used. Starting from an oversegmented image, region boundaries are eliminated or modified on the basis of criteria that integrate contrast with boundary smoothness, and variation of the image gradient along the boundary, and a final criterion that penalizes the presence of artifacts reflecting the data structure used during the segmentation. For each boundary, a merit function is computed of the form fl (contrast) + fif2(segmentation artifacts)
(4)
where boundaries with low values of that sum are candidates for elimination. Finally, the proposed algorithm ends with a final step of contour refinement using snakes, which produces smoother contours. Saber et al. (1997) proposed a segmentation algorithm which uses a splitand-merge process to carry out the fusion of spatial edge information and regions resulting from adaptive Bayesian color segmentation. The image is first segmented on the basis of color information only. Second, spatial edge locations are determined by using the magnitude of the gradient of the threechannel image vector field, computed as described by Lee and Cok (1991). As a way to enforce the consistency of the color segmentation map with color edge locations, a split-and-merge procedure is proposed. In the first phase,
20
X. CUFI ET AL.
color segments that have at least one edge segment within their boundary will be split into multiple regions. The splitting is accomplished by first thresholding the gradient result and then labeling all contiguous regions therein. Next, the merging criterion favors combining two regions if there is no significant edge between the region boundaries. A flowchart of the method is depicted in Figure 8. Using the same basic idea of starting from an oversegmented image, some authors have developed techniques that begin with edge detection to ~obtain oversegmentation results. The intention is the same as before, except in these cases region information allows differentiation between true and false contours. Following this strategy, Philipp and Zamperoni (1996) proposed to start with a high-resolution edge extractor, and then, according to the texture characteristics of the extracted regions, to decide whether to suppress or prolong a region. Derivative edge detectors, when employed at a high resolution, give long, rather Color Image
.....
Adaptive Bayesian color segmentation using GRF
Color edge detection
!
Split regions containing edges
Merge adjacent regions based on color and edge information Linked edge map
Improved segmentation map
FIGURE 8. Flowchart of the method proposed by Saber et al. (1997). First, an initial segmentation map is computed. Second, region labels are optimized by split-and-merge procedures to enforce consistency with the edge map. GRE Gibbs random field. (Reprinted from Image and Vision Computing, vol. 15, no. 10, E. Saber, A. M. Tekalp and G. Bozdagi, Fusion of Color and Edge Information for Improved Segmentation and Edge Linking, page 770, @ 1997, with permission from Elsevier Science.)
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
21
isolated, and well-localized contours in nontextured areas and numerous, short, close-spaced contours in textured areas. The former correspond to true edges in the image, because they are well localized and thin, so they must be preserved, and prolonged if possible. The latter must be suppressed if they are inside a textured region but preserved and prolonged if they represent a piece of border. The feature used in this algorithm is the distance between textures on either side of the edge. As a way to obtain texture information, two seeds are put on either side of the edge and start a recursive growing until N representative pixels are gathered. If the distance between textures is small, the edge is considered false and regions are merged. Otherwise, the contour is preserved and prolonged in order to maximize the distance on either side of the edge. In 1997, FjCrtoft et al. presented another technique based on oversegmentation from edge detection, which was examined on synthetic aperture radar (SAR) images. These authors discussed the key role of the threshold value to extract the possible edges from an edge strength map by thresholding. The chosen threshold is related to the probability of false alarm (i.e., the probability of detecting an edge in a zone of constant reflectivity). As a way to detect all significant edges, a low threshold is set, accepting the detection of numerous false edges as well. The oversegmentation result provides, as these authors suggested, a good starting point for the merging process that eliminates false edges by merging regions. The merging step uses a likelihood ration (LR) criteflon to decide the homogeneity between adjacent regions and the consequent elimination of their boundary. That is, the LR is related to the probability that the two regions have the same reflectivity.
B. Boundary Refinement As described previously, region-based segmentation yields a good detection of true regions, although, as is well known, the resultant sensitivity to noise causes the boundary of the extracted region to be highly irregular. The boundary refinement approach, which we call result refinement, considers region-based segmentation as a first approximation to segmentation. Typically, a regiongrowing procedure is used to obtain an initial estimate of a target region, which is then combined with salient edge information to achieve a more accurate representation of the target boundary. As in the oversegmentation proposals, edge information permits refinement of an initial result. An interesting example of boundary refinement can be found in the work of Haddon and Boyce (1990), in which they proposed a segmentation algorithm consisting of two stages: after an initial region segmentation, a posteriori refinement of the generated regions is performed by means of a relaxation algorithm that uses the edge information to ensure local consistency of labeling.
22
x. CUFf ET AL.
Nevertheless, the main characteristic of this work is the postulate that a cooccurrence matrix may be employed as a feature space, with clusters within the matrix being identified with the regions and boundaries of an image. This postulate is proven for nearest-neighbor co-occurrence matrices derived from images whose regions satisfy Gaussian statistics; regions yield clusters on the main diagonal, and boundaries yield clusters off the main diagonal. In 1993, Chu and Aggarwal presented an algorithm which integrates multiple region segmentation and edge maps. The proposed algorithm allows multiple input maps and applies user-selected weights on various information sources. The first step consists of transforming all inputs to edge maps. Second, a maximum likelihood estimator provides initial solutions of edge positions and strengths from multiple inputs. Third, an iterative procedure is used to smooth the resultant edge pattems. Finally, regions are merged to ensure that every region has the required properties. The strength of this proposal is that the solution is a negotiated result of all input maps rather than a selection of them. Several years later, Nair and Aggarwal (1996) made their initial proposal more sophisticated by stating the boundary refinement problem as a classification problem. Every point s on the region boundary must find its new location as a selection from a set of candidate edge element locations ~, -- z j, j -- 0 . . . n, where z0 -- s. Using the Bayes decision rule, the algorithm chooses zj as the new location if
p(slzj) > p(slzk)P(zk)
Yk r j
(5)
where p(slzj) represents the conditional density function of s given zj, and P (z j) is the a priori probability of z. The a priori probability of each candidate location zj is estimated as the proximity of the salient edge segment to which zj belongs, to the boundary of the target region. Finally, the proposed algorithm tries to restore boundary segments by incorporating small parts of the target missed in the region segmentation; that is, for each edge pixel at the site of a break in the boundary, the algorithm tries to determine whether the pixel is part of a salient edge. If it is, the complete edge segment can be incorporated into the boundary. A scheme of this proposal is indicated in Figure 9. A recent proposal for the boundary refinement approach was put forward by Sato et al. (2000). The objective of these authors is to obtain an accurate segmentation of three-dimensional medical images for clinical applications. The proposed technique takes into account the gradients of the boundary and its neighborhood and applies the gradient magnitude, based on a Sobel operator, for boundary improvement. The algorithm starts by successive steps of thresholding and ordinary region growing, which yields a first segmentation of the region of interest. The 'highest gradient magnitude is expected at the boundary, so a growing process starts to find this optimal boundary. For each
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
23
.f, ~ sEDGE ,E. ] EXTRACTION
r....................
.........................................
i.................... i
iI
OFTARGET REGIONS
!.f I 12 I
REGi N 1 BASED SEGMENTATIONJ
f
i l l
i Ji
................ ] DETECTED TARGETS
Focusedanalysisofeachdetected targetregion
FIGURE 9. The general flow of the target segmentation paradigm proposed by Nair and Aggarwal (1996). Boundary refinement from edge information is stated as a classification problem. voxel (three-dimensional pixel) at a boundary, neighborhoods of the voxel and outside the region are inspected by calculating their gradient magnitudes. If each voxel has a greater gradient magnitude than that of the boundary voxel, it is assigned to the next boundary region. This process is repeated recursively until no further boundary region can be created. 1. Boundary Refinement by Snakes
Although the aforementioned proposals have contributed interesting results and new ideas, the most common way to refine the boundary consists of the integration of region information with dynamic contours, also called snakes. The concept of the snake was introduced by Kass et al. (1987). A snake can be defined as an energy-minimizing spline guided by internal constraint forces and influenced by image forces. The image forces guide the snake toward salient image features such as lines, edges, and subjective contours. If we represent the position of a snake parametrically by v(s) = (x(s), y(s)), its energy functional can be expressed as E'hake --
[Eint(v(s)) -1t- Eezt(v(s))] ds
(6)
where E int represents the internal energy of the spline due to its elasticity and rigidity properties, and E ezt gives rise to the external constraint forces.
24
X. CUFI ET AL.
The internal forces impose a smoothness constraint, while the external energy guides the snake to image characteristics such as edges. Unlike most other techniques for finding salient contours, the snake model is active: it is always minimizing its energy functional and therefore exhibits dynamic behavior. Because of the way the contour appears to slither while minimizing its energy, it is called a snake. The snake method is known to solve boundary refinement problems by locating the object boundary from an initial plan. However, snakes do not try to solve the entire problem of finding salient image contours. The high gray-level gradient of the image may be due to object boundaries as well as noise and object textures; therefore, the optimization functions may have many local optima. Consequently, in general, active contours are sensitive to initial conditions and they are effective only when the initial position of the contour in the image is sufficiently close to the real boundary. For this reason, active contours rely on other mechanisms to place them somewhere near the desired contour. In first approximations to dynamic contours, an expert has been responsible for putting the snake close to an intended contour; its energy minimization carries it the rest of the way. The snake deforms itself into conformity with the nearest salient contour. However, region segmentation could be the solution of the initialization problem of snakes. Proposals about integrated methods consist of using the region segmentation result as the initial contour of the snake. In this case, the segmentation process is typically divided into two steps (see Fig. 10). First, a region growing with a seed point is performed in the target region, and its
RegionGrowing ]
I. Gettinga seedpoint
]
]
2. Region Growing by parameter
]
]
3. Detectionof initial contour
I
II Contour Modification 1. Dynamic ContourModel
[
2. Extractionof boundary
I
FIGURE10. Blockdiagram of integration proposal using snakes. The region-based segmentation result is used to initialize the position of the dynamic contour. Next, energy minimization permits extraction of the accurate boundary of the target object.
A REVIEW OF INTEGRATED IMAGE S E G M E N T A T I O N TECHNIQUES
25
output is used for the initial contour of the dynamic contour model. Second, the initial contour is modified on the basis of energy minimization. Different works can be found in the literature that combine region detection and dynamic contours. In the work of Chan et al. (1996), the greedy algorithm proposd by Williams and Shah (1992) is used to find the minimum energy contour. This algorithm searches for the position of the minimum energy by adjusting each point on the contour during iteration to a lower energy position among its eight local neighbors. The result, although not always optimal, is comparable to that obtained by variational calculus methods and dynamic programming. The advantage is that Chan et al.'s method is faster. Similar proposals can be found in the works of Vrrard et al. (1996) and Jang et al. (1997). Curiously, the results of all these techniques have been shown on magnetic resonance imaging (MRI) images, but this is not a simple coincidence. Accurate segmentation is critical for diagnosis in medical images. However, it is very difficult to extract the contour that matches exactly the target region in MRI images. Integrated methods seem to be a valid solution to achieve an accurate and consistent detection. In the sense of making maximum use of information and cooperation, there exists a set of techniques that do not limit region information to initialization of the snakes. Information supplied by region segmentation is also included in the snake, more specifically in its energy functional. This functional typically has two components: an internal energy component that applies shape constraints to the model, and an external energy derived from the data to which the model is being applied. In this approach, a term derived from region information is added to the external part of the energy functional. As a result, points on the contour are allowed to expand or contract according to the fit between contour and region information. An exemplary work about these integration methods was developed by Ivins (1996) and Ivins and Porrill (1995). In their implementation of the snake, the energy functional E is specified as E=
~U
2
2
d~. + -~
r
2
d~.
pfRfG(I(x,y))dxdy
(7)
The first two terms in Eq. (7) correspond, respectively, to the tension energy and the stiffness energy of the contour model, and together compose the internal energy. The third term is the external energy derived from the image data. G is a goodness functional that returns a measure of the likelihood that the pixel, indexed by x and y in the image, is part of the region of interest. R is the interior of the contour, and c~, /~, and p are parameters used to weigh the three energy terms. Thus, as the energy is minimized, the contour deforms to enclose as many pixels with positive goodness as possible
26
X. CUFI ET AL.
while excluding those with negative goodness. This seed region serves two purposes: it is used as the initial configuration of the model, and it is used to construct a statistical model of the attributes (e.g., intensity, color, texture) of the data composing the region as a whole from which the goodness functional is derived. This implementation of the method has been a posteriori revised and modified by Alexander and Buxton (1997), in order to be an effective solution to the problem of tracking the boundaries of country lanes in sequences of images from a camera mounted on an autonomous vehicle. Another remarkable work, which is constantly evolving, was carried out by Chakraborty et al. (1994), who applied snakes in biomedical image analysis. The proposal uses a Fourier parameterization to define the dynamic contour, it expresses a curve in terms of an orthonormal basis, which, for most practical situations, is constrained to a limited number of harmonics. The curve is thus represented by a set of corresponding Fourier coefficients p = (a0, Co, al, bl, Cl, d l , . . . )
(8)
The objective function used is a function of conditional probability P (p II g, /r), or the probability of obtaining the p contour given the region-classified image/r and the image of the scalar magnitude of the gray-level gradient Ig. The function is the sum of three terms:
M(p, Ig, Ir) = Mp,~o,.(p) + Mgraa~,,-,t(Ig, p) + M,.,g~on(Ir, p)
(9)
The first (prior) term biases the boundary toward a particular distribution of shapes generated from prior experience, while the second term in the equation (Eq. (10)), Mg~aa~nt(Ig, p), depends on the coincidence of the parameterized boundary, with the image edges appearing as coherent features in the scalar gradient of the gray levels,
Mgradi,nt(Ig, P) -- fc Ig[X(p, t), y(p, t)] dt
(10)
P
such that the likelihood that p represents the true boundary is proportional to the sum of the gradient values of all points in C p. Finally, term M,-cgion(Ir, p) (Eq. (11)) measures the goodness of match of the contour with the perimeter of the segmented interior of the object. This method rewards the boundary that contains as much of the inside region and as little of the outside as possible. This function is evaluated by integrating over the area A p bounded by the contour p, as expressed in
Mrcgion(Ir, p) -- f fa lr(X, y)dA P
(11)
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
.f
.ogio~
)
"L Sogmonotio~J P
I Image I
It*
Ir
"~f BoundaryFinding 1
L
27
p*
FIGURE 11. Flow diagram for game-theoretic integration of region-based segmentation and boundary finding proposed by Chakraborty and Duncan (1999). The outputs of each of the modules feed back to each other after every decision-making step. The algorithm stops when none of the modules can improve their positions unilaterally. (Reprinted from IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 1, A. Chakraborty and J. S. Duncan, Game-Theoretic Integration for Image Segmentation, page 16, 9 1999 IEEE.)
where pixels inside and outside A p a r e set equal to + 1 and - 1 , respectively. Stating that the area integral must be evaluated many timesi Chakraborty et al. described an alternative and faster integration method based on Green's theorem. A recent proposal of Chakraborty and Duncan (1999) emphasized the necessity of integration. In this work, a method is proposed to integrate region segmentation and snakes by using game theory in an effort to form a unified approach. The novelty of the method is that this is a bidirectional framework, whereby both computational modules improve their results through mutual information sharing. This consists of allowing the region and boundary modules to assume the roles of individual players who are trying to optimize their individual cost functions within a game-theoretic framework. The flow of information is restricted to passing only the results of the decisions between the modules. Thus, for any one of the modules, the results of the decisions of the other moduls are used as priors, and players try to minimize their cost functions at each turn. The flow diagram for game-theoretic integration is shown in Figure 11. These authors affirm that this approach makes it unnecessary to construct a giant objective function and optimize all the parameters simultaneously. C. Selection
In the absence of object or scene models or ground truth data, it is critical to have a criterion that enables evaluation of the quality of a segmentation. In
28
X. CUFf ET AL.
this sense, a set of proposals have used edge information to define an evaluation function that qualifies the quality of a region-based segmentation. The purpose is to achieve different results by changing parameters and thresholds on a region segmentation algorithm, and then to use the evaluation function to choose the best result. This strategy permits solution of the traditional problems of region segmentation, such as the definition of an adequate stopping criterion or the setting of appropriate thresholds. In 1986, Fua and Hanson developed an algorithm (published in 1987) that used edge information to evaluate region segmentation. In their proposal, high-level domain knowledge and edge-based techniques were used to select the best segmentation from a series of region-based segmented images. However, from this pioneer proposal the majority of methods based on the selection approach have been developed, as is stated in the following. In 1995, Le Moigne and Tilton proposed choosing a stopping criterion for a region-growing procedure. This criterion is adjusted locally to select the segmentation level that provides the best local match between edge features and region segmentation contours. Figure 12 shows a basic scheme of this proposal. Desired refined segmentation is defined as the region segmentation with minimum length boundaries including all edges extracted by the Canny edge detector and for which all contours include some edge pixels. The iteration of
I
RegionOrowin,]
FindRegionGrowingIterationwhich 1 MinimizestheDistancebetweenBinary EdgeMapandRegionBoundaries [ LocallyRefineRegionSegnentation 1 FIGURE 12. Outline of the edge/region integration algorithm proposed by Le Moigne and Tilton (1995). Edge information is used to decide the best region-growing iteration that provides the best local match of edge features and region boundaries. (Reprinted from IEEE Transactions on GeoScience and Remote Sensing, vol. 33, no. 3, J. Le Moigne and J. C. Tilton, Refining Image Segmentation by Integration of Edge and Region Data, page 606, 9 1995 IEEE.)
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
29
the region-growing process which minimizes the Hausdorffdistance is chosen as the best iteration. The Hausdorff distance measures the distance between two binary images--the edge pixels obtained through Canny, A, and the boundary of the regions obtained through the region growing, B - - a n d is computed as 1
min d(a ' b) + max min d(a, b)] H(A, B) = -~[max a E A bEB bEB a E A
(12)
where d(a, b) is a point-to-point Euclidean distance. In summary, the distance is computed by finding, for each edge pixel, the closest region boundary pixel, and respectively for each region boundary pixel the closest edge pixel, and then computing the maxima and minima expressed in Eq. (12). Hojjatoleslami and Kittler (1998) presented a region-based segmentation which used gradient information to specify the boundary o f a region. The method starts with a growing process which is stopped by using the maximum possible size N of a region. Then, a reserve check on the relevant measurements is applied to detect the region boundary. Contrast and gradient are used as sequential discontinuity measurements derived by the region-growing process whose locally highest values identify the external boundary and the highest gradient boundary of each region, respectively. Contrast is defined as the difference between the average gray level of the region and the average of the current boundary, and it is continuously calculated. The maximum contrast corresponds to the point where the process has started to grow into the background. Finally, the last maximum gradient measure, before the maximum contrast point, specifies the best boundary for the region. Siebert (1997) developed an interesting, simple, and faster integration technique in which edge information is used to adjust the criterion function of a region-growing segmentation. For each seed the algorithm creates a whole family of segmentation results (with different criterion functions) and then, on the basis of the local quality of the region's contour, picks the best one. As a way to measure the segmentation quality, a metric that evaluates the strength of a contour is proposed. The contour strength cs(R) of a region R is defined as the contrast between both sides of the boundary. More formally, the contour strength is expressed as the sum of the absolute differences between each pixel on the contour of a region and the pixels in the four-neighborhood of these contour points that are not part of the region. So that this parameter can be calculated, it is necessary to process a contour-following task, as well as several differences between integer numbers. As these authors remark, these operations are computationally inexpensive. Furthermore, these authors suggest that slightly improved results at higher computational costs can be expected if the contour strength is based on the gradient at each contour pixel rather than on the intensity difference.
30
X. CUFI ET AL.
A similar methodology can be found in the work of Revol-Muller et al. (2000), in which they proposed a region-growing algorithm for the segmentation of medical three-dimensional images. As in the work described previously, the method consists of generating a region-growing sequence by increasing the criterion function at each step. An evaluation function estimates the quality of each segmented region and permits determination of the optimal threshold. This method is illustrated schematically in Figure 13. These authors proposed different parameters based on either boundary or region criteria to be used as the evaluation function. Three choices based on boundary criteria are proposed: (1) the sum of contrasts of all transition couples (two neighboring pixels located on either side of the boundary are called a transition c o u p l e ) , normalized by the total number of transition couples; (2) the sum of all standard deviations of members of the boundary and its neighboring pixels not belonging to the segmented region, normalized by the total number of pixels belonging to the boundary; and (3) the sum of transition levels of all transition couples, normalized by the total number of transition couples. Three alternative choices based on region criteria are proposed: (1) entropy, (2) intercluster variance, and (3) inverse distance between the gray-level function of the original image and the mean of the region and its complement. Tests on three-dimensional magnetic resonance images demonstrated that the proposed algorithm achieves better results than those of manual thresholding.
FIGURE13. Scheme of the method proposed by Revol-Mulleret al. (2000). A sequence of segmented regions is obtained by increasing the homogeneity threshold. Then, the evaluation function determines the optimal threshold automatically.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
31
More ideas about the integration of different methods can be found in the work of Hibbard (1998), in which snakes are used to evaluate the quality of a segmentation result. The proposal is based on an iteratively region-growing approach, in which at each stage the region of interest grows following a deterministic criterion function based on a hierarchical classifier operating on texture features. At each stage, the optimal contour is determined by using snakes. The optimal choice is the one that best satisfies the three conditions of the objective function proposed in 1994 by Chakraborty et al. (see Section III.B and Eq. (9)). The function proposed by Chakraborty is used in the method as a quality measure of the current segmentation and allows choice of the best segmentation among the set of iterations of the growing process. Finally, the resulting contour corresponds to the maximum of all the iteratively computed contours.
IV. SUMMARY The review of different segmentation proposals that integrate edge information and region information has permitted the identification of different strategies and methods to fuse such information. The aim of this summary is to point out the advantages and disadvantages of these approaches, as well as to remark upon new and interesting ideas that perhaps have not been properly exploited. For the purpose of providing an overview of the presented methods, Table 1 summarizes the different ways to carry out the integration of edge and region information. The first column distinguishes the strategy according to the time of fusion: embedded or postprocessing. The second column identifies the approach used to carry out the segmentation. The next two columns describe the problem that the approach tries to solve and a description of the objective. The last column summarizes the procedure used to perform the segmentation task. As described in Section I.A, embedded integration and postprocessing integration use different principles to perform the task of segmentation. Embedded integration is based on the design of a complex, or a superior, algorithm which uses region and edge information to avoid errors in segmentation. Conversely, the postprocessing strategy accepts faults in the elemental segmentation algorithms, but an a posteriori integration module tries to correct them. The key words that allow characterization and comparison of both strategies are as follows: 9 Single algorithm and avoidance of errors by embedded integration 9 Multiple algorithms and correction of errors by postprocessing integration These two essentialcharacteristics cause these strategies to exhibit outstanding differences. The first aspect to analyze is the complexity of both
TABLE 1 SUMMARY OF APPROACHESTO IMAGE SEGMENTATIONINTEGRATINGREGION AND BOUNDARY INFORMATION Integration
Approach
Embedded
Seed placement
Decision criterion
Post-processing
Problem to solve
Objective
Procedure
The resulting region-based segmentation inevitably depends on the choice of initial region growth points. The obtained region's shape depends on the particular growth decision criterion chosen.
Choice of reasonable starting points for region-based segmentation
Edge information is used to choose a seed (or some seeds) inside the region to start the growth. A region is not homogeneous when there are edges inside. For this reason a region cannot grow beyond an edge.
To have in account edge information, together or not with color information, which can be used to decide about the homogeneity of a region To remove false boundaries that do not coincide with additional information
Over-segmentation
Uniformity criteria are too strict and generate false boundaries in segmentation.
Boundary refinement
Region-based segmentation generates errors at boundaries and this is highly irregular.
To refine the result from region-based segmentation by using edge information to arrive at a more accurate representation
Selection
There is not a criterion to enable the evaluation of the quality of a segmentation.
To use edge information to carry out this evaluation in order to choose the best segmentation from a set of results
Thresholds are set to obtain a first oversegmented result. Next, boundaries that do not exist in the segmentation from a complementary approach are removed. A region-based segmentation is used to get an initial estimate of the region. Next, the optimal boundary that coincides with edges is searched. This process is generally carried out by using snakes. The quality of a region segmentation is measured as the correspondence of the boundary with the edge information.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
33
strategies. Embedded integration produces, in general, a more complex algorithm because, as derived from its definition, it endeavors not to commit errors or take wrong decisions. In contrast, the postprocessing strategy can be viewed as the set of many simple algorithms working in parallel and producing many wrong segmentation results. The solution of these problems is moved to an a posteriori fusion module that refines these results. Therefore, postprocessing e.omplexity is lower because the quantity of information to process decreases, as only the results are taken into consideration. Another aspect to analyze is the independence of these integration strategies as regards their implementation in the segmentation algorithm. In this sense, the embedded strategy is strongly dependent, because typically it implies the design of a new algorithm, which incorporates the integration internally. Hence, any change in the integration procedure will imply the modification of the algorithm. On the contrary, the postprocessing strategy produces a more general approach because it is independent of the choice of algorithms to segment the image. The fusion of the information takes into account only the results of the segmentation algorithms, so the way they are obtained is not important, and it is possible to use any established algorithms. Some researchers (Le Moigne and Tilton, 1995) indicate that postprocessing integration can also be viewed in a general data management framework, in which all incoming data are processed on-line upon acquisition, which produces basic features such as edges and regions. However, it is necessary to remark that this independence assigned to the postprocessing strategy is not complete, and this is the weak point of this approach. It is true that it is independent concerning the chosen method, but if the results achieved by these algorithms are very poor, postprocessing fails. It is undeniable that a posteriori fusion needs to work on a relatively good set of segmentation results. Therefore, final segmentation will inevitably depend, to a larger or lesser extent, on the initial results of the segmentation. An initial fault, for example, is that inappropriate selection of seeds in a region-growing algorithm will be carried over into the totality of the segmentation process. A posteriori integration of edge information may not be able to overcome an error of this magnitude.
A. Disadvantages of Both Strategies Once a set of key proposals integrating edge information and region information have been reviewed, it can be stated that it is not feasible to determine which is the best. This problem occurs for two reasons. First, there is no generally accepted methodology in the field of computer vision which elucidates how to evaluate segmentation algorithms (Ng and Lee, 1996). Second, comparing
34
X. CUFf ET AL.
different segmentation algorithms with each other is difficult, mainly because they differ in the properties and objectives they try to satisfy and in the image domain in which they are working. In this sense, it is well known that there is no method available for all images, since requirements related to the images to be segmented are different (e.g., the requirements for three-dimensional medical image analysis are very different from those for the outdoor color images analyzed by a road-following system). In reference to the weak points of the approaches, a serious difficulty appears when it is required, as is usual, that the most significant edges in the image be obtained. This process is not a trivial task: for example, the gradient map has some hardships as regards the choice of an adequate threshold to achieve a reliable binarization. In this sense, the embedded proposals that use the gradient map directly as boundary information have an important advantage. Another weak point to take into account is the lack of attention that, in general, the reviewed works devote to texture. Without this property, it is not possible to discern whether a high-magnitude gradient corresponds to a boundary between regions or is the response to a textured region. Regrettably, the texture is generally forgotten in the different proposals of embedded integration. As a consequence, the algorithms are not adapted to segment heavy textured areas, which results in an oversegmentation of these regions. Segmentation techniques based on postprocessing integration also suffer from some deficiencies. Those based on starting from an oversegmented image must solve a nontrivial problem: what should the threshold be to obtain an oversegmented result? It is well known that images have different characteristics, so this threshold cannot be a fixed value. An adequate threshold for one image may not be effective for others, and because of this the irrecoverable loss of boundaries can result. An initial mistake in such algorithms could be a serious handicap for the a posteriori fusion, which would yield an undersegmented result. As described in Section III.B, the aim of the boundary refinement approaches is to obtain reliable smooth boundaries. As a way to achieve this, the co-operation between region-based segmentation and snakes, which is the most usual technique, is a good choice. However, it should be stressed that the objective of these algorithms is, generally, to segment not a whole image, but individual objects from an image. Furthermore, these algorithms bear a deficiency that is shared with the third set of postprocessing methods: their exclusive attention to the boundary. Result refinement is reduced to the region boundary, so it is not possible to correct other mistakes inside the region. The same problem is found in the selection approach, in which the quality measure of a segmentation based on boundary information is exclusively based on the external boundary and not on any inner contour lines caused by holes. For this reason, the regions extracted may contain holes. In summary, all these weak points of postprocessing integration reaffirm the previous assertion about
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
35
the necessity of having good initial segmentation results and the incapacity of the postprocessing strategy to correct some initial mistakes. V. CONCLUSIONSAND FURTHER WORK In this article we reviewed some key segmentation techniques integrating region information and boundary information. Special emphasis was given to the strategy performed to carry out the integration process. In this sense, a classification of cooperative segmentation techniques was proposed, and this article described several algorithms with the aim of pointing out their strengths and weaknesses. The lack of a special treatment of textured images was noticed, and it is one of the great problems of segmentation (Deng et al., 1999). If an image mainly contains homogeneous color regions, traditional methods of segmentation working in color spaces can be sufficient to achieve reasonable results. However, some real images "suffer" from texture, for example, images corresponding to natural scenes which have considerable variety of color and texture. Hence, undoubtedly, texture has a pivotal role to play in image segmentation. However, new and promising research has started in relation to the integration of color and texture (Mirmehdi and Petrou, 2000). The intention of integrating complementary information from the image may follow; it seems reasonable to think that a considerable improvement in segmentation could result from the fusion of color, texture, and boundary information. Concerning the strategies for the integration of edge information and region information, it is obvious that there are still methods to be explored. In this sense, a hybrid strategy between embedded and postprocessing may be a solution for some of the previously mentioned typical weak points. A basic scheme of such an idea is presented in Figure 14, where an algorithm based on an embedded strategy produces an initial result, which will be a posteriori refined
Edge Detection
I I
.I Region "1 Detection
Information Fusion Output image FIGURE14. Hybrid strategy scheme for region and boundary integration.
36
x. CUFI ET AL.
by a postprocessing fusion with boundary information. More specifically, the first step of this new proposal could consist of the extraction of edge and region information. In this sense, an embedded algorithm permits the adequate placement of the seeds in an optimal position, and a result with regions free of holes may be obtained. Then, a posteriori fusion with boundary information could refine the segmentation, improving the resulting boundaries. This proposal combines both strategies in a hybrid scheme that uses integration of information in all steps of segmentation with the aim of obtaining a better segmentation result. Segmentation techniques, in general, still require considerable improvement. Surveyed techniques still present some faults and there is no perfect segmentation algorithm, something which is vital for the advancement of computer vision and its applications. Nevertheless, integration of region and boundary information has allowed the improvement of previous results. Current work in this field of research has generated numerous proposals. Thus, this current interest permits us to foresee that further work and improvement of segmentation will be focused on the integration of algorithms and information.
ACKNOWLEDGMENTS This work has been partially developed thanks to the support of the Departament d' Universitats, Recerca i Societat de la Informaci6 de la Generalitat de Catalunya.
REFERENCES Adams, R., and Bischof, L. (1994). Seeded region growing. IEEE Trans. Pattern Anal. Machine Intell. 16(6), 641-647. Alexander, D. C., and Buxton, B. E (1997). Implementational improvements for active region models, in British Machine Vision Conference, Colchester, UK, 1997. Bellet, E, Salotti, M., and Garbay, C. (1994). Low level vision as the opportunist scheduling of incremental edge and region detection processes, in International Conference on Pattern Recognition, Jerusalem, Israel, September 1994, Vol. A. pp. 517-519. Benois, J., and Barba, D. (1992). Image segmentation by region-contour co-operation for image coding, in International Conference on Pattern Recognition, The Hague, Netherlands, August 1992, Vol. C. pp. 331-334. Bertolino, P., and Montanvert, A. (1996). Cooprration rrgions-contours multirrsolution en segmentation d'image, in Congrks AFCET de Reconnaissance des Formes et Intelligence Artificielle, Rennes, France, January 1996, Vol. 1. pp. 299-307. Bezdek, J. C., Keller, J., Krisnapuram, R., and Pal, N. R. (1999). Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Boston, MA: Kluwer Academic. Bonnin, P., Blanc Talon, J., Hayot, J. C., and Zavidovique, B. (1989). A new edge point/region cooperative segmentation deduced from a 3D scene reconstruction application, in SPIE Applications of Digital Image Processing XII, Vol. 1153. pp. 579-591.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
37
Buvry, M., Senard, J., and Krey, C. (1997). Hierarchical region detection based on the gradient image, in Scandinavian Conference on Image Analysis, Lappeenranta, Finland, June 1997, Vol. 2. pp. 717-724. Buvry, M., Zagrouba, E., and Krey, C. J. (1994). A rule-based system for region segmentation improvement in stereovision, in Society of Photo-Optical Instrumentation Engineers, San Jose, California, February 1994, Vol. 2182, Image and Video Processing II. pp. 357-367. Chakraborty, A., and Duncan, J. S. (1999). Game-theoretic integration for image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 21(1), 12-30. Chakraborty, A., Staib, L. H., and Duncan, J. S. (1994). Deformable boundary finding influenced by region homogeneity, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, Washington, June 1994, Vol. 94. pp. 624-627. Chan, E H. Y., Lam, E K., Poon, P. W. E, Zhu, H., and Chan, K. H. (1996). Object boundary location by region and contour deformation, lEE Proc. Vision Image Signal Processing 143(6), 353-360. Chen, P~ C., and Pavlidis, T. (1980). Image segmentation as an estimation problem. Comput. Graphics Image Processing 12, 153-172. Chu, C. C., and Aggarwal, J. K. (1993). The integration of image segmentation maps using region and edge information. IEEE Trans. Pattern Anal. Machine Intell. 15(12), 1241-1252. Cuff, X., and Casals, A. (1996). A criterion for circumscribed contour determination for image segmentation, in IAPR Workshop on Machine Vision Applications, Tokyo, Japan, November 1996. pp. 309-312. Cufi, X., Mufioz, X., Freixenet, J., and Marti, J. (2000). A concurrent region growing algorithm guided by circumscribed contours, in International Conference on Pattern Recognition, Barcelona, Spain, September 2000, Vol. I. pp. 432-435. Deng, Y., Manjunath, B. S., and Shin, H. (1999). Color image segmentation, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Ft. Collins, Colorado, June 1999, Vol. 2. pp. 446-451. Falah, R. K., Bolon, P., and Cocquerez, J. P. (1994). A region-region and region-edge cooperative approach of image segmentation, in International Conference on Image Processing, Austin, Texas, October 1994, Vol. 3. pp. 470-474. FjCrtoft, R., Cabada, J., Lop~s, A., Marthon, P., and Cubero-Castan, E. (1997). Complementary edge detection and region growing for SAR image segmentation, in Conference of the Norwegian Society for Image Processing and Pattern Recognition, Troms6, Norway, May 1997, Vol. 1. pp. 70-72. Fu, K. S., and Mui, J. K. (1981). A survey on image segmentation. Pattern Recogn. 13, 3-16. Fua, P., and Hanson, A. J. (1987). Using genetic geometric models for intelligent shape extraction, in National Conference on Artificial Intelligence, Seattle, Washington, July 1987. pp. 706-711. Fukada, Y. (1980). Spatial clustering procedures for region analysis. Pattern Recogn. 12, 395-403. Gagalowicz, A., and Monga, O. (1986). A new approach for image segmentation, in International Conference on Pattern Recognition, Paris, France, November 1986. pp. 265-267. Gambotto, J. P. (1993). A new approach to combining region growing and edge detection. Pattern Recogn. Lett. 14, 869-875. Gevers, T., and Smeulders, A. W. M. (1997). Combining region growing and edge detection through guided Delaunay image subdivision, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Puerto Rico, June 1997. pp. 1021-1026. Haddon, J. E, and Boyce, J. E (1990). Image segmentation by unifying region and boundary information. IEEE Trans. Pattern Anal. Machine Intell. 12(10), 929-948. Haralick, R., and Shapiro, R. (1992-1993). Computer and Robot Vision, Vols. 1 and 2. Reading, MA: Addison-Wesley.
38
X. CUFI ET AL.
Haralick, R. M., and Shapiro, L. G. (1985). Image segmentation techniques. Comput. Vision Graphics Image Processing 29, 100-132. Hibbard, L. S. (1998). Maximum a posteriori segmentation for medical visualization, in IEEE Workshop on Biomedical Image Analysis, Santa Barbara, California, June 1998. pp. 93-102. Hojjatoleslami, S. A., and Kittler, J. (1998). Region growing: a new approach. IEEE Trans. Image Processing 7(7), 1079-1084. Ivins, J. (1996). Statistical snakes: active region models. Doctoral thesis, University of Sheffield, UK. Ivins, J., and Porrill, J. (1995). Active-region models for segmenting textures and colors. Image Vision Comput. 13(5), 431-438. Jang, D. P., Lee, D. S., and Kim, S. I. (1997). Contour detection of hippocampus using dynamic contour model and region growing, in International Conference of the IEEE Engineering in Medicine and Biology Society, Chicago, Ilinois, October 1997. pp. 763-766. Kass, M., Witkin, A., and Terzopoulos, D. (1987). Snakes: active contour models, in International Conference on Computer Vision, London, UK, June 1987. pp. 259-268. Kittler, J., and Illingworth, J. (1985). On threshold selection using clustering criterion. IEEE Trans. Syst. Man Cybernet. 15, 652-655. Kohler, R. (1981 ). A segmentation system based on thresholding. Comput. Vision Graphics Image Processing 15, 319-338. Kong, S. G., and Kosko, B. (1992). Image coding with fuzzy image segmentation, in International Conference on Fuzzy Systems, San Diego, California, March 1992, Vol. I. pp. 213220. Krishnan, S. M., Tan, C. S., and Chan, K. L. (1994). Closed-boundary extraction of large intestinal lumen, in International Conference of the IEEE Engineering in Medicine and Biology Society, Baltimore, Washington, November 1994. Lambert, P., and Carron, T. (1999). Symbolic fusion of luminance-hue-croma features for region segmentation. Pattern Recogn. 32(11), 1857-1872. Lee, H. C., and Cok, D. R. (1991). Detecting boundaries in a vector field. IEEE Trans. Signal Processing 39, 1181-1194. Le Moigne, J., and Tilton, J. C. (1995). Refining image segmentation by integration of edge and region data. IEEE Trans. Geosci. Remote Sensing 33(3), 605-615. Mirmehdi, M., and Petrou, M. (2000). Segmentation of color textures. IEEE Trans. Pattern Anal. Machine Intell. 22(2), 142-159. Moghaddamzadeh, A., and Bourbakis, N. (1997). A fuzzy region growing approach for segmentation of color images. Pattern Recogn. 30(6), 867-881. Moulet, D., and Barba, D. (1988). Temporal following of spatial segmentation in images sequences, in European Signal Processing Conference, Grenoble, France, September 1988. pp. 39-42. Nair, D., and Aggarwal, J. K. (1996). A focused target segmentation paradigm, in European Conference on Computer Vision, Cambridge, UK, April 1996, Vol. A. pp. 579-588. Nevatia, R. (1986). Image segmentation, in Handbook of Pattern Recognition and Image Processing, Vol. 86. pp. 215-231. Ng, W. S., and Lee, C. K. (1996). Comment on using the uniformity measure for the performance measure in image segmentation. IEEE Trans. Pattern Anal. Machine Intell. 18(9), 933934. Pal, N. R., and Pal, S. K. (1993). A review on image segmentation techniques. Pattern Recogn. 26(9), 1277-1294. Pavlidis, T., and Liow, Y. (1990). Integrating region growing and edge detection. IEEE Trans. Pattern Anal. Machine Intell. 12(3), 225-233. Petrou, M., and Bosdogianni, P. (1999). Image Processing: The fundamentals. New York: Wiley.
A REVIEW OF INTEGRATED IMAGE SEGMENTATION TECHNIQUES
39
Pham, D. L., and Prince, J. L. (1999). An adaptive fuzzy c-means algorithm for image segmentation in the presence of intensity inhomogeneities. Pattern Recogn. Lett. 21(1), 57-68. Philipp, S., and Zamperoni, P. (1996). Segmentation and contour closing of textured and non-textured images using distances between textures, in International Conference on Image Processing, Lausanne, Switzerland, September 1996, Vol. C. pp. 125-128. Prewitt, J. (1970). Object enhancement and extraction, in Picture Processing and Psychopictorics, edited by B. Lipkin and A. Roselfed. New York: Academic Press, pp. 75-149. Revol-Muller, C., Peyrin, E, Odet, C., and Carillon, Y. (2000). Automated 3D region growing algorithm governed by an evaluation function, in International Conference on Image Processing, Vancouver, Canada. September 2000, Vol. III. pp. 440-443. Riseman, E. M., and Arbib, M. A. (1977). Computational techniques in the visual segmentation of static scenes. Comput. Graphics Image Processing 6(3), 221-276. Roberts, L. G. (1965). Machine perception of three-dimensional solids, in Optical and ElectroOptical Information Processing, edited by J. Tippet, D. Berkowitz, L. Clapp, C. Koester, and A. Vanderburgh. Cambridge, MA: MIT Press, pp. 159-197. Rosenfeld, A., and Kak, A. (1982). Digital Picture Processing, 2nd ed., Vol. 2. Orlando, FL: Academic Press. Saber, E., Tekalp, A. M., and Bozdagi, G. (1997). Fusion of color and edge information for improved segmentation and edge linking. Image Vision Comput. 15(10), 769-780. Salotti, M., and Garbay, C. (1992). A new paradigm for segmentation, in International Conference on Pattern Recognition. The Hague, Netherlands, August, 1992, Vol. C. pp. 611-614. Sato, M., Lakare, S., Wan, M., Kaufman, A., and Nakajima, M. (2000). A gradient magnitude based region growing algorithm for accurate segmentation, in International Conference on Image Processing, Vancouver, Canada, September 2000, Vol. III. pp. 448-451. Siebert, A. (1997). Dynamic region growing, in Vision Interface, Kelowna, Canada, May 1997. Sinclair, D. (1999). Voronoi seeded colour image segmentation. Technical Report No. 3, AT&T Laboratories, Cambridge, UK. Sobel, I. E. (1970). Camera models and machine perception. Doctoral thesis, Electrical Engineering Department, Stanford University, Stanford, CA. Steudel, A., and Glesner, M. (1999). Fuzzy segmented image coding using orthonormal bases and derivative chain coding. Pattern Recogn. 32(11), 1827-1841. V6rard, L., Fadili, J., Ruan, S., and Bloyet, D. (1996). 3D MRI segmentation of brain structures, in International Conference of the IEEE Engineering in Medicine and Biology Society, Amsterdam, Netherlands, November 1996. pp. 1081-1082. Williams, D. J., and Shah, M. (1992). A fast algorithm for active contours and curvature estimation. Comput. Vision Graphics Image Processing. Image Understanding 1(55), 14-26. Wrobel, B., and Monga, O. (1987). Segmentation d'images naturelles: coop6ration entre un d6tecteur, contour et un d6tecteur-r6gion, in Symposium on Signal and Image Processing, Nice, France, June 1987. Xiaohan, Y., Yla-Jaaski, J., Huttunen, O., Vehkomaki, T., Sipild, O., and Katila, T. (1992). Image segmentation combining region growing and edge detection, in International Conference on Pattern Recognition, The Hague, Netherlands, August 1992, Vol. C. pp. 481-484. Zucker, S. W. (1976). Region growing: childhood and adolescence. Comput. Graphics Image Processing 5, 382-399. Zucker, S. W. (1977). Algorithms for image segmentation, in Digital Image Processing and Analysis. pp. 169-183.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 120
Mirror Corrector for Low-Voltage Electron Microscopes R HARTEL, 1 D. PREIKSZAS,2 R. SPEHR, H. MULLER, 1 AND H. ROSE Darmstadt University of Technology, Institute of Applied Physics, D-64289 Darmstadt, Germany
I. I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. G e n e r a l C o n s i d e r a t i o n s . . . . . . . . . . . . . . . . . . . . . . . . A. N e c e s s i t y for a n d U s e f u l n e s s of the C o r r e c t i o n . . . . . . . . . . . . . B. C h r o m a t i c a n d S p h e r i c a l A b e r r a t i o n C o r r e c t i o n w i t h an E l e c t r o n M i r r o r . . . C. R e q u i r e m e n t s for the B e a m S e p a r a t o r . . . . . . . . . . . . . . . . . III. T h e S p e c t r o m i c r o s c o p e " S M A R T " . . . . . . . . . . . . . . . . . . . . A. Overall C o n s t r u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . B. E x p e r i m e n t a l Possibilities . . . . . . . . . . . . . . . . . . . . . . 1. P h o t o n I l l u m i n a t i o n . . . . . . . . . . . . . . . . . . . . . . . 2. E l e c t r o n I l l u m i n a t i o n . . . . . . . . . . . . . . . . . . . . . . . C. C a l c u l a t i o n o f P e r f o r m a n c e . . . . . . . . . . . . . . . . . . . . . 1. E l e c t r o n M i r r o r . . . . . . . . . . . . . . . . . . . . . . . . . 2. B e a m S e p a r a t o r . . . . . . . . . . . . . . . . . . . . . . . . . 3. D e t e r m i n a t i o n o f R e s o l u t i o n . . . . . . . . . . . . . . . . . . . . 4. A c c u r a c y a n d Stability R e q u i r e m e n t s . . . . . . . . . . . . . . . . IV. M e c h a n i c a l D e s i g n of the M i r r o r C o r r e c t o r . . . . . . . . . . . . . . . . A. B e a m S e p a r a t o r . . . . . . . . . . . . . . . . . . . . . . . . . . 1. P o l e Plates a n d Coils . . . . . . . . . . . . . . . . . . . . . . . 2. F r a m e w o r k o f the B e a m S e p a r a t o r . . . . . . . . . . . . . . . . . B. F i e l d L e n s e s a n d E l e c t r o n M i r r o r . . . . . . . . . . . . . . . . . . . C. M u l t i p o l e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. E l e c t r i c - M a g n e t i c M u l t i p o l e E l e m e n t s . . . . . . . . . . . . . . . . 2. A d d i t i o n a l M a g n e t i c D e f l e c t i o n E l e m e n t s . . . . . . . . . . . . . . V. Testing of the M i r r o r C o r r e c t o r . . . . . . . . . . . . . . . . . . . . . A. M e a s u r e m e n t A r r a n g e m e n t . . . . . . . . . . . . . . . . . . . . . B. I m p r o v e m e n t o f M a g n e t i c S h i e l d i n g . . . . . . . . . . . . . . . . . . C. F i e l d L e n s e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. B e a m S e p a r a t o r . . . . . . . . . . . . . . . . . . . . . . . . . . E. D e t e r m i n a t i o n of the C h r o m a t i c a n d S p h e r i c a l A b e r r a t i o n Coefficients . . . . F. T h e E l e c t r o n M i r r o r . . . . . . . . . . . . . . . . . . . . . . . . VI. C o n c l u s i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A p p e n d i x : A d d i t i o n o f R e f r a c t i v e P o w e r s in the T w o - L e n s S y s t e m . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42 44 44 47 51 52 54 57 57 59 59 60 65 70 71 72 73 73 77 78 82 82 82 84 84 91 96 102
113 122 128 130 132
1Current address: C E O S G m b H , D - 6 9 1 2 6 H e i d e l b e r g , G e r m a n y . 2Current address: L E O E l e k t r o n e n m i k r o s k o p i e G m b H , D - 7 3 4 4 6 O b e r k o c h e n , G e r m a n y . 41 ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 2002, Elsevier Science (USA). Volume 120 All rights reserved. ISBN 0-12-014762-9 ISSN 1076-5670/02 $35.00
42
P. HARTEL ET AL. I. INTRODUCTION
In the twenty-first century, ever-increasing demands are being placed on the spatial resolution of modem electron microscopes in connection with spectral information. Many users are interested in the atomic composition of complex nonperiodic structures such as grain boundaries. Moreover, they would like information on the chemical binding states of the elements. One goal, for example, is the differentiation of diamond and graphite. This can be achieved only by the insertion of an imaging energy filter and an adequate monochromatic or monochromatized source. The spatial resolution of conventional electron microscopes is mainly limited by the chromatic and spherical aberrations of the objective lens. In 1936 Scherzer showed, under the assumption of (1) static, (2) axially symmetric electromagnetic fields, and (3) a space-charge-free beam region, that off-axis or too-slow electrons are always deflected too strongly. Condition (4) is that the axial velocity component of the electron does not change its sign (Scherzer, 1936). Both aberrations can be minimized at a given focal length. Numerous theoretical investigations have shown that relinquishing one of these four conditions suffices for one to obtain a system free of chromatic and spherical aberrations (Scherzer, 1936). Thus Zworykin et al. proposed in 1945 a system for the correction of chromatic and/or spherical aberration with the aid of an electron mirror. Scherzer, in 1947, presented a nonround lens system, a space-charge device, and a high-frequency lens, each of which in principle enabled the correction of both aberrations. However, the experimental implementation of these proposals turned out to be extremely difficult to realize in practice. It is true that several groups succeeded in demonstrating the correction of aberration, but the resolution, as in the example of the Darmstadt corrector (Bernhard, 1980), could not be improved. The investigations foundered, either as a result of inadequate electrical and mechanical stability or as a result of inadequate adjustment possibilities and procedures. An improvement of resolution with the aid of a corrector was first shown by Zach and Haider in 1995. On the basis of theoretical investigations by Rose (1971), Zach (1989) developed a new concept for a corrected lowvoltage scanning electron microscope. In this case, an electrostatic immersion i lens served as the objective lens that permitted the detection of backscattered and secondary electrons inside the objective lens. The spherical and chromatic aberrations could be optimally adjusted by a combination of electrostatic and magnetic quadrupoles and octopoles. The resolution was improved from 6 nm to better than 2.5 nm at a primary electron energy of 1 keV at the specimen (Zach and Haider, 1995). Two years later, Haider, at the European Molecular Biology Laboratory in Heidelberg, successfully corrected the spherical aberration of a transmission
LOW-VOLTAGE ELECTRON MICROSCOPES
43
electron microscope (TEM). The possibility of correcting the spherical aberration by means of hexapoles was first mentioned by Beck (1979). A feasible corrector, consisting of two hexapoles and transfer lenses, was proposed by Rose (1981) and later further improved (Rose, 1990). With the aid of this corrector, the point resolution limit of a commercial 200-keV transmission electron microscope was improved from 2.6 to 1.2 A (Haider, Rose, et al., 1998; Haider, Uhlemann, et al., 1998). In 2001, Krivanek succeeded in correcting the spherical aberration of a scanning transmission electron microscope with the aid of a quadrupole-octopole arrangement (Dellby et al., 2001). With the microscope operating at an accelerating voltage of 100 kV, a point resolution of 1.23 A could be demonstrated. Up to now, there existed no working corrector for direct imaging in the low-voltage region. This gap was closed with an electron mirror (Preikszas and Rose, 1997; Rose and Preikszas, 1992, 1995). Unlike the quadrupoleoctopole corrector, this device allows the transfer of a large image field. At the same time, the technological risk of slowing low-energy electrons is low. This correction principle is put forward in Section II. Previously Rempfer et al. (1997) were able to demonstrate the correction of chromatic and spherical aberrations by a hyperbolic electron mirror purely and simply on an electron-optical bench. An improvement in resolution with an electron mirror has, however, not yet been achieved. One of the main problems in correcting with an electron mirror is the separation of the incoming electrons from those reflected by means of a magnetic beam separator that must not limit the resolution. The design of a beam separator that satisfies these requirements has been pursued by the working group of Rose for several years (Degenhardt, 1992; MUller et al., 1999; Rose and Preikszas, 1992). Preikszas has put forward a promising concept for a corrected, direct-imaging low-voltage electron microscope with an energy filter (Preikszas, 1995), which is described in detail in Fink et al. (1997). Within the framework of the project entitled "Highest Resolution ElectronSpectromicroscopy with Tunable XUV Radiation," which is financially supported by the German Federal Ministry of Education and Research, this design is at present being realized in collaboration with the University of Wtirzburg, the Technological University of Clausthal, the Fritz Haber Institute in Berlin, and the company LEO Elektronenmikroskopie GmbH. The microscope should be valuable for applications in surface physics. A spatial resolution down to 1 nm and an energy resolution of at best 0.1 eV for electrons from around 0 to 2000 eV is being striven for. The specimen can be illuminated with electrons or with photons. The numerous modes are consolidated in Fink et al. (1997) and described in Section III. A further enormous advantage of the aberration correction lies in the increase of the electron yield. For a given spatial resolution, compared with the aperture size of a conventional electron microscope,
44
P. HARTEL ET AL.
the aperture size of the new microscope can be made appreciably larger, which results in a fourfold increase in intensity. The beam separator and the electron mirror have already been constructed and assembled. The mechanical setup is described in Section IV. The testing of the components was carried out in a conventional scanning electron microscope (see Section V). The important steps for the calculation of the electron optics are summarized in Section III.C.
II. GENERAL CONSIDERATIONS
A. Necessity for and Usefulness of the Correction The resolving power of electron microscopes with adequate mechanical and electrical stability is limited by imaging aberrations. In a rotationally symmetric objective lens, the axial chromatic aberration of first order and degree and the spherical aberration of third order limit the attainable resolution of the microscope. This terminology arises from the fact that the radial deviation from the ideal image point in the Gaussian image plane is expanded in a power series according to the slope of the electrons with respect to the optic axis, their distances from the axis, and the relative energy spread i( = AE/Eo from the initial energy Eo. The sum of the exponents of slope and axial distance is designated as order while the degree denotes the power of the energy or chromatic term. The coefficients of the power series are called aberration coefficients. The starting point of the aberration expansion is thus a series of linearly independent solutions of the linearized equations of motion along the axis. These solutions are characterized as fundamental trajectories or paraxial rays. The initial conditions are usually so chosen that, in an image plane, the four solutions at two perpendicular cross sections indicate either an initial slope (axial rays) or an initial deflection (field rays). The Gaussian image planes are defined as the intersections of the axial ray with the optic axis. The action of the chromatic aberration is clarified in Figure 1. The chromatic aberration represents an energy-dependent defocusing. As a consequence of the Scherzer (1936) theorem, in static, space-charge-free round lenses, tooslow electrons are always focused in front of the Gaussian image plane, and too-fast electrons always behind it. This leads to an unsharpness of the image in a fixed plane, depending on the energy distribution of the electrons. Each point of the object plane is imaged in a disk with radius rc = do x~ Cc, where Oto is the maximum angle, K~ is a suitably chosen measure of the relative half-width of the energy distribution, and Cc denotes the chromatic aberration coefficient. The power series expansion is carried out with respect to the variables in the object plane.
LOW-VOLTAGE ELECTRON MICROSCOPES
45
FIGURE 1. The chromatic aberration of round lenses acts in such a way that electrons with higher energy (E > En) than the nominal energy are focused more weakly and electrons with lower energy (E < En) are focused more strongly. Consequently, a point on the object is imaged as a spot--the chromatic aberration disk.
As shown in Figure 2, off-axial rays, according to the prescription of the Scherzer theorem, are always more strongly refracted than paraxial rays. In the Gaussian image plane, therefore, an image of a point results in an aberration disk of radius rs - - ot o3C s , where Cs denotes the spherical aberration coefficient. The adverse consequences of spherical aberration can be diminished by a suitable defocusing Af. In practice, the effective radius of the aberration disk is about half the size of that in the Gaussian image plane. The attainable resolution of a microscope is additionally limited by diffraction at the smallest aperture, in practice by the objective aperture. A selfluminous point appears in the image as an Airy disk of radius rd = 0.61)~/Oto, where )~ is the de Broglie wavelength of the electron. In the nonrelativistic case
FIGURE 2. Effect of spherical aberration. Off-axial rays in round lenses are always more strongly refracted than paraxial rays, which define the Gaussian image plane. The size of the spherical aberration disk can be significantly reduced by the choice of a suitable defocus A f.
46
P. H A R T E L
ET AL.
the relation )~/nm -- ~/1.5 eV/Eo holds. As a rule of thumb the quadratic mean of the aberration disk and the diffraction disk yields a good upper estimate of the attainable resolution d--
V/
2
r 2 _jr_ r c2 _jr_ ._~ Fs m
~/0.612~2 ~oX~Cc + -~ __~ ol o
+
2 2
1
2
(1)
o l 6o C s2
with an uncorrected electron microscope. An objective that is suitable for a low-energy electron microscope must possess at least one electrostatic immersion lens. A significant improvement can be achieved by superimposing a magnetic field. This acts in such a way that the electrons inside the lens stay closer to the axis (Preikszas and Rose, 1995). In Preikszas (1995) an electrostatic objective is compared with a combined electrostatic and magnetic objective. For an initial energy Eo = 100eV, the object-side aberration coefficient Cc drops from 75 to 2 4 / z m and Cs from 250 to 84/zm. Figure 3 shows the resolution limit d of the electromagnetic objective, for an initial energy of the electrons of 100 eV, as a function of the aperture angle ct. An energy spread A Eo = 0.5 eV was chosen. The best resolution of 4.3 nm is obtained at an aperture angle of 23 mrad. The resolution is principally limited by diffraction and chromatic aberration. The spherical aberration first becomes dominant at angles greater than 50 mrad. 15
I
I
'
I
'
I
'
I
'
I I
.... -- -9-- 9
10
I
diffraction limit chromatic aberration spherical aberration quadratic mean
/ / 9
J J 9
/
9
~ ~oO~ 9~ 5
~'~
~ 9
9 ~149176
9o 9
0
i _ _L _ ~ - l0
9
1
--I"" -- T 20
/_
/
/
/'
I
o ~176176
9j f
~
9 9149149 9* o
I"
~ 4 ~
I 30
9
1
J
4
I 40
9
I
1
4
I 50
i
9
1
7
I 60
ct (mrad) FIGURE 3. R e s o l u t i o n limit d as a f u n c t i o n o f o b j e c t - s i d e aperture angle a for an initial e n e r g y o f the e l e c t r o n s o f Eo = 100 e V with an e n e r g y w i d t h A E o = 0.5 e V . For the relevant c h r o m a t i c and spherical aberration coefficients, the v a l u e s Cc = 2 4 l z m and Cs = 8 4 / z m w e r e assumed.
6
1
4
9
LOW-VOLTAGE ELECTRON MICROSCOPES
47
If one strives for an improved resolution d' - 1 nm, the aperture angle must be increased since the diffraction limit must clearly be raised. Under the assumption that all contributions have equal values r d' - r c' - - r s' - - d ' / f f 3 , t h e aperture angle must be at least or' -- 129 mrad. Beyond this angle, the intended resolution can be obtained if the aberration coefficients take the values C 'c
= ~
d'
= 0.89/xm
and
d'
C~ =
= 0.27#m
(2)
These values lie around two orders of magnitude below those of this objective lens. Such a reduction of the aberration coefficients by an optimization of the lens geometry is impossible for a fixed focal length. This means that without correction no appreciable improvement can be achieved. If one wants to gain a factor g in resolution d' = d / g , assumed to be diffraction limited, the aperture angle must be increased to ot' = got. Since Ctc
, ot rC
1
C---~= r~ ot---;= g2
and
C 's
r s, ot 3
Cs
rs
ot,3
1
g4
(3)
the chromatic aberration coefficient must be reduced by g2 and the spherical aberration coefficient by g4. The second important objective of correction, which is less demanding, is raising the electron yield for a given resolution d ' = d. The intensity in a homogeneous beam increases quadratically with aperture angle. This phenomenon makes it possible to record micrographs that under normal conditions would be unusable because of inadequate signal-to-noise ratios or excessive exposure times. In this application, the requirements on the accuracy of aberration elimination are reduced by a factor g, since for a fixed resolution d' -----d, r c' = rc and r s rs. The great advantage of this application of a corrector is that the stability of the basic microscope need not be improved. B. C h r o m a t i c a n d S p h e r i c a l A b e r r a t i o n C o r r e c t i o n w i t h a n E l e c t r o n M i r r o r
With an electron mirror, chromatic and spherical aberrations of a round lens can be compensated for. Although the mirror itself is rotationally symmetric, it does not fall foul of the implications of the Scherzer theorem, since the axial velocity of the electrons changes its sign. The mathematical basis for the inapplicability of the Scherzer theorem is the following: For an electron that enters the mirror along the axis, the electric potential on the axis ~(z) must assume the value zero at the point of reversal zt, if the gauge of the potential is selected so that with the vanishing velocity v = 0 of an electron with nominal energy, the potential also takes on the value zero. In the derivation of the positive-definite integral expression for the chromatic and spherical aberration coefficients, the continuity of 1 / ~ ( z ) is required, which is not true at the reversal point.
48
R HARTEL ET AL.
The light-optical analogue of the electron mirror explains the principle of the correction. One can consider the equipotential surfaces qg(r) = q), = 0 at which the kinetic energy of an electron with nominal energy En vanishes as the surface of a light-optical mirror. This provides a good description of the actual imaging behavior in the neighborhood of this equipotential surface, since the electrical field strength, the negative gradient of the potential, is perpendicular to the surface. The light-optical law of reflection is approximately fulfilled.
FIGURE4. Principle of chromatic and spherical aberration correction with an electron mirror. (a) Ideal mirror; (b) chromatic correction; (c) correction of spherical aberration. (Reprinted from Journal of Electron Spectroscopy and Related Phenomena, 84, No 1-3, Fink et al., SMART: a planned ultrahigh-resolution spectromicroscope for BESSY II, pp. 231-250, 1997, with permission from Elsevier Science)
LOW-VOLTAGE ELECTRON MICROSCOPES
49
For the ideal imaging of a point on the optic axis in the image plane zi, the equipotential, as shown in Figure 4a, must be a sector of a spherical surface with the midpoint on the optic axis in the plane zi. If one wants to compensate for the chromatic aberration of a round lens, faster electrons must be focused more strongly and slower electrons more weakly. Electrons of higher energy penetrate deeper into the mirror and are reflected at another equipotential ~0 < qgn. If these, as shown in Figure 4b, exhibit a greater bending than at the equipotential q9 = qgn,the electron trajectories cut the optic axis in front of the Gaussian image plane zi. Electrons of lower energy penetrate less deeply into the mirror and must, in order to be more weakly focused, encounter an equipotential of weaker curvature. Under these conditions, the mirror possesses a negative chromatic aberration coefficient. For correction of the spherical aberration, one must take care that the offaxial rays are more weakly refracted. This is the case if the curvature of the equipotential surface q9-----qgnbecomes smaller with increasing distance from the axis, as can be seen in Figure 4c. The form of the reversing equipotential surfaces is enormously affected by the geometry of the mirror electrodes. In Figure 5 a suitable four-electrode configuration is shown that meets the
FIGURE5. Schematic arrangement of an electron mirror.By means of the three freely choosable voltages U0, U1, and U2, the focal length, the chromatic aberration, and the spherical aberration can be set. The electrodes in the neighborhood of the reversal point of the electrons are so shaped that the adjustment range for the chromatic and spherical aberration coefficients is predominantly negative.
50
P. HARTEL ET AL.
preceding requirements. For a fixed column voltage Uc, three freely chosen potentials U0, U1, and U2 are available to adjust the focal length f, and the chromatic and spherical aberration coefficients Cc and Cs can be adjusted over a broad range. As already explained, with the aid of an electron mirror the chromatic and the spherical aberrations of a round lens system can, in principle, be compensated for. The insertion of a mirror for correction presumes that the incoming electron bundle can be separated from the reflected bundle by a so-called beam separator. For this purpose, magnetic fields are necessary since the direction of the Lorentz force F = qv x B, unlike the action of an electric field, depends on the sign of the velocity. The electron-optical requirements for a beam separator that, in conjunction with an electron mirror, forms a thoroughly reliable corrector are clarified in Section II.C. Figure 6 shows, schematically, the overall construction of electron microscopes with a mirror corrector. An important application of mirror correctors
FIGURE 6. Schematic construction of a mirror corrector. So that the incoming electrons entering the mirror can be separated from those reflected, a magnetic beam separator is required. The corrector can be incorporated in a direct-imaging low-energy electron microscope (LEEM) or in a scanning electron microscope (SEM).
LOW-VOLTAGE ELECTRON MICROSCOPES
51
is the low-energy region with electron energies below 15 keV; in this region the technological expense for insulation and for voltage supplies for the mirror electrodes is still small. One can correct direct-imaging low-energy electron microscopes (LEEMs) as well as scanning electron microscopes (SEMs). With an SEM a smaller scanning spot is produced, which is no longer limited by chromatic and spherical aberrations. For this purpose, the mirror must be placed in the ray path before the probe-forming objective lens. This means that the electron bundle emitted from the source after a single passage through the beam separator is thereby precompensated for at the mirror, so after a further passage through the beam separator, the objective lens will form a smaller probe. In a direct-imaging microscope, the ray path is the reverse of that in a scanning microscope, as the reciprocity theorem confirms. The electrons reflected or emitted at the probe will be imaged by the objective lens with aberrations, which will be compensated for after a single deflection through the beam separator by the mirror. The chromatic and spherical aberration-free image, after a second passage through the beam separator, is magnified by the projector and imaged onto the detector. The sample can be illuminated either with electrons through the beam separator or with photons from the side. When photon illumination is used, one speaks of a photoemission electron microscope (PEEM). A transmission illumination is likewise possible. The mirror corrector built within the framework of this project is designated for incorporation in the "SMART" (spectromicroscope for all relevant techniques). The SMART is a combined PEEM/LEEM. Its construction and the experimental possibilities are described more precisely in Section III. The trial and testing of the two key components of the corrector--namely, beam separator and electron mirror--were, however, carried out in an SEM.
C. Requirements for the Beam Separator The favorable correction of chromatic and spherical aberrations of a round lens system with a rotationally symmetric electron mirror is bought at the price that one must incorporate a nonround element in the beam path for beam-steering purposes. The deflection introduced bythe beam separator must be sufficiently aberration free that the correction effected by the mirror is not significantly impaired. The simplest method for minimizing the effect of the intrinsic aberrations of a given beam separator on the entire system is the choice of a suitable intermediate magnification M in the beam separator, since the influence of individual aberrations of the beam separator on the aberrations of the system as a whole scales with different powers of the intermediate magnification. A further possibility may be to keep the deflection angle ~b as small as possible
52
E HARTEL ET AL.
by a one-time pass through the beam separator. This is, however, scarcely realizable in a corrected SEM. The minimum bending angle is then determined by the boundary condition that electron source and specimen chamber be placed close together. With a corrected LEEM only bending angles of more than 60 ~ are possible, if projector and electron source are not located at the same place. From manufacturing considerations it is favorable, as is adopted in Figure 6, to have an angle of 90 ~. Furthermore it is an advantage both for the construction and for the calculation to have a pure magnetic system with midsection symmetry. The optic axis is then torsion free and remains in the symmetry plane. Such a beam separator can be realized by using two plane-parallel plates of high-permeability material, in which an assembly of symmetric coils is embedded. Figure 7 shows the general view of a section of one of the two pole-piece plates. The engraving geometry coincides with that given in Mtiller et al. (1999). The imaging properties of the beam separator were determined principally by the position of the coil grooves and their orientation with respect to the optic axis. For insertion in connection with the electron mirror, it is advantageous that the beam separator, as in a transfer lens system, should at least image one chosen entrance plane 1:1 into an exit plane. If the field trajectories outside the beam separator are, additionally, parallel to the axis, then any plane is imaged 1:1. This corresponds to an 8f arrangement of four round lenses (see Fig. 12). To obtain a comparable image quality, one must take care that the beam separator has no second-order geometric aberrations. This is achieved through the double-symmetric construction of the beam separator relative to the diagonal S1 and the two angle bisectors $2. If the principal rays are point symmetric or axially symmetric to this symmetry plane, then all aberrations of second order vanish. In Figure 7 the axial bundle is shown in the midplane of the beam separator. The intersections of the axial rays with the optic axis give the positions of the intermediate images. The mirror must compensate for the chromatic aberration of the objective lens. It is therefore advantageous--so that combination aberrations are avoided--if the beam separator introduces no dispersion. This means that electrons with different energies entering the beam separator along the optic axis leave it at the same place and with equal slope. This is possible only if the deflecting magnetic field contains regions of opposite sign.
III. THE SPECTROMICROSCOPE"SMART" The corrected spectromicroscope "SMART" with tunable extreme ultraviolet (XUV) radiation was built in collaboration with several universities, the Fritz
LOW-VOLTAGE ELECTRON MICROSCOPES
53
FIGURE 7. Construction of the beam separator. In two plane-parallel pole plates, coils are embedded that create regions of oppositely directed magnetic fields (shown as dotted or crosshatched areas). The areas are chosen so that the entrance plane of the beam separator is imaged without dispersion with unit magnification in its exit plane. The locally symmetric arrangement of the coil triplets relative to the symmetry planes $1 and $2 cancels the geometric aberrations of second order.
Haber Institute, and industry. The essential construction of the apparatus is described in Section III.A. In Section Ill.B, the multiplicity of the possible operational modes of the combined P E E M / L E E M is introduced. The m e t h o d for calculation of the mirror corrector and the objective lens is set out in Section III.C. The attainable spatial and energy resolution is u n i q u e in comparison with that of existing instruments (see Section III.C.3). There are several important constraints for the construction of the instrument which are grouped in Section III.C,4.
54
R HARTEL ET AL. A. Overall Construction
The schematic overall view of the construction of the complete ultra-highvacuum spectromicroscope "SMART" (Fink et al., 1997) is shown in Figure 8. This equipment is intended for operation on the synchrotron BESSY II in Berlin Adlershof. It is supplied with photons by means of the PM-6 beam line at the undulator U49/1 (period length, 49 mm). The beam line is equipped with a plane grating monochromator and several refocusing mirrors. Photons in an energy range E ph from around 20 to 2000 eV are available. The spectral resolution Eph/AEph a m o u n t s to s o m e 104 at a photoenergy of 400 eV. The photons that pass through the energy selection slit are focused to a spot of 10/zm x 25/zm in size by the X-ray mirrors (PEEM operation). Additionally, conventional UV lamps with photoenergies of up to 7 eV can be mounted at the specimen chamber. Alternatively, the specimen can be illuminated with electrons from a field-emission gun with an energy width of 0.9 eV through the beam separator and the objective lens (LEEM operation). With the condenser lens system between source and beam separator it is possible to switch from normal illumination of a specimen region of adjustable size to convergent illumination. The electrons released from or reflected at the specimen are accelerated by the electromagnetic immersion objective to the nominal energy En = 15 keV. They are imaged with a mode-dependent magnification between 18 and 26 into the entrance plane of the beam separator. The field lens L 1 just in front of the intermediate image plane influences the field trajectories in such a way that these run parallel to the axis in the intermediate image plane at the edge of the beam separator. The various modes of the objective lens are determined by the initialenergy Eo of the electrons leaving the specimen (from 0 to 2000 eV) and the electric field strength at the specimen surface (between 0 and 7 kV/mm). For field-free operation, the microscope can be operated at initial energies Eo larger than 500 eV. With an electromagnetic double-deflection element, the optic axis of the electron illumination and the imaged electron bundle can be aligned separately. For a single deflection of 90 ~ the beam separator images its entrance plane 1:1 in its exit plane, without introducing aberrations of second order or dispersion of first or second degree. The electron mirror, a tetrode structure, in conjunction with the field lens L2 allows the mirror-side exit plane of the beam separator to be imaged again onto itself with a magnification of unity. With the help of the three mirror potentials the (negative) chromatic and spherical aberration coefficients of the electron mirror can be adjusted so that the corresponding aberrations of the objective lens are corrected in every mode of operation. As a result, at such low initial energies of the electrons, a point resolution of 1 nm is possible. Alternatively, the corrector can be used to increase the intensity yield significantly by
FIGURE 8. Construction of the corrected spectromicroscope for all relevant techniques (SMART).
56
P. HARTEL ET AL.
using a larger aperture at less demanding requirements in spatial resolution. Two magnetic octopoles and electrostatic dodecapoles are placed between the mirror and the field lens. They serve as double-deflection elements for separate control of the electrons entering and leaving the mirror. They can also be employed as complex stigmators to compensate for residual aberrations and alignment errors. After the electrons pass through the beam separator again, they encounter the transfer optics with integrated aperture diaphragms and image field apertures. The transfer optics consists of five electrostatic einzel lenses which transfer the image and diffraction planes of the objective lens into two fixed planes in front of the energy filter. The intermediate magnification can be varied between 2 and 45. At a magnification of 2, the image and diffraction planes can, in addition, be exchanged in order to enable photoelectron diffraction (PED) or low-energy electron diffraction (LEED). The ~-shaped energy filter is designed as a purely magnetic imaging filter, whose image aberrations of second order are completely corrected by the hexapoles H1-H6 and the dodecapole (Lanio, 1986). A filter of this type has already been built at the Fritz Haber Institute in Berlin (Rose and Krahl, 1995). The fixed entrance plane and the diffraction entry plane are imaged by the filter with unit magnification in the achromatic image plane just behind the exit of the dipole P4 and in the energy selection plane, respectively. The dispersion of the filter amounts to 35/zm/eV at the nominal energy of 15 keV. With a slit width of 3.5/zm, an energy resolution of 8E = 0.1 eV can be attained. With the three-lens electrostatic projector, on the one hand, the achromatic image plane of the energy filter can be imaged onto the detector with a magnification between 20 and 150 (energy-filtered image). On the other hand, the energy selection plane can be imaged with magnifications from 15 to 60 (imaging of spectra). Spectra of an energy range up to 35 eV of selected area object regions can be investigated with an energy resolution of 0.1 eV. The camera system includes a slow-scan charge-coupled-device (CCD) camera and a video-frequency-operated CCD camera with a preconnected channel plate. The complete equipment is mounted on a vibration-proof table. The objective chamber is connected to a preparation chamber by means of a specimentransfer system. The construction and testing of individual components has been distributed among various research groups as follows: BESSY GmbH
Synchrotron, undulator, X-ray optics
Fritz Haber Institute, Berlin
Transfer optics, energy filter, projector; preparation chamber, table
LEO Elektronenmikroskopie GmbH
Objective lens, electron source, highvoltage equipment
LOW-VOLTAGE ELECTRON MICROSCOPES Clausthal University of Technology
Specimen stage, camera system
Darmstadt University of Technology
Beam separator, electron mirror
University of Wtirzburg
Project coordination, X-ray optics, specimen chamber
57
B. Experimental Possibilities
The multiplicity of the various measuring possibilities with the spectromicroscope in surface physics can be divided into two major groups: experiments with photon illumination (Section III.B. 1) and those with electron illumination (Section III.B.2). In each group there are three subgroups. With the transfer optics, one can switch between direct imaging (PEEM/LEEM) and micrographs of (energy-filtered)diffraction patterns (PED/LEED). The projector permits the switchover to spectrum images (PES/EELS: photoelectron spectroscopy/electron energy-loss spectroscopy). Any particularly interesting region of the specimen may be selected with the image field aperture. In what follows, the various direct-imaging modes are described. A detailed presentation can be found, for example, in Fink et al. (1997) or Henzler and G6pel (1991). An electron of starting (kinetic) energy Eo at the surface of the specimen always has the nominal energy E n - - 15 keV in the grounded part of the microscope column. The nominal energy is prescribed by the setting of the energy filter. 1. Photon Illumination
Different kinds of chemical information (up to the clarification of the binding states) can be gained in the imaging mode (PEEM). An energy-level scheme is shown in Figure 9. For spectroscopy, several micrographs at different photon energies and/or micrographs at different settings of the microscope are necessary. If only the photon energy is varied, then the electrons used for imaging are those that are excited from different initial states of the specimen into a narrow range about the starting energy Eo. This mode is advantageous for the SMART, since the objective lens, and hence the corrector, does not need to be readjusted to another starting energy. The attainable energy resolution is mainly limited by the energy filter to dE ,~ ,~E - - 0 . 1 eV, since the energy distribution of the illuminating photons is significantly sharper. Electrons can also be excited from a constant initial state of the specimen. In this case, the photon energy hv and the starting energy Eo are altered simultaneously while the binding energy Eb = h v - Eo is kept constant. In this case the excitation of the objective lens and of the corrector depends on the initial energy. Therefore, at high resolution, measurements with constant initial conditions are possible only if the alignment of the corrector can be completely automated. The same holds for spectroscopy in the classical sense,
58
E H A R T E L ET AL.
E
dE e ( ~ c - t~obj) dis
.,,,o
4E~ac = 0
,.-__ y
EF
bulk
surface
FIGURE 9. Energy-level scheme for spectromicroscopy. Electrons that leave the specimen with an initial energy Eo relative to the vacuum level Ev,,c are accelerated through a potential difference ~c - ~obj between the specimen and the microscope column to the nominal energy E,, = 15 keV. The effective energy width dE is determined by the energy widths arising from emission (AE), energy filtering (~E), and instabilities and parasitic perturbations. With illumination of the specimen with photons of energy E ph --h v, the electrons used for imaging are either emitted directly (core or valence electrons) or released by secondary processes (Auger electrons, secondary electrons). The Fermi level is denoted by EF. (Reprinted from Journal of Electron Spectroscopy and Related Phenomena, 84, No 1-3, Fink et al., SMART: a planned ultrahigh-resolution spectromicroscope for BESSY II, pp. 231-250, 1997, with permission from Elsevier Science)
in which one investigates, at constant photoenergy, the spectrum of initial energies of the specimen. The spectra evaluated in this way are, however, dependent on the density of states in the initial and final states. With very low photon energies, local variations in the work function E v a c E F can be measured with very high accuracy. This furnishes information
LOW-VOLTAGE ELECTRON MICROSCOPES
59
about chemical reactions or diffusion at the surface of solids, for example. With medium photon energies, valence electrons are excited. The excitation of core electrons can create chemical contrast by three mechanisms. In the first method, photoelectrons are used directly for imaging. The photon energy must purposely be chosen so large that the initial energy of the electrons lies between 30 and 100 eV. In this energy region the secondary electron background is comparatively small; the effective cross section for the excitation of core electrons is still large enough and the initial energy for electron yield in the SMART is favorable. The second possibility is to image Auger electrons. Both methods are characterized by high surface sensitivity, since electrons with such low starting energies reach the surface without inelastic scattering processes only within a depth of a few monolayers. The third method makes use of the circumstance that below the threshold energy of an inner shell excitation, the secondary electron yield is considerably lower than that above the threshold energy. In this case, secondary electrons with energies of a few electron volts are imaged. This method is, however, not particularly surface sensitive.
2. Electron Illumination For recording images with elastically scattered electrons, the electrons originating from the source are accelerated to the nominal energy En. The kinetic energy of the electrons at the specimen amounts to Eo. By means of the energy filter, the energy width of the electrons that contribute to the image can be reduced to 8 E = 0.1 eV. In this case the resolution is not limited by the energy width A E ~ 1 eV of the electron source. If one chooses the potential difference Cbc- t~obj between microscope column and specimen so that the corresponding energy is somewhat larger than the kinetic energy En of the illuminating electrons, the electrons return at an equipotential surface just above the specimen. This method is known as mirror electron microscopy (MEM). It allows the investigation of topography as well as the electric and magnetic fields at the surface of not-too-rough specimens. Images of inelastically scattered electrons with an energy loss E1 can be obtained, if the illuminating electrons are accelerated to an energy En + El. This is possible as the dispersions of first and second degree of the beam separator vanish and the electron-optical requirements for the illuminating system are less severe.
C. Calculation of Performance In this section the most important basic ideas for the calculation of the mirror corrector and the objective lens for the SMART are summarized. The starting
60
E HARTEL ET AL.
point of the trajectory calculations is the Lorentz force d dt(mv) = - e ( E + v x B)
(4)
with relativistic mass m, charge - e , and velocity v -- i" of an electron. The static electromagnetic fields in the beam region, assumed charge free and current free, can be obtained from the scalar potentials ~p and ~p: E(r) = - g r a d ~0(r)
and
B(r) = - grad 7t(r)
(5)
The electric potential is normally so gauged that with vanishing velocity v of an electron with nominal energy, the potential on the optic axis ~ ( z ) : = ~0(x = 0, y = 0, z) is zero. In general, the optic axis z is a possible particle trajectory. The x and y axes are frequently chosen to form the moving trihedral. For the solution of the equation of motion one proceeds in the following way: The potential on the optic axis is expanded with respect to the lateral coordinates x and y. Usually the complex notation w -- x + iy and t~ = x - iy is used. The equations of motion are solved iteratively by successive approximation. The starting point is the fundamental solution of the linearized equation of motion. As initial conditions, one usually chooses the aperture angles ot and 17 and the positions y and 3 in the xz oryz section of an image plane. The additional chromatic parameter tr describes the relative energy deviation of an electron from the nominal energy E,. The nonlinear equation of motion is solved as a power series in the initial conditions. In the practical procedure, the methods differ considerably for the beam separator with midplane symmetry and the rotationally symmetric electron mirror.
1. Electron Mirror In contrast to the conventional method, in which the time t is replaced by z along the optic axis, with a mirror, the time dependency must be retained at least in the vicinity of the reversal points. A perturbation method for calculating the properties of electron mirrors is described in detail in Preikszas (1995), Preikszas and Rose (1997), and Rose and Preikszas (1995). It has proved useful to consider the electron motion relative to a reference electron with x(t) = O, y(t) = 0, and z(t) = ~'(t) that enters the mirror along the optic axis with nominal energy E,. The optic axis thus corresponds to the symmetry axis of the electron mirror. The position of an arbitrary electron is then given by r(t) = (x(t), y(t), ~(t) + h(t)) T
(6)
The expansion of the potential and the linearization of the equation of motion must be performed three-dimensionally with respect to thesmall quantities x, y, and h. If one applies the Laplace equation Aq9 = 0, the potential expansion
LOW-VOLTAGE ELECTRON MICROSCOPES
61
takes on the form qg(to, fro, h; z ) - -
~
~t2~+m~(z)
n, m=0
-
n !2m!
WCV
--~
hm
(7)
where ~tkl denotes the kth derivative of 9 with respect to z. Because of the rotational symmetry, no multipole strengths appear. The expansion of the magnetic potential ~ is obtained by replacing the electric potential 9 on the axis by its magnetic counterpart q J ( z ) - - T z ( x - O, y O, z). With the presence of magnetic fields it is advantageous to change to a new coordinate system, which is rotated by the Larmor angle X, using the transformed lateral coordinate u "-- we'X. The equations of motion with separated linear components have the structure /i q- ( a ~ " -k- bqj'2) 9u -- Fu h - 2 ( a ~ " + c~'2) 9h + d x ~ ' =
(8)
Fh
in which the functions a, b, c, and d depend only on the potential 9 and on elementary constants. On the right-hand side, F, and Fh contain all the nonlinear components. In the nonrelativistic limiting case the functions c and d vanish. The solution of the linearized equation of motion can be written as U (1) - -
]J,U/z _qt_~U~
(9)
h (1) -- trh~ + vhv + tch,~
with the complex constants/z and ~ as well as the real constants tr and v. The linearly independent solutions can be chosen to be symmetric (u u, ha) or antisymmetric (u~, hv) relative to the reversal point of the reference electron. The solution hK of the inhomogeneous component of the differential equation does not appear in the nonrelativistic case. The parameter t c - tc(/z, ~, or, v) is determined by the energy conservation theorem and the initial conditions. If one demands that a particular electron and the reference electron reverse their direction of motion with respect to the optic axis at the same time, then the antisymmetric component vhv disappears, so that v = 0 holds. The exact solutions (x) tt--
~ r=l
oo
bt (r)
and
h= ~
h (r)
(10)
r=l
of the equations of motion can be determined iteratively as power series expansions with respect to the parameters/z, /2, ~, ~, tr, and to. Each step of the iteration for u ~r~ and h ~r~ is achieved by the insertion of all foregoing solutions r-1 r-1 Y~k= 1 u~k~ and Y~k= 1 h~k) into the system of differential equations (8).
62
P. HARTEL ET AL.
The time-dependent coefficients of both power series can be transformed outside the mirror--in a field-free region--by projection in an arbitrary fixed plane z. The function h is then eliminated and one obtains the conventional time-independent representation. The solution fi(z) is then a power series
~(z)=c~(o~, co, p, 7~, K)~o~(z) + cp(~o, co, p, ~, x)C~p(z)
(11)
in the small quantities co = u + i/~, 69, p = ?, + i3, 9, and x with the axial ray fio~(Z) and the field ray tip(Z). The coefficients Co~ and Cp have the form
C,o=
y~
coa COb pC ~gdK.k D~o~Cobp~i~drk
a,b,c,d,k > O
cp=
coa CobpC ~)d Kk Co~. 69bpc ~dx k
(12)
a,b,c,d,k > O
with the complex aberration coefficients D . . . and C . . . . whereas for rotational symmetric systems such as an electron mirror, the possible exponents are restricted by the condition a - b + c - d = 1. As already defined in Section II, one classifies the exponent k as the degree, and the sum of the remaining exponents as the order, of an aberration. The total sum of the exponents is called the rank. For the electron mirror, all aberrations up to the fifth rank were calculated. The aberration expansion given by Eq. (12) is not limited to the mirror, but can be carried out--in principle--in this way for each electron-optical system. It is possible to combine the aberration coefficients of successive optical subsystems. This procedure can also account for mechanical misalignment of the individeal components (Preikszas, 1995). Finally, one arrives at an aberration expansion of the total system. In an image plane zi, only the axial aberration coefficients C . . . are of interest, since there the axial trajectory /~o~(z/) vanishes. In a conventional electron microscope consisting of only round lenses, the chromatic aberration coefficient Cc = - C o ~ and the spherical aberration coefficient Cs = Co~o~,produce the largest contributions to the axial series expansion Cp = p + A p = p + . . . .
coxCc + "." --[- co2Co Cs + " . .
(13)
The negative sign of the chromatic aberration coefficient is consistent with that chosen by Scherzer. By replacing co with ct + ifl and p with y + i3 and rearranging and introducing a new terminology for the expansion coefficients, one obtains an intuitively clearer notation for the chromatic aberration coefficients C~K and C~K as well as for the spherical aberration coefficients C ~ , C ~ , C ~ , and C ~ . Whereas for measuring the spherical aberration coefficient, the deviation A p = co2(~o C s from the ideal image point can be readily obtained directly, the
LOW-VOLTAGE ELECTRON MICROSCOPES
63
chromatic aberration coefficient is more readily obtained from the defocusing A f := - A z = x C c + tcZKc -t- . . . ~ - A / o / c o
with
Kc = - C~K
(14)
as a function of the electron energy. The division by co takes into account that the axial rays enter, in linear approximation, with the slope co in the image point. The sign of the defocus is thus so chosen that a stronger excitation of a lens leads to a positive defocus A f . The round component of chromatic and spherical aberrations at a preselected magnification and focal length can be corrected with a tetrode mirror. For the aberration coefficients Cc and Cs one can derive integral expressions. In the nonrelativistic, purely electrostatic case these are given by C c --
( )2 I
Cs
(//toe)4 {.4
--
bl we U#e J
~ U tz e / I
a 2 e h cr e -
Utzehae
lftTe~[3] }
bl lze lg9lz e - - -~
.3
-- lllzeUlze
dr
(I)[4]_ 4 161 ftre q~----~ -uu
( o)o 3,
+ 2 4it 2 + -~ccu2
di) c Ig 2lz h , ,
-~cU~,h~ dr
j
(15)
where the second index e on a trajectory refers to the value at the time re in the image plane Ze = ~"(re). It is further assumed that the entry and exit planes of the mirror coincide. The time rt fixes the time at the reversal point of the reference electron, whereas the potential of the microscope column is defined by ~c. One recognizes, in the structure of the integral expressions, that the values of the chromatic and spherical aberration coefficients are characterized by the third and/or fourth derivatives of the potential 9 on the optic axis and hence can be adjusted independently. This is the mathematical background for the intuitive explanation, given in Section II, of the correction possibilities with an electron mirror. Moreover, the focal length can be chosen independently, since it is determined by the second derivative of the potential, as is clear from the term a ~ " of the linearized equation of motion (8). For a particular electron mirror, the derivatives of the potential on the axis are determined by the voltages on the individual electrodes. With the three freely choosable voltages on the tetrode mirror, in principle, the chromatic and spherical aberration coefficients can be adjusted at a constant focal length. The usable range of values is determined by the geometry of the electrodes. Figure 10 shows the variation region of the basic tetrode mirror for the SMART. The aberration coefficients relate to the image plane at the entrance plane of the beam separator. The values of the objective lens in all operational modes of the SMART as well as those of the scanning electron microscope ZEISS
64
E HARTEL ET AL.
FIGURE 10. Range of variation for the chromatic and spherical aberration coefficients for the tetrode mirror. The influence of the mirror-side field lens has been included. The magnification is unity and the image plane always lies near the edge of the beam separator. An entry of the form 20/1 characterizes the mirror calibration, with which the aberrations of the SMART objective for an electric field of 1 kV/mm at the specimen and an initial energy of 20 eV are compensated for. A cross marks the necessary calibration for the DSM objective at a working distance of 4 mm. (Reprinted from Journal of Electron Spectroscopy and Related Phenomena, 84, No 1-3, Fink et al., SMART: a planned ultrahigh-resolution spectromicroscope for BESSY II, pp. 231-250, 1997, with permission from Elsevier Science)
D S M 960 u s e d for testing can be c o m p e n s a t e d for. T h e necessary values of the aberration coefficients to set at the m i r r o r for correction are s h o w n in F i g u r e 10. For the calculation of the m a g n e t i c and electric potentials and the different objective lenses, a c h a r g e - d e n s i t y m e t h o d is used. U n i f o r m l y charged rings serve as c h a r g e - d e n s i t y elements, w h i c h are p o s i t i o n e d inside the electrodes. T h e equipotential surfaces g e n e r a t e d by the charges deviate f r o m the prescribed
LOW-VOLTAGE ELECTRON MICROSCOPES
65
geometry by less than 20 nm, which is far beyond the attainable finishing accuracy (Preikszas et al., 2000). 2. B e a m Separator
For the calculation of the beam separator, the time t can be replaced by the coordinate z along the curved optic axis. A comprehensive exposition of this general procedure may be found in Kahl (1999) and Rose (in press). The basic procedures for the calculation of the beam separator are described in MUller et al. (1999) and Preikszas (1995). Since the beam separator operates purely magnetically, the electrostatic potential on the optic axis remains constant. The magnetic potential, because of the midplane symmetry, fulfills the condition !/r(x, y, z) = -7t(x, - y , z). In this case the xz plane denotes the midplane of the beam separator. The coordinate system is chosen as the moving trihedral about the optic axis z. The optic axis is thereby defined in terms of the trajectory of an electron that starts with nominal energy without any slope and axial deviation in the entrance plane of the beam separator. The expansion of the scalar magnetic potential, in complex notation, has the form (x)
gr(w, go; z) =
~
bz~,(z)w z Oa~
(16)
)~=0,/~=0
The coefficients bzu are functions of the multipole strengths kiln(Z ) = kIInc(Z ) .qt_ iq%(z) and their derivatives. They are determined uniquely by a recursion formula, based on the Laplace equation Aap--0, and the distribution of the multipole strengths along the optic axis. In systems with midplane symmetry only the imaginary part kilns contributes to the series expansion. The equation of motion with respect to the curvilinear local coordinate system about the optic axis can be written as x" + (aep2s + b*2s) .X -
CKklI1
--
Fx
y" - bqJ2s 9y -- Fy
(17)
where a, b, and c are constants. All nonlinear terms are placed together in Fx and Fy. The course of the fundamental rays is determined by the dipole strengths ~ls = qqs(Z) and the quadruple strengths qJ2s = qJ2~(z). From the structure of the equation of motion, one can deduce that in the y direction focusing can be done by quadrupole fields only. The focusing in the x direction is already caused by the bending dipole field, whose action is weakened or strengthened by the quadrupole fields. Quadrupole fields arise in systems consisting of only homogeneous sector magnets, if the current-carrying grooves in the pole-piece material are not crossed orthogonally. The strength of the quadrupole field is proportional to the cotangent of the cutting angle and the first derivative of the
66
P. HARTEL ET AL.
magnetic field strength along the z axis (Rose, 1978; Rose and Krahl, 1995). The solution of the linearized equation of motion has the form X(1) = OgXot(Z) ,Jr-~Xy(Z) + KXK(Z) y(1) __
fly,(z) + Bye(z)
(18)
with the axial trajectories x~ and Yt3, the field trajectories x• and ya, and the dispersive trajectory xK that appears only in the xz section. The exact solution can be obtained by iteration, as in the case of the mirror. The initial conditions c~,/3, y, and 8 in Eq. (18) are replaced by power series with respect to the initial conditions and the relative energy deviation x. The coefficients of the series vary with the coordinate z and represent the aberration coefficients of the system:
x(z) = c~(z)x,,(z) + c~,(z)xy(z) + xxK(z) y(z)--c~(z) y~(z) + ca(z) y~(z)
(19)
Because of the midplane symmetry, the sum of the exponents of/3 and 8 in the xz section must be even, and in the xy section odd. Through a suitable linear combination, the solution outside the beam separator takes the same form (11) as for the mirror, with aberration coefficients according to the representation given by Eq. (12). The fundamental rays and the dispersive ray of the beam separator with a pole-piece spacing of 7 mm, as shown in the Figure 7, are displayed in Figure 11 (top) along the straightened optic axis. The axial rays x~ and y~, are point symmetric with respect to the plane S1, and axially symmetric about the plane $2, whereas the symmetry conditions for the field trajectories xy and ya are the opposite. The linear contribution to dispersion is already compensated for after a deflection of 45 ~ The double-symmetric course of the fundamental trajectories and of the multipole strengths about the symmetry planes $1 and $2 has the consequence that all geometric aberration coefficients of second order and the dispersion coefficients of first and second degree outside the beam separator vanish identically. In the middle section of Figure 11, a few secondrank fundamental rays are shown as a function of the z coordinate that one obtains as a product of the aberration coefficient C . . . with the corresponding field ray. The lower part of the figure contains the fundamental rays of first order and first degree. One recognizes that the chromatic aberrations of magnification, represented by x• and YaK,vanish in the exit plane of the beam separator. On the contrary, different values for the chromatic aberration in the xz and yz sections remain (given by x~K and y~K). This unround component of the chromatic aberration cannot be corrected by the rotationally symmetric mirror. From the table in Figure 12 it is evident, however, that the unround component amounts to less than 1% of the chromatic aberration of the objective lens of the
LOW-VOLTAGE ELECTRON MICROSCOPES
67
FIGURE 11. Fundamental trajectories and selected aberration rays of the second rank of the beam separator along the straightened optic axis of the beam separator. The arc length of the optic axis is in units of the curvature radius R -- 18.55 mm. (H. Mtieller, D. Preikszas, and H. Rose (1999). Abeam separator with small aberrations. Journal of Electron Microscopy 48(3), 191-204, by permission of Oxford University Press). SMART with respect to the s a m e plane. The values of the coefficients of the spherical aberration of the beam separator are in fact different in both sections, but even in the yz section the value lies significantly below a thousandth of the spherical aberration of the objective lens. All aberration coefficients of the beam separator up to the third order were evaluated by employing the model of an infinitely long groove of infinite depth along the straightened axis and were considered in the determination of the resolving power. The imaging quality of the beam separator at a simple 90 ~ deflection is comparable to that of an 8f arrangement of round lenses. This arrangement is shown in Figure 12 (top) together with the schematic path of the paraxial rays. The front-focal plane of the first lens is transferred with unit magnification
68
P. HARTEL ET AL.
_
R
_
I
[
_
21-
f
f
[
o_
'V " 17 .
.
.
.
pS S
.
$2
.
.
S1
$2
chromatic aberrations
spherical aberrations
xa~
Yl3~
aaa
beam separator
-83
-300
71
-380
5900
8f-arrangement
-110
-110
830
830
830
2 x 10 7
2 x 10 7
objective lens
- 5 x 10 4 -5x 10 4 2 x 10 7
'xaB~=YBaa YBBB
FIGURE 12. Comparison of the optical characteristics of the beam separator with an 8f arrangement of four magnetic round lenses. For the bore diameter and the pole-piece separation, a value of 10 mm was assumed. For comparison purposes the corresponding values for the objective lens of the SMART, referred to the intermediate image in front of the beam separator, are shown. All data are expressed in millimeters. (H. Miieller, D. Preikszas, and H. Rose (1999). A beam separator with small aberrations. Journal of Electron Microscopy 48(3), 191-204, by permission of Oxford University Press).
into the back-focal plane of the fourth lens. This can be deduced from the comparison of chromatic and spherical aberration coefficients; the values in the x z section for the beam separator are somewhat more favorable, whereas in the y z section the coefficients of the transfer lens system are somewhat lower. The beam separator works in a microscope almost as a transfer lens system, whereby the electrons are deflected by 90 ~ without introducing dispersion of first and second degree. The difference lies in the unround contribution of some aberration coefficients of the beam separator. Moreover, in systems with a straight optic axis, the dispersion coefficients of arbitrary degree disappear. Before the pole plates of the beam separator were manufactured, the validity of the model of infinitely deep grooves was checked and the influence of the curved, two-dimensional groove contour of the pole-piece plates was examined.
LOW-VOLTAGE ELECTRON MICROSCOPES
69
The course of the magnetic potential, caused by a coil in a straight groove of finite depth, very quickly approximates to the model of an infinitely deep groove (Mtiller et al., 1999). This applies even if one specifies that the coil close off the upper side of the groove; groove depth and coil height then coincide. For a pole-piece separation and a groove width of 7 mm no relevant deviation from the model of a groove of infinite depth can be detected even at a groove depth of 4 mm. From this it follows that the electron optics of the beam separator is mainly defined by the edges of the grooves. One must hence be especially careful in construction and finishing. A pleasing side effect is that the exact position of each coil winding has no influence on the imaging properties. An electrical fine adjustment of the potential distribution can be attained on a scale of some few hundredths of a millimeter by thin adjustment coils that are flush with the pole-piece surface. The semianalytical field calculation, which accounts for the detailed path of the grooves in the pole-piece material, is based on the application of virtual dipole layers on the surface of both pole-piece plates. The charge density between two grooves is determined by the prescribed radius of curvature R = 18.55 mm in the homogeneous field region. The course of the charge density above a groove is determined by the assumed groove geometry. Because of the high permeability (#r ~ 200,000) of the pole-piece material, its surfaces are, to a good approximation, equipotential surfaces of the magnetic scalar potential gr. These additional conditions can be fulfilled by successive mirroring of each of the virtual dipole layers at the opposite inner face of the pole-plate arrangement. For numerical evaluation the convergence of the infinite series can be accelerated by an extrapolation procedure; furthermore one integration over the two-dimensional charge-density distribution can be carried out analytically by making use of the Stokes integral theorem. The exact geometry defined by straight lines and circular segments of grooves was approximated by trapezia. The positions of the grooves evaluated in this way deviate in the region of the intersection points with the optic axis by less than 10/xm in position and a maximum of 4 mrad in angle from those determined under the assumption of infinitely long grooves. While the lateral deviation lies within the order of magnitude of the manufacturing accuracy and the effect of the adjustment coils, the angular deviation is more critical to judge, particularly since the focusing characteristic of the beam separator in the yz section is determined only by this angle. This result shows that a more accurate field calculation procedure is necessary if no special quadrupole windings are provided for correction of imperfectly determined edge positions and orientations. The calculated values of the multipole strength and of the fundamental rays and hence the aberration coefficients remain nearly unchanged. The deviations of the fundamental rays
70
P. HARTEL ET AL.
from those calculated with the simple model (shown in Fig. 11) lie well below the thickness of the lines. The good agreement between the numerical results of the two largely independent methods confirms their reliability.
3. Determination of Resolution Two procedures were adopted as ways to determine the resolution. First, a good and simple approximation is to assume that the surface of the diffraction disk corresponds to the Gaussian sum of the surfaces of all aberration disks:
r2= (0"61~) 'Ogmax 2
~
09maxPmaxKmax
[Co)a ffgbpc lgdKk
(20)
a+b+k > 0 a,b,c,d,k>O
where O)ma x denotes the aperture opening angle, Pmax the size of the image field, and gma x the maximum relative energy deviation from the nominal energy. Through the condition a + b + k > 0, all terms from the aberration expansion that lead to a pure distortion of the image were isolated. This method of determining resolution was used to incorporate the individual components of the system into a total system, which took care of tilt and displacement. This procedure furnished the adjustment tolerances given in Section III.C.4. Second, a more accurate method to predict the resolution involves point diagrams (Mtiller et al., 1999). For this purpose, one creates a statistical ensemble of electrons in an initial image plane with the initial condition Icol _ O)max, Ipl _< Pmax, and IKI _< Kmax. In the point diagram, one enters, for each electron, the lateral deviation from its ideal Gaussian image point caused by the aberration coefficients. The resulting point cloud gives on the one hand a good visual impression of the effect of the imaging aberrations; on the other hand, point and edge resolutions are easily defined from the size of regions, in which a certain portion of the points come to lie. In this method, the edge resolution was defined as the rise distance of 15% to 85% in intensity of the scanning spot. In the calculation of point diagrams, the effect of parasitic aberrations can easily be simulated by the insertion or variation of individual aberration coefficients. Thus the effect of a disturbance of the intrinsic symmetry of the beam separator is estimated by assuming that some of the symmetry-corrected aberration coefficients in the exit plane of the beam separator do not vanish but contribute to a certain percentage of its maximum value. A comparison of the edge resolution of an ideal beam separator with one having 10% asymmetry is shown in Table 3 (see page 112). The correction of the chromatic and spherical aberrations of the objective lens by the mirror corrector permits, in the various modes and with a fixed aperture, resolutions between 1 and 1.5 nm at an energy width of 1 eV determined
LOW-VOLTAGE ELECTRON MICROSCOPES
71
by the energy filter. If one increases the aperture angle by a factor of 4, which corresponds to an increase in intensity of 16 times, a resolution limit smaller than 5 nm can be attained with an initial energy of 100 eV at the specimen. The imaging energy filter allows one to obtain an energy resolution down to 0.1 eV. With the present best PEEM/LEEM, the SpeLEEM (spectroscopic photoemission and low-energy electron microscope) at Clausthal, edge resolution limits of 10 nm in the LEEM mode and 22 nm in the PEEM mode have been demonstrated by employing the 15%/85% criterion (Schmidt et al., 2000). The hemispherical energy filter exhibits an energy resolution of 0.5 eV. The reduced resolution in PEEM operation is mainly caused by the long exposure times resulting from the unfavorable signal-to-noise ratio. Pure photoemission microscopes achieve spatial resolution limits between 20 and 130 nm (Anders et al., 1999; De Stasio et al., 1999; Watts et al., 1997; Ziethen et al., 1998).
4. Accuracy and Stability Requirements The high spatial resolution of 1 nm can be attained only if (1) the complete apparatus and the state of adjustment during the exposure of an image is adequately stable, (2)the individual components are adequately prealigned mechanically and fine adjusted by the deflecting elements, and (3) the internal fine finish and accuracy of each component is adequate to fulfill the high electron-optical specifications. Overall tolerances for the corrector and the objective lens were estimated under the assumption that any deviation from the ideal state should at worst result in an additional unsharpness of 0.25 nm in the image. Beating in mind the deflection elements already in the apparatus, the individual components (objective lens, field lens, and electron mirror) must be positioned mechanically relative to the beam separator along the axis by up to a limit of Az = 2 mm. The requirements on the lateral displacement for mirror and objective lens are moderate at Ap = 1.4 mm, whereas the axis of the field lenses must not be displaced by more than A p - - 0 . 2 5 mm from the optic axis of the beam separator. A tilt of the objective lens and mirror of about | = 30 mrad is permissible, whereas for the field lens twice as much tilt is allowable. The mechanical stability requirements after the alignment of the system with the aid of the deflection elements are some three orders of magnitude more severe. They lie (a) for lateral displacement at 0.1 /~m, (b)for a displacement along the axis at between 7 and 85 ~m, and (c) for tilt at about 0.1/~rad. The finishing accuracy of the beam separator edges should clearly lie below the tolerance limit that can be balanced with the aid of the adjustment coils on top of the grooves. This means that the permissible deviation in position must
72
E HARTEL ET AL.
lie below 2 0 / z m and in angle below 1 mrad. The more accurately the edges are finished, the simpler should be the handling and adjustment of the beam separator. So that local variations in the dipole field can be avoided, a surface roughness of less than 10/zm and an equally good plane parallelism with the best possible constant plate separation of 7 mm should be aimed at. The internal accuracy of the field lenses lies at 0.1 mm and is hence uncritical. The demands on the objective lens and the electron mirror are clearly higher. Deviations that do not harm the rotational symmetry (e.g., a wrong electrode separation) can easily be compensated for by a slightly altered excitation and are tolerable up to 0.1 mm. The rotational symmetry inside the individual components must, however, be ensured to be more accurate by at least an order of magnitude. Current and voltage variations in rotationally symmetric systems have, in general, a defocusing effect that provokes an unsharpness of an image in a fixed plane. Moreover, the aberration coefficients in the aberration expansion (12) depend on the excitation. In the beam separator and with the deflection elements, electrical instabilities lead mainly to variations in the exit positions and slopes of the electron trajectories that propagate and cumulate in combination aberrations for the complete system. An additional unsharpness in the image remains below 0.25 nm if the relative accuracy of the power supplies for the objective lens and the mirror maintains a stability of 10 -6. The relative demands on the coil currents of the beam separator lie above 10 -5, whereas for the supply to deflection elements, stabilities of 2 . 1 0 -6 are needed. For the voltage supplies of the field lenses, a relative stability of 10 -4 is adequate. Because of the large number (about 100) of current and voltage supplies for exciting the deflection elements, an in-house modular construction incorporating computer control was mandatory on the grounds of both space and cost. Because of the total length of the straightened optic axis of some 2 m, attention must be paid to good magnetic shielding. Time-varying stray magnetic fields have to be reduced to [Bstray] < 8 910-11T.
IV. MECHANICAL DESIGN OF THE MIRROR CORRECTOR
From the electron-optical calculations there follow relatively high demands on the mechanical accuracy of the individual components of the mirror corrector. They are listed in Section III.C.4. In the construction, all tolerances must remain well below the upper limits given there. In the application of the instrument in surface physics, the ultra-high-vacuum capability of the equipment must be guaranteed. For this, the choice of material is limited to metals with not-toolow vapor pressure and to special ceramics. With all mechanical connections, care should be taken that no enclosed volumes arise. Moreover, the equipment
LOW-VOLTAGE ELECTRON MICROSCOPES
73
must be able to be baked out to a temperature of at least 150~ in order to achieve the desired pressure of less than 10 -8 Pa at the specimen. Attention must be paid not only to the steadiness of the temperature but also to the thermal expansion coefficients of the different materials involved. The electron optics places severe demands on the magnetic properties of materials. In the immediate vicinity of the beam, with the exception of pole pieces, magnetized or magnetizable materials are to be avoided. This, in particular, limits the choice of steels considerably. The pole-piece materials should be those with high and homogeneous permeability as well as low coercivity. Finally, the complete equipment must be screened against stray magnetic fields with a double-walled shielding. Close to the electron beam, for the avoidance of charging-up effects, only conducting materials should be used. Materials with nonconducting oxides (e.g., aluminum) are excluded. A. Beam Separator The beam separator is set in a stable framework of stainless steel in precision bearings. The framework serves, at the same time, as a vacuum chamber and defines the mechanical interface for the other electron-optical components by means of suitable fittings. The small vacuum box between the pole plates is made from oxygen-free copper and is connected to the framework with bellows. 1. Pole Plates and Coils The pole plates of the beam separator are made of VACOPERM 100. This is, like Mu-metal, a nickel-iron alloy with a nickel content of some 7283%. Compared with Mu-metal, or the frequently employed PERMENORM, VACOPERM 100 exhibits a higher permeability, together with a lower coercivity (see Table 1). Its lower saturation induction is not important at a magnetic flux of 20 mT in the beam separator. After milling and boring, it is necessary, TABLE 1 MATERIAL PROPERTIES OF SPECIAL HIGH-PERMEABILITYMATERIALSa
Material VACOPERM 100 Mu-metal PERMENORM 5000
permeability (/Zr)
Relative
Static coercivity (A/cm)
Saturation induction (T)
200,000 60,000 15,000
0.015 0.020 0.030
0.78 0.80 1.60
aThe data are taken from the brochure FS-M7 of the company Vacuumschmelze GmbH (Hanau, Germany).
74
E HARTEL ET AL.
in order to safeguard the homogeneity of the magnetic properties, to run a final heating for several hours at a temperature of 1100~ in a hydrogen atmosphere at low pressure. At such high temperatures, a heat-induced distortion in the range of several tenths of a millimeter is unavoidable; this deviation must be compensated for by subsequent spark erosion and grinding. This procedure guarantees good magnetic properties and a mechanical accuracy of a hundredth of a millimeter.
FIGURE 13. Plan view of a pole plate of the beam separator together with a cross section through the groove of the field-generating coil. The cross section has been enlarged by a factor of 2. (Reprinted from Journal of Electron Spectroscopy and Related Phenomena, 84, No 1-3, Fink et al., SMART: a planned ultrahigh-resolution spectromicroscope for BESSY II, pp. 231-250, 1997, with permission from Elsevier Science)
LOW-VOLTAGE ELECTRON MICROSCOPES
75
FIGURE 14. Photographs of the beam separator. (Top) Pole plates with yokes and pins. (Bottom left) Pole plate with fitted main coils. (Bottom fight) Sets of main coils (top) and flat adjustment coils (bottom).
Each pole plate of the beam separator is constructed with screwed-on "islands" between which the field-producing coils are embedded (see Figs. 13 and 14). With spark erosion one achieves an accuracy of the edge run of an "island" of 3-5 /xm. Individual components are accurately placed, within 10/xm, in the baseplate by dowel pins. The separation of the two pole plates is determined by four yokes, while the relative position of the two plates to each other is ensured by four dowels to an accuracy of 10 /zm. The surface roughness after grinding of the mounted pole plates is below 2.5 /zm. The plane parallelism is adequately determined by the yoke height of (7 40.005) mm.
76
P. H A R T E L ET AL.
The coils were manufactured with special forming and press tools and are set in high-temperature epoxy resin. The wire of the main coils has a rectangular cross section. In this way the accurate winding of multilayer, freestanding, concave-convex coils is feasible. Moreover, the filling factor of 90% clearly exceeds the values that can be achieved with round wire. The single-layer adjustment coils consist of two partial coils in order to influence the center of gravity of the current as well as the total current. All materials used for the coils sustain a temperature of at least 200~
J
i
_J
i
- -I-,
i
I
frame
....
F, I
i
/ i beam separator
, ,n ', k',~
~
vacuum box with channels
_',I ,~ ,,, ',~ ,,:~ ',',, ,,-',b '
support for the beam separator holder for the vacuum box
i I i
/flange source - --,_ 2
/.'l
/
ii
i
I
I
I
I
~
i
~
i I
0
.
.
.
i
i I
.
\\
t
/
/
ix ix
i
i'
I i
I
i
.
\
t
.
x \
. x i
x
Xl
\
"
.
/
/ /
/
1 / /
/
/
/1
'
/
. i/
/
v
/
/
/
/
"/
i
i
i
i
i
i
i
i
_
. . . .
_
-'==~2;~ '
.
\
/
\
A~L
H
\
i
x
.
x
i/Ii
Il i
/
I
"XX.
I
/
/
: '
\\,.
/i
/
,- r
---'--~-_-'_-~\---\-\;_~--'--'---'
q--
/
DN 150 OF
'~ :
i
>z
flange DN 16 CF
'
'
i
_
~=
,-
* :
Vmirr~ ~ t [] 350 mm . . . . . . . . . . . . . . . . . . . . . .
t-,-
210 mm
FIGURE 15. Overall view of the beam separator with built-in pole pieces. The flame also serves as a vacuum chamber. A vacuum box of copper, which is fastened with membrane bellows to the outer frame, is located between the pole plates.
LOW-VOLTAGE ELECTRON MICROSCOPES
77
2. Framework of the Beam Separator The framework of the beam separator establishes the connection of the surrounding components with the b e a m separator (see Fig. 15). The mechanical stability results from four welded plates of nonmagnetizable stainless steel with a thickness of 30 mm, which are shown in Figure 16. In the side walls,
FIGURE 16. Photographs of the framework of the beam separator. (Upper photograph, foreground) The mirror-side flange. (Lower photograph) The copper vacuum box with membrane bellows and upper pump supports as well as the outer surfaces of the beam separator.
78
P. HARTEL ET AL.
flange inserts are fixed, fitted to each other with a precision of 4-0.1 mm in all three spatial directions. The flange inserts are precision-turned compo, nents with fittings for holding the field lenses and for coupling up of the other electron-optical components. They form, at the same time, the separation wall between the evacuated beam region and the surroundings. Toward the outside they present a standard ultra-high-vacuum DN 150 CF flange. The flange inserts thus define the position of the ideal optic axis. The beam separator must, relative to this, be positioned with an accuracy of +0.1 mm. This can be achieved by displacing the beam separator on the reference surfaces, whose heights are guaranteed to 4-0.02 mm. The fittings for centering the field lens and its outside diameter are finished to -t-0.02 mm. A stainless-steel tube with a wall thickness of 10 mm binds the electron mirror and the multipoles directly to the mirror-side flange insert (see Fig. 17). Thus the position of the mirror relative to the optic axis is defined to an accuracy of about 0.1 mm. The electron source and the transfer lens system are adjusted by means of centering devices in the outer flange, while for the actual positioning of the objective lens suitable calipers are available. Between the pole plates of the beam separator there is a vacuum box with a height of 6.6 mm. The beam separator with a pole plate separation of 7 mm can therefore be adjusted to a small extent in height. The vacuum box consists of two symmetric halves. In both halves, in the region of the optic axis and in a straight connecting line of the oppositely placed flanges, there are milled channels of a depth of 2 mm and a width of 16 mm. The two halves were hard-soldered together (with nonmagnetic silver solder) with membrane bellows in the comers and two pump supports in the center of the vacuum box. The membrane bellows balance the different thermal expansions of the stainless-steel frame and the copper-finished vacuum box during the bakeout times. The endpieces of the bellows are screwed, vacuum tight, onto the flange inserts with special aluminum seals. The upper pump support is also used for fixing the vacuum box, in order to relieve the (soft) copper box and the bellows mechanically.
B. Field Lenses and Electron Mirror
For the electrostatic field lenses and the tetrode mirror, one can choose between two well-known manufacturing procedures. In the first method, the electrodes are shrunk onto ceramic tubes, while in the second method, two electrodes are insulated from each other by three ceramic spheres. The second method seemed technologically simpler to carry out. Moreover, it permits a slightly smaller construction height for a lens. This is especially significant for the field lenses, since the center of the lens should lie as close as possible to the intermediate image (near the edge of the beam separator).
79
LOW-VOLTAGE ELECTRON MICROSCOPES
flange for electric supplies of the multipole elements
flanges for high voltage connection Faraday cage / ~ tetrode mirror / / \ t , k
i
flange for test specimen electric-magnetic ~ beam separator multipole ,/~elements \~ frame \ k
I. I
-
~
~/~ Mu-metal screenings pump flan! ae
strengthening/tube I
j
I
r 100 mm
field lens
t ~
Mu-metal vacuum chamber
i
i
I
FIGURE 17. General view of the tetrode mirror, multipoles, and a field lens together with magnetic screening and the vacuum chamber made of Mu-metal. Between the multipoles and the field lens is a direct entry to the optic axis. Here, for instance, a test specimen can be introduced. The high voltage is led in straight lines to the mirror electrodes over suitably placed flanges.
A well-proven material for the electrodes is titanium. It has no problems of magnetic inclusions and most titanium oxides are electrically conducting. The precision spheres made of aluminum oxide are pore free. The deviations from nominal diameter and sphericity lie, for spheres up to a diameter of 25 mm, below 3/zm. For the approximate selection of sphere size, a rule of thumb can be applied: the diameter in millimeters should be at least the maximum potential difference in kilovolts between the two electrodes. In this case, the
80
P. HARTEL ET AL.
voltage drop per unit length on the spherical surface lies below 1.3 kV/mm, if one assumes a quarter of the sphere's girth as an insulation path. Care should be taken that the spheres are hidden from the optic axis to avoid charging effects. The manufacturing of a lens (or the electron mirror) proceeds in several steps. First, the electrodes are premachined and the sockets for the spheres attached. Second, the lens is assembled with hardened steel spheres of comparable accuracy. With a hydraulic press, one presses in the final seating of the spheres into the sockets. In this way, the relative position of the electrodes is fixed. After the pressing, one puts all the electrodes of a lens, in one setting, in a lathe for final machining of one side. Finally, the back face of each electrode is individually finished. The use of special feeds enables the finishing of a lens to be carried out within the machining accuracy (better than 10/zm) of a conventional lathe. In the field lenses*--electrostatic einzel lenses--a maximum voltage of 10 kV on the central electrode is adequate for all modes of the SMART. Correspondingly, spheres with a diameter of 10 mm were used. The maximum local field strength on the electrode surfaces is less than 10 kV/mm. This avoids any electric breakdown problems during the operation of the lens. The design of the lens is not symmetric with respect to the midplane of the lens, as one may deduce from Figures 17 and 18. In this way, the center of the lens is only 13 mm away from the beam separator's edge and, at the same time, the spheres are hidden from the beam. The electrode geometry of the tetrode mirror was determined in such a way that the chromatic and spherical aberrations of the objective lens can be simultaneously corrected in all operational modes without overstepping a maximum local field strength of 10 kV/mm. The spheres were so dimensioned that at the reversing electrode a voltage of 20 kV can be applied, while between the other electrodes the potential difference remains less than 15 kV. As an option for the alignment of the optic axis, the reversing electrode contains a small transmission hole. Behind it, a Faraday cup is mounted for measuring the electron intensity. The optic axis is adjusted by two multipole elements, whose construction is described in Section IV.C.1. The direct fixing of the mirror and the multipoles at the side wall of the beam separator has two advantages: The position of the mirror is independent of the accuracy of the vacuum chamber. This is very favorable, since with a welded construction with Mu-metal the best tolerance that can be guaranteed is around 4-1 mm. Moreover, the vacuum chamber and mirror are connected mechanically only by means of the framework of the beam separator. Hence, the intrinsic stability of the electron-optical corrector system is increased. *The field lens was calculated and designed by S. Planck as part of her diploma thesis.
LOW-VOLTAGE ELECTRON MICROSCOPES
81
FIGURE 18. Field lens and electron mirror. Both elements are based on the same principle. The insulation and positioning of one electrode relative to another is ensured by three highprecision ceramic spheres.
82
P. HARTEL ET AL.
C. Multipoles 1. Electric-Magnetic Multipole Elements Between the mirror and the beam separator, two electric-magnetic multipole elements are situated with a separation of 100 mm. Figure 17 shows a longitudinal cross section through the multipole arrangement. Both multipoles consist of an electrostatic dodecapole and a magnetic octopole. In addition to their role as a double-deflection element (for aligning separately the optic axes for incoming electrons and those reflected by the mirror), they can serve as stigmators for the compensation of residual aberrations. With the dodecapole, quadrupole and hexapole fields of any desired azimuthal orientation and, with slight loss of quality, octopole fields can be produced. The magnetic octopole should be used exclusively for the generation of dipole and quadrupole fields. Since the necessary magnetic field strength for the stigmators is relatively low, pole pieces can be dispensed with. Figure 19 shows the construction of a multipole element. The electrostatic multipole consists of 12 molybdenum wires, led through two ceramic holders. The holders are mounted on titanium tubes that at the same time determine the effective field length. The wires are surrounded by a winding support of bronze on which eight adjustable coils are wound. The effective length of the magnetic fields is set by an external cylinder and two disks of high-permeability material. The effective length of the electric and of the magnetic field lies around 25 mm. The connection of the dodecapole is established by attached contacts, while the coils and its leads are screwed into a ceramic ring (see Fig. 19). The two multipoles are fastened to the baseplate of the mirror. The mirror-side multipole is centered on a cylinder of aluminum that carries the second multipole. Generally, with screw connections in ultra-high-vacuum devices, care must be taken of the bakeout capability. Therefore, nuts and threads of different nonmagnetic materials were chosen. In the immediate vicinity of the electron beam, molybdenum, titanium, and bronze were used exclusively.
2. Additional Magnetic Deflection Elements Additional deflection elements were needed for testing the individual components of the mirror corrector in a conventional SEM. The microscope was split between the aperture and the objective lens, in order to insert adapter flanges (see also Section V.A). The deflection elements provide a fine adjustment of the optic axes of the corrector and the microscope under test, as well as the intentional deflection of the illuminating electron beam for characterizing the electron-optical properties of the corrector.
LOW-VOLTAGE ELECTRON MICROSCOPES
83
FIGURE19. Electric-magneticmultipole elements. (Left) The multipoles as finally mounted and connected. (Top right) The combined electric dodecapole with a magnetic octopole superimposed. (Bottom fight)The assembled (mirror-side) multipole, hidden behind the aluminum cylinder. The arrows indicate where the individual components are situated in the final assembly. The double-deflection element shown in Figure 20 is mounted on the adapter flange between the electron source of the SEM and the framework of the beam separator. A further single-deflection element of the same type is located between the beam separator and the objective lens of the test microscope. The construction of the deflecting elements needs to satisfy only high-vacuum requirements. This greatly simplifies the design. Magnetic deflection elements were chosen, since the mechanical outlay is easier in comparison with that needed for electrical elements. Moreover, the electrical requirements are simpler, since facing coils can be connected in series. The coil bobbins are attached to brass tubes of 7-mm outside diameter. A bobbin consists of two anchorshaped bearing plates, whose separation is fixed by soldered rods. The coils
84
R HARTEL ET AL.
FIGURE 20. Magnetic double-deflection element. In the test bed, the deflection element is located between the aperture of the modified SEM and the beam separator.
are wound with 28 turns of lacquered copper wire. The desired geometry was held better than 0.2 mm.
V. TESTING OF THE MIRROR CORRECTOR
A. Measurement Arrangement The various components of the SMART must be tested individually because of the complexity of the system as a whole. The testing of the mirror corrector, consisting of the beam separator and the tetrode mirror, in the finished apparatus was impossible in view of the time available. The construction of
LOW-VOLTAGE ELECTRON MICROSCOPES
85
a similar system with an electron source, an objective lens, and a projector system also seemed unreasonable. Furthermore, the beam separator had to be investigated separately. In this first test phase, the imaging properties should be characterized for a 90 ~ deflection by one quadrant of the beam separator. In the second phase the tetrode mirror must be attached. For this purpose two quadrants of the beam separator must be used. For both phases of the test, a suitable electron-optical bench must be available. Since only one quadrant of the beam separator was to be tested, a test bed such as that of a direct-imaging LEEM was not possible, since for the electron illumination and the imaging, one quadrant of the beam separator is needed for each function. If, instead, one illuminates the specimen with photons from an ultraviolet lamp (PEEM), one is forced into time-consuming ultra-highvacuum technology and has additionally to contend with intensity problems. There remained two sensible possibilities for analyzing the image quality of both parts of the mirror corrector. The components to be investigated could be integrated in a TEM or in an SEM. The schematic construction of a TEM is shown in Figure 21. The condenser is operated so that the specimen is illuminated normally or slightly convergently. The objective lens and the projector lenses image the transmitted electrons with adjustable magnification in the observation plane. New electron-optical components can be tested by inserting them in place of a lens
FIGURE21. Schematic diagram of a transmission electron microscope. The condenser lens system provides an almost-uniform illumination of the specimen. The objective lens and the projector lens system image the exit plane of the specimen with variable magnification in the observation plane.
86
E HARTEL ET AL.
behind the specimen or by placing the components in an intermediate plane. The latter corresponds to the position of the mirror corrector in the SMART. For testing, on the one hand, one can make use of the illuminating system as usual to illuminate, for example, a copper mesh. This would then be imaged through the component to be tested and magnified by the projector. The distortions of the image thus obtained allow one to draw conclusions about the imaging properties of the component being characterized. On the other hand, by removing the specimen, one can set up the optical system as a whole in such a way that an image of the source appears in the observation plane. By arranging double-deflection elements in front of and possibly behind the component, one can generate aberration figures; the positional deviation of the image of the source can be analyzed in terms of displacement and tilt. Finally, diffractograms of amorphous specimens can be used to analyze the aberrations. If an SEM, the construction of which is shown in Figure 22, is used, the individual components are inserted in the beam path between the illuminating system and the objective lens. The aberrations in the components to be tested lead to a distorted scanning spot at the specimen. On the one hand, the imaging properties can be determined by the achievable point or edge resolution of the modified equipment compared with theoretical predictions. On the other
FIGURE 22. Schematic diagram of the unmodified SEM. In front of the aperture plane, the zoom condenser produces a demagnified intermediate image of the source, with variable magnification. This image is further demagnified by the objective lens and imaged on the specimen as a scanning probe. The scanning coils tilt the illuminating beam in such a way that a square object region can be scanned distortion free. The detector signal from each raster point is displayed on a monitor whose deflector coils are synchronized with the displacement of the probe.
LOW-VOLTAGE ELECTRON MICROSCOPES
87
hand, aberration figures can again be obtained by inserting a double-deflection element in front of the new components. The quality of the scanning remains unaffected, since the scanning coils lie in the path of rays behind the new component close to the objective lens. 9 We used an available commercial SEM. The beam separator was integrated into the microscope without difficulty by means of two adapter flanges. A suitable TEM was not available to us. Building a TEM as a test bed, from individual components from various pieces of apparatus, seemed unreasonable, since the complete peripheral equipment, such as the vacuum system, current and voltage supplies, and control system, would have to be built from scratch. A big advantage of the SEM is that it usually functions with electrons with the nominal energy En--- 15 keV used in the SMART, so that standard test specimens can be employed. A disadvantage of the SEM is that it does not transfer a large field of view. The size of the transferred field of view of the components to be tested can be checked only sequentially with the aid of additional deflecting elements. This disadvantage is, however, more than compensated for by the advantages. The arrangement of the electron-optical elements in the scanning electron microscope ZEISS DSM 960 placed at our disposal is shown in Figure 22. The microscope is a purely magnetic system. The electrons emitted by the thermionic cathode are accelerated to a selectable nominal energy between 1 and 30 keV. The two condenser lenses produce a demagnified intermediate image of the source at a selectable magnification. The diameter of the electron beam used to form the scanning spot by means of the objective lens is limited by a manually operated aperture mechanism, which is also used to center the beam onto the objective lens. Two pairs of crossed coils serve as scanning elements, so that scanning can take place about a point, preferably the comafree point of the objective lens. In the section of the column, there are also situated two quadrupole windings, rotated by 45 ~ relative to each other, that serve as stigmators. The secondary electrons emitted from the specimen and/or backscattered electrons are recorded by a side-mounted detector. This consists of a gridlike collector whose bias voltage can be varied in order to distinguish between secondary and backscattered electrons. The electrons that pass through the grating are accelerated and strike a scintillator with an attached photomultiplier. As shown in Figure 23, the unmodified microscope reaches a resolution limit of 14 nm at an accelerating voltage of 10 kV and a working distance (WD) of 4 mm. The WD is defined as the separation of the specimen from the front pole-piece face of the objective lens. The resolution achieved is in agreement with that of theoretical calculations for an aperture with a diameter of 40/zm. As the intermediate image of the source is located 66 mm in front of the aperture plane, the aperture angle with respect to the intermediate image
B. HARTEL ET AL.
88
350
'
I
'
I
'
I
I
I 100
,
I 150
~ , 300
~250
~
200 -...,. 150 0
50
x (nm)
FmURE23. The optimum resolution of the scanning electron microscope DSM 960 is 14 nm at an accelerating voltage of 10 kV and a working distance (WD) of 4 mm. The image at the top is taken at a magnification of 100,000. To extract the intensity profile below, we averaged the intensity values inside the box, indicated above, along the vertical direction.
amounts to 0.3 mrad. If one further assumes a full width at half m a x i m u m of the energy distribution A E -- 3 eV, the diameters of the scanning spot are found to be d70~ = 10 nm and d90~ = 17 nm. The indices denote the percentage of the electrons that are focused into a circle of the given diameter. As a first step, the framework of the beam separator was mounted between the specimen chamber (with objective lens and scanning coils) and the illuminating system including the aperture (see Fig. 24 without the electron mirror on the left). The modified peripheral e q u i p m e n t m v a c u u m system, water cooling, w i r i n g m a n d most notably the first field lens were tested. In the beginning, the lengthened microscope column was very sensitive to stray magnetic fields with a frequency of 50 Hz. On the one hand, this was caused by the absence
LOW-VOLTAGE ELECTRON MICROSCOPES
89
FIGURE 24. Arrangement for testing the mirror. Two electric-magnetic multipole elements are located between the tetrode mirror and the upper field lens. They serve, on the one hand, as double-deflector elements for the independent adjustment of the optic axis of incoming and outgoing electrons, and, on the other hand, as complex stigmators for the compensation of residual aberrations in the system as a whole. By switching off the beam separator, one can operate the scanning microscope in straight transmission. The resulting additional drift length then leads to an increased diameter of the electron bundle in the objective lens. o f magnetic shielding in the region o f the adapter flanges; on the other hand, the pole pieces of the b e a m separator were found to have an u n e x p e c t e d l y low shielding factor. The m e a s u r e m e n t s and the necessary changes o f the construction are discussed in Section V.B. After the successful test o f the first field lens, the b e a m separator was added to the system as shown in Figure 25. Initially, t h e i m a g i n g properties of the different field lenses No. 1 to No. 4 differed considerably b e c a u s e o f technological difficulties, w h i c h had to be solved (see Section V.C). Additional deflecting elements in the region o f the adapter flanges were attached to the
90
P. HARTEL ET AL. Scanning electron microscope ZEISS DSM 960
Beam separator
FIGURE 25. Test bed for the characterization of the beam separator. The beam separator together with two electrostatic field lenses is integrated into the SEM by means of adapter flanges between the aperture and the objective lens. A double-deflection element is needed for aligning the optic axis with the axes of the upper field lens and of the beam separator. In order for the electrons to strike the objective lens centrally a further deflection element behind the lower field lens is necessary.
microscope for alignment of the optic axes of all imaging elements and the recording of aberration figures. The characterization and improvement of the electron-optical properties of the beam separator close to the theoretical predictions is summarized in Section V.D. In addition, the chromatic and spherical aberrations of the system without the mirror were measured as described in Section V.E. Finally, the complete mirror corrector was installed in the test bed. The arrangement can be seen in Figure 24. The tetrode mirror and the two electricmagnetic multipole e l e m e n t s - - w o r k i n g as stigmator and as double-deflection e l e m e n t - - a r e assembled sideways at the framework of the beam separator. In this setup the simultaneous correction of chromatic and spherical aberrations was proven beyond any doubt. The theoretical resolution limit of 4.5 nm with
LOW-VOLTAGE ELECTRON MICROSCOPES
91
a position of the intermediate image of the source about 135 m m in front of the edge of the beam separator and an aperture angle with respect to this plane of 0.3 mrad has not been reached so far. The results obtained with the electron mirror are presented in Section V.F.
B. Improvement of Magnetic Shielding The SEM with inbuilt beam separator was extremely sensitive to stray magnetic fields, both in direct transmission of the deactivated beam separator and in the first tests of a 90~ The initial presumption that this was due to insufficient screening in the region of the field lenses over a length of some 50 m m could not be confirmed. A further improvement of the shielding beyond that provided for the test bed was necessary, since for high-resolution micrographs the stray magnetic field was not sufficiently reduced. The measured deflection of the electron beam with an amplitude A - - 2 0 0 nm in the object plane was so large that it could not be explained by the four 5-mm-long gaps at which no material of high permeability was present. The strength B of the magnetic flux required to produce so large a ray deviation can be estimated geometrically with the aid of Figure 26. A deviation A in the specimen plane corresponds to a (virtual) displacement of the intermediate image by A = M 9A, in which the magnification of the objective lens
FIGURE26. Sketch of the geometric construction used to estimate the influence of the stray magnetic fields in the test microscope. The four regions without screening in the neighborhood of the beam separator are combined into one region. The beam separator itself is omitted for clarity.
92
E HARTEL ET AL.
amounts to M - 24 in the case of a simple 90 ~ deflection through the beam separator. If one assumes that a homogeneous magnetic field B over a length I of 20 mm at a distance d -- 130 mm from the intermediate image is responsible for the displacement of the electron bundle, one obtains for small deviation angles, ct = l / r - A/d, a necessary strength of the magnetic flux density on the axis of B-
IBI-
~/2U~m M . A = 8.3/zT e d.l
(21)
to produce the measured deflection A. This relation results from equating the Lorentz and centripetal forces: U2
]F]- ev [ B [ - m - F
(22)
and from the nonrelativistic energy relation E,, = eUa --(m/2)v 2. The measurement of the amplitude of the stray field in the vicinity of the microscope yielded values ranging from 0.1 to 0.4/zT. This shows that the observed displacement of the electron probe in the specimen plane of several hundred nanometers cannot be explained by the air gaps in the screening alone. However, the strong influence of the perturbations became reasonable after measurement of the falloff of stray fields between the pole pieces of the beam separator. To create a well-defined stray magnetic field, we placed a large coil 80 cm underneath the beam separator and supplied it with an adjustable alternating voltage Ue/ywith a frequency of v - 50 Hz. The measurements were performed with a small pickup coil. The induced voltage was amplified and displayed on an oscilloscope or measured with a digital voltmeter. Curve (d) in Figure 27 shows the amplitude of the field created by the excitation coil with the beam separator removed. If the beam separator is brought into the field (a), one can see that the field is strongly damped up to a few millimeters in front of the edge, as is the case of measurements along the symmetry axis of a shielding cylinder (e). For the beam separator, however, in contrast to a cylindrical opening, there is a strong edge increase by a factor of 2. Thereafter, the stray field decreases slowly over several plateaus right into the center of the beam separator. The plateaus thereby reflect the inner structure of the beam separator: they correspond to regions between two grooves in the surface of the pole-piece plates. With a cylinder, there are no such edge effects. In the latter case the value of the magnetic flux density at the edge amounts to just one third of the maximum value outside and decreases nearly exponentially below the detectable range. The beam separator therefore screens the stray fields insufficiently. Its screening properties can, however, be significantly improved by simple
93
LOW-VOLTAGE ELECTRON MICROSCOPES \.
\
, \ I
- 1 0 0 mm
p~
9
1 4 0 mm
plates x J
~ ]
'
yoke /
"
B50 Hz i I 2.5
'
I
o 9
i
9( ! ) ' . . ~ -
.... .~..'~'"
.~.''~ ~
2 o
]
i
t
I
9 ~.
9 ~. 9
I
9
i
9 ~
9
beam separator alone (b) beam separator with ring
I~-~a (c~beam separatorwith ring and side sheets I ~'~".
1.5
4. 9 (d) stray field without beam separator V"V (e) screening cylinder
1 0.5 I 0
I
-100
-50
0 50 distance from the beam separator (mm)
100
FIGURE 27. Decline of the vertical magnetic field for a frequency of 50 Hz in the midsection of the beam separator. The magnetic field is generated by a coil 80 cm below the beam separator. The damping can be improved considerably by attaching a ring of Mu-metal at the front face as well as by sealing the side flanges with Mu-metal. For comparison, the magnetic field without the presence of the high-permeability material is shown, as well as the field reduction along the central axis of a cylinder of Mu-metal with a wall thickness of 1.5 mm and a diameter of 96 mm.
methods. If a Mu-metal ring ( ~ i = 94 mm, ~bo = 112 mm, thickness = 7 mm) is pressed on the side surface of the beam separator on which the measurement is performed (b), then the magnetic field at the beam axis is reduced by a factor of 2. A further improvement of the screening can be obtained by covering the remaining side surfaces with Mu-metal sheets (c). The Mu-metal rings that were originally fixed at a distance of 1 mm from the beam separator edge are now pressed onto the side surface with springs. This ensures an improvement of the screening by more than a factor of 2.
94
P. HARTEL ET AL.
This measurement cannot be brought into agreement with the estimate obtained by the method of the magnetic circuit (see, for example, Joos, 1989). While the measurement yields an attenuation of the stray field by a factor of 5-10 with respect to the external field, according to the method of the magnetic circuitmapart from small deviations in the edge areamthe magnetic flux density in the air gap Ba should be a constant, attenuated by a factor Ba no
= u
1
A y + Aa
t.s Ay --I- Aa/IZr
~
1
0_ 4
(23)
with respect to the external magnetic flux density Bo. In this case, the permeability/J~r of the pole-piece plate was taken to be 50,000. The cross-sectional area of the four yokes amounted to A y --- 169 cm 2 and that of the air gap Aa --" 615 cm 2. The method of the magnetic circuit fails on account of the special threedimensional structure of the beam separator with its unfavorable width-toheight relationship of the air gap and the sensitivity of high-permeability material to the skin effect even at low frequencies (see, for example, Joos, 1989). The penetration depth t, which denotes the value at which an external homogeneous alternating field is reduced by a factor 1/e in a plane plate of conductivity cr and permeability lZrlZo, amounts in the present case to t -
1
=
0.14
mm
(24)
where we have assumed a conductivity of 5 m / ( ~ mm 2) at a frequency of v = 50 Hz. This means that the total magnetic flux generated by stray fields is transported along the surface of the pole plates. In this connection, the measured edge increase is understandable. The refinement of the method of the magnetic circuit, taking into account the skin effect through changed cross sections (surface x penetration depth) and small air gaps between pole plates and yokes, does reduce the discrepancy between theory and measurement. However, all this does not yet succeed in explaining the measured results. The measurements shown in Figure 28 demonstrate the influence of the width-to-height relationship of the air gap between two yokes in a simple model system. The massive pole plates were replaced by 1.5-mm-thick Mu-metal sheets. Every two yokes were fastened back to back. One obvious difference between the beam separator (a) and the model system (b-d) is the missing edge structure. This is due to the small thickness of the sheets compared with that of the massive pole plates. All measurements show, however, the formation of a plateau in the field strength starting at a depth of 40 mm. The height of the plateau sinks rapidly with decreasing yoke separation d. Therefore, an effective improvement of the screening effect of the beam separator can be achieved by
LOW-VOLTAGE ELECTRON MICROSCOPES
95
yoke distance d \
i
\ __
-10~) mm
!l
____~
~
-
mm
B50Hz I 2.5 I
Mu-metal sheet
I
" ~ " ~
~
yokes I
";~~ 1.5
~ o.5 0 -100
-50 0 distance from the edge (mm)
50
FIGURE 28. Decrease of the vertical magnetic field at a frequency of 50 Hz for different yoke distances in a simple model system. This system consists of the four yokes of a beam separator and two Mu-metal sheets of thickness 1.5 mm. The magnetic field was induced by a coil placed 80 cm below the sheets. Between the two sheets, the field strength falls to a plateau, whose height decreases with decreasing yoke separation. In the beam separator the field strength again sinks to a plateau, if one neglects the influence of the grooves. The absence of edge enhancement in the model system can be explained by the smaller thickness of the sheets compared with that of the pole plates.
a modification of the yokes. The long gap along the side faces in the region of the edges can be closed off in the neighborhood of the optic axis. This is possible without covering the field-producing coils. Such modified yokes are required if the corrector is installed in the SMART. For the test bed we decided to cover the beam separator with an additional U-shaped Mu-metal sheet with a thickness of 1.5 mm. The screening plate is shown in Figure 33 (see page 103). It is fastened with two screws on the flange
96
P. HARTEL ET AL.
(fight), which is not used for the testing. Without heat treatment the screening factor was 2 but increased to 5 after heating in a vacuum oven. The performance of the screening sheet and the application of Mu-metal tings onto the side faces of the beam separator are cumulative and adequate for the test bench. The total improvement obtained is shown in Figure 29. The upper image was taken without any shielding in the region of the beam separator. It shows an astigmatic distortion of the probe of about 1/zm caused by the static magnetic flux of the earth's magnetic field. The flux passes from the pole plates of the beam separator to the iron circuit of the objective lens. This corresponds to the construction of a magnetic cylinder lens. Magnetic cylinder lenses image astigmatically (with some exceptions which require a special shape of the field and special specimen positions). A temporary solution for the problem is to compensate for the (integral) magnetic flux with a coil wound over the adapter flange with an outer diameter of 200 mm (see Fig. 24). The result achieved with an excitation of 10-A turns is shown in the central image of Figure 29. The resolution limit amounts to 200 nm. With another coil on the upper surface of the beam separator, no further improvement of the resolution was achieved. Therefore a nearly seamless junction of additional screening elements and the side surfaces of the beam separator is mandatory for the SMART. At the same time this also damps out dynamic stray fields. However, the simple improvements of the screening described in this section are sufficient for the test equipment. At high magnification, as shown in Figure 29, gold clusters with a full halfwidth of 40 nm are still visible. The theoretically predicted resolution limit for the extended (about 380-mm) column lies at 20 nm for the aperture in use (diameter, 40 #m).
C. Field Lenses
Field lens No. lmbuilt as a prototypemwas tested before the beam separator was completely assembled. For this, a setup as shown in Figure 24 was used. Instead of the mirror, a blind flange was fitted. For the investigation of the field lens, the objective lens was switched off and the electron bundle was focused instead by the (lower) field lens. The performance of the field lens is documented in the upper photograph in Figure 30. Because of the inadequate screening over a length of some 50 mm above and below the beam separator, the influence of stray magnetic fields of a frequency of 50 Hz is visible as a wavy distortion of the copper mesh, since for the chosen exposure time, at the start of the sweep of each line, mains synchronization occurs. Each line of the image is thus begun at the same mains phase, which leads to an almost identical displacement of the scanning spot in
LOW-VOLTAGE ELECTRON MICROSCOPES
97
FIGURE 29. Images taken with the beam separator in straight transmission. (Top image) Owing to inadequate screening around the lower field lens, the image is unsharp. (Center image) Compensation for the static magnetic flux between the beam separator and the objective lens by means of the compensation coil (see Fig. 24) reduces the unsharpness considerably. (Bottom image) The theoretical resolution limit was reached for the first time with the aid of all the additional screening measures in operation.
98
R HARTEL ET AL.
FIGURE 30. Imaging characteristics of the field lens with switched-off beam separator. The influence of the stray magnetic fields of frequency 50 Hz is reduced by an order of magnitude through the improvement of the magnetic shielding. The electron probe is formed with the lower field lens instead of the objective lens. the object plane by the stray field. The scan time for a row of pixels is around 60 ms. The attainable point resolution of the field lens at higher magnification can only be estimated at around 800 nm on account of the strong stray field. The position of the field lens is unfavorable for an SEM with a very large distance of 160 m m to the specimen and 425 m m from the intermediate image. For an aperture diameter of 70/zm, a best resolution of only 300 nm can be attained. On the basis of these results, three further lenses of the same type were put in service.
LOW-VOLTAGE ELECTRON MICROSCOPES
99
The lower image in Figure 30 was taken with the measuring equipment shown in Figure 24 in straight transmission. The beam separator, electron mirror, and objective lens were switched off. The recording time corresponded to that of the upper image; meanwhile the magnification was eight times higher. Above and below the beam separator, screening cylinders were built in. The beam separator was enclosed in a U-shaped Mu-metal sheet. The bright image regions indicate the crossing point of a structured, orthogonal copper grid; the support film appears dark. Since the scanning coils lie in the region of the magnetic field of the objective lens (which is switched off, but usually causes Larmor rotation), the copper strips seem not to intersect at fight angles; the image is slightly sheared. Influences of stray fields of frequency 50 Hz are no longer detectable in this micrograph. At the start of the tests of the simple 90 ~ deflection, field lens No. 1 was mounted in the lower position and lens No. 2 in the upper position of the beam separator. The optic axes of the field lenses determine the position of the beam separator. Deviations of the beam separator from the ideal position up to -t-0.1 mm are permissible according to theoretical calculations. A suitable displacement of the beam separator belongs to the basic adjustment of the system. For this, the aperture is so adjusted that the upper field lens is irradiated centrally by the electron bundle. A well-tried test method for the adjustment is that of wobbling. For this, one superimposes on the direct voltage of several kilovolts an alternating voltage component of some 100 V with a frequency of about 1 Hz. During the periodic defocusing due to the alternating voltage, the image may be unsharp; however, it should not drift in position. Thus, it is guaranteed that the central trajectory of the electron bundle in the lens will coincide with an unrefracted central beam. In a second step, the lower lens can be centered by a horizontal displacement of the beam separator. After this procedure, the scanning spot and thus the image position exhibited a displacement of 244/zm, if the electrons were focused by either the upper or the lower field lens. This means that despite the wobbling, at least one of the field lenses deflects the beam--assuming a length of 172 mm to the specimen plane--by an angle of 1.4 mrad. A beam tilt of this magnitude is tolerable in the SMART, where the excitation of the field lenses remains more or less constant. However, since the tetrode mirror with appreciably higher requirements on accuracy was completed by the same procedure, an explanation for the beam tilt was necessary. The behavior of both field lenses was investigated by the so-called throughfocus method. For a fixed image position, one lowers the refractive power of one of the lenses while raising the refractive power of the other lens. The measurement of the image shift as a function of the voltage on the relevant central electrodes is shown in Figure 31. This figure shows clearly that field lens No. 1 is responsible for the tilt of the beam. The angle of deflection is proportional to
100
E HARTEL ET AL. 250
200
150
100
t . _ _
I
I
I
I
I
I
_
/
/
/
_ c
_ field lens No. 1 :=! in-iowerpo~iio n
I
~I -
O- A field lens No. 2
I -
50
i
0
I
i
I
2
i
I
4
6
i
,k
8
U (kV) FIGURE 31. Image shift on through-focusing from upper to lower field lens as a function of the voltage on the corresponding middle electrode. While field lens No. 1 exhibits a large linear contribution of 50 #m/kV, the initial increase at field lens No. 2 remains less than 2.5/~m/kV.
the voltage applied at the central electrode. This behavior is incompatible with an ideal, but displaced round lens. In this case the beam tilt would be proportional to the product of the displacement and the refractive power. The refractive power of a thin electrostatic einzel lens is, in the nonrelativistic case, given by 1 3 f 16
2
~
~(z)
dz
(25)
where ~(z) denotes the electric potential on the axis. For small voltages U on the central electrode, the refractive power can be estimated as"
1
-f=
31eff(U) 2
16 d2
~
(26)
where Ua -- 15 kV is the accelerating voltage of the electron, lef f is a length in the order of magnitude of the lens extension, and d is the electrode separation. This formula holds under the assumption IUI << ~,(z) ~ Ua and the simple approximation for the field strength on the axis between the two electrodes IEz(z)l = I - ~'(z)l ~ [UI/d. Hence, for a thin electrostatic einzel lens, irradiated noncentrally, the induced beam tilt increases quadratically with the voltage on the middle electrode. The reason for the imperfect behavior of field
LOW-VOLTAGE ELECTRON MICROSCOPES
101
-U
9
z
- 15 ~ m
+20 ~m
FIGURE 32. Effect of the lateral displacement of the individual electrodes of a field lens, shown for field lens No. 3. For clarity, the displacements are exaggerated. As a result of the displacement of the two front electrodes in opposite directions, influence charging arises. The strength of the resulting dipole fields is proportional to the voltage -U on the central electrode. Because of the dipole field, an incoming electron acquires an additional radial velocity component that causes a deviation A of the image, increasing linearly with the strength of the dipole field and hence the voltage on the middle electrode.
lens No. 1 must be a departure from rotational symmetry, which can cause charging at the electrodes. In this case, the induced charge increases linearly with the voltage U on the central electrode. The effect of the influence charging is clarified in Figure 32 for field lens No. 3. The displacement shown occurred by chance in a plane section after a first remachining. Through the contrary displacement of the front and central electrodes, influence charging occurred at the electrodes in such a way that the relationship between the mechanical deviation from the rotational symmetry and the tilt of the beam was obvious. In other cases, on account of the noncollinear displacements, such a simple directional correlation is not possible. After the determination of the beam tilting, the mechanical precision of all field lenses was measured in the lathe by clock gauges. The results are shown in Table 2 together with the corresponding beam tilting. Field lenses No. 1 and No. 3 could be successfully remachined. The bores in the electrodes now fluctuate by about 5 # m , which corresponds to the accuracy of the lathe in use. In the case of field lens No. 2, a beam tilt is no longer detectable. For field lens No. 4, it seemed that the pressed seatings were inadequate. A hint may be that depending on the contact m o m e n t of the screws, two different positions were taken up, which appeared both in mechanical displacement and in the beam tilting. It therefore made no sense to remachine field lens No. 4.
102
P. HARTEL ET AL. TABLE 2 SUMMARY OF MECHANICAL ACCURACY AND INDUCED BEAM TILT OF THE FIELD LENSES
Initial statea Field lens No. 1 No. 2 No. 3 No. 4
~' (/zrad/kV) 315 <<14 145 38
Pmax
Final statea (#m)
150 <10 150 ~20
~ (/zrad/kV) <7 <7 <7 -~27
Pmax
(/zm)
<5 <5 <5 t.,,a ~20
a The beam tilt ~ is measured for an excitation of the lens of 1 kV, whereas Pmaxgives the maximum lateral displacement of two electrode bores relative to each other.
On the basis of the experience gathered with the field lenses, the tetrode mirror was likewise reworked. The bores of the mirror electrodes now vary with a precision of better than 6 / z m relative to each other. This result is highly satisfactory, since the dimensions of the mirror are considerably larger than those of the field lenses and the mirror has twice as many ball seatings.
D. Beam Separator The test of a single 90 ~ deflection was performed with the setup shown in Figure 33, in which the electron-optical elements were arranged as in Figure 25. The beam separator is integrated into a conventional SEM at a position between aperture and objective lens. Ideally, it images, at a 90 ~ deflection, its entrance plane 1:1 into its exit plane. Thus the beam separator acts optically (like an 8f arrangement of round lenses) as an element of zero length, which does, however, introduce aberrations. Because of the adapter flanges, the effective length of the column is increased by 103 mm over that of the unmodified microscope. The testing of the beam separator involves two tasks: the development of a suitable adjustment strategy and the estimation of the aberrations. For both tasks it is necessary to operate the microscope in various modes; these are summarized in Figure 34. In measuring mode (A) the microscope is operated in the usual way. The requirements on the beam separator are more severe than later in the SMART, since the extension of the electron bundle inside the beam separator is, on average, three times as large. The cause of this is the different position of the intermediate image. In the test equipment the intermediate image of the source lies 135 mm in front of the edge of the beam separator, while the objective lens of the SMART images the specimen
LOW-VOLTAGE ELECTRON MICROSCOPES
103
FIGURE33. Testbed for the beam separator. The beam separatoris locatedbehind a screening sheet of Mu-metal. The position of the individual electron-optical elements is shownin Figure 25. into the entrance plane of the beam separator. The beam separator, with its residual aberrations, does not reduce the resolution in the test microscope, so in any case a successful operation of the uncorrected SMART is guaranteed. Measuring method (B) makes the residual aberrations of the beam separator visible. In this method, the probe is formed by the upper field lens and the beam separator. Since the objective lens is switched off, the aberrations of the beam separator are transferred without demagnification to the object plane.
104
R HARTEL ET AL.
FIGURE34. Overview of the different characterization modes. The beam separator is omitted, since it images its entry plane 1:1 in its exit plane. For forming a scanning probe, the intermediate image of the source situated behind the condenser is imaged by either the objective lens (A) or the upper field lens (B) into the specimen plane. In mode (C) the upper field lens creates an additional intermediate image in the plane of the deflector element O, which is imaged by the objective lens onto the specimen. M e a s u r i n g method (C) with an additional intermediate image of the source in the plane of the deflector element O is used for the determination of the spherical aberration of the objective lens. With the deflection element O, the illuminating b e a m can be tilted about the intermediate image without any displacement. For the basic adjustment of the system, the position of the aperture and hence the direction and position of the electron bundle entering the b e a m separator
LOW-VOLTAGE ELECTRON MICROSCOPES
105
are chosen in such a way that in mode (B) the upper field lens is irradiated centrally. The current Is through the series-connected main coils of the beam separator is then fine adjusted so that the electrons in the specimen plane are stigmatically focused. The beam separator is then adjusted horizontally, until, in this mode, the center of the lower field lens is struck. This can be checked by applying a voltage to the middle electrode of the lower field lens, whereupon the image position should not change. Finally, one can ensure that, in mode (A) with the deflection element O, the electron bundle passes through the center of the objective lens. For this purpose, a wobbler is available for the coil current of the objective. The basic adjustment of the system just described does not guarantee that the optic axis of the beam separator coincides with the central trajectory of the electron bundle. This must be considered separately. A good criterion is the vanishing linear dispersion coefficient of the beam separator. This can be checked by varying either the electron energy En or the current Is in the main coils of the beam separator. Under the premise that the magnetic flux density B in the beam separator is proportional to the coil current Is and relativistic effects can be neglected, Eq. (22) gives Is ~ ~ ~ ~/Ua. With adequately small variations, relative current and energy variations are equivalent within a factor of 2:
dis dEn dUa -= Is En Ua
2~
(27)
In the initial investigations, the beam separator showed a residual dispersion for small variations of the main current. The results for large energy variations in measuring mode (A) were, however, more promising. The two upper images in Figure 35 show the image shift caused by an energy increase from 10 to 12 keV. The main current of the beam separator remained unaltered at a nominal energy setting of 10 keV. The microscope control system just increased the excitation of the objective lens adequately; this had no effect on the image shift as tested by the wobbler. The measured image shift of 28 tim is, by a factor of 2, larger than that calculated for an ideal beam separator, however, by a factor 20 smaller than that for a beam deflection caused by a homogeneous magnetic field of the same strength (radius of curvature R -- 18.55 mm). By refocusing and changing the aperture position (different from the basic adjustment), we were able to improve the imaging quality (lower image). The resolution was limited by the energy-dependent residual aberrations of the beam separator. The relative energy deviation tc amounted in this case to 0.2. In the normal operation of an SEM, the maximum relative energy deviation lies at around 2 910 -4 as set by the energy width of the source. The beam separator even transfers electron bundles of much larger energy deviations. As Figure 36 shows, electrons with energies between 8 and 15 keV pass through the beam separator, although this is set up for a nominal energy
106
E HARTEL ET AL.
FIGURE 35. Measurement of the residual dispersion of the beam separator. For all images the coil current is matched with an accelerating voltage of 10 kV. If one raises the electron energy to 12 keV (middle image), the image is displaced by about 28 #m. This is 20 times less than the displacement expected for a homogeneous magnetic field of the same strength. By a subsequent adjustment of the aperture position and the focus, one obtains a sharp image (lower image).
of 10 keV. T h e r e b y the b e a m tube of only a few millimeters in diameter in the region of the objective lens was not struck. The image displacement was c o m p e n s a t e d for by a displacement of the specimen stage. A further investigation of the i m a g e displacement for small variations of the current applied to the coils of the b e a m separator showed that the dispersion
LOW-VOLTAGE ELECTRON MICROSCOPES
107
FIGURE36. Transfer of the electron bundle through the beam separator for extreme energy deviations. The image shift caused by dispersion was compensated for. Furthermore, the focus and aperture position were matched to the changed electron energy.
can be influenced by the aperture position. For a more accurate analysis of this dependence, a double-deflection element was inserted behind the aperture. This also allowed the separate setting of the beam tilt and displacement (these are coupled on moving the aperture). By equal and opposite excitation of the two deflector elements B 1 and B2 (see Figs. 25 and 34) it is possible, starting from
108
R HARTEL ET AL.
the basic adjustment just described, to displace the electron b e a m entering the b e a m separator. The upward deflection of the electron bundle by 0.2 m m led to a dispersion-free passage of the b e a m separator for current variations A Is/I~ of two parts per thousand. This means that the b e a m separator is imaging free from dispersion of first degree (the linear dispersion coefficient in the aberration series). This gives a criterion for an adequate vertical adjustment of the b e a m separator. In accordance with that, the seatings for the pole plates were lowered by 0.2 mm. Thus, the optic axis of the upper field lens and the entrance position for a "dispersion-free" passage (in the previous sense of the word) through the b e a m separator coincide. The method of the basic adjustment remains unchanged. Measuring method (B) allows the determination of the dispersion coefficients of higher degree of the b e a m separator. One measures the image shift in the object plane as a function of the electron energy leaving the excitation of the b e a m separator with Is = 739.1 m A unchanged; this corresponds to a nominal energy of E,, -- 15 keV. The m e a s u r e m e n t shown in Figure 37 does not agree
En = 15 keV 1000
_1 ~ . . . .
o melasurement [ idealbeam separator beam separator with -" " " smallresidual dispersion
500 -
i
] f
/~_ .I-
Tf fl.a-J'~./" I"
-
0 .,-.~
-500
_ T//~z.,,
~
_
_f .// ~~
-1000
-
I" I
12
!
I
I
14
I
16 electron energy (keV)
t
I
18
t
-I
FIGURE 37. Displacement of the image as a function of electron energy in measurement mode (B) for the determination of the residual dispersion of the beam separator. The excitation of the beam separator remains constant at a nominal energy of E,, = 15 keV. The measured values can be brought into agreement with the theoretical predicted values, if one allows for a residual dispersion coefficient of - 1 mm and a residual quadratic dispersion coefficient of 10 mm. These values of the coefficients correspond to 1% and 10% of their maximum values arising within the beam separator, respectively.
LOW-VOLTAGE ELECTRON MICROSCOPES
109
with the theoretical values for the ideal beam separator without dispersion of first or second degree. If one assumes, however, that the linear dispersion coefficient is - 1 mm and the quadratic coefficient is 10 mm, the calculated values lie within the error limits of the measurement. The sizes of both coefficients correspond to 1% and 10% of the maximum of 100 mm inside the beam separator. The linear coefficient can be further reduced through a more exact incidence in the micrometer range. The nonvanishing quadratic component is probably caused by a lack of symmetry due to imperfect machining. A residual dispersion of second degree of such magnitude is, however, permissible for the beam separator attached to the SMART. In this connection, it may be noted that the previously mentioned adjustment coils for the (internal) fine-tuning of the beam separator were not used for any measurement. After the basic adjustment, described previously, in connection with a "dispersion-free" passage through the beam separator, we were then ready to characterize the beam separator in terms of the achievable edge resolution in measuring modes (A) and (B). As shown in Figure 38, in measurement mode (A), at a nominal energy of 15 keV, an edge resolution limit of 9.5 nm was obtained in the x z section, the midsection of the beam separator. This value corresponds to about the resolution limit of the unmodified microscope. According to theoretical calculations, a resolution of 7 nm in the x z section and 11 nm in the y z section can be reached in this mode with an aperture diameter of 40/zm. The experimental result shows that the residual aberration of the first tested quadrant of the beam separator does not limit the resolution of the SEM. This result guarantees that the beam separator can be incorporated into the (uncorrected) SMART without loss of resolution. The cross fringes visible in the image were caused by a beating fault in the turbomolecular pump. This mechanical disturbance was removed by changing the pump. Thereafter the mechanical construction was stable enough to reach resolutions of less than 10 nm. The residual aberrations of the beam separator can be determined only in measurement mode (B). With a large aperture (4~ = 200/zm) the geometric aberrations produce their strongest effect. The choice of a smaller aperture is not sensible. The attainable resolution with an aperture of diameter 40/zm lies below 100 nm. On account of the absence of demagnification of the probe by the objective lens accompanied by a magnification of the intermediate image by a factor of 1.6 through the upper field lens, the residual aberrations can no longer be detected for two reasons. On the one hand, the displacement of the scanning spot due to the stray magnetic fields is enlarged by a factor of 40. This makes an unambiguous determination of the resolving power impossible. On the other hand, the size of the probe is no longer limited by aberrations but by the geometric size of the image of the source. The limitation through the source size itself lies, at the maximum demagnification of the source by the condenser, at about 150 nm in imaging mode (B). Moreover, with stronger excitation of the
110
P. HARTEL ET AL.
'
I
'
I
'
I
250
"=~
^
"
"~ 200 "
15 %Q/ t
150
85 %
0
9,5 nm -.__ i
I
50
i
x (nm)
100
~-" ~
-
~5 i
I
150
FIGURE 38. The resolution of the scanning electron microscope DSM 960 with the beam separator in measuring mode (A) amounts to 9.5 nm in the x z section, at an acceleration voltage of 15 ku This resolution is comparable to that of the unmodified apparatus. The intensity profile shown in the lower half of the figure is obtained by averaging the intensity values in the box, indicated above, along the vertical direction.
condenser, the emitted intensity per solid angle becomes increasingly lower, so that the signal-to-noise ratio becomes unfavorable and the contrast is reduced. On the contrary, with a large aperture, the signal-to-noise ratio is high. The total intensity in the probe increases quadratically with the aperture diameter. The attainable resolution limit of more than 400 nm can be discerned without difficulty.
LOW-VOLTAGE ELECTRON MICROSCOPES
180 170 160
i _
~
I
~
I
~
111
I
85 % - - .
m
150 140
550nm _
130
t
0
I
1
i
I
t.
i
2
I
4
i
i
5
i
6
x (ktm) FIGURE 39. Edge resolution in measuring mode (B) in the x direction (parallel to the slit between the pole pieces). The resolution demonstrated in the first quadrant of the beam separator lies close to the theoretically expected value of 490 nm for the ideal device.
Figure 39 shows the determination of the edge resolution of the first quadrant of the beam separator in the x z section (the midsection of the beam separator). For comparison with the theoretical predictions, a criterion for the edge resolution was applied, by which the signal between the two plateaus, in front of and behind an edge of the copper mesh, rises from 15% of the height difference to 85%. The resolution is close to the theoretical limit for the chosen aperture. For the energy width of the source, a value of 3 eV was assumed.
112
E HARTEL ET AL. TABLE 3 EDGE RESOLUTION LIMIT ATTAINED BY IMAGING WITH THE UPPER FIELD LENS AND ONE QUADRANT OF THE BEAM SEPARATOR IN COMPARISON WITH THEORETICAL PREDICTIONS a
Properties of the beam separator Theory Experiment
Ideal 10% Asymmetry 1st quadrant 2nd quadrant 3rd quadrant 4th quadrant
Edge resolution limit (nm) xz
Section 490 1600 550 650 550 700
yz
Section 1450 2200 1500 1700 1500 1600
a The midsection of the beam separator is denoted as the xz section. The aperture in use has a diameter of 200/zm. An asymmetry of 10% at the beam separator means that for each symmetry-correctedaberration coefficient, 10% of its maximum fails to cancel at the exit plane owing to imperfections.
The results of all quadrants of the beam separator in both sections are summarized in Table 3. The measured resolutions closely approximate the theoretical predictions for an ideal beam separator and they deviate widely from the values for a symmetry violation of 10%. In the case of quadrants 2, 3, and 4, however, a tilt of a few milliradians was needed for the incident beam. The tilting was performed about the center of the upper field lens. The influence of the violation of the intrinsic symmetry of the beam separator on the edge resolution is induced in the calculations in the following way: one assumes that of each symmetry-corrected aberration coefficient a certain portion, for example, 10%, of its maximum value is not canceled. The attained edge resolution for the x z section is at the same time a proof of the quality of the field lens, since its aberrations already limit the resolution to 450 nm. On the basis of the dispersion measurements and the proof of the edge resolution in measuring mode (B) with a large aperture, a successful insertion of the beam separator in the corrected SMART is to be expected. The values for the edge resolution provide a good upper estimate of all the geometric aberrations. Their values, without application of the flat adjustment coils of the beam separator, are clearly less than those of the beam separator with a disturbance of its symmetry by 10%. Earlier (Mtiller e t a l . , 1999) the reduction of all symmetry-corrected aberration coefficients to 2% of their maximum value was claimed. This criterion cannot be verified from measurements of the edge resolution. This is, moreover, a very severe criterion that need not be
LOW-VOLTAGE ELECTRON MICROSCOPES
113
fulfilled for all coefficients. Therefore, with the aid of the adjustment coils, any necessary fine adjustment in the completed SMART should be possible without problems. E. Determination o f the Chromatic and Spherical Aberration Coefficients
For the tests of the electron mirror, a method of measuring the chromatic and spherical aberrations must be available. With the aid of the equipment with a single 90 ~ deflection through the beam separator shown in Figure 25, a method for measuring both aberrations was developed. A further possibility for the measurement of the chromatic aberration was used in the equipment with the mirror (see Fig. 24). The determination of the chromatic aberration follows directly from the defocus of the scanning probe as a function of the electron energy. The chromatic aberration coefficient Cc is, according to Eq. (14), the linear coefficient of the series expansion of Af with respect to the relative deviation to: Af -
- AZ
tc Cc -l- tc2 Kc -+- ....
with
x=
AE En
=
AUa Ua
(28)
For measuring the defocus, there are, in principle, several methods available. The most straightforward one would be to move the specimen stage in height after an alteration of the electron energy in such a way that the image is once more sharp. Unfortunately, with the specimen stage currently available, the height cannot be recorded with sufficient accuracy. A further possibility is the calibration of the excitation of the objective lens. This was performed with the aid of various high-precision methods that have an accuracy of -4-0.5/xm. The mechanical tolerance lies clearly below the depth of focus of the objective lens of several micrometers. The calibration is shown in Figure 40. The slopes A f / A I o L -- (0.0177 4-0.0003) m/A are identical within the frame of measuring accuracy for two different (optical) distances between the intermediate image of the source and the specimen. The separation of the intermediate image from the center of the objective lens amounts to 573 mm for straight transmission and with the use of the beam separator and the mirror, 293 mm. In the setup for testing the beam separator (see Fig. 25), another method was applied. The defocus Af induced by an altered electron energy can be calculated from the excitation of a further lens which is used to refocus the image, the excitation of the probe-forming lens and the position of the specimen plane remaining constant. For a positive chromatic aberration coefficient and an increase in electron energy, one refocuses the image with the aid of a field lens. From the voltage UFL on the middle electrode, the refractive power
114
P. HARTEL ET AL.
~ _ ~
straight transmission ] with beam separator and mirror ] - regressionline 1 ] - - . regressionline 2 ]
-
0.5
-
0
-
~
<1 -O'5
-1.5
.d~.,~" ~"~"
_
/~r /~b~'.~"
-
.
///Y 1150
o
1200
1250
1300
lot . (mA) FIGURE 40. Relationship between the focal length of the objective lens and its coil current relative to a WD of 4 mm for various separations of the intermediate image from the objective lens. For all cases, the defocus A f depends linearly on the variation Aloc of the lens current with approximately the same slope.
k of the lens can be calculated with an accuracy of a few percent by the m e t h o d set out in Section III.C. From the refractive power and the geometric positions of the lenses in the microscope, one obtains the relevant defocus A f by utilizing the matrix m e t h o d described in the Appendix. For a positive chromatic aberration coefficient and a lowered energy, it is necessary to preexcite the refocusing lens weakly with a voltage DFL; the excitation of the lens which focuses the scanning probe into the fixed image plane is changed only slightly. F r o m the voltage UFL on the central electrode needed for refocusing after reduction of the accelerating voltage Uu, one obtains the defocus as the difference A f = A f (UFL) A f (~_]FL). In mode (A), in which the objective lens forms the raster spot, both field lenses can be used for refocusing. From Eq. (33) and the distances a l l 2 m m , b = 1 7 3 . 5 m m , and c = 1 2 m m for the upper field lens and a = 139 m m , b = 146.5 mm, and c = 12 m m for the lower lens, respectively, the defocus is found to be -
A f . / l z m - - 22
k 1 -- 0.066k
-
and
A j S / t z m -- 34
k 1 - 0.068k
LOW-VOLTAGE ELECTRON MICROSCOPES
115
where the refractive power k of the relevant lens is measured in m -1 . In method (B), the lower field lens is used for refocusing. With the relevant equation (34) and the lengths a = 112 mm, b = 27 mm, and c - 162.5 mm, one obtains for the defocus K
A f / m m - - 26.4
1 + 0.16K
with K given in m -1. The measurement carried out in mode (A) is shown in Figure 41. It serves for the determination of the chromatic aberration coefficients of the system consisting of the beam separator and the objective lens. The values Cc = (20.8 4-0.7) m m obtained from both field lenses lie between the theoretical values for the chromatic aberration coefficients C (the~ -- (19.1 T 3.9) m m for the xz section and the yz section, respectively. The asymmetry is produced by the beam separator. In this measurement it was not possible to focus sharply on one of the sections separately. The value of the voltage applied to the middle
i
I
i
I
i
.
I
i
I
refocusing with the upper field lens (1) refocusing with the lower field lens (2) ! 20 - -- -- regression line (upper field lens) ! _ regression line (lower field lens)
i
i
O []
0
<1 -20 -
J
S
/
chromatic aberration coefficients resulting from the regression lines: -
(2) Cr = 20.1 mm
-
c
-40 - ~ i
-30
(1) C = 21.5 m m
I
-20
i
I
i
-10
i
0
I
l0
i
I
20
AU~ (v) FIGURE 41. Defocusing A f introduced by varying the accelerating voltage at 14.8 kV + A Ua in measuring mode (A). The defocus is determined by exciting a distinct field lens for a constant WD according to Eq. (28). The slopes of the regression lines directly represent the total chromatic coefficient Cc of the beam separator and the objective lens. They agree, within the framework of measurement accuracy, with the theoretical values for the xz and yz sections, respectively.
116
E HARTEL ET AL.
electrode of the field lens corresponded to an average value of the defocus at which the i m a g e showed the best resolution. The determination of the chromatic aberration coefficients for the individual sections of the b e a m separator succeeded in m e t h o d (B). In this case the chromatic aberration from the b e a m separator predominates. For the measurements shown in Figure 42 a copper m e s h was used as the specimen, whose bars were oriented along the principal axes of the b e a m separator. This allowed independent focusing in each section. Instead of the previously used designation C c for the round c o m p o n e n t of the chromatic aberration, the quantities C~K and C~K from the aberration expansion were used for the linear chromatic aberration coefficients in the x z and y z sections. In this connection, it may be noted that in the rotational-symmetric system C c - - -Co,,~ = - - C ~ K holds, since the aberration coefficients C ~ and Ct~ are related to A z - - A f . The m e a s u r e d chromatic aberration coefficient C~K = - - 3 . 4 m in the x z section corresponds exactly with the theoretical prediction, while the coefficient in the y z section with - C ~ - 10.4 m deviates slightly from the theoretical value - - C (the~ -- 11.6 m Of K
l0
E E
p
j
<1
J
-10
J
-20 o
-80
,
oJ
I
-60
,
-40
,
6
I
-20
I
'
J
o/
J
I
'
,d
/
9
,
I
J:]
/ / I~l
chromatic aberration coefficients: (1)
y ,
I
'
O xz-section(1) IZi yz-section(2) regression line for xz-section 9 9regressionline for yz-section
"
-C
=
3.4
.
m
-
(2) -Cl~~ = 10.4 m ,
0 Au (3 (v)
,
I
20
,
I
40
,
I
60
FIGURE 42. Determination of the defocusing Afas a function of the deviation A Ua from the nominal U, = 14.8 kV by excitation of the lower field lens in measurement mode (B). The (negative) slope of the regression lines can be interpreted directly as chromatic aberration coefficients C~K and Ct~Kof the system of upper field lens and beam separator in the xz and yz sections, respectively. These values agree well with the theoretical values C r176 = -3.4 m and C~the~ = - l 1.6 m.
LOW-VOLTAGE ELECTRON MICROSCOPES
117
Both measurements show that the chromatic aberration coefficients can be determined with sufficient accuracy. The results are in good agreement with the calculated values for the beam separator in combination with electrostatic and magnetic round lenses. To determine the spherical aberration coefficient of an SEM, one can measure the image shift as a function of the beam tilt c~ (or/~) induced in an intermediate image of the source. In this case, the central trajectory of the electron bundle no longer coincides with the optic axis, but with the axial ray of slope ct. The maximum aperture angle is denoted by Cto.In the beam, there are then electrons that are inclined to the central trajectory at the (complex) angle 0 with 101 _< O~o, Real and imaginary parts of the angle 0 are defined as the angles enclosed by the z axis and by the projection of the electron trajectory onto the x z or yz section, respectively. From the aberration expansion (12), one obtains the image shift if one replaces the complex angle ~owith ot + 0 or i/3 + 0, respectively. The image shift is given by the power series with respect to ot and 13, obtained by collecting all terms containing equal powers of ot and/3. Besides the direct contribution of the corresponding aberration coefficients (e.g., C , ~ for the third-order coefficient), additional terms appear; for some of the trajectories deviating from the central ray, these additional terms contain powers of 0 and aberration coefficients of higher order. These lead to an unsharpness of the image depending on the beam tilt c~. The size of the resulting aberration disk can be reduced, if the maximum aperture angle C~ois chosen as small as possible. The optimum size is limited by the required signal-to-noise ratio. In our test equipment, good results were obtained with an aperture diameter of 20/zm. During a measurement, it is necessary to work at different magnifications and with different object details. The magnification is chosen so that the deviation can be clearly detected. The detail observed with the beam tilt switched on and off must be sufficiently large so that despite increasing blurring of the image with increased flit, the detail in the image can be distinguished and, at the same time, its displacement can be measured. A polynomial of third order is fitted to the measured values. The third-order coefficient gives directly the spherical aberration coefficient at the relevant section. The linear component represents the defocus. It can be removed by refocusing. Quadratic terms vanish for round lenses and for the ideal beam separator for symmetry reasons. The presence of quadratic contributions thus reveals a lack of symmetry. As a way to check the measuring method, the spherical aberration of the objective lens alone was determined. For this, one uses measuring method (C) in which, with the aid of the upper field lens, an additional intermediate image is placed in the plane of the deflecting element O. The fine focusing of the upper field lens was carried out by wobbling the deflector with a very small amplitude. The corresponding beam tilt (with respect to the intermediate
118
P. HARTEL ET AL. 30
I
I
-
.
20
~
O ---
I .
I
I
i
measurement fitted cubic function
0
~"
_20_10
-30
I
.
s
I -0.1
i
i -0.05
c i
~
I
I 0
0.05
0.1
beam tilt (rad) FIGURE 43. Image shift as a function of the beam tilt about the intermediate image in the plane of the detector element O, at a W D of 4 mm. On the horizontal axis, the object-side tilt angle ct = -or'. M is shown. This angle is related to the aperture-side angle ct' by means of the magnification M = 8.4 of the objective lens. The nominal energy is En -- 14.8 keV. The thirdorder coefficient of the polynomial fitted to the measured curve gives the spherical aberration coefficient Cs. The theoretical value ",..'sg'~(the~ = 25.3 mm.
image) amounted to 0.5 mrad. If the image in the object plane does not move, then the field lens is correctly excited. Finally, the focus of the objective lens is precisely adjusted. The deflection element O thus produces a beam tilt about the intermediate image needed for the measurement of spherical aberration. The measurement is shown in Figure 43. The fitted polynomial of third order for the deflection Yo in the object plane takes the form Yo/lZm= -0.02-
5.6c~ + 231.1c~2 + 23,700ot 3
The coefficients up to the second order are smaller by orders of magnitude than those of the third order. From this, the value for the spherical aberration coefficient Cs -- (23.7 + 3) mm agrees with the calculated value C(~he~ - 25.3 mm. The measured value depends critically on the calibration of the tilt angle. The calibration of the deflector O by measuring the image displacement in mode (B) is certainly accurate to 1%, but the magnification of the objective lens alters significantly with the WD, which is not known very accurately, so that the specimen-side angle can be determined at best to within about 5%. In the system with the beam separator and the objective lens, measurement mode (A) was used to determine the total spherical aberration. The beam tilt for this is performed with the double-deflecting element B. The two elements,
LOW-VOLTAGE ELECTRON MICROSCOPES -
O
10
i
I
15 -
I
;
I
i
;
I
i
measurement in xz-section (1) measurement in yz-section (2) f!tted cub!c funct!on (xz-sect!on) fitted cubic function (yz-sectaon) _ ~
[]
-
I
119 I
I
r-~ Z~" ~
i
50 ~
-
~ - 5 "~
_
-10 -15 ~ -
I
-0.06
I
I
-0,04
I
I -0.02
spherical aberration coefficients:
_
(1) C aa = 53mm
_
(2)
-
J
I
0
CI]i]I3 = 51mm I
0.02
i
I
i
0.04
I
-
0.06
beam tilt (rad) FIGURE 44. Image shift in method (A) with respect to the x and y directions, induced by a beam tilt in the same section about the intermediate image behind the condenser in measuring mode (A) at a nominal energy of En = 14.8 keV. The magnification of the objective lens amounts to M = 24 for a WD of 4 mm. The object-side angle shown on the horizontal axis is proportional to the intermediate image-side angle with the slope-M. The third-order coefficients of the fitted polynomial for each section give the spherical aberration coefficients C ~ and C~t~~, respectively.
which are arranged between the aperture and the upper field lens, are so coupled that they tilt the beam virtually about the intermediate image just behind the condenser. By synchronous wobbling of both deflection elements, one can set the coupling constant in such a way that, for small amplitudes (beam tilt around 0.1 mrad), the image in the specimen plane does not move. The result of the tilt series is shown in Figure 44. Just as with the determination of the chromatic aberration, one must distinguish between the x direction and the y direction. Instead of using the notation Cs for the round portion of the spherical aberration, one uses the notations C ~ and C ~ for the spherical aberration coefficients in the xz section and the yz section, which appear in the series expansion of the aberrations. For the polynomials fitted to the measured displacements Vo and 3o in the object plane, which are shown in Figure 44, one obtains Vo//Zm- -0.04-
3.6ot + 38ot 2 + 53,000ot 3
6 o / / Z m - - 0 . 0 1 + 11/3 + 124/32 + 51,000fl 3 The third-order coefficients C ~ -- 53 mm and C ~ - 51 mm are signifif,(theo) cantly larger than the theoretical values C (the~ ---23.5 mm and ,_. ~ -29.5 mm. The coefficient C ~ C ~ has not been measured. Owing to the structure of the corresponding aberration integral, the actual value of this
120
P. HARTEL ET AL. 8 I
,
i
I~ O l [] [ 4 F ,'- --
t IJ -8 I"i
,
I
,
I
Ii /~i9_] / I / ]
~
G
spherical aberration coefficients:
_
I
'
i
measurement in xz-section (1) measurement in yz-section (2) fitted cubic function (xz-section) fitted cubic function (yz-section)
J
/
/,,
,
[] # #
#I
i (1]
4.2km
/(2, c,7.,=lO.3km I
-0.001
i
I
-0.0005
I
/
i
0
I
0.0005
i
I
-
i
0.001
beam tilt (rad) FIGURE 45. Image displacement in measurement mode (B) in the x and y directions, respectively, as a function of the beam tilt in the corresponding section about the intermediate image behind the condenser for a nominal energy of E,, = 14.8 keV. The magnification M of the upper field lens amounts to 0.61. The tilt angle refers to the object side. It is related to the intermediate image-side angle by the factor-M. The third-order coefficients of the polynomial fitted to the data for each section give the spherical aberration coefficients C,~u and Ct~/#~,respectively, while the linear components of the fit represent the defocus.
coefficient will not deviate significantly from the geometric mean of the coefficients C~,~ and C ~ . In mode (B), a similar result is also found in the measurement shown in Figure 45, taken with the upper field lens and the beam separator. The coupling of the deflector elements B 1 and B2 need not be altered since the positions of the intermediate image and of the specimen remain unchanged. The fitting polynomials take the form
yo/m -- - lc~ + 4200c~ 3 3o/m = 0.6/3 + 10,300/33 from which the spherical aberration coefficients C,~,~ - 4.2 km and C ~ = 10.3 km are obtained. The calculated values, however, amount only to t,-,(theo) C
LOW-VOLTAGE ELECTRON MICROSCOPES
121
The discrepancy between experiment and theory cannot be explained by an inaccuracy in the angle calibration. The angle calibration of the doubledeflector elements was carried out in measurement mode (B), in which the deflectors B 1 and B2 are excited in the ratio 1:2. This corresponds to a tilt through the center of the upper field lens. The angle calculated from the image displacement is accurate to 1%. The reason for this discrepancy is that in fact the beam separator does not image 1:1, but instead acts like a weak diverging lens. This leads to apparently larger angles behind the beam separator than those created by the double deflector. The angle calibration is almost unaffected, since on tilting through the center of the upper field lens, the axial distance in the entrance plane of the beam separator is negligible. The tilt angle induced by the beam separator is proportional to this distance. In measurements of the spherical aberration, the illuminating bundle is tilted about the intermediate image behind the condenser. Even for small beam tilts, the distance of the central trajectory of the bundle in the entrance plane of the beam separator from the optic axis is appreciable. An accurate quantification of this effect is difficult. The (negative) refractive power of the beam separator can be estimated with the double-deflection element B. To center the lower field lens, one must, on account of the separations of the individual elements along the axis, install a coupling of the elements B 1 and B2 of 1:-1.43, while the measurements in the x z section give 1:-1.54 4- 0.02 and in the y z section 1:-1.46-4- 0.02. The difference between the two sections is sufficient to explain why the spherical aberration coefficients measured in mode (A) are in contradiction to the inequality C ~ > C ~ predicted by the calculations. The center of the objective lens is struck in the experiment at a ratio of 1 : - 1 . 1 5 -+- 0.03 in contradiction to the calculated ratio of 1:-1.10. On account of the measuring uncertainty, the refractive power of the b e a m separator can be said to lie only somewhere between - 1 and - 4 m -1. Furthermore, the applied voltage of U = 7.45 kV on the central electrode of the lower field lens deviates significantly from the theoretical value U ~th~~ = 7.12 kV for focusing the electron bundle in the object plane. From all this, and assuming that the field lens is exactly machined and finished, a refractive power of the beam separator of - 1 . 7 m -1 results, which, however, could be different in each section. Under the assumption of this refractive power, in measuring method (A) the axial distance of the electrons from the axis in the objective lens is increased by 20%. For the objective lens, the intermediate image appears to lie closer to the beam separator. The virtual intermediate-side angle is thereby increased by about 20%. This apparently larger angle compared with the actually chosen value at the double-deflection element explains the difference between theory and the experimental data shown in Figure 44. The discrepancy between the measurement shown in Figure 45 in the y z section can be explained through a higher excitation of the upper field lens to compensate for the negative
122
P. HARTEL ET AL.
refractive power of the beam separator. The value in the xz section, which is too large by a factor of 3, points to a cross talk from the yz section. It is possible that the tilt of the electron bundle about the intermediate image was not sufficiently well oriented relative tothe main axis of the beam separator. Although the diverging action of the beam separator influences the measurement of the spherical aberration coefficient enormously, this is, in relation to the effective internal refractive power, an error of about 1.5% only. In the equivalent 8f arrangement of four round lenses, the refractive power of each individual lens amounts to 30 m -1. The action of the beam separator as a diverging lens can be explained in that the electrons inside the complete beam separator are focused slightly too weakly. This can be balanced by an increase of the main current in the coils of the beam separator. However, the stigmatic imaging is thus lost. To guarantee both imaging properties, one must use adjustment coils or stigmators. Initial investigations with the superposition of an additional static magnetic field by coils wound on the yokes show that, in this way, the applied current for stigmatic focusing can be varied.
E The Electron Mirror
To investigate the electron mirror, one needs the beam separator. The beam separator separates the incoming electrons from the reflected ones by two deflections of 90 ~. With the successful characterization of all quadrants of the beam separator and the measuring method for determining the chromatic and spherical aberration coefficients, the preliminaries for the test are complete. The measuring setup is shown in Figure 46. The electron source with the condenser and the aperture is mounted on the upper flange of the beam separator. The electron mirror with the two multipole elements is attached inside the vacuum chamber flanged on the left. The positions of the individual electron-optical elements may be seen in Figure 24. For all measurements, the multipoles between the mirror and the upper field lens were operated as an electrostatic double-deflection element. Instead of the 24 high-precision voltage sources in the final construction stage, in this case only 8 sources were needed. The two deflection elements are, in what follows, referred to as S 1 and $2, in which the element S 1 lies next to the mirror. For a first basic adjustment of the system, it was appropriate to leave the position of the beam separator unaltered from that in the test of the single 90 ~ deflection, since the beam separator was already positioned with respect to the flanges of the framework of the beam separator that defines the optic axis. With the aperture, one aims at the center of the lower field lens. An irradiation of the
LOW-VOLTAGE ELECTRON MICROSCOPES
123
FIGURE 46. Test equipment for investigating the electron mirror. The mirror is attached at the left inside its vacuum chamber made of Mu-metal; the electron source is flanged on top of the framework of the beam separator. The positions of the individual electron-optical elements are depicted in Figure 24. b e a m separator deviating f r o m this can be achieved by the double-deflection element B. The introduction of a small d i s p l a c e m e n t of 1 5 / z m proved useful, in order to guarantee the dispersion-free irradiation of the b e a m separator. The central irradiation for the objective lens was ensured by a suitable excitation of the deflection element O. The main current Is in the coils of the b e a m separator was so chosen that, upon switch over f r o m direct transmission (Is = O)
124
E HARTEL ET AL.
to imaging with the mirror, the setting of the stigmator of the objective lens remained unchanged as well as possible. In order for one to adjust the axis of the mirror, a displacement of only a few micrometers was necessary. For this purpose, both electric deflection elements S 1 and $2 were excited in the ratio 1 :-1. The adjustment was carried out by wobbling the voltage U0 on the reversal electrode of the mirror. With a beam tilt about an arbitrary point on the optic axes, the mirror could also be centered, but the attainable resolution decreased notably, probably since the reentrance of the reflected electrons in the subsequent quadrant of the beam separator occurred at the wrong position. The basic adjustment did not yet guarantee that the upper field lens would be hit centrally by the entering and departing electron. Moreover, without a magnetic deflection element between the mirror and the upper field lens, the reentry of the electron bundle into the beam separator after reflection through the mirror cannot be freely adjusted. A definitive and reproducible alignment strategy remains to be developed. The micrographs of gold clusters on carbon in Figure 47 were taken with different mirror arrangements. With the three mirror voltages and the voltage on the central electrode of the upper field lens, the chromatic and spherical aberrations can be varied over a wide range, while at the same time the focal length and magnification are kept constant. The latter were so chosen that an image at the edge of the beam separator is imaged mirror-reversed on itself. In the upper micrograph, the mirror images without introducing chromatic and spherical aberrations. In the middle image, both aberration coefficients were so adjusted that the round aberration component of the total system is optimally corrected. For the bottom image, twice-as-large aberration coefficients were chosen at the mirror. The aberration-corrected image reveals the highest contrast. The resolution is 24 nm as well as in the upper image. Despite the washed-out structures in the lower image, there are regions showing a resolution of 30 nm. The remarkable increase of contrast is an important advantage of a corrected electron microscope and means, in the case of an SEM, that a higher portion of the electrons is focused in the central region of the probe, whereas in the uncorrected case the foot of the probe profile is broadened. The theoretical limit of 4.5 nm assuming an aperture with a diameter of 40 # m has not been attained yet. However, if one takes into account the diverging action of the beam separator with a refractive power o f - 1.7 m -1, the attainable resolution lies at only 11 nm. A further cause of the reduced resolving power could be the imperfectly controlled simultaneous irradiation of the two quadrants of the beam separator. The demands on the correct alignment in the test equipment compared with those on the operation in the SMART are significantly higher, since the diameter of the electron bundle inside the beam separator is, on average, three times larger than in the SMART. This comes from the position of the intermediate image of the source at a distance of 135 mm from the beam separator. Comparable
LOW-VOLTAGE ELECTRON MICROSCOPES
125
FIGURE 47. Micrographs of gold clusters oncarbon with different mirror settings. (Upper micrograph) The potentials on the mirror electrodes are chosen so that the mirror does not introduce chromatic or spherical aberration. (Middle micrograph) The mirror is excited so that both aberrations of the complete system are compensated for. (Lower micrograph) The chromatic and spherical aberrations amount to the negative value of the system without the corrector.
conditions can be obtained if the intermediate i m a g e of the source is located in the entrance plane of the b e a m separator. The correction of the chromatic aberration is shown in Figure 48. It can be r e c o g n i z e d by the c h a n g e of sign of the slope of the defocus as a function of the electron e n e r g y for different mirror settings. In the c o m p e n s a t e d m o d e , the
126
P. HARTEL ET AL. i
I
_
0.3
t
@
I
i
I
\ ~
-
~ - <> over-compensated
\ ~
0.2
\
q
-
~
,HH 0.1
I
O- --O uncompensated k_. ~ compensated
\ ~
-
t
-
\
~ \
<1
o O"
-
"'~>-.. ~ . ,
-0.1
"O-
-0.2 -60
t
I -40
i
I -20
t
0 A E (eV)
i
I 20
t
I 40
-
.~
i
60
FIGURE 48. Proof of the chromatic aberration correction. The slope of the defocusing A f as a function of the energy deviation AE of the electron from the nominal energy En = 15 keV yields the chromatic aberration coefficient. The defocusing was obtained by means of the objective lens, with the coil current calibrated according to Figure 40. The chromatic aberration of the total system can be changed by different excitations of the mirror from an undercompensated state to an overcompensated state.
focus remains constant within an error tolerance of 0.05 mm. In the uncompensated case, the electrons with increasing energy are focused at a larger distance behind the objective lens. So that the position of focus remains constant, the excitation of the objective lens must be increased. In the overcompensated case, the relationship is just the opposite. The possibility of correcting the spherical aberration with the electron mirror may be seen in Figure 49. By prescribing different potentials on the electrodes of the mirror, one can change the sign of the total spherical aberration of the system. The spherical aberration coefficient is the third-order coefficient of the polynomial fitted to the measured values. The aberration coefficients are slightly different in the two sections. In the compensated state, the spherical aberration coefficients of third order are not completely compensated for, but set up so that the fifth-order spherical aberration is counterbalanced in as broad a range as possible. The absolute values of the chromatic and spherical aberration coefficients do not agree with the theoretical values, because of the diverging effect of the beam separator, described in Section V.E. A quantitative evaluation of the coefficients has therefore had to be postponed.
127
LOW-VOLTAGE ELECTRON MICROSCOPES 40
I
-V 30 - \ _ \ 20 -r~. =I. .~ 9
'
"\
x
"\ ~.
10
I
.
-
-
[
'
i
I
~ - --,6 uncompensated, xz-section O- " 0 uncompensated, yz-section ~ compensated, xz section 0 0 compensated, yz-section ~ ' - V' over-compensated, xz-section El--[] over-compensated, yz-section "[].
.
.
.
.
.
i
I
# / /II /I "".,r I
/
~x.r _
_
-
"~V,,.
0
,,~--S"fl~r
-10 -
//
s. v
//
"\. "X
-20
\ ~
-30
I -40
I
I -20
I
t 0
I 20
I
\
~7[]
40
beam tilt (mrad) FIGURE 49. Proof of the spherical aberration correction. For this, one measures the induced image shift as a function of the beam tilt for various mirror settings. The change of sign of the coefficients of the polynomial fitted to the measured curve shows that with the aid of the electron mirror, a negative spherical aberration coefficient can be adjusted over a wide range. The beam tilting is performed with the double-deflector element B (virtually) about the intermediate image behind the condenser. The tilt angle is defined with respect to the object plane.
The measurements of this section show, beyond any doubt, that with the mirror corrector, consisting of a low-aberration beam separator and an electrostatic tetrode mirror, the simultaneous correction of the chromatic and spherical aberrations can be carried out in a low-voltage electron microscope. All the measurements that have been made promise a successful incorporation of the corrector into the SMART. The improvement of the resolution of the test microscope is not essential for the incorporation of the corrector into the SMART. To improve the resolution of the test microscope, one must implement the following measures. The adjustment strategy for the simultaneous irradiation of two quadrants of the beam separator must be reconsidered, in order to lower the current resolution limit of 24 nm within the range of the theoretical value of 11 nm (by adopting a refractive power of the beam separator of - 1 . 7 m-~). The requirements on the correct alignment of the beam separator can be reduced, if the intermediate image of the source, as in the operation in the SMART, is moved into the entrance plane of the beam separator. This can be achieved by weakening the second condenser lens.
128
E HARTEL ET AL.
A further reduction of the resolution limit to 4.5 nm implies the elimination of the diverging action of the beam separator keeping stigmatic focusing. Preliminary investigations indicate that this goal can be reached by superposing a homogeneous magnetic field on the inside of the beam separator. With the aid of the fine-adjustment coils the 1:1 imaging of the beam separator should be set up exactly in any case.
VI. CONCLUSION The theoretical resolution limit of present, uncorrected, direct-imaging lowenergy electron microscopes is limited to 5 nm. In the case of electron illumination (LEEM) the best edge resolution achieved is 10 nm, while with photon illumination (PEEM) resolution limits of only 20 nm have been observed (Schmidt et al., 2000). The difference is mainly caused by the decreased electron yield in the latter case which leads to long recording times and hence problems with mechanical and electrical stability. With the aid of a mirror corrector for simultaneous correction of chromatic and spherical aberrations, the spatial resolution limit can in principle be lowered to 0.5 nm, or alternatively it is possible--at a resolution comparable to that of uncorrected microscopesNto increase the limiting aperture angle by a factor of 4-10. The electron gain then ranges between 16 and 100. This application of the corrector is, therefore, especially interesting for the PEEM mode. The corrected high-resolution SMART with an in-column energy filter has been built by several research establishments in cooperation with the industry. At the Darmstadt University of Technology, the mirror corrector has been designed and tested. It consists of an electrostatic electron mirror for the correction of chromatic and spherical aberrations and a low-aberration beam separator, which separates the incident electrons from those reflected by the mirror. Electrostatic field lenses are located close to the edges of the beam separator. Between the mirror and the beam separator two electricmagnetic multipoles are attached, which serve as complex stigmators and double-deflection elements. Since the spectromicroscope will be used in surface physics, the construction must provide ultra-high-vacuum conditions. Furthermore, the choice of materials and manufacturing processes is limited by the high demands of the electron optics with respect to magnetic properties and mechanical accuracy. The present assembly of the individual components fulfills all the required conditions. Initial problems with deviations from the rotational symmetry of the field lenses and the electron mirror have been solved. The lateral deviations
LOW-VOLTAGE ELECTRON MICROSCOPES
129
of the bores are now well below 6 #m. This corresponds to the machining accuracy of the lathe. The test of the individual components was performed in a conventional SEM. For this purpose the components with adapter flanges were integrated between the illumination system and the objective lens. The shielding factor of the beam separator was improved by simple measures. With additional shielding at the adapter flanges, a resolution of the test microscope of better than 10 nm could be verified. The ideal beam separator images its entrance plane 1:1 in its exit plane. Neither geometric aberrations of second order nor dispersion of first or second degree is introduced owing to the intrinsic symmetry at a deflection of 90 ~ The residual aberrations can be determined only if the electrons are focused by a field lens and by the beam separator. If one employs the objective lens of the test microscope for focusing the electrons, one achieves a resolution limit of 9.5 nm which is comparable to that of the unmodified microscope. For the characterization of each quadrant of the beam separator, the geometric aberrations were estimated by measuring the edge resolution with a large aperture, and the dispersion was investigated with large deviations of the electron energy up to 25% from the nominal energy. The resolving power of the field lens was checked with the beam separator switched off in straight transmission as well as with the beam separator switched on for a single 90 ~ deflection. The simultaneous correction of the chromatic and spherical aberrations with the electron mirror could be verified unambiguously by the measurement of the aberration coefficients of the system as a whole, consisting of the SEM, the beam separator, and the electron mirror. Through a suitable choice of the potentials at the mirror electrodes the system was transferred from an undercompensated state by means of the corrected state to an overcompensated state. As a characteristic feature of the successful correction, the micrograph of gold clusters shows the highest contrast in the corrected state. In this case a maximum proportion of the incident electrons is focused in the central region of the scanning spot. The results clearly demonstrate that the correction of the chromatic and spherical aberrations of an LEEM by means of an electron mirror is feasible. The beam separator fulfills the required electron-optical constraints without using additional adjustment coils. The successful test of the corrector in an SEM augurs well for the implementation of the mirror corrector in the SMART. The fine alignment of the mirror corrector and the proof of the ultra-highresolution limit of 1 nm can be performed only within the completed SMART. The greatest challenges for the future are to develop computer-controlled adjustment procedures for the overall system and to provide the necessary electrical and mechanical stability of the microscope.
130
P. HARTEL ET AL. APPENDIX: ADDITION OF REFRACTIVE POWERS IN THE
TwO-LENS SYSTEM A simple method of measuring the chromatic aberration of electron lenses is to investigate the defocusing Af of an image as a function of the electron energy. The defocusing on raising the energy can be compensated for by altering the excitation of a second lens in the ray path. The determination of the defocusing from the excitation of the second lens is performed with the matrix method for the calculation of beam transport in linear approximation (Hawkes and Kasper, 1989). An electron trajectory in some plane zi is characterized by the vector with elements axial distance xi and slope x[. The beam transport in the field-free region from the plane zi to the plane Zi+l is described by the matrix "T'--(~
d)
(xi+l)"l-(Xi)(29)
with the convention
where d -- Z/+l z i denotes the spacing of the two planes. The slope of the rays does not change, since the axial distance grows in proportion to the slope. A thin lens leaves the axial distance unchanged; the slope, however, is reduced abruptly in its midplane zi by the product of the refractive power k = 1/f and the axial distance. The matrix -
/~__(__1k
0)
transfers by means of
(Xi)x! l
Zi -lt-E
x;
(30) Zi --(.
the axial distance and the slope from the plane Z i - - ~ immediately in front of the lens into a plane zi + e directly behind the lens. The imaging properties of systems that are composed of drift spaces and thin lenses can be obtained by successive matrix multiplication. The image planes are determined by the zeros of the axial distance, if the initial conditions of an axial ray are chosen to be (xi; xi') r = (0; 1)r. In the following, a two-lens system is considered with lengths and refractive powers according to Figure 50. The coordinates in the initial image plane zi and those of the final image plane zf are related to each other by the system of equations
(',
with the parameters s, k, K, and m, where m denotes the slope of the electron trajectory in the plane zf. For the measurement of the chromatic aberration in an SEM the specimen remains in the plane zf. The lens to be measured has constant excitation, so that at the nominal energy En, the intermediate image of the source is imaged from the plane zi into the plane zf. For an energy increase of AE, the image moves into the
LOW-VOLTAGE ELECTRON MICROSCOPES refractive powers
k
131
K
I
z
4['"
"~
~
"
I I I
a
b
c
FIGURE 50. Geometric optical construction for calculating the image position in a system of two round lenses. The ray path above the z axis illustrates the situation in which the first lens is weak and the second lens has constant excitation, while the path below the axis shows the situation when the roles of the two lenses are reversed.
plane ~f. For the refractive power of the strongly excited lens we find K =
a+b+c (a + b ) . c
and likewise
k-
a+b+c a 9(b + c)
(32)
by employing the lens equation
1
1
1
f
g
b
with the object distance g and the image distance b. For the determination of the defocusing A f, one focuses with the second lens sharply on the plane zf. When one is using the constant refractive power K according to Eq. (32) and the second line of the system of Eqs. (31), the variation of the image position A f = c s for constant excitation of the second lens as a function of the refractive power k of the first lens is given by
a2c 2
af = ~ (a+b) 2 1-(a/(a+b))(b
k ac/(a+b))k
(33)
If one focuses with the second lens for constant excitation of the first lens with the refractive power k, as given in Eq. (32), one obtains for the defocus Af
-- c 2
K l+Kc ~
as a function of the refractive power K of the second lens.
(34)
132
P. HARTEL ET AL. ACKNOWLEDGMENTS
This work would have been impossible without the open and informal exchange of expertise a m o n g all colleagues connected with the S M A R T project. The project coordinator, Professor E. Umbach, of the University of Wtirzburg, deserves special mention. In addition, we are extremely grateful to Professor T. Mulvey for preparing this excellent English translation and for many helpful comments. Finally, the project was financially supported by the G e r m a n Federal Ministry of Education and Research (BMBF).
REFERENCES Anders, S., Padmore, H. A., Duarte, R. M., Renner, T., Stammler, T., Scholl, A., Scheinfein, M. R., Strhr, J., Srve, L., and Sinkovic, B. (1999). Photoemission electron microscope for the study of magnetic materials. Rev. Sci. Instrum. 70, 3973-3981. Beck, V. D. (1979). A hexapole spherical aberration corrector. Optik 53, 241-255. Bernhard, W. (1980). Erprobung eines sph~irisch und chromatisch korrigierten Elektronenmikroskops. Optik 57, 73-94. Degenhardt, R. (1992). Korrektur von Aberrationen in der Teilchenoptik mit Hilfe von Symmetrien. Doctoral dissertation, TH Darmstadt, Darmstadt, Germany, D 17. Dellby, N., Krivanek, O. L., Nellist, P. D., Bateson, P. E., and Lupini, A. R. (2001). Progress in aberration-corrected scanning transmission electron microscopy. J. Electron Microsc. 50, 177-185. De Stasio, G., Perfetti, L., Gilbert, B., Fauchoux, O., Capozi, M., Perfetti, P., Margaritondo, G., and Tonner, B. P. (1999). MEPHISTO spectromicroscope reaches 20 nm lateral resolution. Rev. Sci. Instrum. 70, 1740-1742. Fink, R., Weiss, M. R., Umbach, E., Preikszas, D., Rose, H., Spehr, R., Hartel, P., Engel, W., Degenhardt, R., Wichtendahl, R., Kuhlenbeck, H., Erlebach, W., Ihmann, K., Schlrgl, R., Freund, H.-J., Bradshaw, A. M., Lilienkamp, G., Schmidt, T., Bauer, E., Benner, G. (1997). SMART: a planned ultrahigh-resolution spectromicroscope for BESSY II. J. Electron Spectrosc. Relat. Phenom. 84, 231-250. Haider, M., Rose, H., Uhlemann, S., Schwan, E., Kabius, B., and Urban, K. (1998). A sphericalaberration-corrected 200 kV transmission electron microscope. Ultramicroscopy 75, 53-60. Haider, M., Uhlemann, S., Schwan, E., Rose, H., Kabius, B., and Urban, K. (1998). Electron microscopy image enhanced. Nature 392, 768-769. Hawkes, P. W., and Kasper, E. (1989). Principles of Electron Optics, Vol. 1. Basic Geometrical Optics. New York: Academic Press, pp. 226-235. Henzler, M., and Grpel, W. (1991). Oberfliichenphysik des Festkrrpers. Stuttgart: TeubnerVerlag. (Assisted by C. Ziegler). Joos, G. (1989). Lehrbuch der theoretischen Physik, 15th ed. Wiesbaden: Aula-Verlag. Kahl, E (1999). Design eines Monochromators for Elektronenquellen. Doctoral dissertation, TU Darmstadt, Darmstadt, Germany, D 17. Lanio, S. (1986). Test and improved design of a corrected imaging magnetic energy filter. Optik 73, 56-68. MUller, H., Preikszas, D., and Rose, H. (1999). A beam separator with small aberrations. J. Electron Microsc. 48, 191-204.
LOW-VOLTAGE ELECTRON MICROSCOPES
133
Preikszas, D. (1995). Korrektur des Farb- und Offnungsfehlers eines NiederspannungsElektronenmikroskops mit Hilfe eines Elektronenspiegels. Doctoral dissertation, TH Darmstadt, Darmstadt, Germany, D 17. Preikszas, D., Hartel, P., Spehr, R., and Rose, H. (2000). SMART electron optics, in Proceedings of the Twelfth European Congress on Electron Microscopy, edited by L. Frank and E Ciampoc. Vol. III. Brno (Czech Republic): Czechoslovak Society for Electron Microscopy, 181-184. Preikszas, D., and Rose, H. (1995). Procedures for minimizing the aberrations of electromagnetic compound lenses. Optik 100, 179-187. Preikszas, D., and Rose, H. (1997). Correction properties of electron mirrors. J. Electron Microsc. 1, 1-9. Rempfer, G. F., Desloge, D. M., Skoczylas, W. P., and Griffith, O. H. (1997). Simultaneous correction of spherical and chromatic aberrations with an electron mirror: an electron optical achromat. Microsc. Microanal. 3, 14-27. Rose, H. (1971). Abbildungseigenschaften sph~Lrisch korrigierter elektronenoptischer Achromate. Optik 33, 1-24. Rose, H. (1978). Aberration correction of homogeneous magnetic deflection systems. Optik 51, 15-38. Rose, H. (1981). Correction of aperture aberrations in magnetic systems with threefold symmetry. Nucl. Instrum. Methods 187, 187-199. Rose, H. (1990). Outline of a spherically corrected semiaplanatic medium-voltage transmission electron microscope. Optik 85, 19-24. Rose, H. (in press). Advances in electron optics, in High-Resolution Imaging and Spectrometry of Materials, edited by M. Riihle and E Ernst. Heidelberg: Springer-Verlag. Rose, H., and Krahl, D. (1995). Electron optics of imaging energy filters, in Energy-Filtering Transmission Electron Microscopy, edited by L. Reimer. Berlin/Heidelberg: Springer-Verlag, pp. 110-146. Rose, H., and Preikszas, D. (1992). Outline of a versatile corrected LEEM. Optik 92, 31-44. Rose, H., and Preikszas, D. (1995). Time-dependent perturbation formalism for calculating the aberrations of systems with large ray gradients. Nucl. Instrum. Methods A 363, 19-24. Scherzer, O. (1936). r einige Fehler von Elektr0nenlinsen. Z. Phys. 101, 593-603. Scherzer, O. (1947). Sph~rische und chromatische Korrektur von Elektronen-Linsen. Optik 2, 114-132. Schmidt, T., Ressel, B., Heun, S., Prince, K. C., and Bauer, E. (2000). Growth of thin metal films studied by spectromicroscopy, in Proceedings of the Sixth International Conference on X-ray Microscopy, Berkeley, CA, August 1999, edited by W. Meyer-Ilse and T. Altwood. Melville, :NY: Am. Inst. of Phys., pp. 27-32. Watts, R. N., Liang, S., Levine, Z. H., Lucatorto, T. B., Polack, E, and Scheinfein, M. R. (1997). A transmission X-ray microscope based on secondary-electron imaging. Rev. Sci. Instrum. 68, 3464-3476. Zach, J. (1989). Design of a high-resolution low-voltage scanning electron microscope. Optik 83, 30-40. Zach, J., and Haider, M. (1995). Correction of spherical and chromatic aberration in a low-voltage SEM. Optik 98, 112-118. Ziethen, C., Schmidt, O., Fecher, G. H., Schneider, C. M., Schrnhense, G., Fromter, R., Seider, M., Grzelakowsi, K., Mertel, M., Funnemann, D., Swiech, W., Gundlach, H., and Kirschner, J. (1998). Fast elemental mapping and magnetic imaging with high lateral resolution using a novel photoemission microscope. J. Electron Spectrosc. Relat. Phenom. 88, 983. Zworykin, V. K., Morton, G. A., Ramberg, E. G., Hillier, J., and Vance, A. W. (1945). Electron Optics and the Electron Microscope. New York: Wiley, pp. 603-649.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 120
Characterization of Texture in Scanning Electron Microscope Images JUAN LUIS LADAGA Laser Laboratory, Department of Physics, Faculty of Engineering, Universidad de Buenos Aires, 1063 Buenos Aires, Argentina and
RITA DOMINGA BONETTO Center of Investigation and Development in Processes Catalytic, National Council of Investigations Scientific and Technical, Universidad Nacional de La Plata, 1900 La Plata, Argentina
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. The Variogram as a Surface Characterization Tool . . . . . . . . . . . . . A. Surfaces with R a n d o m Fractal Characteristics . . . . . . . . . . . . . . 1. Fractal G e o m e t r y . . . . . . . . . . . . . . . . . . . . . . . . 2. Fractional B r o w n i a n M o t i o n and Its Relationship to Self-Affine Records .... 3. Some E x a m p l e s of the Use of the Variogram . . . . . . . . . . . . . III. V a r i o g r a m U s e for Texture Characterization of Digital Images . . . . . . . . A. Images T h a t Are Not Self-Affine Fractals at All S c a l e s . . . . . . . . . . 1. Bonetto and L a d a g a M e t h o d . . . . . . . . . . . . . . . . . . . . B. Analysis o f Theoretically Generated I m a g e s . . . . . . . . . . . . . . . C. Physics o f I m a g e F o r m a t i o n in S E M . . . . . . . . . . . . . . . . . . 1. Interaction of Electron B e a m with the S p e c i m e n . . . . . . . . . . . . 2. Contrast M e c h a n i s m s in the E - T Detector . . . . . . . . . . . . . . 3. Noise Influence i n the Fractal D i m e n s i o n Calculation . . . . . . . . . D. The F E R I m a g e P r o g r a m . . . . . . . . . . . . . . . . . . . . . . . IV. Two E x a m p l e s of Application in S E M I m a g e s . . . . . . . . . . . . . . . A. Differences of Plastic and Crystalline Sulfur B e h a v i o r under the Action of Two Thiobacillus Types . . . . . . . . . . . . . . . . . . . . . . B. Quality Difference between Two Types of E m e r y Paper . . . . . . . . . . V. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A p p e n d i x I: Correlation between Fourier P o w e r Spectrum M a x i m u m and Variogram Characteristic M i n i m u m . . . . . . . . . . . . . . . . . . . A p p e n d i x II: Theoretical E x a m p l e to Show the Correlation b e t w e e n the Fourier P o w e r S p e c t r u m M a x i m u m and the Variogram Characteristic M i n i m u m . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
136 136 137 137 141 144 146 151 152 158 161 162 163 164 171 174 175 178 183 183 186 189
135 ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 2002, Elsevier Science (USA). Volume 120 All rights reserved. ISBN 0-12-014762-9 ISSN 1076-5670/02 $35.00
136
JUAN LADAGA AND RITA BONETTO I. INTRODUCTION
Different questions arise when one is studying the textures of materials in diverse areas of modem science and technology. For such studies, the scanning electron microscope (SEM) is very versatile and its use has increased lately because of the high lateral resolution that can now be achieved with it. Our goal is to work in the range of scales that the SEM scans, because in this range there are new elements of investigation that allow us to avoid the abrupt discontinuity in the description of typical phenomena of the atomic and subatomic scales with respect to phenomena that are studied in mechanics and classical physics. In this sense, computerization has allowed a fast growth in the use of fractals applied to science and technology. In this article, we show, by means of examples of numerically generated images and SEM images corresponding to sample surfaces of several origins, the identification of different parameters taken from measurements carried out on these images. Such parameters are used to try to identify the image texture characteristics and their relationship to the surface from which they result. For instance, we analyze the attainment of two parameters dper and dmin suggested by Bonetto and Ladaga (1998) to characterize, together with the fractal dimension D, the texture of SEM images. This method is applicable to many images observed by SEM and which present a fractal behavior at one scale and a periodic behavior at another scale in the variogram. In some examples, the results of the fractal dimension measurement obtained through the variogram method and that obtained through the Fourier power spectrum are compared. We also point out the development of an interactive program that enables one to obtain the previous parameters for isotropic or anisotropic samples from calculations carried out according to different directions in the image. Finally, we show that for the images studied in this article, the correlation method does not lead directly, because of its statistical nature, to texture aspects that are evident with the use of the empirically defined parameters dper and dmin 9
II. THE VARIOGRAM AS A SURFACE CHARACTERIZATION TOOL
The separation between a material medium and a vacuum defines a surface. To study this surface, we can identify it through positions of relief points regarding a reference: distance between points, relative slopes, and the quality of being rough or plain and smooth. Undoubtedly, all this plays an important role in the formation of a surface. The purpose of studying a surface is to find parameters that quantitatively characterize it. Different methods are used to accomplish this. On the one hand,
CHARACTERIZATION OF TEXTURE IN SCANNING
137
there are models that allow a height correlation. On the other hand and more indirectly, there is the image study of these surfaces. The variogram method is very efficient at detecting behavior differences of surfaces at different scales. This method consists of graphing the variance of variation of heights in a surface (or variation of gray levels in an image), for different steps, at logarithmic scale in the function of such steps. One objective of using the variogram is to obtain the parameter known as thefractal dimension. In this case the surface roughness has a random feature.
A. Surfaces with Random Fractal Characteristics Randomness is present in all natural phenomena since even real states of the most perfect systems contain random elements. Therefore, when one is studying natural fractal surfaces, it is convenient to work with a random fractal model. Of the different random models, the Brownian motion (Brown, 1828), or random walk, process (Einstein, 1905) is discussed because of its usefulness in studying many physical phenomena of chaotic origin. We do not review fractal geometry and its applications in detail because there are many books that deal thoroughly with the subject in different science and technology fields (Barabfisi and Stanley, 1995; Barnsley et al., 1988; Bunde and Havlin, 1994, 1996; Family et al., 1995; Feder, 1988; Kaye, 1989, 1993; Mandelbrot, 1982; Nottale, 1993; Russ, 1994). However, we do briefly summarize the concept of fractal geometry so that the Brownian fractal surfaces can be understood in the global context of fractal geometry.
1. Fractal Geometry Fractal geometry was introduced by Beno~t B. Mandelbrot, and one of his books (Mandelbrot, 1982) is a standard reference since it presents elementary concepts and new ideas, such as the ones which gave base to what was later called multifractals. Fractal geometry provides a description and a mathematical model for many shapes found in nature, such as borders, coasts, clouds, mountains, and so forth. These shapes, although complex, present invariance under magnification changes. This statistical self-similarity can be quantified by a fractal dimension (i.e., a number that involves an intuitive concept of dimension but does not need to be an integer). Euclidean geometry results are inadequate to define these complex shapes. Another noticeable difference between Euclidean shapes and fractal shapes is that while the former are usually described by simple algebraic formulas, the latter, in general, may be produced by an algorithm or a building process that often is iterative or recursive. Mandelbrot (1967) used the fractal dimension concept for the first time when studying Richardson's empiric work (1961) on coast and border length.
138
JUAN LADAGA AND RITA BONETTO
Richardson used a method to measure such lengths that consisted of taking a compass with an e opening and moving it along the coast, beginning each step where the previous one finished. In this case, length L(e) equals the number of steps N(e) multiplied by e. He observed that as e diminished, the length increased indefinitely. Graphing L(e) versus e at logarithmic scale, he verified that in all studied cases he obtained straight lines conforming to the following relation: L@) ~ e 1-D
(1)
where the D exponent depended on the considered coast. Mandelbrot (1967) suggested that although D is not an integer, it could and should be interpreted as a dimension, in the fractal dimension sense. The graph, at logarithmic scale, of L(e) versus e obtained with the compass method is known as the Richardson plot. Kaye (1989, 1993) used the Richardson plot in particle contour measurement and found, in many cases, a slope at low scale called texturalfractal and a different slope at high scale called structural fractal. In general, when one is speaking of the dimension of a set, it is referred to as the topological dimension. From the physics point of view, the topological dimension is interpreted as a minimum number of coordinates necessary to determine a fundamental element or space point. For a line segment, the topological dimension equals one, its length is finite, and its area and volume are zero. For a surface, the topological dimension is two, its area is finite, its length is infinite, and its volume is zero. For a body the topological dimension is three with a finite volume and an infinite area and length. However, there are closed sets whose topological dimension is one, length is infinite, and area is null. In these cases, a new kind of dimension can be considered between one and two for which the whole measure would be finite and different from zero. This is the basic concept for the Hausdorffdimension (Hausdorff, 1919), but it is worth noting that this dimension does not have the same topological concept as that of the Hausdorff space dimension (Mandelbrot, 1982). If T is a set of points in the space, we can take a test function h(6) ~ 3d (a line, a square, a circle, a sphere, a cube, etc.) and cover the set to obtain the measure Fd = Eh(~). In general, when 6 --+ 0, the measure Fd is zero or infinite depending on the choice of the d value. The Hausdorff-Besicovitch dimension D of the T set is the dimension for which the d measure is infinite for d < D and is zero for d > D: Fj ~ N(3)3 J
a-+0
>
{oo
d
0
d > D
(2)
where N(6) is the number of boxes of size a necessary to cover the T set of points. The Hausdorff-Besicovitch dimension is defined as a local property, since it measures the properties in a set in the limit for ~ sizes tending to
CHARACTERIZATION OF TEXTURE IN SCANNING
139
zero. The I'd value for d -- D is often finite, but it may be zero or infinite. In most known cases the Hausd0rff-Besicovitch dimension corresponds to the integer D values for the lines, planes, surfaces, spheres, and so forth. However, there are many sets for which the Hausdorff-Besicovitch dimension is not an integer and in such cases, and after Mandelbrot, these dimensions are said to be fractals. If the I'd limit is a finite number and different from zero, for a small-enough 3, this approximation is valid: N(6) ~ 3-0
(3)
Then the fractal dimension can be determined by finding the In N ( 6 ) versus In(g) plot slope. The D dimension, determined by counting the number of boxes necessary for the set as a box size function, is known as the box dimension (Feder, 1988). Fractals are naturally grouped into two categories: random and deterministic. The fractals in physics belong to the first category; however, it is better to discuss first some of the deterministic fractals, such as the von Koch curve. Figure 1 shows an iterative process for building this fractal curve. A segment is divided into three segments and the middle one is replaced by two equal segments that form part of an equilateral triangle. In the next building phase, four new segments replace each of these four segments, one third of the previous
FIGURE 1. The first (top) and fourth (bottom) generations of the von Koch self-similar fractal curve of fractal dimension Ds = log(4)/log(3) ~ 1.262.
140
JUAN LADAGA AND RITA BONETTO
one long. This proceeding is performed repeatedly, which produces the von Koch curve. Figure 1 shows the first (top) and fourth (bottom) generations of the von Koch curve. This figure has an exact self-similarity (each small portion when magnified can reproduce exactly a larger one). As can easily be seen, the curve length is multiplied by four thirds at each step and the limit curve has an infinite length in a finite area of the plane without self-interception. The self-similarity property is one of the basic concepts of fractal geometry. A line segment, a bidimensional object, or a three-dimensional object presents self-similarity properties. These elements can be divided into N identical parts, each of which is scaled down by the ratio r -- 1/N, r - 1 / N 1/2, or r -- 1 / N 1/3, respectively. In general, it can be said that a D-dimensional self-similar object can be divided into N copies smaller than itself, and each is scaled by a factor of r - 1 / N 1/~ of the total. The fractal, or similarity, d i m e n s i o n Ds is given by Ds --
log(N) log(l/r)
(4)
In the case of the yon Koch curve, the fractal dimension Ds -- log(4)/log(3) 1.262. The real cases, such as coasts, are not exactly self-similar (i.e., each little portion is observed as a bigger portion, but not exactly identical). The fractal dimension concept, however, can also be applied to objects that are statistically self-similar. The dimension, in this case, is also given by Eq. (4). The similarity concept can be more formally expressed by taking T as a set of points in the Euclidean space of dimension E placed in positions Z - (Z1 . . . . . ZE). The T set is self-similar with respect to a real scale factor r (0 < r < 1) if T is transformed into r T with points in positions r Z (rZ1 . . . . . r Z E ) . Therefore, a closed set T is self-similar with respect to an r scale factor, when T is the union of N different and nonoverlapping subsets, each being congruent with rT (identical under translation and rotation transformations). In contrast, the set T is statistically self-similar if it is composed of N different subsets, each being scaled down by an r original factor and identical to r T in all statistical aspects. In many cases the studied sets are not self-similar but self-affine. An affine transformation transforms the points Z - ( Z 1 . . . . . ZE) into new points Z t -(rlZ1 . . . . . rE ZE), where the scale factors rl . . . . . re are not all the same. A closed set is self-affine with respect to a scale vector r -- (rl . . . . . re), if T is the union of N nonoverlapping subsets, each identical to r T in all statistical aspects. The fractal dimension of a self-affine record is not univocally determined. The similarity dimension is not defined since it exists only for self-similar fractals. As regards the box dimension, depending on the initial box size, two possible values are obtained, one called the local d i m e n s i o n D -- 2 - H, with
CHARACTERIZATION OF TEXTURE IN SCANNING
141
0 < H < 1, and another, D = 1, which is called the global dimension for the bigger boxes (Feder, 1988, and reference therein; Mandelbrot, 1985). That is to say, globally, a self-affine record is not fractal. The Hausdorff-Besicovitch local dimension also gives the value D = 2 - H. Another widely used method for the fractal dimension measurement is the Fourier power spectrum (P ~ f~). This consists of graphing at logarithmic scale the power spectrum (P) versus the frequency (f). The straight-line slope, 13, is related to the fractal dimension. For the Brownian fractal surfaces (Barnsley et al., 1988), D = 4 + 13/2
(5)
2. Fractional Brownian Motion and Its Relationship to Self-Affine Records In the one-dimensional random walk, a particle moves by jumping a step length + 17or - r/in r s. If the step length follows a Gaussian distribution with (r/) = 0, the probability distribution is 1 exp p(r/, r) -- ~/4zrDe r
4De r
(6)
and the variance of the process is O2p(r/, r ) d r / - - 2 D e r
(/.]2) =
(7)
oo
where the De parameter is the diffusion coefficient. The random walk study was an important development by Einstein (1905) which provided the theoretical basis for Brownian motion. Although the diffusion coefficient is a Gaussian distribution parameter that determines the variance by means of Eq. (7), this equation is more general and valid even when the jumps take place at regular intervals and when the probability distribution for the step length is discrete, continuous, or with some arbitrary shape (see Feder, 1988). Figure 2 shows the particle position as a function of time t, and Figure 3 the random step-length record. In these figures it can be observed that in Brownian motion, it is not the particle position which is independent from time, but its displacement at a time interval which is independent from the interval. The record corresponding to a random walk is scale invariant (i.e., it is statistically the same at different resolutions). This means that, independently from number b of time steps r among observations, the increments in the particle position constitute an independent Gaussian random process (Fig. 3)
142
JUAN LADAGA AND RITA BONETTO
X
0.4 0.2 0.0 -0.2 -0.4 I
0
200
I
400
I
t
600
I
800
r
1000
FIGURE 2. The particle position as a function of time t.
0.2
0.0
-0.2
0
,
'
200
4()0
6()0
8()0
1()00
FIGURE 3. The random step length of the particle as a function of time t.
CHARACTERIZATION OF TEXTURE IN SCANNING
143
with (,7) - 0 and variance equal to (7 2) -- 2 D E b t
(8)
The probability distribution follows a scaling relation equal to
p(bl/Zrl, b r ) - b-1/Zp(rl, r)
(9)
The previous equation says that the Brownian process is invariant in distribution under a transformation that changes the time scale by b and the length by ~/-b. Therefore, the Brownian record is self-affine. Mandelbrot introduced the fractional Brownian motion concept (Mandelbrot, 1982; Mandelbrot and Van Ness, 1968), generalizing the Wiener equation (1923) for the Brownian particle increments:
Y(t + r ) -
Y(t)~
(10)
( T TM
where H ~1 for the Brownian movement and ~ is a random number from a Gaussian distribution. Changing the exponent of H -- ~1 to any real number in the range 0 < H < 1, we can prove that the fractional Brownian process has an average value of increments equal to zero:
(Y(t + v) - Y(t)) -- 0 and increment variance ((Y(t + r ) V(r)-
(11)
Y(t)) 2) -- V(r), given by (12)
2 D E r 2t-/
The exponent H is called the Hurst exponent H by Mandelbrot. It is worth mentioning that, unlike ordinary Brownian motion, in fractional Brownian motion the past and future increments are correlated. This is observed when one is calculating the correlation coefficient between (e.g., increments A Y_ (between - t and 0) and AY+ (between 0 and t))" ([Y(0)- Y(-t)][Y(t)-
Cov(AY_, AY+) c(t)
-
Var(t - 0)
=
Y(0)])
2DIt - 01TM
(Y(O)Y(t) - (Y(0)) 2 - Y ( - t ) Y ( t ) + Y(-t)Y(O)) 2DIt -01TM (-(Y(t)-
Y(O)) 2 + ( Y ( t ) - Y ( - t ) ) 2 - (Y(O)
Y ( - t ) ) 2)
2.2Dlt] 2/-/ - - 2 D t TM + 2 D 2 2 H t 2H _ 2 D t TM
2.2Dt 2~ = 2 2/-/-1 - 1
(13)
144
JUAN LADAGA AND RITA BONETTO
1 The previous equation implies that fractional Brownian motion (H ~ ~) shows a correlation coefficient C(t) ~: 0 independent from t, which presents 1 persistence (H > ~) or antipersistence (H < 1). The aforementioned for Brownian motion can be generalized for roughness records of surfaces that present a fractal Brownian behavior. In this case, the difference between two points corresponding to the three-dimensional component z (IAzl) may be equivalent to the step length (r/) of the Brownian movement, and the other two bidimensional dimensions (x and y) play the time role. Equation (12) then takes this form:
V ( S ) - - ((Az) 2) "~ s 2H
(14)
where s is the step according to the x axis or the y axis corresponding to the increment in z. For a self-affine record of a fractal Brownian surface, the fractal dimension D is related to H as D -- 3 - H
(15)
As H varies between 1 and 0, D varies between 2 (smooth surface) and 3 (completely rough surface). The H coefficient can be obtained from the slope value in the variogram (log(V) vs. log(s)). Such a slope is equal to 2H, as can be deduced from Eq. (14).
3. Some Examples of the Use of the Variogram The first step for measuring surfaces is to determine their elevation profile (height data). This can be accomplished by several methods; among these we mention contact profilometry, atomic force microscopy, interference microscopy, radiation scattering from rough surfaces (X-ray scattering, neutron scattering, radar, etc.), sonar, and so forth. Many authors have used the variogram to determine parameters that characterize the surface roughness. Burrough (1981) analyzed a great amount of environmental data (different soil characteristics, iron minerals in rocks) and obtained estimations of fractal dimensions. Mark and Aronson (1984) studied topographical surfaces and showed that in most of them there were scale ranges with different fractal dimensions, separated by distinct scale breaks. These breaks represent characteristic horizontal scales, at which surface behavior changes substantially. Sahimi et al. (1995) used the variogram to calculate the fractal dimension of porous media at field scale and thus to show the existence of antipersistence or negative correlations. One surface may behave differently according to the scale under study (Bunde and Havlin, 1996; Kaye, 1989, 1993; Mark and Aronson, 1984; Sahimi
CHARACTERIZATION OF TEXTURE IN SCANNING
145
et al., 1995; Sinha et al., 1988; Zhao et al., 1998). Another way of characterizing the rough surface behavior is by means of a definition of other parameters apart from the fractal dimension. Many authors have worked with three-dimensional roughness, using different experimental techniques but a similar parametric approach. Sinha et al. (1988), with X-ray and neutron scattering from rough surfaces, used three parameters to characterize rough surfaces. These parameters were the roughness coefficient H that is related to the fractal dimension D, the root-mean-square (r.m.s,) roughness amplitude 0-, and the cutoff length ~. The suggested expression for the variance in the isotropic self-affine surfaces case was V(Sx, Sy) -- V ( s ) -- 20-211
e -~s/~2'~]
(16)
where s -- (s 2 + Sy2)1/2 . V(S) trends toward 20- 2 when s trends toward infinity and Eq. (16) trends toward Eq. (14) for s << ~. When Eq. (14) is taken into account, V(Sx, Sy) --" ( ( Z ( x Ji- Sx, y + Sy)
Z(x,
y))2)
-- 2(Z2(x, y)) - 2 ( Z ( x + Sx, y + S y ) Z ( x , y))
(17)
Taking 0-2 = (Z2(x, y)), these authors defined the height-height correlation function C(sx, Sy) as follows: C ( s x, Sy) ~ ( Z ( x 'Ji- Sx, y + S y ) Z ( x , y)) -- 0-2 _ 21V(sx, Sy)
(18)
Using Eq. (16) for V (s), they obtained C(Sx, Sy) --
0-2e-(S/~)zH
(19)
Therefore, if ln(C(sx, Sy)/0- 2) versus s 2H is graphed, a straight line with slope - ( 1 / ~ ) 2H should be obtained. From this slope the ~ value can be obtained. The H value is obtained from the low-scale region in the variogram (s << ~). Different authors have worked with these three parameters (H. Hol~ and Baumbach, 1994; V. Ho13~et al., 1993; Palasantzas and De Hosson, 2000; Press et al., 1996; Stettner et al., 1996; Zhao et al., 1998; and others). Zhao et al. (1998) suggested a phenomenologic model for rough surfaces with periodic patterns with only one characteristic period. They suggested the following expression for the variance of diffraction studies from diffusion-barrier-induced mound structures in epitaxial growth fronts: 2o-211--e-~S/C~2Mcos(2~-~s)]
for 1 -4- 1 dimension (20)
V(s) =
20-211 - e-(S/~)2"J0(2Jrs----Z-)]
for 2 + 1 dimension
146
JUAN LADAGA AND RITA BONETTO
where Jo(x) is the zeroth-order Bessel function. In this case, four parameters (i.e., r.m.s, roughness amplitude or, system correlation length ~', roughness exponent H, and average mound separation )~) are used to describe the surface, which presents a periodic behavior at high scale. Similarly, Palasantzas and De Hosson (2000) applied the Sinha et al. model for self-affine surfaces and the Zhao et al. model for mound surfaces to study the roughness effect on the measurement of interface stress.
III. VARIOGRAMUSE FOR TEXTURE CHARACTERIZATIONOF DIGITAL IMAGES The scanning electron microscope (SEM) is a very useful tool in the image studies of rough surfaces. This method is nondestructive and easy to apply. Therefore, SEM studies to obtain quantitative information on roughness are ongoing. The different signals produced by the SEM do not directly supply the elevation profile. Instead, SEM images can be used to obtain the elevation profile by means of stereoscopy. The stereoscopic technique requires the use of two images obtained from different viewpoints. The location of matching points in the two images enables the attainment of the elevation profile, through its displacement. However, the main problem of this technique is the computing difficulty in matching enough points between the two images (Russ, 1990). Another way of obtaining information about surface roughness is by means of a texture study of SEM images. In the case of the fractal dimension, for instance, it can be said that the previously mentioned methods (variogram, Fourier power spectrum, etc.) for obtaining it with an elevation profile of the surface could be used with SEM bidimensional images. The fractal dimension values obtained in both cases may not coincide. Nevertheless, the sample fractal behavior is retained in the corresponding image texture. This statement is based on the work of Pentland (1984), who proved mathematically that gray levels in the digitized optical image of a fractal surface show the same fractal behavior as that of the fractal original surface. Later Russ and Russ (1987) obtained an empirical correlation between the fractal dimension of the original surface and a statistic textural parameter obtained from brightness differences of an SEM image. Skands (1996), using SEM, found a noticeable linear correlation between the topographical properties of surfaces and the secondary electrons emitted from them. He studied fracture roughness of fractured steel surfaces, using fractal dimension values obtained with the elevation profile and with SEM images. He obtained a correlation of 0.9757 between both measures and found that both measures could be related directly to mechanical properties of the samples, because both showed a monotonic increase with the impact toughness.
CHARACTERIZATION OF TEXTURE IN SCANNING
14'/
FIGURE4. Variance of the brightness levels (V) as a function of s steps for a theoretically generated fractal Brownian image (H = 0.4), which is shown at the bottom left of the variogram. (Adapted from Bonetto and Ladaga (1998, Fig. 1, p. 458). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.) In the case of images, for the fractal dimension calculation, Eq. (14) is used, where V(s) is the variance of the gray-level difference, and Azi is the graylevel difference between two different positions in the digital image for a step i of length s measured in pixels or microns (Bonetto, Forlerer et al., submitted; Bonetto and Ladaga, 1998; Bonetto, Ozols et al., 1998; Bonetto, S~inchez et al., 1996; Briand et al., 1999; Pentland, 1984; Van Put, 1991). Figure 4 shows the variogram of a self-affine image numerically generated with H = 0.4 (D -- 2.6). The variance values are obtained by scanning the image first according to the x axis and then according to the y axis with the same s step in each case. Thus, the influence of preferential directions on the sample surface is partly avoided. The variance is calculated by scanning the image starting from all possible origins, to obtain the best statistics on each step and to avoid a preferential starting point. Although all the variance values are calculated up to an s value equal to half the size of the image, for the regression calculation only those steps that are approximately equally spaced at logarithmic scale are taken. This is done to account for the fact that the larger steps weigh more statistically due to their higher population density (Hamblin and Stachowiak, 1993). As a way to avoid having few statistics for smaller s values, the first four values are taken and from the fifth, the expression
148
JUAN LADAGA AND RITA BONETTO
s = integer part (0.5 + 4*(s -- 1)/3) is used. The equally spaced data in the logarithmic scale correspond to large white squares in Figures 4 through 6. The smaller black squares represent the total calculated steps. All the numerically generated images are created in a matrix of 256 x 256 pixels and 256 gray levels. For the image in Figure 4, the slope value obtained from the variogram was 0.770 -t- 0.012 with r = 0.99 and the D value obtained by using Eq. (14) was 2.615 4- 0.006. Figures 5a through 5c depict the variograms corresponding to theoretically generated fractal Brownian images of theoretical dimension Dtheo - ' - 2 . 1 , 2.3, and 2.7, respectively. The values of the corresponding slopes obtained through the variogram method were 1.756 -1- 0.010, 1.461 + 0.015, and 0.652 + 0.008, respectively, with r - - 0 . 9 9 9 regression coefficients, which gives fractal dimension values of D - 2.122 i 0.005, 2.269-t-0.007, and 2.674-t-0.004, respectively. Unless otherwise stated, all the fractal dimension values are obtained from the straight-line slopes that adjust the data, with r = 0.999 regression coefficients. This is done in order to follow a calculation system mainly necessary when conclusions must be drawn according to different obtained values for different images. The sum of two fractal Brownian images results in images as shown in Figures 6a and 6b. As such, the fractal dimension calculation can be as complicated as the one shown in Figure 6a, or simpler, as in Figure 6b. In the first
FIGURE 5. V versus s
(c) H = 0.3.
for theoretically generated images. (a) H = 0 . 9 ;
(b) H = 0 . 7 ;
CHARACTERIZATION OF TEXTURE IN SCANNING
149
FIGURE 5. (Continued)
case (image resulting from the sum of Figs. 5a and 5c), it is impossible to obtain only one value, and two well-defined regions appear, one at low scale and another at high scale. In the low-scale region the image with a larger fractal dimension prevails, and a D1 = 2.658 -t-0.004 value is obtained. In the high-scale region the image with a smaller fractal dimension prevails, and a
150
JUAN LADAGA AND RITA BONETTO
FIGURE 6. V versus s for theoretically generated images. (a) The sum of Figures 5a and 5c. (b) The sum of Figures 5b and 5c. D2 -- 2.415 + 0.009 value is obtained. In Figure 6b (image resulting from the s u m of Figs. 5b and 5c), only one fractal d i m e n s i o n value D -- 2.657 + 0.002 can be obtained. This work with theoretical i m a g e s enables us to show that the sum of two fractal B r o w n i a n i m a g e s does not p r o d u c e a fractal B r o w n i a n image (Fig. 6a).
CHARACTERIZATION OF TEXTURE IN SCANNING
151
Therefore, the calculation of the "true" value of the fractal dimension is relative. In Figure 6b, for example, one can think that the process that gives rise to this image is only one process, of 2.657 4-0.002 fractal dimension, and not two processes as is actually the case. The previous discussion and the noise study that affects the images of the SEM (discussed later) lead to the following conclusion: The value of the fractal dimension obtained in each case serves as a characterization tool of the image texture. However, the physical process interpretation that gave rise to this value may lead to errors if the multifractal spectrum is not studied (see Feder, 1988). A. Images That Are Not Self-Affine Fractals at All Scales For many images observed with the SEM, the variogram presents a fractal behavior at low scale and a behavior that seems to have an asymptotic tendency at high scale. One example is shown in Figure 7, which corresponds to an SEM image of emery paper obtained at 100x magnification. If the fluctuations at high scale were neglected, the variogram shape would correspond to Eq. (16) of Sinha et al. (1988). Actually, the variance seems to trend asymptotically toward 2cr 2 -- 3916, and the image is seen as isotropic. The value H = 0.518 + 0.015 is obtained from the slope of Figure 7 at low scale. In the case in which (z) is different from zero, as occurs with digital images, Eq. (19) becomes
F(Sx, Sy)
-- C ( s x , Sy) - ( Z ) 2 -- o 2 e -(s/~)2h
FmURE7. V versus s for an SEM image of emery paper.
(21)
152
JUAN LADAGA AND RITA BONETTO _
]-'/(i 2 = 14.58
6 FIGURE 8. ln(F/cr2) v e r s u s Figure 7. a.u., arbitrary units.
50 ' 160 150' 200' 250' 3603 0 s2H(a.u.) s TM
for the image shown at the bottom left of the variogram in
The ~ value can be obtained from the straight-line slope corresponding to the graph of ln(F(Sx, Sy)/o- 2) versus s 2/-/. As can be seen in Figure 8, there is only one linear behavior at low scale. The gamma function does not trend toward zero for large s values, which shows the correlation existence at high scale. If the value of the straight-line slope at low scale is used, a value of -- 14.58 pixels is obtained. However, the variance in s -- ~ is equal to 3880 and notto 2o2(1 - e - l ) = 2465, as would be predicted by Sinhaetal.'s model (Eq. (16)). Figure 9 depicts a theoretical model that shows the shape of the V versus s plot that fulfills Sinha et al.'s model for ~ - 14.58, H -- 0.518, and 2o-2 -3916. We can see that this figure, in some ways, is significantly different from Figure 7. If the vertical axis is expanded in the high-scale region of the variogram in Figure 7, one can see that, in fact, variance maximums and minimums appear. Bonetto and Ladaga (1998) suggested a method to deal with images that produce this kind of variogram behavior.
1. Bonetto and Ladaga Method To help us explain this method, we have graphed the variogram corresponding to the image at the bottom left of Figure 10. This image is built by squares of 24 pixels per side, approximately. The centers of the squares of the same color are at an average distance of 48 pixels for both axes. In this plot, a characteristic minimum at high scale can be observed at s = 49 pixels and its multiple at
CHARACTERIZATION OF TEXTURE IN SCANNING
153
V H =0.518 = 14.58 2(3,2 = 3 9 1 6 10 a-
100
101
102
s(pixel)
FIGURE9. Vversus s for a theoretical case that complies with Sinha et al.' s equation, Eq. (16), with parameters H, ~, and o- equal to those obtained from Figures 7 and 8.
V
10 s-
D = 2 . 4 9 9 6 + 0 . 0 0 0 3 (r = 1
\
104 9
[99,25731 [49,1282] 9
'
'
'
'1
100
'
'
'
'
'
'
' ' 1
101
'
s(pixel)
'
'
'
'
'
'
'
'
1
102
FIGURE 10. V versus s for the image shown at the bottom left of the variogram. (Adapted from Bonetto and Ladaga (1998, Fig. 2, p. 458). Copyrighted and reprinted with the permission of S C A N N I N G , and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.)
154
JUAN L A D A G A AND RITA BONETTO
V
~--
/ J
.....
'
'
[15,13636] ~ ~ )~ [45'11117]t 1f
D = 2.4984 + 0.0001 (r = 1)
'1
100
'
'
.
.
.
.
.
.
I
101
s(pixel)
'
'
] [120,5184]
'
[60,2631] 102 '
'
'
'
'1
'
FIGURE 11. Vversus s for the image shown at the bottom left of the variogram. The elemental cell presents a different periodicity according to each orthogonal direction. (Adapted from Bonetto and Ladaga (1998, Fig. 4, p. 459). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.)
s -- 99 pixels. The minimums in the variogram are a consequence of choosing an s step similar to the period of the image. The difference between the gray levels should be zero; therefore, the variance should also be zero. The variance is not zero and this is because the period is not exactly the same in all directions and the pattern is not repeated exactly in the image. In this image the variance change, between the linear region and the periodic region, is clearly defined. We were able to calculate a fractal dimension only at a lower scale by using the variogram slope. The obtained H value was 0.5004 4- 0.0003 (r = 1), which gives a fractal dimension value of D -- 2.4996 4- 0.0003. Figure 11 corresponds to another numerically generated image which consists of elemental cells of about 29 x 7 pixels (according to the x axis and the y axis, respectively). A distance equal to 15 pixels according to the y axis and 57 pixels according to the x axis separates the centers between cells of the same color. As can be seen, this figure shows minimums of the variance in values of steps corresponding to these distances and their multiples. The first minimum corresponding to the y axis (57 pixels) is superposed with the fourth multiple (60 pixels) of the characteristic minimum on the x axis (see Fig. 12). In the Figure 11 variogram we observe that the change from the fractal region to the periodic region appears at a step of approximately 8 pixels. In Bonetto and Ladaga's work, the periodic region for any surface was described as a spectrum of spatial frequencies that can be represented by
CHARACTERIZATION OF TEXTURE IN SCANNING
155
(a) V
10 s
,/
lO 4 103 10 2
101
10 o ,
[57,1]
,
,
,
,
,
[114,1] 102
'
I
Sx(pixel) (b) V
10 s
AAAAi
A 10 4 10 3 10 2 101
10 o
[15,1] '
'
'
'
'
I
101
[60;11 ' [1'2d,11
[30,1] '
'
'
'
'
'
'
'
I
10 2
Sy(pixel)
FIGURE 12. Variance of the brightness levels for the image in Figure 11 in the periodic region, as a function of (a) Sx and (b) Sy. (Adapted from Bonetto and Ladaga (1998, Fig. 5, p. 459). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.)
156
JUAN LADAGA AND RITA BONETTO
"k wave vectors." We should note that in this definition there is no variable depending on time. By projecting the k wave vectors on the pair of orthogonal coordinate axes, we were able to obtain (k 2) as (k 2) - - ( k 2) + (k~}
(22)
By observing the previous equation we shall note that it is not necessary to make explicit to which vector the components belong. The spatial frequencies are directly related to the minimums of the variogram in the orthogonal representation. Thus, searching for a new parameter we can calculate (k2 ) and (k 2) by taking only kxi and kyi corresponding to minimums of the variogram. The subscript index i identifies the corresponding ki wave vector. The suggested expressions are
(k 2) - ~(.t)xik2i {k 2) -- ]p~O)yiky2 i 2 where the O)xi and O)yi values denote the statistical probability of k2i and kyi, respectively: O.)xi -- ( m Vxi / VMxi )/ ]Pajm Vxj / VMxj O)yi -- ( m Vyi / VMyi )/ ~ j m Vyj / VMyj The values VMx i and VMyi are the variance values of the left side maximum closest to Sxi and to Syi respectively. A Vxi and A Vyi are obtained from the difference between the nearest maximum and the nearest minimum.* The spatial frequencies are kxi - 27r/Sxi and kyi -- 2 7 r / S y i , where the step values Sxi and Syi correspond to the minimums of the variogram. Therefore, (k 2) calculated this way should not be mistaken for the average of squares of all Fourier wave vectors. If the Fourier transform of the image is carried out, one will see a correlation between a peak in the Fourier power spectrum and the corresponding characteristic minimum in the variance. This is shown in Van Put's work (1991), for an image of a radiolarian. In Appendix I this correlation for a numerically generated image is analyzed and in Appendix II a theoretical example is drafted. In both appendixes one can see that in the variogram, minimums in characteristic steps and their multiples appear. The square root of the (k 2) value of Eq. (22) can be interpreted as the inverse of an average dper distance and characteristic of the surface in the periodic region:
dper --
2zr
(23)
*The (.Oxi (O.)yi ) expressions in Bonetto and Ladaga's paper (1998) are incorrect. Vxi (Vyi) must be replaced with VMxi (VMvi).
CHARACTERIZATION OF TEXTURE IN SCANNING TABLE
CALCULUS OF
Sxi
57 114
157
1
dper PARAMETER CORRESPONDING TO THE VARIOGRAMS IN FIGURE
kxi
O)xi
0.110231 0.055116
(k2 )
0.49956 0.50044
0.007590
Syi
15 30 45 60 75 90 105 120
kyi
0.418879 0.209439 0.139626 0.104720 0.083776 0.069814 0.059840 0.052360
O)yi
0.12487 0.12492 0.12496 0.12499 0.12501 0.12504 0.12508 0.12513
12 a
(k2 )
0.033474
dper = 2 ~ / ( k 2 ) 1/2 = 2zr/((k 2) + (k2)) 1/2 = 31.01 p i x e l s a Adapted from Bonetto and Ladaga (1998, Table I, p. 460). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.
Table 1 gives the values for the steps Sxi and Sy i (in pixels) and the values of spatial frequencies kxi and kyi, their respective probabilities O)xi and O)yi, and their average values (k2) and (k2), as well as the parameter dper, for the plots of Figure 12. Taking into account the nearest elementary cells with enough statistic weight so as to produce periods in the variogram, we can state that the dper parameter is a measurement of the average diameter of "holes" among these cells ("virtual circles" representative of regions that separate these cells). In the images, we have drawn the dper diameter circle. As the spatial frequencies belong to the real numbers set, k 2 - k 2 is a rotational invariant and its average is also a rotational invariant. This can be verified only statistically because of numerical uncertainties (digitalization and edge effects) and because of the precision level in the chosen probability calculation. The dpe r parameter is determined from the average of different (k 2) values deriving from the different rotations of the image between 0 and 90~ * In general, when the image is markedly anisotropic it will be convenient to take more rotations than for an image of the isotropic type. The rotations are necessary since the error in the measurement is strongly influenced by the image anisotropy. Another parameter defined by Bonetto and Ladaga (1998) is the parameter dmi n that corresponds to the inferior end of the region of the periodic scale and is related to the smallest cell size with enough statistic weight to produce *In Bonetto and Ladaga's work (1998) the rotations were taken between 0 and 45~ but it is more convenient to take rotations between 0 and 90~
158
JUAN LADAGA AND RITA BONETTO
periods. The value of dmi n in an image is also the superior limit of low scale, to which the fractal belongs. In the case of images whose variograms do not present a clear separation between the fractal region and the periodic region, the parameter dmi n c a n be obtained as the intersection of the straight line that fits the region of low scale, to which the fractal belongs, and the straight line that better fits the maximums of the periodic region. The dmi n and D values are calculated with the complete variogram (i.e., by adding all gray deltas squared resulting from the image scanning according to x and to y). These two parameters will not necessarily be constant in different directions if the image is anisotropic. Therefore, it is convenient to take an average value of the obtained values for such parameters in every different rotation between 0 and 90 ~ The values of dper, dmin, and D shown in the figures are obtained from an image without rotation. Many of the figure captions include the values and the error corresponding to five rotations between 0 and 90 ~ In Figure 10, these values dper 41.4 4- 1.5 pixels, dmi n -~- 16.1 4-3.0 pixels, and D = 2.522-4-0.008. In Figure 11, the values dper 30.27 -1- 0.32 pixels, dmin 7.397 -+- 0.055 pixels, and D -2.527 -t- 0.011. The dper and dmin parameters, together with the D fractal dimension, provide a satisfactory characterization of digital images of rough surfaces with periodic or quasi-periodic characteristics. a r e
a r e
--
-'-
--
B. Analysis of Theoretically Generated Images One option for Bonetto and Ladaga's method is to try to apply the Zhao et al. model to the images that present a periodic variogram. As an example, in the case of Figure 10, the Zhao et al. parameters are H -- 0.5004 and 0.2 _ 16,100 arbitrary units (a.u.). We obtained a value of 40.2 -- 64,400 (a.u.), similar to the variance value in the upper limit of the fractal region that in this case corresponds to the step that defines the dmin value. The parameter )~ (average mound separation) is the s value corresponding to the first minimum (i.e., 49 pixels), taking into account the same-color squares as projections in the plane of the Zhao et al. mound. Since the x and y axes in this image coincide with the main axes of symmetry, we can use Eq. (20) for the 1 + 1 dimension. From the V00 value we obtain ~" -- 2519 pixels, using the first minimum, and from V(2~.), corresponding to the second minimum, we obtain ~ - 2510. The approximate equality of ~" values would indicate the applicability of the Zhao et al. model to this kind of image. However, this ~" value does not have the same physical meaning (system correlation length) as theirs because the variance in Bonetto and Ladaga's method was calculated by using gray levels but no heights. In particular, we cannot find that the value obtained for ~" is
C H A R A C T E R I Z A T I O N OF T E X T U R E IN S C A N N I N G
159
FIGURE 13. Vversus s for the image shown at the bottom left of the variogram, obtained from addition of the images in Figure 4 and Figure 11, the latter with two nearby gray levels (132 and 165). The D, dper, and dmin values in this plot correspond to the non rotated image. The average values for different angles between 0 and 90 ~ are D = 2.572 + 0.008 and drain= 7.71 4- 0.37. The corresponding parameter dper "-- 26.32 + 0.52.
representative of any textural property of this image. In fact, if we scanned the image in different directions we would see that the image is not isotropic. This finding implies that the corresponding study for the 2 + 1 dimension would show an average value of only )~. Likewise, when the pattern has more than one characteristic period, the model is not applicable, either. In the cases of most images that can be presented, we consider that parameters D, dper, and dmi n enable us to characterize their texture without imposing isotropy conditions. In Figure 13 we can observe the variogram corresponding to a numerically generated image obtained by the pixel-to-pixel sum of the images of Figure 4 and another figure exactly like Figure 11 but with two nearby gray levels (132 and 165). In this case, the fractal dimension obtained as the average of five rotations is 2.572 4- 0.008 (r = 0.999). As the elemental cells are not so well defined, the minimums are not so sharp, although they appear at about the same s values as in Figure 11. The dper value obtained in Figure 13 is dper = 26.32 :t: 0.52. The dper values in both figures are different as a consequence of less edge definition, which was mentioned previously. The dmi,, average value is dmi,, - 7.71 + 0.37, which may be considered equal, within error, to the dmi,, average value in Figure 11. Figure 14 shows an array of rectangles with a qualitatively different anisotropy. Each rectangle has 7 x 8 pixels. We obtain the following values for the
160
JUAN L A D A G A A N D R I T A B O N E T T O
V
~[11!I,1724]
min ~, 8
[57,1405]
103
9
-
9
//
,,
'
'
O = 2.4985 _+ 0.0003 (r = 1) dper = 46.94
~
'1
10 0
'
'
'
'
'
'
'
'1
'
101 s(pixel)
,
,
,
,
,
,
,
10 2
FIGURE 14. V versus s for the image shown at the bottom left of the variogram. The D, dper, and dmin values in this plot correspond to the nonrotated image. The average values for different angles between 0 and 90 ~ are D = 2 . 5 4 1 +0.016 and dmi,,=7.53 +0.22. The corresponding parameter dper = 46.71 + 0.72.
three parameters: dper - - 46.71 4- 0.72 pixels, dmin = 7.53 4- 0.22 pixels, and D = 2.541 4- 0.016. Figure 15 shows an example with cells of two sizes (rectangles of approximately 7 x 8 pixels and squares of 15 • 15 pixels). The fractal region is narrower than in Figure 11 and the beginning of the periodic region is not so well defined. In this case we obtain the value of dmi n 7 pixels from the intersection between the straight line that better fits the maximums in the region at high scale and the straight line that fits the region at low scale, to which the fractal belongs. The obtained value is near the value of the smallest rectangle side. The average obtained values for five rotations between 0 and 90 ~ w e r e drain - 7.28 4- 0.14 pixels and D = 2.528 4- 0.003. The value of dper -- 36.80 4- 0.52 pixels is the result of a larger number of cells per unit area than that in Figure 11. Taking into account that for Figures 11, 14, and 15, the average values of D and drain are similar, we find that the dper parameter is the one that can be used to distinguish among these different images. In all the preceding cases analyzed, the observed fluctuation for dper due to digitalization effects, border effects, and uncertainty in probability calculus amounts to nearly 10% or less. "-~
C H A R A C T E R I Z A T I O N OF T E X T U R E IN S C A N N I N G
V
10s~
J
~
10'
mll~
~
A
v
161
JMh w~ll ~
y
: 1;1,,1, :,,:,l:,,:l:~
'l~'m'm~ " 1 " ' " 1 , ' , I 9 9
mmmm 9 ~. 9 .m 9 ram..m.~ )m....,-,.., ...,m....,ml mmmmm-imr ~~mm
D = 2.511 + 0.004 dper = 37.5
:'~mmmm'" 9 9 mmm"m"mmm9mm" mmmml mI~1" : "I~1" : "1~1mm" i~"
:im." .:,i+..'m i+m":im.m m: ~.m.m,.m.... ,m..,m~
' ' '1
100
'
'
'
'
' ' ' '1
101
'
'
[120,4253] [60,21171 '
'
s(pixel)
' ' ' '1
102
FIGURE 15. V versus s for the image shown at the bottom left of the variogram. The D, dper, and dmin values in this plot correspond to the nonrotated image. The average values for different angles between 0 and 90 ~ are D - - 2.528 4-0.003 and drain= 7.28 4-0.14. The corresponding parameter dper= 36.80 4- 0.52.
C. Physics of Image Formation in SEM The image in SEM is unusually formed and very different from the images formed by light. The electron gun in the electron column produces a source of electrons, which are accelerated to an energy between 1 and 40 keV. Two pairs of electron lenses reduce the electron beam diameter to a small area of about 5 nm, which is focused on the specimen. In the last lens two pairs of electromagnetic deflection coils are used to control the raster of the beam across the specimen. Many different signals are produced as a result of the interaction of the electron beam with the specimen atoms. The Everhart-Thornley (E-T) scintillator-photomultiplier detector detects a mixture of two of these signals (secondary electron (SE) and backscattered electron (BSE)). This detector has a collector grid that can be polarized between - 1 0 0 and 300 V according to whether we want to reject or attract the SEs of energy between 0 and 50 eV. The resulting amplified signal modulates the gray levels of the screen of a cathode ray tube (CRT) in synchronism with the scan across the specimen. This ensures that a point of the image corresponds to a point of the sample. The increase in the magnification is obtained by scanning the beam over an
162
JUAN LADAGA AND RITA BONETTO
even smaller area of the sample, whereas the signal is always observed on the same screen of the CRT. Since the aperture of the electron beam consists of a few milliradians, the focus depth is much deeper than in optical microscopy. Despite this unusual and complicated way of forming images, the result is easy to interpret in samples observed at low magnification (i.e., up to 5000x or 10,000x). The easiness of interpreting SEM images is due, partly, to a light-optical analogy demonstrated by Oatley et al. (1965). In particular, the light-optical analogy for SEM images produced by the E-T detector with SEs and BSEs is to place a directional light source at the position of the E-T detector and to add a general diffuse illumination and an observer looking down on the scene. (For a more complete discussion, see Goldstein et al., 1992). Next, we provide a brief explanation of the interaction of electrons with matter and the contrast mechanisms detected through the E-T detector. There are other SEM imaging modes (see, for example, Goldstein et al., 1992; Reimer, 1985), but we analyze only SEs and BSEs in the E-T detector because this mode has been used at our institutions for studying the texture of SEM images.
1. Interaction of Electron Beam with the Specimen When an electron beam interacts with the surface of a sample, different and complex phenomena are produced. Elastic scattering and inelastic scattering are the most common. About 98% of the interactions are elastic and 2% are inelastic. In elastic scattering, the interaction can be with the high coulomb field near the nuclei of the sample atoms or with the outermost electrons in the sample atoms. In the first case, there is little possibility of energy interchange between the incident electron and the atom nucleus, but a great deflection is produced toward the incident electron. Although the most probable deflection angle is approximately 5 ~ its range is between 0 and 180 ~ Thus, one or more of these elastically scattered electrons can be backscattered and even leave the sample surface (BSEs), and their fraction increases with the atomic number. When the elastic scattering is due to an interaction between the incident electron and the most peripheral electrons of the sample atoms, many of these valence electrons are ejected out of the specimen as SEs of low energy (lower than 50 eV). In inelastic scattering, the interaction of the incident electron may be with the electrons in the inner shells of the atom, which gives rise to an X-ray spectrum of characteristic lines of elements present in the sample. Because of the small mass of atomic electrons, the kinetic energy transferred in these coulomb transitions may be so great that the electron is ejected outside the atom. Usually, the energy transferred to the ejected electron can be 1 keV or more and consequently this electron produces at the same time secondary ionizations until it is totally stopped. These inelastic collisions with the electrons in the
CHARACTERIZATION OF TEXTURE IN SCANNING
163
inner levels of the atom are the most usual mechanisms by which the electron loses its kinetic energy in a sample. Sometimes the vacancy produced in an inner shell is filled by an electron through a nonradioactive transition (i.e., the energy release in that transition is used by the atom to eject another electron that leaves the sample). This effect, which competes with the characteristic X-ray emission, is called the Auger effect. When the inelastic scattering is with the coulomb field of the specimen atoms, the incident electron invariably changes its initial direction. In some cases some radiation is emitted and the interacting electron loses an equivalent amount of incident energy. At potentials lower than 100 keV, approximately 0.5% of the incident electrons lose their energy under this process, which gives rise to bremsstrahlung, or continuum, X-rays.
2. Contrast Mechanisms in the E-T Detector The contrast in an image produced by the E-T detector, which is placed next to the specimen, may be due to BSEs or to a mixture of SEs and BSEs. In the first case, the collector grid is negatively polarized and the SEs that have very low energy are rejected. The BSE energy oscillates between zero and E0, which is the incident electron energy, with a maximum of about 0.9E0. Only the BSEs resulting from the primary electron beam that go toward the detector will be collected, in a small solid angle. This makes the image have a large component due to the electron trajectory. In the second case, apart from BSEs, SEs produced by the incident beam in the impact point (SE1), SEs produced in the sample by the BSEs (SE2), and remote SEs produced at different places of the sample holder chamber by the impact of BSEs (SE3) will be collected. The secondary electrons SEa are electrons of high resolution as they come from an area similar to the incident beam size. SE2 and SE3 correspond to the BSE signal and therefore have a poorer resolution. The BSEs travel distances many times the electron beam size, deeply and laterally, depending on the atomic number of the sample and on the incident beam energy. Their range estimation depends on approximations carried out for the range calculation of the incident electron. However, it can be said, as an example, that for an incident energy of 20 keV the distance is about 0.15 # m for Au and 1.5 # m for C. The SE signal is more strongly influenced by SEa for light elements and by SE2 and SE3 for heavier elements. Two main contrast mechanisms can be considered for BSEs and SEs: the material, or compositional, contrast and the topographic contrast. The BSE coefficient (17) strongly depends on the atomic number of the sample; it is higher for elements of higher atomic number. This causes an important compositional contrast in the BSE case. Because of the dependence of the backscatter coefficient on the atomic number, there is a greater contrast between elements of adjacent atomic numbers for low Z than for high Z. The SE coefficient
164
JUAN LADAGA AND RITA BONETTO
(~) is approximately 0.1 for all the elements. However, a compositional contrast is observed and it results from the contribution of SE2 and SE3 components. As regards the topographic contrast, this is the most important mechanism, since it is continuously being used in SEM to observe the topographic characteristics of samples in different technology and science fields. When a negative bias is applied to the collector grid, the topographic contrast is exclusively due to BSEs in the E-T detector. The backscattering coefficient ~7increases with the tilt angle ~ (angle between the incident beam and the normal surface). The angular distribution of BSEs, dq/d~2, varies with the angle ~0between the normal surface and the emission direction, following approximately Lambert's cosine law (dJT/df2 "~ cos ~p) for normal incidence. When the tilt angle increases, the angular distribution deviates from the cosine law, becoming very elongated. In large tilt angle cases, low Z elements may have such an angular distribution that the backscattered coefficient in the direction of maximum emission is larger than that for high Z. The small solid angle from where these BSEs are received gives the image a very marked directional component, with very bright regions pointing at the detector and dark regions which come from surfaces pointing against the detector (Fig. 16a). When a positive bias is applied to the collector grid, the topographic contrast is due to BSE and SE signals. The high BSE directionality is not so marked in this situation because of the indirect re-collection through SE2 and SE3 signals, as can be observed in Figure 16b. The SEs produced by primary beam SE~ are the ones that contribute to the high resolution of this contrast mode. The SE coefficient 8 depends on the tilt angle 7t and increases with increasing specimen tilt angle (8 ~ 80 sec(Tt), with 80 equal to the SE coefficient at 0~ The SE angular distribution follows Lambert's law (dg/df2 ~ sec(Tr) cos(~0)), regardless of the energy of the primary electron, the surface tilt, and the material. 3. Noise Influence in the Fractal Dimension Calculation
In all digital images resulting from different instruments, noise is always present and eliminating it is very difficult to achieve. In the case of images produced through the SEM, the primary electron (PE) beam presents the socalled shot noise, which means that the number of PEs that influence the sample in a determined time r per pixel is statistically distributed. This implies that if this time r is divided into a large number of time intervals, the probability of observing one electron in one of these intervals is much less than unity and the probability of observing more than one electron per time interval is negligible. The shot noise is transmitted as noise in SE and BSE signals. If Ip is the electron beam current and e is the electron charge, the mean number of
FIGURE 16. Image of a steel fracture surface, with Everhart-Thornley (E-T) detector (a) biased negatively and (b) biased positively.
166
JUAN LADAGA AND RITA BONETTO
PEs per pixel is equal to
(np) = I p r / e
(24)
The distribution of PEs follows a Poisson distribution with a mean value of (np) and Var(np) = (np) variance. The signal-to-noise ratio (S/N) of PEs is given by
(S/N)pE
=
( t l p ) / W a r ( H p ) 1/2 - -
(Ip'g/e) 1/2
(25)
When PEs interact with the specimen, BSEs and SEs result. In the BSE case, a binomial distribution exists, because the only possibilities are absorbed PEs or BSEs with an 17backscattering coefficient. The cascade of a Poisson distribution of PEs and of a BSE binomial distribution follows a Poisson distribution and the signal-to-noise ratio is as follows (see Reimer, 1985, pp. 155-158):
(S/N)•se
-- (Iprl'r/e) 1/2
(26)
The distribution of SEs emitted is neither binomial nor Poisson, since a PE can excite zero, one, or more SEs with decreasing probability. In this case, being the SE coefficient, the signal-to-noise ratio is given by
(S/N)sE -- (Ipr/e(1 + b)) 1/2
(27)
In the case of a Poisson distribution, b = 1/6. However, because of the deviation of this statistic, b is higher than this value by a factor of between 1.2 and 1.5 for electrons between 10 and 20 keV. In addition, SEM images may present a noise that can be considered spatially uncorrelated, characterized by a gray-level Gaussian distribution. Noise elimination is necessary so that mistakes are not made in the analysis of the image textures under study. If a digital filter is used, it must eliminate the noise effectively without distorting the image texture. The median filter suggested by Tukey (1971) is effective in removing image shot noise (Podsiadlo and Stachowiak, 1995). This consists of replacing the central pixel value by the median value of pixel grays within the window. This filter rounds up the area corresponding to the image comers, since an edge displacement is produced. It also erases linear structures that are narrower than the window half-width. As a way to keep lines and comers that are eliminated by the standard median filter, the hybrid-median filter was invented (Nieminen et al., 1987). This filter aligns the brightness values of pixels within the window in two groups: one contains elements of diagonals of the N • N pixel window including the central pixel, and the other contains elements of medians of the window including the central pixel. The statistical median of each of the two groups and the brightness value of the central pixel form another group. The median value of this group is the brightness value of the filtered pixel.
CHARACTERIZATION OF TEXTURE IN SCANNING
167
FIGURE 17. Influence of noise in a theoretical Brownian image of fractal dimension 2.7 as shown at the bottom left of the variogram. When the value corresponding to noise is subtracted from the image with noise, one obtains a value (D = 2.673 + 0.006) which is very close to the value obtained for the image without noise (D = 2.670 + 0.006). The same is not true when one is using the hybrid-median filter, for which the fractal dimension value is too low (D = 2.592 + 0.012).
It has been observed that hybrid-median-filter application to images with shot noise can yield fractal dimension values below "true" values. As a way to show this, the following simulation can be carried out: an image without noise is taken and the fractal dimension is calculated ("image" in Fig. 17); a slope value of 0.660 + 0.012 is obtained, with a correlation coefficient r = 0.99 in the variogram. This implies a fractal dimension value D - - 2 . 6 7 0 + 0.006. Then 1% of shot noise is added and the new fractal dimension is calculated, which will be larger ("image and 1% noise" in Fig. 17). The slope value obtained in this case is 0.552 4- 0.012, with r -- 0.99, and the fractal dimension value is D = 2.724 4- 0.006. If the hybrid-median filter is then applied with a 3 x 3 window in the second image, the fractal dimension value obtained is smaller than that corresponding to the image without noise ("image and 1% noise (H-M)" in Fig. 17). The slope value in this case is 0.816 • 0.024, with r = 0.99, and D = 2.592 • 0.012. The shot noise is spatially uncorrelated and can be discounted in the variogram by directly subtracting its variance from the variance value of the image with noise. To show this, we generate an image with a gray level equal to the average gray level of the image without noise, and we add 1% noise. The variance of this image corresponds to "noise" in Figure 17. Subtracting the average variance of noise from each variance value of the image with noise, we obtain the curve called " i m a g e and 1% noise-noise." The variance
168
JUAN LADAGA AND RITA BONETTO
(a) lO 6 P lO s
10 4 10 3
10 2 101 10 o
'
'
'I
10 0
101
10 2
f(a.u)
(b) 10 8 P lO s lO* lO 3 lO 2 lO 1 '
'
'
I
10 0
101
10 a
f(a.u)
FIGURE 18. Fourier power spectrum (P) versus spatial frequency (f): (a) for the image of fractal dimension 2.7 without noise as shown at the bottom left of the variogram in Figure 17 and (b) for the image of fractal dimension 2.7 with 1% noise added.
values practically coincide with the values corresponding to the image without noise, which results in a slope of 0.654 4- 0.012, with r -- 0.99, and a fractal dimension D = 2.673 4- 0.006. We can also obtain the fractal dimension values of these images by using the Fourier p o w e r spectrum method. Figure 18a shows the graph corresponding
CHARACTERIZATION OF TEXTURE IN SCANNING
169
to the image without noise numerically generated as shown at the bottom left of the variogram in Figure 17. The slope value in this case and with all the spectrum points approximately equally spaced at logarithmic scale is - 2 . 6 5 + 0.05, with r = 0.99, and the fractal dimension is D = 2.67 4-0.02. Figure 18b shows the influence of noise in high frequencies. The slope obtained by eliminating the high-frequency region, corresponding to noise, is - 2 . 6 5 + 0.06, with r = - 0 . 9 9 and the fractal dimension is D = 2.67 + 0.03, which as can be observed, is equal to the fractal dimension of the image without noise. Hamblin and Stachowiak (1994) observed that SEM images yielded significantly different fractal dimension values when the magnification was varied by 10• This finding does not mean that the SEM is inadequate for determining fractal dimension values. Rather, the different obtained values were the result of the noise associated with the current decrease on the sample, which was the result of the necessary decrease in probe size when the magnification was increased. It has been proved that when one observes the sample at different magnifications but keeps the probe size constant, the fractal dimension does not vary. Conversely, it has been observed that if a sample area is kept at the same magnification value, but the probe size is changed, the fractal dimension grows as the probe size decreases (Bonetto, 1994). To show that the probe size variation has influence on the obtained value of the fractal dimension, we can take a coal sample and obtain two images at 10,000• magnification with electron probe sizes of 20 and 10 nm. The probe current is 0.9 x 10 -1~ and 1.2 • 10 -11 A, respectively, and the value of the cross section of the SEs for coal is 3c ~ 0.1 for a PE energy of 25 kV. To calculate the signal-to-noise relation, we use Eq. (27). The beam permanence time in each pixel is r ~ 1 x 10 -4 s/pixel, the electron charge is e = 1.6 • 10 -19 C, and we suppose b = 1/8 = 10. Therefore, (S/N)sE
-- I 1/2
e(1 + b)
-- I1/2
1.6 • 10-19C(1 + 10)
= I~/2 7.5 x 106(s/C) 1/2
For Ip = 0.9 x 10 -1~ A, the signal-to-noise relation results in (S/N)sE
= 71.15
or
(N/S)se
= 0.014 ~ 1.4%
For Ip = 1.2 • 10 -11 A, the signal-to-noise relation results in (S/N)se
-- 25.98
or
(N/S)se
-- 0.038 ,~ 4%
Again we produce two images with a brightness level equal to the average of brightness levels of coal images obtained with electron probe sizes of 10 and 20 nm. We add 4% of shot noise to the first one and 1% of shot noise to the second and calculate both variograms,
170
JUAN LADAGA AND RITA BONETTO
FIGURE 19. Noise effect in coal images taken at 10,000x magnification with electron probe sizes of l0 and 20 nm. Also the variograms corresponding to images with 4% and 1% noise are present. The variogram corresponding to "image 10 nm" presents a larger fractal dimension at low scale than that of the 20-nm image (D = 2.8397 + 0.0008 and D = 2.717 • 0.003, respectively). When one subtracts the corresponding variance value of the shot noise, closer values are obtained (D = 2.682 -t- 0.003 and D -- 2.627 + 0.005 for 10 and 20 nm, respectively).
Figure 19 shows the variograms corresponding to the coal images obtained with electron probe sizes of 10 and 20 nm ("image 10 nm" and "image 20 nm" in the graph), as well as the variograms corresponding to images with 4% and 1% noise ("4% noise" and "1% noise" in Fig. 19). The size of these images is 512 x 512 pixels. The variograms of the coal images are different, and they produce very different fractal dimensions at low scale (slope of 0.3205 + 0.0019, with r = 0.999, and fractal dimension D = 2.8397 • 0.0008 for 10 nm, and slope of 0.567 -k-0.006, with r = 0.999, and fractal dimension D = 2.717 40.003 for 20 nm). At high scale the periodic zone appears. W h e n one subtracts the noise value corresponding to each image, similar variograms are obtained ("image 10 nm - 4 % noise" and "image 20 n m - 1% noise" in Fig. 19). Therefore, similar fractal dimension values are obtained in the low-scale zone (slope equal to 0.635 -t-0.006, with r -- 0.999, and fractal dimension D -- 2.682 4- 0.003 for 10 nm; slope equal to 0.746 + 0.010, with r -- 0.999, and fractal dimension D = 2.627 -4- 0.005 for 20 nm), as it should be, since it is the same sample and the same magnification. The fact that the values are not exactly similar can be the result of a Gaussian noise component present in the image.
CHARACTERIZATION OF TEXTURE IN SCANNING
171
Podsiadlo and Stachowiak (1995) designed a filter for noise reduction in SEM images of biological wear particles that have shot noise and Gaussian noise. This filter is a combination of hybrid-median filters and sigma using a modulation function. These workers proved that, compared with other reference filters, this filter yielded very good results for their images. Again, if we calculate the power spectrum for the two previous images (Figs. 20a and 20b), we observe the spatially uncorrelated noise influence in the high-frequency region. In the low-frequency region, the period influence is observed. The slope values of the graph in the intermediate region in both cases are -2.59 • 0.08 and - 2 . 6 6 4- 0.07, with r = -0.99. The fractal dimensions are D -- 2.71 + 0.04 and D = 2.67 4- 0.03, respectively. With this example we want to point out that the probe size influence in the determination of fractal dimensions can be eliminated if the noise present in the image can be eliminated. The following example aims at showing that the magnification change in the SEM does not modify the obtained fractal dimension value in a sufficiently homogeneous specimen. The images shown in the Fourier power spectrum graphs in Figures 21a and 21b correspond t o a gold sample excited with (a) an electron beam of probe size 40 nm and a magnification of 5000• and (b) an electron beam of probe size 15 nm and a magnification of 10,000• respectively. The obtained slopes are-2.43 4- 0.07 and-2.56 4- 0.12, withr -- -0.99, and D = 2.78 4- 0.03 and D = 2.72 + 0.06, respectively. For high-magnification images, for which the spatially uncorrelated noise influence is important, the Fourier method seems to provide better fractal dimension values. In such cases, it may be convenient to use the Fourier method for the fractal dimension calculation and the variogram method for obtaining the periodic region parameters in the search of a univocal characterization of the image. Nevertheless, it should be clear that when the periodic region is very extended, until it is overlapped with the spectrum corresponding to the noise, the Fourier power spectrum does not allow a precise calculation of the fractal dimension.
D. The FERImage Program* So that the variance calculation of an image could be carried out, an interactive and easy-to-use software program was developed. With this program, the FERImage program, the values of the D fractal dimension and the dper and *From Bianchi and Bonetto (2001). Copyrighted and reprinted with the permission of
SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.
172
JUAN LADAGA AND RITA BONETTO
(a) 10 s
P
D = 2.71 + 0 . 0 4 (r = - 0 . 9 9 )
[] []
[]
10 4
\
ril
1031
102 -
101 _ ' ' 1'00
. . . . . . .
1'o'
1'o =
. . . . . . .
f(a.u.)
'
(b) 10 s-
P
D = 2.67 + 0.03 (r = -0.99)
[] []
10 4
[]
103~
102~ 101~ 10 0
1-'
'1
100
'
!
101
f(a.u.)
I
102
'
'
FIGURE 20. P versus f" (a) 10 nm; (b) 20 nm.
dm~,, parameters can be obtained with the variance data. As an alternative, the program allows the user to obtain fractal dimension values with the Fourier power spectrum method. The FERImage program is freely available at the following web site: http:// dalton.quimica.unlp.edu.ar/cindeca/programa.htm
CHARACTERIZATION OF TEXTURE IN SCANNING
173
FIGURE 21. P versus f" (a) Au, 5000x; (b) Au, 10,000•
FERImage was developed so that the user could choose the range in which the behavior is fractal, both in the variogram method and in the Fourier power spectrum method, and thus be able to obtain fractals at different scales. Even when the maximum available step in the variance calculation is equal to half the image, the program allows the user to take the necessary number of points
174
JUAN LADAGA AND RITA BONETTO
for each case. The step values, either initial or final, can be varied to exclude regions that do not respond to a fractal behavior. In the cases in which the fractal region extends to almost the complete scale under study, the largest steps will weigh more statistically than the rest because of their high population density. As a way to avoid this, the program enables the user to take only the steps that are approximately equally spaced at logarithmic scale for the regression calculation. In the calculation of the three parameters for different angles, the program enables the user to take a square or a circle as the limit of the image. This last option is recommended since the user can work with the same border effect in each direction. A histogram of delta of gray levels can be obtained for each step. If the fractal is Brownian, the histogram will fit a Gaussian centered at zero. Polar graphs for the calculation of the fractal dimension with the variogram method and with the Fourier power spectrum method can also be obtained. The program enables the user to graph the variogram, the average power spectrum, the polar graphs, and the Fourier transform. These graphs can be edited and saved. The results of all analyses can be saved to be read later, by other graphics programs for instance.
IV. T w o EXAMPLES OF APPLICATION IN S E M IMAGES
The easiness of sample preparation, the excellent contrast, and the large depth of focus are the reasons for considering SEM as an attractive tool for characterizing roughness. Attaining an absolute quantitative value of the fractal dimension is difficult, mainly because of the complex physical processes involved in the SEM imaging process. However, one must always remember that it is possible to have comparative values of fractal dimensions of the same material and under the same conditions of image acquisition. In this section we show two real cases in which the three parameters D, dper, and dmi, have been applied to study texture in SEM images. Unlike the previously discussed theoretical examples, in these examples there is not just one particle of only one size, or only one structure that is repeated. However, it will be seen that the developed method enables us to characterize differences in samples due to differences in the physical characteristics of their formation. The examples given in this section, together with the data for noise analysis used in the previous section, correspond to images of 512 x 512 pixels and 256 gray levels, obtained in a Philips SEM 505 scanning electron microscope. The digitizer that has this microscope was designed and built in our laboratory (Viturro et al., 1994). This device has a Wormate high-speed data acquisition, which digitizes the same analog signal that goes to the photograph screen of
CHARACTERIZATION OF TEXTURE IN SCANNING
175
the microscope. In each beam position, 16 measurements are taken in order to minimize the electronic noise. The images are stored in an ASCII character file, where each represents a brightness level, and images up to 1024 x 1024 pixels, in 256 gray levels, can be digitized.
A. Differences of Plastic and Crystalline Sulfur Behavior under the Action of Two Thiobacillus Types The objective of this section is to show that the variogram method, applied to SEM images, enables us to characterize the behavior of two Thiobacillus types (ferrooxidans and thiooxidans) over crystalline and plastic sulfur substrates. In Briand et al.'s work (1999), the superficial interaction between Thiobacillus ferrooxidans and Thiobacillus thiooxidans and crystalline and plastic elemental sulfur was studied. Both strains were preserved in Imai culture medium (Imai, 1978) at an initial pH of 2 obtained with sulfuric acid addition, and with elemental sulfur as the only energy source. In that work the fractal dimension at low scale of images corresponding to different situations was measured. Bonetto and Ladaga (1998) extended the study to the high-scale region of the variogram that presents periods. The samples of residual sulfur were observed and digitized in the SEM at 25 kV, with an electron probe 50 nm in diameter and a magnification of 5000x. Figure 22 shows the variograms for the four particular cases of crystalline and plastic sulfur under each bacteria action. Table 2 presents the production values of sulfuric acid in millimoles of H + per litre after 20 days of testing, together with the fractal dimension values at low scale and the dper and dmi n parameters for each of the four residual sulfur samples, taken as representations of each particular case. The evolution of acid sulfuric amount is a measure of the bacterial growth. The fractal dimension values at low scale for crystalline and plastic sulfur from the sterile medium (without bacteria) are given too. Each of the chosen samples corresponds to a particular condition: residual plastic sulfur taken from cultures of T. ferrooxidans (Plastic S O+ T.f.) and T. Thiooxidans (Plastic S O+ T.t.) and residual crystalline sulfur taken from both cultures (Crystalline S o + T.f. and Crystalline S o + T.t.). The values of the three textural parameters are the average result of five measurements in different directions between 0 and 90 ~ The surface, both in crystalline sulfur and in plastic sulfur, when attacked by bacteria, shows the bacterial action on the image, which leaves marks in their same size range (0.7-1.7/zm). These marks denote clearly that bacteria work with a different mechanism in the plastic sulfur than in the crystalline sulfur. This different behavior is well characterized by D, dper, and dmin. The bacterial action on crystalline sulfur produces a decrease of the fractal dimension at low scale (lower than drain), this effect being more pronounced for T. ferrooxidans
176
JUAN L A D A G A AND RITA B O N E T T O
FIGURE 22. V versus s corresponding to the bottom left image of the variogram. (a) Residual crystalline sulfur grain + T. thiooxidans. (b) Residual crystalline sulfur grain + T. ferrooxidans. (c) Residual plastic sulfur grain + T. ferrooxidans. (d) Residual plastic sulfur grain + T. thiooxidans. A magnification corresponding to the periodic region is shown. The average values of D and drainfor different angles between 0 and 90 ~ as well as the corresponding parameter dper, can be seen in Table 2. (Adapted from Bonetto and Ladaga (1998, Fig. 7, p. 461). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA.)
CHARACTERIZATION OF TEXTURE IN SCANNING
177
FIGURE22. (Continued) than for T. thiooxidans. In the case of plastic sulfur an opposite situation can be observed, in the sense that the decrease of the fractal dimension is more pronounced for T. thiooxidans than for T.ferrooxidans. In all cases the decrease of fractal dimension implies that bacteria smooth the substrate. Bacterial action is also transferred toward the bulk, being more pronounced in plastic sulfur than in crystalline sulfur. This effect gives place to correlations at high scale which can be seen in the variogram through the periods. By definition, the dper parameter contains, in an implicit way, all periods. The dper values are
178
JUAN LADAGA AND RITA BONETTO TABLE 2
THE ACID PRODUCTION AND THE THREE TEXTURAL PARAMETERS FOR DIFFERENT SAMPLES a
Sampleb Sterile crystalline SO Sterile plastic SO Crystalline SO+ T.t. Crystalline SO+ T.f. Plastic SO+ T.f. Plastic SO+ T.t.
Acid production (mmol H+/liter)
Fractal dimension D(r = 0.999) c
dper (/zm)
dmin(/zm)
212 4- 4 252 4- 4 240 4- 4 280 4- 4
2.91 4- 0.04 2.71 4- 0.02 2.874 4-0.032 2.743 4- 0.004 2.651 4- 0.005 2.594 4- 0.006
1.178 + 0.036 1.549 + 0.036 1.876 4- 0.024 1.982 4- 0.065
0.228 4- 0.003 0.3434- 0.004 0.426 4- 0.005 0.474 4- 0.015
a Adapted from Bonetto and Ladaga (1998, Table III, p. 462). Copyrighted and reprinted with the permission of SCANNING, and/or the Foundation for Advances of Medicine and Science (FAMS), Box 832, Mahwah, New Jersey 07430, USA. bT.t., Thiobacillus thiooxidans; T.f., Thiobacillusferrooxidans. cAt low scale.
correlated with the size of the marks left by bacteria. A correlation exists b e t w e e n the fractal dimension behavior and the dper parameter behavior, since m o r e s m o o t h i n g corresponds to a larger transfer toward the bulk. F r o m the analysis of Table 2, we can conclude that there is a correlation b e t w e e n the acid production and the three textural parameters. A greater acid production corresponds to a smaller fractal dimension and larger dpe r and dmi n parameters. In the case of plastic and crystalline sulfur + T. ferrooxidans, this effect is not true, but it is necessary to make clear that acid production is a qualitative m e a s u r e m e n t of bacterial growth, since a broad variety of intermediate species are p r o d u c e d during elemental sulfur oxidation that can affect sulfuric acid concentration (Schippers et al., 1996).
B. Quality Difference between Two Types of Emery Paper The e x a m p l e in this section corresponds to Bonetto et al.'s work (submitted) and the purpose of this e x a m p l e is to study the application of the three textural parameters D, dper, and dmin for characterizing e m e r y papers of different quality. These papers are manufactured according to FEPA (Federation of E u r o p e a n Producers of Abrasives) standards or ISO 6344/1 and 6344/3 standards. In Figures 23 and 24, the S E M images for two paper qualities, A and B, respectively, are shown. These images have been obtained at 100 • magnification. The low increase used a l l o w e d a significant n u m b e r of particles, even in papers of lower grit, to be obtained. The variance of gray levels was calculated
CHARACTERIZATION OF TEXTURE IN SCANNING
179
FICURE23. SEM images of A-quality emery paper for different grit sizes.
and the variogram was graphed; the fractal dimension was calculated from the slope obtained with a linear regression coefficient r = 0.999. The values that were obtained and are observed in Tables 3 and 4 are the average results for five rotations between 0 and 90 ~ In these tables, data for the smallest grain size (when 95% of each load has been deposited) are presented in column d~95. In this column the average diameter values of particles for grits #220 and #320 can be observed, although they were determined by mechanical separation by means of sieves. In these tables, the values corresponding to dmin, dper, and D are also included for the two paper qualities. (A in Table 3 and B in Table 4, the former being of better quality). The outstanding difference in quality between papers A and B makes the comparison between papers A#800 and B#1000 possible (as will be further seen). As an example, in Figure 25 the variograms corresponding to papers A#800 and B#1000 are shown. It is worth mentioning that even though the FEPA and ISO standards are well detailed in average diameters of particles for each grit, they do not specify the
180
JUAN LADAGA AND RITA BONETTO
FIGURE 24. SEM images of B-quality emery paper for different grit sizes.
superficial density of particles. This prevents a numerical comparison with the
dper parameter and comparison can occur only qualitatively by observing the
images. A visual analysis of the images immediately shows the behavior tendency, as regards size and agglutination of particles that form the emery paper, that the smaller the observed particle size, the higher the grit number in A-quality emery paper. The average separation between particles decreases monotonously. In Table 3 the values obtained for the dmi n and dpe r parameters, respectively, enable us to confirm the visual inspection. Moreover, the value of
dmin a p p r o a c h e s
ds95.
In B-quality emery paper, the average separation between particles and the particle size change abruptly between grits #220 and #320, whereas these characteristics are very similar in grits #500 and #600. In grit # 1000 an unexpected rareness in the distribution and particle size is observed. These behaviors are not in agreement with those observed in better-quality papers. These
181
C H A R A C T E R I Z A T I O N OF T E X T U R E IN S C A N N I N G TABLE 3 THE ds95 VALUES AND THE THREE TEXTURALPARAMETERS, dmin, dper, AND D, FOR A-QUALITY EMERY PAPERS Grit
ds95(~m)
dmin(l~m)
#2400 #1000 #800 #500 #320 #220
~5.4 12.4 15.1 21.5 34.2 53
4.93 4- 0.05 8.26 4- 0.19 12.304-0.30 22.90 4- 0.19 30.82 4- 0.30 52.19 4- 0.87
dper(l~m) 43.9 68.3 75.8 100.6 120.9 140.2
4- 3.6 4- 1.8 4- 1.8 4- 4.8 4- 4.0 4- 7.2
D (r = 0.999) 2.68 4- 0.01 2.53 4- 0.01 2.504-0.01 2.522 4- 0.008 2.552 4- 0.004 2.586 4- 0.006
irregularities can be confirmed by studying in detail the dpe r and dmi n values in Table 4. From the visual inspection of images, grits A#800 and B#1000 look similar, and this is confirmed since dmi n and dpe r c a n be considered equal within error for both cases. As regards the fractal dimensions, these differ significantly. For the B# 1000 case, the fractal dimension D -- 2.608 4- 0.006 is observed as antipersistent. For paper A#800 the fractal dimension D -- 2.50 4- 0.01. It is known that when the edges are rougher the fractal dimension increases. The particles Of paper B#1000 are smaller particle clusters and therefore the edges are more irregular. It should be noted that the calculated error of dper in paper B#1000 is twice the error for paper A#800. This is a typical result of cluster heterogeneity.
TABLE 4 THE ds95 VALUES AND THE THREE TEXTURALPARAMETERS, dmin, dper, B-QUALITY EMERY PAPERS Grit
ds95(/zm)
#1000 #600 #500 #400 #320 #220
12.4 18 21.5 25.2 34.2 ~53
dmin(tzm) 12.85 6.801 7.29 7.98 14.07 46.2
4- 0.39 4- 0.077 4- 0.11 4-0.16 4- 0.30 4- 1.1
dper(lZm) 77.0 49.28 54.69 69.1 91.2 161.4
4- 3.5 4- 0.39 4- 0.71 4- 1.9 4- 3.4 4-5.1
AND
D, FOR
D (r = 0.999) 2.608 2.546 2.538 2.547 2.558 2.611
4- 0.006 4- 0.013 4- 0.014 4-0.015 4- 0.010 4-0.005
182
J U A N L A D A G A A N D RITA B O N E T T O
(a) D = 2.5 r = 0.999
V
d
per
= 73.8 pm
d mln . =11.51am _
103
102
lO2 ,I
'
'
'
,
'
,
' ' 1
100
'
'
'
'
. . . .
I
'
102
10'
s(pm) (b) D = 2.598 (r = 0.999)
V
dpe r = 69.7 ~m
~.....~
/
\
. . . .
A _-~"
dmi n = 12.0 pm
103
.
102 '
'
'1
.....
100
'
.
.
.
.
.
.
.
|
'.
.
101
.
.
.
,
,
,i
.
.
.
.
.
102
s(~m) FIGURE 25. V versus s for images of two types of emery paper. The D, dper, and dmin values in these plots correspond to the nonrotated image. The average values for different angles between 0 and 90 ~ are as follows: (a) For image #800 shown in Figure 23, D = 2.50-t-0.01 and dmin = 12.3 + 0.3 #m. The corresponding parameter dper = 75.8 + 1.8/zm. (b) For image #1000 shown in Figure 24, D = 2.608 +0.006 and dmi,= 12.85 -t-0.39 #m. The corresponding parameter dper-" 77.0 4- 3.5/zm.
CHARACTERIZATION OF TEXTURE IN SCANNING
183
V. CONCLUSIONS
We have demonstrated, through theoretical and SEM images, the capacity of Bonetto and Ladaga's method to determine parameters that characterize digital image textures. We have shown the existing correlation between a characteristic minimum and its multiples in the variogram and the corresponding characteristic maximum in the Fourier power spectrum. Although it is difficult to eliminate all noise sources, the Bonetto and Ladaga method emphasizes the determination of the periodic region. The noise sources influence mainly the determination of the fractal dimension. In texture characterization by means of this method it is necessary to work at the same magnification if the dper parameter is used for comparing different samples. This then enables us to use the fractal dimension values for different samples without eliminating noise, since the main source due to the current of the incident beam is the same in all samples. When the Fourier power spectrum allows, we show that the fractal dimension can be calculated from this spectrum once noise and the periodic region are eliminated. The dmi n parameter could be understood as an inferior end in the periodic region. In the theoretical and SEM images it has been observed that this value is related to the smallest cell size with enough statistic weight to produce periods. The FERImage program is usable as a tool for obtaining the aforementioned parameters in an interactive way, then allowing the elimination of nonfractal regions, both in the variogram and in the Fourier power spectrum. The examples of studied SEM images lead to the conclusion that dper, drain, and D can be used to characterize surface transformations due to certain physical-chemical processes. The variogram results in a very efficient means that allows characterization of the different mechanisms of bacteria interaction with different substrates, in all the studied range and in samples that present a Brownian fractal region and a periodic region at different scales. A correlation has been observed between the D, dper, and dmi n parameters and the bacterial growth, measured by the sulfuric acid production. In the case of abrasive papers, it has been shown that quality control can be carried out by using the dmi n parameter for characterizing the smallest grain size (when 95% of each load has been deposited (ds95)). The dper parameter allows an estimation of the particle density. APPENDIX I: CORRELATION BETWEEN FOURIER POWER SPECTRUM MAXIMUM AND VARIOGRAM CHARACTERISTIC MINIMUM
The fast Fourier transform (FFT) method is used in image analysis for different purposes. The FFT is an algorithm designed to compute the discrete Fourier
184
JUAN LADAGA AND RITA BONETTO
transform (DFT). The DFT is the classical Fourier series with only the first N terms retained. It is stated (see, for example, Kanatani, 1984) that if the object is being represented by the Fourier series, it does not contain frequency components equal to or higher than N/2. The first N/2 coefficients of the DFT coincide with the coefficients corresponding to the classical Fourier series. The second half of the DFT coefficients coincides with the mirror reflection of the first half. The frequency component corresponding to N/2 is called the Nyquist frequency. That is to say, if the object is well represented by N points, there will not be frequency components in the DFT higher than that of the Nyquist frequency. The complete set of N/2 coefficient squares is called the power spectrum and the graph of those coefficients in the frequency function is used to study the relative magnitudes of each frequency component. Particularly, for the calculation of the fractal dimension a log-log scale plot is used. Next, for a numerically generated image, we study the correlation between a characteristic minimum and its multiple in the variogram and a maximum in the power spectrum obtained from the Fourier-transformed image. Figure 26 shows the variogram corresponding to the theoretically generated image having elemental cells similar to a not very eccentric ellipse (longer axis about 14 and 15 pixels (y axis) and shorter axis about 12 pixels (x axis)). The gaps between centers of same-color cells are about 29 pixels (y axis) and 26 pixels (x axis). Owing to this slight difference, there is not only one well-defined minimum in the variogram. In fact, there are two minimums between the two previous ones that come from the component inverse of the wave vector corresponding to the period of about 19 pixels according to an approximate direction of 45 ~
V
i
104 ,,.:.:.,.,.:.,.,.,., %o ~ 1 7 6 1 7 6 :............o......:
t
,%0o%'.0o0o0" ..o
9 9 9 9 9 9o ~ 1 7 6 o'
, OoO.O 9OoO 9 9o ,
DOg 9 oOoOo oOoOoOoo, %%0o0 "o"
[26,5668]
9 9
:..OoO..Vo......: .e 9e . e . o o , o . o . o 9 '
FIGURE
'
'
10 ~ I
'
'
'
'
'
'
'
101 s(pixel) '
I
'
'
'
'
'
'
'
'
1'02
26. Vversus s for the image shown at the bottom left of the variogram.
CHARACTERIZATION OF TEXTURE IN SCANNING
185
V
f
104
'
'
'
I
'
100
'
'
9 "W 'v
V
'
[19,18431
]
'
'
'
'
'
I
'
101
'
!
'
1
[6o,13845]
'
'
'
'
'
'
I
10 a
s(pixel) FIGURE 27. V versus s corresponding to the image in Figure 26 but scanned at an angle of 30~ In Figures 27 and 28 the variograms corresponding to an i m a g e rotation of 30 and 45 ~ respectively, are shown. In Figure 27 two characteristic m i n i m u m s (about 19 and 60 pixels) and their multiples are observed. In Figure 29 one can see the power spectrum corresponding to the Fouriertransformed i m a g e in Figure 26. The abscissa line corresponds to the wave
V ,
-
., ,i .. i
104.
[19,5180] '
'
'
I
100
.
.
.
.
.
.
.
.
!
101
,
,
.
.
.
.
'
'
I
10 2
s(pixel) FIGURE 28. V versus s corresponding to the image in Figure 26 but scanned at an angle of 45 ~
186
JUAN LADAGA AND RITA BONETTO
[13, 7.1 x 104] 105P
[10,3
104 103
[4,1;~ X 102]~
102 101 100
'
'
'
'
'
'
'
'
101 I
,
,
n(pixel)
I
'
'
'
'
'
102 I
FIGURE29. Fourier power spectrum (P) as a function of the wave number (n).
number (n = 256/s) given in pixels. In the power spectrum graph only 128 pixels were used. Since the image is binary the power spectrum shows great fluctuations in high frequencies. However, in the graph three outstanding maximums are evident. The wave numbers 4, 10, and 13 correspond approximately to steps 60, 26, and 19, respectively, observed in the three variograms shown in Figures 26 through 28.
APPENDIX II: THEORETICAL EXAMPLE TO SHOW THE CORRELATION BETWEEN THE FOURIER POWER SPECTRUM MAXIMUM AND THE VARIOGRAM CHARACTERISTIC MINIMUM We next carry out some calculations to prove the correlation between a maximum in the Fourier power spectrum and a characteristic minimum and its multiples in the variogram. We assume the gray-level variance defined in the continuous space as V ( h , k) -- -~
~
~ d x d y [ z ( x + h, y + k) - z ( x , y)]2
(28)
where C is the normalization constant related to the area of the finite integration space and (Az) -- 0 has been assumed. Then, taking into account that ]z(x, y)[2
187
CHARACTERIZATION OF TEXTURE IN SCANNING
is an integrable square function, V(h, k) -- -~
~
~ dx dy[z(x + h y + k)z*(x + h, y + k)
- 2 R e { z ( x , y)z*(x + h, y + k)} + z(x, y)z*(x, y)] dx dy[Iz(x y)12 _ Re{z(x y)z*(x --F h y + k)}]
--
=[f"f"f"f"
= -C
(X)
O0
O0
f~f~dxdy -Re
du dv du' dr' ~(u, v)~*(u', v')
(X)
ei2~t~'-u'~x+O~-~'~yl
{f=f=f=f= f f= f f= f= {f=f=f=f=
du dv du' dr' ~(u, v) ~*(u', v')
O0
O0
O0
O0
X e -i2rc(u'h+v'k)
dx
(x)
= -C
oo
dye i
2yr[(u-u')x+(v-v')y]
]]
du dv du' dr' ~(u, v) ?~*(u', v') 6(u' - u)
(X)
(X)
x 6 (v'- v)-
CO
Re
du dv du' dr' ~(u, v)
O0
X Z*(U 1,
O0
O0
Ul)e-i2:rr(uth+vtk)(~(bll--bl)(~(U
O0
l-
U)]]
(29)
In Eq. (29), the Fourier transform 2(u, v) is ~:(u, v) --
dx dy z(x, y) e -i 2~(ux+~y)
(30)
If we develop the Dirac-8 in its Fourier components (conjugate variables), the resulting expressions are S(U'--
U) - -
ei2Jr(u-u')Xdx
(31)
e i 2zr(v-v')Ydy
(32)
(x)
and ~(V' -- V) - -
f
oo oo
188
JUAN LADAGA AND RITA BONETTO
Equation (29) becomes V ( h , k) -- -c
~ d u d v I~'(u v)12{1 - cos(2rr(uh + vk))}
~
(33)
A simple theoretical example to show the correlation between a maximum in the power spectrum (l~(u, v)l 2) and a minimum (and its multiple) in the variance ( V ( h , k)) consists of supposing that I~(u, v)l 2 = aZ3(u - ui) 3(v - vi). In this case we observe that V ( h , k) - - 0 for ui -- m~ h and vi -n / k , with m and n integer numbers. When the Dirac-3 corresponds to the Fourier power spectrum, the peak is perfectly identified and it corresponds to the minimums (zeros) of the variance for the corresponding h, k steps. If we now suggest a more complex system in which I~(u, v)l 2 - aZi3(u ui)6(v - v i ) + g(u, v), with g(u, v) a finite function higher than zero and defined in a finite space (pertaining to the rectangle -Umax <_ u < Umax and -Vmax <_ v < Vmax, and null outside this rectangle), then the variance becomes
2[
V(h, k)-
~
a2{1 - c o s ( 2 r c ( u i h + vik))} d u d v g ( u , v){ 1 - c o s ( 2 z r ( u h + vk))}
+ OO
(34)
O~
To see the minimums compared with the bottom in V ( h , k) we compare the maximum values of both terms on the right of Eq. (34). In the first term, the maximum is 4a 2/ C. In the second term, if we name gmax as the maximum function value g(u, v) we can say du d v g(u v){ 1 - cos(2rr(uh + vk))}
--
4fUmofO--o
C
< -C
O~
Umax
OO
16
d u d p gmax -- T
Umax
gmaxUmaxl)max
So that the minimum and its multiple in the variance can be well defined, the g(u, v) function must be such that
4gmaxUmax Umax
>>1
The smaller the previous relation, the less defined the minimums and consequently their presence will be more difficult to ensure. In the case of digital images, the power spectrum is discrete and therefore the Dirac-6 can be thought of as a Gaussian function of little dispersion or as an impulse function of little width, and the previous analysis can be retained.
CHARACTERIZATION OF TEXTURE IN SCANNING
189
ACKNOWLEDGMENTS The authors are pleased to a c k n o w l e d g e Dr. Elena Forlerer for her i m p o r t a n t participation in this w o r k and fruitful discussions, Lic. Mario Sfinchez for his collaboration to obtain S E M images, and Dr. Pablo Bilmes for p r o v i d i n g us with the sample of fractured steel; and Drs. N o r m a Gallegos, L i n d a Saal, C a r m e n Paniagua and Silvana Bertolino for their help revising the manuscript. The authors also a c k n o w l e d g e financial support from the Consejo N a c i o n a l de Investigaciones Cientfficas y T6cnicas de la R e p f b l i c a Argentina, the Universidad Nacional de La Plata, and the Universidad de B u e n o s Aires.
REFERENCES Barab~isi, A. L., and Stanley, H. E. (1995). Fractal Concepts in Surface Growth. Cambridge, UK: Cambridge Univ. Press. Barnsley, M. E, Devaney, R. L., Mandelbrot, B. B., Peitgen, H.-O., Saupe, D., and, Voss, R. E (1988). Fractals in nature: From characterization to simulation, in The Science of Fractal Images, edited by Heinz-Otto Peitgen and Dietmar Saupe. New York: Springer-Verlag, pp. 21-70. Bianchi, E D., and Bonetto, R. D. (2001). FERImage: an interactive program for fractal dimension, dper, and dmi n calculation. Scanning 23, 193-197. Bonetto, R. D. (1994). Private communication. Bonetto, R. D., Forlerer, E., and Ladaga, J. L. (submitted). Characterization of the texture of digital images which have a periodicity or a quasi-periodicity. Meas. Sci. Technol. Bonetto, R. D., and Ladaga, J. L. (1998). The variogram method for characterization of SEM images. Scanning 20, 457-463. Bonetto, R. D., Ozols, A., Ladaga, J. L., and Sancho, E. (1998). Quantitative evaluation of solidification microstructures in centrifugally atomized powders, in Proceedings of the PM'98 Powder Metallurgy World Congress, European Powder Metallurgy Association Editor Vol. 1. Granada, Spain-1998, pp. 202-207. Bonetto, R. D., S~inchez,M., Alvarez, A. G., and Ladaga, J. L. (1996). Descripci6n de la rugosidad de superficie a trav6s de las im(tgenes del microscopio electr6nico de barrido, in Avances en Andlisis por Tdcnicas de Rayos X, Vol. 9, edited by SARX, Facultad de Matem~itica, Astronomia y Ffsica, Universidad Nacional de C6rdoba, C6rdoba Argentina, ISSN: 1515-1565 pp. 113-118. Briand, L. E., Bonetto, R. D., Ladaga, J. L., and Donati, E. (1999). Bulk and surface characterization of crystalline and plastic sulfur oxidized by two Thiobacillus species. Process Biochem. 34(3),249-256. Brown, R. (1828). On the existence of active molecules in organic and inorganic bodies. Philos. Mag. 4, 162-173. Bunde, A., and Havlin, S. (1994). Fractals in Science. BeNin/Heidelberg/New York: SpringerVerlag. Bunde, A., and Havlin, S. (1996). Fractals and Disordered System. Berlin/Heidelberg/New York: Springer-Verlag. Burrough, E A. (1981). Fractal dimensions of landscapes and other environmental data. Nature 294, 240-242.
190
JUAN L A D A G A AND RITA BONETTO
Einstein, A. (1905). Lrber die von der molekularkinetischen Theorie der W~irrne geforderte Bewegung von in ruhenden Fltissigkeiten suspendierten Teilchen. Ann. Phys. 322, 549-560. Family, E, Meakin, P., Sapoval, B., and Wool, R., eds. (1995). Fractal Aspects of Materials. Vol. 367, Materials Research Society Symposium Proceedings, Boston, MA, 1994. Pittsburgh, PA: Materials Research Society. Feder, J. (1988). Fractals. New York/London: Plenum. Goldstein, J. I., Newbury, D. E., Echlin, P., Joy, D. C., Romig, A. D., Jr., Lyman, C. E., Fiori, C., and Lifshin, E. (1992). Scanning Electron Microscopy and X-Ray Microanalysis, 2nd ed. New York/London: Plenum. Hamblin, M. G., and Stachowiak, G. W. (1993). Comparison of boundary fractal dimensions from projected and sectioned particle images. Part I: Technique evaluation. J. Comput. Assist. Microsc. 5(4), 291-300. Hamblin, M. G., and Stachowiak, G. W. (1994). Measurement of fractal surface profiles obtained from scanning electron and laser scanning microscope images and contact profile meter. J. Comput. Assist. Microsc. 6(4), 181-194. Hausdorff, E (1919). Dimension und aussers Mass. Math. Ann. 79, 157-179. Hol~, H., and Baumbach, T. (1994). Nonspecular x-ray reflection from rough multilayers. Phys. Rev. B 49(15), 10668-10676. Ho13), V., Kubena, J., Ohlidal, I., Lischka, K., and Plotz, W. (1993). X-ray reflection from rough layered systems. Phys. Rev. B 47(23), 15896-15903. Imai, K. (1978). On the mechanism of bacterial leaching, in Metallurgical Application of Bacterial Leaching and Related Microbiological Phenomena, edited by L. E. Murr, A. E. Torma, and J. A. Brierley, New York: Academic Press, pp. 275-294. Kanatani, K. (1984). Fast Fourier transform, in Particle Characterization in Technology, Vol. II, edited by J. K. Beddow. CRC Press, Inc. pp. 31-50 Boca Raton, Florida. Kaye, B. (1989). A Random Walk through Fractal Dimensions. Weinheim/New York: VCH. Kaye, B. (1993). Chaos and ComplexitymDiscovering the Surprising Patterns of Science and Technology. Weinheim/New York: VCH. Mandelbrot, B. B. (1967). How long is the coast of Britain? Statistical self-similarity and fractional dimension. Science 155, 636-638. Mandelbrot, B. B., and Van Ness, J. W. (1968). Fractional brownian motions, fractional noises and applications. Siam Rev. 10, 422-437. Mandelbrot, B. B. (1982). The Fractal Geometry of Nature. New York: Freeman. Mandelbrot, B. B. (1985). Self-affine fractal dimension. Physica Scripta, 32, 257-260. Mark, D. M., and Aronson, M. M. (1984). Scale-dependent fractal dimensions of topographic surfaces: an empirical investigation, with applications in geomorphology and computer mapping. Math. Geol. 16(7), 671-683. Nieminen, A., Heinonen, P., and Nuevo, Y. (1987). A new class of detail-preserving filters for image processing. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-9, 74-90. Nottale, L. (1993). Fractal Space-Time and MicrophysicsmTowards a Theory of Scale Relativity. Singapore: World Scientific. Oatley, C. W., Nixon, W. C., and Pease, R. E W. (1965). In Advances in Electronic and Electron Physics. New York: Academic Press. Palasantzas, G., and De Hosson, J. T. M. (2000). Roughness effect on the measurement of interfaces stress. Acta Mater. 48, 3641-3645. Pentland, A. P. (1984). Fractal-based description of natural scenes. IEEE Trans. PAMI-6(6), 661-674. Podsiadlo, P., and Stachowiak, G. W. (1995). Median-sigma filter for SEM wear particle images. J. Comput. Assist. Microsc. 7(2), 67-82.
CHARACTERIZATION OF TEXTURE IN SCANNING
191
Press, W., Tolan, M., Stettner, J., Seeck, O. H., Schlomka, J. R, Nitz, V., Schawalowsky, L., Mtiller-Buschbaum, R, and Bahr, D. (1996). Roughness of surfaces and interfaces. Phys. B 221, 1-9. Reimer, L. (1985). Scanning Electron MicroscopymPhysics and Image Formation and Microanalysis. Berlin/Heidelberg: Springer-Verlag. (Springer Series in Optical Sciences.) Richardson, L. E (1961). The problem of contiguity: an appendix of statistics of deadly quarrels. Gen. Syst. Yearbook 6, 139-187. Russ, J. C. (1990). Computer Assisted Microscopy: The Measurement and Analysis of Images. New York: Plenum. Russ, J. C. (1994). Fractal Surfaces. New York/London: Plenum. Russ, J. C., and Russ, J. C. (1987). Feature-specific measurement of surface roughness in SEM images. Part. Charact. 4, 22-25. Sahimi, M., Rassamdana, H., and Mehrabi, A. (1995). Fractals in porous media: from pore to field scale, in Fractal Aspects of Materials. Vol. 367, Materials Research Society Symposium Proceedings, Boston, MA, 1994, edited by E Family, R Meakin, B. Sapoval, and R. Wool. Pittsburgh, PA: Materials Research Society, pp. 200-214, Schippers, A., Jozsa, R, and Sand, W. (1996). Sulphur chemistry in bacterial leaching of pyrite. Appl. Environ. Microbiol. 62, 3424-3431. Sinha, S. K., Sirota, E. B., Garoff, S., and Stanley, H. B. (1988). X-ray and neutron scattering from rough surfaces. Phys. Rev. B 38(4), 2297-2311. Skands, U. (1996). Quantitative methods for the analysis of electron microscope images. Doctoral thesis, Technical University of Denmark, Institute of Mathematical Modelling (IMM), ISNN 0909-3192, pp. 73-77. Stettner, J., Schwalowsky, L., Seeck, O. H., Tolan, M., Press, W., Schwarz, C., and K~inel, H. V. (1996). Interface structure of MBE-grown CoSi2/Si/CoSi2 layers on Si(lll): partially correlated roughness and diffuse x-ray scattering. Phys. Rev. B 53(3), 1398-1412. Tukey, J. W. (1971). Exploratory Data Analysis. Reading, MA: Addison-Wesley. Van Put, A. (1991). Geochemical and morphological characterization of individual particles from the aqueous environment by EPXMA. Doctoral thesis, Universiteit Antwerpen, Universitaire Instelling Antwerpen, Departement Scheikunde, Belgium. Viturro, H., Peez, C., and Bonetto, R. D. (1994). Desarrollo e implementacirn de un sistema digital de adquisicirn, procesamiento y an~ilisis de im~igenes y espectros de rayos x, in Avarices en Andlisis por Tdcnicas en Rayos X, Vol. 8, edited by D. M. Inrs. Santiago, Chile: Universidad de Chile, pp. 379-384. Wiener, N. (1923). Differential-space. J. Math. Phys. Mass. Inst. Technol. 2, 131-174. Zhao, Y. R, Yang, H. N., Wang, G. C., and Lu, T. M. (1998). Diffraction from diffusion-barrierinduced mound structures in epitaxial growth fronts. Phys. Rev. B 57(3), 1922-1934.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 120
Degradation Identification and Model Parameter Estimation in Discontinuity-Adaptive Visual Reconstruction ANNA TONAZZINI AND LUIS BEDINI Institute for the Elaboration of Information, Area della Ricerca CNR di Pisa, 1-56124 Pisa, Italy
I. Introduction .... . . . . . . . . . . . . . . . . . . . . . . . A. Image Restoration . . . . . . . . . . . . . . . . . . . . . . . . . B. Blind I m a g e Restoration . . . . . . . . . . . . . . . . . . . . . C. U n s u p e r v i s e d Blind Image Restoration . . . . . . . . . . . . . . . . D. O v e r v i e w . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Fully B a y e s i a n A p p r o a c h to U n s u p e r v i s e d Blind Restoration . . . . . . . . . A. M A P - M L , or Alternating M a x i m i z a t i o n . . . . . . . . . . . . . . B. E x p e c t a t i o n M a x i m i z a t i o n . . . . . . . . . . . . . . . . . . . . . III. The M A P - M L M e t h o d . . . . . . . . . . . . . . . . . . . . . . . . . IV. M A P Estimation o f the Image Field . . . . . . . . . . . . . . . . . . . . A. The M i x e d - A n n e a l i n g A l g o r i t h m . . . . . . . . . . . . . . . . . B. Use of a Preconditioned Conjugate Gradient . . . . . . . . . . . . . . V. M L Estimation of the Degradation Parameters . . . . . . . . . . . . . . VI. M L Estimation of the M o d e l Parameters . . . . . . . . . . . . . . . . . A. Derivation of the M o d e l Parameter U p d a t i n g Rule by M e a n s of the Saddle Point A p p r o x i m a t i o n . . . . . . . . . . . . . . . . . . . . . . . B. Derivation of the M o d e l Parameter U p d a t i n g Rule by M e a n s of M C M C Techniques . . . . . . . . . . . . . . . . . . . . . . . VII. The Overall Architecture for the Fully Blind and U n s u p e r v i s e d Restoration . . VIII. Adaptive S m o o t h i n g and Edge Tracking . . . . . . . . . . . . . . . . . . A. An M R F M o d e l for Constrained Implicit Discontinuities . . . . . . . . . B. C o a r s e - t o - F i n e Detection of Edges by M e a n s of G N C . . . . . . . . . . IX. E x p e r i m e n t a l Results: The Blind Restoration Subcase . . . . . . . . . . . A. Blur Identification with K n o w n Ideal I m a g e . . . . . . . . . . . . . . B. Joint B l u r Identification and Image Restoration . . . . . . . . . . . . . X. E x p e r i m e n t a l Results: The U n s u p e r v i s e d Restoration Subcase . . . . . . . . A. M L H y p e r p a r a m e t e r Estimation . . . . . . . . . . . . . . . . . . . B. Adaptive E d g e Tracking . . . . . . . . . . . . . . . . . . . . . . XI. E x p e r i m e n t a l Results: The F u l l y U n s u p e r v i s e d Blind Restoration Case . . . . XII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Volume 120
194 194 197 199 201 202 204 204 206 208 208 210 215 217 218 224 227 231 232 234 238 238 241 247 247 261 270 279 280
193 ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 2002, Elsevier Science (USA). All rights reserved.
194
ANNA TONAZZINI AND LUIS BEDINI I. INTRODUCTION
In several and diverse applications of image processing, the observed image or data g is produced by an imaging system or a measurement system that introduces some amount of degradation and noise. This degradation and noise must be removed before any other processing can be efficiently performed on the data, and it must be removed in such a way as to reconstruct an image which is as faithful as possible to the ideal image f of the original object or scene to be analyzed. Although in general the operator of the imaging system can be either linear or nonlinear, for the applications considered in this article, we assume that the degradation operator is linear and shift invariant, and, more specifically, that g is the two-dimensional convolution of f with the point-spread function (PSF) of the system, plus superposition of additive white Gaussian noise. In many applications, such as astronomy, remote sensing, and biomedical imaging, these assumptions are reasonable and the PSF introduces a blur on the ideal image f . Blur is typically caused by optical aberrations, out-of-focus cameras, relative motion between the camera and the object, or atmospheric turbulence, whereas noise is due to the superposition of many factors, including electronic thermal noise, photoelectric noise, film grain noise, transmission noise, and, finally, quantization noise. Both blur and noise are unavoidable in practice, since the physical requirements for improving the performance of the imaging system are often unachievable, too costly, or even inconvenient. For example, in X-ray imaging applied to medicine, a better-quality image could be obtained by increasing the incident X-ray beam intensity; however, doing so would be dangerous for the patient's health. Another example is in biological microscopy: The fluorescence study of labeled or autofluorescent samples, which can support only very weak excitation light, requires the use of low-level light detector technology. Quantitative detection of extremely low-level light imposes severe limitations on photodetector performance with regard to signal-to-noise ratio, resolution, and dynamic range. Hence, in many practical situations, digital image restoration/reconstruction represents the only reliable solution for recoveting the original image or, at least, for increasing the signal-to-noise ratio and the resolution.
A. Image Restoration Most methods for recovering the original image f in visual reconstruction tasks have been developed under the assumption that the imaging system operator is explicitly and exactly known. Even under this favorable condition,
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
195
visual reconstruction remains an inverse ill-posed and ill-conditioned problem, for which a unique solution does not exist or is unstable (Bertero et al., 1988). For instance, the problem treated in this article (i.e., image deconvolution and denoising) typically consists of the inversion of a convolution operator. This process requires the numerical solution of a large ill-conditioned linear system arising from the discretization of two-dimensional Fredholm integral equations of the first kind with displacement kernels (Andrews and Hunt, 1977). As a consequence, the solution may be very sensitive to even small perturbations, which, as already noted, in practical applications always affect the data term. Various regularization strategies have been adopted to stabilize ill-posed inverse problems. Some methods exploit the intrinsic regularization properties of Wiener filters, recursive Kalman filters, truncated singular value decomposition (SVD), and iterative conjugate gradient techniques (Andrews and Hunt, 1977; Hansen, 1990; Van der Sluis and Van der Vorst, 1990; Woods and Ingle, 1981). Because the system matrix is a block Toeplitz matrix with Toeplitz block, some authors have proposed using the preconditioned conjugate gradient algorithm as a regularization procedure (Hanke et al., 1993). They have used a block circulant matrix with circulant blocks as a preconditioner, for which the fast Fourier transform (FFT) can be used in the computations, and have shown the effectiveness of this preconditioner in speeding the convergence of the conjugate gradient (R. H. Chan et al., 1984, 1993). Again in an attempt to overcome the ill-conditioning of the blur operator, other methods force specific regularity constraints onto the solution. The idea is to exploit as much a priori knowledge as possible to reduce the set of feasible solutions and thus, it is hoped, obtain a unique and robust inverse. Constrained least squares methods, such as the maximum entropy method (MEM), and Tikhonov standard regularization (Bertero et al., 1988; Burch et al., 1983; Hunt, 1973; Tikhonov and Arsenin, 1977) assume generic information of global smoothness properties of the image. For Tikhonov regularization, extensions of preconditioners for Toeplitz matrices have also been developed (R. H. Chan et al., 1993). The principal drawback of these approaches is that, since they enforce global smoothness constraints, discontinuities or edges between homogeneous regions of the image are not adequately recovered. To alleviate this inconvenience, several authors have proposed edge-preserving regularization strategies, which are either deterministic (Blake and Zisserman, 1987; Charbonnier et al., 1997), probabilistic (Bedini and Tonazzini, 1992; S. Geman and Geman, 1984; J. Marroquin et al., 1987), or variational (March, 1992; Mumford and Shah, 1989). All of these strategies lead to the minimization of an energy function derived by formulating the original problem as a
196
ANNA TONAZZINI AND LUIS BEDINI
least squares problem, which accounts for data consistency, augmented by terms enforcing local smoothness constraints derived from available a priori information on the image and its discontinuities. According to the probabilistic approach, techniques of Bayesian estimation coupled with Markov random field (MRF) models are now recognized as a powerful tool for defining feasible solutions (S. Geman and Geman, 1984; Jeng and Woods, 1991; Li, 1995). Indeed, MRF models are very flexible for enforcing on the expected solution constraints which are derived from available real-world knowledge about the image features and the imaging system. For instance, these models are well suited to address a line process, related to intensity discontinuities, in order to prevent the reconstructions from being oversmoothed. Different models have been proposed in which the line process is either explicitly addressed, through the incorporation of extra variables, or implicitly addressed, through suitable functions (stabilizers) of the intensity gradients of various degrees. Furthermore, the line process can be either binary or continuous (Bedini, Gerace et al., 1996; D. Geman and Reynolds, 1992). Some studies have also explored the relationships between the various models and in particular have shown that when suitable properties are satisfied by the stabilizers, an equivalent recovery of the discontinuities is guaranteed without the introduction of auxiliary variables (D. Geman and Reynolds, 1992). The advantage of considering the discontinuities implicitly rather than explicitly is related to the reduction in computational costs through the use of deterministic algorithms. Nevertheless, most stabilizers for implicit line processes have been developed for the case of noninteracting discontinuities. This means that significant constraints on the discontinuities, such as connection and thinness, are not enforced on the geometry of the edges. Indeed they would require self-interactions of the line process, which are particularly difficult to model implicitly unless some approximation is made (Geiger and Girosi, 1991). Conversely, stabilizers that address a line process in an explicit manner are well suited to manage self-interactions among lines, which thus allows for the incorporation of geometric edge constraints (e.g., regularity constraints), which have proven to be very useful for improving the quality of the reconstructions (Bedini, Gerace et al., 1996). However, the cost of this advantage has to be paid in terms of increased computational complexity. All these models, expressed in the form of Gibbs priors and likelihood functions, can be merged according to Bayes' rule to derive a posterior probability which collects all the knowledge we have about the problem. The availability of well-understood estimators, such as the maximum a posteriori (MAP) estimator, allows for the definition of a single solution (S. Geman and Geman, 1984). If the preceding approaches are followed, many problems in image processing and in computer vision can thus be solved by minimizing a cost
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
197
function which includes both terms of data consistency and terms which force constraints on the solution. Whereas in Tikhonov regularization the energy function is convex, in regularization techniques involving discontinuities, the energy function usually presents many local minima, and the descent algorithms do not ensure that the global minimum can be found. Although not yet conclusive, a great deal of work has also been devoted to the definition of efficient and fast nonconvex optimization algorithms, both stochastic and deterministic, such as simulated annealing (SA) (Aarts and Korst, 1989; S. Geman and Geman, 1984), graduated nonconvexity (GNC) (Bedini, Gerace et al., 1994a, 1994b; Blake and Zisserman, 1987), mean field annealing (Bilbro and Snyder, 1990; Geiger and Girosi, 1991), the iterated conditional modes (ICM) algorithm (Besag, 1986), and the generalized expectation-maximization (EM, GEM) technique (Bedini, Salerno et al., 1994; Dempster et al., 1977). Both the stochastic and the deterministic algorithms are inherently distributed and parallel, so they are naturally suitable for implementations which take advantage of existing parallel architectures or innovative architectures based on neural network models (Bedini and Tonazzini, 1992; Horn, 1988; Koch et al., 1986; Lumsdaine et al., 1991).
B. Blind Image Restoration
Image restoration methods assuming a known PSF are not suitable for many realistic image-processing applications. Indeed, the degradation process is often unknown or, alternatively, it is difficult and expensive to measure. Thus, in practice, the PSF has to be estimated from the observed data as an integral part of the restoration procedure. Blind image restoration, or joint PSF identification and image restoration, attempts to estimate both the true image and the PSF from the degraded image by using partial information about the imaging system and the features of the true image. Nevertheless, as can be expected, blind image restoration is far more indeterminate than image restoration with a known PSF, in the sense that this problem admits many, possibly an infinite number, of solutions. One cause of this, among others, is the higher number of variables with respect to the data. For instance, the couple given by a PSF in the form of a Dirac function and the observed image itself always constitutes a trivial solution. Hence, the adoption of constraints for both the image and the PSF becomes necessary. In the case of simple blurs, such as motion blur, and modest noise levels, blur identification was first tackled by methods based on the analysis of the zeros of the transfer function in the frequency domain (Gennery, 1973; Stockham et al., 1975). Subsequent work was based on ARMA (autoregressive moving
198
ANNA TONAZZINI AND LUIS BEDINI
average) models for the distorted image, in which the AR parameters, related to the original image, and the PSF, which corresponds to the MA parameters, can be estimated by maximum likelihood (ML) estimation (Lagendijk, Tekalp et al., 1990; Tekalp and Kaufman, 1988; Tekalp et al., 1986), Kalman filtering (Angwin, 1989), and EM algorithms (Katsaggelos and Lay, 1991; Lagendijk, B iemond et al., 1990; Lay and Katsaggelos, 1990). Alternatively, still using the ARMA model for the degraded image, the generalized cross-validation (GCV) method has been used to determine the parameters that minimize a weighted sum of prediction errors, based on standard regularization (Reeves and Mersereau, 1992). Another approach, which has been widely experimented with, consists of forcing positivity and/or size constraints on both the image and the PSE This can be done by means of iterative algorithms that alternately estimate the image, on the basis of the current estimate of the PSF, and the PSF, on the basis of the current image estimate. These estimates are computed by means of inverse filtering or Wiener filtering and then modified to satisfy the constraints (Ayers and Dainty, 1988; Davey et al., 1989). Alternatively, an error measure enforcing positivity constraints is defined and minimized by using SA (McCallum, 1990) or the conjugate gradient algorithm (Lane, 1992); also, a priori knowledge about the image and the PSF can be expressed through constraint sets and the solution can be computed by means of projection-based algorithms (Yang et al., 1994). The positivity and/or size constraints, although useful in reducing the number of admissible solutions, are not sufficient for a satisfactory solution of the blind restoration problem. For instance, the trivial solution given by the Dirac function and the data satisfies these constraints. Other constraints, such as the knowledge of the type of blur (e.g., motion blur) or the global smoothness of the blur itself, are too restrictive or force the estimate toward uniform blurs. We argue instead that the choice of a good image model is crucial for the success of joint image restoration and blur estimation. Indeed, in the theoretical case in which the true image is exactly known and the data are noiseless, excellent blur estimates are always obtainable through simple ML techniques, even in the absence of a priori constraints on the blur coefficients. When some noise is present on the data, increasingly better estimates can be obtained as the number of true edges in the image increases (Tonazzini and Bedini, 1998). This is not surprising, since, assuming for example that the image is piecewise smooth, most of the information about the blur is expected to be located across the discontinuity edges. Although global smoothness of the image has been used in ML- and GCV-based methods, it is well known that this constraint destroys sharp intensity transitions in the image, and then it cannot perform well for blur identification. Thus, it is to be expected that further improvement can be achieved by incorporating the piecewise smoothness of the image, through models which allow for an accurate edge location.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
199
In a paper by You and Kaveh (1996) the piecewise smoothness of the image is coupled to the piecewise smoothness of the blur and incorporated in spaceadaptive regularization, performed by means of alternating image estimation and blur identification. Experimental results are shown for uniform blurs only. In this case, a smoothness constraint for the blur coefficients is naturally well suited. However, we have experimentally observed that, because of the generally very small size of the blur masks, the use of a smoothness constraint of comparable weight with respect to the data consistency term tends to force the solution toward uniform blurs. MRF image models which account for intensity edges have been used along with the previously used EM algorithm (Zhang, 1993). In this approach, the observations are viewed as incomplete data, and the coupled image field as unobservable variables, to be estimated along with the degradation parameters by means of maximization of the marginal distribution of the data conditioned on the parameters themselves. The computational problems related to the estimation of expectations have been dealt with by using the mean field approximation. As an alternative to EM and still using MRF image models with explicit lines, an iterative alternating MAP-ML procedure can be adopted (Tonazzini, 2001; Tonazzini and Bedini, 2000). More precisely, given an initial set of parameters, the MAP reconstruction of the image is performed, and the estimate is interpreted as the original image and then used to compute, by means of MAP or ML estimation, a new set of blur parameters.
C. Unsupervised Blind Image Restoration In the previous approaches, the need for an image estimate with reliable intensity edges requires a fine-tuning of the MRF model hyperparameters (also called Gibbs parameters). These hyperparameters express the confidence we have with the corresponding constraints. As the most typical example, the socalled regularization or smoothness parameter defines the extent of smoothness in the desired image and is related to the image scale. Whereas the functional form of the Gibbs prior is usually known, these hyperparameters must be estimated. In the case in which one or more clean samples of the MRF are available, satisfactory results were found by using ML estimators (Gidas, 1993; WinNer, 1995; Younes, 1988). Unfortunately, in practical restoration tasks, samples of the MRF model are usually unavailable, and one has to rely solely on the data, which are the only available observations, although incomplete and unclean, of the underlying, unobservable MRE Thus, a more realistic and challenging approach to restoration consists of considering both the image field and all the parameters (degradation parameters plus model hyperparameters) as variables of the problem, to be jointly estimated from
200
ANNA TONAZZINI AND LUIS BEDINI
the data. The Bayesian approaches (EM or MAP-ML) devised previously for blind restoration alone can theoretically be extended to this case straightforwardly. Unfortunately, although appealing, this procedure has an intractable computational complexity even for small-size problems. Indeed, the hyperparameters estimation step requires summing over all possible configurations of the MRF, for computing either the partition function or related expectations. Thus, some amount of approximation is unavoidable. For instance, in Younes (1989) the hyperparameters are estimated by means of ML through a stochastic gradient algorithm in which the expectations are substituted by the values computed in the last configurations of single-sweep-long Markov chains. Other popular approximations are the maximum pseudolikelihood (MPL) approximation, the coding method, and the mean field approximation (Besag, 1986; Chalmond, 1989; Lakshmanan and Derin, 1989; Saquib et al., 1998; Zhang, 1992). Nevertheless, when explicit lines are introduced into the MRF model, the pseudolikelihood and the mean field distribution are not good approximations of the original distribution, in that intensity and line elements are highly correlated. In addition, since restoration requires considering a multilevel or even a continuous-valued intensity field, their computation is very complex. Another approach is to view the model parameters not only as related to the desired image per se, but also to be determined in relationship to the particular problem to be solved. This is especially true for some of the parameters. For instance, the regularization parameter also has to be tuned according to the amount of noise which affects the data: the higher the noise, the stronger the smoothness to be selected. Another parameter, namely the threshold for the creation of a line, represents the intensity gradient above which a true edge is likely to be present in the true image. Thus, in principle, it should be lower than the lowest jump in the image and higher than the highest noise peak. Unfortunately, these two conditions are unlikely to be simultaneously verified in real images. One way to solve this dilemma is to perform an adaptive reconstruction in which the process is started with a high threshold in order to smooth off the noise and then the threshold is gradually reduced to obtain a coarse-to-fine recovery of even the finest true edges. This adaptive variation of the model parameters can be performed heuristically, but many attempts would probably have to be made before the correct variation schedule could be found. Thus, some automatic rule should be devised. GNC-like algorithms seem to be suitable for performing this task, especially with reference to the threshold parameters. GNC was first derived to minimize the energy function F ( f ) which arises when the explicit, binary line process is eliminated from the weak membrane energy (Blake and Zisserman, 1987). The basic idea is to construct a family of approximations, dependent on a parameter p, p ~ [0, 1], such that F (~ -- F and F (1) is convex. Staging
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
201
from p = 1, a gradient descent algorithm is subsequently applied to the various approximations, for a prescribed decreasing sequence of values of p. Since the first derivation of GNC, various GNC-like algorithms have been proposed to manage the nonconvex minimization of different edge-preserving stabilizers (Bedini, Gerace et al., 1994a, 1994b, 1995; Bilbro et al., 1992; Li, 1998). It can be observed that parameter p, while approximating the original stabilizer, usually relaxes the threshold. Nevertheless, in most proposed stabilizers, the extent of this threshold variability is very limited. Moreover, the stabilizers suitable for GNC-like algorithms do not allow for a straightforward incorporation of constraints into the edge geometry, while the adaptive tracking of the edges could very much benefit from the incorporation of at least a line connection constraint, in such a way as to favor the recovery of well-behaved edges against sparse edges due to noise.
D. O v e r v i e w
This article describes our recent experiences and progress toward an efficient solution of the highly ill-posed and computationally demanding problem of blind and unsupervised visual reconstruction. Our case study is image restoration (i.e., deblurring and denoising). The methodology employed makes reference to edge-preserving regularization. This is formulated both in a fully Bayesian framework using an MRF image model with explicit and possibly geometrically constrained line processes, and in a deterministic framework in which the line process is addressed implicitly by using a particular MRF model which allows for self-interactions of the line and an adaptive variation of the model parameters. These MRF models have proven to be efficient in modeling the local regularity properties of most real scenes, as well as the local regularity of object boundaries and intensity discontinuities. In both cases, our approach to this problem attempts to effectively exploit the correlation between intensities and lines and is based on the assumption that the line process alone, when correctly recovered and located, can retain a good deal of information about both the hyperparameters that best model the whole image and the degradation features. We show that these approaches offer a way to improve both the quality of the reconstructed image and the estimates of the degradation and model parameters, and that they significantly reduce the computational burden of the estimation processes. In this article, we first describe a fully Bayesian approach, which allows the problem to be formulated as the joint maximization of a distribution of the image field, the data, and the degradation and model parameters. This very complex joint maximization is initially broken down into a sequence of MAP
202
ANNA TONAZZINI AND LUIS BEDINI
and/or ML estimations, to be performed alternately and iteratively, with an initial significant reduction in the complexity and computational load (Sections II and III). Some feasible approximations are then applied to further improve the computational performance of the method (Sections IV through VI). In Section VII a procedure for its practical implementation is proposed, and some hints are given for a possible parallel architecture based on neural networks such as the Hopfield network and the Boltzmann machine. We then describe a discontinuity-adaptive smoothing method, which is essentially based on a GNC-like algorithm applied to a particular MRF model in which the constrained discontinuities are treated implicitly. This is shown to allow for the automatic tuning of the model threshold, for a coarse-to-fine detection of the edges during the image reconstruction process (Section VIII). In Sections IX through XI, the performance of these methods is evaluated and discussed through numerical simulations taken from our previous work. In particular, we examine separately the blind restoration subcase (Section IX), the unsupervised MRF-based restoration subcase (Section X), and, finally, the fully data-driven restoration case (Section XI).
II. FULLY BAYESIAN APPROACH TO UNSUPERVISEDBLIND RESTORATION In this section we establish the mathematical foundations of a general Bayesian estimation method which allows for the joint estimation of the image and all the free parameters of the problems. The free parameters are related both to the degradation process and the noise (degradation parameters) and to the MRF model parameters (Gibbs parameters or hyperparameters). To.develop this theory, we adopt edge-preserving, coupled MRF models to describe a priori information about the local behavior of both the intensity field f and the unobservable discontinuity field 1 (line elements), which are treated as explicit variables of the problem. As already highlighted, when compared with alternative models in which the discontinuities are implicitly addressed through functions of the intensity gradient, this approach enables the introduction of useful available information about the geometry of the image edges. In addition, as we will show, this approach can simplify the computational complexity of the blind, unsupervised estimation problem. A coupled MRF is characterized by the following Gibbs distribution: P ( f , l) =
1
Z(w)
V ( f , l) -- ~
UOr F
exp[-U(f,/)]
(la)
V r ( f , l)
(lb)
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
203
where Z(w) is the normalizing constant, or partition function, and U(f,/), the prior energy, is in the form of a scalar product between the hyperparameter vector w and the vector of the clique potential functions V,.(f, 1). In practice, for an isotropic and homogeneous MRF each Vr is the sum of the local potentials over the set of homogeneous cliques and wr is a parameter expressing the relative weight of these potentials. Depending on the particular problem, the neighborhood system and the potential functions should be chosen to capture the desired statistical relationship among the elements in the field. For instance, a typical model for piecewise smooth images must express the smoothness constraint on the pixel intensities, the conditions under which this constraint can be broken to create a discontinuity, and the self-interactions between line elements, so as to describe the admitted or forbidden local configurations of the discontinuities. Assuming a linear data formation model, and Gaussian uncorrelated noise with zero mean and variance a2, the likelihood function of the data given the intensity field is
p(glf)__(2rc0-2)-(n•
Ilg-Hfl12320.2 -
(2)
where n x m is the number of pixels, f is the vector of the lexicographically ordered notation (Andrews and Hunt, 1977) for f , g is the vector of the data, and H is the matrix associated with the linear operator. Thus P(gl f ) contains, as degradation parameters, the elements of matrix H and the variance of the noise. When the degradation operator is a blur, H is a block Toeplitz matrix whose elements derive, according to a known rule, from a usually square mask d of small and odd size 2D + 1. Let us rewrite H as H(d) and call q the set of the degradation parameters d and 0-2. If we explicitly introduce the dependence of all the previously defined functions on their own parameters (e.g., U(f, l)= U(f, llw), P(glf)--= P(glf, q)), a fully Bayesian approach to reconstruct the image, estimate the "best" Gibbs parameters, and accomplish the degradation process identification for a given problem is to assume a prior distribution for w and q and then simultaneously compute (f, l), w, and q by maximizing some joint distribution of them all. Assuming at the moment a uniform prior for w and q, and considering the distribution P(f, l, glw, q), the problem can be formulated as
max P(f, l, glw, q)
f,l,w,q
(3)
The joint maximization (3) is a very difficult task which necessarily requires some form of reduction. In the following two subsections, we revise the two
204
ANNA TONAZZINI AND LUIS BEDINI
basic approaches adopted for this purpose: the alternating maximization approach and the EM approach.
A. MAP-ML, or Alternating Maximization Given the separability of the degradation parameters and the model parameters, the distribution of Eq. (3) can be rewritten as P ( f , l, glw, q) - P ( f , / [ g , w, q)P(glw, q) - P(glf, q ) P ( f , / I w )
(4)
This allows for the adoption of the following suboptimal iterative procedure: (f(k), l(k)) _ arg mfa/x P(f,/[g, w (k), q(k,)
(5a)
q(k+l) _ arg max P ( g l f ~), q)
(5b)
w (k+l) -- arg max p(f(t,), l(k, lw )
(5c)
q
w
Starting from an initial guess w (~ q(0), and iterating and alternating steps (5a), (5b), and (5c), we can obtain a sequence (f(k), l(k), w(k), q(k)) which converges to a local maximum of P(f, 1, glw, q). In this sense, problem (5) is weaker than problem (3). Nevertheless, if (f*, l*, w*, q*) is the solution of Eq. (5), then (f*, l*) is the MAP estimate of (f, I) based on g, w*, and q*, while q* is the ML estimate of q based on the likelihood function computed in f*, and w* is the ML estimate of w based on the prior computed in (f*, l*) (Lakshmanan and Derin, 1989). The same considerations hold at each stage of the iterative procedure (5). Thus, the final solution (f*, l*, w*, q*) is adaptively obtained by the iterative execution of a MAP estimation for (f, l) and ML estimations for w and q in turn.
B. Expectation Maximization According to the EM formalism, unsupervised blind image restoration can be described as an incomplete data problem in which part of the data (the image field) is not observable, or is hidden, and must be estimated along with the degradation and the model parameters. In this respect, the true problem to be solved, in the incomplete data set of the observations, becomes max P(glw, q) w,q
(6)
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
205
whereas the original problem of Eq. (3) is seen as the transposition of problem (6) in the complete data set. The many-to-one relationship among unobservable and observable data in this case is (f, l, g) > g, and the two probability distributions are related as follows:
P(glw, q) - f y~ P(f, l, glw, q) df
(7)
l
The EM algorithm is an iterative procedure for solving Eq. (6), making use of the associated distribution P (f, l, glw, q). Each iteration consists of the following two steps: E step Compute the conditional expectation of log P(f, l, glw, q), with respect to (f, l) and conditioned on the observed data and the current estimate (w ~k), q~k)) of the parameters: a(w, qlw! ~', q
e(glf,
l, w, q) + log P(f,/[w,
q)lg, w(*), q(k)]
Find (w (k+l), q
This algorithm has been proved to converge to the ML estimate (Dempster et al., 1977). Although apparently very different, the EM approach and the alternating MAP-ML approach are intimately related. In fact, in the alternating maximization approach, the estimation of the parameters is conditioned on the current MAP estimate (f(k), l(k)) of the image field, given the data and the previous estimate of the parameters themselves, whereas in the EM approach, the estimation of the parameters is conditioned on the average of all configurations of the image field which are consistent with the data g and the previous estimate of the parameters. To see this in formulas, let us consider the kth iteration of the EM algorithm and consider performing the E step and the M step together. The problem to be solved at the kth iteration is then (w(k+l), q(k+l)) __ arg max E[log P(glf, l, w, q) w,q
+ log P(f,/Iw, q)lg, w (k), q(k)] = arg max f f ~ w,q
1
{log P(glf, l, w, q)
+ log P(f, llw, q)}P (f, llg, w (k), q(k)) df
(8)
206
ANNA TONAZZINI AND LUIS BEDINI
If we concentrate the posterior distribution P (f, l ]g, w ~k), q~k)) around its maximizer (f~,/~k~)mthat is, the MAP estimate of the image field given the data and the current estimate (w ~k~,q~k~)of the parametersmproblem (8) turns out to be max log P ( g l f ~), l
(9)
and moreover, by taking into account the separability of the degradation and model parameters, and the independence of the likelihood function from w and l
p(f(k),/(k)lw )
(10)
which is equivalent to the following two separated problems: q(k+l~ _ arg max log
P(glf ~k~,q)
(lla)
w (k+l~ -- arg max log
p(f(k~, l(k~lw)
(lib)
q
w
It is easy to recognize in steps (11 a) and (11 b) the two parameter estimation steps (5b) and (5c) of the alternating maximization approach described in Section II.A, apart from the unaffecting logarithm transform.
III. THE MAP-ML METHOD Both of the prior formulations greatly decrease the complexity of the original blind unsupervised restoration problem. Nevertheless, they still have a very high computational cost, so further simplifications must be adopted. The main difficulty in using EM procedures when the hidden variables are MRFs is that the expectation of the E step is difficult to calculate because of the interactions among the hidden variables at different sites. To overcome this difficulty, Zhang (1993) proposed using a mean field distribution of (f, l) conditioned on g, to be used to compute the expectation in the E step. The mean field values give the estimates of the image field, while the estimate of the parameters is obtained by means of a conjugate gradient applied to the expectation, according to the M step. In the alternating maximization method, the image estimation step (5a) amounts to an ordinary supervised restoration process, to be handled by standard descent algorithms and stochastic algorithms such as the Gibbs sampler (S. Geman and Geman, 1984). The blur estimation step (5b) consists of a small-size least squares problem, eventually augmented by projection onto
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
207
constraints sets to enforce positivity, size, and normalization constraints, or by standard regularization when smoothness constraints on the blur coefficients are enforced as well. These algorithms, for their local and distributed nature, are relatively cheap. In addition, they are inherently parallel and suitable for implementation either on general-purpose parallel machines or on dedicated neural architectures. Nevertheless, step (5c), as it stands, remains very heavy, and its efficient solution is still an open problem. Pseudolikelihood- and mean field-based approaches consist of factorizing the original distribution, assuming the independence of the elements of the image field, so as to simplify the computation of the partition function or related expectations that, as already mentioned, depend nonlinearly on w. Our approach to this problem attempts to positively exploit the correlation between intensities and lines and is based on the assumption that the line process alone can retain a good deal of information about the hyperparameters that best model the whole image. This assumption allows for the adoption of approximations which significantly decrease the computational cost of the hyperparameter estimation step and thus makes feasible the blind, unsupervised image restoration process with coupled MRF models. Our method is based on an SA scheme for the joint maximization of the distribution of the image and the data, given all the parameters. This scheme is practically implemented through a mixed-annealing algorithm for the estimation of the intensity and the line fields, periodically interrupted for parameter updating. The degradation parameters are updated through a small-size least squares technique. The hyperparameters are updated by means of a gradient descent technique which requires the computation of expectations. If the saddle point approximation (Chandler, 1987) or the Markov chain Monte Carlo (MCMC) methods (Geyer and Thompson, 1992) are exploited, the computation of these expectations is reduced to low-cost time averages over binary variables only. In previous papers (Bedini and Tonazzini, submitted; Bedini, Tonazzini et al., 1999; Tonazzini et al., 1997), we highlighted the satisfactory performance of this approach for unsupervised noise removal with coupled MRF models, when images are unblurred, and proposed a possible neural architecture for an efficient implementation of the procedure (Bedini, Tonazzini, and Minutoli, 2000). In other papers (Bedini and Tonazzini, 2001; Tonazzini and Bedini, 1998), we presented the results of blind unsupervised deconvolution in the presence of noise. In the following sections, we provide details and derive the formulas and the algorithms that we adopt to solve the three separated problems of Eqs. (5). In particular, in Section VI we revise an approach, based on the saddle point approximation or on MCMC techniques, which exploits the presence of the line process to drastically decrease the computational cost of step (5c).
208
ANNA TONAZZINI AND LUIS BEDINI IV. MAP ESTIMATION OF THE IMAGE FIELD
A. The Mixed-Annealing Algorithm With respect to the MAP estimation of the image field for fixed values of the parameters, we can observe that by taking the negative logarithm of the posterior distribution and dropping the constant terms, it is reduced to (f(k) l(~))_ argmf!n I [ g - H(d(k')fl[ 2 , 2(r ) + U ( f , l[w (k))
(12)
The function to be minimized in Eq. (12) is called the posterior energy function E(f, llg, w ~k), q(k)). To solve Eq. (12), we can adopt any of several existing algorithms--for example, SA, ICM, GNC, or GEM (Bedini, Gerace et al., 1996)--depending on the form of the posterior distribution. In this case, owing to the presence of an explicit binary line process, we propose to use the mixed-annealing algorithm (Marroquin, 1985) in which the continuous variables are updated in a deterministic way, while the binary variables are updated in a stochastic way. This strategy is based on the fact that when suitable forms for the prior energy are adopted, the posterior energy E(f, llg, w(k), q(k)) is a convex function with respect to f , so that its minimum f*(l), for any fixed configuration of l, can be computed by using deterministic schemes, such as gradient descent techniques. The problem is thus reformulated as the minimization of E(f*(l),/Ig, w(k), q(k)) with respect to 1. This minimization entails using SA techniques, but only with respect to the line variables, and hence with a greatly reduced complexity. Indeed, because the neighborhood size is generally modest and the line process is binary, the computational cost of drawing a sample of the line process is low. The scheme of the annealing algorithm for the binary line variables consists of a rule for decreasing a temperature parameter T, of the length of the Markov chains at each temperature (i.e., the number of element visitations, provided that each element is visited at least once), and of an algorithm for updating the elements themselves. When this scheme is implemented in parallel, asymptotic convergence is ensured for asynchronous updates (i.e., independent but not simultaneous updates) (Aarts and Korst, 1989; S. Geman and Geman, 1984). The line element li,j at site (i, j), horizontal or vertical, is updated by using the Gibbs sampler, according to the local conditional posterior
Pr(li,j -- lllm,nS.t.(m, n) :/: (i, j), g, w (k), q(k)) _
1
1 + exp[AE/T]
(13)
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
209
where
A E -- E(f*(ll), l 1 [g, W(k), q(k)) __ E(f,(lO),/0[g, w(~), q(~))
(14)
and 11 and l ~ are line configurations which differ only in terms of the element which is "1" in the first and "0" in the second. The practical implementation of the overall mixed-annealing scheme is still very expensive because f*(1) must be computed whenever an update for a line element is proposed. In a previous paper (Bedini and Tonazzini, 1992), we showed that the availability of an analogue Hopfield-type neural network would permit the computation of f*(1) in almost real time, as the stable state of an electrical circuit (Hopfield, 1984), which would thus make the mixedannealing algorithm very efficient. Nevertheless, even in the absence of such an architecture, we found that, from an experimental point of view, good results can also be obtained by using an approximation of the theoretical scheme. This approximation is based on the local nature of the intensity process and on its usual short-range interactions with the line process, so that, under the reasonable hypothesis that f*(1) undergoes localized, small variations for mild perturbations in 1,f*(1) can be computed, for instance, after only one or more updates of the whole line process, or even once per each Markov chain. Since, as already highlighted, the posterior energy is convex in f and the intensity field exhibits local, short-range interactions, the computation of f*(1) can be performed deterministically, through parallel algorithms such as successive overrelaxation (SOR) (Blake and Zisserman, 1987), or through a conjugate gradient. If the value for f*(l) is kept fixed along the updating of the line process at each given temperature, Eq. (14) is reduced to
li,j
AE - u ( i * , l'
v(i*,
(15)
which means that the Gibbs sampler draws samples according to the prior probability, P~(llf*, w(~)), of the lines conditioned on the intensity process, rather than according to the conditional posterior. Although the convergence of the algorithm approximated in this way can no longer be guaranteed, the computational efficiency of the algorithm is greatly improved, and its satisfactory performance in restoring piecewise smooth images has been shown experimentally (Bedini and Tonazzini, 1992). It is well known that SA with the Gibbs sampler or the Metropolis algorithm represents the running modality of the Boltzmann machine, which is a stochastic neural network whose units evolve until they stabilize at the minimizer of an internal energy (Aarts and Korst, 1989; Hinton et al., 1984). This internal energy can be made to coincide with the conditional prior energy of the lines
210
ANNA TONAZZINI AND LUIS BEDINI
given the intensity, so it can be used, at fixed temperatures, to produce samples of the line process (Bedini, Tonazzini, and Minutoli, 2000). In particular, we refer to a generalized Boltzmann machine (GBM) with mixed continuous and binary units interconnected according to a system of cliques and distributed over an input layer and an output layer (Azencott, 1990, 1992). The input units are associated with the pixels f and the output units with the line elements I. The weights of the GBM are the hyperparameters of the conditional prior, which coincides with the internal distribution according to which the units evolve. This Boltzmann machine can be realized by means of a grid of processors working in parallel.
B. Use of a Preconditioned Conjugate Gradient
Preconditioned conjugate gradient algorithms (Axelsson and Lindskog, 1986; Bjorck, 1989) have been successfully used to significantly reduce the number of iterations in Tikhonov or standard regularization techniques for image restoration (R. H. Chan et al., 1993; Hanke et al., 1993). In the mixed annealing described previously, the deterministic computation of f*(l) represents the most expensive part of the whole algorithm. When the prior energy used to enforce local smoothness is related to first-order derivatives of the intensity, the posterior energy for fixed lines is convex and quadratic. Thus, a large-size conjugate gradient can be applied for computing f*(l). In these conditions, the use of preconditioners can greatly improve the computational performance by reducing the number of iterations required. Nevertheless, the presence of the line process in the energy function, although kept fixed, requires a specific preconditioning strategy to manage the particular structure of the matrix of the equivalent least squares problem. Indeed, as a substantial difference with respect to the case of Tikhonov regularization, this matrix is composed of three parts: the first is the imaging system matrix H, and the remaining two are (0-1) matrices associated with the local smoothness term and depending on the current configuration of the line process. Because of the adaptive nature of the restoration process through mixed annealing, these last two matrices change every time the line process is updated. Nevertheless, these changes do not involve the structure of the matrices, and consist only of flipping some elements from 0to 1. Thus, on the basis of an extension of Chan's idea (R. Chan and Ng, 1996; R. H. Chan et al., 1993), in Bedini, Del Corso et al. (2001 ) we proposed computing a single Chan-type preconditioner in the case in which all the line elements are set to zero and using it for every configuration of the line process. This
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
211
approximation is particularly justified when the restoration process approaches convergence. Indeed, near the convergence, the restored image is close to the undegraded image that, as is usual in natural images, contains a small number of edges. We demonstrated evidence of the effectiveness of this computational strategy when it was applied to the restoration of both synthetic images and real images. In particular, we showed an increase of up to 45% in the number of iterations when the conjugate gradient is used without a preconditioner. In the following discussion, we briefly revise the strategy adopted to build the aforementioned Chan-type preconditioner for our specific problem. To develop this argument, we first provide details of the associated matricial form of the typical posterior energy function
E ( f , / I g , w, q) -
I I g - Hfl[ 2 2O-2
n-1
+ ~2 E
m
E
_+_~2
~
m-1
Z (J~'j -- J~,j+l i--1 j--1
)2( 1
--
Vi,J )
(fi,j -- J~+l,j)2( 1 -- hi,j) + Q(1)
(16)
i=1 j = l
where for simplicity's sake we dropped the dependence of the variables from the order of the iteration and the dependence of H from d. In energy (16), n x m is the dimension of the image; h and v are the horizontal and vertical line processes, respectively; X2 is the regularization or smoothness hyperparameter; and Q(1) is a term accounting for constraints on the lines. Energy (16) is to be considered in this case as a function of the intensity process alone (i.e., the configuration 1 = (h, v) of the line process is arbitrary and fixed). When we drop the terms that do not depend on f and in view of the binary nature of the line process, Eq. (16) can be equivalently rewritten as m--1
E(fll, g, w, q) - Ilg
Hfll 2 + ~2 n-1
+ )2 E
m
E
E (J~'j -- J~'J+l)2(1 -i=1 j = l
vi'j)2
(J~'J -- J~+l'j)2(l -- hi'j)2
(17)
i=1 j = l
where the regularization parameter/~2 is now intended to incorporate the constant term 2o-2. This formulation can be rewritten in the following fully matricial form: E(fll, g, w, q) --
fZ(HTH + ~.2ATA + ~.2BTB)f- 2grHf +
gZg
(18)
212
ANNA TONAZZINI AND LUIS BEDINI
In Eq. (18) H is an n x n block matrix in which each block H i is m x m and is related to the i th row of the kemel matrix d, where d is a square matrix of small and odd size 2D + 1. In particular, the structure of H is as follows: .
.
.
.
9
110
Ho
H1
9
.
.
Ho
.
.
HD
On
.
.
Ho
.
.
Ho
.
.
Ho
.
Om
m
. Om
H1
H-D H
Om
no H-1
H-D
9
Om
9
tlo
H_o
On
.
H-o
Om
9
.
.
. 9 Om .
1to
9
9
Ho
H1
H-o
9
H_I
Ho
with m
di,o
di,1
9
di,-I
di,o
di, l
9
9
di,o
9
9
di,D
0
.
.
di,-D
9
9
di,o
9
9
di,D
0
.
di,-D
9
9
di,o
9
di,-D 0 H i
di,D
0
.
.
di,D
0
m
9
.
0 9
9
di,D
9
di,o
di,1
9
di,-1
di,o
o
0
di,-D
9
B
0
9 di,-U
_
Matrix A is (n - 1)m • has a block bidiagonal structure, and depends on the configuration of the horizontal line process, while B is an n ( m - 1 ) x n m block diagonal matrix and depends on the configuration of the vertical line process. In particular, V / A(1) O! a
~
9
- A (1) A (2)
Om - A (2)
Om
9
9
o
o
Om
A (n-l)
9
- A (n-l)
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
213
and B (1) B(2)
n B (n)
The generic block A (~) is an m x m diagonal matrix and the block B (k) is (m - 1) x m bidiagonal. In particular, we have
(A(k)).. = " "'J
{
1 -
Vkj
1-h~j
if/--j
0
otherwise
for i, j -- 1, 2 . . . . , m
and
(n(k))i j --
if/
--
j
--(1 -- vk(j-1))
i f / -- j -- 1
0
otherwise
for/--
1,2 . . . . . m - 1
f o r j = 1,2 . . . . . m
The minimization of the posterior energy (18) is then equivalent to the solution of the following least squares problem: min
f
II ~ -
Hf
II
(19)
where ~, is the vector (g, Onm, Onm)T and H is the rectangular matrix given by /-~ =
FH] ~.A +
I_XB+
(20)
where A + is obtained by padding zeros to the bottom m rows of A, and B + is obtained from B by padding a row of zeros at the bottom of each Bi, for i = 1, 2 . . . . . n. As a way to solve Eq. (19), the conjugate gradient (CG) method can be applied to the equivalent linear system: ~T(~, __ ~ f ) __ 0
(21)
where 0 represents the vector with all zero entries. The rate of convergence of the CG is low for data matrices that either are well conditioned or have just a few distinct singular values. Thus, as a way to improve convergence, a "preconditioned" conjugate gradient (PCG) can be defined by transforming the original problem with a preconditioner C and applying the conjugate gradient
214
ANNA TONAZZINI AND LUIS BEDINI
method to the problem: min y
II ~ -/-tc-ly II
(22)
The solution is then obtained by solving the system Cf -- y. It is clear that the preconditioner C should be very simple to invert, while the singular values of the matrix H C -1 should be well clustered. Of particular interest for our purposes are the preconditioners for block Toeplitz matrices with Toeplitz blocks. Indeed, as we already observed, the matrix associated with a typical degradation operator in imaging problems, so far called H, has a block Toeplitz structure. This structure is maintained by the matrix of the least squares problem associated with the energy function when a Tikhonov regularization is adopted, and the Toeplitz structure can also be enforced in the matrix H associated with edge-preserving regularization. In particular, the preconditioner we adopted is based on T. Chan's (1988) idea. Chan's preconditioner for a block Toeplitz matrix employs circulant approximations and is defined as the circulant matrix C that minimizes the Frobenius norm with respect to the given matrix. The circulant approximation of Toeplitz matrices is a classical technique in image restoration, which has been applied in inverse filtering, truncated SVD, and Tikhonov regularization. The main advantage is the availability of fast algorithms for the computation of the inverse and of the eigenvalues of the matrix. In fact, each n x n circulant matrix can be diagonalized by the Fourier matrix; therefore, its eigenvalues as well as those of its inverse can be computed in O (n log n) operations when the FFT algorithm is used. Similarly (Davis, 1979), each block circulant matrix with circulant blocks can be diagonalized by computing the two-dimensional FFT in O ( n m ( l o g n + log m)) operations. The idea of using the PCG method in association with circulant preconditioners is, however, a relatively "new" technique first proposed in 1986 by Strang and re-proposed many times since (R. H. Chan et al., 1993; David and Dusan, 1994; Strela and Tyrtyshnikov, 1996). To adapt the Chan-type circulant preconditioner for the solution of the least squares problem for which the data matrix has the form (20), we have the additional difficulty that this matrix changes each time we update the line variables. Computing a new preconditioner at each round would drastically increase the overall time and eliminate the benefit of using the PCG instead of the CG. So, we will construct a preconditioner only once and use this matrix for each call to the PCG subroutine. Since real images have sparse contours, our preconditioner C is constructed by assuming that the line variables are all off and extending the Toeplitz structure of A and B.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
215
In particular we define the new system matrix ~0-
(23)
)~ z~
where A and/3 are (nm) x (nm) and are given, respectively, by ..
_ i ,m-,m
~1
a=
~m --Im Im
Om
(24)
and
E _ [l /) -
V
(25a)
Om
where V is an m • m bidiagonal matrix of the form v -
0
1
01
-1 1
(25b)
In Bedini, Del Corso et al. (2001), we proved the convergence of the adopted preconditioner by showing that the singular values o f / 4 C -I, where/-) is given in Eq. (20), are clustered around one, except for at most O(n) + O(m) + 2k, where k is the number of lines that are "on" in the current configuration of the line process.
V. M L ESTIMATION OF THE DEGRADATION PARAMETERS
Problem (5b)--that is, the estimation of the degradation parameters--does not depend on the line field and can be reduced to the following two computational steps" d (~+1' - a r g n ~ n II g - H(d)f(~' II2 (a2)(k+l) _ J i g - H(d(~+l))f(k)[12 n•
(26a) (26b)
which are derived by taking the negative logarithm of the likelihood function and dropping the constant terms. In particular, the closed-form solution for a 2
216
ANNA TONAZZINI AND LUIS BEDINI
is obtained by setting the first derivative with respect to cr of the log-likelihood to zero. To solve problem (26a), that is, to estimate the degradation parameters, we basically need to perform a least squares minimization. Let us consider the blur mask d to be a matrix of size (2D + 1) • (2D + 1) such that
L do'-D
.
.
d; D J
and let d be its lexicographic version. Then it is
II g - n ( d ) f l l 2 - ~ i,j
[
r:os=o
g i , j - Y ~ Y ~ fi+r,j+sdr, s r=-D s=-D
(28)
where the order of the iteration has been dropped, g is the matrix version of g, and a free boundary condition is assumed. If we call Fi,j and Fi,j the matrix form and the lexicographic form, respectively, for the matrix of the elements of f covered by the mask d, when d is centered on fi,j, then we have II g -
H(d)f II2 -- ~
i,j
[gi,j -- (Fi,j, d)] 2
(29)
where (., .) indicates the scalar product. The gradient of function (29) is II g - n ( d ) f
0d(r)
II 2
-- - 2 Z [gi,j -- (Fi,j, d)]Fi,j(r) i,j
-- --2 ~ gi,jFi,j(r) -I- 2 y ~ [Fi,j(r)(Fi,j, d)] i,j i,j (2D+I) x (2D+l)
-- --2 y ~ gi,jFi,j(r) -+-2 i,j
~
k-1
~ Fi,j(r)Fi,j(k)d(k ) i,j
(30) forr= 1,(2D+I) x(2D+I) Setting this gradient to zero reduces the original minimization problem to the solution of a linear system. Although the matrix of the system is very small, owing to the usually small size of the blur masks considered, it clearly depends on the current intensity map and it is not easy to verify whether it is nonsingular. For this reason, instead of solving the system by the direct inversion of this matrix, we prefer to adopt a gradient descent method to minimize the original
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
217
function. Once again, it can easily be shown that an analogue Hopfield network could find the solution of this problem in real time.
VI. M L ESTIMATION OF THE MODEL PARAMETERS
As formulated in Section II, the hyperparameter estimation step (5c) entails maximizing the prior distribution of the image field computed in the current image estimate (f(k), l(k)). This ML estimation problem can be equivalently reformulated as the problem that minimizes the negative log-prior. This function can easily be shown to be convex with respect to w, so that a possible criterion for obtaining its minimum is to look for the zero of its gradient, that is, by solving the following system of equations: f~
Z L _ { _ l o g p(f(k) OW r
'
/(k)lw)} _
8 u ( f (~), l(k)[w)
1
8Z(w)
O 113r
Z (w)
Ow r
= Vr (f(k), l(k)) _ Ew[Vr(f, l)] -- 0
(31)
where the expectations Ew are computed with respect to the prior distribution P(f, llw), and Z(w) is the related partition function, which depends on w. Eq. (31) highlights a well-known result in statistical mechanics and probability theory which states that all mean values of the system described by a Gibbs distribution can be obtained from the partition function (Chandler, 1987). Hence, a gradient descent technique could be applied to update the parameters vector w according to the following iterative scheme: 0 t/jt+l t r -- Wr -- O~Wr { - - l o g p ( f ( k ) l ( k ) l w ) } [ w
t
--" W'r --~ ~ { E w t [ W r ( f , /)] -- W r ( f (k), l(~)) }
(32)
where 0 is a positive, small parameter to be suitably chosen to ensure conVergence. Starting with w ~ = w (~), at the end of the iterative procedure (32) we would get the new estimate w (g+l) of the parameters, according to Eq. (5c). Nevertheless, as it stands, the computational load of the iterative scheme (32) is insurmountable because of the computation of the expectations, or equivalently of the partition function, which requires summation over binary variables and integration over continuous variables. In previous papers (Bedini and Tonazzini, submitted; Bedini, Tonazzini et al., 1999; Tonazzini et al., 1997), we showed that the explicit presence of a line process in the MRF model enables one to drastically reduce the burden related to the computation of either the partition functions or their related expectations, In a first approach, we exploited a well-known result of the
218
ANNA TONAZZINI AND LUIS BEDINI
statistical mechanical theorymthat is, the saddle point approximation (Chandler, 1987; Geiger and Girosi, 1991) m to feasibly approximate the partition function. We showed that this leads to basing the estimation on the conditional prior P(l(k) lf(~), w) rather than on the joint prior P(f(k), l(~)Iw). In a second approach, we exploited the importance sampling theorem, within MCMC techniques (Geyer and Thompson, 1992), to approximate, by means of time averages, the ratio between partition functions (Bedini and Tonazzini, submitted; Tonazzini et al., 1997). In both cases, the time averages can be computed by means of a low-cost binary Gibbs sampler, or, when the line process is noninteracting, the original partition functions or expectations can even be computed by means of analytical calculations, with additional, significant reduction of the computational cost. Furthermore, as already mentioned, a Boltzmann machine architecture could be devised for performing stochastic sampling. To derive these two approaches, we found it more convenient to equivalently reformulate the hyperparameter estimation step (5c) as an ML estimation based on the posterior distribution rather than on the prior distribution. The details of these derivations are given in the following two subsections.
A. Derivation of the Model Parameter Updating Rule by Means of the Saddle Point Approximation In this section we show how, by means of the saddle point approximation, the ML estimation of the model hyperparameters, originally formulated on the basis of the prior distribution of the coupled image field, can be reduced to an ML estimation based on the conditional prior of the line field given the current estimate of the intensity field. To this aim, we first formulate the hyperparameter estimation as the maximization with respect to w of the posterior distribution P(1(k), f(k)lg, w), where we dropped the explicit dependence from q(k), which is considered fixed at this stage. As a function of w, p(l(k), f(k)lg, w) is given by
p(f(k) /(k)[g, w) - P(g[f(~)' l(k)' w)P(f(k)'/(k)lw) '
P(glw)
(33)
where
P(glw)-ff~l
P(f'l'glw)df-ff~P(glf'l'w)P(f'llw)dfl
(34)
DISCONTINUITY-ADAPTIVE
219
VISUAL RECONSTRUCTION
Substituting in Eqs. (33) and (34) the expressions that hold in our case for P(glf, l, w) and P ( f , / I w ) we get
exp[ lg- f k ll2]exp[-U(f(~),l(~)lw)] -
P (f(~)' l(k)Ig' w) --
2cr2
ex. [-
2~-51
tl
e x p [ - U ( f , llw)]
df (35)
Since arg max P (f(k), l(k)]g, W
w) -
Ig, w)
arg min -log P (f(~), 1Ck~ W
(36)
we reformulate the ML parameter estimation problem as the minimization of the negative log-posterior. Also this function can easily be shown to be convex with respect to w, so that one possible criterion for obtaining its minimum is to look for the zero of its gradient, that is, by solving the following system of equations:
Ow----~O { - l o g P(f(~), z()Ig, w)} - OU(f(~)'Oll)r/(~) Iw) -- Vr(f (~), 1(~)) -
! Z(w)I
OZ(w)otOr
Ew[Vr(f,/)] -- 0 (37)
which is similar to that already given in Eq. (31), but where Z(w) is now the partition function of the posterior distribution P (f, 11g, w) and the expectations are computed with respect to the posterior rather than with respect to the prior. The application of a gradient descent technique gives the following updating rule:
wt+l r --
t 0 , l (k) W) 'WF -- r / ~ w r l - - l o g P ( f ( k ) Ig, }lw,
_._ w ,r + rl{Ew, t V r ( f , / ) ]
_ V r ( f ( k ) , l(k))}
(38)
where r/, as in Eq. (32), is a positive, small parameter to be carefully chosen to ensure convergence. The physical meaning of this updating rule is to look for the parameters which make the various potentials, computed in the MRF realization (f(k), l(k)), equal to their averages, when (f, l) is let free to assume all the possible configurations. In other words, the parameter updating rule aims to determine the probability under which the frequencies of occurrence of the various local configurations in (f(k), l(k)) are equal to their statistical expectations. It is easy to recognize in this updating rule the learning algorithm for the weights of a GBM with mixed continuous and binary units distributed over a single layer (Azencott, 1990). One particularity is that these units are
220
ANNA TONAZZINI AND LUIS BEDINI
associated with the data, the pixels, and the line elements, but, whereas the units f and l are free to evolve in the unclamped phase and are kept fixed to the single example (f~k), l~k)) in the clamped phase, the units associated with the data are always clamped to g. Moreover, the interconnection weights among the data units and the pixel units are fixed and related to the degradation model through the elements of matrix H. Starting with w ~ -- w ~k), at the end of the iterative procedure (38) we would get the new estimate w ~k+l) of the parameters, to be used in Eq. (5a) for computing a new update (f~k+l),/~+1)) of the image. Nevertheless, as for the case in which the update of the hyperparameters is based on the prior, the computational cost of the iterative scheme (38) is extremely high. The summation with respect to the binary variables can be estimated, in the expectations, by means of a relatively cheap Gibbs sampler and, in the case of noninteracting lines, can even be performed analytically. On the contrary, there is no way to reduce the cost of the summation in f , so that some form of approximation needs to be adopted. Our approach to derive this approximation acts at the level of the partition function and uses the saddle point approximation. The expectations in the gradient are then computed by derivatives of the approximated partition function. Let us consider first the following simplified form for the isotropic, homogeneous prior energy U(f, l Iw):
U(f, 11)~, c~) =
~. y ~ (j~,j - 3~+1,j)2(1 - hi,j) i,j
+ ~. ~
(jr,j - J~,j+,)2(1 -
vi,j)+ ot ~ hi,j + ot ~ ui,j
i,j
i,j
i,j
(39) which corresponds to the weak membrane energy. Prior (39) favors solutions which present discontinuities when the absolute value of the horizontal or vertical gradient is greater than the threshold value ~/~/~. and vary smoothly when there are no discontinuities. Considered individually, )~ tunes the strength of the smoothness constraint, while c~ determines the cost to be paid whenever a discontinuity is created, which thus prevents the creation of too many discontinuities. For the weak membrane model, the partition function Z(~., or) of the posterior distribution is Z(~. , c ~ ) - f f
h ~ e x p [ - I I g 2o.2 - H f l l 2 - ~. ~ ,
+ (j~,j -- j~,j+l)2(l
i,j
-- Ui,j) -- 0[, ~ i,j
(J~,j - fi+l,j )2 (1 (hi,j --[- Ui,j)] df
hi,j) (40)
DISCONTINUITY-ADAPTIVEVISUAL RECONSTRUCTION
221
In Eq. (40) the contribution of the line process can be computed exactly. Indeed, distributing the summation over h and v, and exploiting the independence of the single line elements from each other, one can integrate out the variables h and v, thus getting Z(~, or)-
exp -
2o.2
x . . {exp[-~(j~,j -- J~+l,j) 2] -~- exp(-c~)}
-Hfll2] X H{exp[--)~(j~,j--f/,j+l)2]+ exp(--ot)} df xffe xp I - I I g 20.2 i,j
x exp [/~j log{exp[-)~(f,.,j-, j~+l,j)2] + exp(-ot)} 1 x exp [/~j , l~
- J~'j+l)2] + exp(-~
df
(41)
which can be interpreted as the partition function of a new posterior distribution depending on f only. Geiger and Girosi (1991) call the Gibbsian prior energy associated with this distribution the effectivepotential.Indeed, it describes the effect of the interaction of the line field with the intensity field by suitably modifying the interaction of the field f with itself. To overcome the need for integrating with respect to f , one could argue that a good approximation for Z is given by the value which is predominant in the function to be integrated. In fact, if one neglects the statistical fluctuations of the field f , the partition function of Eq. (41) can be approximated through the saddle point approximation as follows: Z(I., oe)~ C exp [ -''g-2o.2Hfl'21
xexpI~ijl~176 ] x exp [~ij l~
, -- f i,j+l)2] +
(42)
exp(-cr
where C is a constant and f satisfies
V [ Ilg- Hfll 2 - Z log{exp[-)~(fi,j - fi+l,j)2] 21-exp(-ot)} / 2a 2 i,j - ~ i,j
log{exp[-)~(fi,j - fi,j+m) 2] + exp(-o~)}
}
- 0
(43)
222
ANNA TONAZZINI AND LUIS BEDINI
Assuming this approximation for the partition function, the explicit forms for the expectations appearing in the gradient of Eq. (37) are
1 OZ(X, oe) z(,~, ~) ox ~.,
( f i,j - :i+l,j) 2 exp[--)~(f i,j -- f i+l,j)2]
(:i,j
--
:i,j+l) 2 e x p [ - ) . ( f
i,j -
f i,j+,)2l
+ ./~j ~xxp[-~-~-i~ii"- ~/,j+l) 2] -~-exp[-a]
(44a)
and E~.,ot
hi,j+ ~. . ui,j ,
t,j
--
Z(~ of)
_~/~j
oz(x, a) Oc~
exp[-oe]
9. e x p [ - 1 . ( f i,j - f i+l,j )2] + exp[-ot]
exp[-c~] + Y~' t,J. 9 exp[--~.(fi,j -- fi,j+l) 2] -+-exp[-ot] (44b) respectively. It is clear that these expectations can be considered to be computed with respect to the conditional posterior P ( l [ f , g, ~., c~) which coincides with the conditional prior P ( l l f , ~., or). In particular, from Eq. (44b), it results that the two terms exp[-c~] hi,j = e x p [ - 1 . ( f i,j - f i+l,j) 2] + exp[-c~]
(45a)
and exp[-c~]
(45b)
f)i,j = e x p [ - 1 . ( f i,j - f i,j+l) 2] -Jr-exp[-c~]
represent the mean value for the variables hi,j and Ui,j, respectively, under the same probability. Geiger and Girosi (1991) applied the preceding saddle point approximation to derive a set of deterministic equations for the solution of an image
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
223
reconstruction problem from sparse and noisy data. In particular, they proposed a gradient descent method as a fast, parallel, and iterative scheme to obtain the solution of Eq. (43). In our application, as f depends on w, Eq. (43) should in principle be solved for each step of the parameter updating rule of Eq. (38). A further reduction in the computational cost can be obtained by assuming that mild changes in the values of the parameters produce slight modifications in f , so that this can be kept fixed along scheme (38) at the value corresponding to w (k). A second, possible level of approximation is thus based on assuming f "~ f(k). This assumption is justified by the fact that, when f is computed with respect to w (k), as in this case, it actually converges to f(k) in the limit to zero of a temperature parameter T to be inserted as usual in the posterior probability. This argument is the basis for the convergence of a practical algorithm, described in Section VII, in which this approximation is used in the context of a mixed annealing for MAP restoration, periodically interrupted to update the parameters according to rule (38). The use of the previously described saddle point approximations leads to the result that the estimation is actually based on the conditional prior P(l(~)lf(~), )~, or) = P(l(~)lf(k), g, )~, or) rather than on the joint prior or on the joint posterior p(f(k), /(k)[g, ~., Or). This can easily be understood by verifying that the gradient of the negative logarithm of the conditional prior with respect to )~ and ot is identical to that obtained by considering Eqs. (44) with f -- f(~). It is clear that in this case the practical computation of the expectations is very fast, since f(k) is already available from step (5a). Hence, the computational cost of the iterafive scheme of Eq. (38) is very low. This result can be extended to the case of an MRF model in general form (e.g., self-interacting line processes), so that, in the M A P - M L procedure for the blind, unsupervised restoration of images, formalized in Section II.A through the three computational steps of Eqs. (5), step (5c) can now be rewritten as W (k+l)
--
arg max w
P(l(~lf~, w)
(46)
and can be implemented through the general updating rule W t+l-
W t-
~TV(-log
w)lwt
(47)
As a remark, the meaning of this updating rule is now the search for the prior probability under which the potentials computed in (f(k), l(k~) are equal to the corresponding potentials computed in (f(~, 1), where i is the expected value of the line process for the given intensity f(k). The physical point of view is that the fluctuations of l, with respect to sample 1~k~and conditioned on some relevant value for f , are assumed to give almost the same information about the
224
ANNA TONAZZINI AND LUIS BEDINI
parameters as the fluctuations of both fields around (f(k), l(k)). Note that since f(~) is already available from step (5a), the computation of the expectations, although generally not available in analytical form, requires summation over the binary variables alone and can be performed through time averages by means of a cheap Gibbs sampler. In the formalism of the GBM this updating rule is equivalent to a learning algorithm for a network with mixed continuous and binary units interconnected according to a system of cliques and distributed over an input layer and an output layer. The input units are associated with the pixels f , and the output units with the line elements 1. The weights of the GBM are the hyperparameters of the conditional prior, which coincides with the internal distribution according to which the units evolve. In the learning algorithm, based on the single example (f(k),/(k)), the input units are kept fixed to the environment both in the clamped phase and in the unclamped phase, while the output units are free to evolve in the unclamped phase. This special case of learning is known as strict classification without hidden units (Aarts and Korst, 1989). Thus, the same Boltzmann machine, already defined for updating the line elements in mixed annealing, can be used according to its learning modality to estimate the model hyperparameters (Bedini, Tonazzini, and Minutoli, 2000).
B. Derivation of the Model Parameter Updating Rule by Means of MCMC Techniques
We now consider a different approach to the ML estimation of the unknown Gibbs parameters. This approach is based on the use of Markov chain Monte Carlo (MCMC) methods (Geyer and Thompson, 1992). Higdon et al. (1995) and Descombes et al. (1997) applied MCMC techniques to estimate the hyperparameter set, with application to positron emission tomography (PET) and image segmentation, respectively. For the application of this approach to our case (Tonazzini et al., 1997), steps (5a) and (5c) are first joined and formulated as the simultaneous maximization of the posterior distribution with respect to (f, l) and w, for the current value of the degradation parameters vector q(k):
(f(k), l(k), w(k)) _ arg max P ( f , llg, w, q(k)) f,l,w
(48)
We rewrite problem (48) in the equivalent form (1(~), w (k)) -- arg max l,w
P(f*(l, w, q(~)), llg, w, q(k))
(49a)
225
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
where
f*(1, w, q(k~) _ arg max P(f, llg, w, q(k~)
(49b)
Problem (49a) can be approximated, as usual, by the following two steps: l(k)= arg max
P(f*(1, w (k), q(k~), llg, w(k), q(k~)
(50a)
w ~k+l) = arg max
P(f*(l (k~,w, q(k)), l(k)Ig, W, q(k~)
(50b)
W
where, as stated, f*(l, w, q) is the maximizer of the posterior for the current values of the parameters. Steps (50a) and (50b), together with a step for computing f*, are now intended as substitutes for steps (5a) and (5c), respectively. For simplicity of notation, the dependence of f* and of the posterior from q will not be explicitly referred to from now on. Let us assume, at this stage, to have f* (1, w) available for each 1 and each w. Then, the solution of problem (50a) entails using SA only with respect to the line variables. Because of the small size of the neighborhoods and the binary nature of the lines, the adoption of a Gibbs sampler to generate Markov chains in 1 would result in a cheap algorithm. Since the posterior distribution in Eq. (50b) is likely to be a nonconcave function with respect to w, an SA must also be adopted to compute its global maximum. Considering the log-posterior and introducing a temperature parameter T, suitably decreased during iterations, a Metropolis algorithm is chosen over the Gibbs sampler since it avoids integrating over w. Thus, starting with w (k~, at each step t we propose to update the current value w t of the parameters by a quantity Aw. According to the Metropolis algorithm, the ratio r =
P (f* (1
P
w,),
(51)
Ig, w')
must be computed. The update is definitely accepted if r >_ 1, so that W t + l - w t + Aw; otherwise the update is accepted with probability r. The expanded form of r is given by t':
exp
[-E(f*(1 (k), I t + A I ) , l(k)lg, I t + A I ) ] '/T exp
[-E(f*(1 (k), It),/(~:) Ig, w')] '/~
X
Z ( I ' ) 1/T Z ( W t at-
AW)I/T
(52) The difficulty with Eq. (52) is mainly related to the presence of the ratio Z(w t)/Z(w t + Aw); however, we can exploit the well-known importance sampling theorem in MCMC methods, according to which the ratio between two partition functions can always be seen as an expectation with respect to one of
226
ANNA TONAZZINI AND LUIS BEDINI
the two corresponding distributions (Geyer and Thompson, 1992). In our case it is
wt+Aw),llg, wt+Aw)]] e---~p[--E--(-f:-(ll~tt~ llg, wt)]
Z(wt + A w ) = E [ e x p [ - - E ( f * ( l ,
Z(W t)
(53)
where the expectation is computed with respect to the distribution P(f*(1, wt), llg, wt). Provided that Aw is not too large, if a sequence { l m } , m = 1. . . . . M, of samples is drawn from P(f*(1, w/), llg, wt), the ratio Z(wt)/Z(w t + Aw) can be approximated by the inverse of
1 ~ --~ ~
m=l
exp[-E(f*(l m, W t "~- Aw), lmlg, W t "91- AW)] exp[_E-(f.(lml ~t~ lmlg, wt)]
(54)
Again, the samples {l m } will be constituted by realizations of the line process, obtained by means of a low-cost Gibbs sampler over binary variables. So far, we have assumed the ready availability of f* (l, w), for each element of the Markov chains, involved in the solution of Eqs. (50a) and (50b). Unfortunately, no analytical formula for f*(l, w) is generally available. However, f*(l, w) is well defined because of the concavity of the posterior distribution with respect to f . Indeed, for the typical forms of U(f, llw), and assuming a linear data model with Gaussian uncorrelated noise, the posterior energy E(f, llg, w) is a quadratic, positive-definite form when 1 and w are kept fixed. We already highlighted that the availability of an analogue Hopfield-type neural network would permit the computation of f*(l, w) in almost real time, which thus would make the whole preceding procedure extremely fast. In the absence of such an architecture, the numerical computation of f*(l, w) by means of gradient descent for each proposed update of I would still make the method expensive, so that the adoption of some form of approximation becomes necessary. The most natural approximation consists of trying to reduce as much as possible the number of computations of f* (l, w). For instance, we can assume that f*(l, w) is computed only after one or more updates of the whole line process. This approximation is justified by the fact that because of the local nature of the MRF model for the image, f*(1, w) undergoes small variations for small perturbations in 1. The application of this approximation to the solution of problem (49a) leads to the already cited mixed-annealing algorithm for the MAP estimation of the image field (f, 1). With regard to problem (50b) the same approximation simplifies the computation of Eq. (54). Moreover, if, as a limit case, we assume that f*(l, W/) will always be clamped to f*(1 ~1'),wt), and f*(l, w t + Aw) always clamped to f*(1 ~k), w t + Aw), the computation of Eq. (54) will be extremely fast. When we adopt the approximation of keeping f* always clamped to 1~k) in conjunction with a weak membrane model,
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
227
in which the line elements are noninteracting, the ratio in Eq. (52) can be computed exactly, since the two partition functions Z(w t) and Z(w t + A w) can be computed exhaustively through analytical calculations (Bedini and Tonazzini, submitted). VII. THE OVERALL ARCHITECTURE FOR THE FULLY BLIND AND UNSUPERVISED RESTORATION
In Sections VI.A and VI.B two techniques were described to simplify the estimation of the model's hyperparameters within the iterative procedure of Eqs. (5). The first employs a deterministic updating rule based on the saddle point approximation; the second employs an SA based on the importance sampling theorem. It is to be noted, however, that they both can be used for the implementation of the theoretical iterative procedure (5), in conjunction with a step for estimating the image field, through the mixed-annealing algorithm briefly described in Section IV, and a step for estimating the degradation parameters, through the algorithms described in Section V. Although greatly simplified by virtue of the aforementioned approximation, this procedure still remains computationally costly, since in principle it requires, at each of the three steps, a global maximization of the relative distributions. To further reduce the computational complexity, in the following we devise a practical scheme for the implementation of the procedure, based on an overall, external mixed-annealing procedure for the MAP estimation of the image field, occasionally interrupted for performing partial estimations (i.e., updates) of w and q. In particular, when the saddle point approximation is used, parameters w are obtained by performing a single step of the iterative scheme (47), so as to increment the value that the probability assumes on the current image estimate. When MCMC techniques are used, parameters w are updated by a single visitation of all the vector elements according to the Metropolis algorithm. In this practical procedure, the mixed annealing is carried out for a decreasing sequence of temperatures, based on the posterior distribution
1 [ E(f,l[g,w(~ q(~)] T '
PT(f, l]g, W(k), q(k)) __ Z exp --
(55)
During each iteration of the algorithm the temperature is kept fixed at its current value and the following main tasks are executed. The line elements are updated to 1(k) according to the Gibbs sampler; all the line sites are repeatedly visited, say L times, and their values are set by drawing samples from the local prior distributions. We already highlighted that, because of the small size of the neighborhood system adopted and because the line elements are
228
A N N A T O N A Z Z I N I A N D LUIS BEDINI
binary, the Gibbs sampler can be considered a cheap algorithm in this case. The minimization of the convex posterior energy in f , to get f(~), is performed deterministically, through a conjugate gradient, on the basis of the current line configuration. This conjugate gradient can eventually be improved by a preconditioner or stopped after a given number of iterations. The degradation parameters are estimated by means of Eqs. (26). In particular, the estimation of d (k+l) is performed by solving a least squares problem whose complexity is low, as a result of the generally low size of the blur masks used in practical applications, and (or2)
eT~(l(k+l)lf(k+l),
W)) [w,~>
or, alternatively, 5. Propose an increment Aw for w (k). a. Compute
f*(l(k+l>,w(k>'+" /~W, q ( k ) ) -
arg m)n
E(f,
l<~+'>Ig,
zxw, q(~:>)
b. By exploiting approximation (54) compute
P(f*(l (k+'), w (k) +
Aw, q(k)),
p(f(~+l,,
+ zXw, q(k)) I/T~
l(k+l)lg ' w(~), q(~,)'/T"
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
229
c. Accept the update and set W ( k + l ) - - lh l(k) .qL A W if r > 1; otherwise accept it with probability r. 6. Set k -- k + 1; go to step 2 until a termination criterion is satisfied. It is easy to see that when the deterministic updating rule is used for the hyperparameters, the convergence of the whole algorithm (i.e., stabilization of the solutions) is ensured since, as Tk approaches zero, Prk (llf (k), w(k)) becomes peaked around 1(k), and hence the gradient in step 5 approaches zero as well. In fact, when this condition occurs, both w and I do not change anymore. By virtue of the stabilization of w and l, it is easy to see that f , d, and rr 2 stabilize as well, since they are computed deterministically as the minimizers of cost functions with now-fixed parameters, or by means of analytical formulas. Figure 1 shows a possible hardware architecture for implementing the preceding scheme when the hyperparameter updating rule based on the saddle point approximation is adopted. The proposed architecture is suitable for near real-time applications. A similar architecture can be derived when MCMC techniques are used instead. The architecture consists of five main blocks which allow the various steps of the scheme to be executed. Block B 1 has as input a realization f of the intensity field and updates each line element li,j, horizontal or vertical, according to a Gibbs sampler. From a computational point of view this block can be considered as consisting of as many processing units as there are output units. Each unit has a local memory where the values of the parameters w, of temperature T, and of the intensity pixels needed for the Gibbs sampler are stored. It operates asynchronous updates of the value of the corresponding line element.
CONTROLLER
T
B1
f*
f.
1
~ y
B4
Unclamped f, mean v
B3
d
1
B2
FIGURE 1. The overall architecture for the MAP-ML (maximum a posterior-maximum likelihood) method.
230
ANNA TONAZZINI AND LUIS BEDINI
Blocks B2 and B3 perform, respectively, the computation of f * ( l ) and the computation of d (k+l). Block B4 has as input the current realization of f and a given number of realizations, say M, of 1 and computes an estimate of the unclamped mean. The controller performs two main tasks: First, it coordinates the tasks executed by the four blocks, by providing the inputs and the parameters used by each; second, it updates the values of parameters w, by computing the clamped mean and using the received sample (f(k+l), l(k+l)) of the image field and the value of the estimated unclamped mean. Another task of the controller is to decrease temperature T according to a pre-fixed schedule and generate the commands for activating the various operative modes needed to implement the functions of the overall system. In Bedini and Tonazzini (2001) and Bedini, Tonazzini, and Minutoli (2000), we showed that a full parallel hardware can be used to implement blocks B 1, B2, and B3. For instance, block B 1 can be considered as a GBM and could be implemented.by a hardware constituted of ( 2 n m - m - n ) processors operating asynchronously in parallel. Since each processor needs about 30 CPU cycles to update a line element, considering a 512 x 512 image, the updating of the whole line process could be executed by using a single standard digital signal processor (DSP) when response times of about 1 s are required. The functions performed by blocks B2 and B3 can be executed by two Hopfield neural networks with a number of neurons equal to the number of pixels in the image and of elements in the blur mask, respectively. Because the number of elements of a blur mask is usually small (often not more then 49), parallel hardware to implement the Hopfield network corresponding to block B3 is already commercially available (Lindsey and Lindblad, 1995). Conversely, the implementation of the Hopfield network corresponding to block B2 is impractical as a result of the usually very high number of pixels in images. Thus some modifications to standard implementations of the Hopfield network have to be devised. To this purpose one could exploit the fact that the number of actually active interconnections for each neuron is small (81 for a 5 x 5 blur mask, 169 for a 7 x 7 blur mask). This could make feasible the implementation of the Hopfield network by means of parallel hardware with acceptable execution time. So that the unclamped mean can be computed, for each element Wr of the vector w, about n m values of clique potentials have to be computed and averaged M times. Thus, assuming that the average number of CPU cycles needed to compute a clique potential is 20, the total number of CPU cycles needed to compute the unclamped mean can be estimated in about 20M x n m . Considering that the typical average value for n and m is 512 and for M is 200, as a way to obtain computation time fitting real-time requirements, parallel hardware has to be designed for the implementation of block B4. The computation could be performed fully parallel by using n m independent processors, with
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
231
computation time very low; otherwise a good trade-off between computation time and number of independent processors can be reached by grouping the pixels and using each processor to compute the unclamped means of the pixels belonging to each group.
VIII. ADAPTIVE SMOOTHING AND EDGE TRACKING
A different way of dealing with the problem of unsupervised image recovery involves considering the hyperparameters not only as peculiar features of the image model, strictly related to the desired image characteristics, but also in relation to the particular problem in question. In this sense, the hyperparameters can be considered as weights to be determined on the basis of the relative confidence we have with regard to the various constraints and to the data. For instance, the regularization parameter also has to be tuned according to the amount of noise which affects the data: the higher the noise, the stronger the smoothness to be chosen. Another parameter, namely the threshold for the creation of a line, represents the intensity gradient above which a true edge is likely to be present in the true image. Thus, in principle, it should be lower than the lowest jump in the image and higher than the highest noise peak. Unfortunately, as in this latter example, often the parameters should be chosen in order to satisfy conditions that are unlikely to be simultaneously verified in real situations. One solution to this dilemma could be to perform an adaptive reconstruction in which the parameters are first roughly selected on the basis of the available information and then modified according to some criterion. For example, a strategy that has proven to perform well is to start the process with a high threshold in order to smooth off the noise and then gradually reduce it for a coarse-to-fine recovery of even the finest true edges. This adaptive variation of the model parameters can be performed heuristically, but many attempts would likely have to be made before the correct variation schedule could be found. Thus, some automatic rule should be devised. GNC-like algorithms seem to be suitable for performing this task, especially with reference to the threshold parameters. The idea behind GNC, which was first derived to minimize the energy function F(f) which arises from the elimination of an explicit, binary line process from the weak membrane energy, and was then extended to other stabilizers, is to construct a family of approximations, dependent on a parameter p, p ~ [0, 1], such that F (~ -- F and F (1) is convex. Starting from p -- 1, a gradient descent algorithm is subsequently applied to the various approximations, for a prescribed decreasing sequence of values of p. It can be seen that parameter p, while approximating the original stabilizer, usually relaxes the threshold. Nevertheless, in most proposed stabilizers, the extent of this threshold variability is very limited.
232
ANNA TONAZZINI AND LUIS BEDINI
Moreover, the stabilizers suitable for GNC-like algorithms do not allow for a straightforward incorporation of constraints into the edge geometry, while the adaptive tracking of the edges could very much benefit from the incorporation of a line connection constraint, in such a way as to favor the recovering of well-behaved edges against sparse edges due to noise. In this section, we show that a particular edge-preserving stabilizer, previously proposed for managing self-interacting discontinuities (Bedini, Gerace et al., 1995), can be effectively used for the fully automatic adaptive tracking of well-behaved edges (Tonazzini, 2000). This stabilizer was derived by substituting each explicit line element of an energy function E(f, l), accounting for useful discontinuity constraints, with a step function of the local intensity gradient. Approximating the step function with a sigmoidal function, whose gain is determined by a parameter T called temperature, yielded a GNC-like algorithm for minimizing the resulting energy function F r ( f ) at decreasing values of T. We applied the technique to the problem of image reconstruction from sparse and noisy data, and we showed that considerable improvements in the quality of the reconstructions can be achieved when constraints on the line process are introduced, without any increase in the computational cost. In this case, comparative experiments were carried out by choosing, by trial and error, the best regularization and threshold parameters to be adopted with and without a line connection constraint, respectively. Nevertheless, this sigmoidal stabilizer has the extra property of greatly changing its shape as the temperature varies. This property is reflected in the effective threshold of the stabilizer itself. In more detail, for high values of the temperature the threshold increases, while for T lowered to zero the threshold tends to the one originally assigned to E(f, l). Next, we investigate whether this property of the sigmoidal approximation can effectively be used for significantly relaxing the usual need for fine-tuning.
A. An MRF Model for Constrained Implicit Discontinuities In the MRF formalism, the easiest way to define an edge-preserving model that is able to manage self-interactions of the intensity discontinuities is to consider a coupled model with an explicit binary line process l = (h, v). For instance, a prior energy (from now on also called a stabilizer) of the following form U ( L l) -- ~, E
(J~,j -- J~+l,j
)2
(1 - hi,j) + ,k E
i,j
(fi,j - j~,j+,)2(1 - vi,j)
i,j
E hi j + i,j
E vii + y E h/jh/+, j + y E vii vii+, i,j
i,j
i,j
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
233
-- S Z hi,jhi,j+ 1 -- S Z hi'jPi'J -- S ~ hi,jUi+l, j - s Z ui,jPi+l,J i,j i,j i,j i,j --6 ~ l)i,jhi,j+l -- 8 Z l)i+l,jhi,j+l -t- 16 Z 13i,jl)i+l,jhi,j+l i,j i,j i,j -~ fl Z hi,jhi,j+l Vi,j -l- i~ Z hi,jvi,jl)i+l,j qL fl Z hi,jhi,j+l Vi+l, j i,j i,j i,j @ K Z hi,jhi,j+l vi,jl)i+l,J i,j
(56)
can describe a good deal of useful isotropic geometric edge properties, such as the favoring of line continuation and the penalization of parallel adjacent lines, in addition to a local smoothness constraint on the image intensity. The extra parameters s, y, fi, and tc are positive weights for the related line constraints. Parameter s takes positive values in [0, 1] and allows the amount of propagation of the lines to be controlled; in other words, the price to be paid for creating a discontinuity will be decreased by sot when a discontinuity at a neighboring site is present. Parameter y is any positive value that establishes the price to be paid for creating a discontinuity when a parallel adjacent line element is already present, while fl and x serve to prevent too many branches and crosses. It is worth noting that the formalism in Eq. (56) permits the introduction of higherorder self-interaction of lines in order to describe more complex constraints on the configurations of the discontinuities. In view of Eq. (56), the overall energy function to be minimized for solving the edge-preserving restoration problem is given, as usual in the adopted framework, by
E(f, l ) -
I l g - Hfll 2 + U(f, 1)
(57)
The minimization of E(f, l) with respect to both f and l is a very difficult task, which would require the adoption of expensive stochastic relaxation algorithms, such as SA with a Metropolis or Gibbs sampler. Moreover, the preliminary elimination of the line process, as in the GNC or in mean field annealing (Bilbro and Snyder, 1990; Bilbro et al., 1992), is impracticable because of the presence of interaction terms between lines. In Bedini, Gerace et al. (1995), we proposed, when only the line continuation constraint is present, to approximate E(f, l) (in the form of Eq. (57) with y = fl = tc -- 0) by substituting the adjacent line elements with a step function of the intensity gradient. Considering, for example, the line element hi,j+l, w e set
hi j+l -- @(j~,j-t-1 '
j~-t-l,j-t-1) -- [/ 0
! 1
if [ j ~ , j + l -
otherwise
J~-4-!,j-4-1[ < %/r-~[~
(58)
234
ANNA TONAZZINI AND LUIS BEDINI
In this way the analytical minimization of E ( f , l) with respect to I is possible and an energy function F(f) is obtained that implicitly refers to binary discontinuities. It should be noted that the approximation (58) can be extended to every line element in (56) and in other possible higher-order terms to be added to U(f, 1), which thus directly produces an F(f) for implicitly treating any kind of constraint on the discontinuity field. The minimization of this F(f) was performed by constructing a family of ad hoc approximations to be minimized in turn according to the GNC strategy. In particular, we proposed constructing the approximations for the nonconvex F(f) by simply substituting the step function with a sequence of parametric functions converging to it. In particular, we adopted a family of sigmoidal functions of the gradient, with values in [0, 1] and depending on a temperature parameter T. Thus, for the generic horizontal line element hi,j we set hi,j
= q)T(fi,j
-- J ~ + l , j )
"-
1 -!- exp{-[(j~,j - j]+l,j) 2 - O2]/T}
and analogously for the generic vertical line element Ui,j = ( D T ( f i , j
-- J ~ , j + l )
~"
(59a)
l)i,j,
1 -I--exp{-[(j~,j - 3~,j+l) 2 -
02]/T}
(59b)
where 0 is the step threshold, assumed to be equal to v/-~/Z. It is straightforward to verify that when T is lowered to zero, ~0r converges to the step function. We focused on the flexibility of this approach in effectively introducing selfinteractions between lines in an implicit way. We provided experimental results to show that significant improvements in the quality of the reconstructions can be achieved with respect to the reconstructions obtained by using a weak membrane energy as a stabilizer.
B. Coarse-to-Fine Detection of Edges by Means of GNC In our previous study on the performance of the sigmoidal stabilizer, we left open the crucial issue of the choice of the best parameters for a given problem. In this subsection we resume this issue and investigate the robustness of this stabilizer against a rough choice of ~. and oe. Thus in the following discussion we focus on the properties of the stabilizer with respect to the values of temperature T. Let us consider, for simplicity's sake, the case in which e - t' = /3 = tc = 0. In relation to Eqs. (59a) and (59b), from Eqs. (56) and (57) we
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
235
obtain F T ( f ) = J i g - Hfl[ 2 + ~ - , )~(Z,j - j~+l,j) 2 exp{-[(j~,j - 06+1,j) 2 - 02]/T} + ot
,,'-7 +
1 + exp{-[(j~,j
-
j~+l,j)
2 -
02]/T}
Z )~(J~'J - J~'J+l)2 exp{-[(j~,j - j~,j+l) 2 - 02]/T} + c~ i,j 1 + exp{-[(j~,j - fi,j+l) 2 - 02]/T}
(60)
The local stabilizer is given by
Or(t) =
)vt2 e x p { - [ t 2 - 02]/T} + ot 1 + exp{-[t 2 -
02]/T}
(61)
where t is the local intensity gradient. This function is even, holds or/(1 + exp(02/T)) for t = 0, and asymptotically converges to ol when t goes to infinity. It is easy to see that when T goes to infinity, Or(t) converges to a parabola, 4 ~ (t) =
)~t2 + ot 2
(62)
that is, the classical quadratic stabilizer of standard regularization, apart from a constant term. In this limit case, the sigmoidal stabilizer loses its edgepreserving features. Conversely, in the limit of T to zero, Or(t) converges to a truncated parabola, ~b0(t)
{~,t 2 ot
if It[ < 0 otherwise
(63)
whose edge-preserving properties were first highlighted in Blake and Zisserman (1987) for the case 0 = x/-&--/)~. With this stabilizer all discontinuities whose amplitude is higher than 0 can be reconstructed. Let us now consider the behavior of Ckr(t) for a finite value of T. In Figure 2 a typical plot of CkT(t) versus t > 0 is shown. It is immediately clear that, differently from most stabilizers proposed in the literature, Ckr(t) is not monotonically increasing, but attains a maximum at a certain point t*. Thus, it is reasonable to expect that the stabilizer performs a smoothing for t < t*, while it favors, or does not excessively penalize, the creation of edges for t > t*. The maximizer t* could then be considered as the effective threshold of the stabilizer. The interesting point in this case is to see whether and to what extent this maximizer depends on T. Unfortunately, setting the first derivative of Eq. (61) to zero gives a transcendental equation, which cannot be solved in a straightforward way. However, by plotting Ckr(t) for a large set of values of T, we have found that both t* and Ckr(t*) always increase as T increases. Moreover,
236
ANNA TONAZZINI AND LUIS BEDINI 50
I
I
I
I
45 40 35 30 25 20 15 10
0
I
I
I
I
I
I
I
I
I
10
20
30
40
50
60
70
80
90
100
FIGURE 2. Typical plot of the function ckr(t).
the variations that they undergo are very large. As an example, Figure 3 shows some plots of Eq. (61 ) for T ranging from 100 to 2000. This large variability is also theoretically confirmed. Indeed, from the limit cases of Eqs. (62) and (63) it is clear that the maximizer ranges from 0, here called the nominal threshold, to infinity. A close look at Figure 3 also highlights some extra considerations. For large values of T, the stabilizer performs in practice like a parabola, whose smoothness capacity is governed by the value of s In particular, this holds for all values of T greater than the value for which the effective threshold is higher than the maximum gradient in the degraded image. Moreover, as the value of the maximum increases with T and the stabilizer is asymptotically convergent, large gradients are promoted at high temperatures, whereas at low temperatures the penalization for large gradients is constant. The previously highlighted behaviors of the stabilizer could be exploited for simultaneous edge-preserving noise removal and adaptive tracking of the image jumps. For this purpose, according to the GNC strategy, the reconstruction should be performed at a decreasing sequence of temperatures. Noise is likely to be removed at high temperatures, when the threshold is higher than the noise itself, whereas image jumps with high amplitude are strongly promoted. As the temperature decreases, the threshold also decreases, which should allow for the reconstruction of edges of lower width without recovery of the edges due to the residual noise. Indeed, the promotion of intensity discontinuities over
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
237
100 90 80 70 60 50 40 30 20 10 0
I
I
I
I
I
I
I
I
I
10
20
30
40
50
60
70
80
90
100
FIGURE 3. Behavior of the function ~br(t) versus T.
the threshold loses its strength as the temperature, and then the threshold itself, diminishes. By virtue of the natural hysteresis properties of edge-preserving regularization (Blake and Zisserman, 1987), it may happen that under these conditions well-behaved discontinuities, due to connected true image edges and object boundaries, will be favored against sparse jumps due to peaks of noise. This hypothesis is confirmed by the results provided in the following sections. Moreover, the flexibility of the sigmoidal stabilizer in introducing self-interactions between lines may be exploited to strengthen hysteresis by means of the incorporation of a line continuity constraint. In this way, the usual need for fine-tuning parameters )~ and ot is also greatly relaxed. Indeed, it is sufficent to choose a)~ large enough to smooth off the noise at the first iteration and an ot such that the nominal threshold is lower than the lowest jump to be recovered in the image. In particular, a threshold lower than the noise level can also be chosen, and the value of the regularization parameter is requested only not to completely destroy the largest image structures. Thus, only rough estimates for these parameters are requested. This is particularly useful since, in practical cases, no precise knowledge of the noise level and the image features is available. Conversely, when stabilizers with fixed thresholds are used, it is extremely difficult, and often impossible, to conciliate the two needs for recovering the details in the image and to remove all the noise. In the
238
ANNA TONAZZINI AND LUIS BEDINI
best cases, expensive trial-and-error procedures must be adopted. Otherwise, a compromise between the noise left and the details lost in the reconstruction must be accepted. Also the choice of the initial temperature is not particularly important. Indeed any value large enough to make the initial effective threshold higher that the highest noise peak is sufficient. In practice, the initial temperature can be chosen arbitrarily high, in that the sigmoidal stabilizer cannot smooth more than the standard regularization stabilizer associated with the chosen I.. In the results section we provide evidence of the validity of the aforementioned considerations. IX. EXPERIMENTALRESULTS: THE BLIND RESTORATION SUBCASE In this section we analyze the subproblem of blind restoration (i.e., when it is assumed that the MRF model hyperparameters are known in advance). To do this we refer to the fully data-driven restoration procedure proposed in Section VII and restrict it by disabling the hyperparameter estimation step. Our aim is to qualitatively and quantitatively analyze the performance of joint blur identification and image restoration with respect to the image characteristics and to the image model adopted. In other words, we want to experimentally verify the efficiency of edge-preserving image models for a satisfactory recovery of both the image and the blur.
A. Blur Identification with Known Ideal Image An initial set of experiments was aimed at quantitatively analyzing the identifiability of the blur versus increasing amounts of noise on the data when the resolution of the original imagemthat is, the number of edgesmincreases. Indeed, as previously highlighted, we expected that most of the information for blur identification would be located across the image edges. For this purpose, we assumed that the original image was known exactly and used the blind restoration algorithm with the image reconstruction process disabled, in order to test the performance of the blur identification process in the most favorable case. We recall that, under these conditions and according to our method, the blur identification process amounts to a single conjugate gradient algorithm whose dimension is given by the size assumed for the unknown blur mask. We have found that in most papers on blind restoration that employs a smoothness constraint for the blur, results are shown only for uniform blurs, for which this constraint is naturally well suited. However, we have experimentally observed that because of the generally very small size of the blur masks, the use of a smoothness constraint of comparable weight with respect to the data consistency term forces the solution toward uniform blurs, no matter what
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
239
the true blur is. For this reason, we preferred to consider a nonuniform blur, characterized by the following quasi-Gaussian mask of size 3 x 3: 0.058824 0.117647 0.058824
0.117647 0.294115 0.117647
0.058824 0.117647 0.058824
(64)
The images considered consist of four checkerboards with 4, 16, 64, and 256 squares, respectively. Each was degraded by convolution with the 3 x 3 mask of Eq. (64) and then by addition of white Gaussian noise of zero mean and standard deviation cr = 5, 10, 20, respectively, as shown in Figure 4. These amounts of noise give values of the signal-to-noise ratio (SNR) of about 28.8, 22.8, and 16.8 dB, respectively, for the four checkerboards. For the sake of generality, we started the process from a blur mask in the form of a Dirac function, which represents the most noninformative starting point. Although the considered mask has all positive elements and is circularly symmetric, we did not enforce these constraints since we assumed that in
FIGURE4. Synthetic images used to test the identifiability of the blur of Eq. (64) versus the amount of noise and the number of edges in the true image. By row: Images with same amount of degradation and increasing resolution. By column: Images of same resolution and increasing amount of noise.
240
ANNA TONAZZINI AND LUIS BEDINI TABLE 1 R M S E a BETWEEN THE TRUE MASK AND THE ESTIMATED MASK FOR THE DEGRADED IMAGES OF FIGURE 4 No. of squares Image R M S E (~r = 5) R M S E (cr = 10) R M S E (cr = 20)
4
16
64
256
0.022 0.037 0.059
0.0085 0.018 0.031
0.0070 0.014 0.027
0.0020 0.0035 0.0080
" R M S E , root-mean-square error.
real cases this information is not available. Instead, enforced only the sum of the mask elements to be one by simply normalizing the values at each estimation step. The results obtained are summarized in Table 1, where for each degraded image we show the root-mean-square error (RMSE) between the estimated mask and the ideal mask. Two major observations can be made by analyzing Table 1. First, for each kind of image, the RMSE increases as the amount of noise increases because the blur estimation process, while minimizing the data consistency term, tends to fit the noise. Second, and most interesting, for each value of o-, the RMSE decreases as the number of edges in the checkerboard increases. For example, the blur coefficients computed for the case of the 256-squares checkerboard and o- -- 5 are 0.057355 0.117542 0.060106
0.116954 0.294619 0.114327
0.061778 0.114551 0.062768
For each trial, we also computed the value at convergence of the data consistency term (i.e., the energy to be minimized). In all cases, we found that this value is slightly lower than the energy of the total noise (Gaussian noise plus quantization error) which affects the data. This finding confirms the noisefitting features of the blur identification process and automatically provides the value, although slightly underestimated, of the noise variance, according to Eq. (26b). Furthermore, this means that the amount of noise on the data need not be a priori known for blur identification purposes. For comparative purposes, we also let the procedure run by forcing both the positivity and the circular symmetry constraints for the blur mask. We found that the positivity constraint does not substantially affect the results. Indeed, whereas at the first iterations of the conjugate gradient some blur coefficients could be negative, at convergence the coefficients were always all positive. Moreover, we found that by forcing the symmetry we could obtain only slight improvements, with an insignificant reduction in the computational complexity, but at the price of a loss of generality.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
241
As a final experiment, we considered the case in which the size of the underlying blur mask is not known in advance. In all trials, assuming a larger size for the starting Dirac function, we found that the process converges to an enlarged blur mask which, however, has near-zero elements outside the support of the true mask and values which give similar RMSE inside.
B. Joint Blur Identification and Image Restoration In subsequent experiments we tested the blind restoration subprocedure to identify the blur and estimate the noise under the realistic assumption that we do not know the original image, which is jointly estimated. These experiments were performed on artificially degraded synthetic and real images and on real degraded images whose blur and noise are unknown. For comparison, we considered as synthetic images the checkerboard sequence of Figure 4. Since the original images are piecewise constant, we adopted the following isotropic, homogeneous prior energy U(f, 11w) model:
U ( f , ll)~, ~, e) - )~ ~ (~,j i,j "if- )~ Z
i,j
-
fi+l,j)2(l
--
(J~'J -- J ~ ' J + l ) 2 ( 1
+ ol ~ u i , j i,j
hi,j) --
Ui'j) -47ol Z hi,j i,j
F_,O~l hi , j hi , j+l- 80t~ Ui,jUi+I,j i,j
i,j
(65)
which accounts for a useful line continuation constraint. Because we disabled the hyperparameter estimation step, we kept the hyperparameters fixed throughout all the iterations at given values based on guesses regarding the characteristics of the original image and the amount of noise. Moreover, we neglected the dependence of the posterior energy from the noise variance. This amounts to implicitly considering the noise variance as incorporated into the hyperparameter vector w, so that there is no need for its initialization. Obviously, this modification does not affect the results of the estimation of the image field, nor the estimation of the blur mask. Furthermore, the estimation of the noise variance according to step 4b (in Section VII) can be accomplished only once, at convergence. As an initial value of temperature, we adopted To -- 500, and the chosen law for decreasing it was exponential, given by the formula Tk+l = 0.85T~. According to the mixed-annealing algorithm, at each temperature we performed 10 updates of the image field, while the chosen length of the Markov chains for updating the line elements was 20(2nm - m - n), corresponding to 20 visitations of each line element in turn. Again, we started the process with a Dirac function and the degraded image itself as initial estimates
242
ANNA TONAZZINI AND LUIS BEDINI
of the blur mask and the restored image, respectively. The initial configuration of the line process was set to zero everywhere. It is worth noting again that in the absence of a good image model, this starting point constitutes a trivial solution to the blind restoration problem, as it makes the data consistency term zero. Convergence to the final values of the degradation parameters and stabilization of the reconstructions were reached in all cases in less than 30 iterations of the whole procedure. With regard to noise standard deviation a = 5, by adopting for the fixed hyperparameters the values ~. = 2, c~ = 180, and e = 0.7, respectively, w e obtained almost perfect reconstructions of the image field, as well as blur and noise estimates which were slightly worse than those that were obtained for the ideal case in which the original image is exactly known and that were reported in the previous subsection in Table 1. The high value we chose for parameter ~. allows for a good flattening of the constant patches in the images, while the threshold, which is lower than the true jumps and higher than the noise peaks, allows for excellent noise removal and detection of the true edges. This is confirmed by the image RMSEs, which were confined in all cases to around one. More interesting observations can be made by analyzing the case of noise standard deviation cr -- 10. The results, shown in Figure 5 and Table 2, were obtained by adopting the same threshold used in the previous experiment, but with a lower (if compared with the amount of noise) value for )~. Comparing Table 2 with Table 1, we see that, as expected, the blur RMSEs are always slightly greater than those obtained for the ideal case in which the original image is exactly known, except for the case of the 64-squares checkerboard, for which we obtained a slightly lower value. Moreover, the estimates of
FIGURE5. Results of the blind reconstruction process on synthetic images, cr = 10. Top: Degraded images. Bottom: Reconstructed images 0~ = 3, ct = 270, and e -- 0.7).
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
243
TABLE 2 RESULTS OF PARAMETERESTIMATION FOR THE IMAGES OF FIGURE 5 No. of squares Image Blur RMSE Image RMSE Estimated a 2
4
16
64
256
0.080 2.78 88.77
0.035 3.75 88.94
0.0114 7.50 93.88
0.0156 11.97 97.90
both the blur mask and the noise variance improve from the 4-squares checkerboard to the 64-squares checkerboard. This finding once again confirms that when the image model is suitable for a correct location of the intensity edges during reconstruction, the blur estimate improves when the number of edges is higher. Nevertheless, while the noise variance estimate further improves for the 256-squares checkerboard, the blur RMSE does not. This effect is certainly related to a modeling defect due to the concurrency of a small scale of the image and a strong noise. In other words, while the chosen hyperparameters are good for modeling the large-scale images of the 4-, 16-, and 64-squares checkerboards, the value of ~. is too high for the 256-squares checkerboard, which exhibits very fine details. Under these conditions, the strong noise, with peaks of amplitude possibly larger than the intensity discontinuity threshold, prevents the model from being able to completely remove the peaks. The excessive propagation of the smoothness, together with the persistence of noise peaks, leads to an incorrect detection of the edges, with consequent incorrect blur identification. Correspondingly, the image RMSE deteriorates as the image scale diminishes. This phenomenon is in accordance with the qualitative inspection of the reconstructions in Figure 5, which, however, appear to be satisfactory. Although generally worse, the results we obtained for noise standard deviation a = 20 exhibit a qualitative and quantitative behavior similar to those reported for a = 10. The original 256 x 256 real image considered for artificial degradation is shown in Figure 6. Because of the larger size of this image with respect to that of the synthetic image, we considered a 5 x 5 quasi-Gaussian blur mask characterized by the following coefficients: 0.000000 0.019231 0.019231 0.019231 0.000000
0.019231 0.067308 0.086538 0.067308 0.019231
0.019231 0.086538 0.153844 0.086538 0.019231
0.019231 0.067308 0.086538 0.067308 0.019231
0.000000 0.019231 0.019231 0.019231 0.000000
(66)
244
ANNA TONAZZINI AND LUIS BEDINI
FIGURE6. Original 256 x 256 real image.
Figures 7a and 7b show the corresponding degraded image obtained by convolution with the blur mask of Eq. (66) plus addition of Gaussian noise of zero means and standard deviations cr = 5 and cr = 10, respectively. For this kind of image, which is well contrasted and can then be considered roughly piecewise constant, we adopted the same prior energy model as Eq. (65). Because of the small scale of some of the fine details which are present in the original image, we kept the hyperparameters fixed throughout all the iterations at values ~. = 0.7, c~ ---- 200, and e - - 0 . 5 , when cr = 5, and ~. = 1.4, c~ = 400, and e = 0.5, when cr = 10, respectively. Figures 7c and 7d show the final reconstructions of the intensity images in the two noise cases. For the reconstruction in Figure 7c, the RMSE between the exact and estimated blur masks was 0.0041, the RMSE between the original and estimated images was 14.31, and the as usual underestimated noise variance was 20.45. The final blur mask was 0.000000 0.020951 0.010904 0.019876 0.000000
0.021556 0.069378 0.087740 0.070003 0.016108
0.021902 0.085495 0.155618 0.082609 0.018069
0.014030 0.067903 0.084247 0.077619 0.013287
0.009683 0.018676 0.019970 0.014374 0.000000
For the reconstruction in Figure 7d, the RMSE between the exact and estimated blur masks was 0.0062, the RMSE between the original and estimated images was 14.97, and the noise variance was 92.15. The final blur mask was
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION 0.000000 0.018580 0.000000 0.028170 0.000000
0.018733 0.080092 0.091118 0.064063 0.016741
0.028819 0.080877 0.154710 0.081870 0.017679
0.013820 0.067969 0.089423 0.071008 0.015452
245
0.003922 0.024714 0.011525 0.020718 0.000000
As e x p e c t e d ~ o m a p h y s i c a l p o i n t o f view, a n d a l r e a d y c o n f i r m e d b y the e x p e r i m e n t s o n s y n t h e t i c i m a g e s , the p e r f o r m a n c e o f the b l i n d r e s t o r a t i o n
FIGURE 7. Blind restoration of a real 256 • 256 image artificially degraded. (a) Image degraded by convolution with the blur mask of Eq. (66) plus addition of Gaussian noise, a -- 5. (b) Image degraded by convolution with the blur mask of Eq. (66) plus addition of Gaussian noise, cr = 10. (c) Blind reconstruction of Figure 7a ()~ = 0.7, ot = 200, e = 0.5). (d) Blind reconstruction of Figure 7b ()~ = 1.4, ot -- 400, e = 0.5).
246
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 8. Blind restoration of the emission of a Euglena gracilis iahotoreceptor: (a) Digitized image, of size 96 x 160, of the fluorescent photoreceptor under the 365-nm excitation light. In this case the photoreceptor is barely detectable. (b) Blind restoration result obtained by setting ~. = 2, ot = 100, e = 0.5,/3 = 0.5, and ~, = 0.5 and assuming a 7 x 7 size for the unknown blur mask. This reconstruction shows a photoreceptor with a detectable emission. p r o c e d u r e d e t e r i o r a t e s as t h e n o i s e l e v e l i n c r e a s e s . H o w e v e r , p r o v i d e d that the i m a g e m o d e l a d o p t e d is s u i t a b l e f o r t h e c o n s i d e r e d i m a g e , the r e c o n s t r u c t i o n s remain satisfactory. T h e s u b s e q u e n t e x p e r i m e n t s i n v o l v e t h e r e c o v e r y o f t w o r e a l i m a g e s , as s h o w n in F i g u r e s 8a a n d 9a, r e s p e c t i v e l y . T h e first is a 96 x 160 i m a g e d r a w n
FIGURE 9. Blind restoration of the anatomic characteristics of an isolated Euglena gracilis photoreceptor: (a) Highly defocused and noisy 200 x 200 image, badly acquired by means of a transmission electron microscope. (b) Result of the blind restoration obtained by setting ~ = 1; ct = 200, e = 0.5,/3 = 0.1, and y = 0.5 and assuming a 7 x 7 size for the unknown blur mask. This reconstruction shows a photoreceptor with very sharply defined anatomic structures.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
247
from the sequence of digitized fluorescence images showing the variation in the emission of the Euglena gracilis photoreceptor, under a 365-nm excitation light (Barsanti et al., 1997). In this case, the emission of the photoreceptor is almost undetectable since the photocycle is just beginning. The second image is a 200 x 200 highly defocused and noisy image of an isolated Euglena gracilis photoreceptor, poorly acquired by means of a transmission electron microscope. Although we know that the theoretical PSF of a microscope can be expressed in the form of an Airy diffraction pattern, in practical situations the true blur mask depends on a number of factors, including the light level. Thus, in this specific case we did not know the exact size of the masks and the values of the blur coefficients. Assuming a 7 x 7 size to be sufficient for both images, we used the blind restoration procedure already experimented on in the synthetic case. For these images, we found that a model which addresses various constraints on the lines is necessary for obtaining satisfactory reconstructions. We thus augmented the prior of Eq. (65) by using the following terms:
U(l) -- }/ol ~ hi,jhi+l, j .-~ ~ol Z l)i,j vi,j+l - I~0l Z hi,jui,J i,j i,j i,j -- i ~ 0 1 Z hi,jl)i+l,J -- I~0l ~ ui,jhi,j+l - I~01 ~ Ui+l,jhi,j+l i,j i,j i,j
(67)
which encourage line turns and penalize double lines. Figure 8b shows the result of restoration of the image in Figure 8a, obtained by setting k -- 2, a = 100, e = 0.5, 13 = 0.5, and F = 0.5. Figure 9b shows the result of restoration of the image in Figure 9a, obtained by setting k -- 1, a = 200, e = 0.5,/~ = 0.1, and F -- 0.5. In both cases the blind restoration procedure produces good-quality imagesmthat is, a photoreceptor with a detectable emission (Fig. 8b) and a photoreceptor with very sharply defined anatomic structures (Fig. 9b) (Bedini, Tonazzini, and Gualtieri, 2000).
X. EXPERIMENTAL RESULTS: THE UNSUPERVISED RESTORATION SUBCASE
A. ML Hyperparameter Estimation In this subsection we analyze the performance of the ML hyperparameter estimation methods proposed in Section VI. To this purpose, we implemented the procedure described in Section VII with the degradation parameter estimation step (step 4) disabled. We further simplified the method by considering degradation to be only the addition of uncorrelated Gaussian noise, of zero mean and various standard deviations, to the original image and by considering the image model to be the basic weak membrane. Thus the vector of the Gibbs
248
ANNA TONAZZINI AND LUIS BEDINI
parameters w reduces to parameters ~. and ot alone. We carried out several experiments, using as the hyperparameter updating rules (step 5) both that derived with the saddle point approximation and that derived by using MCMC techniques. Since we discovered that the two updating rules gave substantially equivalent results, we show only the results obtained by exploiting the saddle point approximation. It is noteworthy to recall that, at least in principle, the procedure does not require knowledge of the noise variance value. Thus, we could adopt the simplification of neglecting the dependence of the posterior energy from it. Indeed, this parameter does not affect the result of the estimation of the image field, and it does not enter in the hyperparameter estimation step. Nevertheless, this modification amounts to implicitly considering the noise variance as incorporated into the hyperparameter vector w, so that at the end of the procedure the obtained hyperparameters have to be divided by 20-2, where 0. is either the a priori known noise standard deviation or that estimated by computing the equation in step 4b, for the final value of the image intensity. However, in our experiments, we assumed that we knew the noise variance and adopted )~ = 2/2o -2 and c~ = 5000/20 .2 as starting values for the parameters, which give as the initial threshold ,r -- 50. The initial temperature value was To -- 1500, and the law for decreasing it was chosen to be exponential, according to Tk+~ = 0.9Tk. At each temperature we performed L = 10 updates of the image field, according to steps 2a and 2b of the algorithm, while the length of the Markov chains for updating the line elements in step 2b was chosen to be ( 2 n m - m - n ) , corresponding to a single visitation of each line element in turn. Indeed, a single complete update is sufficient because, owing to the independence of the line elements, the Gibbs sampler is memoryless. The single-step update of each hyperparameter (step 5, first version) was executed by decreasing its current value of a variable percentage of the gradient of the negative conditional log-prior, where the percentage value r/is computed as the ratio between the gradient itself and the clamped expectation. In all cases, convergence to the final values and stabilization of the reconstruction were reached in less than 30 iterations of the whole procedure, which corresponded to a final temperature value of around 70. The first synthetic image we considered was a piecewise constant M o n d r i a n i m a g e (Fig. 10a) of size 64 x 64 over 256 gray levels, artificially degraded by adding Gaussian noise with three standard deviation values (i.e., 0. = 10, 20, 30, respectively). The results in the three noisy cases are shown in Figures 10 through 12, respectively, while a typical plot of the behavior of the parameters versus the number of iterations is shown in Figure 13. For comparison, we also computed for each degraded image the reconstruction corresponding to the initial values of the parameters. These reconstructions were obtained by running the mixed annealing alone, under the same conditions of initial temperature, annealing schedule, and Markov chain length.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
249
FIGURE 10. Synthetic Mondrian image of size 64 x 64: (a) Piecewise constant original image. (b) Image degraded by adding uncorrelated Gaussian noise (tr = 10). (c) M A P reconstruction with fixed parameters ()~ = 0.01, oe = 25). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, ), = 0.01, ot = 25; final parameters, )~ = 0.035, oe = 1.075). (f) Related edge map.
250
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 11. Synthetic Mondrian image of size 64 x 64: (a) Piecewise constant original image. (b) Image degraded by adding uncorrelated Gaussian noise (a -- 20). (c) MAP reconstruction with fixed parameters (~, = 0.0025, ~ - 6.25). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, ,k = 0.0025, ct -- 6.25; final parameters, ~. = 0.029, c~ = 0.7). (f) Related edge map.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
251
FIGURE 12. Synthetic Mondrian image of size 64 • 64: (a) Piecewise constant original image. (b) Image degraded by adding uncorrelated Gaussian noise (or -- 30). (c) M A P reconstruction with fixed parameters ()~ = 0.0011, c~ = 2.78). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, Z -- 0.0011, o~ = 2.78; final parameters, )~ -- 0.025, ot -- 0.65). (f) Related edge map.
252
ANNA TONAZZINI AND LUIS BEDINI alpha
25.00
13.04
1.08
,-.,.,
1
"r------2O
40
lambda
0.036
0.023
0.010
1
i 20
40
FIGURE 13. Graphical behavior of the two parameters versus the number of iterations, for
the experiment in Figure 10. Top: ot parameter. Bottom: ~. parameter.
The second synthetic image was a piecewise planar roof image (Fig. 14a), still of size 64 • 64 over 256 gray levels and artificially degraded by adding Gaussian noise with three standard deviation values (i.e., ~r = 10, 20, 30, respectively). The results obtained for this image, including the reconstructions with hyperparameters kept fixed at the initial values, are shown in Figures 14 through 16, respectively. From a qualitative analysis of the results obtained for the Mondrian and roof images, the performance of our method can be considered quite satisfactory. In particular, whereas the reconstructions obtained with the initial values of the
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
253
FIGURE 14. Synthetic roof image of size 64 x 64: (a) Piecewise planar original image. (b) Image degraded by adding uncorrelated Gaussian noise (tr = 10). (c) MAP reconstruction with fixed parameters (~, = 0.01, ct = 25). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, ~, = 0.01, ot = 25" final parameters, )~ = 0.04, ct -- 2.4). (f) Related edge map.
254
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 15. Synthetic roof image of size 64 • 64: (a) Piecewise planar original image. (b) Image degraded by adding uncorrelated Gaussian noise (tr = 20). (c) MAP reconstruction with fixed parameters (~. = 0.0025, ct = 6.25). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, ~. = 0.0025, ct = 6.25; final parameters, ~. = 0.028, a = 1.37). (f) Related edge map.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
255
FIGURE 16. Synthetic roof image of size 64 • 64: (a) Piecewise planar original image. (b) Image degraded by adding uncorrelated Gaussian noise (tr = 30). (c) M A P reconstruction with fixed parameters (X = 0.0011, ot = 2.78). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, X -- 0.0011, ot = 2.78; final parameters, X -- 0.015, ct = 0.63). (f) Related edge map.
256
ANNA TONAZZINI AND LUIS BEDiNI
parameters are, at the same time, excessively rough inside the homogeneous areas and excessively smooth across them, because of a too-low smoothness parameter and a too-high threshold, the final reconstructions globally show a nice recovery of the smooth areas and an almost-perfect reconstruction of the related edges. Indeed, the edge curves are all connected, while the few defects that are observable in the cases of higher noise levels (tr = 20, 30) are essentially related to a lack of straight-line continuity and thinness. This finding is not surprising if we remember that the weak membrane model adopted herein for the images does not enforce any constraints on the line geometry. In fact, although the weak membrane model is generally sufficient to describe piecewise constant images, there are practical situations in which the particular noise realization, or the very low values of the smallest jump in the true image, does not allow for any couple ()~, ct) to give a perfect edge map. This is particularly evident in the planar image, where in some regions the values of the jumps go to zero (see Fig. 14a). In those regions the reconstruction is not correct. In these cases, the aid of extra constraints on the regularity of the line configurations, such as straight-line connectivity and double-line penalization, becomes necessary. Further improvements can be obtained by using more complex models including higher-order derivatives and/or implicitly addressed graded discontinuities (Bedini, Gerace et al., 1994b; D. Geman and Reynolds, 1992). Another effect due to poor image modelization is the slight flattening of the junction between the two planar patches with different slopes in the image of Figure 14a, which is well seen in the perspective view provided. This image feature is not a true intensity edge, in that it does not correspond to a discontinuity in the intensity values. Instead, it is mathematically defined as a discontinuity of the intensity first derivative. As a way preserve it, a local smoothness constraint should be recursively enforced on the line process itself. The final values of the parameters obtained in these six examined cases are reported in Table 3 for the Mondrian image and in Table 4 for the planar image. The study of these tables allows for a quantitative analysis of the convergence
TABLE 3 INITIAL AND FINAL VALUES OF THE HYPERPARAMETERS FOR THE EXPERIMENTS SHOWN IN FIGURES 10 THROUGH 12
Noise
Initial )~
Final )~
Initial ct
Final ot
cr = 10 tr = 20 cr = 30
0.01 0.0025 0.0011
0.035 0.029 0.025
25 6.25 2.78
1.075 0.7 0.65
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
257
TABLE 4 INITIAL AND FINAL VALUES OF THE HYPERPARAMETERS FOR THE EXPERIMENTS SHOWN IN FIGURES 14 THROUGH 16
Noise
Initial X
Final X
Initial ot
Final c~
a = 10 cr = 20 a = 30
0.01 0.0025 0.0011
0.04 0.028 0.015
25 6.25 2.78
2.4 1.37 0.63
properties of the method. According to the M A P - M L principle, the final reconstructions should be a sample of the MRF corresponding to the given prior and the final parameters. In this sense, we should expect the parameters to converge to the same values for a given original image and almost independently of the degradation level and of the initial values. Thus, for the )~ parameters, starting with very different initial values, ranging from 0.0011 to 0.01 in the three noise-level cases, we obtained final values ranging from 0.015 to 0.04, for both images. Similarly, for the ot parameters, starting with initial values ranging from 2.78 to 25, we obtained final values ranging from 0.63 to 2.4. However, in all our simulations we found that the obtained values slightly decrease when the values of the noise standard deviation increase. We are not able to decide if this finding is due to the random variability of the estimates or if it is intrinsically connected to the method. Nevertheless, it is to be noted that the average variability of the reconstructions computed inside the homogeneous zones (which is governed by the X parameter) and the total number of edges (which is governed by the ot parameter) remain almost constant for a given image in the three noise cases. As another example of a synthetic image we considered the piecewise constant Mondrian image of size 128 x 128 shown in Figure 17a, degraded by uncorrelated Gaussian noise with a = 25. At convergence we obtained the final values )~ = 0.014 and ot = 0.39. The very small value of the ot parameter obtained in this case is related to the very small jump between the background and the square in the bottom-fight comer of the image. From inspection of the reconstructed edge map, we can see that the boundaries are all complete and close, although not well behaved. Indeed, it is possible to observe, besides many deviations from the straight line, some unwanted double parallel lines. This is a clear consequence of the low ot value. In another paper (Bedini, Gerace et al., 1994a), we observed that for this image the recovery of a well-behaved edge map is possible only if the weak membrane model is augmented by a straight-line line continuity constraint. In that case, the best values that we found by trial and error for the )~ and ot parameters were different from those obtained in this article because of the presence of the line propagation factor e.
258
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 17. Synthetic Mondrian image of size 128 x 128: (a) Piecewise constant original image. (b) Image degraded by adding uncorrelated Gaussian noise (or -- 25). (c) MAP reconstruction with fixed parameters (~. = 0.0016, ct -- 4). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, ~. = 0.0016, ct = 4; final parameters, ~. -- 0.014, ot -- 0.39). (f) Related edge map.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
259
FIGURE 18. Real image of size 128 x 128: (a) Image degraded by randomly removing 50% of the pixels and adding uncorrelated Gaussian noise (cr -- 12). (b) MAP reconstruction with fixed parameters (~. = 0.007, oe = 17.36). (c) MAP-ML reconstruction (initial parameters, ~. = 0.007, c~ = 17.36; final parameters, ), = 0.046, c~ = 2.08). (d) Related edge map.
As real images we first used two 128 x 128 images of printed characters, which can also be roughly considered piecewise constant. The first image was degraded by randomly selecting 50% of the original image and adding Gaussian noise with a = 12 (Fig. 18a). For display purposes, in Figure 18a the missing pixels are replaced by white pixels. The second image (Fig. 19a) was degraded by adding Gaussian noise with standard deviation a = 25 (Fig. 19b). The results obtained for the two images are shown in Figures 18b through 18d and in Figures 19c and 19d, respectively. We reached at convergence the values ), = 0.046, ot = 2.08 for the first image, and >, = 0.0078, ot = 0.69 for the second image. The results obtained give rise to interesting observations. Let us consider first Figure 18. In this case, the degradation, besides the addition of
260
ANNA TONAZZINI AND LUIS BEDINI
FIGURE19. Real imageof size 128 x 128: (a) Original image. (b) Imagedegradedby adding uncorrelated Gaussian noise (o = 25). (c) MAP-ML reconstruction (initial parameters, ~, = 0.0016, a = 4; final parameters, ~. = 0.0078, c~ = 0.69). (d) Related edge map. noise, consists of randomly removing 50% of the pixels. Thus the restoration problem becomes a problem of noise removal plus interpolation of the missing data. The good quality of the reconstruction clearly indicates that the use of even very simple MRF models for the image allows for an excellent solution of the interpolation problem, even when the missing data are a significant percentage of the pixels. Thus the true problem is again noise removal. We can observe that, both from the point of view of the image model (piecewise constant) and with respect to the amount of degradation, this experiment is similar to that shown in Figure 10. Indeed, we obtained very similar values of the smoothness degree (0.035 and 0.046) for the two experiments. Instead, the c~ values are different in the two cases, higher in the second, for which the jump between the printed characters and the background is higher than the minimum
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
261
jump in the Mondrian image. The artifacts in the edge map of Figure 18d are clearly related to the presence of very fine details in the various characters. Finally, as shown in Figure 18b, in the reconstruction obtained with the initial parameter values, the edges are poorly detected. This is in accordance with the high value of the threshold used. With respect to the second image of printed characters (Fig. 19), we can observe that the very low value of the final smoothness parameter (~. = 0.0078), as compared with the amount of noise (o- = 25), is due to the very fine scale of the characters. Despite the obvious presence of many spurious edges, the quality of the reconstruction shown in Figure 19c is excellent. We again emphasize that better results for the edge maps, with consequent further improvement of the intensity maps, could be obtained by using image models which incorporate constraints on the line configuration geometry. In a last experiment, we considered a 170 • 290 portion of a 512 • 512 image of"Lena" (Fig. 20a), which can be roughly considered piecewise smooth, degraded by the addition of Gaussian noise with standard deviation cr = 20 (see Fig. 20b). Although this image does not exactly fit the weak membrane model, which is more suitable for piecewise constant images rather than for piecewise smooth images, we obtained the satisfactory results shown in Figures 20e and 20f. The final values of the parameters, after 25 iterations, were )~ -- 0.0019 and ot -- 0.97. These values are well balanced, since they allow for sufficient removal of the noise and good detection of the most salient edges. In particular, the very low value of )~ prevents the reconstruction from being too stylized, in accordance with the fact that the true image does not contain large constant patches and is slowly varying. The previous considerations as to how to further improve the edge map also hold in this case. Finally, the poor performance of the initial set of parameters is to be noted (Figs. 20c and 20d). All the computations were executed on the IBM system RISC6000. For the 64 x 64 images the CPU computation time was about 30 s for 30 iterations; for the 128 x 128 images the times were around 120 s. We found that the number of iterations for all the experiments was approximately the same as that needed for a typical mixed-annealing procedure with known parameters. Thus the computational cost of the whole M A P - M L method can be considered equivalent (apart from cheap step 5) to that of a simple mixed-annealing algorithm for MAP estimation, with the advantage that the optimal values of the parameters do not need to be known in advance.
B. Adaptive Edge Tracking In this subsection we analyze the performance of the GNC-like algorithm described in Section VIII, when a sigmoidal stabilizer for the implicit treatment
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
263
of constrained lines is adopted. The robustness of this adaptive method against a rough choice of parameters and its ability to gradually recover edges of decreasing width were tested by considering images degraded by blur plus noise. To focus on the main aim of this article, we considered the case of noninteracting line processes, since in Bedini, Gerace et al. (1995) we showed the capability of the considered stabilizer to address discontinuity constraints and then to produce well-behaved edge maps. In all the experiments performed we compared the results obtained with those produced by the standard (i.e., nonadaptive) GNC algorithm. As already mentioned, this algorithm is based on polynomial approximations for the truncated parabola stabilizer, which thus results in a stabilizer with a fixed threshold. The same values of )~ and ot were used in the two cases. As regards our method, in all the trials the initial temperature was T -- 10,000 and the chosen law for decreasing it was T~+I = Tk/lO. We let the temperature decrease to 0.0001, but in every case stabilization of the reconstructions was achieved for values of T lower than one. As a comparison, we ran the standard GNC for 10 iterations, by decreasing p by a factor of 2 at each iteration, starting with p -- 1. It is worth noting that the cost of each iteration in the two cases is comparable. The value of the regularization parameter )~ was chosen empirically on the basis of the noise level and the image scale. Conversely, the value of ot was chosen almost arbitrarily, in such a way as to obtain a very low nominal threshold. In this case, we neglected the dependence of the energy function from the noise variance, although this was known in our synthetic experiments. As well as not affecting the image estimation, this assumption fits more realistic situations in which the amount of noise is unknown. Still, for comparison with the original GNC, 0 was set equal to ~/-d-/)~. The first image treated was a synthetic 64 x 64 step image (Fig. 21a), artificially degraded by convolution with a 7 x 7 uniform blur plus addition of uncorrelated, zero-mean Gaussian noise with standard deviation cr = 20 (Fig. 2 lb). Because of the large scale of the image and the high level of noise we chose X = 4 as the regularization parameter, while we set ot = 1600, in such a way as to give a nominal threshold of 20, which is, on average, comparable to the noise. Figures 21 c and 21 d show the results obtained by running the standard GNC and our GNC with sigmoidal approximation. The RMSEs between the original and reconstructed images in the two cases were 16.06 and 3.08, respectively.
FIGURE 20. Real image of size 170 x 290: (a) Original image. (b) Image degraded by adding uncorrelated Gaussian noise (G = 20). (c) M A P reconstruction with fixed parameters ()~ = 0.0025, ot = 6.25). (d) Related edge map. (e) M A P - M L reconstruction (initial parameters, -- 0.0025, ot = 6.25; final parameters, X = 0.0019, ot = 0.97). (f) Related edge map.
264
ANNA TONAZZINI AND LUIS BEDINI
FIGURE21. Synthetic 64 • 64 step image: (a) Original image. (b) Image degraded by convolution with a 7 x 7 uniform blur plus addition of uncorrelated Gaussian noise (tr = 20). (c) Reconstruction with standard GNC (~. = 4, ot = 1600). (d) Reconstruction with GNC and sigmoidal approximation (~, = 4, ct = 1600).
The first image is noisy, and the edges are not correctly recovered, as might be expected given the very low value of the nominal threshold, which also corresponds to the effective threshold of the various polynomial approximations employed by the GNC. Instead, the second image is cleaner and well resolved, with the true intensity edges perfectly recovered. To better illustrate our method, Figure 22 shows the sequence of the partial results obtained at the various temperatures. The first two images are clean but very smooth, which indicates that processing at high temperatures has removed almost all the noise. As the temperature, and then the effective threshold, is reduced, the noise is not recovered any further, while the true intensity edges are gradually recovered. With regard to this image we also attempted a further low nominal threshold of 10, by setting )~ -- 6 and ot - 600, and obtained the results shown in Figure 23, whose RMSEs are 25.08 and 6.37, respectively.
D I S C O N T I N U I T Y - A D A P T I V E VISUAL R E C O N S T R U C T I O N
265
FIGURE 22. Sequence of the reconstructions of the degraded image in Figure 2 lb with the sigmoidal approximation, at various temperatures: (a) T = 10,000; (b) T = 100; (c) T = 10; (d) T = 1.
FIGURE 23. Reconstruction of the degraded image in Figure 21b with a nominal threshold of 10: (a) Standard GNC. (b) GNC with sigmoidal approximation'
266
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 24. Real image, 128 x 128, of printed characters: (a) Original image. (b) Image degraded by convolution with a 3 x 3 quasi-Gaussian mask (Eq. (64))plus addition of uncorrelated Gaussian noise (or = 5). (c) Reconstruction with standard GNC (~ = l, c~ -- 25). (d) Reconstruction with GNC and sigmoidal approximation (~. = l, a = 25).
As initial real images we considered the already-used 128 x 128 image of printed characters (Fig. 24a), this time degraded by convolution with a 3 x 3 quasi-Gaussian mask plus addition of Gaussian, zero-mean noise with standard deviation o- -- 5 (Fig. 24b). In the first experiment, we assumed k --- 1 and o~ = 25, which give a nominal threshold of 5. Figures 24c and 24d show the results we obtained by running the standard G N C and our G N C with sigmoidal approximation. The R M S E s obtained in the two cases are 25.56 and 13.19, respectively. Since this i m a g e has j u m p s of almost constant width, and the noise is modest, it should be possible to find a good reconstruction by also employing
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
267
FIGURE25. Reconstructionof the degraded image in Figure 24b with ,k = 0.1 and ot -- 90: (a) Standard GNC. (b) GNC with sigmoidal approximation.
a fixed threshold, provided that it is carefully selected. To do this, we performed a large set of experiments with standard GNC. By trial and error, we tried different couples of parameters in order to find the "best" ones. We found that a very good solution can be achieved for )~ - - 0 . 1 and ot = 90. This solution, shown in Figure 25a, has an RMSE with the original image of about 5.7. We then ran our GNC with the same values of parameters and with both widely variable and slightly variable thresholds. When the threshold varies considerably, we obtain the slightly worse solution shown in Figure 25b, whose RMSE with the original image is of about 11.6. Nevertheless, if we let the threshold vary slightly, by starting the process with a lower initial temperature, we get roughly the same results as those produced by the standard G N C . From the preceding experiments we can conclude that when the image has edges of constant width and the noise is moderate, so that a single optimal threshold can be found, a stabilizer with a widely variable threshold gives worse, although still acceptable, results. However, also in this case, our stabilizer with a variable threshold performs better than a stabilizer with a fixed threshold when the parameters are chosen arbitrarily. Thus, since the choice of the optimal threshold has a very high computational cost, in some practical applications it may be preferable to accept a little degradation in quality. More frequently, real images have edges of variable widths. Often, in the presence of even modest noise, a single optimal threshold cannot be found. In these cases, the main feature of the sigmoidal stabilizer--the adaptive tracking of intensity discontinuities of decreasing amplitudemturns out to be extremely
268
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 26. A 170 x 170 portion of the image "Lena": (a) Original image. (b) Image degraded by convolution with a 3 x 3 quasi-Gaussian mask (Eq. (64)) plus addition of uncorrelated Gaussian noise (or = 5). (c) Reconstruction with standard GNC (~, = 2, ot = 50). (d) Reconstruction with GNC and sigmoidal approximation (~. - 2, ct -- 50).
useful. Thus, in s u b s e q u e n t experiments, we c o n s i d e r e d a 170 x 170 portion of the i m a g e " L e n a " (Fig. 26a), which is c h a r a c t e r i z e d by intensity j u m p s of very different widths. To this i m a g e we applied the s a m e degradation already used for printed characters (see Fig. 26b). We first set ~. = 2 and ot = 50, w h i c h still give a nominal threshold of 5. H o w e v e r , the value of ~ is higher, since the scale o f the characters is finer than that of a face. We obtained
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
269
FIGURE27. Sequence of the reconstructions of the degraded image in Figure 26b with the sigmoidal approximation, at various temperatures: (a) T = 10,000; (b) T = 1000; (c) T = 100; (d) T = 1.
the results shown in Figures 26c and 26d, whose R M S E s are 22.6 and 8.97, respectively. So that the coarse-to-fine recovery of the edges can be appreciated, Figure 27 shows the sequence of partial results obtained with our adaptive G N C at the various temperatures. For this image we also attempted even rougher values of parameters--)~ = 4 and c~ = 5 0 - - w h i c h give a nominal threshold of 3.54. The obtained results are shown in Figures 28a and 28b, respectively, with corresponding R M S E s of 25.77 and 10.17.
270
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 28. Reconstruction of the degraded image in Figure 26b with a nominal threshold of 3.54: (a) Standard GNC. (b) GNC with sigmoidal approximation.
XI. EXPERIMENTAL RESULTS: THE FULLY UNSUPERVISED BLIND RESTORATION CASE
In this section we analyze the qualitative and quantitative performance of the whole procedure described in Section VII (i.e., a fully data-driven image restoration process) in the general case of both blurred and noisy images. For comparison with the subcases of unsupervised reconstruction with known degradation and of blind reconstruction with known model parameters, some of the images employed are those used for the experiments of Sections IX and X. Again, we found that similar results can be obtained by using the two model parameter updating rules proposed in this article, so we show only the results obtained when the saddle point approximation is exploited. For all the experiments, we considered the model to be the weak membrane energy and the Gibbs parameters to be only the two parameters ~ and c~. As previously mentioned, the independence of the line elements allows the analytical computation of the expectations appearing in the hyperparameter updating rule, with a significant reduction in computational load for the whole procedure. With respect to the degradation parameters, we assumed that we knew in advance the size of the blur mask d, while we estimated the mask coefficients and the noise variance 0-2 . We always adopted as initial guesses a low value of &(~. = 0.001) and a value of or(or = 2.5) which gives a threshold higher than the standard deviation of the noise. Although the step for estimating the noise variance is given by an
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
271
analytical formula, so that in principle an initial value for it need not be set, since the term 2a 2 explicitly appears in the energy function to be minimized, a rough initial estimation for the variance has to be performed anyway. This can be done, for instance, by computing the image variance in a background homogeneous area. In our experiments, we considered instead a realistic value of 10 for initializing a. The blur mask was initialized with a Dirac function, for blurred and noisy data, and with a uniform mask, for only noisy data. For the remaining, the procedure of Section VII was run at the conditions already described for the experiments in Section X.A. In all the experiments performed, we found that convergence of the parameters and stabilization of the reconstructions were always reached in less than 30 iterations of the whole procedure. We considered synthetic and real images, artificially degraded by three types of degradation: high noise but no blur, blur plus moderate noise, and blur plus significant noise, where the noise, as usual, is always a Gaussian uncorrelated process of zero mean. The first set of experiments deals with the comparative recovery of the synthetic piecewise constant Mondrian image already used (see Fig. 10a), when affected by the three types of degradation just mentioned. For strong noise (a -- 30) but no blur (Dirac mask) we obtained results similar to those shown in Figure 12, apart from slight differences due to the different noise realization, with an RMSE between the original and estimated image of 3.89. In addition, we obtained the following almost-perfect estimate of the blur mask ( R M S E - 0.0042). 0.000000 0.000000 0.008957
0.000000 0.991043 0.000000
0.000000 0.000000 0.000000
From this experiment, we can observe that when no blur affects the data, the blind restoration process is robust even against a significant amount of noise. Figures 29a and 30a show the degraded images, when the degradation is given by blur (again constituted of the nonuniform 3 x 3 blur mask of Eq. (64)) plus modest noise (or -- 5) and strong noise (~r - 30), respectively. The corresponding restored images that we obtained are shown in Figures 29b and 30b, respectively. With blur and modest noise the reconstruction is excellent (Fig. 29b, RMSE -- 1.36), as well as the following estimated blur mask (RMSE = 0.0086): 0.055254 0.112481 0.076750
0.111137 0.300023 0.112725
0.068267 0.115089 0.048275
When both blur and high-level noise are present, both the image reconstruction (Fig. 30b, RMSE -- 15.0) and the estimated blur mask deteriorate. The
272
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 29. Synthetic 64 x 64 image: (a) Image degraded by convolution with the 3 x 3 blur mask of Eq. (64) plus addition of uncorrelated Gaussian noise (tr = 5). (b) MAP-ML reconstruction (initial parameters, ~. -- 0.001, c~ = 2.5; final parameters, ~. = 0.038, ct = 1.1).
o b t a i n e d b l u r m a s k is the f o l l o w i n g ( R M S E = 0 . 1 1 ) : 0.049968 0.023446 0.117686
0.000000 0.553230 0.034766
0.111355 0.015997 0.093551
F r o m t h e s e e x p e r i m e n t s w e c a n o b s e r v e that t h e q u a l i t y o f the r e c o n s t r u c t i o n s w o r s e n s as the total a m o u n t o f d e g r a d a t i o n i n c r e a s e s . In particular, w h e n
FIGURE 30. Synthetic 64 • 64 image: (a) Image degraded by convolution with the 3 x 3 blur mask of Eq. (64) plus addition of uncorrelated Gaussian noise (tr = 30). (b) MAP-ML reconstruction (initial parameters, X = 0.001, ct = 2.5; final parameters, X -- 0.027, c~ = 0.62).
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
273
TABLE 5 RESULTS OF PARAMETER ESTIMATION FOR THE EXPERIMENTS WITH THE MONDRIAN IMAGE
Degradation
a = 3 0 , noblur
Blur RMSE Image RMSE Estimated X Estimated c~ Estimated a
0.0042 3.89 0.028 0.66 27.0
3x3blur+a=5 0.0086 1.36 0.038 1.1 4.76
3x3blur+a=30 0.11 15.0 0.027 0.62 27.95
some blur is present, the method becomes much more sensitive to the noise. However, for this synthetic image, the quality of the reconstruction is satisfactory in all cases, since the deterioration due to the presence of blur and high-level noise is partially compensated for by the perfect fitting between the ideal image and the model adopted for the intensity process. Moreover, a good deal of deterioration is related to a defect in the line modelization. Indeed, although the edge curves are all connected, the defects that are observable in the case of the higher noise level (a = 30) are essentially related to a lack of straight-line continuity and thinness, not addressed by the weak membrane model adopted herein. Table 5 quantitatively summarizes the results by showing the RMSE between the true and estimated blur, the RMSE between the true and estimated images, the estimated noise standard deviation, and the estimated final values of the hyperparameters, for the three degradation situations. In all cases the obtained standard deviation is slightly underestimated. As already highlighted, this underestimation is because of the obvious noise-fitting feature of the restoration process, which makes the data consistency term computed on the reconstructed image slightly lower than the energy of the total noise (Gaussian noise plus quantization error) affecting the data. With respect to the final estimates for both L and or, we obtained nearly the same values in the three degradation cases. Let us consider now the results obtained for three real images. The first is the only noisy image of printed characters (a = 25) already shown in Figure 19b. We obtained results that are qualitatively and quantitatively similar to those obtained for the case in which only the hyperparameters are jointly estimated with the image (see Section X.A and Fig. 19). In particular, in this case the final parameters are X -- 0..0082 and ct = 0.73, the RMSE between the Dirac mask and the estimated blur mask is 0.0032, and the estimated noise standard deviation is a - 22.85. Figure 31 shows the results obtained for the same image when the degradation consists of a convolution with the blur mask of Eq. (64) plus an increasing amount of noise ( a = 5, 15, 25, respectively).
274
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 31. Real image, of size 128 x 128 degraded by convolution with the 3 x 3 blur mask of Eq. (64) plus addition of uncorrelated Gaussian noise with increasing standard deviations. For all three cases the initial parameters were ~. = 0.O01 and ct = 2.5. (a) Degraded image (tr = 5). (b) M A P - M L reconstruction (final parameters, k = 0.0138, ct = 1.596). (c) Image degraded (or = 15). (d) M A P - M L reconstruction (final parameters, k = 0.057, ot = 0.7). (e) Image degraded (or = 25). (IF)M A P - M L reconstruction (final parameters, ~ = 0.0347, ~ = 0.225).
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
275
As already observed for the synthetic Mondrian image, also in this case, because of the presence of blur, the quality of the reconstruction deteriorates as the amount of noise increases. In particular, for this specific image, we found that the quality of both the reconstructed images and the blur masks abruptly deteriorates when the noise level overcomes the value cr = 10. Thus, while for tr = 5 we obtained a satisfactory restored image (Fig. 3 l b) and a good blur mask (RMSE = 0.0044), for cr = 25 both the restored image (Fig. 3 lf) and the blur mask estimate (RMSE = 0.12) are very poor. As a consequence, while the estimated hyperparameters for cr = 5 have values that are comparable to those obtained for the case of only strong noise, the values obtained for cr = 15 and cr -- 25 are meaningless. As second and third real images we chose two 128 x 128 images of sculptures. Figures 32 and 33 show the reconstructions of the first image (Fig. 32a),
FIGURE 32. Real image of size 128 x 128: (a) Original image. (b) Image degraded by adding uncorrelated Gaussian noise (or = 25). (c) M A P - M L reconstruction (initial parameters, ~, = 0.001, ot = 2.5; final parameters, ~ = 0.012, ct = 2.05).
276
ANNA TONAZZINI AND LUIS BEDINI
FIGURE 33. Real image of size 128 x 128: (a) Image degraded by convolution with the 3 x 3 blur mask of Eq. (64) plus addition of uncorrelated Gaussian noise (tr = 5). (b) MAP-ML reconstruction (initial parameters, ~. = 0.001, a -- 2.5; final parameters, ~. = 0.015, ot = 2.44).
degraded only by high noise (tr = 25; Fig. 32b) and by the same blur mask of Eq. (64) plus modest noise (or = 5; Fig. 33a). With respect to the estimation of the degradation parameters and the model hyperparameters, we obtained results that are in accordance with those obtained for the image of printed characters. With respect to the quality of the restored images, we observe that, in both cases, although we obtained an excellent removal of the noise and a satisfactory reconstruction of the largest structures, we lost some details, for example, in correspondence of the eye. In other words, the restored images appear to be more "stylized" than the original, and this is in accordance with the model adopted, which is better suited for piecewise constant images. Also in this case, further improvements could be obtained by using more complex models including higher-order derivatives and/or implicitly addressed graded discontinuities (Bedini, Gerace et al., 1994b). The second image of a sculpture (Fig. 34a) is better suited to the model adopted. For this reason we could attempt a reconstruction with a higher noise (tr = 10; Fig. 34b). We obtained a satisfactory reconstruction (Fig. 34c); the blur mask estimation (RMSE = 0.042) shows a slight worsening with respect to that of the image of Figure 33 (RMSE = 0.012). This finding agrees with the fact, already mentioned previously, that the blur mask estimate deteriorates as the noise level increases. We also used our procedure on the 200 • 200 highly defocused and noisy image of an isolated Euglena gracilis photoreceptor, already used to test the subprocedure of blind restoration (see Fig. 9). In that experiment we adopted as the model the prior in Eq. (65), augmented by terms which encouraged not
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
277
FIGURE34. Real image of size 128 x 128: (a) Original image. (b) Image degraded by convolution with the 3 • 3 blur mask of Eq. (64) plus addition of uncorrelated Gaussian noise (or = 10). (c) MAP-ML reconstruction (initial parameters, ~. -- 0.001, ct = 2.5; final parameters, ~. = 0.015, ~ = 2.36).
only straight-line continuations but also line turns, and which penalized double lines. In this experiment, in which the hyperparameters have to be estimated as well, we reduced the model to the prior of Eq. (65) alone, which, however, accounts for a line continuation constraint. This constraint is weighted by the extra hyperparameter e, with values in the range (0, 1), that serves to decrease the threshold for the intensity gradient above which a discontinuity can be created, when an adjacent discontinuity is already present. To further reduce the computational cost of the procedure, we kept parameter e fixed at a given value through all the iterations. Indeed, although the line continuation constraint expressed by this parameter is fundamental for improving the
278
ANNA TONAZZINI AND LUIS BEDINI
quality of the reconstructions, its value is far less critical than the values of ~. and or. Moreover, keeping e fixed still allows for the analytical computation of the expectations in the hyperparameter updating rule. The mixed-annealing algorithm was run under the same conditions as for the previous experiments, except for the length of the Markov chains for updating the line elements, which was chosen to be 20(2mn - m - n), corresponding to 20 visitations of each line element in turn. This length was chosen because, as a result of the presence of the line continuation constraint, the Gibbs sampler needs some time to reach equilibrium. Assuming again a 7 x 7 size for the blur mask, at convergence we get a 2 = 104.47, )~ = 0.0054, ot -- 1.074, and the following estimated blur mask: 0.000000 0.000000 0.014574 0.018354 0.016408 0.008515 0.000000
0.016100 0.019304 0.024174 0.033544 0.030240 0.012281 0.007582
0.017801 0.031971 0.027620 0.048958 0.042737 0.020959 0.012359
0.020476 0.035412 0.053524 0.057797 0.045794 0.022813 0.015319
0.014319 0.023609 0.028775 0.052928 0.038197 0.012744 0.010387
0.011874 0.016349 0.021407 0.025551 0.025228 0.010800 0.011543
0.000000 0.000000 0.013158 0.015713 0.012800 0.000000 0.000000
From a qualitative inspection of the reconstructed image (see Fig. 35b), it is possible to appreciate the excellent removal of the noise and the satisfactory deblurring effect. Nevertheless, this image is slightly worse than that shown in Figure 9b (i.e., it appears more "stylized") with only the largest structures
FIGURE 35. Result of the blind unsupervised reconstruction process on a 200 x 200 real microscope image. (a) Degraded image. (b) Reconstructed image.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
279
recovered and some of the fine details lost. The reasons for this behavior are essentially related to the fact that in the experiment of Figure 9 a more complex model was used and only the blur mask was estimated while the Gibbs parameters were chosen by trial and error.
XII. CONCLUSIONS In this article we described our recent experience and progress made in finding efficient solutions to the highly ill-posed and computationally demanding problem of blind and unsupervised visual reconstruction. The problem was dealt with in the general framework of edge-preserving regularization through the use of Bayesian estimation and MRF image models. This approach is known to be one of the most promising and efficient for solving a large body of problems in image processing and computer vision. Thus, although our case study dealt with image restoration (i.e., deblurring and denoising), the solutions we proposed can be considered as representative of the possible solutions for many other related problems. The considered MRF models account for a line process both explicitly and implicitly addressed, and possibly geometrically constrained. These MRF models have proven to be efficient for modeling the local regularity properties of most real scenes, as well as the local regularity of object boundaries and intensity discontinuities. In both cases, our approach to the estimation problem attempted to positively exploit the correlation between intensities and lines and was based on the assumption that the line process alone, when correctly recovered and located, can retain a good deal of information regarding boththe hyperparameters that best model the whole image and the degradation features. We showed that this idea offers a way to improve not only the quality of the reconstructed image, but also the quality of the degradation and model parameter estimates, and moreover to significantly reduce the computational burden of the estimation processes. First we described a fully Bayesian approach, which is essentially based on the joint maximization of a distribution of the image field, the data, and the degradation and model parameters. This very complex joint maximization was initially decomposed into a sequence of MAP and/or ML estimations, to be performed alternately and iteratively, with an initial significant reduction in complexity and computational load. The saddle point approximation from the statistical mechanics and the importance sampling theorem from the MCMC theory were then applied to further improve the computational performance of the MRF parameter estimation. A procedure to practically implement the derived methods was designed, and some suggestions were given for a possible parallel architecture based on the Hopfield network and the GBM. We
280
ANNA TONAZZINI AND LUIS BEDINI
then described a discontinuity-adaptive s m o o t h i n g method, which is essentially based on a G N C - l i k e algorithm applied to a particular M R F model in which the constrained discontinuities are treated in an implicit manner. We showed that the combination of the particular i m a g e model and the G N C like algorithm allows for the automatic reduction in the model threshold during the reconstruction process, and then for a coarse-to-fine detection of the edges. The computational savings and the expected g o o d quality of the estimations were confirmed by the experiments, for the estimation of all the degradation and model parameters and of the image. Starting with poor parameter values, the n u m b e r o f iterations required for convergence were nearly the same as those required by a typical mixed annealing for supervised M A P restoration with k n o w n degradation operator and model hyperparameters. REFERENCES Aarts, E., and Korst, J. (1989). Simulated Annealing and Boltzmann Machines. New York: Wiley. Andrews, H. C., and Hunt, B. R. (1977). Digital Image Restoration. Englewood Cliffs, NJ: Prentice-Hall. Angwin, D. L. (1989). Adaptive image restoration using reduced order model based Kalman filters. Doctoral dissertation, Department of Electrical Engineering and Computer Science, Rensselaer Polytechnic Institute, Troy, New York. Axelsson, O., and Lindskog, G. (1986). On the rate of convergence of the preconditioned conjugate gradient algorithm. Num. Math. 48, 499-523. Ayers, G. R., and Dainty, J. G. (1988). Iterative blind deconvolution method and its applications. Opt. Lett. 13, 547-549. Azencott, R. (1990). Synchronous Boltzmann machines and Gibbs fields: learning algorithms, in Neurocomputing, Vol. NATO ASI F-68, edited by F. Fogelman and J. Hrrault. Berlin: Springer-Verlag, pp. 51-64. Azencott, R. (1992). Boltzmann machines: high-order interactions and synchronous learning, in Stochastic Models in Image Analysis, edited by P. Barone and A. Frigessi. Berlin/Heidelberg: Springer-Verlag, pp. 14-45. (Lecture Notes in Statistics). Barsanti, L., Passarelli, V., Walne, P. L., and Gualtieri, P. (1997). In vivo photocycle of the Euglena gracilis photoreceptor. Biophys. J. 72, 545-553. Bedini, L., and Tonazzini, A. (1992). Image restoration preserving discontinuities: the Bayesian approach and neural networks. Image Vision Comput. 10, 108-118. Bedini, L., Gerace, I., and Tonazzini, A. (1994a). A deterministic algorithm for reconstructing images with interacting discontinuities. CVGIP: Graphical Models Image Processing 56, 109123. Bedini, L., Salerno, E., and Tonazzini, A. (1994). Edge-preserving tomographic reconstruction from Gaussian data using a Gibbs prior and a generalized expectation-maximization algorithm. Int. J. Imaging Syst. TechnoL 5, 231-238. Bedini, L., Gerace, I., and Tonazzini, A. (1994b). A GNC algorithm for constrained image reconstruction with continuous-valued line processes. Pattern Recogn. Lett. 15(9), 907-918. Bedini, L., Gerace, I., and Tonazzini, A. (1995). Sigmoidal approximations for self-interacting line processes in edge-preserving image restoration. Pattern Recogn. Lett. 16(10), 1011-1022.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
281
Bedini, L., Gerace, I., Salerno, E., and Tonazzini, A. (1996). Models and algorithms for edgepreserving image reconstruction, in Advances in Imaging and Electron Physics, Vol. 97, edited by P. W. Hawkes. San Diego: Academic Press, pp. 86-189. Bedini, L., Del Corso, G. M., and Tonazzini, A. (2001). Preconditioned edge-preserving image deblurring and denoising. Pattern Recogn. Lett. 22(10), 1083-1101. Bedini, L., Tonazzini, A., and Minutoli, S. (1999). Unsupervised edge-preserving image restoration via a saddle point approximation. Image Vision Comput. 17(11), 779-793. Bedini, L., Tonazzini, A., and Gualtieri, P. (2000). Image restoration in two-dimensional microscopy, in Image Analysis: Methods and Applications, 2nd ed., edited by D.-P. Haeder. Boca Raton, FL: CRC Press, pp. 159-183. Bedini, L., Tonazzini, A., and Minutoli, S. (2000). A neural architecture for fully data driven edge-preserving image restoration. Integrated Computer-Aided Eng. 7(1), 1-18. (Special issue on Architectural Trends for Image Processing and Machine Vision) Bedini, L., and Tonazzini, A. (2001). Fast fully data driven image restoration by means of edge-preserving regularization. Real-Time Imaging 7(1), 3-19. (Special issue on Fast EnergyMinimization-Based Imaging and Vision Techniques) Bedini, L., and Tonazzini, A. (submitted). Monte Carlo Markov chain techniques for unsupervised noisy image deconvolution with MRF models. Bertero, M., Poggio, T., and Torre, V. (1988). Ill-posed problems in early vision. Proc. IEEE 76(8), 869-889. Besag, J. (1986). On the statistical analysis of dirty pictures. J. R. Stat. Soc. B 48, 259-302. Bilbro, G. L., and Snyder, W. E. (1990). Applying mean field annealing to image noise removal. J. Neural Network Comput. (Fall), 5-17. Bilbro, G. L., Snyder, W. E., Gamier, S. J., and Gault, J. W. (1992). Mean field annealing: a formalism for constructing GNC,like algorithms. IEEE Trans. Neural Networks 3(1), 131138. Bjorck, A. (1990). Least squares methods in Handbook of Numerical Methods, Vol. 1, edited by P. Ciarlet and J. Lions. Amsterdam: Elsevier/North Holland, 466-647. Blake, A., and Zisserman, A. (1987). Visual Reconstruction. Cambridge, MA: MIT Press. Burch, S. E, Gull, S. E, and Skilling, J. (1983). Image restoration by a powerful maximum entropy method. Comput. Vision Graphics Image Processing 23, 113-128. Chalmond, B. (1989). An iterative Gibbsian technique for reconstruction of m-ary images. Pattern Recogn. 22(6), 747-761. Chan, T. (1988). An optimal preconditioner for Toeplitz system. SlAM J. Sci. Stat. Comput. 9, 766-771. Chan, R., and Ng, K. P. (1996). Conjugate gradient method for Toeplitz systems. SIAM Rev. 38, 427-482. Chan, R. H., Nagy, J. G., and Plemmons, R. J. (1984). Circulant preconditioned Toeplitz least squares iterations. SlAM J. Matrix Anal Appl. 15, 88-97. Chan, R. H., Nagy, J. G., and Plemmons, R. J. (1993). FFT-based preconditioners for Toeplitzblock least squares problems. SlAM J. Num. Anal. 30, 1740-1768. Chandler, D. (1987). Introduction to Modern Statistical Mechanics. London: Oxford Univ. Press. Charbonnier, P., Blanc-Ftraud, L., Aubert, G., and Barlaud, M. (1997). Deterministic edgepreserving regularization in computed imaging. IEEE Trans. Image Processing 6(2), 298311. Davey, B. L. K., Lane, R. G., and Bates, R. H. T. (1989). Blind deconvolution of noisy complexvalued images. Opt. Commun. 69, 353-356. Dav.id, E., and Dusan, C. (1994). A sparse preconditioner for symmetric positive definite banded circulant and Toeplitz linear systems. Int. J. Comput. Math. 54, 229-238. Davis, P. (1979). Circulant Matrices. New York: Wiley.
282
ANNA TONAZZINI AND LUIS BEDINI
Dempster, A. E, Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Star. Soc. B 39, 1-38. Descombes, X., Morris, R., Zerubia, J., and Berthod, M. (1997). Maximum likelihood estimation of Markov random field parameters using Markov chain Monte Carlo algorithms, in Lecture Notes in Computer Science, Vol. 1223, edited by M. Pelillo and E. R. Hancock. Heidelberg: Springer-Verlag, pp. 133-148. (Proceedings of the International Workshop EMMCVPR'97, Venezia, 1997) Geiger, D., and Girosi, E (1991). Parallel and deterministic algorithms for MRFs: surface reconstruction. IEEE Trans. Pattern Anal. Machine Intell. 13, 401-412. Geman, S., and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Machine Intell. 6, 721-740. Geman, D., and Reynolds, G. (1992). Constrained restoration and the recovery of discontinuities. IEEE Trans. Pattern Anal. Machine lntell. 14, 367-383. Gennery, D. B. (1973). Determination of optical transfer function by inspection of the frequencydomain plot. J. Opt. Soc. Am. 63, 1571-1577. Geyer, C. J., and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data. J. R. Star. Soc. B 54, 657-699. Gidas, B. (1993). Parameter estimation for Gibbs distributions from fully observed data, in Markov Random Fields: Theory and Applications, edited by R. Chellappa and A. Jain. San Diego: Academic Press, pp. 471-498. Hanke, M., Nagy, J., and Plemmons, R. (1993). Preconditioned iterative regularization for illposed problems, in Numerical Linear Algebra and Scientific Computing, edited by L. Reichel, A. Ruttan, R. S. Varga. Berlin: de Gruyter, pp. 141-163. Hansen, P. C. (1990). Truncated singular value decomposition solutions to discrete ill-posed problems with ill-determined numerical rank. SlAM J. Sci. Stat. Comput. 11, 503-518. Higdon, D. M., Johnson, V. E., Turkington, T. G., Bowsher, J. E., Gilland, D. R., and Jaszczak, R. J. (1995). Fully Bayesian estimation of Gibbs hyperparameters for emission computed tomography data. Technical Report No. 96-21, Institute of Statistics and Decision Sciences, Duke University, Durham, NC. Hinton, G. E., Sejnowski, T. J., and Hackley, D. H. (1984). Boltzmann machine: constraint satisfaction networks that learn. Technical Report No. CNU-CS-84-119, Camegie-Mellon University, Pittsburgh, PA. Hopfield, J. J. (1984). Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Natl. Acad. Sci. USA 81, 3088-3092. Hom, B. K. P. (1988, Dec.). Parallel networks for machine vision. A.I. Memo No. 1071. Hunt, B. R. (1973). The application of constrained least squares estimation to image restoration by digital computer. IEEE Trans. Comput. 22, 805-812. Jeng, E C., and Woods, J. W. (1991). Compound Gauss-Markov random fields for image estimation. IEEE Trans. Signal Processing 39, 683-697. Katsaggelos, A. K., and Lay, K. T. (1991). Maximum likelihood blur identification and image restoration using the EM algorithm. IEEE Trans. Signal Processing 39, 729-732. Koch, C., Marroquin, J., and Yuille, A. (1986). Analog "neuronal" networks in early vision. Proc. Natl. Acad. Sci. USA 83, 4263-4267. Lagendijk, R. L., Tekalp, A. M., and Biemond, J. (1990). Maximum likelihood image and blur identification: a unifying approach. Opt. Eng. 29, 422-435. Lagendijk, R. L., Biemond, J., and Boekee, D. E. (1990). Identification and restoration of noisy blurred images using the expectation-maximization algorithm. IEEE Acoust. Speech Signal Processing 38, 1180-1191. Lakshmanan, S., and Derin, H. (1989). Simultaneous parameter estimation and segmentation of Gibbs random fields using simulated annealing. IEEE Trans. Pattern Anal. Machine Intell. 11, 799-813.
DISCONTINUITY-ADAPTIVE VISUAL RECONSTRUCTION
283
Lane, R. G. (1992). Blind deconvolution of speckle images. J. Opt. Soc. Am. A 9(9), 1508-1514. Lay, K. T., and Katsaggelos, A. K. (1990). Image identification and restoration based on expectation-maximization algorithm. Opt. Eng. 29(5), 436-445. Li, S. Z. (1995). Markov Random Field Modeling in Computer Vision. Tokyo: Springer-Verlag. Li, S. Z. (1998). Close-form solution and parameter selection for convex minimization based edge-preserving smoothing. IEEE Trans. Pattbrn Anal. Machine Intell. 20(9), 916-932. Lindsey, C. S., and Lindblad, T. (1995). Survey of neural network hardware, part II. Proc. SPIE 2432, 1194-1205. (Proceedings of Applications and Science of Artificial Neural Network Conference) Lumsdaine, A., Wyatt, J. L., and Elfadel, I. M. (1991, Jan.). Nonlinear analog networks for image smoothing and segmentation, A.L Memo No. 1280. March, R. (1992). Visual reconstruction with discontinuities using variational methods. Image Vision Comput. 10(1), 30-38. Marroquin, J. L. (1985). Probabilistic solution of inverse problems. MIT-AI Technical Report No. 860. Marroquin, J., Mitter, S., and Poggio, T. (1987). Probabilistic solution of ill-posed problems in computational vision. J. Am. Stat. Assoc. 82, 76-89. McCallum, B. C. (1990). Blind deconvolution by simulated annealing. Opt. Commun. 75, 101105. Mumford, D., and Shah, J. (1989). Optimal approximations by piecewise smooth functionals and associated variational problems. Commun. Pure Appl. Math. 42(5), 577-685. Reeves, S. I., and Mersereau, R. M. (1992). Blur identification by the method of generalized cross-validation. IEEE Trans. Image Processing 1, 301-311. Saquib, B., Bouman, C., and Sauer, K. (1998). ML parameter estimation for Markov random fields with application to Bayesian tomography. IEEE Trans. Image Processing 7(7), 10291044. Stockham, T. G., Jr., Cannon, T. M., and Ingebresten, R. B. (1975). Blind deconvolution through digital signal processing. Proc. IEEE 63, 678-692. Strang, G. (1986). A proposal for Toeplitz matrix calculations. Stud. Appl. Math. 74, 171-176. Strela, V. V., and Tyrtyshnikov, E. E. (1996). Which circulant preconditioner is better? Math. Comput. 65, 137-150. Tekalp, A. M., and Kaufman, H. (1988). On statistical identification of a class of linear spaceinvariant blurs using nonminimum-phase ARMA models. IEEE Trans. Acoust. Speech Signal Processing 38, 1360-1363. Tekalp, A. M., Kaufman, H., and Woods, J. (1986). Identification of image and blur parameters for the restoration of noncausal blurs. IEEE Trans. Acoust. Speech Signal Processing 34, 963972. Tikhonov, A. N., and Arsenin, V. Y. (1977). Solution of Ill-Posed Problems. Washington, DC: Winston-Wiley. Tonazzini, A. (2000). Adaptive smoothing and edge tracking in image deblurring and denoising. Pattern Recogn. Image Anal. 10(4), 492-499. Tonazzini, A., Bedini, L., and Minutoli, S. (1997). Joint MAP image restoration and ML parameter estimation using MRF models with explicit lines, in Proceedings of the IASTED International Conference on Signal and Image Processing (SIP'97). Calgary, IASTED/ACTA PRESS. pp. 215-220. Tonazzini, A., and Bedini, L. (1998). Using intensity edges to improve parameter estimation in blind image restoration. Proc. SPIE 3459, 73-81 (SPIE's International Symposium on Optical Science, Engineering, and Instrumentation, Bayesian Inference for Inverse Problems, San Diego, July 19-24, 1998) Tonazzini, A., and Bedini, L. (2000). Blind image restoration through edge-preserving regularization, in Proceedings of the Fifth Congresso Nazionale della Societd Italiana di Matematica
284
ANNA TONAZZINI AND LUIS BEDINI
Applicata e Industriale SIMA12000, Miniconvegno su "Modelli differenziali e metodi numerici per il trattamento delle immagini," Ischia, 5-9 Giugno 2000. Roma, SIMAI. pp. 384-387. Tonazzini, A. (2001). Blur identification analysis in blind image deconvolution using Markov random fields. Pattern Recogn. Image Anal. 11(4), 699-710. Van der Sluis, A., and Van der Vorst, H. (1990). SIRT- and CG-type methods for the iterative solution of sparse least squares problems. Linear Algebra Appl. 130, 257-302. Winkler, G. (1995). Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction. Berlin/Heidelberg: Springer-Verlag. Woods, J. W., and Ingle, V. K. (1981). Kalman filtering in two dimensions: further results. IEEE Trans. Acoust. Speech Signal Processing 29, 188-197. Yang, A. Y., Galatsanos, N. P., and Stark, H. (1994). Projection-based blind deconvolution. J. Opt. Soc. Am. A 11, 2401-2409. You, Y., and Kaveh, M. (1996). A regularization approach to joint blur identification and image restoration. IEEE Trans. Image Processing 5, 416-428. Younes, L. (1988). Estimation and annealing for Gibbsian fields. A. Inst. Henri Poincar6 24(2), 269-294. Younes, L. (1989). Parametric inference for imperfectly observed Gibbsian fields. Probability Theory Relat. Fields 82, 625-645. The mean field theory in EM procedures for markov random fields. IEEE Trans. Signal Processing 40, 2570-2583. Zhang, J. (1992). Zhang, J. (1993). The mean field theory in EM procedures for blind Markov random field image restoration. IEEE Trans. Image Processing 2, 27-40.
Index
A Aberration coefficients, 44, 62-63 determination of, 113-122 ARMA (autoregressive moving average), 197-198 Auger effect, 163 Axial rays, 44
B Bayes decision rule, 22 Bayesian approach to unsupervised restoration, 202-206 Bayesian estimation, 196 Beam separator, 50 framework, 77-78 performance calculation, 65-70 pole plates and coils, 73-76 requirements, 51-52 testing of, 102-113 Blind image restoration See also Image restoration architecture for, 227-231 blur identification and image restoration, 241-247 blur identification with known ideal image, 238-241 blurs, 197-199 point-spread function, 197-198 Blurs, 194 blind image restoration, 197-199 identification with known ideal image, 238-241 joint identification and image restoration, 241-247
Bonetto and Ladaga method, 152-158 Boundary-based methods, 2, 3 Boundary refinement, 18, 21-27 Box dimension, 139 Brownian motion or random walk, 141-144
C Canny edge detector, 28 Chromatic aberration, 44-45 correction of, with an electron mirror, 47-51 defocusing, 130-131 determination of coefficients, 113-122 Coding method, 200 Color segmentation, Bayesian, 19 Contour detection, 7 Contrast, 29 Contrast mechanisms in E-T detectors, 163-164
D Darmstadt corrector, 42 Defocusing, 130-131 Degradation parameters, ML estimation of, 215-217 Degree, 44, 62 Delaunay triangulation, 12 Differentiation, first- and second-order, 2 Discrete Fourier transform (DFT), 183-184 285
286
INDEX
E Edge detection, 7-8 Edge-preserving regularization strategies, 195,202, 214 Edge tracking and smoothing, 231-238, 261-269 Effective potential, 221 Elastic scattering, 162 Electron beam interaction, 162-163 Electron illumination experiments, 59 Electron mirror correction of chromatic and spherical aberration with an, 47-51 design, 78-81 performance calculation, 60-64 testing of, 122-128 EM (expectation maximization) algorithms, 198, 199, 204-206 Embedded integration conclusions and future work on, 35-36 control of growing criteria, 6, 10-14 defined, 4 disadvantages of, 33-35 fuzzy logic, 14-17 guidance of seed placement, 6, 7-10 region-growing algorithms, 2, 12-14 rule-based system, 11 split-and-merge algorithms, 2, 7, 10-12 summary of, 31-35 versus postprocessing integration, 4-5 Emery paper, quality differences in, 178-182
Everhart-Thornley (E-T) detector, 161-162, 163-164 Expanding a segment, 8
F Fast Fourier transform (FFT), 183-184, 195,214 FERImage program, 171-174 Field lenses design of, 78-81 testing of, 96-102 Field rays, 44 Filter, hybrid-median, 166-167 Fourier parameterization, 26 Fourier power spectrum, 141, 168, 171,183-188 Fractal characteristics, random, 137-146 Fractal dimension, 137, 140 noise influence in calculating, 164-171 Fractal geometry, 137-141 Fuzzy logic, 14-17
G Game theory, use of, 27 Gaussian image planes, 44 Generalized Boltzmann machine (GBM), 210, 218, 219, 224 Generalized cross-validation (GCV) method, 198 Generalized expectation maximization (GEM) technique, 197, 208 Gibbisian prior energy, 221 Gibbs parameters, 199, 202-203 Gibbs sampler, 209, 218, 220, 224-226, 228 Global dimension, 141
INDEX Graduated nonconvexity (GNC), 197, 200-201,208, 231-232 detection of edges using, 234-238, 261,263-270 Graph edge, 12 Greedy algorithm, 25 Green's theorem, 27
H Hausdorff-Besicovitch dimension, 138-139 Hausdorff dimension, 138 Hausdorff distance, 29 Hexapoles, 43 High-frequency characteristics, 18 Hopfield-type network, 209, 217, 226 Hurst exponent, 143
I Image field, MAP estimation of, 208-215 Image formation in SEM, 161-171 Image restoration See also Blind image restoration; Unsupervised image restoration Bayesian approach to, 202-206 conclusions, 279-280 edge tracking and smoothing, 231-238 importance sampling theorem, 218, 224-227 MAP estimation of image field, 208-215 MAP-ML method, 199, 200, 204, 206-207 Markov chain Monte Carlo (MCMC) method, 207, 218, 224-227
287
ML estimation of degradation parameters, 215-217 ML estimation of model parameters, 217-227 saddle point approximation, 207, 218-224 types of regularization strategies, 195 visual reconstruction, 194-195 Image segmentation techniques, 224 See also Embedded integration; Postprocessing integration boundary-based methods, 2, 3 discontinuity and similarity, 2 goal of, 1 pixel-level, 6 region-based methods, 2-3 symbol-level, 6 Importance sampling theorem, 218, 224-227 Inelastic scattering, 162-163 Inverse filtering, 198 Iterated conditional modes (ICM) algorithm, 197, 108
K Kalman filtering, 198
L Lambert's cosine law, 164 Lena image, 261,262, 268-270 Light-optical law of reflection, 48 Local dimension, 140-141 Lorentz force, 60 Low-energy electron microscope (LEEM), mirror corrector and, 50, 51
288
INDEX
M Magnetic shielding, 91-96 Markov chain Monte Carlo (MCMC) method, 207, 218, 224-227, 229 Markov random field (MRF), 196, 199, 217, 223 Bayesian approach to unsupervised restoration, 202-206 model for constrained implicit discontinuities, 232-234 Maximum a posteriori (MAP) estimator, 196, 227 estimation of image field, 208-215 -ML method, 199, 200, 204, 206-207 Maximum entropy method (MEM), 195 Maximum likelihood (ML) estimation, 198 estimation of degradation parameters, 215-217 estimation of model parameters, 217-227 hyperparameter estimation, 247-261 MAP-ML method, 199, 200, 204, 206-207 Maximum pseudolikelihood (MPL) approximation, 200 Mean field annealing, 197 Mean field approximation, 199, 200, 206 Merit function, 19 Metropolis algorithm, 209, 225, 227 Mirror correctors See also SMART (spectromicroscope)
beam separator requirements, 51-52 chromatic and spherical aberration correction with an electron mirror, 47-51 conclusions, 128-129 necessity and usefulness of correction, 44--47 research and development of, 42-44 Mirror correctors, design of, 72 beam separator, 73-78 field lenses and electron mirror, 78-81 multipoles, 82-84 Mirror correctors, testing of beam separator, 102-113 determination of chromatic and spherical aberration coefficients, 113-122 electron mirror, 122-128 field lenses, 96-102 magnetic shielding, 91-96 measurement arrangement, 84-91 Mixed-annealing algorithm, 208-210 Model parameters, ML estimation of, 217-227 Mondrian image, 248-252, 256-258, 271 Multifractals, 137 Multipoles, 82-84
N Noise influence in fractal dimension calculation, 164-171 need to eliminate, 194 Nyquist frequency, 184
INDEX
289
O
Q
Off-axial rays, 45 Order, 44, 62 Oversegmentation, 18-21
Quadrupole fields, 65 Quadrupole-octopole corrector, 43
P Paraxial rays, 44 Particle position, 141 Photoemission electron microscope (PEEM), 51 Photon illumination experiments, 57-59 Pixel-level integration, 6 Point diagrams, 70 Point-spread function (PSF), 194 blind image restoration, 197-198 Pole plates and coils, beam separator, 73-76 Positron emission tomography. (PET), 224 Posterior energy function, 208 Postprocessing integration boundary refinement, 18, 21-27 conclusions and future work on, 35-36 defined, 4 disadvantages of, 33-35 oversegmentation, 18-21 region-growing algorithms, 2, 18 selection evaluation, 18, 27-31 split-and-merge algorithms, 2, 7, 19-21 summary of, 31-35 versus embedded integration, 4-5, 17 Power spectrum, 184 Preconditioned conjugate gradient (PCG) algorithms, 210-215
R Random fractal characteristics, 137-146 Random walk, 141-144 Rank, 62 Region-based methods, 2-3 Region-growing algorithms, 2, 12-14, 18 Resolution determination, 70-71 Result refinement, 21 Richardson plot, 138 Roof image, 252-256 Root-mean-square error (RMSE), 240, 242-245,263,264, 266, 267, 271-276 Rosenfeld method of local maxima distance, 7
S Saddle point approximation, 207, 218-224, 227, 229 Scanning electron microscope (SEM) construction of, 86-87 mirror corrector and, 50, 51 Scanning electron microscope, texture characterization and analysis of theoretically generated images, 158-161 applications, 174-182 Bonetto and Ladaga method, 152-158 contrast mechanisms, 163-164 electron beam interaction, 162-163
290
INDEX
Scanning electron microscope, photon illumination experiments, texture characterization 57-59 and (Cont.) resolution determination, 70-71 Everhart-Thornley (E-T) detector, Snakes, boundary refinement by, 161-162, 163-164 23-27 FERImage program, 171-174 Spherical aberration, 45 Fourier power spectrum, 141, 168, correction of, with an electron 171,183-188 mirror, 47-51 image formation in, 161-171 determination of coefficients, noise influence in fractal 113-122 dimension calculation, Split-and-merge algorithms, 2, 7, 164-171 10-12, 19-20 quality differences in emery paper, Stabiliziers, 232-238 178-182 Strict classification without hidden Thiobacillus type behavior, units, 224 175-178 Structural fractal, 138 vafiogram as a surface Symbol-level integration, 6 characterization tool, 136-146 variogram for texture T characterization of digital Textural fractal, 138 images, 146-174 Thiobacillus type behavior, 175-178 Scherzer theorem, 44, 45, 47 Thresholding, 21 Seed placement, 6, 7-10 effective, 235 Self-affine records, Brownian motion nominal, 236 or random walk and, Through-focus method, 99 141-144 Tikhonov standard regularization, Self-similarity property, 140 195, 197,210 Simulated annealing (SA), 197, 207, Time of fusion, 3, 4 208, 209-210, 225,227 Toeplitz blocks, 214 SMART (spectromicroscope) Toeplitz matrices, 195, 214 See also Mirror correctors Topological dimension, 138 accuracy and stability Trajectories, fundamental, 44 requirements, 71-72 Transition couple, 30 beam separator, 65-70 Transmission electron microscope construction of, 54-57 (TEM), 85-86 defined, 51 development of, 52-53 electron illumination experiments, U 59 electron mirror, 60-64 Unsupervised image restoration, 199 performance calculation, 59-72 See also Image restoration
INDEX architecture for, 227-231 Bayesian approach to, 202-206 edge tracking, 261-269 graduated nonconvexity (GNC), 197, 200-201 ML hyperparameter estimation, 247-261 performance evaluation, 270-279
u Variogram analysis of theoretically generated images, 158-161 Bonetto and Ladaga method, 152-158 Brownian motion or random walk, 141-144
291
examples of, 144-146 Fourier power spectrum, 141,168, 171,183-188 fractal geometry, 137-141 random fractal characteristics, 137-146 as a surface characterization tool, 136-146 for texture characterization of digital images, 146-174 von Koch curve, 139-140 Voronoi image, 7
W Wiener filtering, 198 Wobbling, 99
This Page Intentionally Left Blank