Academic Press is an imprint of Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands 32 Jamestown Road, London NW1 7BY, UK 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA First edition 2009 c 2009, Elsevier Inc. All rights reserved. Copyright No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://www.elsevier.com/locate/permissions, and selecting Obtaining permission to use Elsevier material. Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-374769-3 ISSN: 1076-5670 For information on all Academic Press publications visit our Web site at elsevierdirect.com Printed in the United States of America 09 10 11 12 10 9 8 7 6 5 4 3 2 1
Contents Preface Contributors Future Contributions
1. Surface Plasmon-Enhanced Photoemission and Electron Acceleration with Ultrashort Laser Pulses
vii ix xi
1
P´eter Dombi 1. Introduction 2. Electron Emission and Photoacceleration in Surface Plasmon Fields 3. Numerical Methods to Model Surface Plasmon-Enhanced Electron Acceleration 4. Experimental Results 5. The Role of the Carrier-Envelope Phase 6. Conclusions Acknowledgments References
2. Did Physics Matter to the Pioneers of Microscopy?
2 3 7 16 21 23 24 24
27
Brian J. Ford 1. Introduction 2. Setting the Scene 3. Traditional Limits of Light Microscopy 4. Origins of the Cell Theory 5. Pioneers of Field Microscopy 6. The Image of the Simple Microscope Acknowledgments References
3. Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
27 28 30 39 58 70 84 85
89
J´erˆ ome Gilles 1. Introduction 2. Preliminaries 3. Structures + Textures Decomposition 4. Structures + Textures + Noise Decomposition 5. Performance Evaluation 6. Conclusion Appendix A. Chambolle’s Nonlinear Projectors References
90 90 102 110 123 128 130 135
v
vi
Contents
4. The Reverse Fuzzy Distance Transform and its Use when Studying the Shape of Macromolecules from Cryo-Electron Tomographic Data
139
Stina Svensson 1. Introduction 2. Preliminaries 3. Segmentation Using Region Growing by Means of the Reverse Fuzzy Distance Transform 4. Cryo-Electron Tomography for Imaging of Individual Macromolecules 5. From Electron Tomographic Structure to a Fuzzy Objects Representation 6. Identifying the Subunits of a Macromolecule 7. Identifying the Core of an Elongated Macromolecule 8. Conclusions Acknowledgments References
5. Anchors of Morphological Operators and Algebraic Openings
140 142 151 153 160 161 165 167 168 168
173
M. Van Droogenbroeck 1. Introduction 2. Morphological Anchors 3. Anchors of Algebraic Openings 4. Conclusions References
173 182 195 199 200
6. Temporal Filtering Technique Using Time Lenses for Optical Transmission 203 Systems Dong Yang, Shiva Kumar, and Hao Wang 1. 2. 3. 4. 5.
Introduction Configuration of a Time-Lens–based Optical Signal Processing System Wavelength Division Demultiplexer Dispersion Compensator Optical Implementation of Orthogonal Frequency-Division Multiplexing Using Time Lenses 6. Conclusions Acknowledgment Appendix A Appendix B Appendix C References
203 206 211 216 219 226 227 227 228 229 231
Contents of Volumes 151–157
233
Index
235
Preface
Ben Kazan, 1982 Before describing the contents of this volume, let me first say a few words about Benjamin Kazan, one of the Honorary Associate Editors of these Advances, whose death on January 14 2009 was mentioned briefly in the preface to volume 157. He was editor of the Academic Press series, Advances in Image Pickup and Display from 1974 to 1983, after which the title was absorbed into Advances in Electronics and Electron Physics (the earlier title of these Advances). Ben Kazan, born in New York in 1917, received his B.S. degree from the California Institute of Technology, Pasadena, in 1938 and his M.A. from Columbia University, New York, in 1940. In 1961, he was awarded the D.Sc. degree by the Technical University of Munich. From 1940 to 1950, he was Section Head at the Signal Corps Engineering Laboratories, working on the development of new microwave storage and display tubes. For the next eight years, he was engaged in work on colour television tubes and solidstate intensifiers at the RCA Research Laboratories. From 1958 to 1962, he was head of the Solid-state Display Group at Hughes Research Laboratories, after which he moved to Electro-Optical Systems, an affiliate of the Xerox Corporation, again working on solid-state and electro-optical systems. From 1968–1974, he was employed at the IBM Thomas J. Watson Research Center. His last position was head of the Display Group at the Palo Alto Research Center of the Xerox Corporation. A dinner was held in his honour at Xerox, as the person holding the most patents at Xerox. In addition to his editorship of Advances in Image Pickup and Display, he was co-author of two books (notably, Electronic Image Storage with M. Knoll,
vii
viii
Preface
Academic Press, New York 1968) and was also editor of the Proceedings of the Society for Information Display. He was a Fellow of this Society as well as a member of the American Physical Society. In his leisure hours, he played the violin and enjoyed books about music and medical topics, biographies and many other subjects. He was man of great kindness and generosity and will be greatly missed by his family and friends. On behalf of the publishers and myself, we extend our sincerest condolences to Gerda Mosse-Kazan, his widow. The present volume contains six chapters on very different subjects, ranging from the early history of the microscope to mathematical morphology, time lenses, fuzzy sets and electron acceleration. We begin with a study of surface-plasmon-enhanced photoemission and electron acceleration using ultrashort laser pulses by P. Dombi. This is a very young subject and P. Dombi explains in detail what is involved and the physics of these complicated processes. This is followed by a fascinating article on the development of (light) microscopy by B.J. Ford, with the provocative title ‘Did physics matter to the pioneers of microscopy?’ He has chosen to work back to Hooke and van Leeuwenhoek, starting with the microscopes we know today. I do not need to do more than urge all readers of these Advances to plunge into this chapter, which is truly ‘unputdownable’! How can an image be decomposed into its various structural and textural components? This is the subject of the chapter by J. Gilles, who provides a very lucid account of recent progress in this area. The mathematical preliminaries, which cover all the newer kinds of wavelets – ridgelets, curvelets and contourlets – form an essential basis on which the remainder reposes. The fourth chapter, by S. Svensson, brings together two different topics: fuzzy distance transforms and electron tomography. Once again, the opening sections provide a solid mathematical basis for the application envisaged and I am certain that this full introductory account to these techniques will be heavily used. The next chapter will appeal to mathematical morphologists: here, M. van Droogenbroeck describes the notion of anchors of morphological operators and algebraic openings. This concept is placed in context and the chapter forms a self-contained account of this particular aspect of mathematical morphology. The volume ends with another new subject, time lenses for optical transmission systems, by D. Yang, S. Kumar and H. Wang. Spatial imaging has a perfect analogy in the time domain and this is exploited for temporal filtering. The authors introduce us to the subject before going more deeply into the possible ways of pursuing this analogy. As always, I thank the authors for all the trouble they have taken to make their work accessible to a wide readership. Peter W. Hawkes
Contributors
P´eter Dombi Research Institute for Solid-State Physics and Optics, Budapest, ´ Hungary Konkoly-Thege M. ut,
1
Brian J. Ford Gonville & Caius College, University of Cambridge, UK
27
J´erˆ ome Gilles ˆ d’Or, DGA/CEP - EORD Department, 16bis rue Prieur de la Cote Arcueil, France
89
Stina Svensson Department of Cell and Molecular Biology, Karolinska Institute, Stockholm, Sweden
139
M. Van Droogenbroeck University of Li`ege, Department of Electrical Engineering and Computer Science, Montefiore, Sart Tilman, Li`ege, Belgium
173
Dong Yang, Shiva Kumar, and Hao Wang Department of Electrical and Computer Engineering, McMaster University, Canada
203
ix
Future Contributions
S. Ando Gradient operators and edge and corner detection K. Asakura Energy-filtering x-ray PEEM W. Bacsa Optical interference near surfaces, sub-wavelength microscopy and spectroscopic sensors C. Beeli Structure and microscopy of quasicrystals C. Bobisch and R. M¨ oller Ballistic electron microscopy G. Borgefors Distance transforms Z. Bouchal Non-diffracting optical beams A. Buchau Boundary element or integral equation methods for static and timedependent problems B. Buchberger ¨ Grobner bases E. Cosgriff, P. D. Nellist, L. J. Allen, A. J. d’Alfonso, S. D. Findlay, and A. I. Kirkland Three-dimensional imaging using aberration-corrected scanning confocal electron microscopy T. Cremer Neutron microscopy A. V. Crewe (Special volume on STEM, 159) Early STEM A. Engel (Special volume on STEM, 159) STEM in the life sciences A. N. Evans Area morphology scale-spaces for colour images xi
xii
Future Contributions
A. X. Falc˜ao The image foresting transform R. G. Forbes Liquid metal ion sources C. Fredembach Eigenregions for image classification J. Giesen, Z. Baranczuk, K. Simon, and P. Zolliker Gamut mapping A. G¨ olzh¨auser Recent advances in electron holography with point sources M. Haschke Micro-XRF excitation in the scanning electron microscope P. W. Hawkes (Special volume on STEM, 159) The Siemens and AEI STEMs L. Hermi, M. A. Khabou, and M. B. H. Rhouma Shape recognition based on eigenvalues of the Laplacian M. I. Herrera The development of electron microscopy in Spain H. Inada and H. Kakibayashi (Special volume on STEM, 159) Development of cold field-emission STEM at Hitachi M. S. Isaacson (Special volume on STEM, 159) Early STEM development J. Isenberg Imaging IR-techniques for the characterization of solar cells K. Ishizuka Contrast transfer and crystal images A. Jacobo Intracavity type II second-harmonic generation for image processing B. Jouffrey (Special volume on STEM, 159) The Toulouse high-voltage STEM project L. Kipp Photon sieves G. K¨ ogel Positron microscopy T. Kohashi Spin-polarized scanning electron microscopy
Future Contributions
xiii
O. L. Krivanek (Special volume on STEM, 159) Aberration-corrected STEM R. Leitgeb Fourier domain and time domain optical coherence tomography B. Lencov´a Modern developments in electron optical calculations H. Lichte New developments in electron holography M. Mankos High-throughput LEEM M. Matsuya Calculation of aberration coefficients using Lie algebra S. McVitie Microscopy of magnetic specimens J. Mendiola Santiba˜ nez, I. R. Terol-Villalobos, and I. M. Santill´an-M´endez (Vol. 160) Connected morphological contrast mappings I. Moreno Soriano and C. Ferreira Fractional Fourier transforms and geometrical optics M. A. O’Keefe Electron image simulation D. Oulton and H. Owens Colorimetric imaging N. Papamarkos and A. Kesidis The inverse Hough transform K. S. Pedersen, A. Lee, and M. Nielsen The scale-space properties of natural images E. Rau Energy analysers for electron microscopes P. Rudenberg (Vol. 160) ¨ The work of R. Rudenberg R. Shimizu, T. Ikuta, and Y. Takai Defocus image modulation processing in real time S. Shirai CRT gun design methods A. S. Skapin and P. Ropret (Vol. 160) The use of optical and scanning electron microscopy in the study of ancient pigments
xiv
Future Contributions
K. C. A. Smith (Special volume on STEM, 159) STEM in Cambridge T. Soma Focus-deflection systems and their applications P. Sussner and M. E. Valle Fuzzy morphological associative memories L. Swanson and G. A. Schwind (Special volume on STEM, 159) Cold field-emission sources I. Talmon Study of complex fluids by transmission electron microscopy M. E. Testorf and M. Fiddy Imaging from scattered electromagnetic fields, investigations into an unsolved problem N. M. Towghi I p norm optimal filters E. Twerdowski Defocused acoustic transmission microscopy Y. Uchikawa Electron gun optics K. Vaeth and G. Rajeswaran Organic light-emitting arrays V. Velisavljevic and M. Vetterli (Vol. 160) Space-frequence quantization using directionlets S. von Harrach (Special volume on STEM, 159) STEM development at Vacuum Generators, the later years J. Wall, M. N. Simon, and J. F. Hainfeld (Special volume on STEM, 159) History of the STEM at Brookhaven National Laboratory I. R. Wardell and P. Bovey (Special volume on STEM, 159) STEM development at Vacuum Generators, the early years M. H. F. Wilkinson and G. Ouzounis Second generation connectivity and attribute filters P. Ye Harmonic holography
Chapter
1 Surface Plasmon-Enhanced Photoemission and Electron Acceleration with Ultrashort Laser Pulses P´eter Dombi
Contents
1. Introduction 2. Electron Emission and Photoacceleration in Surface Plasmon Fields 2.1. Emission Mechanisms 2.2. Emission Currents 2.3. Electron Acceleration in Evanescent Surface Plasmon Fields 3. Numerical Methods to Model Surface Plasmon-Enhanced Electron Acceleration 3.1. Elements of the Model 3.2. Model Results 4. Experimental Results 4.1. Surface Plasmon-Enhanced Photoemission 4.2. Generation of High-Energy Electrons 4.3. Time-Resolved Studies of the Emission 5. The Role of the Carrier-Envelope Phase 5.1. Light-Matter Interaction with Few-Cycle Laser Pulses, CarrierEnvelope Phase Dependence 5.2. Carrier-Envelope Phase-Controlled Electron Acceleration 6. Conclusions Acknowledgments References
2 3 3 5 7 7 7 11 16 16 18 19 21 21 22 23 24 24
´ Hungary Research Institute for Solid-State Physics and Optics, Budapest, Konkoly-Thege M. ut, Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00006-8. c 2009 Elsevier Inc. All rights reserved. Copyright
1
2
P´eter Dombi
1. INTRODUCTION It was shown recently that ultrashort, intense laser pulses are particularly well suited for the generation of electron and other charged particle beams both in the relativistic and the nonrelativistic intensity regimes of laser-solid interactions (Irvine, Dechant, & Elezzabi, 2004; Leemans et al., 2006, and references therein). One method to generate well-behaved, optically accelerated electron beams with relatively low-intensity light pulses is surface plasmon polariton (SPP)-enhanced electron acceleration. Due to the intrinsic phenomenon of the enhancement of the SPP field (with respect to the field of the SPP-generating laser pulse), substantial field strength can be created in the vicinity of metal surfaces with simple, high-repetition-rate, unamplified laser sources. This results in both SPPenhanced electron photoemission and electron acceleration in the SPP field. SPP-enhanced photoemission was demonstrated in several experimental publications. Typical photocurrent enhancement values ranged from ×50 to ×3500 achieved solely by SPP excitation (Tsang, Srinivasan-Rao, & Fischer, 1991). In addition to SPP-enhanced photoemission, the electrons in the vicinity of the metal surface can undergo significant cycle-by-cycle acceleration in the evanescent plasmonic field. This phenomenon, termed SPP-enhanced electron acceleration, was discovered recently and was experimentally demonstrated to be suitable for the production of relatively high-energy, quasi-monoenergetic electron beams with the usage of simple femtosecond lasers (Irvine et al., 2004; Kupersztych, Monchicourt, & Raynaud, 2001; Zawadzka, Jaroszynski, Carey, & Wynne, 2001). In this scheme, the evanescent electric field of SPPs accelerates photo-emitted electrons away from the surface. This process can be so efficient that multi-keV kinetic energy levels can be reached without external direct current (DC) fields (Irvine and Elezzabi, 2005; Irvine et al., 2004). This method seems particularly advantageous for the generation of well-behaved femtosecond electron beams that can later be used for infrared pump/electron probe methods, such as ultrafast electron diffraction or microscopy (Lobastov, Srinivasan, & Zewail, 2005; Siwick, Dwyer, Jordan, & Miller, 2003). These time-resolved methods using electron beams can gain importance in the future by enabling both high spatial and high temporal resolution material characterization at the same time. They will become particularly interesting if the attosecond temporal resolution domain becomes within reach with electron diffraction and microscopy methods, as suggested recently (Fill, Veisz, Apolonski, & Krausz, 2006; Stockman, Kling, Krausz, & Kleineberg, 2007; Varro´ and Farkas, 2008). Moreover, studying the spectral properties of femtosecond electron beams has the potential to reveal ultrafast excitation dynamics in solids and to provide the basis for a single-shot measurement tool of the carrier-envelope (CE) phase (or the optical waveform) of ultrashort laser
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
3
pulses, as we suggested recently (Dombi and R´acz, 2008a; Irvine, Dombi, Farkas, & Elezzabi, 2006). Other waveform-sensitive laser-solid interactions that have already been demonstrated (Apolonski et al., 2004; Dombi et al., ¨ 2004; Fortier et al., 2004; Mucke et al., 2004) suffer from low experimental contrast; therefore, it is necessary to look for higher-contrast tools for direct phase measurement. Motivated by these possibilities, it was shown numerically (and also partly experimentally) that surface plasmonic electron sources can be ideally controlled with ultrashort laser pulses so that they deliver highly directional, monoenergetic electron beams readily synchronized with the pump pulse (Dombi and R´acz, 2008a; Irvine et al., 2004, 2006). We developed a simple semiclassical approach for the simulation of this process analogous to the three-step model of high harmonic generation (Corkum, 1993; Kulander, Schafer, & Krause, 1993). In this chapter, we review the basic elements of this model and prove that it delivers the same results as a much more complicated treatment of the problem based on the rigorous, but computationally time-consuming, solution of Maxwell’s equations. Results gained with this latter method showed very good agreement with experimental electron spectra (Irvine, 2006). We also provide new insight into the spatiotemporal dynamics of SPP-enhanced electron acceleration, which is also important if one intends to realize adaptive emission control methods (Aeschlimann et al., 2007).
2. ELECTRON EMISSION AND PHOTOACCELERATION IN SURFACE PLASMON FIELDS 2.1. Emission Mechanisms Laser-induced electron emission processes of both atoms and solids are determined by the intensity of the exciting laser pulse. At low intensities where the field of the laser pulse is not sufficient to distort the potential significantly, multiphoton-induced processes dominate at visible wavelengths. These nonlinear processes can be described by a perturbative approach in this case. Light-matter interaction is predominantly nonadiabatic and it is governed by the evolution of the amplitude of the laser field, or, in other words, the intensity envelope of the laser pulse. Tunneling or field emission takes over at higher intensities. This emission regime is determined by the fact that the potential is distorted by the laser field to an extent that it allows tunneling (or, at even higher intensities, above-barrier detachment) of the electron through the modulated potential barrier, the width of which is determined by the laser field strength. The interaction is determined by the instantaneous field strength of the laser pulse; the photocurrent generated in this manner follows the field evolution
4
P´eter Dombi
(a)
Vacuum niveau
~ 5 eV typically Fermi niveau
(b) Vacuum niveau
Fermi niveau
FIGURE 1 Schematic illustration of photo-induced electron emission processes in different laser-intensity regimes when the work function of the metal is more than twice the photon energy (typical for most metals and for near-infrared wavelengths). (a) Multiphoton-induced photoemission. (b) Tunneling or field emission through the potential barrier the width of which is modulated by the laser field.
adiabatically. This interaction type is also referred to as the strong-field regime of nonlinear optics. The difference between multiphoton-induced and field emission is illustrated in Figure 1. There are, of course, intermediate intensities where the contribution of multiphoton and field emission processes can become comparable. This case is termed as non-adiabatic tunnel ionization and its theoretical treatment is considerably more complicated (Yudin and Ivanov, 2001). It should be mentioned that at significantly higher intensities characteristically different plasma and relativistic effects can also contribute to the light-matter interaction process. This regime, however, is not discussed here. It follows from simple considerations that the average oscillation energy of an electron in the field of an infinite electromagnetic plane wave is
Up =
e2 El2 , 4mω2
(1)
where the electron charge and rest mass are denoted by e and m, respectively, ω is the angular frequency, and the field strength of the laser field is given by El . This quantity is called ponderomotive potential in the literature. The analysis by Keldysh (1965) yielded the perturbation parameter γ , which proved to be an efficient scale parameter to describe bound-free transitions induced by laser fields. Its value is given by
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
W γ = = 2U p 2
!2 √ ω 2mW , eEl
5 (2)
where W is the binding energy of the most weakly bound electron in an atom (ionization potential) or the work function of the metal. It can be shown that for the case γ 1, multiphoton-induced processes dominate. On the other hand, the γ 1 condition indicates the dominance of field emission. The intensity corresponding to γ ∼ 1 signifies the transition regime between multiphoton-induced and field emission (Farkas, Chin, Galarneau, ´ & Yergeau, 1983; Toth, Farkas, & Vodopyanov, 1991) and this parameter region is sometimes termed the non-adiabatic tunnel ionization regime (Yudin and Ivanov, 2001). It can also be shown that γ = τt ω holds where τt is ¨ ¨ the Buttiker–Landauer traversal time for tunneling (Buttiker and Landauer, 1982).
2.2. Emission Currents 2.2.1. Multiphoton-Induced Emission As suggested by the previous considerations, the time dependence of the electron emission currents can be described by different formulas in the multiphoton and the field emission cases. During multiphoton-induced emission the energy of n photons is converted into overcoming the work function of the metal and into the kinetic energy of the freed electron: nh¯ ω = E kin + W . In this case, the probability of the electron generation is proportional to the nth power of the intensity of the laser field: j (t) ∝ I n (t).
(3)
This formula yields a very good approximation of the temporal emission profile, provided that no finite-lifetime intermediate states exist. For example, the full quantum mechanical description of the multiphotoninduced photoemission process yielded a very similar dependence recently ¨ (Lemell, Tong, Krausz, & Burgdorfer, 2003), although with a somewhat asymmetric temporal profile. Thus, it can be seen that in this case it is the momentary amplitude of the field oscillation that determines the emission probability. As a result of formula (3), for example, if we take a Gaussian laser pulse profile, I (t), the electron √ emission curve, j (t), has a full width at half maximum (FWHM) that is n times shorter than the FWHM of the original I (t) curve (Figure 2). 2.2.2. Field or Tunneling Emission The case of field or tunneling emission can be described by more complex equations. Depending on the model used, several tunneling formulas have
6
P´eter Dombi Field envelope
1.0
Three-photon-induced
Photocurrent (arb. u.)
photoemission Field emission
0.8
0.6
0.4
0.2
0.0 –8
–6
–4
–2
0
2
4
6
8
Time (fs)
FIGURE 2 Examples of electron emission temporal profiles for a few-cycle laser pulse with a duration of 3.5- fs (intensity full width at half maximum (FWHM)). The dotted curve depicts the field envelope evolution. The dashed curve is the photocurrent temporal distribution for a three-photon-induced photoemission. The solid curve is the photocurrent profile for tunneling electron emission from the surface, determined by the Fowler–Nordheim equation (see text for further details).
been proposed. The one used most generally for metals both for static and for oscillating laser fields is the so-called Fowler–Nordheim equation (Binh, Garcia, & Purcell, 1996; Hommelhoff, Sortais, Aghajani-Talesh, & Kasevich, 2006), where the electric field dependence of the tunneling current is described by ! √ 8π 2mW 3/2 e3 El (t)2 j (t) ∝ exp − v(w) , 3he |El (t)| 8π hW t 2 (w)
(4)
where El (t) denotes the laser field strength, e and m the electron charge and mass, respectively, and h is the Planck constant. W stands for the work function of the metal. v(w) is a slowly varying function taking into account the image force of the tunneling electron with 0.4 < v(w) < 0.8, and the value of the function t (w) can be taken as t (w) ≈ 1 for tunneling emission √ with w = e3/2 El /4πε0 /W. The characteristic form of the j (t) curve for this case is shown in Figure 2. The electron emission is concentrated mainly in the vicinity of those instants when the field strength reaches its maximum value. Note that the experimental investigation of pure field emission is very limited for metals (at visible wavelengths) since the damage threshold of bulk metal surfaces and thin films lies around an intensity of 1013 W/cm2 , which is very close to the intensity value where the γ ∼ 1 condition is met. A practical approach
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
7
to circumnavigate this problem is needed, to be able to investigate these processes experimentally. The exploitation of far-infrared sources proved suitable for this purpose where the γ ∼ 1 condition can be met at much lower intensities (Farkas et al., 1983). In addition, plasmonic field enhancement can be exploited in the visible spectral region so that γ 1 can be achieved for metal films without damaging the surface. This latter method is also more advantageous due to the lack of ultrashort laser sources in the far-infrared domain. The phenomenon of plasmonic field enhancement is described in detail in the next section.
2.3. Electron Acceleration in Evanescent Surface Plasmon Fields After photoemission had taken place from the metal surface, the electrons travel in vacuum dressed by the SPP field. This situation can be approximated by solving the classical equations of motion for the electron in the electromagnetic field of the surface plasmons. This concept is somewhat similar to the three-step model of high harmonic generation on atoms where the electron is considered as a free particle after tunneling photoinonization had taken place induced by the electric field of the laser pulse (Corkum, 1993; Kulander et al., 1993). We adapted a model similar to the SPP environment where instead of a single atom, a solid surface is involved that determines the conditions for recollision. Because of the presence of the surface, many electrons recollide or cannot even accelerate because the Lorentz force points toward the surface at the instant of photoemission, or, in other words, at the instant of the “birth” of the electron in vacuum. This latter situation is also modeled by recombination; therefore, these electrons must be disregarded when the properties of the electron bunch are determined. The rest of the electrons experience cycle-by-cycle kinetic energy gain and become accelerated along the electric field gradient. This mechanism is the same if the envelope of the laser pulse is made up of only few optical cycles; however, the final kinetic energy will not be composed of a large number of incremental, cycle-by-cycle kinetic energy gain portions as in the case of long pulses. Due to the reduced time the electrons spend in the field of the fewcycle SPPs, however, the expected final kinetic energy will be lower. These intuitive predictions are confirmed numerically in the upcoming sections.
3. NUMERICAL METHODS TO MODEL SURFACE PLASMON-ENHANCED ELECTRON ACCELERATION 3.1. Elements of the Model As discussed previously, SPP-enhanced electron acceleration involves distinct physical processes such as (i) the coupling of the incident light and surface plasmonic electromagnetic fields, (ii) the photoinjection of the electrons into vacuum from the metal layer, and (iii) the subsequent
8
P´eter Dombi
acceleration of free electrons by the decaying SPP field on the vacuum side of the surface. The elements of the model that we used correspond to these individual steps of the process; therefore, they are presented in separate sections below. 3.1.1. Solution of the Field In order to determine SPP fields accurately, Maxwell’s equations can be solved with the so-called finite difference time-domain (FDTD) method. This approach was used for the Kretschmann–Raether SPP coupling configuration in previous studies (Irvine and Elezzabi, 2006; Irvine et al., 2004). In this case, the components of the electric field, the electric displacement, and the magnetic intensity vectors are solved for a grid placed upon the given geometry. Since the FDTD method provides the complete numerical solution of Maxwell’s equations, it is computationally rather intensive and more complex geometries cannot be handled with simple personal computers due to the increased processor times required. Therefore, we proposed analytic formulas to describe SPP fields (Dombi and R´acz, 2008a). Based on the well-known fact that these fields decay exponentially by moving away from the surface (Raether, 1988), we took an analytic expression for the SPP field components on the vacuum side of the metal layer in the form of E ySPP (x, y, t) = ηE 0 E env (x, t) cos (kSPP x − ωt + ϕ0 ) exp(−αy) (5a) π SPP E x (x, y, t) = ηa E 0 E env (x, t) cos kSPP x − ωt − + ϕ0 exp(−αy), (5b) 2 where E 0 is the field amplitude, E env (x, t) is an envelope function determined by the temporal and spatial beam profiles of the incoming Gaussian pulse, η is the field enhancement factor resulting from plasmon coupling (Raether, 1988), kSPP is the SPP wave vector, ω is the carrier frequency, ϕ0 is the CE phase of the laser pulse, and α is the decay length of the plasmonic field in vacuum given by s α −1 =
ω2 2 − kSPP c2
(6)
(Irvine and Elezzabi, 2006). For laser pulses with a central wavelength of 800 nm, the evanescent decay parameter α = 247 nm−1 follows from Eq. (6). We used the value of a = 0.3 according to the notion that the amplitudes of the x- and y-components of the plasmonic field have this ratio according to the numerical solution of Maxwell’s equations (Irvine and Elezzabi, 2006). It can be concluded that the field given by Eqs. (5a) and (5b) approximates the exact SPP field with very good accuracy by comparing our results to
9
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses 0.5 0.4
x
y (micron)
y
0.3 0.2 0.1 0 –0.5
–0.25
0 x (micron)
0.25
0.5
FIGURE 3 Illustration of the setup for the generation of electron beams by surface plasmon- enhanced electron acceleration with the distribution of the electric field amplitude on the vacuum side of the surface, field vectors (inset) and electron trajectories. For further details, see text. (Source: Dombi and R´acz (2008a).)
those of Irvine and Elezzabi (2006). The distribution of the field amplitude in the vicinity of the surface is shown in Figure 3 which shows very good agreement with the above-mentioned calculation. We also succeeded in reproducing the vector representation of the field depicted in Figure 3 of Irvine and Elezzabi (2006) with this method. The representation of the vector field that can be calculated with our model is depicted in the inset of Figure 3. 3.1.2. Electron Emission Channels and Currents Induced by Plasmonic Fields After the determination of the field, a point array can be placed along the prism surface and the spatial and temporal distribution of the photoemission (induced by the SPP field) along the surface can be examined, assuming that field emission takes place at higher intensities. To this end, we applied the Fowler–Nordheim equation routinely used in studies involving electron emission from metal nanotips (Hommelhoff, Kealhofer, & Kasevich, 2006; Hommelhoff, Sortais et al., 2006; Ropers, Solli, Schulz, Lienau, & Elsaesser, 2007). This describes the instantaneous tunneling current based on the fact that plasmonic fields carry substantial field enhancement factors (up to ×100) compared to the generating field. One can gain a spatially and temporally resolved map of tunneling probabilities determined by the SPP field this way. The temporal distribution, for example, can be seen in Figure 2. Similar probability distribution curves also result for the spatial coordinates. According to these probabilities, each photoemitted and SPPaccelerated electron that is examined can be assigned a corresponding weight. This weight must be used to accurately determine the final kinetic energy spectrum of the electron beam.
10
P´eter Dombi
(a)
τ0 = 5 fs
(b)
τ0 = 30 fs
40
12
35
10
30 25
8
t (fs)
6
20 15
4
10
–40 –20 x (n m)
5 0 0
)
–150
50
–100 –50
x (n
m)
0 0
nm
–60
200 150 100
5 0
y(
0
nm )
25 20 15 10
2
y(
t (fs)
14
FIGURE 4 Two selected electron trajectories for a 5 fs-long SPP exciting laser pulse (a) and a 30 fs-long pulse (b) illustrating the difference between the few-cycle and the multicycle case. The central wavelength of the laser pulse is 800 nm in both cases. (Courtesy of P. R´acz.)
3.1.3. Particle Acceleration in the Evanescent Field As a final step in the numerical model, each vacuum electron trajectory of photoemitted electrons in the plasmonic field is investigated for each point in the above-mentioned array and for several emission instants. This is done by solving free-electron equations of motions numerically in the SPP field given by Eqs. (5a) and (5b). Some representative trajectories are shown in Figure 3 (gray curves). Two selected trajectories for 5-fs long exciting pulses (FWHM) as well as for 30-fs long pulses are depicted in Figure 4, illustrating the difference between the acceleration process in the few-cycle and in the multicycle case. In some cases, the electron trajectories involve a recollision with the metal surface; when this happens, no electron emission is assumed. In all other cases, the final kinetic energies and directions of the photoemitted and photoaccelerated electrons are placed in a matrix for each emission point in space and for each emission instant. Figure 5 illustrates the temporal distribution of the final kinetic energies as a function of the electron “birth” instant for a maximum plasmonic field strength of 5.8 × 1010 V/m and for electrons emitted from the central spot of the illuminated surface in case of a 5-fs long exciting pulse with 800-nm central wavelength. Figure 5 demonstrates similarities to the corresponding kinetic energy distributions of atomic electrons after being accelerated by the ionizing laser field (Reider, 2004). As opposed to that case, it is important to note here that only roughly one-fourth of all emission instants contribute to the acceleration process. This is due to the symmetry breaking of the metal surface and the associated electron recollision and reabsorption processes, as discussed in Section 2.3. Macroscopic emission distributions and electron spectra can be calculated
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
11
Kinetic energy (eV)
40
0
–40
–5
0 Time (fs)
5
FIGURE 5 Surface plasmon-accelerated electron energy as a function of the birth instant of the electrons (scatterplots). The electric field of the plasmon generating 5-fs laser pulse (illustrated with solid and dashed lines) has either a ‘‘cosine’’ (dashed) or ‘‘minus cosine’’ waveform (solid) under the same envelope. The corresponding electron energies for the cosine waveform are depicted as circles, whereas for the minus cosine waveform as squares. See text for further pulse parameters.
after the assessment of each trajectory by integrating the above-described emission maps along the spatial and/or temporal coordinates.
3.2. Model Results 3.2.1. Electron Acceleration with Multiphoton-Induced Emission We checked first whether the modeling results reproduce former measurement and simulation spectra (published in Irvine et al. (2004, 2006), Irvine and Elezzabi (2006)) to gain confidence in our simplifed 3-step model. To this end, we carried out simulations for the same parameters as those published in these papers. Athough for the time being we assume multiphoton-induced electron emission for these simulations (as previously used in these references), we must mention that it does not necessarily hold for higher intensities. However, our purpose in this case was to reproduce former results; therefore, the spatiotemporal distribution of photoemission was described by j (t, x) ∼ I n (t, x), according to Eq. (3). n = 3 is used here according to the 4. . . 5 eV work function of most metal surfaces and films and the 1.5 eV photon energy at 800 nm. Figure 6a depicts macroscopic electron spectra gained with our model for peak plasmonic fields of 1.9 × 1011 V/m, 2.7 × 1011 V/m, and 3.7 × 1011 V/m, respectively (the FWHM duration of the input Gaussian laser pulse was 30 fs with a central wavelength of 800 nm). Thereby, this figure can be directly compared with the results in Irvine and Elezzabi (2006), (see Figure 6b). The characteristics
12
P´eter Dombi
(a)
1.9 x109 V/cm 2.7
x109
(b) 1.2
V/cm
1.9 x109 V/cm
1.0
2.7 x109 V/cm
3.7 x109 V/cm
0.8
Counts (a.u.)
Electron counts (a.u.)
1.0
0.6 0.4 0.2
3.7 x109 V/cm
0.8 0.6 0.4 0.2
0.0 0
1000 2000 3000 Kinetic energy (eV)
4000
0.0 0.0
1.0
2.0 3.0 4.0 5.0 Kinetic energy (keV)
6.0
FIGURE 6 (a) Macroscopic electron spectra at peak plasmonic fields of 1.9 × 1011 V/m (solid line), 2.7 × 1011 V/m (dashed line), and 3.7 × 1011 V/m (dotted line) for a Gaussian input laser pulse of 30-fs FWHM duration with a central wavelength of 800 nm. The model used was based on the simplified SPP field description given by Eqs. (5a)–(5b). (b) Electron spectra for the same input parameters with the field calculated with a full FDTD-based simulation. (Source of (b): Irvine and Elezzabi (2006).)
of the electron spectra are very well reproduced, as well as the linear scaling of the kinetic energies of the most energetic electrons with intensity. Slight differences in the peak and cutoff positions can be attributed to the approximate nature of the SPP field expression [Eqs. (5a) and (5b)] used in our case in contrast to the more accurate numerical field solution used by Irvine and Elezzabi (2006). In another comparative simulation we changed the input pulse length to 5 fs FWHM, and assumed that this pulse is focused to a spot with 60-µm diameter on the prism surface. The field peak was 1.9 × 1011 V/m. Figure 7 shows that the spectrum of the electron beam gained with this approach reproduces the spectrum computed with other methods, such as the one in Irvine and Elezzabi (2006) (depicted with a dashed curve in Figure 7). Slight differences in the cutoff positions can still be observed; however, all spectral features and the position of the main peak are exactly the same. Thus, the applicability of analytic field expressions [Eqs. (5a) and (5b)] and the robustness of our approach are confirmed by these examples. 3.2.2. Electron Acceleration with Field Emission We now turn our attention to modeling electron spectra by assuming field emission from the metal surface, which is a more realistic assumption for higher-intensity input beams, approaching the damage threshold of thin metal films. The experimental motivation of this study is driven by the fact that high-repetition-rate, ultrafast laser output delivering focused intensity in this range is achievable with simple titanium:sapphire oscillators with an extended cavity, as we demonstrated recently (Dombi and Antal, 2007;
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
13
Electron counts (a.u.)
1.0 0.8 0.6 0.4 0.2 0.0 0
200
400 Kinetic energy (eV)
600
FIGURE 7 Electron spectrum for a 5-fs generating pulse with a peak plasmonic field strength of 1.9 × 1011 V/m, assuming multiphoton photoemission calculated with simplified numerical methods (solid curve) and electron spectrum for the same input parameters with the electric field calculated with a full FDTD-based simulation (source for dashed curve: Irvine and Elezzabi (2006)). See text for further details.
¨ Dombi, Antal, Fekete, Szipocs, & V´arallyay, 2007; Naumov et al., 2005). We then used the Fowler–Nordheim formula, as given by Eq. (4) and resolved the photoaccelerated electron beam both angularly and spectrally, assuming a maximum input field of 5.8×1010 V/m, which is a rather realistic maximum value considering the damage threshold of gold and silver films. We also assumed a tunneling time of 600 attoseconds which, in our model, describes the delay between the actual distortion of the potential by the field and the corresponding appearance of the electron in the continuum. Several emission maps are presented in the following text, using realistic parameters to reveal the fine structure of the acceleration process and to search conclusions about macroscopically observable properties of the electron beams generated. We examined the final kinetic energy distribution of SPP-accelerated electrons along the plasmon propagation direction (xaxis, representing emission locations along the surface) for a few-cycle interacting pulse with a Gaussian pulse shape, 15-fs and 5-fs intensity FWHM, ϕ0 = 0 CE phase (which means that envelope and field maxima coincide). The central wavelength was 800 nm. The pulse was assumed to be focussed on a spot with a diameter of 4 µm on the prism surface so that a peak plasmon field strength of 5.8 × 108 V/cm (Keldysh-gamma of 0.31) was reached. With this effective intensity value we have already taken into account that substantial field enhancement factors (up to ×100) can be achieved with respect to the SPP generating field. The spatial and spectral distribution of the emitted electrons along the plasmon propagation direction was calculated with these simulation
14
P´eter Dombi
FIGURE 8 Normalized photoacceleration maps (kinetic energy distribution of electrons emitted at different points of the surface: (a), (d) and (g), in grayscale representation); angular and kinetic energy distribution ((b), (e) and (h)); and macroscopic electron spectra ((c), (f), and (i)) of surface plasmon-accelerated electrons for three example parameter sets. Panels (a)–(c) are for 15-fs and (d)–(i) are for 5-fs laser pulses. In panels (g)–(i) we restricted the emission to a spot with 300-nm radius, as illustrated in (g). We modeled a nanolocalized emission region with this approach. See text for further details. (Source: Dombi and R´acz (2008a).)
parameters (in false color representation in Figures 8a and d) for two different pulse lengths to illustrate few-cycle effects. Whereas in the multicycle regime (15-fs pulse length) in Figure 8a a much more structured distribution can be observed, in Figure 8d (5-fs pulse length) the emission is concentrated primarily at a single structure on the emission map providing a better-behaved electron beam. It can also be seen that the emission of highenergy electrons is localized to the center of the illuminated spot and that the number of distinct structures on the emission maps roughly correspond to the number of optical cycles in the generating pulse. This is because the “birth” interval of those electrons in the continuum that can leave the vicinity of the surface is limited to about one-fourth of every laser cycle. This is due to the breaking of the symmetry by the surface such that positive and negative
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
15
half-cycles are not identical from this point of view. Every laser cycle has one such favored interval and electrons emitted in each of these intervals spend different amounts of time in the field; hence, they undergo different acceleration. An even more conspicuous property seems crucially important from the point of view of the applications of this electron source. Figure 8b and 8e depict the angular-kinetic energy distributions of the emitted electron beams, showing the direction in which the energetic electrons leave the surface. The emission is confined to a small range of angles supporting a directionally emitted electron beam ideally suited for novel ultrafast techniques. Provided that the pulse length is in the few-cycle range (Figure 8e), the angular emission map is reduced to a single distinct structure corresponding to a highly directional, quasi-monoenergetic electron beam representing the most favorable regime of SPP-enhanced electron acceleration. By integrating any of the distributions along the x-axis we derive the macroscopically observable electron spectra depicted in Figures 8c and f. The spectrum in Figure 8f has a FWHM 1E kin /E kin value of 0.22 (with E kin denoting the electron kinetic energy) corresponding to a quasi-monoenergetic spectrum. The spectral properties of this electron beam can be further enhanced under experimental circumstances by applying a retarding potential to suppress the low-energy wing of the spectrum. The integrated spectra in Figures 8c, f, and i show a significant difference compared with the one in Figure 7a. This can be attributed exclusively to the different emission regimes (multiphoton vs. tunneling) involved. The sharp temporal distribution of the tunneling peaks located at the field maxima favor the emission of electrons at those time instants when they can gain significant kinetic energy. The sharp spectral cutoff is at the same location as the highest-energy electrons are located in the multiphoton case; however, it is primarily these high-energy electrons that are represented in the field emission case; therefore, a sharp peak appears in the spectrum. On the other hand, the low-energy wings of the spectra in Figures 8c and f display a broader feature, making the source less suitable for ultrafast applications. To generate spectra with higher monoenergeticity, we suggest the application of spatial confinement of the emission area on the metal surface. This can be carried out experimentally by various nanofabrication techniques — for example, by depositing a dielectric layer on top of the metal with a nanoscale opening where the dielectric overlayer is absent and the metal surface is exposed to vacuum. Another possibility is roughening a small rectangular area on top of the metal surface, thereby enhancing the emission from that portion of the film. These potential schemes were taken into account in our simulations by selecting only smaller areas of the surface illuminated by the laser beam, and we considered only those photoelectrons that were emitted from this area. Results are shown in Figures 8g–i where the same emission maps and spectra are given as in Figures 8d–f with
16
P´eter Dombi
the only difference that electrons coming only from a 300-nm wide central portion of the surface were considered. By so confining the emission area, the distribution in Figure 8h shows a highly enhanced contrast. This means that even more monoenergetic spectra and even more directional beams can be generated from this spatially confined source. The 1E kin /E kin value of the integrated spectrum can be enhanced by almost an order of magnitude to 0.033 (see Figure 8i). Our results suggest that SP electron acceleration offers a robust and powerful technique for the generation of ultrafast, monoenergetic, highly directional electron beams (Dombi and R´acz, 2008a).
4. EXPERIMENTAL RESULTS 4.1. Surface Plasmon-Enhanced Photoemission It is well known that the efficiency of several light-matter interaction phenomena and applications, such as Raman scattering, plasmonic biosensors (Lal, Link, & Halas, 2007), and references therein), surface harmonic generation (Quail, Rako, Simon, & Deck, 1983; Simon, Mitchell, & Watson, 1974), and other surface physical and chemical processes can be significantly enhanced by the roughness of the metal surface involved. It was recently shown that even high harmonic generation on atoms is possible in the vicinity of tailored, nanostructured metal surfaces with the help of this phenomenon (Kim et al., 2008). It was shown that the common reason for the increased effects in most such cases is mainly the field enhancement and SPP coupling due to the roughness of the metal surface involved. It is known that the incident electromagnetic field can be enhanced by a factor of up to ×100 (Raether, 1988) on a rough surface if SPPs are also coupled. This means an enhancement of 104 in intensity, which corresponds to a 108 enhancement in two-photon photoemission yield according to Eq. (3) in this favorable case. Moreover, even if the surface of a thin metal film is perfectly (atomically) flat, SPP coupling in the Kretschmann–Raether configuration results in a factor of ×3. . . 4 field enhancement alone at the metal-vacuum interface with respect to the field of the incident beam (Raether, 1988). Even this effect means a drastic photoemission yield enhancement for a perturbative nphoton process. Therefore, one of the first examples of newly discovered femtosecond surface plasmon-enhanced phenomena was SPP-induced photoemission from metal surfaces (Tsang, Srinivasan-Rao, & Fischer, 1990). More systematic studies with Au, Ag, Cu, and Al surfaces revealed photoemission yield enhancement factors of ×50 to ×3500, which indicate field enhancement values of ×2. . . ×8 suggesting that the surfaces involved were of relatively good surface quality (Tsang et al., 1991). Figure 9 shows the main results of these experiments. The curves show the intensity dependence of the photoelectron yield on double logarithmic scales. Therefore, the slope of
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses 101
Ag
100
17
Au SP Sl op
e
2 of
Sl
10–2
Nonresonance
10–3 Nonresonance 10–4 101
Cu
A1 SP 3
100
3
SP
10–3 10–4 7 10
Sl o
pe
10–2
Sl o
pe
of
10–1 of
Peak electron current density (A/cm2)
e op
of
3
SP
10–1
Nonresonance
Nonresonance
108
109 107
108
109
Peak laser power density (W/cm2)
FIGURE 9 The enhancement of SPP-induced multiphoton photoemission yield as a function of the intensity of the incident laser beam for four different surfaces plotted on double logarithmic scales. The slope of each linear fit equals the nonlinearity of the photoemission process. The lower data sets marked as ‘‘nonresonance’’ depict photoelectron yield from the same metal film without SPP coupling but with a similar illumination geometry. The substantial increase of the SPP-enhanced photoelectron yield is clearly illustrated with the upper curves plotted with solid symbols and marked with ‘‘SP’’. (Source: Tsang et al. (1991)).
each linear fit equals the nonlinearity of the photoemission process. In each case, multiphoton-induced emission takes place since there is no deviation from the linear fits. Moreover, the enhancement of the SPP-enhanced photoelectron yield is illustrated compared with nonlinear photoemission induced from the same film without SPP coupling. These first pioneering results paved the way toward SPP-mediated electron acceleration. Later independent experiments confirmed these results (Chen, Boneberg, & Leiderer, 1993; Irvine et al., 2004). The fact that the electron yield is much higher if SPP coupling takes place than the yield at direct surface illumination without SPP coupling underscores a very important feature of SPP-enhanced emission processes. Namely, it can be stated that the SPPs induce the observed photocurrent primarily; therefore, it would be more appropriate to term the multiphotoninduced emission picture in this case as multiplasmon-induced electron emission. Accordingly, it is the enhanced SPP field that distorts the surface
18
P´eter Dombi
potential in the field emission picture and lowers the tunneling barrier. This means that the field emission regime can be reached at much lower laser input intensities and strong-field phenomena can be induced with highrepetition-rate, cost-effective laser oscillators (see, e.g., Dombi and Antal, 2007; Dombi et al., 2007; Naumov et al., 2005).
4.2. Generation of High-Energy Electrons In addition to their enhancement of photoemission yield, SPP fields can also accelerate the electrons that are set free from the surface, thanks to the mechanisms described in Section 2.1. Recently performed spectrally resolved measurements of SPP photoemission delivered the experimental confirmation of this powerful particle acceleration mechanism in evanescent plasmonic fields (Irvine et al., 2004; Kupersztych et al., 2001; Zawadzka, Jaroszynski, Carey, & Wynne, 2000; Zawadzka et al., 2001). The main features of these electron spectra, especially the scaling of cutoff energies resulting from this mechanism, could be explained within the framework of the semiclassical three-step model described in Section 3.1 (Irvine, 2006). To describe these experiments in detail, Zawadska et al. demonstrated SPP-enhanced electron spectra stretching until 400 eV with 40 TW/cm2 focused intensity in the Kretschmann SPP coupling configuration. The pulse length was 100–150 fs in that case (Zawadzka et al., 2000, 2001). Kupersztych et al. also showed this phenomenon with laser pulses that were 60-fs long and reached 8 GW/cm2 focused intensity (Kupersztych et al., 2001). The highest electron energy was ∼40 eV in their experiments. SPPs were coupled on a grating surface, and in contrast to the results of Zawadska et al., they possessed a peak at higher energies. Irvine et al. demonstrated even more conspicuous results in 2004 by accelerating electrons in SPP fields up to 400 eV with a simple titanium:sapphire oscillator delivering merely 1.5-nJ pulse energy. The resulting focused intensity was 1.8 GW/cm2 . Most interestingly, the SPPenhanced electron spectrum became quasi-monoenergetic peaking at 300 eV with a FWHM of 83 eV (Figure 10). The increased enhancement and confined electron emission in the latter experiment can be explained by considering the surface morphology of the silver film. Surface roughness effects alter the spatial distribution of the SPP field on a nanometer scale (<50 nm) and are not included in the FDTD-based model calculations (Irvine et al., 2004). In such cases, the overall energy of the pulse is conserved, but the energy density is drastically increased by confinement of the radiation to sub-wavelength volumes and is manifested as an additional localized electric field enhancement. This explanation is further supported by the fact that the modeled electron emission had to be restricted to within 10% of the laser spot (Figure 10) to enable the reproduction of measurement results. Due to the highly nonlinear photoemission, small peaks or protrusions at the metal surface would
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
19
1.0
Electron count (a.u.)
0.8
0.6
0.4
0.2
0.0 0.1
0.2
0.3
0.4
Kinetic energy (keV)
FIGURE 10 Comparison between a measured electron energy spectrum using a Ti–sapphire laser oscillator (circles) and theoretical energy spectra (solid line) as calculated from an FDTD-based model. (Source: Irvine (2006).)
dominate the electron emission in the presence of an SPP wave, and it would appear that electrons originate only from such defects with a reduced spatial extent. A full account of surface roughness necessitates three-dimensional FDTD calculation, which over the length scales of electron emission and acceleration, requires enormous computational effort. Nevertheless, the principal effects underlying efficient SPP-enhanced acceleration can be seen from these initial, approximate simulations. The same authors also demonstrated acceleration up to 2 keV recently by applying higher-intensity laser pulses with the help of an amplified titanium:sapphire laser system, delivering proof of the scalability of the electron acceleration process with laser intensity (Irvine and Elezzabi, 2005). In summary, spectrally resolved measurements confirmed that SPPenhanced electron acceleration is a very powerful method to generate multikiloelectronvolt electron beams with high-repetition-rate, low-intensity laser pulses. Simple scaling laws, such as the linear scaling of the highest electron energies with the incident laser intensity, were confirmed in these measurements. Since a great variety of different spectral shapes were observed in these pioneering studies, more systematic experiments are needed to establish the optimum focusing conditions and coating methods (surface morphologies) to enable the generation of well-behaved, highenergy, monoenergetic beams.
4.3. Time-Resolved Studies of the Emission By combining optical pump-probe methods with surface science techniques, time-resolved studies can be performed on photoemission processes from
20
P´eter Dombi
thin metal films. The most widespread example of such an experimental scheme is recording the autocorrelation functions using multiphotoninduced emission or surface harmonic generation processes acting as nonlinear “detectors” instead of the standard second harmonic generation scheme with nonlinear crystals (Melnikov, Povolotskiy, & Bovensiepen, 2008; Moore and Donnelly, 1999; Petek and Ogawa, 1997). The implementation of this technique in this particular case is the recording of the SPP-enhanced photoemission signal as a function of the delay between the ultrashort-pulse replicas produced by a Michelson interferometer and extracting information on the evolution and characteristic time scales of the surface process by means of deconvolution. This method was applied particularly successfully in case of two-photon photoemission phenomena from metals and surface adsorbates (Petek and Ogawa, 1997, and references therein). With the rapid development of femtosecond laser technology, such methods were recently extended to the few-cycle domain, too (Dombi, Krausz, & Farkas, 2006). The interferometric or the background-free autocorrelation functions that can be measured this way provide indirect information on the lifetime of any potential intermediate states the electrons undergo during photoemission (Georges and Karatzas, 2008), optical excitation of hot electrons (Petek and Ogawa, 1997), image potential effects (Schoenlein, Fujimoto, Eesley, & Capeherat, 1988), and so forth. Time-resolved characterization in case of SPP-induced photoemission was carried out by several groups to gain insight into ultrafast emission dynamics of this process on pico- and femtosecond time scales (Chen et al., 1993; Irvine et al., 2004; Kupersztych et al., 2001; Tsang et al., 1991). The higherorder autocorrelation traces revealed in each case that the electron pulse length is roughly n 1/2 times shorter than that of the exciting laser pulse, where n is the order of the photoemission process (n = 2, . . . , 4 in these experiments). Therefore, the temporal profile of the electron bunch can be well approximated with Eq. (3). As can be seen from the comparison of these articles, this holds for a very broad range of pulse durations since such a behavior was observed both for nanosecond-long exciting pulses (Chen et al., 1993), as well as for femtosecond excitation (Irvine et al., 2004) where pulses of 27-fs duration were used. For example, the third-order interferometric autocorrelation trace measured with three-photon-induced, SPP-enhanced photoemission in the latter case is depicted in Figure 11. These results suggest that the influence of surface states and hot electron excitation is negligible on the time scales that were examined, and the photoemission process can be considered instantaneous with respect to the intensity evolution of the exciting laser pulse in the material environments that were considered (mostly polycrystalline, evaporated Ag and Au thin films). Unfortunately, compared with the wealth of measurements conducted with the two-photon photoemission technique on ultrafast dynamics at
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
21
Normalized amplitude
1 0.8 0.6 0.4 0.2 0 –35
–25
–15
–5 5 Delay (fs)
15
25
35
FIGURE 11 Measured interferometric two-pulse three-photon photoemission correlation trace for 27-fs laser pulses, indicating that at the surface of the film, the electron pulse duration is less than 27 fs (Source: Irvine et al. (2004).)
metal surfaces (Petek and Ogawa, 1997), there are significantly fewer data for time-resolved measurements of SPP-enhanced electron acceleration. These measurements also lack in-depth evaluation and thus further studies are needed to establish the underlying processes in the ultrafast dynamics of this phenomenon. Nevertheless, these initial studies carry a very important positive message. The ultrashort nature of these electron bunches upon leaving the surface indicates that the SPP-enhanced electron acceleration effect can be particularly well implemented in the development of novel ultrafast time-resolved methods where electron pulses of few-femtosecond duration are required.
5. THE ROLE OF THE CARRIER-ENVELOPE PHASE 5.1. Light-Matter Interaction with Few-Cycle Laser Pulses, Carrier-Envelope Phase Dependence Optical waveform control of recollision processes of atomic electrons had brought deeper insight into atomic physics since the reproducible generation of attosecond light pulses in gas targets was enabled by CE phase control of few-cycle light pulses (see, e.g., (Agostini and Dimauro, 2004), and references therein). Similarly, in solids, the CE phase played a decisive role in governing various charge transfer and photoemission processes, as measured with the first CE phase-stabilized oscillators (Apolonski et al., 2004; Dombi and ¨ R´acz, 2008b; Dombi et al., 2004; Fortier et al., 2004; Mucke et al., 2004). These experiments revealed several new aspects of the underlying lightmatter interaction physics, even though the underlying mechanisms by many of these processes are not fully understood and there is substantial discrepancy between various semiclassical and quantum mechanical models and measured results (Dombi et al., 2004; Lemell et al., 2003).
22
P´eter Dombi
Electric field strength (a.u.)
1.0 A(t)
0.5
0.0
–0.5
ϕ0=π
ϕ0=π/2
ϕ0=0
ϕ0=3π/2
E(t)=A(t)cos(ωt+ϕ0)
–1.0 –8 –4
0
4
8
1/f r
2/f r
3/f r
Time (fs)
FIGURE 12 Few-cycle laser pulses with different CE phase values representing different optical waveforms (solid line) under the same field envelope (dashed line). (τ L = 4 fs, λ0 = 750 nm, Gaussian pulse shape: A(t) = A0 exp(−2t 2 ln 2/τ L2 ).)
The optical waveform of a transform-limited ultrashort laser pulse can be parameterized with the CE phase value for a given envelope shape. An arbitrary, chirp-free laser pulse shape can be defined by the equation El (t) = A(t) cos (ωt + ϕ0 ) ,
(7)
where A(t) is the field envelope and ω is the central angular frequency of the laser and ϕ0 is the CE phase. Depending on the value of ϕ0 , significantly different optical waveforms can occur provided that the pulse length is in the few-cycle domain. This is illustrated in Figure 12 where different waveforms are depicted under the same Gaussian envelope. The self-referencing or f -to-2 f technique allows control of CE phase evolution in the typically multi-megahertz train of pulses by stabilizing 1ϕ0 (i.e., the pulse-to-pulse CE phase shift in the output of a mode-locked laser) (Jones et al., 2000). State-of-the-art laser systems were developed in past years by exploiting this novel optical technology. These delivered CE phasestabilized pulses as short as 3.7 fs (Yakovlev et al., 2003). However, an f -to2 f interferometer usually used in these schemes is not suitable for measuring the absolute value of ϕ0 ; only the pulse-to-pulse CE phase shift value (1ϕ0 ) can be stabilized. Despite this shortcoming, basic experiments could be done with these types of lasers demonstrating the effect of the optical waveform on laser-solid interactions (Apolonski et al., 2004; Dombi and R´acz, 2008b; ¨ Dombi et al., 2004; Fortier et al., 2004; Mucke et al., 2004).
5.2. Carrier-Envelope Phase-Controlled Electron Acceleration Motivated by these developments, we examined the effect of the optical waveform on the electron beam generated in this parameter regime
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
23
FIGURE 13 Angle-energy distributions of SPP-enhanced photoacceleration for carrier-envelope-phase values of (a) π /2 and (b) π . Other than the CE phase value, the simulation parameters were the same as those used to calculate Figure 8e. These distributions can be directly compared to Figure 8e, where the CE phase value was ϕ0 = 0. (Source: Dombi and R´acz (2008a).)
numerically (Dombi and R´acz, 2008a). The angle-energy distributions in Figures 13a and b (CE phases of π/2 and π, respectively) can be directly compared to that of Figure 8e (CE phase ϕ0 = 0); the only difference in the simulation input is that we varied the CE phase of the interacting pulses but otherwise left other parameters unchanged. We can see that the spectral cutoffs determined by the acceleration process are highly dependent on the CE phase of the pulses in accordance with previous results (Irvine and Elezzabi, 2006). In our case, however, by having taken tunneling emission into account (instead of multiphoton emission) the influence of the CE phase becomes more pronounced. The number of structures observable on the emission maps corresponds to the number of optical cycles in the laser pulse (two in this case). These structures coincide for CE phase values of 1.75π and π/4 serving as a basis for an ideal photoelectron source. Therefore, it is anticipated that the generation of electron beams with the desired spatial, spectral, and temporal features requires femtosecond laser sources with CE phase stabilization. The experimental verification of these predictions can be carried out with state-of-the-art few-cycle laser sources.
6. CONCLUSIONS In conclusion, SPP-enhanced electron acceleration proved to be a powerful method for the all-optical generation of ultrashort, high-repetition-rate electron beams with kiloelectronvolt-range energy. The initial duration of these electron bunches is limited by the width of the envelope function of the laser pulse, and in the case of high-order multiphoton-induced processes it can be significantly shorter than the FWHM of the femtosecond
24
P´eter Dombi
optical excitation. The properties of these electron beams can be sensitively tuned with the parameters of the SPP exciting laser pulse, such as the intensity, focusing, pulse shape, or the CE phase, demonstrating a full coherent control perspective in a solid-state system. In addition, the field enhancement intrinsic to SPPs enables the investigation of a wealth of strongfield phenomena in surface environments. SPP coupling together with surface nanostructures hold promise of circumventing the damage threshold problem related to surfaces, which is the main obstacle to strong-field lightmatter interaction experiments in solid environments. Thus, SPP-enhanced photoemission and photoacceleration processes as versatile tools will open the door to novel surface characterization, ultrafast, spatially resolved pumpprobe methods and to strong-field plasmonics in the future.
ACKNOWLEDGMENTS The author received support from the Hungarian Scientific Research Fund (OTKA Project F60256). The author was also supported by the Bolyai Fellowship of the Hungarian Academy of Sciences. Fruitful discussions with ¨ o¨ Farkas and the help of P´eter R´acz in the preparation of the figures are Gyoz gratefully acknowledged.
REFERENCES Aeschlimann, M., Bauer, M., Bayer, D., Brixner, T., Garc´ıa de Abajo, F. J., Pfeiffer, W., et al. (2007). Adaptive subwavelength control of nano-optical fields. Nature, 446, 301–304. Agostini, P., & Dimauro, L. F. (2004). The physics of attosecond light pulses. Rep. Prog. Phys., 67, 813–855. Apolonski, A., Dombi, P., Paulus, G. G., Kakehata, M., Holzwarth, R., Udem, T., et al. (2004). Observation of light-phase-sensitive photoemission from a metal. Physical Review Letters, 92, 073902. Binh, V. T., Garcia, N., & Purcell, S. T. (1996). Electron field emission from atom-sources: fabrication, properties, and applications of nanotips. Advances in Imaging Electron Physics, 95, 63. ¨ Buttiker, M., & Landauer, R. (1982). Traversal time for tunnelling. Physical Review Letters, 49, 1739–1742. Chen, H., Boneberg, J., & Leiderer, P. (1993). Surface-plasmon-enhanced multiple-photon photoemission from Ag and Al films. Physical Review Letters, 47, 9956–9958. Corkum, P. B. (1993). Plasma perspective of strong-field multiphoton ionization. Physical Review Letters, 71, 1994–1997. Dombi, P., & Antal, P. (2007). Investigation of a 200 nJ Ti:sapphire oscillator for white light generation. Laser Phys. Lett., 4, 538–542. Dombi, P., & R´acz, P. (2008a). Ultrafast monoenergetic electron source by optical waveform control of surface plasmons. Optics Express, 16, 2887–2893. Dombi, P., & R´acz, P. (2008b). Carrier-envelope phase-controlled laser-surface interactions. Proceedings of SPIE, 6892, 68921J. Dombi, P., Krausz, F., & Farkas, G. (2006). Ultrafast dynamics and carrier-envelope phase sensitivity of multiphoton photoemission from metal surfaces. Journal of Modern Optics, 53, 163–172.
Surface Plasmon-Enhanced Electron Acceleration with Ultrashort Laser Pulses
25
Dombi, P., Apolonski, A., Lemell, C., Paulus, G. G., Kakehata, M., Holzwarth, R., et al. (2004). Direct measurement and analysis of the carrier-envelope phase in light pulses approaching the single-cycle regime. New Journal of Physics, 6, 39. ¨ Dombi, P., Antal, P., Fekete, J., Szipocs, R., & V´arallyay, Z. (2007). Chirped-pulse supercontinuum generation with a long-cavity Ti:sapphire oscillator. Applied Physics B, 88, 379–384. Farkas, G., Chin, S. L., Galarneau, P., & Yergeau, F. (1983). Optics Communications, 48, 275–278. Fill, E., Veisz, L., Apolonski, A., & Krausz, F. (2006). Sub-fs electron pulses for ultrafast electron diffraction. New Journal of Physics, 8, 272. Fortier, T. M., Roos, P. A., Jones, D. J., Cundiff, S. T., Bhat, R. D. R., & Sipe, J. E. (2004). Carrier-envelope phase-controlled quantum interference of injected photocurrents in semiconductors. Physical Review Letters, 92, 147403. Georges, A. T., & Karatzas, N. E. (2008). Modeling of ultrafast interferometric three-photon photoemission from a metal surface irradiated with sub-10-fs laser pulses. Physical Review B, 77, 085436. Hommelhoff, P., Kealhofer, C., & Kasevich, M. A. (2006). Ultrafast electron pulses from a tungsten tip triggered by low-power femtosecond laser pulses. Physical Review Letters, 97, 247402. Hommelhoff, P., Sortais, Y., Aghajani-Talesh, A., & Kasevich, M. A. (2006). Field emission tip as a nanometer source of free electron femtosecond pulses. Physical Review Letters, 96, 077401. Irvine, S.E. (2006). Laser-field femtosecond electron pulse generation using surface plasmons. Doctoral thesis, University of Alberta, Canada. Irvine, S. E., & Elezzabi, A. Y. (2005). Ponderomotive electron acceleration using surface plasmon waves excited with femtosecond laser pulses. Applied Physics Letters, 86, 264102. Irvine, S. E., & Elezzabi, A. Y. (2006). Surface-plasmon-based electron acceleration. Physical Review A, 73, 013815. Irvine, S. E., Dechant, A., & Elezzabi, A. Y. (2004). Generation of 0.4-keV femtosecond electron pulses using impulsively excited surface plasmons. Physical Review Letters, 93, 184801. Irvine, S. E., Dombi, P., Farkas, G., & Elezzabi, A. Y. (2006). Influence of the carrier-envelope phase of few-cycle pulses on ponderomotive surface-plasmon electron acceleration. Physical Review Letters, 97, 146801. Jones, D. J., Diddams, S. A., Ranka, J. K., Stentz, A., Windeler, R. S., Hall, J. L., et al. (2000). Carrier-envelope phase control of femtosecond mode-locked lasers and direct optical frequency synthesis. Science, 288, 635–639. Keldysh, L. V. (1965). Ionization in the field of a strong electromagnetic wave. Soviet Physics Journal of Experimental and Theoretical Physics, 20, 1307. Kim, S., Jin, J., Kim, Y.-J., Park, I.-J., Kim, Y., & Kim, S.-W. (2008). High-harmonic generation by resonant plasmon field enhancement. Nature, 453, 757–760. Kulander, K. C., Schafer, K. J., & Krause, J. L. (1993). Dynamics of short-pulse excitation, ionization and harmonic conversion. In B. Piraux (Ed.), Proc. workshop super intense laser atom physics (SILAP III) (p. 95). New York: Plenum. Kupersztych, J., Monchicourt, P., & Raynaud, M. (2001). Ponderomotive acceleration of photoelectrons in surface-plasmon-assisted multiphoton photoelectric emission. Physical Review Letters, 86, 5180–5183. Lal, S., Link, S., & Halas, N. J. (2007). Nano-optics from sensing to waveguiding. Nature Photonics, 1, 641–648. ´ Leemans, W. P., Nagler, B., Gonsalves, A. J., Toth, Cs., Nakamura, K., Geddes, C. G. R., et al. (2006). GeV electron beams from a centimetre-scale accelerator. Nature Physics, 10, 696–699. ¨ Lemell, C., Tong, X.-M., Krausz, F., & Burgdorfer, J. (2003). Electron emission from metal surfaces by ultrashort pulses: Determination of the carrier-envelope phase. Physical Review Letters, 90, 076403. Lobastov, V. A., Srinivasan, R., & Zewail, A. H. (2005). Four-dimensional ultrafast electron microscopy. Proceedings of National Academy of Sciences, 102, 7069–7073. Melnikov, A., Povolotskiy, A., & Bovensiepen, U. (2008). Magnon-enhanced phonon damping at Gd(0001) and Tb(0001) surfaces using femtosecond time-resolved optical second-harmonic generation. Physical Review Letters, 100, 247401.
26
P´eter Dombi
Moore, K. L., & Donnelly, T. D. (1999). Probing nonequilibrium electron distributions in gold by use of second-harmonic generation. Optics Letters, 24, 990–992. ¨ Mucke, O. D., Tritschler, T., Wegener, M., Morgner, U., K¨artner, F. X., Khitrova, G., et al. (2004). Carrier-wave Rabi flopping: role of the carrier-envelope phase. Optics Letters, 29, 2160–2162. Naumov, S., Fernandez, A., Graf, R., Dombi, P., Krausz, F., & Apolonski, A. (2005). Approaching the microjoule frontier with femtosecond laser oscillators. New Journal of Physics, 5, 216. Petek, H., & Ogawa, S. (1997). Femtosecond time-reolved two-photon photoemission studies of electron dynamics in metals. Progress in Surface Science, 56, 239–310. Quail, J. C., Rako, J. G., Simon, H. J., & Deck, R. T. (1983). Optical second-harmonic generation with long-range surface plasmons. Physical Review Letters, 50, 1987–1990. Raether, H. (1988). Surface plasmons on smooth and rough surfaces and on gratings. Berlin: SpringlerVerlag. Reider, G. (2004). XUV attosecond pulses: Generation and measurement. Journal of Physics D, 37, R37–R48. Ropers, C., Solli, D. R., Schulz, C. P., Lienau, C., & Elsaesser, T. (2007). Localized multiphoton emission of femtosecond electron pulses from metal nanotips. Physical Review Letters, 98, 043907. Schoenlein, R. W., Fujimoto, J. G., Eesley, G. L., & Capeherat, T. W. (1988). Femtosecond studies of image-potential dynamics in metals. Physical Review Letters, 61, 2596–2599. Simon, H. J., Mitchell, D. E., & Watson, J. G. (1974). Optical second-harmonic generation with surface plasmons in silver films. Physical Review Letters, 33, 1531–1534. Siwick, B. J., Dwyer, J. R., Jordan, R. E., & Miller, R. J. D. (2003). An atomic-level view of melting using femtosecond electron diffraction. Science, 302, 1382–1385. Stockman, M., Kling, M. F., Krausz, F., & Kleineberg, U. (2007). Attosecond nanoplasmonic field microscope. Nature Photonics, 1, 539–544. ´ Toth, C., Farkas, G., & Vodopyanov, K. L. (1991). Laser-induced electron emission from an Au surface irradiated by single picosecond pulses at λ = 2.94 µm. The intermediate region between multiphoton and tunneling effects. Applied Physics B, 53, 221–225. Tsang, T., Srinivasan-Rao, T., & Fischer, J. (1990). Surface-plasmon-enhanced multiphoton photoelectric emission from thin silver films. Optics Letters, 15, 866–868. Tsang, T., Srinivasan-Rao, T., & Fischer, J. (1991). Surface-plasmon field-enhanced multiphoton photoelectric emission from metal films. Physical Review B, 43, 8870–8878. ´ S., & Farkas, G. (2008). Attosecond electron pulses from interference of above-threshold Varro, de Broglie waves. Laser and Particle Beams, 26, 9–20. ¨ Yakovlev, V. S., Dombi, P., Tempea, G., Lemell, C., Burgdorfer, J., Udem,, et al. (2003). Phasestabilized 4-fs pulses at the full oscillator repetition rate for a photoemission experiment. Applied Physics B, 76, 329–332. Yudin, G. L., & Ivanov, M. Y. (2001). Nonadiabatic tunnel ionization: Looking inside a laser cycle. Physical Review A, 64, 013409. Zawadzka, J., Jaroszynski, D., Carey, J. J., & Wynne, K. (2000). Evanescent-wave acceleration of femtosecond electron bnches. Nuclear Instruments & amp; Methods in Physics Research, Section A, 445, 324–328. Zawadzka, J., Jaroszynski, D., Carey, J. J., & Wynne, K. (2001). Evanescent-wave acceleration of ultrashort electron pulses. Applied Physics Letters, 79, 2130–2132.
Chapter
2 Did Physics Matter to the Pioneers of Microscopy? Brian J. Ford
Contents
1. Introduction 2. Setting the Scene 3. Traditional Limits of Light Microscopy 3.1. Questions of Image Quality 3.2. Foundation of Optical Physics 3.3. Single-Lens Microscopes 3.4. Microscope Design 3.5. From Simple to Achromatic Microscopes 4. Origins of the Cell Theory 4.1. Robert Brown’s Key Observations 4.2. A Failure to Understand 4.3. States of Denial 4.4. Aberrations, Real and Irrelevant 4.5. Resolution and the Art of Seeing 5. Pioneers of Field Microscopy 5.1. The Polype Shows the Way 5.2. The Dutch Draper’s Roots 6. The Image of the Simple Microscope 6.1. Analyzing the Image 6.2. Sources of Inspiration Acknowledgments References
27 28 30 31 32 33 35 37 39 42 44 45 49 51 58 64 67 70 73 77 84 85
1. INTRODUCTION Microscopes characterize modern science. It is hard to find any instrument that more immediately symbolizes laboratory research, and a microscope is Gonville & Caius College, University of Cambridge, UK Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00007-X. c 2009 by Brian J. Ford Copyright
27
28
Brian J. Ford
featured as a logo for scientific societies the world over. The birth of the microscope is an absorbing study, relating personal enthusiasms, rivalry, opportunism, and technical skill to the demands both of the physics and the technology of microscope manufacture. Experienced microscopists soon discover that the instrument is uniquely amenable to being tweaked. If an image is very slightly out of focus, for example, then phase disparity may render an otherwise invisible structure visible. Precision is not the aim; seeing what you need to see is the overriding consideration and personal preferences defy the form of definition on which physics is founded. Microscopy began as a hobby. During the Victorian era, an interest in microscopy was a conventional pastime for large numbers of people; even in the modern age there are some surviving clubs and societies that are specifically aimed at part-time microscopists who regard the topic as an enthusiasm. Only in ornithology and observational astronomy is there a comparable level of encouragement of the amateur investigator. In appraising the history of any branch of science, we view the topic from a lofty vantage-point, heady in the knowledge that our hindsight gives us the sense of continuity that we seek. That can be a mistake. Innovations that now seem crucial may have been inconsequential, at the time, to the innovator. If we are to gain a fuller understanding of the early development of the light microscope, I believe that we can benefit by telling the tale backwards. We shall start with the modern microscope, and move progressively backwards in time towards the beginning. In this way we can see how today’s instrument rests on foundations laid down in response to the practical demands of previous generations of investigators. The microscope can thus be conceived, not just as a means of magnification, but as an instrument of increasing practicality. The convenient convention of retrospection retreats into a clearer context, and the lens can be seen as just one component in an instrument that has bequeathed to us our concept of what we are, and how our world is comprised. And so we will tell the story in reverse. History related backwards can do much to set each stage of development into context.
2. SETTING THE SCENE For well over a century physics has driven technological developments in light microscopy. The continuing quest has been for bigger and better pictures, for increasing magnification, pressing back the boundaries of resolution so that ever-finer details can come within the compass of human scrutiny. Taken by itself, that quest has been a mistake. It has nurtured a reductionism that has led us to unravel so many of the details within living cells, while largely ignoring how cells behave and what they can do. We
Did Physics Matter to the Pioneers of Microscopy?
29
have been seduced by molecular biology, which has done surprisingly little to improve our lot; we have been captivated by genetics, even though the hyperbole in which the topic is immersed has not been translated into the practical benefits of which we were assured. Our new need should be for the study of the cell as organism, and the impetus towards cell biology and genetics is turning us away from these crucial topics. We need to observe living cells, and not merely analyze their contents. The obsession with elaborate instrumentation handicapped scientists who too easily came to regard the limits of resolution as an uncrossable frontier, somewhat like the sound barrier. This is proving not to be the case. We can see objects that are not, in theory, amenable to resolution. It is recognized that dark-ground light microscopy can take the observer below the theoretical limits of resolution (Ford, 1970) and that sub-microscopic luminous objects can be visualized, even if they cannot strictly be resolved (Ford, 1968). Fluorescence microscopy now gives us an opportunity to generate identifiable self-luminous components within cells, and the potentiation of confocal microscopy through stimulated emission depletion and photo-switching microscopy now allows us to attain a resolution better than 100 nm (Punge et al., 2008). The computer is the core to these techniques. In confocal laser scanning microscopy, a beam of laser illumination is focused into a minute, diffractionlimited focal spot within a fluorescent specimen. A beam splitter separates the reflected laser light from the fluorescent light emitted by the specimen, and passes the fluorescent light into a photomultiplier detection device so that it can be recorded by a computer. All the light that does not originate from the focal point is suppressed, and the scanning of the laser beam across the specimen allows an image to be constructed pixel by pixel, and then line by line; so that far greater resolution is thus obtained than a conventional light microscope can offer. Although the process is complex, there are advantages since little specimen preparation is necessary and three-dimensional images can be constructed. Apart from the addition of the fluorescent dye, which is in very low concentrations, the technique is essentially noninvasive. Magnetic resonance force microscopy now gives us resolutions down to ≈4 nm and this burgeoning discipline has been fittingly set into context by a paper by Sidles (2009). This reminds us of the early ideas of John von Neumann dating from 1948. These techniques can give an improvement in resolution of 100 million times better than conventional magnetic resonance imaging (MRI) (Degen, Poggio, Mamin, Rettner, & Rugar, 2009). Other techniques of increasing resolution include near-field scanning optical microscopy, which provides enormously increased resolution by placing the detector very close (λ) to the specimen and scanning it across the surface.
30
Brian J. Ford
This gives very high spatial and spectral resolution that is related to the dimensions of the detector’s aperture, rather than to the wavelength of the illuminating beam. In techniques like these, we are witnessing new thinking brought to microscopy from other areas of science and technology. These revolutionary approaches are demonstrating how optical microscopy can reach far beyond the long-accepted limits of resolution (Hell, 2003) and they ¨ are shattering beliefs held for more than a hundred years (Hell & Schonle, 2008). Of course, these novel instruments are far removed from traditional microscopes. Objectives of increasingly high specifications are still being developed, however—but not for use by microscopists. These ultimate lenses are made up of as many as 14 separate components with meticulously designed aspheric contours. They are used by the major microchip manufacturers to create ultra-sharp images of the details of which the circuitry is comprised. Current chips measure 24 × 32 mm, and today’s production processes are aiming to print 50-nm features. To do this, a muchreduced image of the template is projected onto the substrate and the chips are then built up by photolithography. This is microscopy backwards, where the object (the template) is large and the image greatly reduced, and has led to the development of large-field diffraction limited camera systems using extreme ultraviolet. This is the ultimate refinement of objective design and each lens is rumored to cost as much as $ 20,000. How curious it is that the best optical microscopes of our era do not look like microscopes at all, and require a computer to drive them; whereas the highest-specification objective lenses are not used by microscopists. It is now timely for us to retrace our steps back to the age when the design of microscope lenses was scientifically established for the first time.
3. TRADITIONAL LIMITS OF LIGHT MICROSCOPY It was the pioneering work of Ernst Abbe in the 1870s (Abbe, 1873) that laid the groundwork for our understanding of the limits of optical microscopy (Brocksh, 2005). He demonstrated that diffraction causes a light wave, when focused through an [objective] lens, to form a spot of light. The wavelength of light exerts constraints upon this spot, which must therefore be approximately 200 nm in diameter. The spot exists in three dimensions, however, and not just two; as its diameter is 200 nm, it is some 500 nm in length. It is the construction of an objective lens and the nature of light which determine these dimensions, and it had not been envisaged that such constraints would ever be overthrown. ¨ Matters were dramatically changed when Stefan Hell at Gottingen revisited Abbe’s work and reworked the theory. Hell recognized that, no
Did Physics Matter to the Pioneers of Microscopy?
31
matter how large the aperture of a conventional lens, it can capture only a segment of a spherical wavelength from one direction. However, if one could utilize a fully spherical wavelength of solid angle 4π , then the spot could become a small sphere, rather than an elongated micro-pool of light. This 4π microscope (conventionally written as 4Pi) reduces the length of the axis of the light spot and offers a better than fourfold increase in resolution for an optical microscope (Schrader, Hell, & vanderVoort, 1998). Much enthusiasm has focused on electron microscopy. Nobody can doubt the spectacular insight than these instruments offer us, though these are essentially restricted to the examination of cells that are dead. Electron microscopes do not offer us insights into living cells, and it is the sociology, the responses, and the behavior of cells that I believe we are ignoring at our peril. These pose us the timeliest problems. Attempting to elicit what living cells do—using electron micrographs—is as fruitless as trying to deduce the behavior and social structure of hens by looking at a hard-boiled egg. We need to recognize, in an era dominated by molecular biology and genetics, that light microscopy is being too widely ignored. Tomorrow’s bioscientists need to become familiar with how cells behave, rather than how they are comprised; and only the light microscope offers us this opportunity.
3.1. Questions of Image Quality Light microscopy cut its teeth in biology, and the clarity of the image has long preoccupied light microscopists. Prior to Abbe’s theoretical work on resolution, the greatest improvement in image quality had been the introduction of the achromatic microscope objective in which spherical aberration was reduced. The pioneering work on achromatism was done for telescopes, rather than microscopes; the inventor was Chester Moore Hall (1703–1771) of London, who recognized that the answer to chromatism lay in the utilization of lenses of disparate refractive indices. In 1729 he found that crown and flint glass gave him the results he sought, and by 1733 he had produced several refracting telescopes with apertures up to 65 mm (Court & von Rohr, 1929). Perversely, when the Royal Society awarded its prestigious Copley Medal for the invention in 1758, it went to John Dollond (1706–1761). Dollond was a silk-weaver and an autodidact who went on to produce beautifully engineered single-lens microscopes, and his work on achromatism had been done was independently of Hall, his predecessor and the true innovator. An achromatic microscope objective with a focal length of 25 mm was manufactured as early as 1807 by Harmanus van Deijl of Amsterdam (Lovell, 1967) but much of the impetus came from Joseph Jackson Lister (1786–1869) (Hodgkin & Lister, 1827), whose son Joseph Lister (1827–1912),
32
Brian J. Ford
went on to introduce aseptic practices to British hospitals. The 1827 paper by Goring (Goring, 1827) established the groundwork for this crucial aspect of microscopy to which Lister returned with a definitive paper in 1830 (Lister, 1830). Lister (the elder) not only foresaw that combinations of lenses of differing refractive indices could minimize chromatic aberration, but further showed that spherical aberration could be minimized by the correct separation of the components of a compound lens. Such was the state of the art that parallel developments were under way in France, where Vincent Chevalier (1771–1841) had been experimenting with the manufacture of achromatic doublets in 1824 (Hughes, 1855; Nuttall, 1971). His son, Charles Chevalier (1804–1859), continued work on the development of the light microscope after his father’s demise. This assault on the problem of achromatism was prolonged. One early experimenter with lenses of disparate refractive indices was Giovanni Battista Amici (1786–1863), who attempted to produce an achromatic system in 1827, concluded that the problem was insoluble, and concentrated instead on reflecting microscopes for the following 20 years (Optical Microscopy Division, 2008) before returning to refracting instruments. Progress in England continued apace, however, and by the middle of the nineteenth century, achromatic objective lenses were becoming widely accepted. However, many of the fundamental discoveries had been made before these fine lenses were available. Cells and nuclei, fungi and bacteria, crystals and pollen grains had all been studied with simpler microscopes—many of them with optics that an optical theorist would regard as not up to the job. Most high-power observations prior to the middle nineteenth century were made with single lenses little bigger than the head of a pin.
3.2. Foundation of Optical Physics Theoretical optics is a branch of physics that has a more ancient lineage than you might expect. We are familiar with Snell’s law, which relates the path of a beam of light to the refractive effects of passing from one medium to another. Refraction is due to the change in velocity that light undergoes when it passes from an optically less-dense medium (like air) to one of greater density (such as glass). The law was coined in 1621 by Willebrord Snel van Royen (1580–1626) of Leiden, who became known as Snellius. He determined the direction of light rays through refractive media with varying indices of refraction: η1 sin θ1 = η2 sin θ2 ,
where η is refractive index.
Did Physics Matter to the Pioneers of Microscopy?
33
The ray path is delineated in the following diagram.
air
θ1
index = η1 index = η2 glass θ2
The mathematical expression is attributed to Snell, though his own spelling of his family name was Snel, and his theories were not published in his lifetime. In France, Anglophone physicists are rarely surprised to learn, it is known as Descartes’ law. Yet the refraction of light by glass lenses had been similarly explored centuries earlier, for the first description of the principle known to us was in the year 984, when it was published by the Persian philosopher, Ibn Sahl of Baghdad (Figure 1) in his celebrated manuscript “On Burning Mirrors and Lenses” (Wolf, 1995).
3.3. Single-Lens Microscopes It has been widely accepted that achromatic lenses were crucial in the furtherance of serious microscopical observations. Single lens microscopes, known in the trade as simple microscopes, have been condemned as too primitive for serious investigation. The images produced have been widely dismissed as being “indistinct and often surrounded by color fringes” (Bradbury, 1968). Yet there is precious little evidence on which we can base such judgments; none of those who make allegations of low image quality have used the microscopes. Images of objects seen through these early instruments are missing from all the major textbooks and are rarely found even in the scientific literature. A search through Google images for “simple microscope” micrograph or “single lens microscope” micrograph produces only about 50 images from the whole world, all but a few of them being irrelevant. We are faced with an extraordinary proposition: The results of an entire branch of investigation—single-lens microscopy—have been condemned without a scrap of empirical evidence on which to base the conclusion. Pioneering users of compound microscopes quickly discovered that lenses amplify aberrations more than they magnify images. Lenses in a coaxial array can produce a heavily chromatic image. A single lens, however, is less prone to problems. It is plainly true that a single spherical surface
34
Brian J. Ford
FIGURE 1 Laws of Refraction from the ancient world. The foundations of optical physics were laid down more than 1000 years ago. The Arabian mathematician and philosopher, Abu Sa’d al-’Ala’ ibn Sahl, popularly known as Ibn Sahl (c. 940–1000), published on the laws of refraction in Baghdad in 984, even earlier than the better-known Alhazen of Basra. In Ibn Sahl’s manuscript, On Burning Mirrors and Lenses, the internal hypotenuse of the right-angled triangle shows the path of the incident ray, while the outer hypotenuse shows an extension of the path of the refracted ray if the incident ray intersects a crystal whose face is vertical at the point where the two intersect. The calculations by [Rashed, R (1950) Isis 81 464–491] show that the ratio of the length of the smaller hypotenuse to the larger is the reciprocal of the refractive index of the crystal.
lens must—by definition—produce an image that shows both chromatic and spherical aberration; but what matters more is the practical fact of whether the aberrations interfere with the observations. As a rule, the answer is no. Simple microscopes are small and inexpensive and can produce images that compare surprisingly well with those obtained with presentday instruments. Many of today’s microscopists do not know how best to use their instruments, and the worst examples of chromatic aberration I have ever encountered all come from generously funded and well-equipped modern research laboratories. Simple microscopes from several centuries ago can perform better than a modern instrument in unskilled hands.
Did Physics Matter to the Pioneers of Microscopy?
35
Simple microscopes were valued long after the achromatic compound microscope had become available. Charles Darwin (1809–1882) was an enthusiast for these uncomplicated instruments and, even in 1848, he was still recommending the use of a pocket simple microscope. In a letter to Richard Owen in March that year he wrote: “I daresay what I am going to write will be absolutely superfluous, but I have derived such infinitely great advantage from my new simple microscope in comparison with the one, which I used on board the Beagle & which was recommended to me by R. Brown, that I cannot forego the mere chance of advantage of urging this on you”.1 At about that time, the London firm of instrument makers Smith and Beck were advertising “Darwin’s Simple Microscope” at £10. Yet brass compound microscopes with achromatic lenses were already becoming widely available at the time.
3.4. Microscope Design The practical advantages of the single lens, simple microscopes were evident. They were inexpensive. Almost anyone could use them. There was little to keep clean, and maintenance was a relatively easy affair. They were small, and the hardwood box into which they were packed away could fit easily into a coat pocket. The physics of magnification, however, remained a closed book. The topic had advanced relatively little since the time of Snell and Descartes (p33). The quality of the lens was a subtle blend of chance and guesswork, yet the optical results they gave were more than sufficient for routine microscopical observations and the effect of aberrations were slight. The principal disadvantage was that the simple microscopes were small and were uncomfortable to use. The observer, having to set up the instrument on a desk, had to bend low to make observations. This was the main difficulty with the early simple microscopes—a lack of anthropometrics. Put simply, they did not comfortably fit the user. This anthropometric principal underpinned the design of the grand bench microscopes—not only the great brass instruments of the Victorian era, but also of the microscopes of the mid-seventeenth century used by pioneers like Robert Hooke (1635–1703) (Figure 2). Not all microscopes have the same optical tube length. The English standard ratified by the Royal Microscopical Society was set at 10 inches (∼250 mm) while Continental manufacturers opted for 160 mm. Lens systems in compound microscopes are typically corrected for tube length specified by the instrument maker. Yet, no matter what the tube length, compound microscopes are all roughly the same size, standing some 400 mm tall. The reason lies in the way we sit, the dimensions of our furniture, and the proportions of the human body. The height of 400 mm fills in the
1 Darwin C (1848) letter #1166 to Richard Owen dated 26 March, Darwin Project at Cambridge University.
36
Brian J. Ford
FIGURE 2 Robert Hooke’s compound microscope manufactured in London. Christopher Cock was the London instrument maker who produced the microscope that Robert Hooke illustrated in Micrographia. With a body made of turned wood, and covered with tooled leather, this instrument was fitted with three lenses. Hooke states in his book that he removed one of them and used just two lenses in his microscope for high-power work. A sagittal section to show the concept is published here as Figure 4. Yet this is disingenuous– as we shall see, the fine details he published could not thus be observed. Hooke was clearly enamored of his ‘‘great microscope,’’ as he called it, and simply declined to admit that he had made his crucial observations with nothing more than a tiny hand-made lens. Practical experiment, rather than theoretical physics, has substantiated the answer.
space between a comfortable bench top and the level of the observer’s eyes. In truth, the design of bench microscopes is founded on anthropometric principles and not those of physics. Today’s bench microscope has clear and recognizable components. There is a vertical main body component and a horizontal limb that holds the lens assembly firmly in position. Many early microscopes lacked this kind of solidity. Some, like the instrument designed by William Withering (1741–1799), were made of wood. Others, like the screw-barrel microscope perfected by James Wilson (1665–1730) took the form of a cylinder into which the object could be inserted through slots on either side (Figure 3). Others had the microscope body and the lenses mounted in a tripod, a design popularized by the younger Edmund Culpeper (1670–1738). These were all fine designs in their way and have been comprehensively discussed elsewhere (Clay & Court, 1932) but none of them can be viewed as of a design that is ancestral to today’s microscope.
Did Physics Matter to the Pioneers of Microscopy?
37
FIGURE 3 The Wilson screw-barrel microscope of the eighteenth century. Small pocket microscopes of this sort were popularized by James Wilson. [This woodcut is from Disney, A. N., Hill, C. F., and Watson Baker, W. E. (1928). ‘‘Origin and Development of the Microscope.’’ Royal Microscopical Society, London, p. 174.] The lens and the specimen holders were fitted into a cylinder of metal. The sliders were held between springs, and the lens holder could be rotated to bring the object into focus. This design was mentioned by earlier workers, including Bonanni (1681), Campani (1686), and Hartsoeker (1694), who refined the focusing system. Wilson’s design became the best known, and the microscope was at its height of popularity in 1740, after Wilson’s demise. The compressing effects of the specimen holder made it impossible to examine delicate living organisms–like Hydra–and this gave Ellis the impetus to design a microscope with an open stage (see Figures 20 and 22).
3.5. From Simple to Achromatic Microscopes The simple microscope used by Charles Darwin and his contemporaries, however, does point the way. It boasted a strong, vertical supporting limb mounted on a firm base, and at right-angles to it, there was an arm that held the magnifiers. In terms of design, this was the forebear of the optical microscopes we see today. Charles Darwin’s microscopes were of several types (Burnett & Martin, 1992) and his later instruments made the leap from the simple, portable design to the heavy, elaborate, achromatic microscope that was to become so popular in the second half of the nineteenth century. Among the other workers whose active careers spanned the time when the simple microscope was being superseded by compound microscopes were George Bentham (1800–1884) and Sir Joseph Dalton Hooker (1817–1911). We have seen that Darwin recommended a simple microscope as late as 1848, and even as late as 1860 he was writing to Daniel Oliver saying: Put a [growing Drosera rotundifolia] leaf under a simple microscope and observe whether the sensitive hairs are uniformly covered.2 2 Darwin C (1860) letter #2949 to Daniel Oliver dated 14 October, Darwin Project at Cambridge University.
38
Brian J. Ford
FIGURE 4 Bancks’ design of simple microscope. The father and son firm of Bancks of London manufactured fine single-lens (=simple) botanical microscopes in the 1820s used by such luminaries as Robert Brown, Charles Darwin, and William Jackson Hooker. The design above was the microscope of George Bentham, a leading exponent of nineteenth-century systematic botany. It is made from brass with ground soda-glass lenses magnifying up to 170× and is stored within the mahogany box that serves as a stand when the instrument is in use. Noteworthy are the two-sided mirror (one side plane, the other concave) and the substage condenser lens. The image quality of these uncorrected microscopes is higher than the accounts in the standard textbooks would allow one to anticipate (see Figure 21).
And, even later, in 1864 he wrote to Asa Gray as follows (Darwin, 1864): I will enclose some specimens, and if you think it worthwhile, you can put them under the simple microscope. In his celebrated Outlines of Botany, Bentham (Figure 4) wrote (Bentham, 1877): At home it is more convenient to have a mounted lens or simple microscope, with a stage holding a glass plate, upon which the flowers may be laid; and a pair of dissectors, one of which should be narrow and pointed, or a mere point, like a thick needle, in a handle; the other should have a pointed blade, with a sharp edge, to make clean sections across the ovary. A compound microscope is rarely necessary, except in cryptogamic botany and vegetable anatomy. For the simple microscope, lenses of 14 , 12 , 1 and 12 inches focus are sufficient. For most scientists it is a revelation to recognize that the single-lens microscope remained so popular long after the era of achromatic microscopes
Did Physics Matter to the Pioneers of Microscopy?
39
had already dawned. It is certainly true that our current view of the microscopical nature of life was derived from the endeavors of pioneering microscopists in the chromatic era, just as the consolidation of what we might call the received view of the understanding of life was clearly due to achromatic, compound microscopes, the design of which was more firmly founded on the principles of optical physics.
4. ORIGINS OF THE CELL THEORY Cells were revealed by a simple microscope, and the ubiquity of cell division was established by the Polish microscopist Robert Remak (1815–1865) in Berlin in 1841 (Remak, 1852, 1855). This crucial phenomenon was further investigated by many others in the 1850s. Remak made serious strides in our understanding of the role of the cell. Karl Ernst von Baer had held that there were four germ layers in the embryo, for example, and it was Remak who recognized that there were just three: ectoderm, mesoderm and endoderm. Already, by the middle of the nineteenth century, the scientific understanding of how life works was entering the era which we would recognize today. Globular theories, precursors to the cell theory, were quite popular at the beginning of the nineteenth century and suggested that living matter was ultimately composed of small globules. The diffraction effects of observing small structures with a restricted cone of illumination cause particulates to have a “globular” appearance, and fringes that can appear when a single lens is used slightly out of focus can convey a similar impression. Many workers, as the understanding of microscopes improved, accurately described various cell types and structures (including the nucleus and cytoplasmic streaming observed by Brown) but the idea that cells were the universal units is associated with Schleiden in 1838 and Schwann in 1839 (Figure 5). This work, however, followed on the heels of major discoveries made by simple microscopes that gave the impetus to cause Rudolf Virchow, in 1855, to exclaim: “All living cells arise from pre-existing cells” (omnis cellula e celulla) which became known as the biogenic law. The two colleagues, Matthias Jacob Schleiden (1804–1881) and Theodor Schwann (1810–1882), laid down the basis of the cell theory, recognizing that living organisms were essentially composed of cells (Figure 6). The view was first propounded in 1838 by Matthias Schleiden, whose studies with simple microscopes led him to the revelation that plants are made up of cells (Schleiden, 1838). Legend has it that Schleiden was drinking coffee after dinner together with Theodor Schwann when the two microscopists found that their findings in plant and animal microscopy had much in common. They adjourned to Schwann’s laboratory so that Schleiden could observe the structures Schwann had described, and from this the cell theory emerged.
40
Brian J. Ford
FIGURE 5 Cellular structure depicted by Schwann. Schwann published these diagrams of cell structure in 1839. They belie the relatively unsophisticated microscopes then available. Theodor Schwann (1810–1882) was a Professor at the Universities of Louvain and Li`ege. His coinage of the cell theory, jointly with Schleiden, brought him Fellowship of the Royal Society and Membership of the French Academy of Science. He received the Copley Medal in 1845. These diagrams of cells appear in his book, ‘‘Mikroskopische Untersuchungen ber die ¨ bereinstimmung in der Struktur und dem Wachsthum der Thiere und Pflanzen’’ [(1839) U Berlin: Sanderschen Buchhandlung (G. E. Reimer)]. The first section of the book is devoted to the structure and development of the spinal cord, and contains an exposition that cells are the basis of all animal tissues. The second part contains sections on the ovum, the cellular structure of other tissues, and a defense of the cell theory.
The next year, Theodor Schwann published his view that all animals are composed of cells too (Matile, 1998), though he mistakenly thought that cells could form de novo. Schleiden had an extraordinary introduction to the life ¨ sciences (Nordenskiold, 1928) for he began adult life as a barrister. Born in Hamburg, the son of a doctor, he studied jurisprudence and became a Doctor of Law. He was not good at it, and became known as an unsuccessful advocate which depressed him to the extent that he decided to end his melancholy and shot himself in the head. He was no better at suicide than practicing the law, for he survived the attempt and was left with a wound to the forehead that soon healed. He turned to the natural sciences, gained doctorates both in medicine and philosophy, and ended his career as Professor of Botany at Jena.
Did Physics Matter to the Pioneers of Microscopy?
41
FIGURE 6 The microscope and the cell theory. Clear drawings of cells appear in the book, ‘‘Microscopical Researches into the Accordance in the Structure and Growth of Animals and Plants’’ [(1843) London: Sydenham Society, jointly written by Theodor Schwann and Matthias Schleiden]. The tissues depicted in Figures 1–10 here are from the tropical palm, Chamaedorea. Figure 1 is captioned ‘‘cellular tissue from the embryo sac.’’ Many of the diagrams are remarkably accurate. Note, for example, Figure 16—‘‘the embryonal end of the pollen tube from the ovulum of Orchis mori’’ (the green-winged orchid)—which is typical of the high standards of observational accuracy in this book. Schwann’s understanding of the origin of cells was very wide of the mark, however, for they were envisaged as arising, somewhat like a process of crystallization, de novo. Thus Figures 4 and 5 are incorrectly claimed to show a cytoblast ‘‘with the cell forming upon it.’’.
42
Brian J. Ford
Although the cell theory (Wolpert, 1995) is synonymous with the names of Schleiden and Schwann (Schwann, 1839),3 these two investigators were not the first to light upon the essential concepts. Ludolph Christian Treviranus (1779–1864) published on the internal structure of vascular plants (Treviranus, 1806) and clearly recognized that the tissues were divided into discrete cells. He was closely followed by Johann Jacob Paul Moldenhawer (1766–1827), who held the position of außerordentlicher Professor fur ¨ Botanik und Obstbau (Extraordinary Professor of Botany and Fruit Trees) at Keil, and whose great work revealed much of the microscopic nature of vascular tissues in plants (Moldenhawer, 1812). Moldenhawer demonstrated the true nature of stomata (surrounded by two guard cells, rather than being a single cell with an opening) and parenchyma. He is also the investigator who first recognized the meaning of annular rings in the trunks of trees. All his work was done with simple microscopes, the physics of which was at the time an unfathomed science. The simple microscope was used by the French autodidact F´elix Dujardin (1801–1860) in his studies of living cells, leading him in 1835 to the recognition in all such cells of what became known as protoplasm (Matile, 1998). This was not the first observation of cytoplasm in action. Fine cytoplasmic strands within living cells had already been identified and studied by the Scottish physician and botanist, Robert Brown (1773–1858), in his studies of the Virginia spiderwort (Tradescantia virginiana) in 1828 (Figure 7). While Brown was examining the flowering structures of this attractive plant with his simple microscope, he studied the large purple cells that make up the staminal hairs that are a feature of each flower. Running across the vacuole that comprises the bulk of each cell, Brown perceived fine strands of cytoplasm in which the flow of the semi-viscous fluid could clearly be seen. Brown spent much time observing this fascinating phenomenon, though one cannot tell what he made of it. He teased Charles Darwin about it. When Brown first showed the phenomenon to an astonished Darwin, Brown would only describe the phenomenon as “my little secret!” (Ford, 1992a). Cytoplasmic streaming is now known to be widespread in cells and is a principal means of translocating nutriments and waste components within living cells (Shimmen & Yokota, 2004).
4.1. Robert Brown’s Key Observations Robert Brown also used his simple microscope to elicit the occurrence of the nucleus within living cells. At the time, he was studying orchid tissues and noted what he described as a circular “areola” within each cell, adding that he recalled having observed the same structure within other plant tissues that he had studied. Perhaps, Brown mused, this structure might better be 3 This early writing is discussed in a 2007 article by Pierre Clment, ”Introducing the cell concept by both animal and plant cells: a historical and didactic approach,” published in Science and Education, 16, 423–440.
Did Physics Matter to the Pioneers of Microscopy?
43
FIGURE 7 Dark-ground microscopy in the Georgian era. One form of image intensification available to the early microscopists was dark-ground microscopy. If the illuminating beam is set off-axis, it is possible to illuminate the specimen brightly while the background of the field of view remains dark. The technique is often referred to inaccurately as ‘‘darkfield’’ (but it is the background that is dark, and not the field). This method is infrequently used by modern microscopists but offers high-contrast images of tenuous or nearly invisible structures. Here we see the staminal hairs of Tradescantia virginiana using the No. 3 lens of Robert Brown’s microscope. These structures were first studied by Robert Brown in 1828. One large rectangular cell (its rounded nucleus clearly visible near the center) can be seen across the top center of this image. (See Color Insert.)
called a “nucleus” (King, 1827). It was this revelation in 1827 that gave us a concept that is of fundamental importance in the modern era of the biosciences. Brown is best remembered for his descriptions of Brownian motion, a concept of importance in theoretical physics. This is the ceaseless random movement of minute particles suspended in a fluid, and related to the molecular bombardment that the particles experience in the suspending medium. Since he made his observations in 1827, they have since been incorrectly recorded in the annals of science. Typical of these summaries is this entry from the Einstein Year website (Institute of Physics, 2009): In 1827 the biologist Robert Brown noticed that if you looked at pollen grains in water through a microscope, the pollen jiggles about. He called this jiggling “Brownian motion”, but Brown could not work out what was causing it. Even in such a short account there are two mistakes, one minor, and the other an error of fundamental physics. Of minor importance is the fact that Brown gave the phenomenon no such name, which would have been immodest in the extreme. He spoke of “active molecules”. It was not until the topic was being given extensive historical analysis, following the explanations of
44
Brian J. Ford
Albert Einstein (1879–1955), that it was resolved to name the phenomenon after Brown. The erroneous origin of the term by which we describe this phenomenon pales into insignificance alongside the far more serious misunderstanding of the nature of the phenomenon. Most reference sources claim that Brown observed the ceaseless movement of Clarkia pulchella (pinkfairies) pollen grains, but this is not the case. Brown saw no such thing. The physics of Brownian motion applies to particles orders of magnitude smaller. As Brown himself makes plain in his account he was observing the movement of minute particles within the pollen grains (Brown, 1827).4 The best known early examination of Brownian motion was by the distinguished physicist Louis Perrin (1870–1942) who published his paper on the topic as a book in 1910 (see Perrin, 1909).
4.2. A Failure to Understand It is deceptively easy to trace back this train of events and see the emergence of a logical sequence. Science rarely works like that. Much of Brown’s work has been widely misrepresented and denigrated over the past 100 years, and the insistence that he was observing the motion of pollen grains, rather than particles within them, is just one example of many. The type of microscope that Robert Brown used belongs to the category known as botanical microscopes, designed for the new breed of botanists who were exploring the newly discovered territories and penetrating their microscopical structure through the power of the lens. But because they bore just a single lens, this fact was used to down-play his research and throw doubt upon his methods. We can see this prejudice in 1922 when the Linnean Society of London was presented with a microscope of Robert Brown. It came with a neatly handwritten letter that read as follows: Amberley, Reigate Jan 19 1922 Dear Sir, By the kindness of Mr Salmon, I have much pleasure in offering Mr Brown’s microscope to the Linnean Society if they care to accept it. Its credentials are in the box with it. At the sale of Mr Bell of Selbourne’s effects, it was bought by my father & so its history since the original owner is accounted for. Yours faithfully Ida M. Silver (Miss) 4 After a private printing in 1827, this was published in 1828 as “A brief account of microscopical observations, and on the general existence of active molecules in organic and inorganic bodies” in The Philosophical Magazine 4, 161–173; and in 1829 as “Additional remarks on active molecules” in The Philosophical Magazine, 6, 161–166
Did Physics Matter to the Pioneers of Microscopy?
45
This was a microscope of great historical importance. It had been bequeathed by John Bennett (who had been Brown’s assistant from 1827 until he died in 1858) to Thomas Bell, surgeon and naturalist, who served as President of the Linnean Society from 1853 to 1861. Although Bell published much pioneering work in dentistry, he was an expert amateur naturalist and published a History of British Quadrupeds. He died at Selbourne in 1880 and the little microscope had been purchased during the house sale, from whence it passed to Miss Silver (Ford, 1985). At that time, the Linnean Society was planning to celebrate the centenary of the first paper on the cell nucleus, and one would imagine that the timely arrival of the microscope would have been greeted with enthusiasm. It was not to be. Doubt was expressed as to whether Brown could ever have resolved so small a structure as a nucleus with such an unsophisticated instrument (see Report, 1932). When the microscope was examined by experts, it was dismissed as “surprisingly simple, being little more than a dissecting-microscope.” The instrument remained in the Society’s possession as a neglected curiosity. In 1951 the organizers of the Festival of Britain approached the Linnean Society and invited them to put the historic microscope on display. The request was refused: the microscope wasn’t worth it. In the early 1970s the Honorary Secretary of the Society, Mr. T. O’Grady, and one of the Fellows, W. A. S. Burnett, examined the microscope and Burnett took a photograph of it. One of the few microscopists who examined it was Professor Irene Manton, and her technicians even managed to take a photograph of onion epidermis through the microscope. Manton, in her George Bidder Lecture at Leeds University in 1974, stated that “its condition... is not very good since minor repairs are needed”. She was certainly right, for when I took on the challenge of restoring the microscope it was dirty, bent, and neglected; it was even wrongly assembled with parts jammed together. Reassembly of the microscope was a challenging procedure (Ford, 1982, 1984) and it allowed me to see how the instrument had been constructed. Once the superficial dirt was removed, one could once again perceive a discolored area of wear, where Brown’s forefinger had rubbed continually against the body pillar as he focused the instrument. Meticulous cleaning of the lenses restored them to their original condition, and photomicrography confirmed just how much single lenses can reveal. The detail revealed by the lenses from the Brown microscope proves to be sufficient for normal light microscopy, and the ease with which it packs away into its hardwood box makes it into an ingenious and compact instrument.
4.3. States of Denial Even the most basic application of optical physics confirms that single lenses could theoretically resolve bacteria. Yet each generation has been dominated by detractors who claim that the simple microscope was not, and could
46
Brian J. Ford
never have been, up to the task. There is a continuing unwillingness to accept that our predecessors, with their unsophisticated instruments and limited understanding, could in any sense rival our grand and enlightened era of contemporary science. We have seen that the Brown microscope was dismissed as a mere dissecting instrument when it was presented to the Linnean Society. There should have been no reason to doubt—even if nobody then knew how to use it properly—that some degree of cell structure would have been visible with it. The staminal hairs of the spiderwort Tradescantia virginiana are even visible to the naked eye and there can be no doubt that a modest microscope would have reasonable value as an instrument of scrutiny. It proved to be more difficult for today’s scientists to accept that Brown could have witnessed the movement that now bears his name. The particles that are witnessed by the microscopist in Brownian motion are mere microns in diameter, and resolving those with a home-made lens is a considerable achievement. In a short publication for the American Physical Society in 1991, Daniel Deutsch of Pasadena, California, argued that Brown’s microscope could not have been up to the task. The title of Deutsch’s submission was admirably descriptive, sufficiently so as to obviate the need to read further; it was “Did Robert Brown observe Brownian motion: probably not” (Deutsch, 1991). The timing was good, as I was booked to give my annual lecture to the Inter/Micro conference organized by the McCrone Research Institute in Chicago, Illinois. I had recently obtained video recordings of the phenomenon of Brownian motion viewed through his original microscope from the Linnean Society, and rushed to produce an illustrated presentation that would put Deutsch, and the other detractors, to rights. The demonstration was given in Chicago and provoked much international interest. A paper based on the lecture was published that same year (Ford, 1992b) and another for the Institute of Biology in London appeared shortly afterwards (Ford, 1992c). As interest in the topic continued to grow, the controversy was discussed in New Scientist magazine (Bown, 1992) and elsewhere. It was intriguing to see this attempt to deny Brown the right to his discovery published in an American physics journal. This underlies a common problem in experimental microscopy. The individuals who produced the simple microscopes used by those pioneers were not burdened by the constraints of theoretical physics, nor were they inhibited by foreknowledge of the limits of resolution. They worked by trial and error, and the users of these microscopes managed to visualize details of a surprisingly diminutive nature, no matter what the theories might have implied. Confirming that Brown could indeed have observed Brownian motion (Ford, 1992b) was a highly gratifying result. What made it better was
Did Physics Matter to the Pioneers of Microscopy?
47
FIGURE 8 The stage of the simple microscope. The circular stage of the Bentham microscope was well adapted to the examination of preparations mounted in ivory sliders (the precursor of the glass slide of the modern era). Dried plant sections were mounted between mica disks and retained with steel circlips. Around the stage is engraving stating that the supplier made scientific instruments, by appointment, to the Royal household. Here a low-power lens is in use (magnification 19.6×). The lens mounts were simply unscrewed and replaced as required. Note the focusing of the illuminating light beam by means of the substage condenser lens.
that it seemed to fly in the face of a detached scientific appraisal (Raspail, 1830). Chevalier (Hughes, 1855; Nuttall, 1971) made a microscope used by Brown, but those I have personally inspected were made by Robert Bancks & Son in London (Figure 4). This father and son firm also made the instruments used by Hooker, Bentham, Darwin and others (Figures 4, 8 and 9). Their microscopes were supplied with a selection of lenses, the mounts for which ¨ included one or more Lieberkuhn reflectors. These are concave, silvered downward-pointing mirrors that focused a pool of light on the specimen and provided the view that we would now obtain with a metallurgical (incident illumination) instrument. This form of illumination provided remarkably clear views of the plant structures for which these “botanical microscopes” were primarily designed. The finest designs produced by Bancks (Figure 10) included a movable substage condenser and a concentric fine-focusing adjustment; this is the type that came to be favored for botanical research by Charles Darwin. Pioneers in emergent disciplines other than botany used this type of instrument. In France, for example, where Chevalier was producing botanical microscopes, the new science of chemical microscopy was being painstakingly established by Franc¸ois-Vincent Raspail (1794–1878). He ordered a microscope, based on these British designs, from the Parisian instrument maker Louis Joseph Deleuil (1795–1862) and used it to found the
48
Brian J. Ford
FIGURE 9 Robert Brown’s microscope and the plant cell. Brown first observed the nucleus within cells taken from orchid tissues. When in 1922 his microscope was returned to the Linnean Society (of which Brown had been president) it was dismissed as of little interest—and certainly incapable of providing the images on which Brown had based his conclusions. Indeed, when the 1951 Festival of Britain organizers asked to exhibit it as an example of British scientific achievement, the proposal was declined. Here we reprise his observations with epidermis from the orchid Cymbidium. Three stomata and about twelve epidermal cells are clearly seen, the nuclei being conspicuously visible within each cell with the No. 2 lens magnifying 75×. (See Color Insert.)
new science of microchemistry (Ford, 1996). Although we remember Raspail for this contribution to science, it is also noteworthy that he stood in the French elections for the position of president of the new republic. He gained 36,900 votes (Napoleon III won with 5,434,226). Each of the microscopes used by these pioneering investigators was fashioned from brass with focusing mechanisms that were either made from steel or took the form of sliding the main supporting pillar through a tight collar of cork. As their experience deepened, the instrument makers went on to produce a more advanced design, and Bancks in London introduced an advanced microscope in which the coarse-focusing control was a rack and pinion mechanism, situated at the rear of the body pillar, with an excellent fine focusing adjustment mounted concentrically near the base. Robert Brown had one of these instruments and to this day it is a pleasure to use. Not all the microscopes were as elegant or useful. Perhaps the most unsuccessful—at least in use—was the botanical microscope designed about 1790 by William Withering (1741–1799). Withering’s most important contribution to science was his recognizing and promulgating a traditional herbal remedy that has since become widely recognized — digitalin, now extracted from the foxglove (Digitalis lutea, but, in Withering’s time, from the common woodland species, D. purpurea). The Withering microscope folded down into a box with a sliding lid, and the act of opening the box drew
Did Physics Matter to the Pioneers of Microscopy?
49
FIGURE 10 Peak of perfection—Bancks’ advanced design for a simple microscope. Careful manufacture and the incorporation of experience derived during the production of previous models gave Bancks the ability to tune their designs as time went by. Their typical microscope stood some 200 mm tall and here we see both coarse- and fine-focusing adjustments (situated to the rear of the body pillar, and set into its base, respectively). There is a racking control for the lens arm (top) and the position of the substage lens can be fixed by mean of a knurled knob (center). Robert Brown clearly preferred to incline his microscope, showing that it was used for botanical purposes—an aquatic specimen would spill from its watchglass—though most Bancks microscopes did not have this feature. This example was made around 1828.
the instrument upright on a hinge. Ingenious it certainly was; but it would be hard to claim that this was a well-made and functional instrument. As botanical microscopes go, this was one of the worst.
4.4. Aberrations, Real and Irrelevant The physics of the lenses imposes constraints on performance, and manufacturing techniques add further limitations to optical performance. The lenses made by Bancks are diminutive and were ground from soda glass. All show minor imperfections. They are mounted in turned brass holders, and held in place by a screw-in stop that restricts the lens aperture. This reduces spherical aberration, but imposes limits on the transmission of light. Trial and error allowed the microscope maker to find the optimum result, with an image which—even if it was somewhat dimmer than it
50
Brian J. Ford
might be—had significantly greater clarity. Chromatic aberration could not be addressed by any of the lens production methods available at the time, for this required the use of a convex soda glass lens, coupled with a concave lens made of flint glass which has a higher refractive index. Until the development of the solution to this most recalcitrant problem, chromatic aberration would remain a bugbear to all microscopists, and a stimulus to optical experimenters. Until the era of corrected lenses, aberrations would remain: chromatic and spherical aberration, coma, focal plane curvature, distortion and astigmatism. • Chromatic aberration The greater the wavelength, the less light is refracted; thus there is inevitable chromatic aberration. Its severity is a function of the dispersion of the mineral from which the lens is made, that is to say, the extent to which light of differing wavelengths is diffracted at the lens/air interface. Low-dispersal minerals (like spinel) produce images in which chromatic aberration is reduced. With a single refracting lens, this problem can never be entirely overcome. • Spherical aberration With a lens of spherical curvature, light refracted through the periphery is brought to a focal point closer to the lens than light passing through the lens closer to the center. The resulting image thus suffers from spherical aberration. If the image is critically focused at the center, it will be marginally out of focus farther from the lens axis. The production of a lens with an aspheric contour can theoretically address this issue. • Coma This form of aberration results from light approaching the lens at an angle to its axis, and producing a focal pool of light that is teardrop shaped rather than circular. This problem is of particular importance to astronomers, where sharp images of distant luminous bodies are of crucial importance. In conventional light microscopy, where the field of view is generally illuminated (not dark), coma matters less. • Focal plane curvature We tend to assume that an object and the image that it generates are in a flat plane that lies normal to the axis of the lens. In fact, it must be the case that the focal length is not constant across the entire image plane. The center of a lens of given focal length is closer to the object at its center than it is at the edges, for the image plane is subtended from the lens. For this reason, the image plane is best construed as curved, with the distance from the center of the lens across its entire field being constant. • Distortion Forcing an image that is best viewed in a curved plane into a flat focal field can result in image distortion. When a square object is imaged, the appearance may be distorted such that the straight sides of the object curve slightly inwards, producing what is known as pin-cushion distortion.
Did Physics Matter to the Pioneers of Microscopy?
51
Alternatively, the straight edges of the object may be seen to bow out slightly in the image, a result known as barrel distortion. • Astigmatism Finally, we also have to consider astigmatism, in which there is a cylindrical component (in addition to the intended spherical contour) of the finished lens. Hand-made lenses tend to lack perfect axial symmetry, and this will result in astigmatic degeneration of the image. Any or all of these aberrations cause image degeneration. Some (like chromatic aberration) are unavoidable in single lenses. Others (like spherical aberration) can theoretically be obviated in a lens of aspherical contour, though the surface of lenses of this sort is hard to calculate and the lenses are very difficult to manufacture. Astigmatism is avoided altogether if the lens is radially symmetrical, and distortions can similarly be minimized.
4.5. Resolution and the Art of Seeing Finally, we are left with the limits to resolution. These are easily defined, as the capacity of an object to modify an illuminating light ray is a function of the relative values for object size and wavelength. It is only in recent years that different approaches to light microscopy have allowed us to find our way around these restrictions (p29) and for the conventional light microscope these limits remain: d=
λ , 2A N
where d is the limit of resolution, λ the wavelength of the illuminant, and A N is the numerical aperture of the lens (normally written NA). The resolution is technically defined as the ability of the lens to distinguish two self-luminous points separated by distance d. In conventional optics, we take the value of λ as 550 nm (i.e., 0.55 µm), which corresponds to apple green light. This convention is useful since this color is near the center of the visible spectrum, and it is also the color to which the human eye is most sensitive. With air as the medium between the objective lens and the specimen slide, the highest practical A N is 0.95, and with oil immersion lenses, up to 1.5. In practice the lowest value of d obtainable is around or 200 nm (i.e. 0.2 µm). Pedants can modify the expression given above. One additional factor that can be introduced is the refractive index of the substage condenser lens array. Indeed, there are several formulae for the calculation of resolution (Zeiler, 1969) and there are debates about which one might be the best to use. In practice, though, the single constraint that we can derive through first principles of optical physics is that a single lens in air can resolve structures about one-fifth of a micron across. This is a crucial conclusion, for it allows us to look with renewed respect at what the pioneers might have seen. This
52
Brian J. Ford
FIGURE 11 The Bancks microscope of Robert Brown in its box. Robert Brown had several microscopes during his lifetime, and this is the example from the Linnean Society of London, where I serve as Honorary Surveyor of Scientific Instruments. The use of the hardwood box as a storage facility, and also as a rigid base for the microscope, was common to botanical microscopes of the era. The main body pillar, stage, and mirror are in their places, while the accessories are lying alongside. One can see the stage forceps (left), an ivory slider, a pair of tweezers and the six lenses. Note that the two lenses on the left are fitted with Lieberk¨ uhns to facilitate illumination by means of reflected light.
level of resolution is good enough to permit the imaging of typical bacteria, for example, which seems extraordinary to the tyro. For the first two centuries of high-power microscopy, the users (like the instrument makers) relied entirely on experience and craft, not optical physics. If we look back at the extraordinary work done by the microscopist members of the Royal Society of London (Ford, 2001), we can watch the understanding of the physics slowly developing, long after many major microscopical discoveries had been made. With the developments of the late nineteenth century it is less easy to forgive commentators in recent times who have failed to grasp the capacity of the single lens. Had the Linnean Society’s experts in the early twentieth century paid proper attention to the physics, they would have seen at a glance that resolving the nucleus would not have been an unattainable aim for a microscopist (like Brown) using a single lens instrument. Not only could Brown have observed the nucleus in the 1820s, but his microscope allows one to resolve some of the structure within in. The Bancks microscopes used by Robert Brown and his contemporaries including Darwin, Hooker, and Bentham (Figures 4 and 8) demonstrate the capacity of early Victorian instrument makers—with no understanding of theoretical physics—to design and construct microscopes that were finely tuned to the task in hand (Figure 11). They were capable of providing remarkable results. Even under the low-power lens of Robert Brown’s Linnean Society microscope, we can resolve discrete cells of
Did Physics Matter to the Pioneers of Microscopy?
53
FIGURE 12 Yeast cells observed with Robert Brown’s low-power lens. Because they were made individually, by hand, no two Bancks microscopes were identical. The instrument of Robert Brown was similar to that of Bentham, but different in the details: It lacked the substage condenser lens, but was fitted with a reclining supporting pillar that allowed it to be leant towards the observer in use. Here we observe Saccharomyces cerevisiae, yeast cells, under low power. This image is taken with the Brown microscope, using the No. 3 lens, which magnifies 32.5×. Light is directed onto the specimen with the concave mirror. Each cell can be clearly discerned and, although spherical aberration is apparent towards the periphery of the field of view, there is remarkably little chromatic aberration.
the yeast Saccharomyces cerevisiae (Figure 12). The highest-power lens of this microscope, magnifying some 170×, gives surprisingly clear images (Figure 13). If we look to the ultimate resolution of a single-lens microscope, we can examine the same preparation with a spinel lens ground by Horace Dall (Figure 14). Spinel has relatively low dispersion, and this fine lens (magnifying 395×) was fitted into a hand-made holder to form a pocket microscope (Figure 15). This has since been placed in the Royal Microscopical Society’s collections, and it produces remarkable pictures. Diatom frustules have long been regarded as test objects for the microscope, and we may see how well these specimens are resolved by the same lenses. With the No. 1 lens of Brown’s microscope we can make out little more than the outline of the frustule, which is 25 µm in diameter, and the presence of small details (Figure 16). The No. 2 lens just permits the visualization of the details as circular pores (Figure 17). The Dall lens gives a reasonable impression of the pattern of perforations, and reveals something of the radial patterning at the periphery of the cell (Figure 18), whereas with a modern-day Leitz microscope we may resolve this patterning around the edge of the cell and clearly see the perforations (Figure 19). Here too we can see that the view provided by the single lens does indeed give surprisingly good resolution in comparison with a present-day instrument.
54
Brian J. Ford
FIGURE 13 Robert Brown’s best lens reveals the Saccharomyces culture. The highest-power lens in the Bancks microscope made for Robert Brown, and preserved at the Linnean Society of London, is the No. 1 lens that magnifies 170×. Yeast cells vary in size; most are somewhat larger than blood cells and they thus provide a suitable test object for these microscopes. The optical values of each of these micrographs is normalized through Adobe Photoshop CS2 in order to more closely to approximate the view to that the observer experiences. The lens holders are easily changed, allowing the observer to change magnification as required. In spite of the theoretical constraints imposed by physics, a microscope of this sort would still be usable for day-to-day microscopy.
FIGURE 14 The ultimate performance of a single lens microscope. The same Saccharomyces cells are here imaged with a single lens ground from the mineral spinel by the late Mr. Horace Dall of Luton, England, and magnifying 400×. This microscope is now in the possession of the Royal Microscopical Society in Oxford, England. Spinel has low dispersion and a refractive index of 1.712–1.762 (compared with conventional soda glass of approximately 1.5, and lead glass of NA > 1.7) and thus offers high image quality. This is the peak of perfection for a simple microscope. Each cell can be clearly observed, and even something of the internal structure can be discerned. This dispels the notion that single lenses produced indistinct images that were afflicted with severe aberration. (See Color Insert.)
Did Physics Matter to the Pioneers of Microscopy?
55
FIGURE 15 In quest of the ultimate performance–Horace Dall’s spinel microscope. The late Horace Dall of Luton, England, constructed exquisitely small simple microscopes. This one, dating from 1950, is his finest. The lens (marked here 400×) is ground from the mineral spinel, which has a refractive index higher than that of soda glass (1.5) and close to that of lead glass (1.7). It also has lower dispersion than glass, and thus offers the best results that a single lens could reasonably be expected to provide. The lens is mounted in a circular holder (left), which screws into the stage (right), and is held firm by means of a concentric spring (center). This fine microscope, since given to the Royal Microscopical Society in Oxford, was used by the author to show the extremes which a simple microscope could attain.
FIGURE 16 The image of a diatom under Brown’s 32.5× lens. The clarity obtainable with a single-lens microscope can be best demonstrated if we take serial magnifications, using different lenses, of the same specimen. In this case, we see a cell of the centric diatom Coscinodiscus. The diatoms secrete shells of silica that are typically perforated by regular arrays of apertures and are thus ideal specimens to act as test objects for microscopes. In this case, we are using the No. 3 lens magnifying 32.5×.
56
Brian J. Ford
FIGURE 17 The image of a diatom under Brown’s 170× lens. As the magnification is increased, we can begin to make out increasingly fine detail within the diatom frustule. This specimen measures 25 µm across, and under the No. 1 lens, magnifying 170×, we can already begin to discern that the darker features seen in Figure 16 are actually circular structures. Some structure is also appearing towards the periphery, where we can now see that the edge of the cell forms a translucent rim.
FIGURE 18 The image of a diatom under the spinel 400× lens. The highest-quality image obtainable with a simple microscope is offered by the spinel lens made by Dall, magnifying 400×, and here the perforations can be clearly observed. Features that seemed to coalesce into one under the No. 1 170× lens used by Robert Brown can now be resolved as discrete structures. We can also distinguish the radial patterning that marks the rim of the frustules. At 25 µm in diameter, this frustule is roughly twice the diameter of a typical cell.
During the nineteenth century, other manufacturers, like Dollond of London, took these concepts to yet greater heights by producing beautifully tooled portable instruments with increasingly high magnifications. These
Did Physics Matter to the Pioneers of Microscopy?
57
FIGURE 19 The image of a diatom under a present-day Leitz microscope. Modern instruments use phase-contrast, differential interference, Hoffman modulation, or dark-ground microscopy to amplify structural detail. Here we are using a modern Leitz oil-immersion objective lens to show the detail revealed by a fine present-day microscope, but without the benefits of these contrast-enhancing optical systems. The separation of the fine radiating peripheral markings is 0.2 µm, close to the limits of light microscopy. They are just beyond the resolution of the single lens in Figure 18. These correlated images substantiate that simple microscopes produced surprisingly clear images of fine microscopical features.
instruments became well known. In the collections at the University Museum of Utrecht, Netherlands, is a Dollond microscope that is described in the published catalogue as the “Pocket Microscope of Robert Brown”, and gives magnifications for the various lenses as 185×, 330× and 480×. The microscope itself fits into the palm of your hand, and can collapse into a small leather-covered box little larger than a cigarette packet. There remained a mystery surrounding this instrument, however, for there was no record of how it could have been translocated from Brown’s home in Soho Square, London, to the Physics Laboratory in Utrecht University. The lesson here is—never rely on second-hand sources, even when dignified by print in a collections catalogue. It transpired that the microscope had never belonged to Robert Brown. The entry in the hand-written accessions book revealed to me a story that was significantly different from the printed list. This was, it said, a microscope “folgens Rob. Brown”; i.e., after, or following, Brown’s instrument. It was his type of microscope, and was never in his possession: mystery solved (Ford, 1985). A magnification of 480× would be a remarkable performance by a single lens. The working distance of the lens would be less than a millimeter; the lens itself would be no bigger than the head of a pin. In traveling to Utrecht to experiment with the microscope I realized that I would be faced with the summit of achievement for the maker of a single-lens instrument. But it was not to be. Sadly, the lens holder no longer held its tiny lens. This
58
Brian J. Ford
design offered the highest magnification of any microscope ever put into production, and it was also the smallest such instrument in history. As such, it was a dead-end in development. We can see vague connections in concept with the portable microscope designed by John McArthur in Cambridge, but there is no direct lineage from this Dollond microscope to the present day. The Bancks designs, by contrast, clearly reveal the way ahead. We can follow the stages of development, and it is easy to contemplate the homogeneity between the design of these simple microscopes and the modern research microscope. The lineage is unmistakable. We have seen how the design of the Bancks type of microscope evolved during the first quarter of the nineteenth century from a modest lens support on a stand, with crude focusing, a circular stage, and a mirror (Figure 20) into a range of microscopes that variously boasted a substage condenser, concentric controls, and both coarse- and fine-focusing adjustments (see Figure 10). If achromatic microscopes had never been envisaged, this type of instrument would still be in vogue today, for the images that they can produce are impressive (Figure 21).
5. PIONEERS OF FIELD MICROSCOPY Microscopes of the eighteenth century lacked the ingenious accoutrements of the botanical microscope of the early 1800s. As we have seen, these earlier microscopes were uncomplicated and the entire device fit into a boss set into the lid of its box. There were no finely designed fine-focusing mechanisms and little that was ingenious about them. Although they were ¨ often supplied with a Lieberkuhn reflector (and thus could be used for entire botanical specimens), they were more often known, generically, as “aquatic microscopes” because they had been developed to study freshwater organisms. The origin of this design can be traced to one investigator, and a single instrument maker. It was the concept of John Ellis (1710–1776), an Irishborn British government official who spent much time based in Florida and Dominica, and who was an enthusiastic microscopist in his spare time. Ellis was an active member of the burgeoning class of natural historians who were investigating the wonders of an expanding world, and his social milieu is itself a fascinating commentary on the rapid expansion in awareness of the microscopic world (Duyker & Tingbrand, 1995). He had used microscopes for years, but found the instruments then available were unsatisfactory for the study of freshwater microscopical life. So he turned to a well-established instrument maker in London who had provided him with microscopes and proposed an alteration in design (see Figure 20). Microscopes of the time enclosed the specimen in a confining stage, which meant that delicate living organisms could be crushed or—if they survived intact—could not be
Did Physics Matter to the Pioneers of Microscopy?
59
FIGURE 20 The aquatic microscope designed by John Ellis. John Ellis (1710–1776) commissioned the production of this aquatic microscope seventy years before Bancks, in 1754. His design was initially made by the London instrument maker, John Cuff, and it has solid a square-section vertical brass pillar, compared with the hollow tube of the Bancks design that was to follow. The stage has no embellishments and is meant to support a watchglass (containing aquatic microorganisms) or alternatively an ivory slider that would be simply laid across the circular stage. The brush (P) is fashioned from a quill, the hollow end of which could be used to transport drops of water to the watchglass (M). Note too the Lieberk¨ uhn (G) – one is shown fitted to this microscope (top). This illustration is from ‘‘Essays on the Microscope by George Adams’’ (1750–1795), instrument maker to King George III, printed by Dillon and Keating in 1787.
60
Brian J. Ford
FIGURE 21 Chromatic aberration and the botanical microscope. Here we see a transverse section of fern rhizome, and the main features are several large vessels—the tube-like structures that convey sap from the roots to the fronds. Using the No. 1 lens from the Robert Brown microscope (see Figure 11) we can see the histological structures clearly, and the supporting cells that surround the vessels are all well resolved. The color plate also shows significant chromatic aberration. It is important to note that, although the spurious colors are apparent, they do not greatly detract from the clarity of the image. Standard textbooks cite the rainbow-hued fringes that prevented early microscopists from seeing clearly but, as can be seen, this is a conclusion published by commentators who have never had the benefit of seeing the images they describe. (See Color Insert.)
reached by a dissecting needle or a probe, which an investigator might wish to utilize. It was a London instrument maker to whom Ellis vouchsafed the task, John Cuff (1708–1792). Here we have a successful craftsman of the highest standards, already accustomed to producing compound microscopes— admittedly with uncorrected lenses, but finely tooled and beautifully finished. The wooden cases were meticulously constructed of hardwood (apple, mahogany, oak) and lined with baize. The microscopes were of lacquered brass, with accessories to help the user hold the specimen or illuminate it in a variety of ways. For their time, they were as perfect as a microscope could be; but they were bulky, which prevented their being used for visits to the countryside, and the design of the stage made it difficult to study aquatic organisms. Something altogether simpler was required. The design which Ellis devised had a vertical pillar supporting a circular stage, into which a concave watch-glass could easily fit, and which could be mounted into the lid of the microscope box (Figure 22). The lens arm could be raised or lowered to focus the image, and also turned from side to side to scan across the stage, which is important for studies of pond life. Beneath the stage was a double-sided mirror, one plane, the other concave. The whole instrument could be disassembled and packed into the small wooden box, itself typically adorned with shagreen and finely finished. This
Did Physics Matter to the Pioneers of Microscopy?
61
FIGURE 22 From a pioneering era: the microscope of Linnaeus. This instrument was owned by Carl Linnaeus, father of taxonomy, and was photographed by the author at Uppsala, Sweden, where it is preserved at Linnaeus’ former home. In this design, the wooden box was typically covered with shagreen made by polishing the spines from shark or ray skin. As in the uhn is shown in the fitted position. Only one microscope illustrated Figure 20, a Lieberk¨ low-power lens now remains in this microscope case, and it gives images of poor quality. Linnaeus had something of a blind spot for microorganisms, and I have found no record of his using his microscope to any great effect. Great taxonomist though he was, Linnaeus was no microscopist.
was a microscope that could fit easily into the coat pocket. It is easy to use, and (unlike the microscopes for which Cuff was already well known), these were simple microscopes. The problems caused by aberrations were minimized with a single lens, and the user had a highly portable instrument with a wide range of uses both in the field and back at the desk. The basic design also meant that these microscopes were affordable, and the intelligentsia could easy obtain one of their own. The principal problem these instruments posed was one of nomenclature: Are they properly described as a Cuff, or Ellis, microscope? Both terms are used, but here I will settle for Ellis. Although it was manufactured by Cuff, the original description was “Mr Ellis’ Aquatic Microscope”. It was Ellis’ design, after all, and many manufacturers subsequently produced versions of their own. Not everybody who bought one of these diminutive microscopes used it. In Sweden, the father of taxonomic terminology, Carl Linnaeus, purchased an Ellis microscope in a sharkskin case (see Figure 22). It came with ¨ two lenses mounted into a holder bearing a Lieberkuhn, which made the
62
Brian J. Ford
FIGURE 23 Macroscopic observations by Carl Linnaeus. These are the closest I have found to true microscopical observations by Linnaeus. The drawings show the crane fly, Pedicia rivosa (below) and the moss, Funaria hygrometrica (above). When Robert Hooke portrayed the same moss in his book, Micrographia (1665), he clearly showed the cells of which each leaflet is comprised; these pictures by Linnaeus lack such fine detail, and show little more than can be seen with the naked eye (compare with Figures 21 and 39), though the venation of the crane fly wings is well portrayed. The illustration is from Linnaeus’ journal for 1732 at the Linnean Society, to whom the author extends grateful acknowledgement.
microscope eminently suitable for a busy botanist (Report, 1932) but there is no first-hand evidence that Linnaeus used it. None of his surviving drawings, or published diagrams, shows microscopical detail. There is an indifferent drawing of the crane fly, Pedicia (formerly Tipula) rivosa dating from 1732, for which a low-power lens might have been employed, and a few macroscopical botanical studies, all of which could have been made by the naked eye, but nothing more detailed than that (Figure 23). Linnaeus was also surprisingly uninterested in the myriad microscopic organisms that had been documented before his time. As we can see from the published accounts, Linnaeus was always vague about microorganisms (Linnaeus, 1758). He set down a genus “Microcosmus” which was defined as “Corpus variis heterogeneis tectum,” and recognized Volvox globator as “Corpus liberum, gelatinosum, iotundam, artubus destitutum”. Someone must have drawn his attention to amœbæ for Linneaus also recorded the genus as “Chaos 2. V. polymorpho-mutatibis” and indeed the common pond amœba was designated Chaos chaos (L) well into the Victorian era. Linnaeus’ microscope survives in Uppsala to this day. Of the lenses, only one remains and it is of poor quality. This low-power lens is suitable for only the most basic investigations, and it does not have the
Did Physics Matter to the Pioneers of Microscopy?
63
quality that is ordinarily associated with lenses made by Cuff. Perhaps it is a magnifier dating from earlier in Linnaeus’ career and used for close views of plant specimens; in any event, the microscope is now useless for microscopy. Expert taxonomist and indefatigable collector though Linnaeus may have been, he was certainly no microscopist and he missed out on this fundamentally important realm of life. It was his greatest “blind spot”. Why would Linnaeus have purchased a microscope, if not to use it? There was one minute organism that he did describe; this was Hydra, the fresh-water polyp that he named in 1758. Hydra was an extremely popular organism for study by late eighteenth-century microscopists, and indeed interest in this diminutive creature remains current (Lenhoff, 1983). It was specifically for the study of Hydra that the Ellis aquatic microscope had originally been conceived. The problem with early simple microscopes was that they were delicate and troublesome to use. It was hard to mount the specimen; harder still to focus it. Many designers had tried to find ways to make the task easier, and one of the most widespread designs was a screw-barrel microscope (Clay & Court, 1932). In this instrument, the lens was mounted at one end of a tube. The specimen was slid into the body of the microscope, and the lens focused by screwing its holder in or out of the tube. This design had been perfected by James Wilson (1665–1740), who presented it to a meeting of the Royal Society of London in 1702 (see Figure 23). It proved to be very useful for botanists and others working in the field (Wilson, 1743). Commentators have alleged that Wilson plagiarized his design from one already announced by Nicholas Hartsoecker (1656–1725). It is certainly true that Hartsoecker, a Dutch physicist and pupil of Huygens, had indeed constructed a screw-barrel microscope in 1694, eight years before Wilson. It is also clear that the idea of the design had spread to England, for Wilson’s account did not describe himself as the designer, only as the maker. He described the screw-barrel microscope as “Late Invented,” which clearly acknowledges that it had been devised elsewhere. Screw-barrel microscopes were handy devices for observing specimens that were amenable to sliding into the spring-loaded specimen holder. These were the microscopes that gave rise to the ivory slider, a thin sliver of bone or ivory as wide as a pencil and as long as a matchstick. Countersunk holes (usually four in number) were set into the slider, and the dried specimen was held in position between two disks of mica that were secured in position with circlips (see Figure 8). This form of mounting was ideal for insects, butterfly wings and antennae, wood sections, fabrics, and hairs. Sliders are widely found as collectors’ items and the specimens that they contain are often in excellent order, despite three centuries in store. Although the idea of a “slider” seems decidedly dated, it gave rise to the microscope “slide” that is universally familiar to present-day microscopists.
64
Brian J. Ford
The manufacture of slides was made feasible by the introduction of “patent plate” or “flattened crown” glass in the 1850s. At the same time, Chance Bros of Birmingham, England, began the production of coverslips described as being “of various degrees of thickness, from 1/20th to 1/250th of an inch” (Carpenter, 1862). These slides are the standard mounting materials in the present day. In imitation of the appearance of the ivory sliders, it was the convention to cover early glass slides with paper, only the circular areas of the coverslip remaining clear. These are now popular items in slide collections. Slides and sliders are of the greatest value for the preservation and examination of specimen material, particularly entire small specimens and thin sections of larger objects (like tissue sections). They are of less value for the examination of delicate living organisms, like Hydra, the body of which is typically 1 cm in length and which is disturbed (and even structurally disrupted) by perturbations in its environment. Hence the move from the screw-barrel instrument to the aquatic microscope—this alone, with its open stage and watch-glass specimen chamber, allowed microscopists to examine the world of Hydra in a near-natural state.
5.1. The Polype Shows the Way It was the demands imposed by Hydra that led to the development of the aquatic and, later, botanical microscopes. But why? Hydra is one pond creature of many. Was there a reason for the sudden burst of popularity that propelled it to the number one slot on the microscopists’ charts? There was a reason, and it lay in the Netherlands, where a young Swiss naturalist was engaged as a tutor and began a series of experiments that effectively launched the era of experimental biology. This experimenter was Abraham Trembley (1710–1784) of Geneva (Breen, 1956), who was appointed tutor-inresidence to the two children of Count Bentinck of The Hague, Netherlands. He cultured the polyps in glass vessels, observing them as they grew and reproduced. Then he began to experiment with them, everting their bodies, transplanting parts of one onto another, observing how one could use the organism as an experimental animal and coining the concept of transplantation (Trembley, 1744). The experiments would have been a wonderful introduction to microscopical zoology for Trembley’s young charges, and they had many implications for experimental biology. But they would have remained undisclosed in The Hague had it not been for the encouragement of the distinguished philosopher Ren´e R´eamur (1683–1757) and the growing interest of the Royal Society, to whom he communicated Trembley’s findings. For two years the Royal Society withheld support, repeatedly asking for supplies of the organisms and details of how the experiments were performed. Eventually they were convinced. Martin Folkes (1690–1754) at the Society wrote a description of Trembley’s work as “one of the most
Did Physics Matter to the Pioneers of Microscopy?
65
beautiful discoveries in natural philosophy” and the London philosophers suddenly started to show interest (Lenhoff & Lenhoff, 1986). One of these was Henry Baker (1698–1774) who, faced with this remarkable series of experiments, hastily compiled a popular book on the use of the microscope. Baker was a remarkable character who made a living as a young man teaching deaf-mutes to communicate and helping to show them how to live fulfilling lives. His work as a teacher brought him to the attention of Daniel Defoe (1660–1731), who in 1727 invited Baker to visit him at home. Two years later, Baker married Sophia, Defoe’s daughter, and his acceptance into learned London society was complete. In 1741 he was elected Fellow of the Royal Society. Baker threw himself whole-heartedly into amateur science. He was captivated by Trembley’s work, and immediately set about carrying out experiments of his own with Hydra. The published results appeared at length in Baker’s book, enticingly entitled The Microscope Made Easy and published in 1743 (Baker, 1743) (Figure 24). The book was dedicated to “Martin Folkes Esqr; President, And to the Council and Fellows of the Royal Society of London” and it portrays a popular account of the state of microscopy at the time. Baker was no great innovator, but he was an enthusiastic popularizer and set down his account of the Hydra experiments which he describes as “an insect” (it is of course a coelenterate) discovered by Mr Trembley “who now resides in Holland”. Baker tried to reprise Trembley’s experiments and added a few observations of his own. The list of experiments that Baker described was comprehensive and attracted widespread attention: I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII
Cutting off a Polype’s Head Cutting a Polype into two Pieces, Transversely A Polype cut into three Pieces Transversely Cutting the Head of a Polype in four Pieces Cutting a Polype in two Parts, Lengthways Cutting a young Polype in two Pieces whilst still hanging to its Parent Cutting a Polype lengthwise through the Body, without dividing the Head A Repetition of the foregoing Experiment, with different Success Cutting a Polype in two Places through Head and Body, without dividing the Tail Cutting off half a Polype’s Tail Cutting a Polype transversely, not quite through Cutting a Polype obliquely, not quite through Slitting a Polype open, cutting off the End of its Tail Cutting a Polype with four young Ones hanging to it Quartering a Polype Cutting a Polype in three Pieces the long way An Attempt to turn a Polype, and the Event
66
Brian J. Ford
FIGURE 24 Baker publishes Leeuwenhoek’s studies of the common polyp, Hydra viridis. The extensive studies of Hydra carried out by Trembley stimulated interest in aquatic microscopy, yet he was not the first to study the organism. Leeuwenhoek sent to London a series of studies of Hydra starting in his letter to the Royal Society dated 25 December 1702. The findings appeared in Philosophical Transactions (1703). [This version of Leeuwenhoek’s studies appeared in Baker (1743). ‘‘The Microscope made Easy.’’ R Dodsley & Co., London, p. 93.] This popular book gave publicity to many earlier workers, and hosts of amateur observers sought a way of observing these polyps for themselves.
XVIII XIX XX XXI XXII
Turning a Polype inside out An Attempt to make the divided parts of different Polypes unite A speedy Reproduction of a new Head A young Polype becoming its Parent’s Head A cut Polype producing a young One, but not repairing itself.
Did Physics Matter to the Pioneers of Microscopy?
67
Interest in Hydra increased enormously. From being an insignificant organism that few had even noticed, it was suddenly the most fascinating of subjects for any microscopist. And this was where the problems began, for the popular Wilson screw-barrel microscope could not cope with a delicate creature like this. This is the microscope that Baker describes in detail as his book opens: The first that I shall mention, is M R . W ILSON ’ S Single Pocket Microscope; the Body whereof made either of Brass, Ivory, or Silver.... Baker proceeds by describing how this instrument can be mounted on a stand, leaving the observer’s hands free; and concludes with the claim that this microscope: ... is as easy and pleasant in its Use, and as fit for the most curious Examination of the Animalcules and Salts in Fluids... it is as likely to make Discoveries in Objects that have some Degree of Transparency, as any Microscope I have ever seen or heard of. Not everyone agreed, John Ellis for one. Ellis found that the screwbarrel design tended to crush delicate specimens and was a hindrance to his investigations of these aquatic organisms, and he resolved to design something more appropriate. Thus it was that the popularity of Hydra as an experimental subject led to the design of the aquatic microscopes thatestablished their direct lineage to the bench microscope of today.
5.2. The Dutch Draper’s Roots Trembley believed that he had been the first to discover Hydra. He was wrong. This organism had already been described and figured more than 30 years earlier by the renowned Dutch microscopist, Antony van Leeuwenhoek (1632–1723). Indeed, Leeuwenhoek’s studies of Hydra feature in the same book by Henry Baker as the experiments by Trembley. Leeuwenhoek brings us yet deeper in history back towards the birth of microscopy. All his life he used home-made simple microscopes with crude focusing controls and which in use required diligence, skill, and unimaginable patience (see Figures 20 and 22). They were the crudest single-lens microscopes ever used for serious scientific research, yet they provided astonishing results. Leeuwenhoek himself was still working on microscopy when aged 90 and lying on his death-bed, yet for all his indefatigable research and his indomitable persistence, he founded no school of microbiology (van Neil, 1949) and left no group of devoted followers to carry on his teachings. His influence lived on, of course, and was clearly a trigger to Baker’s enthusiasms (Baker, 1743) in addition to stimulating others to look into the remarkable world he had revealed. My work, which I’ve done for a long time, was not pursued in order to gain the praise I now enjoy, but chiefly from a craving after knowledge, which I notice resides
68
Brian J. Ford
in me more than in most other men. And therewithal, whenever I found out anything remarkable, I have thought it my duty to put down my discovery on paper, so that all ingenious people might be informed thereof.—Antony van Leeuwenhoek [in] Letter to the Royal Society of London dated 12 June 1716. It was Antony van Leeuwenhoek who founded the modern science of microscopical biology. Apart from his description of Hydra, he produced innovative and unprecedented studies of sperm cells and blood corpuscles, cell nuclei and bacteria, protozoa and algae; the whole realm of the microscope lay within his grasp. His lenses were produced by painstakingly grinding beads of soda glass into biconvex magnifiers a few millimeters across, some magnifying several hundred diameters, or—more rarely—by blowing a bubble of glass and utilizing the small “nipple” that formed at the far end as his lens (Figure 32). Each of his microscopes was small, the body of each being not much larger than a rectangular postage-stamp made of workable metals like brass and silver that he extracted from the ore. Unlike the lenses, the bodies were not fine examples of workmanship; they did not need to be. Their purpose was simply to hold specimen and lens in juxtaposition and clearly in focus so that observations could be made over time. Then—once Leeuwenhoek knew what he wanted to portray—he handed over the entire instrument to his limner and commanded that the view be drawn for posterity. Leeuwenhoek himself could not draw, and said so in his records. He always had a skilled artist to perform this task. Other contemporaneous Dutch microscopists, like the gifted Jan Swammerdam (1637–1680), could make their own drawings but Leeuwenhoek lacked the skill. There is a parallel connection between Leeuwenhoek and the Flemish painter Johannes Vermeer (1632–1675). The births of both men were recorded on the same page of the baptismal register of the old Church in Delft, and— when the young Vermeer met his untimely demise at the age of 43—it was Leeuwenhoek, by this time a town official, who was appointed executor to Vermeer’s estate. Because the two were contemporaries, it has been claimed that Leeuwenhoek appears in Vermeer’s paintings but there is no evidence for the assertion, and little resemblance between the persons in the Vermeer paintings and the surviving portraits of Leeuwenhoek. It has even been suggested that Vermeer was Leeuwenhoek’s limner, but that is even more fanciful. I truth, we still have no notion who that person (or, more likely, persons) might have been. As we have seen, Leeuwenhoek’s studies were revived by Baker (1743) and his research was published by Hoole (1798–1807) in 1798. An 1875 biography was written Haaxman (Haaxman, 1875) and in 1932 Dobell published his masterful biography (Dobell, 1932). My 1991 book, The Leeuwenhoek Legacy (Ford, 1991) is perhaps best seen as a modest supplement to Dobell’s definitive work. Looking at Leeuwenhoek’s prodigious output gives an overriding impression of devotion and a highly developed ability
Did Physics Matter to the Pioneers of Microscopy?
69
FIGURE 25 A brass Leeuwenhoek microscope from Delft, Netherlands. This is the only Leeuwenhoek microscope still held in his hometown of Delft and is the only one that lacks its lens. Most of his surviving microscopes were based on this design. The microscope is to be found behind the plate-glass window of a display cabinet in the foyer to the Technical University. The body plates are fashioned from Leeuwenhoek’s home-made brass alloy and measure 22 mm × 46 mm. The two plates are fixed together by means of four small rivets and the main securing screw at the base. With this type of diminutive microscope, Leeuwenhoek was the first to observe a great range of now commonplace specimens, including cell nuclei and bacteria.
to record his unprecedented observations with accuracy. In many cases, the descriptions and figures that are recorded in Leeuwenhoek’s voluminous letters are so meticulously accurate that each species can immediately be recognized by present-day microscopists. Leeuwenhoek’s hand-made microscopes were crudely made, with just enough attention to detail to ensure that they functioned properly (Figure 25). The design of the microscopes was simple, but highly effective (Figure 26), and the details he observed seem so remarkably detailed for such an unsophisticated instrument that detractors have doubted his claims since the day they were made. Professor R. V. Jones once recounted to me how he had supervised an examination for medical students in which one question asked for an explanation of why it was theoretically impossible for Leeuwenhoek to have observed bacteria with a simple microscope. This is much the same kind of skepticism that surrounded Robert Brown’s claims (Report, 1932) when his microscope was examined. There are many die-hard skeptics in science. During Leeuwenhoek’s life, his work became widely known. His discoveries even attracted the interest of King Charles II5 who, according to a 5 Dobell (1932, p. 184), incorrectly cited in Fournier, M. (1996). The Fabric of Life, Microscopy in the Seventeenth Century. Baltimore and London: John Hopkins University Press, p. 234, reference 100.
70
Brian J. Ford
FIGURE 26 How the silver Leeuwenhoek microscopes were designed. Two of the surviving microscopes have body plates made from silver that Leeuwenhoek beat into sheet metal at his home in Delft. This diagram has been prepared by the author to illustrate how it was constructed and used in practice. The 25 mm × 45 mm plates were secured together in this case by six rivets, doubtless because of the softness of the silver. A long main screw, beaten and then tapped to provide a thread, is used to raise and lower the transverse stage. The inclined screw is used to focus the image. The stage of this microscope is triangular in section and the specimen pin has been made from wrought iron. A specimen was fixed to the point with sealing wax, and was contained—if necessary—within a flat-sided glass phial in which aquatic microorganisms could flourish.
letter from Robert Hooke to Leeuwenhoek in 1679, “was desirous to see them and very well pleased with the Observations and mentioned your Name at the same time.” Hoole records that Leeuwenhoek had also presented “a Couple of his Microscopes” to Queen Mary II, wife of William III of Orange. Leeuwenhoek also met (and presented a microscope to Peter the Great in 1698. Leeuwenhoek’s discoveries helped to found the science of experimental biology (Ford, 1995) and he became renowned across Europe by the time he died. There are nine surviving Leeuwenhoek microscopes, not all of them of proven provenance (Ford, 1991) and, in addition, we now have the additional legacy of the nine original specimen packets that Leeuwenhoek bequeathed to the Royal Society (Ford, 1981a,b). They include plant tissue sections, slices of optic nerve, dried aquatic algae and aerial sections of seeds, all prepared with diligence and uncommon skill (Figure 27).
6. THE IMAGE OF THE SIMPLE MICROSCOPE Here we arrive at the crucial question: Was it possible for Leeuwenhoek to make the discoveries that are associated with his name? Was he just,
Did Physics Matter to the Pioneers of Microscopy?
71
FIGURE 27 Original Leeuwenhoek specimens sent to the Royal Society of London. On 2 April 1686, Leeuwenhoek sent his final two packets of specimens to London. The first (top) contained, he stated, ‘‘a cotton seed cut into 24 round slices.’’ This was the launch of serial sectioning as an aid to the study of microanatomy. The second packet (bottom) contained ‘‘9 seeds from the cotton tree which have been stripped of their involucres and in which the leaves have been separated.’’ They are fine examples of microdissection; in this packet, the detached specimen (top left corner) clearly shows the cotyledons, plumule, and radicle visible to the naked eye. Dutch scholars, relying on microfiche copies of the pages, described the packets as ‘‘drawn rectangles,’’ missing entirely the fact that specimens lay concealed within.
72
Brian J. Ford
as has been alleged, a dilettante who exaggerated his results? What can an observer perceive with a single-lens microscope? Could bacteria truly have been discovered with anything so simple? We have already seen that a single lens can theoretically resolve structures as small as 0.25 µm, and this is theoretically sufficient to resolve many species of bacteria, typical examples of which are >2 µm in breadth. The best microscope of Leeuwenhoek that is known to survive is in the collection of the University Museum of Utrecht, Netherlands. The lens, when calibrated by my late colleague J. van Zuylen of Zeist, has a focal length of 0.94 mm and magnifies 266×. I used this microscope to image a dry smear of my own blood, and the results displayed the erythrocytes with remarkable clarity. They are biconcave discs with an in vivo diameter of 7.8 µm that is reduced to 7.2 µm in dried smears, and that Leeuwenhoek first described in 1674 (note that Swammerdam—unbeknown to Leeuwenhoek—may have observed the erythrocytes of Rana, the frog, in 1658). Scattered among the erythrocytes (more popularly known as red corpuscles, or red cells) are leucocytes (white blood cells). Of these, the granulocyte has a curiously lobed nucleus in which each lobe measures approximately 2 µm (Figure 28). These lobes are of bacterial dimensions, and my photomicrographs (taken with the Utrecht microscope) clearly show these structures. As we have seen in previous examples, the problem of chromatic aberration is minimal. There are some slight perturbations in color, but not enough to prevent the microscopist from seeing what needs to be seen. We have a second strand of evidence. I examined the specimens of the stem of elder, Sambucus, that Leeuwenhoek sent to the Royal Society London in 1674 (Figure 29). In one specimen I observed a fine fibril under the scanning electron microscope (SEM). On the SEM scale it measured 0.7 µm across. I had managed to find the same fibril as had been visible in the micrographs taken through the Leeuwenhoek lens, so we had directly comparable images in both Leeuwenhoek’s microscope and a present-day electron microscope. This made calibration an easy matter. Finally there is the practical demonstration of living bacteria with a simple microscope. We can see how the Victorian microscopists resolved spiral bacteria from the published images (Figure 30). For this experiment I utilized a spinel lens ground by my good friend, the late Horace Dall of Luton, which was calibrated to magnify 395× (and is thus comparable with the best surviving Leeuwenhoek microscope). With this modern version of a singlelens microscope I easily observed living aquatic spiral bacteria of the genus Spirillum (Figure 31). The results were unambiguous. The bacteria could be well resolved, thus confirming that Leeuwenhoek had living bacteria within his range (Ford, 1998).6 These three results leave no room for doubt. 6 This article has been reprinted as: Premi´eres images au microscope, Pour la Science, 249, 169–173; and ¨ Fruhe Mikroskopie, Spektrum der Wißenschaft, June, 68–71. [Other editions in Chinese, Japanese, Polish, and Spanish].
Did Physics Matter to the Pioneers of Microscopy?
73
FIGURE 28 Human blood cells imaged by the Leeuwenhoek microscope at Utrecht. Some additional studies were made at the conclusion of the photomicrography session with the Leeuwenhoek microscope (see Figure 33). A smear of blood on a coverslip was airdried and focused for photography. This image shows the entire field of view; noteworthy is the high proportion of the field that is usably in focus. For such small specimens, chromatic aberration is inconsequential. Note (top right) a single leukocyte with its lobed nucleus clearly resolved. The cell itself is some 12 µm across, and the lobes are about 2 µm in diameter–roughly the size of bacteria like Staphylococcus. This picture, taken almost as an afterthought, vividly demonstrates the capacity of the single lens. As in all of the photographs in this chapter, the image contrast and color have been optimized by Adobe Photoshop CS2. (See Color Insert.)
6.1. Analyzing the Image To the optical physicist, it is the lens data that matter above all. The results obtained by van Zuylen give the measurable parameters that define the theoretical performance of the Utrecht lens, and he gives a linear magnification of 266×, numerical aperture 0.13, calculated resolution 1.16 µm, measured resolution 1.35 µm. The lens van Zuylen measured as 1.2 mm thick and the glass from which it is made is standard soda (bottle or window) glass of refractive index 1.5—just as one would expect. Very well, but what does this mean in practice? If we take the literal interpretation of these figures, then it seems logical to deduce that anything measuring <1 µm will not be seen. Logical that may be, but it is also wrong. Even with a simple microscope, we can perceive the presence of objects, or visualize structures, that are below the theoretical limits of resolution (Ford, 1968). Submicroscopic structures can be imaged by dark-ground light microscopy when below the resolution limit (Ford, 1970), and diffraction fringes set up by linear features in a specimen can allow one to infer their presence. Thus, the image of the microfibril previously recorded in the SEM can be compared with the optical image obtained with a Leeuwenhoek microscope. Similarly, consider the
74
Brian J. Ford
FIGURE 29 Cells from Sambucus nigra, the common elder, through a Leeuwenhoek lens. In 1674 Leeuwenhoek sent to London his first four packets of specimens (see also Figure 27). One of the packets contained remarkably fine sections of pith from the elder tree. Here we view a small portion of a specimen through Leeuwenhoek’s microscope at Utrecht. Several cells are clearly seen, each measuring some 45 µm across. Tracing across some of the cell walls are fine fungal hyphæ, testimony to the three centuries during which the specimens lay in storage at the Royal Society. The fungus threads are <1 µm across. These high levels of resolution allow one to observe, with the Leeuwenhoek lens, most of the important structures that later Victorian compound microscopes were used to discover.
FIGURE 30 The Victorian microscopist examines bacteria of the genus Spirillum. Spirillum undula is a common bacterium in pond water. These are robust bacterial cells, typically 40 µm from end to end and thus somewhat larger than typical bacteria. [These organisms were drawn by the Rev. Mr. William Dallinger and appeared as Figure 495 in Carpenter (1862). ‘‘The Microscope and its Revelations.’’ J. & A. Churchill, London, p. 659.] These images are typical of the Victorian era, when fully achromatic and apochromatic microscopes were becoming available. It has been popular to assume that single-lens microscopes are unsophisticated and too primitive to permit pioneering observers to have resolved living bacterial cells. Optical theory tended to confirm the skepticism.
statement that the resolution of the Leeuwenhoek lens is no more than 1.35 µm. The lobes of the granulocyte nucleus are scarcely larger, leading
Did Physics Matter to the Pioneers of Microscopy?
75
FIGURE 31 The simple microscopes resolves living bacteria. Not only has there been much skepticism expressed on the capacity of simple microscopes to resolve living bacteria, but Professor R. V. Jones reported to me that the topic had appeared in an examination question [reported in Ford (1991). ‘‘Leeuwenhoek Legacy.’’ Bristol: Biopress, Bristol; and Farrand Press, London, p. 45]. Students had to explain why the principles of optical physics led to the conclusion that bacteria could not be resolved with a single lens. Here is evidence to the contrary. The spinel lens produced by Dall is here used to resolve living Spirillum undula bacteria obtained from a low-pH pond rich in beech leaf litter. The organisms are so clearly resolved, that even a lower-specification lens—and smaller bacteria—would allow the observation to be repeated.
the optical physicist to predict that the image of blood would be highly unsatisfactory. Add to that the further problems of chromatic and spherical aberration, and we would surely be left with a profoundly deficient image. This is all in contrast with what we find in the real world. The result is far better than our prejudice would lead us to expect. The erythrocytes are clearly displayed, and the details of the granulocyte are visualized with remarkable clarity. That’s a surprise. And here is a greater one, for the chromatic aberration does little to detract from our interpretation of the image. Still more surprising is the flat field of view. Whereas one might anticipate that only the center of the field of view would give in-focus images of the blood cells, it is clear that most of the observable field shows cells that are reasonably in focus. This seems impossible, for only an aspheric lens can provide results like this. Herein lies perhaps the greatest surprise of all. The lens is aspheric. No ground lens from the period could possibly have this form of contour, and this minor mystery was solved only through the ingenious investigations of van Zuylen (1981). He explained to me how
76
Brian J. Ford
FIGURE 32 Leeuwenhoek discovers the aspheric magnifier. The surface of the hand-ground lens fitted to simple microscopes is approximately spherical. The Leeuwenhoek microscope at Utrecht is an exception; the lens has an aspheric contour that gives it a remarkably flat focal field. J. van Zuylen of Zeist, Netherlands, showed how it was almost certainly produced. He blew a large bubble of soda-glass at the end of a piece of tubing, and detached the ‘‘nipple’’ of glass that forms at the end (right). This simple technique provides a serviceable lens without the need for grinding, and Leeuwenhoek himself wrote that he sometimes melted glass to blow lenses. The clarity of view provided by the Utrecht microscope is remarkable (see Figure 28).
he had taken a length of glass tubing, of the kind used to draw a Pasteur pipette, and had melted the end on a blue Bunsen flame. Once a sizeable accumulation of half-melted glass had formed, he blew it to form a large (wineglass-sized) bubble, at the extremity of which a glass “nipple” had formed. The thin glass of the bubble itself is broken away and discarded, and the “nipple” was retained for use as a lens (Figure 32). This aspheric magnifier gives excellent results with images in focus across the entire central field of view. Van Zuylen told me that the contour of the Leeuwenhoek lens at Utrecht matched his experimental replica in every significant respect. Making micrographs through such historic instrument demands the greatest care for the microscope itself, and I designed a purpose-built carrier that allowed the Leeuwenhoek microscope to be carefully focused on a 100year-old forensic microscope base at Utrecht (Figure 33). A series of images was obtained. Image clarity is remarkable, for such a diminutive lens. There is little chromatic aberration, indeed I often see more in images taken with the most expensive research microscopes available to a present-day researcher. Sometimes, when short of a specimen to show in a lecture, I have slotted in an image taken with a seventeenth-century type of microscope, and nobody has ever commented on the difference. This reminds us of a prime principle of experimental microscopy. We are well advised to disregard the constraints imposed by theoretical physics. Some careful adjustments, coupled with
Did Physics Matter to the Pioneers of Microscopy?
77
FIGURE 33 Imaging Leeuwenhoek’s specimens through Leeuwenhoek’s microscope. The author is here capturing the view that Leeuwenhoek obtained of his own specimens, the first time the two had been thus brought together for more than three centuries. With the help of Dr. Peter Hans Kylstra at the Museum for the History of Science, Utrecht University, an old forensic microscopy stand was adapted by the author to support the Leeuwenhoek microscope and specimen for micrography. The results showed that the original specimens gave excellent images (see Figure 29) and present-day specimens (see Figure 28) are resolved with remarkable detail.
judicious tweaking of the illumination source and the alignment of the lens, can combine to allow the microscopist to see features in the specimen that should not be resolved in theory. This form of experimental adjustment was second nature to the pioneers. Our knowledge of optics, and our understanding of the theoretical constraints, can hinder rather than help. Disregarding the conventional physics can be the start of great discoveries. ¨ This can be as true in the modern world (Hell, 2003; Hell & Schonle, 2008; Punge et al., 2008) as it was when microscopy was young.
6.2. Sources of Inspiration Leeuwenhoek’s investigations are justly respected, and his status among the pioneers of microscopy is assured. For centuries, writers have commented on his ingenious inquisitiveness, and praised his brave excursions into untrodden fields of research. His ability to discover the power of the lens has been attributed to his work in Delft as a draper, for dealers in textiles used magnifiers to assess the quality of their goods. My view is different. I believe that Leeuwenhoek was not the purely self-taught microscopist that we imagine. The evidence exists to substantiate a very different view: Leeuwenhoek’s microscope was not designed by him at all, but by the English natural philosopher, Robert Hooke.
78
Brian J. Ford
FIGURE 34 How Robert Hooke observed the flea through his compound instrument. Among the subjects that are featured in Robert Hooke’s great work, ‘‘Micrographia’’ (1665), are a range of insect specimens. The range of detail that Hooke portrays is remarkable (see Figure 36), and this poses us a seemingly unanswerable paradox. As can be seen here, the image produced by Hooke’s type of compound microscope is of low resolution. None of the key features (including those fine chitinous body hairs) is resolved. Yet his illustrations are produced at far higher resolution—and the details that are published in Hooke’s book are clearly beyond the capacity of his microscope to resolve.
Robert Hooke (1635–1703) is best remembered for the coinage (familiar to physicists) of Hooke’s law, which he published in 1678 as Ut tensio, sic vis, which translates as “As is the extension, so is the force”. Equally familiar to physicists is the concept that a perfect arch has the same shape inverted, as a hanging chain. Few realize that this was also a coinage of Hooke’s, expressed thus: Ut pendet continuum flexile, sic stabit contiguum rigidum inversum, which is translated: “As hangs a flexible cable, so stand the contiguous pieces of an arch inverted”. Hooke’s folio-sized book Micrographia (Hooke, 1665) was the first major popular science book and was filled with vivid engravings of objects seen through a microscope. There were fleas and lice, ants and flies, gnats and nettles; all vividly portrayed in superb relief with all the eye-catching brilliance of a SEM. That was not all, though, for the book was rooted firmly in physics. Hooke published an illustration of his microscope (see Figure 21), which showed his design for a lens-grinding machine, explored the heavens and set down engravings of the stars and the craters on the moon; considered light, and the interplay of colors in shot silk (he long protested bitterly that his ideas on light had been systematically purloined by Isaac Newton) in a wide-ranging volume that, in many ways, was a survey of current scientific thinking. There remains a major paradox, however. The fine details published by Hooke could not have been observed through his famous microscope. The image of the early compound microscope was poor (Figure 34), yet his fine
Did Physics Matter to the Pioneers of Microscopy?
79
FIGURE 35 Pediculus humanus, the human body louse, resolved by a simple microscope. The antenna, anterior structures, and an anterior claw of a human head louse are imaged here by the spinel lens made by Horace Dall (see Figure 15) and magnifying 400 ×. The degree of detail that the single lens reveals is remarkable. Note, for instance, the chitinous lens (below, left) of an eye; the details of the segmented antenna; and the claw which is also vividly resolved. Of greatest interest are the fine hairs of chitin that project from the exoskeleton. These are exceedingly fine structures and they are close to the limits of resolution. These structures are largely imaged as diffraction fringes—but they can clearly be observed, and this is the important point. Clearly, Hooke used a simple microscope to observe such diminutive details. His compound microscope (see Figures 2 and 34) was not up to the task.
engravings showed details that were simply incapable of being resolved by the lenses. The truth is surprising—Hooke used a simple microscope for the observations of fine structural detail (Figure 35) and relied on his grand, compound instrument for the overall impression of an entire specimen. It was the single lens that gave him access to the extraordinary amount of detail in his engraved images (Figure 36). I have deduced that Leeuwenhoek was influenced by the book, for Robert Hooke describes the microscopic structure of three selected specimens: cork, elder pith and the white of a quill feather. Leeuwenhoek visited London in 1666, when Hooke’s book was the talk of the town and when he started sending the results of his own microscopical investigations to London in 1674, I find that he chose exactly the same specimens and even listed them in the same order. Leeuwenhoek’s first specimens sent to London were cork, elder pith, and the white of a quill feather. The line of influence is unmistakable. Hooke’s influence on Leeuwenhoek as a microscopist is clear enough, although Hooke’s microscopes were compound instruments, made by the London instrument maker, Christopher Cock, whereas Leeuwenhoek made his own simple microscopes. They are always described as of Leeuwenhoek’s own design, but in my view this conclusion needs to be revised.
80
Brian J. Ford
FIGURE 36 The larva of common gnat Culex pipiens published by Hooke in 1665. Hooke’s engravings of lice, fleas, ants, houseflies, and gnats are well known, but this mosquito larva is rarely reproduced. It is a fine example of his assiduous investigations of the insect world. Note, for example, the clarity with which the chitinous body hairs are reproduced—this view (as we have seen) is clearly the result of observations using a simple microscope, rather than his familiar compound instrument (see Figure 2). Most of the specimens Hooke published in his book are visible to the naked eye. They are essentially macroscopic (rather than microscopical) observations. Yet the fine details are truly microscopic, and can only have been made with a single lens.
The designer of the Leeuwenhoek microscope was actually none other than Robert Hooke. Hidden away in the Preface to Micrographia is the recipe (Figure 37): Take a very clear piece of broken Venice glass7 and in a lamp draw it out into very small hairs or threads, then holding the ends of these threads in the flame, till they melt and run into a small round Globule or drop, which will hang at the end of the thread; and if further you stick several of these upon the end of a stick with a little sealing Wax, so as that the threads stand upwards, and then on a Whetstone first grind off a good part of them, and afterward on a smooth Metal plate, with a little Tripoly, rub them till they come to be very smooth; and if one of these be fixt with a little soft Wax against a small needle hole, prick’d through a thin Plate of Brass, Lead, Pewter, or any other Metal, and an Object, plac’d very near, be looked at through it, it will both 7 Venice glass meant any high-quality soda glass. Venice (more properly Murano, to which island the Venetian glass-makers fled when repeatedly attacked by raiders) was, and remains, a center of glassmaking excellence.
Did Physics Matter to the Pioneers of Microscopy?
81
FIGURE 37 From the Preface to Micrographia: Hooke’s recipe for success. The description of the way that Hooke made simple microscopes is hidden away in the unnumbered pages of the Preface. Lurking on the twenty-second page is this description of his method: ‘‘Take a very clear piece of...glass, and in a Lamp draw it out into very small hairs...they melt and run into a small round Globule...first grind off a good part of them [and] rub them till they come to be very smooth; and if one of these be fixt...against a hole [in] a thin Plate of Brass...it will magnifie and make some Objects more distinct than any of the great [=compound] Microscopes.’’ There we have it—the description that Antony van Leeuwenhoek used as the basis of his design. What we describe as a ‘‘Leeuwenhoek microscope’’ is actually a design by Robert Hooke.
magnifie and make some Objects more distinct than any of the great [i.e. compound] Microscopes. These Preface pages are unnumbered, unlike those in the body of the book, which may account for the past neglect of this important passage. Hooke’s description shows how anyone could make a serviceable microscope at home, and Leeuwenhoek’s design closely follows these instructions. And so we trace the microscope back to its origins. Yes, there were others who went before, though they did not give rise to significant discoveries, nor to a major conceptual breakthrough. Earlier studies, like those of a bee published by Stelluti in 1625 (Figure 38) and of a moth by Borel in 1655 (Figure 39), revealed little more than can be seen with the naked eye. These observations were made before the manufacture of microscope lenses had been developed. There are older lenses, the oldest known lens dating from 3,000 years ago (Figure 40), but objects like this were probably decorative gems, rather than genuine magnifiers. If there is any criticism to make of Micrographia, it is that most of the objects featured are portrayed in macro- rather than micro-mode. Yet beneath the stunning insects and beautiful plant specimens featured in the book lies the clue to Leeuwenhoek’s success: Hooke’s own design for the first-ever highpower microscope.
82
Brian J. Ford
FIGURE 38 First microscopical observation of a compound eye from 1625. Vivid studies of the honey bee, Apis mellifera, were made by Francesco Fontana and published by Francesco Stelluti as illustrations for a tract begun by Frederico Cesi who, unfortunately, had died before the work was completed. The book, entitled Apiarium, was dedicated to Cardinal Francesco Barberini, whose family’s coat of arms was a shield bearing three bees. Small components of the bee are shown (e.g., the mouthparts, No. 9) and details of the antennae and compound eye are revealed (No. 8). [See also Bardell (1983). The first record of microscopic observations. BioScience 33(1), 36–38.]
Did Physics Matter to the Pioneers of Microscopy?
83
FIGURE 39 Borel’s moth from the pre-microscopical study of nature. This crude woodcut of the antennae of a moth was published by Pierre Borel (1620–1671) in 1655. Borel was a French physician and botanist who published a range of rambling discussion of microscopes (and telescopes) in his book, ‘‘De vero telescopii inventore, cum brevi omnium conspiciliorum historia accessit etiam centuria observationum microcospicarum’’ [published by Adriaan Vlacq, The Hague] Other enlarged images had been published by many philosophers, including Francesco Stelluti (a honey bee, 1625) right back to Olaus Magnus (a snowflake, 1555); but these showed no more detail than close inspection with the naked eye could reveal. It was Robert Hooke who popularized the use of the microscope and produced the design that Leeuwenhoek would use to carry us far into the world of microscopy.
FIGURE 40 The Nimrod lens—the earliest yet discovered. The earliest artefact in the field of optics is this ground crystal lens dating back 3000 years. It was discovered at the ancient Assyrian palace of Nimrod by Austen Henry Layard in 1845, near the present-day Mosul in Iraq. Suggestions have been made that it was used to magnify, or perhaps served a practical function as a burning-glass to ignite a fire. Even more fanciful is the idea that it was used in an early telescope. It is likely just to be a jewel, an ornament; however, it is clearly a lens and takes us back to the cradle of civilization—and the earliest artifact that physicists could legitimately embrace.
84
Brian J. Ford
The value of a Leeuwenhoek image stood the test of time. After Leeuwenhoek died, his daughter sent a box of silver microscopes (many of them still complete with specimens) to the Royal Society in token of her late father’s esteem for that learned body. In 1738, Lord Jersey wrote recommending that these Leeuwenhoek microscopes would be the best available for research (Ford, 1991). Baker described them in 1743. He wrote, “At the time I am writing this, the Cabinet of Microscopes left by that famous Man, at his Death, to the Royal Society, as a legacy, is standing upon my Table’ (Baker, 1743). They were eventually borrowed from the Royal Society in the 1820s by Sir Everard Home, a surgeon and associate of John Hunter, the surgeon and anatomist, whose papers Home was given to publishing as his own. They were never seen again and were probably lost in a fire at Home’s apartment in Chelsea (Ford, 1991). It is a salutary revelation that most of the major discoveries of the nineteenth-century microscopists could have been made with the single-lens microscope made by Leeuwenhoek, if only people possessed the mounting techniques, and the knowledge of what to look for. Bacteria and microscopic fungi, even human chromosomes, all could be seen with a Leeuwenhoek lens. Not only did Leeuwenhoek observe cells, but he also figured nuclei and spermatozoa, and even observed Brownian motion. And it was all done by working on a design by Robert Hooke, and by ignoring what the specialists said. Just adjust the instrument, tweak the light, and use unimaginable patience. Optical physics? This honorable discipline dates back more than 1,000 years. Yet who needs it, when the wind of innovation fills the microscopist’s sails? It is “what you can see” that counts.
ACKNOWLEDGMENTS My greatest debt of gratitude is to Professor Archie Howie, FRS, of Churchill College, Cambridge University, for his advice over many years and particularly for his generous and authoritative advice on the early draft of this chapter. I am also indebted to Sir Sam Edwards, FRS, for scrutinizing the final paper prior to publication. Funding in support of aspects of the work has been provided by the Royal Society, the Linnean Society of London, the Leverhulme Trust, the American Microscopical Society, the Kodak Bursary Scheme, and the National Endowment for Science, Technology and the Arts (NESTA). Many colleagues have been cited in the Leeuwenhoek Legacy (Ford, 1991), and I would like to offer special thanks to Sir Andrew Huxley of Cambridge; Dr. J. van Zuylen of Zeist, Netherlands; Dr. H. Hansen of Antwerp, Belgium, Professor R. V. Jones of Aberdeen, Scotland; Mr. Leslie Townsend of Broadstairs, England; Mr. Horace Dall of Luton, England; Professor Peter Hans Kylstra of Utrecht, Netherlands; Dr. G. Van Steenbergen of Munich, Germany; Professor Denis Bellamy of Cardiff University, Wales;
Did Physics Matter to the Pioneers of Microscopy?
85
Mr. Gren Ll Lucas of Kew, England, Ms Gina Douglas, Linnean Society of London; and Mr. Norman Robinson of the Royal Society. I remain grateful to the librarians and staff of the University Library, Cambridge; the Linnean Society of London; the Whipple Museum for the History of Science, Cambridge; the Royal Society and the University of Cardiff.
REFERENCES Abbe, E. (1873). Beitr¨age zur Theorie des Mikroskops und der mikroskopischen Wahrnehmung. Archiv fur ¨ mikroskopische Anatomie, 9, 413–420. Baker, H. (1743). The microscope made easy. London: Dodsley, Cooper and Cuff. Bentham, G. (1877). Outlines of Botany with special relevance to local floras. In J. G. Baker (Ed.), Flora of mauritius and the seychelles: Vol. 133 (25) (p. 244). London: L Reeve & Co, IV. Bown, W. (1992). Brownian Motion sparks renewed debate. New Scientist, 133(25), 15. Bradbury, S. (1968). The eighteenth century. In The microscope past and present (p. 85). Oxford: Pergamon. Breen, Q. (1956). Abraham Trembley of Geneva, scientist and philosopher, 1710–1784. Journal of the Medical Library Association, 44, 84–85. Brocksh, D. (Ed.) (2005). In memory of Ernst Abbe. Innovation, 15, 1–15. Brown, R. (1827). A brief account of microscopical observations, and on the general existence of active molecules in organic and inorganic bodies, London (privately printed) 1827. Burnett, W. A. S., & Martin, L. V. (1992). Charles Darwin’s microscopes. Microscopy, 36(8), 604–627. Carpenter, W. (1862). The microscope and its revelations (2nd edition). London: John Churchill. Clay, R. S., & Court, T. H. (1932). The history of the microscope. London: Charles Griffin & Co. Court, T. H., & von Rohr, M. (1929). A history of the development of the telescope from about 1675 to 1830 based on documents in the court collection. Transactions of the Optics Society, 30, 207–260. Darwin, C. (1864). Letter to Asa Gray of 28 May 1864. In Life and letters of Charles Darwin II (1903) (p. 497). London: John Murray. Degen, C. L., Poggio, M., Mamin, H. J., Rettner, C. T., & Rugar, D. (2009). Nanoscale magnetic resonance imaging. Proceedings of the Natlional Academy of Sciences USA, 106, 1313–1317. Deutsch, D. H. (1991). Did Robert Brown observe Brownian Motion: probably not. Bulletin of the American Physical Society, 36(4), 1374. Reported in Scientific American, 265: 20. Dobell, C. (1932). Antony van Leeuwenhoek and his little animals. Amsterdam: Swets and Zeitlinger. Duyker, E., Tingbrand, P., (editors and translators) (1995). Daniel Solander, collected correspondence 1753-1782. Melbourne: Miegunyah Press. Ford, B. J. (1968). The concept of “antipoint” applied to submicroscopic fibrillar structures. Proceedings of the Royal Microscopical Society, 3(1), 14–16. Ford, B. J. (1970). Searching for ultimate resolution. British Journal of Photography, 117(5716), 144–145. Ford, B. J. (1981a). Leeuwenhoek’s specimens discovered after 307 years. Nature, 292, 407. Ford, B. J. (1981b). The van Leeuwenhoek specimens. Notes and Records of the Royal Society, 36(1), 37–59. Ford, B. J. (1982). The microscope of Robert Brown. The Linnean, 1(4), 12–17. Ford, B. J. (1984). The restoration of Brown’s first botanical microscope. Microscopy, 34, 406–418. Ford, B. J. (1985). Single lens, story of the simple microscope, London: Heinemann, p. 154. Ford, B. J. (1991). The Leeuwenhoek Legacy. Bristol: Biopress and London: Farrand Press. Ford, B. J. (1992a). Brownian movement in Clarckia pollen, a reprise of the first observations. Microscope, 40(4), 235–241. Ford, B. J. (1992b). Robert Brown, Brownian movement, and teethmarks on the hatbrim. Microscope, 39(3 & 4), 161–171.
86
Brian J. Ford
Ford, B. J. (1992c). The controversy of Robert Brown and Brownian movement. Biologist, 39(3), 82–83. Ford, B. J. (1995). First steps in experimental microscopy, Leeuwenhoek as practical scientist. The Microscope, 43(2), 47–57. Ford, B. J. (1996). Confirming Robert Brown’s observations of Brownian Movement. Proceedings of the Royal Microscopical Society, 31(4), 316–321. Ford, B. J. (1998). The earliest views. Scientific American, 278(4), 50–53. (US edition); Scientific American, 278(4), 42–45 (British edition). Ford, B. J. (2001). The royal society and the microscope. Notes and Records of the Royal Society, 55(1), 29–49. Goring, C. R. (1827). On achromatic microscopes with a description of certain objects for trying their definition and penetrating powers. Quarterly Journal of Science, Literature, Natural History, and the Fine Arts, 23, 410–415. Haaxman, P. J. (1875). Antony van Leeuwenhoek, de ontdekker der infusorien. Leiden: van Doesburg. Hell, S. W. (2003). Toward fluorescence nanoscopy. Nature Biotechnology, 21, 1347–1355. ¨ Hell, S. W., & Schonle, A. (2008). Nanoscale resolution in far-field fluorescence microscopy. In P. W. Hawkes, & H. Spence (Eds.), Science of Microscopy (pp. 790–834). NewYork: Springer. Hodgkin, T., & Lister, J. J. (1827). Notice of some microscopical objects of the blood and animal tissue. Philosophical Magazine, 2, 130. Hooke, R. (1665). Micrographia, etc.. London: Martyn and Allestry, Republished in paperback (1961) New York: Dover publications. Hoole, S. (1798–1807) The select works of Antony Van Leeuwenhoek containing his microscopical discoveries in many of the works of nature, in 2 volumes, London: G. Sidney. Hughes, A. (1855). Studies in the history of the microscope 1—the influence of achromatism. Journal of the Royal Microscopical Society, 75, 1–22. Institute of Physics. Did you know...Brownian motion. Einstein Year website. Available at http://www.einsteinyear.org/facts/brownian motion/(accessed January 2009). King, P. P. (1827). Character and description of Kingia. In Narrative of a survey of the intertropica and western coasts of Australia II (pp. 536–563). London. 5: Botanical Appendix: 534 et seq. Lenhoff, H. M. (Ed.) (1983). Hydra: research methods. New York and London: Plenum Press. Lenhoff, S. G., & Lenhoff, H. M. (1986). Hydra and the birth of experimental biology. Pacific Grove, California: Boxwood Press. Linnaeus, C. (1758). Systema naturæ per regnum tria naturæ. Molmiæ: Lurentii Salvii. Lister, J. J. (1830). On some properties on achromatic object-glasses applicable to the improvement of the microscope. Philosophical Transactions of the Royal Society of London, 120, 187–200. Lovell, D. J. (1967). Optical peregrinations in the Netherlands. Applied Optics, 6, 785–791. Matile, P. (1998). Vacuoles—discovery of lysosomal origin. In S. D. Kung, & S. F. Yang (Eds.), Discoveries in plant biology (p. 279). Singapore: World Scientific Publishing. ¨ Moldenhawer, J. P. (1812). Beytr¨age zur Anatomie der Pflanzen. Berlin: Koniglischen Schulbuchdruckerei. ¨ Nordenskiold, E. (1928). The history of biology. New York: Tudor Publishing. Nuttall, R. H. (1971). The achromatic microscope in the first half of the nineteenth century. Journal of the Quekett Microscopical Club, 32, 116. Optical microscopy division, national high magnetic field laboratory. Molecular expressions: exploring the world of optics and microscopy. Available at http://www.molecularexpressions.com/primer/museum/amiciachromatic.html, accessed November 2008. Perrin, J. B. (1909). 8me series, Annales de Chimie et de Physique. London: Taylor and Francis, Translated by F. Soddy (1910) Brownian Movement and Molecular Reality. ¨ Punge, A., Rizzoli, S. O., Jahn, R., Wildanger, J. D., Meyer, L., Schonle, A., et al. (2008). 3D reconstruction of high-resolution STED microscope images. Microscopy Research and Technique, 71(9), 644–650. Raspail, F. V. (1830). Essai de chimie microscopique appliq´ee a` la physiologie. Paris: Meilhac.
Did Physics Matter to the Pioneers of Microscopy?
87
¨ ¨ Remak, R. (1852). Uber extracellulare Entstehung thierischer Zellen und uber Vermehrung derselben durch Theilung. Archiv fur ¨ Anatomie, Physiologie und wißenschaftliche Medicin, 47, 1903. Remak, R. (1855). Untersuchungen uber ¨ die Entwickelung der Wirbelthiere. Berlin: G. Reimer. Report, (1932). Centenary of Robert Brown’s discovery of the nucleus. Journal of Botany, (January). Schleiden, M. J. (1838). Beitr¨age zur Phytogenesis. Archiv fur ¨ Anatomie, Physiologie und wißenschaftliche Medicin: 136–176. Schrader, M., Hell, S. W., & vanderVoort, T. M. (1998). Three-dimensional super-resolution with a 4Pi-confocal microscope using image restoration. Journal of Applied Physics, 84, 4033–4042. ¨ Schwann, T. (1839). Mikroskopische Untersuchungen ueber die Ubereinstimmung in der Struktur und dem Wachstum der Thiere und Pflanzen. Berlin: G. Reimer. Shimmen, T., & Yokota, E. (2004). Cytoplasmic streaming in plants. Current Opinion in Cell Biology, 16(1), 68–72. Sidles, J. A. (2009). Spin microscopy’s heritage, achievements and prospects. Proceedings of the Natlional Academy of Sciences USA, 106, 2477–2478. Trembley, A. (1744). M´emoires pour servir a` l’histoire d’un genre de polypes d’eau douce, a` bras en forme de cornes. Leiden: Jean & Herman Verbeek. Treviranus, L. (1806). Vom inwendigen Bau der Gew¨achse und von der Saftbewegung in denselben. ¨ Gottingen: Dieterich. van Neil, C. B. (1949). The “Delft School” and the rise of general microbiology. Bacteriology Reviews, 13(3), 161–164. van Zuylen, J. (1981). The microscopes of Antony van Leeuwenhoek. Proceedings of the Royal Microscopical Society, 121(3), 309–328. Wilson, J. (1743). The description and manner of using a late invented set of small pocket microscopes. Philosophical Transactions, 23, 1241–1247. Wolf, K. B. (1995). Geometry and dynamics in refracting systems. European Journal of Physics, 16, 14–20. Wolpert, L. (1995). Evolution of the cell theory. Philosophical Transactions of the Royal Society of London, B 349, 227–233. Zeiler, H. W. (1969). What resolving power formula do you use? Microscope, 17(4), 19.
Chapter
3 Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation J´erˆ ome Gilles
Contents
1. Introduction 2. Preliminaries 2.1. Wavelets 2.2. Multiresolution Analysis 2.3. Directional Multiresolution Analysis 2.4. Ridgelets 2.5. Curvelets 2.6. Contourlets 2.7. Function Spaces 3. Structures + Textures Decomposition 4. Structures + Textures + Noise Decomposition 4.1. BV –G–G Local Adaptative Model 4.2. Aujol–Chambolle BV –G–E Model ˙ ∞ 4.3. The BV –G–Co −1,∞ Decomposition Model 5. Performance Evaluation 5.1. Test Image 5.2. Evaluation Metrics 5.3. Image Decomposition Performance Evaluation 6. Conclusion Appendix A. Chambolle’s Nonlinear Projectors A.1. Notations and Definitions A.2. Total Variation A.3. Chambolle’s Projectors A.4. Extension References
90 90 91 93 94 94 96 97 98 102 110 112 115 118 123 124 124 128 128 130 130 131 132 135 135
ˆ d’Or, Arcueil, France DGA/CEP - EORD Department, 16bis rue Prieur de la Cote Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00008-1. c 2009 Elsevier Inc. All rights reserved. Copyright
89
90
J´erˆ ome Gilles
1. INTRODUCTION In the last few years, different algorithms have been proposed to decompose an image into its structures and textures components and these into structures, textures, and noise components. The initial idea was proposed by Meyer (2001). He proposed starting from the Rudin–Osher–Fatemi algorithm, (Rudin, Osher, & Fatemi, 1992) which was designed to perform image denoising. Meyer showed that this model rejects the textures and then proposed to use a new function space, G, by replacing the L 2 -norm by the G-norm. He proved that this space corresponds to a space of oscillating functions that are useful to model textures. Two years later, two numerical schemes were proposed to solve Meyer’s model, particularly the algorithm based on Chambolle’s nonlinear projector. It is easy to implement, and convergence conditions are given by a theorem. These models work well provided no noise is present in the image. Otherwise, it is necessary to extend the model to a three-part model. Different approachs were proposed based on a local adaptable algorithm or wavelet soft thresholding (Aujol & Chambolle, 2005; Gilles, 2007b). This paper describes the philosophy developped by Meyer and gives a description of the different structures + textures models in Section 3, and structures + textures + noise models in Section 4. A new three-part model, based on contourlet soft thresholding, is introduce. This mode improves the results of the previous algorithms. Section 5 deals with performance evaluation of the decomposition algorithms. A specific methodology is proposed. First we create test images by recomposing structures, textures, and noise reference images that are generated separatly. We define some metrics to evaluate the quality of the different components obtained at the output of the decomposition algorithms (especially, the problem of how to measure the remaining residue in the noise). Before detailing the different decomposition models, the first section provides some preliminaries and notations like the wavelet, contourlet formalism. It also presents the different function spaces and their associated norms that are used in the remainder of the paper. We conclude by summarizing the different models and their performance. We also give some perspectives to this work.
2. PRELIMINARIES This section describes all the definitions used in the chapter. We start by recalling the multiresolution formalism, specially based on wavelets and other geometric approachs like curvelets and contourlets. We also introduce different function spaces like the space of bounded variations functions (BV ), Besov spaces, and so on. We complete these descriptions by defining a space
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
91
based on the contourlet expansion, which will be used in the new three-part decomposition model presented in Section 4.3.
2.1. Wavelets Let us start with the notations and properties of wavelet analysis. The first wavelet expansion of a one-dimensional (1D) signal appeared in the 1980s (H¨ardle, Kerkyacharian, Picard, & Tsybakov, 1997; Mallat, 1999; Vidakovic & Mueller, 1991). The well-known contributors of the wavelet theory are, but not restricted to Meyer (1993), Mallat (1999) and Daubechies (1992). In the following, we assume that we have a 1D signal but the D-dimensional extension is naturally obtained by using D separable transforms along the different variables. Wavelet analysis outperforms the Fourier representation. Fourier transform decomposes a signal over a sinecosine basis. This transform is well localized in frequency but not in time (sine and cosine functions are defined over an infinite domain). For example, if we analyze a transient phenomenon, its Fourier transform covers all the frequency plane while it is well localized in time. It is evident that a transform that is both localized in time and frequency is needed. The first solution used a windowed-Fourier transform. It allows decomposition of the time-frequency plane into many time-frequency atoms. However, this transform is not completely satisfactory because it does not authorize adaptable atoms. However, we could be interested in analyzing many transient phenomena with different lengths; then adaptable atoms are necessary. The wavelet transform affords us this opportunity, and we now recall its definition. 2.1.1. Continuous Case Wavelet transform decomposes a signal over a set of translated and dilated versions of a mother wavelet. A mother wavelet is a function ψ ∈ L 2 (R) that respects some criteria as follows: Z R
ψ(t) dt = 0
kψk L 2 = 1
zero mean, normalized,
(1) (2)
and ψ needs to be centered on 0. If we denote a and b as the dilation and translation parameters, respectively, then the set of wavelets is obtained from the mother wavelet ψ by 1 ψa,b (t) = √ ψ a
t −b . a
(3)
92
J´erˆ ome Gilles
Then, we can define the wavelet transform of a function f ∈ L 2 (R) at time b and scale a by (ψ ∗ is the complex conjugate of ψ) Z
1 WT f (a, b) = h f, ψa,b i = f (t) √ ψ ∗ a R
t −b dt. a
(4)
It is easy to see that a wavelet transform can be written as a convolution product (denoted ?) WT f (a, b) = f ? ψ¯ a (b),
where
1 ψ¯ a (t) = √ ψ ∗ a
−t a
.
(5)
The following theorem gives the conditions that permit reconstruction of the function f from its wavelet expansion. Theorem 1. Let ψ ∈ L 2 (R) be a real wavelet that respects the following admissibility condition: +∞
Z Cψ =
0
ˆ )|2 |ψ(ξ dξ < +∞, ξ
(6)
where ψˆ is the Fourier transform of ψ. Then, all functions f ∈ L 2 (R) verify 1 f (t) = Cψ
+∞ Z
Z 0
1 WT f (a, b) √ ψ a R
t −b da db 2 a a
and (Parseval relation) Z Z +∞ Z 1 da 2 | f (t)| dt = |WT f (a, b)| db 2 . Cψ 0 a R R
(7)
(8)
A proof can be find in Mallat (1999). Many papers in the literature deal with the choice of the mother wavelet ψ. According to the concerned applications, we can impose some complementary constraints to the wavelet (e.g., its regularity, the length of its support, the number of its zero moments). 2.1.2. Discrete Case In practice, we have digital signals composed of N samples denoted f [n]. Let ψ(t) be a continous wavelet where its support is [−K /2, K /2]; then the discrete wavelet, dilated by 2 j , is defined as 1 ψ jn [k] = √ ψ[2− j k − n]. 2j
(9)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
93
Then the discrete wavelet transform can be written as WT f [n, j] =
X
f [m]ψ ∗jn [m] = h f, ψ jn i,
(10)
m
and the reconstruction formula is true if ψ has some complementary properties (see Mallat (1999) for more details). Then, we have
f [m] =
+∞ X X
WT f [n, j]ψ jn [n].
(11)
j=0 n
These relations show that filter banks, defined from ψ, can be used to implement the wavelet transform and its inverse.
2.2. Multiresolution Analysis Multiresolution analysis is defined in Mallat (1999). Let {V j } j∈Z be a set of closed subspaces of L 2 (R). We said it is a multiresolution approximation if it meets the following conditions: ∀( j, k) ∈ Z2 ,
f (t) ∈ V j ⇔ f (t − 2 j k) ∈ V j ,
V j+1 ⊂ V j , t ∀ j ∈ Z, f (t) ∈ V j ⇔ f ∈ V j+1 , 2 +∞ \ lim V j = V j = {0}, ∀ j ∈ Z,
j→+∞
(12) (13) (14) (15)
j=−∞ +∞ [
lim V j =
j→−∞
V j = L 2 (R),
(16)
j=−∞
and there exists a function θ such that {θ (t − n)}n∈Z is a Riesz basis of V0 . Let ϕ be a function (called the scale function) with its Fourier transform be defined by: θˆ (ω)
ϕ(ω) ˆ = +∞ P k=−∞
|θˆ (ω + 2kπ )|2
!1/2 .
(17)
94
J´erˆ ome Gilles
Then the set {ϕ jn }n∈Z defined by 1 ϕ jn (t) = √ ϕ 2j
t −n 2j
(18)
is an orthonormal basis of V j . If we define W j = V j V j+1 , the wavelet set {ψ jn }n∈Z associated with ϕ (see H¨ardle et al. (1997), Mallat (1999), and Vidakovic and Mueller (1991) to learn to build such functions) is an orthonormal basis of W j . Then all functions f ∈ L 2 (R) can be decomposed to f (t) =
X n
αn ϕ0n (t) +
+∞ X X
β jn ψ jn (t),
(19)
j=0 n
where the coefficients β jn = h f, ψ jn i are the wavelet transform coefficients and αn = h f, ϕ0n i are the coefficients from the projection on the subspace V0 . In other terms, we have (19) ⇐⇒ f ∈ V0 ⊕
∞ M
Wj.
(20)
j=0
2.3. Directional Multiresolution Analysis The two-dimensional (2D) extension of a wavelet generally uses the separability principle. It uses a 1D wavelet filter along the horizontal and vertical directions. In natural images, however, the information is not limited to these two directions. It is easy to understand that the multiresolution analysis needs to be extended to encompass directions in the image. Many authors propose different approachs to do this directional analysis. This chapter describes only those best known in the literature: the ridgelets, curvelets, and contourlets.
2.4. Ridgelets In his doctoral dissertation, Cand`es (1998) proposes a new transform that deals with directionality in images: the ridgelet transform. The ridgelets functions ψa,b,θ are defined in a manner similar to wavelets but add the notion of orientation (tuned by the θ parameter): ψa,b,θ : R2 −→ R2 1 x1 cos θ + x2 sin θ − b ψa,b,θ (x1 , x2 ) = √ ψ . a a
(21) (22)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
95
The ψa,b,θ is constant along the lines x1 cos θ + x2 sin θ = c (c is a constant) and is a wavelet ψ in the orthogonal direction. Many properties of the wavelet theory can be transposed. Definition 1. The admissibility condition for a ridgelet is:
Kψ = which is equivalent to
R
2 Z ψ(ξ ˆ ) |ξ |2
dξ < ∞,
(23)
ψ(t) dt = 0.
Morever, we assume that ψ is normalized:
⇒
2 Z ψ(ξ ˆ ) |ξ |2
(24)
dξ = 1.
Under these assumptions, Cand`es (1998) defines the ridgelet transform of a function f by Definition 2. For a function f , the coefficients of its ridgelet transform are given by R f (a, b, θ ) =
Z
∗ ψa,b,θ (x1 , x2 ) f (x1 , x2 ) d x1 d x2 = h f, ψa,b,θ i,
(25)
and the reconstruction formula is given by f (x1 , x2 ) =
2π
Z 0
Z
+∞ Z +∞
−∞
R f (a, b, θ )ψa,b,θ (x)
0
da dθ db . 3 4π a
(26)
In addition, the Parseval relation is verified as in Proposition 1 below. Proposition 1. If f ∈ L 1 ∩ L 2 (R2 ) and if ψ is admissible, then k f k2L 2 = cψ
Z
|h f, ψa,b,θ i|2
da dθ db , 3 4π a
(27)
where cψ = (4π )−1 K ψ−1 . The proof can be found in Cand`es (1998). In pratice, the ridgelet transform can be implemented by using the Radon transform and the 1D wavelet transform (see Cand`es (1998) for more details).
96
J´erˆ ome Gilles
2.5. Curvelets From the definition of the ridgelet transform, it is easy to see that this transform is a global transform (we mean that it is efficient to represent lines that go through the entire image). But images contain more general edges that are present locally. Cand`es and Donoho, 1999; Cand`es, Demanet, Donoho, and Ying, 2005; Donoho and Duncan, 1999 propose a new approach that provides a local directional multiresolution analysis called the curvelet transform. The idea is to do a specific tiling of the space and frequency planes by using two windows, the radial window W (r ) and the angular window V (t), where (r, θ ) are the polar coordinates in the frequency plane and r ∈ (1/2, 2). The window V is defined for t ∈ [−1, 1]. These windows obey the following admissibility conditions: +∞ X
W 2 (2 j r ) = 1
r ∈ (3/4, 3/2)
(28)
t ∈ (−1/2, 1/2).
(29)
j=−∞
and +∞ X
V 2 (t − l) = 1
l=−∞
Then for each j > j0 , a frequency window U is defined in the Fourier domain by U j (r, θ ) = 2−3 j/4 W (2− j r )V
2b j/2c θ 2π
! ,
(30)
where b j/2c is the integer part of j/2. Let ϕ j (x) denote the function such that its Fourier transform ϕˆ j (ω) = U j (w) ((r, θ ) are the polar coordinates corresponding to w = (w1 , w2 )). Then we define at scale 2− j , orientation θl , ( j,l) and position xk a set of curvelets by ( j,l) ϕ j,l,k (x) = ϕ j Rθl (x − xk ) ,
(31)
where Rθl is the rotation by θl radians. Then the curvelet transform is simply defined by the inner product between a function f ∈ R2 with the set of curvelets. A curvelet coefficient can be written c( j, l, k) = h f, ϕ j,l,k i =
Z R2
f (x)ϕ ∗j,l,k (x) d x.
(32)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
97
More details can be found in Cand`es et al. (2005). In their paper, the authors prove the following proposition. Proposition 2. Let f ∈ L 2 (R2 ) denote a function expanded over a set of curvelets ϕ j,l,k ; we have the following reconstruction formula: f =
X
h f, ϕ j,l,k iϕ j,l,k
(Tight frame),
(33)
j,l,k
and the Parseval relation is verified: X h f, ϕ j,l,k i 2 = k f k2 2 , L
∀ f ∈ L 2 (R2 ).
(34)
j,l,k
All details about the numerical aspects can be found in Cand`es et al. (2005).
2.6. Contourlets In 1999, when Cand`es et al. proposed the curvelet transform, the authors showed many promising results. The main drawback of the first version of curvelets is the difficulty of its numerical implementation (the discrete curvelet transform was proposed in 2005 (Cand`es et al., 2005)). In order to “overcome” this problem, Do, Vetterli, and Po (Do, 2001, 2003; Do & Vetterli, 2001, 2002, 2003a,b; Po & Do, 2006) proposed a new algorithm, called the contourlet transform, initially designed in a discret framework. The idea is to combine a multiscale decomposition and directional filtering at each scale (Figure 1). The multiscale decomposition is obtained by using a Laplacian pyramid decomposition (LP) (Burt & Adelson, 1983). The directional filtering uses a directional filter bank (DFB) based on quincunx filters (Bamberger & Smith, 1992). In the next theorem, the authors show that this transform produces a tight frame. Theorem 2. Let j be the scale, n the position, l j j6 j the set of number of 0 directions for each scale j. Then, the set n
o (l j ) φ j0 ,n (t); ρ j,k,n (t)
(35)
l
j6 j0 , 06k62 j −1, n∈Z2
is a tight frame of L 2 (R2 ). (l )
j All details about the construction of functions φ j0 ,n (t) and ρ j,k,n (t) can be found in Do (2001).
98
J´erˆ ome Gilles LP
f
DFB2
DFB1
FIGURE 1
Contourlet transform principle.
This implies Corollary 1. l
f (t) =
X
αn φ j0 ,n (t) +
n
j −1 X 2X X
j6 j0 k=0
(l )
j β j,k,n ρ j,k,n (t)
(36)
n
or l
f (t) =
j −1 X 2X X
j∈Z k=0
(l )
j β j,k,n ρ j,k,n (t),
(37)
n (l )
j where αn = h f |φ j0 ,n i and β j,k,n = h f |ρ j,k,n i are the contourlet transform coefficients.
2.7. Function Spaces In Sections 3 and 4, we will use some function spaces and more particularly their associated norms. This section briefly describes the spaces of interest (it is assumed that the reader knows the L p spaces and d is the dimension). The goal of the different spaces is to characterize some properties like the differentiability and the regularity of functions.
99
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
2.7.1. Sobolev Spaces The first spaces we are interested in are the Sobolev spaces W k, p . These spaces are defined as the spaces of functions f such that they and their weak derivatives up to some order k have a finite L p norm, for a given p > 1. These spaces are endowed with the following norm:
k f kW k, p =
k X
p k f (i) k L p
!1/ p =
k Z X
|f
(i)
!1/ p (t)| dt
.
p
(38)
i=0
i=0
An interesting particular case is for p = 2, denoted H k = W k,2 , because of their relation with the Fourier series. More information about the Sobolev spaces can be found in the book by Adams (1975). 2.7.2. Besov Spaces The next kind of spaces are Besov spaces B sp,q . Functions taken in B sp,q have s derivatives in L p . The parameter q permits more precise characterization of the regularity. A general description of these spaces can be found in Triebel (1992). In this paper, we give only their connection with wavelets. Indeed, different expressions exist for the norm associated with Besov space but one uses the wavelet coefficients; see (39) #1/ p
" ∀f ∈
B sp,q
k f k B sp,q =
X
|αn |
p
n
+
+∞ X
2
j
1 d 2 − p +s
q
" X
p 2
#q/ p 1/q
2 j |β jn | p
.
(39)
n
j=0
The homogeneous version is ∀ f ∈ B˙ sp,q
k f k B˙ s
p,q
=
+∞ X j=−∞
2
j
d 1 2 − p +s
q
" X
p 2
2 j |β jn | p
#q/ p 1/q
, (40)
n
where αn and β jn are the coefficients issued from the wavelet expansion (see Section 2.2). 2.7.3. Ridgelet Spaces In the same way as previous, Cand`es define the ridgelet spaces R sp,q endowed with the norm based on the ridgelet coefficients.
100
J´erˆ ome Gilles
Definition 3. For s > 0 and p, q > 0, we said that f ∈ R sp,q if f ∈ L 1 and Ave kR f (u, .) ? ϕk L p < ∞ u p 1/ p and 2 js 2 j (d−1)/2 Ave kR f (u, .) ? ψ j k L p ∈ lq (N),
(41)
u
R where R f (u, t) = u.x=t f (x) d x is the Radon transform of f (u (cos θ ; sin θ )). The function ϕ is the scale function associated with ψ.
=
Then the induced norm is defined by k f k R sp,q = Ave kR f (u, .) ? ϕk L p u ( ) q 1/q X 1/ p p + 2 js 2 j (d−1)/2 Aveu kR f (u, .) ? ψ j k L p
(42)
j>0
and its homogeneous version R˙ sp,q
k f k R˙ s
=
p,q
X
2 js 2 j (d−1)/2
j∈Z
1/q q 1/ p p . Aveu kR f (u, .) ? ψ j k L p
(43)
As in the Besov case, these norms can be calculated from the ridgelet coefficients. Let w j (u, b)( f ) = h f (x), ψ j (u.x − b)i for j > 0 and v(u, b)( f ) = h f (x), ϕ(u.x − b)i these ridgelet coefficients, then Z kfk
R sp,q
=
+
|v(u, b)( f )| du db p
( X
js j (d−1)/2
2 2
1/ p
Z
|w j (u, b)( f )| du db p
1/ p !q )1/q
.
(44)
j>0
More information can be found in Cand`es (1998). 2.7.4. Contourlet Spaces Inspired from the previous spaces, we propose to define the contourlet spaces, which will be denoted Cosp,q .
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
101
Definition 4. Let s > 0 and p, q > 0, if f ∈ Cosp,q ; then #1/ p
" X
k f kCosp,q =
|α j0 ,n |
p
n
+
X
j d2 − 1p +s q
2
l
j −1 2X X
j6 j0
n
k=0
q/ p 1/q j 2p p 2 |β j,k,n | ,
(45)
or in the homogeneous case,
k f kCo ˙ s
p,q
=
X
2
j d2 − 1p +s q
l
j −1 2X X
j∈Z
k=0
n
q/ p 1/q p 2 j 2 |β j,k,n | p ,
(46)
where α j0 ,n and β j,k,n are the contourlet coefficients mentioned in Section 2.6. 2.7.5. Bounded Variation (BV) Spaces The last space of interest is the BV space, the space of bounded variations functions. This space is widely used in image processing because it is a good candidate to modelize structures in images. Definition 5. The space BV over a domain is defined as BV =
f ∈ L 1 ();
Z
|∇ f | < ∞ ,
(47)
where ∇ f is the gradient, in the distributional sense, of f and Z
Z
|∇ f | = sup − → ϕ
2 1 − → − → − → f div ϕ ; ϕ ∈ C0 (, R ), | ϕ | 6 1 .
(48)
This space is endowed with the following norm: Z k f k BV = k f k L 1 +
|∇ f |.
(49)
But in general, we only keep the second term, which is well known as the total variation of f . In the rest of the paper, we will use the notation J( f ) =
Z
|∇ f |.
(50)
102
J´erˆ ome Gilles
More information about the BV space is available in Haddad (2005) and Vese (1996). We now have all the basic tools needed to describe the image decomposition models. The next two sections present the structures + textures and structures + textures + noise models, respectively.
3. STRUCTURES + TEXTURES DECOMPOSITION The starting point of the image decomposition models is the work of Meyer (2001) about the Rudin–Osher–Fatemi (ROF) algorithm (Rudin et al., 1992). Let us recall the ROF model. Assume f is an observed image that is the addition of the ideal scene image u, which we want to retrieve, and a noise b. The authors propose to minimize the following functional to get u: FλROF (u) = J (u) + λk f − uk2L 2 .
(51)
This model assumes that u is in BV because this space preserves sharp edges. This algorithm gives good results and is very easy to implement by using the nonlinear projectors proposed by Chambolle (2004) (see Appendix A). Now if we take the image decomposition point of view, f = u + v, the functional in Eq. (51) can be rewritten as FλROF (u, v) = J (u) + λkvk2L 2 .
(52)
We remind the reader that decomposition means u is the structures part and v the textures part. Meyer shows that this model is not adapted to achieve this decomposition. In order to convince us, the following example illustrates that the more a texture is oscillating, the more it is removed from both the u and v parts. Example 1. Let v be a texture created from an oscillating signal over a finite domain. Then v can be written (x = (x1 , x2 )) as follows: v(x) = cos(ωx1 )θ (x),
(53)
where ω is the frequency and θ the indicator function over the considered domain. Then we can calculate the L 2 and BV norms of v, respectively. We get 1 kvk L 2 ≈ √ kθk L 2 , 2
(54)
103
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
which is constant ∀ω and does not specially capture textures. In addition, kvk BV =
ω kθ k L 1 , 2π
(55)
which grows as ω → ∞ and then clearly rejects textures. In order to adapt the ROF model to capture the textures in the v component, Meyer proposes to replace L 2 space by another space, called G, which is a space of oscillating functions. He proves that this space is the dual space of BV (where BV = { f ∈ L 2 (R2 ), ∇ f ∈ L 1 (R2 )}, which is close to the BV space and the total variation described earlier in the paper); see Meyer (2001) for more theoretical details about these spaces. This space G is endowed by the following norm:
1
2 2 2 | | |g |g + kvkG = inf 2
1
g
,
(56)
L∞
where g = (g1 , g2 ) ∈ L ∞ (R2 ) × L ∞ (R 2 ) and v = div g. If we calculate the G-norm of the oscillating texture in Eq. (53) of Example 1, we get kvkG 6
C , |ω|
(57)
where C is a constant. Then it is easy to see that this space G is well adapted to capture textures. Now, the modified functional performing the structures + textures decomposition is FλY M (u, v) = J (u) + λkvkG ,
(58)
where f = u + v, f ∈ G, u ∈ BV , v ∈ G. The drawback of this model is the presence of an L ∞ norm in the the expression of the G-norm (this does not allow classic variational calculus). The first people to propose a numerical algorithm to solve the Meyer model were Vese and Osher (2002). Their approach was to use the theorem which tells that ∀ f ∈ L ∞ (), k f k L ∞ = lim p→∞ k f k L p and a slightly modified version of Meyer’s functional:
q
2 2 OV 2
Fλ,µ, (u, g) = J (u) + λk f − (u + div g)k + µ g + g p 2
1 L2
.
(59)
Lp
Then variational calculus applies and results in a system of three connected partial differential equations. All the details of the equations discretization are available in Vese and Osher (2002). This algorithm works well but is very sensitive in the choice of its parameters, which induced many instability.
104
J´erˆ ome Gilles
Another way to solve Meyer model was proposed by Aujol et al. (Aujol, 2004; Aujol, Aubert, Blanc-F´eraud, & Chambolle, 2003; Aujol, Gilboa, Chan, & Osher, 2006). The authors propose a dual-method approach that naturally arises because of the dual relation between the G and BV spaces. The problem is assumed to be in the discrete case and defined over a finite domain . They proposed a modified functional to minimize AU Fλ,µ (u, v)
v + (2λ)−1 k f − u − vk2L 2 = J (u) + J µ ∗
(60)
and (u, v) ∈ BV () × G µ ().
(61)
The set G µ is the subset in G where ∀v ∈ G µ , kvkG 6 µ. Moreover, J ∗ is the characteristic function over G 1 with the property that J ∗ is the dual operator of J (J ∗∗ = J ). Thus, J (v) = ∗
0 if v ∈ G 1 +∞ else.
(62)
The interesting point is that the precited Chambolle’s projectors are the projector over the sets G µ , ∀µ; these operators will be denoted PG µ in the rest of the paper. More details about these projectors can be found in Chambolle (2004) and recalled in Appendix A. Then the authors propose an iterative AU (u, v). algorithm that gives the minimizers (u, ˆ v) ˆ of Fλ,µ • Let us fix v, we seek for the minimizer u of inf J (u) + (2λ)−1 k f − u − vk2L 2 . u
• Now we fix u and seek for the minimizer v of ∗ v inf J + k f − u − vk2L 2 . v µ
(63)
(64)
Chambolle’s results show that the solution of Eq. (63) is given by uˆ = f − vˆ − PG λ ( f − v) ˆ
(65)
and the solution of Eq. (64) by vˆ = PG µ ( f − u). ˆ
(66)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
105
Then the numerical algorithm is 1. Initialization: u 0 = v0 = 0 2. Iteration n + 1:
u n+1
vn+1 = PG µ ( f − u n ) = f − vn+1 − PG λ ( f − vn+1 )
3. We stop the algorithm if max (|u n+1 − u n |, |vn+1 − vn |) 6 or if we reach a prescribed maximal number of iterations. The authors prove that the minimizers (u, ˆ v) ˆ are also minimizers of the original Meyer functional [Eq. (58)], and that it is better to start by calculating vn+1 than u n+1 ; see Aujol (2004) and Aujol et al. (2003) for the complete proofs. Figure 2 presents the three original images (Barbara, House, and Leopard) used for tests in the rest of the paper. Figures 3–5 illustrate the results from Aujol’s algorithm. The chosen parameters are (λ = 1, µ = 100), (λ = 10, µ = 1000), and (λ = 5, µ = 1000) respectively. For clarity reasons, we enhanced the contrasts of the textured components. On each test we see that the separation between structures and textures works well. Some residual textures remain in the structures part; this can be explained by the fact the parameter λ acts as a tradeoff between the “power” of separability and too much regularization of u. As the G-norm is difficult to handle, Meyer (2001) proposes to replace the ∞ ∞ space G by the Besov space B˙ −1,∞ because G ⊂ B˙ −1,∞ (in the following, we ∞ ˙ will denote E = B−1,∞ ). The advantage is that the norm of a function v over this space can be defined from its wavelet coefficients. The corresponding model proposed by Meyer is FλY M2 (u, v) = J (u) + λkvk E .
(67)
Aujol and Chambolle were the first to propose a numerical algorithm that uses the space E. As previously, they reformulated the model in a dualmethod approach, where E µ is the subset of E, where ∀ f ∈ E µ , k f k E 6 µ and B ∗ ( f ) is the indicator function over E 1 . Then the functional to minimize is AC ∗ v Fλ,µ (u, v) = J (u) + B + (2λ)−1 k f − u − vk2L 2 . (68) µ
106
J´erˆ ome Gilles
FIGURE 2 Original Barbara, House, and Leopard images.
Structures
FIGURE 3
Textures
BV –G structures + textures image decomposition of Barbara image.
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
FIGURE 4
Textures
BV –G structures + textures image decomposition of House image.
Structures
FIGURE 5
107
Textures
BV –G structures + textures image decomposition of Leopard image.
Chambolle, DeVore, Lee, and Lucier (1998) proved the existence of a projector on this space, denoted PE µ , defined by PE µ ( f ) = f − W ST ( f, 2µ),
(69)
where W ST is the wavelet soft thresholding operator (we mean that we first perform the wavelet expansion of the function, then we do the soft thresholding of the wavelet coefficients, and end by reconstructing the image). Then the new numerical algorithm is as follows:
108
J´erˆ ome Gilles
Structures
FIGURE 6
Textures
BV –E µ structures + textures image decomposition of Barbara image.
1. Initialization: u 0 = v0 = 0 2. Iteration n + 1: vn+1 = PE µ ( f − u n ) = f − u n − W ST ( f − u n , 2µ) u n+1 = f − vn+1 − PG λ ( f − vn+1 ) 3. We stop if max (|u n+1 − u n |, |vn+1 − vn |) 6 or if we reach a prescribed maximal number of iterations. The results obtained by this model are presented in Figures 6–8. This algorithm works, but its main drawback is that it captures some structures informations (like the legs of the table in the Barbara image; see Figure 6). This behavior appears because the space E is much bigger than the space G; in particular the space E contains functions that are not only textures. Osher, Sole, and Vese (2002) explore the possibility of replacing the space G by the Sobolev space H −1 . They propose the following functional (v is obtained by v = f − u): FλV S (u) = J (u) + λk f − uk2H −1 ,
(70)
R where kvk H −1 = |∇(1−1 )v|2 d x dy. The authors give the corresponding Euler–Lagrange equations and their discretization. Another way to
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
FIGURE 7
Textures
BV –E µ structures + textures image decomposition of House image.
Structures
FIGURE 8
109
Textures
BV –E µ structures + textures image decomposition of Leopard image.
numerically solve the problem is to use a modified version of Chambolle’s projector PH −1 (see Appendix A). Figures 9–11 present the results obtained λ with this algorithm. Some other models were proposed that test different spaces to replace BV or G spaces. We mention the work of Aujol et al. (Aujol and Chambolle, 2005; and Aujol and Gilboa, 2006) who propose replacing the space BV by 1 , or replacing G by some Hilbert spaces, which the smaller Besov space B1,1 permits the possibility of extracting textures with a certain directionality. 1 , instead of BV (the Haddad (2005) proposes using the Besov space B˙ 1,∞ norms over these two spaces are equivalent) with the L 2 norm for the v part.
110
J´erˆ ome Gilles
Structures
FIGURE 9
BV –H −1 structures + textures image decomposition of Barbara image.
Structures
FIGURE 10
Textures
Textures
BV –H −1 structures + textures image decomposition of House image.
Garnett, Jones, Triet, and Vese (2005) and Triet and Vese (2005) study the ˙ O −α , and W˙ −α, p to model the textures use of the spaces div (B M O), B M component.
4. STRUCTURES + TEXTURES + NOISE DECOMPOSITION The previous algorithms yield good results but are of limited interest for noisy images (we add a gaussian noise with σ = 20 on each test image of Figure 2, the corresponding noisy test images can be viewed in Figure 12). Indeed, noise can be viewed as a very highly oscillatory function (this means that noise can be view as living in the space G). Therefore, the algorithms
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
FIGURE 11
Textures
BV –H −1 structures + textures image decomposition of Leopard image.
FIGURE 12 Original Barbara, House, and Leopard images corrupted by gaussian noise (σ = 20).
111
112
J´erˆ ome Gilles
Structures
FIGURE 13
Textures
BV –G structures + textures image decomposition of the noisy Barbara image.
incorporate the noise in the textures components. Then the textures are corrupted by noise (see Figure 13 for example). In this section, we present some extension of the two-component model to the three-component model, f = u + v + w, which could discriminate among structures (u), textures (v), and noise (w).
4.1. BV –G–G Local Adaptative Model In Gilles (2007b), we proposed a new model to decompose an image into three parts: structures (u), textures (v), and noise (w). As in the u + v model, we consider that structures and textures are modelized by functions in BV and G spaces, respectively. We also consider a zero mean gaussian noise added to the image. Let us view noise as a specific very oscillating function. In virtue of Meyer’s work (2001), where it is shown that the more a function is oscillatory, the smaller its G-norm is, we propose to modelize w as a function in G and consider that its G-norm is much smaller than the norm of textures (kvkG kwkG ). These assumptions are equivalent to choosing v ∈ G µ1 ,
w ∈ G µ2 ,
where µ1 µ2 .
(71)
To increase the performance, we propose adding a local adaptability behavior to the algorithm following an idea proposed by Gilboa, Zeevi, and Sochen (2003). These authors investigate the ROF model given by Eq. (51) and propose a modified version that can preserve textures in the denoising process. To do this, they do not choose λ as a constant on the entire image but as a function λ( f )(x, y) which represents local properties of the image. In a cartoon-type region, the algorithm enhances the denoising process by increasing the value of λ; in a texture-type region, the algorithm decreases
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
113
λ to attenuate the regularization to preserve the details of textures. So λ( f )(x, y) can be viewed as a smoothed partition between textured and untextured regions. Then, in order to decompose an image into three parts, we propose to use the following functional: JG Fλ,µ (u, v, w) 1 ,µ2
= J (u) + J
∗
v µ1
+J
∗
w µ2
+ (2λ)−1 k f − u − ν1 v − ν2 wk2L 2 ,
(72)
where the functions νi represent the smoothed partition of textured and untextured regions (and play the role of λ in Gilboa’s paper). The νi functions must have the following behavior: • for a textured region, we want to favor v instead of w. This is equivalent to ν1 close to 1 and ν2 close to 0, • for an untextured region, we want to favor w instead of v. This is equivalent to ν1 closed to 0 and ν2 close to 1. We see that ν1 and ν2 are complementary, so it is natural to choose ν2 = 1 − ν1 : R2 → ]0; 1[. The choice of ν1 and ν2 is discussed after the following JG proposition, which characterizes the minimizers of Fλ,µ (u, v, w). 1 ,µ2 Proposition 3. Let u ∈ BV , v ∈ G µ1 , and w ∈ G µ2 be the structures, textures, and noise parts, respectively, and f the original noisy image. Let the functions (ν1 ( f )(., .), ν2 ( f )(., .)) be defined on R2 → ]0; 1[, and assume that these functions could be considered as locally constant compared to the variation of v and w. Then a minimizer defined by (u, ˆ v, ˆ w) ˆ =
arg
(u,v,w)∈BV ×G µ1 ×G µ2
JG min Fλ,µ (u, v, w), 1 ,µ2
(73)
is given by ˆ uˆ = f − ν1 vˆ − ν2 wˆ − PG λ ( f − ν1 vˆ − ν2 w), f − uˆ − ν2 wˆ vˆ = PG µ1 , ν1 f − uˆ − ν1 vˆ wˆ = PG µ2 , ν2
(74) (75) (76)
where PG µ denotes Chambolle’s non-linear projectors (see Appendix A). The proof of this proposition can be found in Gilles (2007b). As in the two-part BV –G decomposition model, we get an equivalent numerical scheme:
114
J´erˆ ome Gilles
FIGURE 14
Texture partition ν1 obtained by local variance computation.
1. Initialization: u 0 = v0 = w0 = 0, 2. Compute ν1 and ν2 = 1− ν1 from f , 3. Compute wn+1 = PG µ2 the division by zero),
f −u n −ν1 vn ν2 +κ
, (κ is a small value in order to prevent
2 wn+1 4. Compute vn+1 = PG µ1 f −u nν−ν , +κ 1 5. Compute u n+1 = f − ν1 vn+1 − ν2 wn+1 − PG λ ( f − ν1 vn+1 − ν2 wn+1 ), 6. If max{|u n+1 − u n |, |vn+1 − vn |, |wn+1 − wn |} 6 or if we did Nstep iterations then stop the algorithm, else jump to step 3. Concerning the choice of the νi functions, we were inspired by the work of Gilboa et al. (2003). The authors choose to compute a local variance on the texture + noise part of the image obtained by the ROF model ( f − u). In our model, we use the same strategy but on the v component obtained by the two parts decomposition algorithm. This choice is implied by the fact that the additive gaussian noise can be considered as orthogonal to textures. As a consequence, the variance of a textured region is larger than the variance of an untextured region. So, in practice, we first compute the two-part decomposition of the image f . On the textures part, for all the pixels (i, j), we compute the local variance on a small window (odd size L) centered on (i, j). At the least, we normalized it to obtain the values in ]0; 1[. All the details about the computation of the νi ’s function can be found in Gilles (2007b). Figure 14 shows an example from the noisy Barbara image. As expected, the variance is higher in the textured regions and lower in the others. Figures 15–17 show the results of the u +v +w decomposition we obtained by the BV –G–G local adaptive model. This model can separate noise from the textures. If we look more precisely, we can see that some residual noise remains in the textures, and some textures are partially captured in the noise
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
115
Textures
Noise
FIGURE 15
BV –G–G structures + textures + noise image decomposition of Barbara image.
part. This is due to the choice of the parameters λ, µ1 , and µ2 which act on the separability power of the algorithm.
4.2. Aujol–Chambolle BV –G–E Model The same time as our work, Aujol and Chambolle thought of the same structures + textures + noise decomposition problem (Aujol & Chambolle, 2005). They proposed a model close to our model described in the previous subsection but with the difference that they consider the noise as a ∞ distribution taken in the Besov space E = B˙ −1,∞ . Then the associated functional is AC2 Fλ,µ,δ (u, v, w)
w v = J (u) + J + B∗ + (2λ)−1 k f − u − v − wk2L 2 , (77) µ δ ∗
116
J´erˆ ome Gilles
Structures
Textures
Noise
FIGURE 16
BV –G–G structures + textures + noise image decomposition of House image.
where u ∈ BV , v ∈ G µ , and w ∈ E δ as defined in the previous sections. The authors prove that the minimizers are (see Aujol and Chambolle (2005)): uˆ = f − vˆ − wˆ − PG λ ( f − vˆ − w), ˆ
(78)
vˆ = PG µ ( f − uˆ − w), ˆ
(79)
wˆ = PE δ ( f − uˆ − v) ˆ = f − uˆ − vˆ − W ST ( f − uˆ − v, ˆ 2δ),
(80)
where W ST ( f − uˆ − v, ˆ 2δ) is the WST operator applied on f − uˆ − vˆ with a threshold set to 2δ. Then the numerical algorithm is given by 1. Initialization: u 0 = v0 = w0 = 0, 2. Compute wn+1 = f − u n − vn − W ST ( f − u n − vn , 2δ),
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
117
Textures
Noise
FIGURE 17 image.
BV –G–G structures + textures + noise image decomposition of Leopard
3. Compute vn+1 = PG µ ( f − u n − wn+1 ), 4. Compute u n+1 = f − vn+1 − wn+1 − PG λ ( f − vn+1 − wn+1 ), 5. If max{|u n+1 − u n |, |vn+1 − vn |, |wn+1 − wn |} 6 or if we performed Nstep iterations, then stop the algorithm, else jump to step 2. The results of this algorithm on our test images are shown in Figures 18– 20, respectively. We can see that textures are better denoised by this model. This is a consequence of a better noise modeling by distributions in the Besov space. But the residual texture is more important than the one given by our algorithm in the noise part. Another drawback appears in the structures part; the edges in the image are damaged because some important wavelet coefficients are removed. Previously, Gilles (2007b) provides the possibility to add the local adaptivity behavior of the BV –G–G model to the BV –G–E model. We refer the reader to Gilles (2007b) to see the BV –G–E local
118
J´erˆ ome Gilles
Structures
Textures
Noise
FIGURE 18
BV –G–E structures + textures + noise image decomposition of Barbara image.
adaptivity functional and find the corresponding results. This modified version shows less improvement compared to the original. We prefer to explore the replacement of wavelets by new geometric multiresolution tools such as contourlets. ∞
˙ −1,∞ Decomposition Model 4.3. The BV –G–Co As mentionned previously, the new directional multiresolution tools, such as curvelets or contourlets, exhibit very good results in denoising. They also better reconstruct the edge in an image. So, the idea to replace the use of wavelet by curvelets or contourlets naturally arises. In this paper, we focus on the choice of contourlets. This choice is equivalent to changing the Besov space in the model described in the previous subsection by the homogeneous ˙ ∞ contourlet space Co −1,∞ . Then, the equivalent functional is given in Eq. (81)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
119
Textures
Noise
FIGURE 19
BV –G–E structures + textures + noise image decomposition of House image.
as below: Co Fλ,µ,δ (u, v, w) = J (u) + J ∗
w v ∗ + JCo + (2λ)−1 k f − u − v − wk2L 2 , (81) µ δ
∗ ( f ) is the indicator function over the set Co if we denote Co = where JCo δ 1 n o ∞ ∞ f ∈ Co−1,∞ /k f kCo−1,∞ 6 δ (norm over the contourlet spaces is defined in the Section 2.7.4) defined by
∗ JCo (f) =
0 if f ∈ Co1 +∞ else.
(82)
Then, the following proposition gives the solutions that minimize the previous functional.
120
J´erˆ ome Gilles
Structures
Textures
Noise
FIGURE 20 image.
BV –G–E structures + textures + noise image decomposition of Leopard
Proposition 4. Let u ∈ BV , v ∈ G µ , w ∈ Coδ be the structures, textures, and noise components derived from the image decomposition. Then the solution (u, ˆ v, ˆ w) ˆ =
arg
(u,v,w)∈BV ×G µ ×Coδ
Co inf Fλ,µ,δ (u, v, w)
(83)
is given by uˆ = f − vˆ − wˆ − PG λ ( f − vˆ − w) ˆ vˆ = PG µ f − uˆ − wˆ wˆ = f − uˆ − vˆ − C ST f − uˆ − v; ˆ 2δ , where PG λ is the Chambolle nonlinear projector and C ST ( f, 2δ) is the Contourlet Soft Thresholding operator of f − u − v.
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
121
Proof. The components u, ˆ vˆ are obtained by the same arguments used in the proof of Proposition 3 (this proof is available in Gilles, 2007b). The particular point concerns the expression of wˆ expressed with the soft thresholding Co (u, v, w) of the contourlet coefficients. Assume we want to minimize Fλ,µ,δ compared to w; it is equivalent to find w solution of (we set g = f − u − v) n o wˆ = arg min kg − wk2L 2 .
(84)
w∈Coδ
ˆ such that We can replace it by its dual formulation: wˆ = g − h, n o hˆ = arg min 2δkhkCo1 + kg − hk2L 2 . 1 h∈Co1,1
1,1
(85)
We can use the same approach used by Chambolle et al. (1998). Let (c j,k,n ) j∈Z,06k62(l j ) ,n∈Z2 and (d j,k,n ) j∈Z,06k62(l j ) ,n∈Z2 denote the coefficients issued from the contourlet expansions of g and h, respectively. As contourlets form a tight frame, with a bound of 1, we have (we denote = Z × [[0, 2(l j ) ]] × Z2 ) X kgk2L 2 = |c j,k,n |2 . (86) ( j,k,n)∈
Then Eq. (85) can be rewritten as X |c j,k,n − d j,k,n |2 + 2δ ( j,k,n)∈
X
|d j,k,n |,
(87)
( j,k,n)∈
which is equivalent to |c j,k,n − d j,k,n |2 + 2δ|d j,k,n |.
(88)
However, Chambolle et al. (1998) prove that the solution of this kind of problem is the soft thresholding of the coefficients (c j,k,n ) with 2δ as the threshold. Then hˆ = C ST (g, 2δ), which by duality implies that wˆ = g − C ST (g, 2δ). We conclude that wˆ = f − uˆ − vˆ − C ST ( f − uˆ − v, ˆ 2δ), which end the proof.
(89)
The corresponding numerical scheme is the same as in the BV –G–E algorithm, except we replace the wavelet expansion by the contourlet expansion in the soft thresholding:
122
J´erˆ ome Gilles
Structures
Textures
Noise
FIGURE 21 image.
1. 2. 3. 4. 5.
BV –G–Co structures + textures + noise image decomposition of Barbara
Initialization: u 0 = v0 = w0 = 0, Compute wn+1 = f − u n − vn − C ST ( f − u n − vn , 2δ), Compute vn+1 = PG µ ( f − u n − wn+1 ), Compute u n+1 = f − vn+1 − wn+1 − PG λ ( f − vn+1 − wn+1 ), If max{|u n+1 − u n |, |vn+1 − vn |, |wn+1 − wn |} 6 or if we performed Nstep iterations, then stop the algorithm; else jump to step 2.
Figures 21–23 show the results obtained by replacing wavelets by contourlets. The advantage of using geometric frames is that it preserves well the integrity of oriented textures as seen in the zoomed images in Figure 24. In this section, we presented many decomposition models. We can imagine the use of other frames and basis like curvelets, cosines, and so on. The idea of decomposing an image by thresholding different basis expansion coefficients corresponds to the recent theory of morphological component
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Structures
123
Textures
Noise
FIGURE 22 image.
BV –G–Co structures + textures + noise image decomposition of House
analysis (MCA) (Bobin, Starck, & Moudden, 2007; Bobin, Starck, Fadili, & Donoho, 2007). This approach seeks sparse representation of the different components and is useful for sources separation.
5. PERFORMANCE EVALUATION The previous section described different decomposition models based on specific function spaces. But one question arises: Which is the best one? This section adresses this question by defining well-adapted criteria and their associated metrics. We build a special test image by creating different components separately and then by adding them. We will denote f 0 the test image composed of u 0 (the structures reference image) + v0 (the textures reference image) + w0 (the noise reference image). We finish by giving the measures obtained for this image.
124
J´erˆ ome Gilles
Structures
Textures
Noise
FIGURE 23 image.
BV –G–Co structures + textures + noise image decomposition of Leopard
5.1. Test Image Because we want to compare the quality of each extracted components, we will create specific components: u 0 for structures, v0 for textures, and w0 for noise. Textures are built by sine functions over some finite domains; structures are made by drawing some shapes with an adapted software like GIMP. The noise part is simply a gaussian noise with σ = 20. The u 0 and v0 reference parts and the recomposed test image are shown in Figure 25.
5.2. Evaluation Metrics Assume the test image is composed of known reference images u 0 , v0 , and w0 . We choose the following criteria to measure the decomposition quality: the L 2 -norms of errors u − u 0 and v − v0 , where u and v are the structures and textures components issued from the decomposition. Another quantity
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Wavelet thresholding
125
Contourlet thresholding
FIGURE 24 Zoomed images for the textured components of wavelet- and contourlet-based algorithms.
FIGURE 25 Structures and textures reference images and the recomposed test image.
that is interesting to evaluate is the residual structures + textures present in the noise component w. To measure this quantity we prove the following proposition. Proposition 5. Let b(i, j) denote a gaussian noise of variance σ 2 and d(i, j) an image free of noise (we assume that the intercorrelation between b and d is negligible). Let f = Ad + b be a simulated noise + residue image, where A ∈ R corresponds to residue level. Then kγ f − γb k L 2 ≈ A2 ,
(90)
where γ f and γb are the autocorrelation functions of f and b, respectively. Proof. We start by calculating the autocorrelation function of f : X γ f (k, l) = f (i, j) f ∗ (i + k, j + l). (i, j)∈Z2
(91)
126
J´erˆ ome Gilles
FIGURE 26 Residual reference image.
However, we assume that images are real; then f (i, j) = f ∗ (i, j) and we deduce that γ f (k, l) =
X
[Ad(i, j) + b(i, j)] [Ad(i + k, j + l) + b(i + k, j + l)] (92)
(i, j)∈Z2
=
X
A2 d(i, j)d(i + k, j + l) +
(i, j)∈Z2
+
X
X
b(i, j)b(i + k, j + k)
(i, j)∈Z2
[Ad(i, j)b(i + k, j + l) + Ad(i + k, j + l)b(i, j)]
(93)
(i, j)∈Z2
= A2 γd (k, l) + γb (k, l) + A (γdb (k, l) + γbd (k, l)) .
(94)
Now we examine the norm k.k L 2 of this autocorrelation function. First, notice that γb (k, l) = σ 2 δ(k, l) (where δ(k, l) is the Kronecker symbol) because we assumed that the noise is gaussian. The statement of the proposition assumed that the intercorrelations are negligible; in pratice, it is easy to check that the quantity A (γdb (k, l) + γbd (k, l)) is negligible compared to A2 γd (k, l). We deduce that γ f (k, l) − γb (k, l) ≈ A2 γd (k, l);
(95)
then, by passing to the norm, we get kγ f − γb k L 2 ≈ A2 kγd k L 2 .
(96)
To illustrate this proposition, assume that we take the image in Figure 26 as d(i, j) and we generate an image b(i, j) full of gaussian noise (σ = 20). Then we compose the image f = Ad + b for the different values A ∈ {0.05; 0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9} (this means that more and more residue appears as A increases, see Figure 27 top row).
127
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
A = 0.05
A = 0.3
A = 0.8
FIGURE 27 Noisy reference images affected by different residual levels and their associated autocorrelation images.
|| γ f – γ b ||L2 849.093432
0.1
3312.071022
0.2
13099.095280
0.3
29367.800483
0.4
52118.223554
0.5
81350.371724
0.6
117064.247377
0.7
159259.851531
0.8
207937.184693
0.9
263096.247142
200000 160000
|| γ f – γ b ||L2
A 0.05
120000 80000 40000
0.2
0.4
0.6
0.8
A
FIGURE 28 Results of the measure norm kγ f − γb k L 2 for the different values of A (left) and its associated graph.
Figure 28 gives the measured values of kγ f − γb k L 2 and shows the associated graph. As announced by the proposition, we show the quadratic behavior of the norm of the autocorrelation differences as A grows. We will use this metric in the next subsection to evaluate the residual quantity in the noise parts at the output of the different decomposition algorithms.
128
J´erˆ ome Gilles
TABLE 1 Evaluation measures obtained for all u, v, w decomposition algorithms Algorithm
F JG
F AC2
F Co
ku˜ − u 0 k L 2 kv˜ − v0 k L 2 kγw − γw0 k L 2
792.8 1844.9 423.2
873.5 2832.4 423.5
984.6 1598.6 255.3
5.3. Image Decomposition Performance Evaluation In this subsection we apply three-part image decomposition on the test image built in Section 5.1 and use the metrics defined in Section 5.2 to evaluate their performances. In this chapter, we restrict the choice of the different parameters to only the ones that give the best visual performances, but in the future, a more global, in terms of parameters variability, test could be to explore the complete behaviors of the algorithms. The choosen parameters are • Algorithm F J G : λ = 10, µ1 = 1000, µ2 = 100, and a window size of 3 × 3 pixels, • Algorithm F AC2 : λ = 1, µ = 500 and δ = 9.4 (κ = 0.2 and σ = 20), • Algorithm F Co : λ = 1, µ = 500 and δ = 23.5 (κ = 0.5 and σ = 20). Figure 29 shows the outputs of the different algorithms while Table 1 gives the corresponding measures. We can see the BV –G–G-based algorithm F J G has the smallest error for the structures image but the textures are slightly less preserved than the contourlet-based model F Co . Its noisy part is of the same quality as the wavelet-based model F AC2 . Moreover, it is clear the F Co algorithm gives the best denoising performance and has the least residue; it also has the best score for the textures quality. Even if the visual quality seems to be close to the F J G algorithm, the contourlet-based model has the worst score on the structures component. Then globally, as expected, the model based on contourlet expansion gives the best decomposition.
6. CONCLUSION This chapter provides an overview of structures + textures image decomposition. We also present the extension to noisy images decomposition and show that it is necessary to adopt a three-part decomposition model (structures + textures + noise). The different models are based on the bounded-variation space to describe the structures component of an image. The textures are defined by the space G of oscillating functions proposed by Meyer; different stategies can be used for the noise. Some other function spaces can be chosen; most often it is equivalent to choosing the best basis
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
129
FIGURE 29 Outputs of the decomposition algorithms. First row: F J G algorithm; second row: F AC2 algorithm; last row: F Co algorithm.
or frame to represent the different components. This approach is the same philosophy as the principle of morphological component analysis recently introduced by the work of Bobin et al. (Bobin, Starck, and Moudden (2007); Bobin Starck, Fadili, et al. (2007)). An interesting property used in the BV –G–G model is the local adaptibility of the algorithm by choosing a nonconstant parameter ν. Some recent theoretical work on the Besov and Triebel–Lizorkin spaces seems to provide some insight on the local behavior of an image (in terms of local scales). Here this approach is used to improve the quality of the decomposition. The main problem of the decomposition models, and it remains an open question, is the choice of the different parameters. Aujol et al. (2006) propose a method of automatically selecting the parameter λ, but it is very expansive in computing time. We currently start some work to find some solutions.
130
J´erˆ ome Gilles
We have proposed a method, which consists of building specific test images and using three different metrics, to evaluate the performance of the quality of components issued from the different decomposition algorithms. The first tests seem to confirm that the model based on the thresholding of contourlets coefficients is the best one. However, more complete tests based on different test images with different kind of textures, noise, or structures and by tuning the different parameters are needed. This could help us to understand completely the behaviors of this kind of algorithm. The last topic explored in this study is the application of the image decomposition. A previous study (Gilles, 2007a) proves that the BV –G model enhanced the thin and long structures. Then, we use the textures component as the input of a road detection algorithm in aerial images. We believe that many applications could be created in the future.
APPENDIX A. CHAMBOLLE’S NONLINEAR PROJECTORS Chambolle (2004) proposes an algorithm based on a nonlinear projector to solve a certain category of total variation-based functional. This appendix summarize this work. Some proofs are provided because they are relevant to the rest of the chapter.
A.1. Notations and Definitions We assume the processed image is size M × N . We denote X = R M×N and Y = X × X. Definition 6. Let u ∈ X ; then the discret gradient of u, written ∇u ∈ Y = X × X , is defined by (∇u)i, j = (∇u)i,1 j , (∇u)i,2 j ,
(97)
with ∀i, j ∈ [[0, . . . , M − 1]] × [[0, . . . , N − 1]] (∇u)i,1 j = (∇u)i,2 j =
u i+1, j − u i, j if i < M − 1 0 if i = M − 1
(98)
u i, j+1 − u i, j if j < N − 1 0 if j = N − 1.
(99)
Definition 7. Let p ∈ Y ( p = ( p 1 , p 2 )), we define the numerical divergence operator div : Y → X such that div = −∇ ∗ (∇ ∗ is the adjoint operator of ∇)
131
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
by the following:
(div p)i, j
1 1 pi, j − pi−1, j if 0 < i < M − 1 1 if i = 0 = pi, j − p 1 if i = M − 1 i−1, j 2 2 pi, j − pi, j−1 if 0 < j < N − 1 if j = 0 + pi,2 j − p 2 if j = N − 1. i, j−1
(100)
We recall that h−div p, ui X = h p, ∇uiY .
A.2. Total Variation In the discrete case, the total variation can be written by: J (u) =
X
(∇u)i, j
(101)
0
r =
X
(∇u)i,1 j
2
2 + (∇u)i,2 j .
(102)
0
However, J is a 1-homogeneous function (J (λu) = λJ (u)); then if we apply the Legendre–Fenchel transform, we get: J ∗ (v) = suphu, vi X − J (u)
(103)
u
with hu, vi X =
X
u i, j vi, j ,
(104)
i, j
where J ∗ is the characteristic function of the closed convex set K : J ∗ (v) = χ K (v) =
0 if v ∈ K +∞ else.
(105)
We have the property J ∗∗ = J . In the continuous case (see the properties of the BV space), we have: n o K = G 1 = div ξ : ξ ∈ Cc1 (, R2 ); |ξ(x)| 6 1, ∀x ∈
(106)
132
J´erˆ ome Gilles
then J (u) = sup ξ
however,
R
Z
u(x)div ξ(x)d x : ξ ∈ Cc1 (, R2 ); |ξ(x)| 6 1, ∀x ∈ ; (107)
u(x)div ξ(x) d x
= hu, div ξ i X , then we can write: J (u) = suphu, div ξ i X , ξ
(108)
which is equivalent, if we write v = div ξ , to J (u) = sup hu, vi X . v∈K
(109)
Now, we would like to have the same kind of expression for the discrete case. Chambolle (2004) proves the following lemma: Lemma 1. In the discrete case, we have: J (u) = sup hv, ui,
(110)
where G 1 = div p; p ∈ Y ; | pi, j | 6 1 .
(111)
v∈G 1
Definition 8. Let product over Y : let p ∈ Y, q ∈ Y such us define the inner that p = p 1 , p 2 and q = q 1 , q 2 ; then X
h p, qiY =
( pi,1 j qi,1 j + pi,2 j qi,2 j ).
(112)
0
A.3. Chambolle’s Projectors We want to solve min u∈X
ku − gk2 + J (u) 2λ
(113)
with g ∈ X , λ > 0, k.k is the Euclidean norm defined by kuk2 = hu, ui X . If we apply Euler–Lagrange calculus to Eq. (113), we get 2(u − g) + ∂ J (u) 3 0 2λ ⇐⇒ u − g + λ∂ J (u) 3 0,
(114) (115)
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
133
where ∂ J is the subdifferential of J defined by w ∈ ∂ J (u) ⇐⇒ J (v) > J (u) + hw, v − ui X
∀v,
(116)
then Eq. (115) can be written as g−u ∈ ∂ J (u) λ g−u ⇐⇒ ∂ J ∗ 3u λ 1 ∗ g−u u ⇐⇒ ∈ ∂ J λ λ λ g g−u 1 ∗ g−u ⇐⇒ ∈ + ∂J . λ λ λ λ
(117) (118) (119) (120)
If we reach a minimizer of
w − g 2 1 λ + J ∗ (w) 2 λ
(121)
then by applying Euler–Lagrange calculus to Eq. (121), we get g 1 + ∂ J ∗ (w) 3 0 λ λ g 1 ⇐⇒ w + ∂ J ∗ (w) 3 . λ λ w−
(122) (123)
Thanks to Eq. (120), we see that w=
g−u λ
(124)
is a minimizer of Eq. (121). However as J ∗ (w) = χG 1 (w) and if w = PG 1 λg (the projector operator
over G 1 ), then J ∗ (w) = 0 and w − λg is minimum. We deduced that g−u λ λg u = g − λPG 1 . λ PG 1
g
=
(125) (126)
134
J´erˆ ome Gilles
We have PG λ
g λ
= λPG 1
g λ ,
then we have
u = g − PG λ
g λ
.
(127)
Now, we need to find how to calculate PG λ (g). Chambolle gives the following result: n computing PG λ (g) ⇐⇒ min kλdiv( p) − gk2 ; | pi, j |2 6 1 p∈Y
o ∀i, j . (128)
The Karush–Kuhn–Tucker conditions showed the existence of a Lagrange multiplier αi, j > 0 associated with each constraint of Eq. (128) such that we have ∀i, j: − (∇ (λdiv( p) − g))i, j + αi, j pi, j = 0
(129)
with αi, j > 0
and | pi, j | = 1
(130)
αi, j = 0
and | pi, j | < 1.
(131)
Then we can see that if αi, j = 0, then (∇ (λdiv( p) − g))i, j = 0; which is not an interesting case. For the case αi, j 6= 0: αi, j pi, j = (∇ (div( p) − g))i, j ⇒ |αi, j || pi, j | = (∇ (div( p) − g))i, j ;
(132) (133)
however, |αi, j | = αi, j because αi, j > 0 and | pi, j | = 1; then αi, j = (∇ (div( p) − g))i, j .
(134)
Now, if we use a gradient steepest descent method with τ > 0; p 0 = 0; n > 0, we get pi,n+1 j
=
pi,n j
g g n+1 n n − ∇ div( p ) − p . + τ ∇ div( p ) − λ i, j λ i, j i, j (135)
135
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
Finally, we get the following iterative formulation: pi,n+1 j
pi,n j + τ ∇ div( p n ) − λg i, j = . 1 + τ ∇ div( p n ) − λg i, j
(136)
Chambolle proves the following important theorem. Theorem 3. If τ < 81 , then λdiv( p n ) converges to PG λ (g) when n → +∞. In pratice, we note that the choice n = 20 is sufficient to reach the wanted convergence.
A.4. Extension The previous result can be extended to the case of BV −H functional where H is a Hilbert space such that there exists a linear positive symmetric operator K that defines the following norm over H: h f, giH = h f, K gi L 2 .
(137)
Then, if we want to minimize J (u) +
λ k f − uk2H , 2
(138)
we can use the following modified Chambolle projector: pi,n+1 j
pi,n j + τ ∇ K −1 div( p n ) − λg i, j = . 1 + τ ∇ K −1 div( p n ) − λg i, j
(139)
And the corresponding convergence theorem is shown below. 1 , then λ1 K −1 div( p n ) converges to vˆ when 8kK −1 k L 2 1 −1 div( p n ) → uˆ where uˆ is the minimizer of Eq. (138). λK
Theorem 4. If τ < and f −
n → +∞
A special case is for K = −1−1 , which corresponds to the Sobolev case H = H −1 .
REFERENCES Adams, R. (1975). Sobolev spaces. Academic Press. Aujol, J. (2004). Contribution a` l’analyse de textures en traitement d’images par m´ethodes variationnelles et e´ quations aux d´eriv´ees partielles. Doctoral thesis. University of NiceSophia Antipolis, France.
136
J´erˆ ome Gilles
Aujol, J., & Chambolle, A. (2005). Dual norms and image decomposition models. International Journal of Computer Vision, 63(1), 85–104. Aujol, J., & Gilboa, G. (2006). Constrained and SNR-based solutions for TV-Hilbert space image denoising. Journal of Mathematical Imaging and Vision, 26(1–2), 217–237. Aujol, J., Aubert, G., Blanc-F´eraud, L., & Chambolle, A. (2003). Decomposing an image. Application to textured images and SAR images. Technical Report. University of Nice-Sophia Antipolis. Aujol, J., Gilboa, G., Chan, T., & Osher, S. (2006). Structure-texture image decompositionmodeling algorithms and parameter selection. International Journal of Computer Vision, 67(1), 111–136. Bamberger, R., & Smith, M. (1992). A filter bank for the directional decomposition of images: Theory and design. IEEE Transactions on Signal Processing, 40(4), 882–893. Bobin, J., Starck, J.-L., & Moudden, Y. (2007). Sparsity and morphological diversity in blind source separation. IEEE Transactions on Image Processing, 16(11), 2662–2674. Bobin, J., Starck, J.-L., Fadili, J., & Donoho, D. (2007). Morphological component analysis: An adaptative thresholding strategy. IEEE Transactions on Image Processing, 16(11), 2675–2681. Burt, P., & Adelson, E. (1983). The Laplacian pyramid as a compact image code. IEEE Transactions on Communication, 31(4), 532–540. Cand`es, E. (1998). Ridgelets: theory and applications. Doctoral thesis. Department of Statistics, Stanford University. Cand`es, E., & Donoho, D. (1999). Curvelets: A surprisingly effective nonadaptive representation of objects with edges. Technical Report. Department of Statistics, Stanford University. Available at: http://www.curvelet.org/papers/Curve99.pdf. Cand`es, E., Demanet, L., Donoho, D., & Ying, L. (2005). Fast discrete curvelet transforms. Multiscale Modeling and Simulation, 5, 861–899. Chambolle, A. (2004). An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision, 20(1–2), 89–97. Chambolle, A., DeVore, R., Lee, N., & Lucier, B. (1998). Nonlinear wavelet image processing: Variational problems compression and noise removal through wavelet shrinkage. IEEE Transactions on Image Processing, 7, 319–335. Daubechies, U. (1992). Ten lectures on wavelets. Philadelphia: Society for Industrial and Applied Mathematics. Do, M. (2001). Directional multiresolution image representations. Doctoral thesis. Department of Communication Systems, Swiss Federal Institute of Technology, Lausanne. Do, M. (2003). Contourlets and sparse image representations. In SPIE conference on wavelet applications in signal and image processing X, San Diego, USA. Do, M., & Vetterli, M. (2001). Pyramidal directional filter banks and curvelets. In IEEE international conference on image processing (ICIP). Do, M. & Vetterli, M. (2002). Contourlets: A directional multiresolution image representation. In IEEE international conference on image processing (ICIP). Do, M., & Vetterli, M. (2003a). The contourlet transform: An efficient directional multiresolution image representation. IEEE Transactions on Image Processing, 14(12), 2091–2106. Do, M., & Vetterli, M. (2003b). Framing pyramids. IEEE Transactions on Signal Processing, 51, 2329–2342. Donoho, D., & Duncan, M. (1999). Digital curvelet transform: Strategy, implementation and experiments. Technical Report. Department of Statistics, Stanford University. Available at: http://www.curvelet.org/papers/DCvT99.pdf. Garnett, J.B., Jones, P.W., Triet, M.L., & Vese, L. (2005). Modeling oscillatory components with ˙ O −α and W˙ −α, p . Technical Report. UCLA CAM Report 07-21. the homogeneous spaces B M Available at: ftp://ftp.math.ucla.edu/pub/camreport/cam07-21.pdf. Gilboa, G., Zeevi, Y., & Sochen, N. (2003). Texture preserving variational denoising using an adaptive fidelity term. In Proceedings of VLSM (pp. 137–144). Available at: http://www.math.ucla.edu/˜gilboa/pub/vlsm03.pdf. Gilles, J. (2007a). Noisy image decomposition: A new structure texture and noise model based on local adaptivity. Journal of Mathematical Imaging and Vision, 28(3), 285–295.
Image Decomposition: Theory, Numerical Schemes, and Performance Evaluation
137
Gilles, J. (2007b). Choix d’un espace de repr´esentation image adpat´e a` la d´etection de r´eseaux routiers. In Traitement et Analyse de l’Information: M´ethode et Application (TAIMA) Workshop. Haddad, A. (2005). M´ethodes variationnelles en traitement d’image. Doctoral thesis. Ecole Normale Sup´erieure de Cachan, France. H¨ardle, W., Kerkyacharian, G., Picard, D., & Tsybakov, A. (1997). Wavelets, approximation and statistical applications. In Proceeding of Paris–Berlin seminar. http://www.quantlet.com/mdstat/scripts/wav/html/index.html. Mallat, S. (1999). A wavelet tour of signal processing (2nd ed.). London: Academic Press. Meyer, Y. (1993). Wavelets: Algorithms and applications. Philadelphia: Society for Industrial and Applied Mathematics. Meyer, Y. (2001). Oscillating patterns in image processing and in some nonlinear evolution equations. In The fifteenth Dean Jacquelines B. Lewis memorial lectures. Providence, RI: American Mathematical Society. Osher, S., Sole, A., & Vese, L. (2002). Image decomposition and restoration using total variation minimization and the H −1 norm. Multiscale Modeling and Simulation, 1(3), 349–370. Po, M., & Do, M. (2006). Directional multiscale modeling of images using the contourlet transform. IEEE Transactions on Image Processing, 15(6), 1610–1620. Rudin, L., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D, 60, 259–268. Triebel, H. (1992). Theory of function spaces II. Basel: Birkh¨auser. Triet, M. L., & Vese, L. (2005). Image decomposition using total variation and div(B M O). Multiscale Modeling and Simulation, 4(2), 390–423. Vese, L. (1996). Variational methods and partial differential equations for image analysis and curve evolution. Doctoral thesis. University of Nice-Sophia Antipolis, France. Vese, L., & Osher, S. (2002). Modeling textures with total variation minimization and oscillating patterns in image processing. Journal of Scientific Computing, 19(1–3), 553–572. Vidakovic, B., & Mueller, P. (1991). Wavelets for kids. http://www.isye.gatech.edu/brani/wp/kidsA.ps, cours d’introduction aux ondelettes.
Chapter
4 The Reverse Fuzzy Distance Transform and its Use when Studying the Shape of Macromolecules from Cryo-Electron Tomographic Data Stina Svensson
Contents
1. Introduction 2. Preliminaries 2.1. The Reverse Fuzzy Distance Transform 2.2. The Centers of Maximal Fuzzy Balls 3. Segmentation Using Region Growing by Means of the Reverse Fuzzy Distance Transform 4. Cryo-Electron Tomography for Imaging of Individual Macromolecules 4.1. Methods for Analyzing Cryo-Electron Tomography Data 4.2. Specific Imaging Settings 4.3. Phantoms Constructed From Structures Deposited in the Protein Data Bank 4.4. Simulated Data 5. From Electron Tomographic Structure to a Fuzzy Objects Representation 6. Identifying the Subunits of a Macromolecule 7. Identifying the Core of an Elongated Macromolecule 8. Conclusions Acknowledgments References
140 142 144 147 151 153 154 154 155 157 160 161 165 167 168 168
Department of Cell and Molecular Biology, Karolinska Institute, Stockholm, Sweden Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00009-3. c 2009 Elsevier Inc. All rights reserved. Copyright
139
140
Stina Svensson
1. INTRODUCTION Information is used for many applications in research or industry, collected in the form of digital images. The interpretation of these images can be done by manual visual inspection or in a more (semi-)automatic way by using various computerized methods. The latter are preferable because they potentially increase both the speed and the objectivity of the analysis. Typically scientists are interested in identifying the structures in the following denoted objects, in the digital image, and in drawing some conclusions regarding, for example, their shape. In some cases, the image acquisition technique forces researchers, base the methods on information in images with low signal to noise ratio (SNR). Moreover, in some cases the objects of interest are represented by only a small number of image points. In the first case, it is difficult to locate the border of the object with high accuracy; that is, to decide whether a certain image point belongs to the object or not. In the latter case, the shape of the object is difficult to analyze as a relatively small change in the positioning of the border may result in a relatively large change of a measured shape feature. In such cases, we can gain robustness of the methods by using fuzzy set methods. Partial means to a set was introduced by Zadeh (1965) and, more of an image analysis context, by Rosenfeld (1979, 1984). The idea is to avoid binary, or crisp, segmentation into object and background and instead assign each image point membership values related to the degree of “belongingess” to the structure of interest. The segmented fuzzy object then is used in the subsequent analysis. Such analysis is less dependent on small changes in the border, something that may be imposed by a crisp segmentation, as well as more scale invariant. (See, for example, Udupa and Samarasekera, 1996, and followers.) The approach is well established in the field of medical imaging, (e.g., through the review by Udupa and Saha, 2003). Recently it has also gained interest in the field of electron tomography (ET) (Bongini ˜ Wong-Barnum, Volkmann, & Ellisman, 2008). It can et al., 2007; Garduno, be used not only for identification and representation of the objects, but also for extracting more stable measurements; see, for example Bloch (2005); ¨ Bogomolny (1987); Chanussot, Nystrom, and Sladoje (2005); Sladoje and Lindblad (2007). When analyzing the shape of an object (crisp or fuzzy), the distance between image points often provides useful information. For example, we might be interested, in the thickness of an elongated structure. For discrete images, the computation of distances between objects of image points has been a crucial issue to obtain a computationally convenient and mathematically correct computation. Rosenfeld and Pfaltz (1966) presented the first ideas and were followed by, among others, Borgefors (1986) for improved approximation of the Euclidean distance. In a distance transform (DT), each point in the object is assigned a value corresponding to the
The Reverse Fuzzy Distance Transform
141
distance to its closest point in the background. The DT can then be used for subsequent shape analysis. The grey-weighted distance, for which the grey-level of a point is added to the spatial distance to the background, was introduced by Rutovitz (1968). This introduces the question of how to weight the different dimensions compared to each other. Shortly thereafter, Levi and Montanari (1970) introduced another grey-weighted distance, for which the mean of the grey-levels of two neighboring points is multiplied by the spatial distance between them. The latter was put into a theoretical framework and denoted the fuzzy distance transform (FDT) by Saha, Wehrli, and Gomberg (2002); This term will be used in this chapter. Fuzzy is used to stress the fact that distances are computed onto a fuzzy object. The greyweighted distance in Rutovitz (1968) is similar to what is denoted topographic ´ (2001). The distance by Philipp-Foliguet, Vieira, and De Albuquerque Araujo notion of topographic distance was introduced by Meyer (1994) but with a grey-weighting based on the local gradient between two neighboring points. Put into a more general framework, the FDT is equal to the geodesic time with the fuzzy object as a geodesic mask (Soille, 1994). A DT can be used not only to directly extract shape information from the (crisp) object it represents, but also to extract shape descriptors of the object. The shape descriptors can, in turn, be used to facilitate shape measures or to achieve a compact representation of the object for efficient storage of the information in the image. One such representation is the medial axis introduced by Blum (1967), where the object is represented by a curve is centrally located in the object. Blum (1967) described this shape descriptor for two-dimensional (2D) objects, but it can be used also for 3D objects, being a medial axis or a medial surface depending on the shape of the object. A medial representation, such as the medial axis, is well suited to represent an elongated object and can be used, for example, to facilitate measurement of its length. By assigning the distance values in the DT to the points in the medial representation, it also provides information about the thickness of the object. An object can be seen as the union of a number of balls, defined by their center point’s respective radius. In fact, the distance values in a DT are the radii of balls (for the corresponding distance function), which are subsets of the object. Many of these balls are redundant for the shape as they are subsets of other balls. A maximal ball is defined as a ball that is not a subset of any other single ball. The center points of such balls, the set of centers of maximal balls (CMBs), gives a medial representation of the object as they constitute a set centrally located in the object (Arcelli & Sanniti di Baja, 1988). By adding a ball to each point in the set, where the ball’s radius is equal to the distance value for the point, the object can be recovered. The concept of CMBs can be generalized to a fuzzy setting and then deal with fuzzy objects, yielding the centers of maximal fuzzy balls (CMFBs). This generalization was described by Svensson (2007a, 2008). Fuzzy is used to indicate that the
142
Stina Svensson
balls are extracted from the FDT. Various other aspects of balls in nonconvex domains have been previously described earlier (e.g., in Bloch, 2000; Sanniti di Baja and Svensson, 2002). As mentioned, a crisp object can be recovered from its CMBs. This recovery process can be efficiently implemented using the reverse distance ¨ transformation (see, for example, the work by Nystrom and Borgefors, 1995). This concept also can be generalized to a fuzzy setting, resulting in the reverse fuzzy distance transform (RFDT) (Svensson, 2007b). It is of interest not only for the recovery of the (support) of a fuzzy object from its CMFBs, but also because it can be used as a region-growing process for the segmentation of subunits of a fuzzy object, following the approach described for crisp objects in Svensson and Sanniti di Baja (2002). We have previously described the RFDT for 2D and 3D images, and the CMFBs for 2D images (Svensson, 2007a,b, 2008). Here, we recall this theoretical framework and apply it to shape analysis for a specific application. In fact, through collaborations with the Electron Tomography group at the Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden, where cryo-electron tomography (ET) is used for structural biology research, many shape-related issues have arisen in which the mentioned methods can be applied, resulting in a more robust analysis of the 3D reconstructions. As the application in mind regards 3D images, the theory is described for the 3D case even though it can be applied equally well to 2D images. Two-dimensional images are used as illustrations for easier understanding. In Section 2, we provide some preliminaries, with special focus on the RFDT (Section 2.1) and CMFB (Section 2.2). In Section 3, we show how the RFDT can be used as a region-growing algorithm. We continue with a description of the application in mind (Section 4) and give some ideas on how a 3D reconstruction can be converted to a fuzzy object (Section 5). The latter is not the focus for this manuscript but still is essential. In Section 6, we show how the region growing by RFDT can be used to identify the subunits of a macromolecular structure. Section 7 describes how CMFBs can be used to find the core of an elongated macromolecular structure.
2. PRELIMINARIES We recall the definition in Zadeh (1965): Let X be the reference set, then a fuzzy subset A of X is defined as a set of ordered pairs A = {(x, µA (x)) | x ∈ X }, where µA : X → [0, 1] is the membership function of A in X . A 3D fuzzy digital subset defined on Z3 ; that is, object O is a fuzzy 3 3 O = (v, µO (v)) | v ∈ Z , where µO : Z → [0, 1]. A voxel v belongs to the support of O if µO ( p) > 0. The fuzzy distance between two points v and u in a fuzzy object O is defined as the length of the shortest path between v and u. In Levi and
143
The Reverse Fuzzy Distance Transform
√ √ h1, 2, 3i
Montanari (1970) and Saha et al. (2002), the fuzzy distance dO Z3 × Z3 → R between v and u in O is set to √ √ h1, 2, 3i
dO
=
min
hv=v1 ,...,vm =ui
m−1 X i=1
1 (µO (vi ) + µO (vi+1 )) · w(vi+1 − vi ), 2
:
(1)
where w(vi+1 − vi ) is the spatial Euclidean√distance between vi and vi+1 ; that √ is, 1 if vi and vi+1 are face neighbors, 2 if they are edge neighbors, and 3 if they are vertex neighbors. We remark that, for two voxels v1 = (x1 , y1 , z 1 ), v2 = (x2 , y2 , z 2 ) ∈ Z3 for which max{|x1 − x2 |, |y1 − y2 |, |z 1 − z 2 |} = 1, we use the notion face neighbor if |x1 − x2 | + |y1 − y2 | + |z 1 − z 2 | = 1, edge neighbor if |x1 − x2 | + |y1 − y2 | + |z 1 − z 2 | = 2, and vertex neighbor if |x1 − x2 | + |y1 − y2 | + |z 1 − z 2 | = 3. The FDT can be computed using a sequential algorithm similar to the one used for computing the DT of a 3D binary image (Borgefors, 1996). However, it is computationally inefficient because repeated sets of forward and backward scans over the image are necessary, as opposite to binary images, where the DT of an nD image can be calculated in two scans. The number of iteration is different depending on the problem domain and may be large. Therefore, in Saha et al. (2002), an approach using sorted priority queues is proposed. As there is no difference in the results, we use a sequential algorithm for simplicity, even if it is more time consuming. The Euclidean distance is approximated using weights for the local distance between two neighboring voxels. If taking into account a 3 × 3 × 3 neighborhood, the h3, 4, 5i distance is preferably used, where 3 is the weight to a face neighbor, 4 the weight to an edge neighbor, and 5 the weight to a vertex neighbor (Borgefors, 1996). To compute the FDT for a fuzzy object O, initial values, D0F , of the points in the support of O are set to infinity and of the remaining points to 0. Increasing fuzzy distances are then locally propagated over O in forward and backward scans. For the h3, 4, 5i FDT, using Eq. (1) is straightforward. In scan i, i ≥ 1, a voxel v ∈ O is assigned the temporary distance value DiF (O, h3, 4, 5i, v): DiF (O, h3, 4, 5i, v) 1 F (O, h3, 4, 5i, v + u) + (µO (v) + µO (v + u)) · w(u) , = min Di−1 u∈scani 2 (2) where w(u) refers to the spatial distance weight used for the neighbor u, given in local coordinates around v, and scani to the set of neighbors used in a forward scan (i odd) and in a backward scan (i even). The weights are shown in Figure 1. The thick lines separate the neighborhood into the part
144
Stina Svensson z=1 z=0 z = –1
5
4
5
4
3
4
5
4
5
4
3
4
3
0
3
4
3
4
5
4
5
4
3
4
5
4
5
FIGURE 1 Weights used for computation of the FDT taking into account a 3 × 3 × 3 neighborhood. The thick lines indicate which part is used in the forward (z = −1 and upper part of z = 0) and backward scan (lower part of z = 0 and z = 1).
used in the forward (front and upper) and in the backward scan (lower and back). The central voxel is included in both scans. The process is repeated until, in scan l + 1, no further updating of voxel values is performed. The result is an image (FDT) in which v is assigned the value D F (O, h3, 4, 5i, v) = DlF (O, h3, 4, 5i, v), that is, its fuzzy h3, 4, 5i distance from the complement of O. Note that D F is throughout the paper used to denote the value of a voxel in an FDT and dO to denote the underlying distance function used to calculate an FDT. The methods described in this chapter are applied to a fuzzy object O. Hence, to have a complete process starting directly from the provided data, algorithms are needed to extract a fuzzy object representation from the structure of interest. This can be done in various ways. How this is done in our case is described in Section 5. There we use the membership function often referred to as fuzzy connectedness, cA (Rosenfeld, 1979). For cA , the strength of membership of a contiguous path πuv between two voxels u and v in a fuzzy subset A is defined as the smallest membership value along the path and the degree of connectedness cA (u, v) as the strongest path between u and v that is, cA (u, v) = max
p∈πuv
min µA (e) ,
e∈E ( p)
(3)
where πuv is the set of all paths between u and v and E( p) is the set of all voxels along the path p. This process can be implemented using a raster scan technique similar to what is described for the FDT above (Smedby, Svensson, ¨ & Lofstrand, 1999) or using a dynamic programming approach (Udupa & Samarasekera, 1996).
2.1. The Reverse Fuzzy Distance Transform A (crisp) object O ∈ Zn can be seen as the union of a set of balls Bi (d, ci , ri ) = {x ∈ Zn | d(ci , x) ≤ ri } for some distance function d : Zn ×Zn → R with center
The Reverse Fuzzy Distance Transform
145
points ci ∈ Zn and radii ri ∈ R, i, . . . , m. Hence, O can be represented by a set S (O, d) = {(ci , ri ) | i = 1, . . . , m} as O is recovered from S by placing a ball of radius ri at each ci . This recovery process is efficiently implemented using the reverse distance transform (RDT). The RDT is computed by propagating local distance information in two scans over the image starting from a set of points each assigned with a distance value, for example, S. It results in a grey-level image in which each point assigned a distance value belongs to at ¨ and Borgefors least one of the balls. For details, see, for example, Nystrom (1995). The RDT can be generalized to a fuzzy setting, resulting in the RFDT. To compute the RFDT starting from a set A = {(ci , ri ) | i = 1, . . . , m}, where ci ∈ O and ri ∈ Z, we assign initial values, R D0F , to all image points. Points in A are assigned their respective ri , while the remaining points are set to 0. Decreasing fuzzy distances are then propagated on O. For the h3, 4, 5i RFDT, and scan i, i ≥ 1, a voxel v ∈ O is assigned the temporary distance value R DiF (O, h3, 4, 5i, v): R DiF (O, h3, 4, 5i, v) 1 F = max R Di−1 (O, h3, 4, 5i, v + u) − (µO (v) + µO (v + u)) · w(q) , u∈scani 2 (4) where w(u) refers to the spatial distance weight (see Figure 1) used for the neighbor u, given in local coordinates around v, and scani to the neighbors used if i is odd and if i is even. The process is repeated until, in scan l + 1, no further updating of voxel values is performed. The result is an image (RFDT) in which each voxel v is assigned the value R D F (O, h3, 4, 5i, v) = R DlF (O, h3, 4, 5i, v); that is, its reverse fuzzy h3, 4, 5i distance to its closest voxel in A. To illustrate the effect of the RFDT and also to compare it with the RDT, we have constructed a set of 2D fuzzy objects Oi , i = 1, . . . , 4 (Figure 2, left column, rows one to four). Each Oi corresponds to the union of two balls (Oi0 and Oi00 ), where the radius of the support of Oi0 (left) is larger than the radius of the support of Oi00 (right). The border of each ball has linearly decreasing membership values. Two different slopes have been used, giving different degrees of fuzziness. Oi , i = 1, . . . , 4, have been constructed in such a way that O10 and O100 both have weak slopes; O20 has weak slope and O200 has strong slope; O30 has strong slope and O300 has weak slope; and O40 and O400 both have strong slopes. Three crisp objects, O5 , O6 , and O7 , created by taking µO1 > 0, µO2 > 0.6, and µO3 > 0.6, respectively, are also shown in Figure 2 (left column, rows four to seven). In Figure 2 (middle), the h3, 4i FDT and h3, 4i DT for Oi for i = 1, . . . , 4, and i = 5, . . . , 7, respectively, are shown. (We remark that h3, 4i FDT and h3, 4i DT correspond, for 3D images, to the,
146
Stina Svensson
FIGURE 2 Left: Fuzzy objects Oi , i = 1, . . . , 4 and crisp objects Oi , i = 1, 2, 3. Middle: The h3, 4i FDT for Oi , i = 1, . . . , 4 and the h3, 4i DT for Oi , i = 1, 2, 3. Right: The pixels, shown in grey and white, reached by the RFDT when applied to Ai for Oi , i = 1, . . . , 4 and the RDT when applied to Ai for Oi , i = 1, 2, 3. The set Ai , i = 1, . . . , 7 are shown overlayed in blue. The support of O1 is outlined in green. (See Color Insert.)
above-described h3, 4,5i FDT and h3, 4, 5i DT.) For each Oi , i = 1, . . . , 7, we have a set Ai Oi , dOi = ci0 , ri0 , ci00 , ri00 , where ri0 and ri00 are set equal to the shortest fuzzy h3, 4i distance from ci0 and ci00 to the complement of the support of Oi , for i = 1, . . . , 4, respectively, and to the shortest h3, 4i distance from ci0 and ci00 to the complement of Oi , for i = 5, . . . , 7, respectively. The sets Ai , i = 1, . . . , 7 are shown in blue in Figure 2 (right column). In this case, Ai does not represent Oi , i = 1, . . . , 7, completely. The pixels reached when applying the RFDT for i = 1, . . . , 4, and the RDT for i = 5, . . . , 7,
The Reverse Fuzzy Distance Transform
147
respectively, to Ai , i = 1, . . . , 7, are shown in grey, corresponding to pixels closest to ci0 , and white, corresponding to pixels closest to ci0 , in Figure 2 (right column). Due to the choice of ri0 and ri00 , pixels on the border of the support of Oi are reached, but no pixels outside. For Oi , i = 1, . . . , 4, we see the effect the fuzziness of the border of Oi0 and Oi00 has. When different slopes have been used for the two balls, that is, for O2 in row two and O3 in row three, the border between the reached regions is slightly shifted toward the left (O2 ) or to the right (O3 ). When the same slope has been used, that is, for O1 in row one and O4 in row four, the border between the reached regions is located in the same position, but the size of the regions differs. In the case of a weak slope, that is, for O1 , a smaller region is reached than for the strong slope, that is, for O4 . To a certain extent we can see the same effect for the crisp objects Oi , i = 5, . . . , 7, but there it is dependent on the threshold of the membership function used to create the objects. By using a fuzzy setting, we avoid this dependency and can directly analyze the (fuzzy) shape of the structure of interest. In Figure 2, the support of O1 , which is equal to the support of Oi , i = 2, . . . , 5, is outlined in green for easier comparison. For implementational aspects of the RFDT, we refer to Svensson (2007b). Note that for the computation of the local fuzzy distance between two voxels v and u required in the algorithm, µO (v) and µO (u) need to be known. Hence, the RFDT is actually computed on O. Saha et al. (2002) suggested to use the Euclidean distance as the spatial local distance between voxels when computing the FDT [Eq. (1)]. We use Eq. (1) for the computation of the FDT and the RFDT but with the h3, 4, 5i distance, instead of the Euclidean distance following the concept of weighted DTs for binary images (Borgefors, 1996). By using the h3, 4, 5i distance, we can work with integer numbers and still achieve a good approximation of the Euclidean distance. The RDT, as well as RFDT, can be used as a region-growing process. We start from a set of seeds, each of which has a distance value as well as a label. The label is propagated together with decreasing distance information. The result is that points in the object are labeled with the label of the closest seed point. In fact, this type of label propagation was used for the examples shown in Figure 2 and is used in Section 3.
2.2. The Centers of Maximal Fuzzy Balls Given a (crisp) object O ∈ Zn , a distance function d : Zn × Zn → R, and the corresponding DT computed on O, the distance value of a point c ∈ O in the DT can be interpreted as the radius r of a ball B (d, c, r ) = {x ∈ Zn | d(c, x) ≤ r } such that B ⊆ O and B (d, c, r + ) 6⊆ O, > 0. Let DTB = {Bi (d, ci , ri ) | i = 1, . . . m}, where ci are all the points O, ri are the respective Sm values in the DT, and m is the number of points in O. Hence, O = i=1 Bi , where Bi ∈ DTB.
148
Stina Svensson
B M is denoted a maximal ball and cM a center of a maximal ball, if for all Sk Bi ∈ DTB, Bi 6⊃ B M , i = 1, . . . , k. Thus, i=1 BiM is equal to O. This means that O can be represented by its set of centers of maximal balls, denoted CMB (O, D), and O can be recovered from CMB (O, D) using the RDT. CMB (O, D) can be identified in one scan over the DT by value comparison. This is due to the fact that CMB (O, D) consists of the points in the DT that do not propagate distance information to neighboring points; that is, CMB (O, D) consists of “local maxima.” Considering O ∈ Z3 and the h3, 4, 5i distance, a voxel v belongs to CMB (O, h3, 4, 5i) if, for all element n i , i = 1, . . . , 26, in the neighborhood, given in local coordinates around v, with their respective weights w(n i ) ∈ {3, 4, 5}, if D(O, h3, 4, 5i, v + n i ) < D(O, h3, 4, 5i, v) + w(n i ),
(5)
where D(O, h3, 4, 5i, v) is the distance value of voxel v found in the h3, 4, 5i DT of O. Special treatment is necessary for voxels with small values to not detect false CMBs. For the h3, 4, 5i distance, voxels with value 3 are considered as having value 1 while performing the comparison (Arcelli & Sanniti di Baja, 1988). The reason is that not all distance values can occur in the h3, 4, 5i DT, where no voxels can have values 1 or 2. The identification process of CMB (O, h3, 4, 5i) is valid for other DTs, with suitable adjustment of the condition for voxels with small distance values. CMB (O, D) is not only a compact representation of O, but also often used as nonremovable points in skeletonization (a commonly used medial representation) to guarantee that the object can be recovered from its CMB (O, D) (Sanniti di Baja, 1994). Once the RFDT has been introduced, we can use it to define what we will denote as the CMFB for a fuzzy object O. It is of interest to consider the concept of CMFBs for O since they can be used, for example, as fuzzy object-based skeletonization, representing the most internal structure of O. A (fuzzy) object O ∈ Zn can to some extent be treated as a set of balls BiF (dO , ci , ri ) = {x ∈ Zn | dO (ci , x) ≤ ri } for some distance function dO : Zn × Zn → R, with S center points ci ∈ Zn and radii ri ∈ R, i, . . . , n, n where the union of the set, i=1 BiF , is equal to the support of O. In the fuzzy case, we cannot, as in the crisp case, say that O can be represented by a set S (O, dO ) = {(ci , ri ) | i = 1, . . . , m} as BiF is not only dependent on ci and ri but also on O itself. Despite this, the concept is still of importance. The support of O can be obtained from S by means of the corresponding RFDT. We extend the concept of CMBs (the crisp case) to a fuzzy framework by introducing the CMFBs, denoted CMFB O, D F , where dO : Zn × Zn → R is the underlying fuzzy distance function and D F are the values in the corresponding FDT. Analogous to the crisp case, we define CMFB O, D F to be the points that do not propagate distance information to neighboring points while calculating the FDT. CMFB O, D F can be detected in one scan over the FDT of O by value comparison, taking into account also the
The Reverse Fuzzy Distance Transform
149
membership values in O. To be more precise, considering O ∈ Z3 and the fuzzy h3, 4, 5i distance, a voxel v belongs to CMFB (O, h3, 4, 5i) if, for all element n i , i = 1, . . . , 26 in the neighborhood, with their respective weights w(n i ) ∈ {3, 4, 5}, D F (O, h3, 4, 5i, v + n i ) < D F (O, h3, 4, 5i, v) 1 + (µO (v) + µO (v + n i )) · w(n i ), 2
(6)
where D F (O, h3, 4, 5i, v) is the distance value at v in the h3, 4, 5i FDT. The identification of CMFB O, D F does not require any special treatment of small fuzzy distance values. This is a difference from the crisp case described above. (Note that this differs from what was earlier stated in Svensson, 2008, a discovery made after the publication was in press). F Following the definition, given the B (dO , c, r ) for which (c, r ) represents F a voxel v ∈ CMFB O, D , there exists no other voxel u ∈ CMFB O, D F , with c0 , r 0 such that the support of B F (dO , c, r ) is a subset of the support of B F dO , c0 , r 0 . However, the support of the support of B F (dO , c, r ) can be a subset of the union of support of a set of B F (d O , ci , ri ) , i = 1, . . . , n (corresponding to voxels v1 , . . . , vn in CMFB O, D F ). Hence, it is possible to find a smaller set that actually recovers the support of O. (We will see in the following text that a reduction of CMFB O, D F is actually of interest.) However, such a reduction process is not a trivial task. The most brute-force way is to calculate the RFDT from CMFB O, D F on a leave one out basis and compare its support with the support of O, meaning that the RFDT must be computed a number of times corresponding to the number of voxels in ¨ (1997), a more efficient reduction CMFB O, D F . In Borgefors and Nystrom process was described for the crisp case. The examples shown here use the brute-force way, as generalization of, that is, the algorithm in Borgefors and ¨ (1997) to a fuzzy setting Nystrom is currently not available. We will denote the reduced set CMFBR O, D F . To show the advantages of a fuzzy approach compared to a crisp, we use the example in Figure 3. The fuzzy object O8 (Figure 3, left, top row) is composed of six balls of different fuzziness, but with supports having the same radius. The CMFB (O8 , h3, 4i) is shown in blue. Two crisp objects, O9 and O10 , created by taking µO8 > 0 and µO8 > 0.4, respectively, are shown in Figure 3 (left, middle and bottom row), with CMB (O9 , h3, 4i) and CMB (O10 , h3, 4i) in blue. The support of O8 is outlined in green for comparison. As can be seen, CMFB (O8 , h3, 4i) provides a robust representation, still reflecting the fuzziness of the border of the balls as the vertical part centrally located in each of the balls becomes more evident as the fuzziness decreases. CMB (O9 , h3, 4i) and CMB (O10 , h3, 4i) presents a different situation: O9 consists of balls having the same radius and, thus, CMB (O9 , h3, 4i) has a similar constitution throughout the object.
150
Stina Svensson
FIGURE 3 Top row: A fuzzy object O8 with CMFB (O, h3, 4i) shown in blue (left) and with CMFBR (O, h3, 4i) (right). Middle row: A crisp object O9 with CMB (O, h3, 4i) (left) and with CMBR (O, h3, 4i) (right). Bottom row: A crisp object O10 with CMB (O, h3, 4i) (left) and with CMBR (O, h3, 4i) (right). The support of O8 outlined in green. (See Color Insert.)
O10 consists of balls with increasing radii, and, thus, CMB (O10 , h3, 4i) gives a representation that varies more. Again, the vertical part becomes more evident as the radius increases. However, as for Figure 2, this is dependent on the threshold of the membership function used to create O10 . For a comparison, Figure 3 also shows CMFBR (O8 , h3, 4i), CMBR (O9 , h3, 4i), and CMBR (O10 , h3, 4i) (right column). The reduction process removes the spurious pixels in the far right ball for CMFB (O8 , h3, 4i). In this case, the reduction is not that evident. However, as seen in another case, especially for real data, it is crucial to remove unnecessary points to achieve a representation more suitable for subsequent analysis. In Figure 4a and d, two fuzzy objects O11 and O12 are shown with CMFBR (O11 , h3, 4i) CMFBR (O12 , h3, 4i) in blue and the support of O11 and O12 outlined in green, respectively. O11 shows that, the internal grey-level structure is enhanced by the described fuzzy approach. The support of O11 is a ball. Internally O11 has an ellipsoidal region of higher membership values. CMFBR (O11 , h3, 4i) reflects both of these aspects. This is a property that cannot be achieved by using a crisp approach. For a comparison, we show two crisp objects O13 and O14 (Figure 4b and c, respectively) with CMBR (O13 , h3, 4i) and CMBR (O14 , h3, 4i) in blue and the support of O11 outlined in green. O13 and O14 have been created by taking µO11 > 0 and µO11 > 0.5, respectively. CMBR (O13 , h3, 4i) gives a suitable representation of the support of O11 , while CMBR (O14 , h3, 4i) emphasizes one aspect of the internal structure of O11 . Hence, we need to choose which information is the most important. In a fuzzy setting, we instead can consider both aspects. O12 illustrates another important aspect. It consists of a set of equally sized ellipsoids, all with a fuzzy border, placed at increasing distance from each other. Because of this the border of support of O12 has concavities of increasing size. Especially for a crisp approach, jaggedness
The Reverse Fuzzy Distance Transform
(a)
(b)
151
(c)
(d) FIGURE 4 Two fuzzy objects (a) O11 and (d) O12 with CMFBR (Oi , h3, 4i) in blue and the support of Oi outlined in green (i = 4, 5). Two crisp objects (c) O13 and (d) O14 with CMBR (Oi , h3, 4i) in blue and the support of Oi outlined in green (i = 6, 7). O13 and O14 are constructed by thresholding of µO11 . (See Color Insert.)
of a border will result in many spurious CMBs located around the main axis. CMFBR (O11 , h3, 4i) shows the effect only when the concavities appears more evident.
3. SEGMENTATION USING REGION GROWING BY MEANS OF THE REVERSE FUZZY DISTANCE TRANSFORM Image segmentation is the process used to define the relevant structures— the objects—in an image by separating them from each other and from the nonrelevant parts, the background. It is a crucial step in the analysis of a digital image and is often, despite years of research, the most difficult part. Many segmentation algorithms are based on the concept of region growing. Seeds, which are a set of points carrying information about both position and identity label, are placed inside the potential objects, either manually or by some automatic process. The seed of a certain label is then allowed to propagate its label onto neighboring points, if similar enough according to some cost function, in an iterative maner. This process is done in parallel for all seeds. The label propagation from a seed terminates when a propagation front originating from a seed with a different label is reached or when there are no more neighboring points that are similar enough. The cost function can be, for example, based on grey-level homogeneity in the region corresponding to the seed or gradient magnitude information extracted from the image. Methods that originates from this basic idea are, for example, level set based segmentation (Sethian, 1999) and watershed segmentation (WS) (Beucher & Lantuejoul, 1979). We here propose to use region growing by means of the RFDT. In this section, we assume that the
152
Stina Svensson
seeds are given. In Section 6, we show how seed detection can be used for a specific application. Given a fuzzy object O and with the aim to segment it, or rather decompose it, into relevant subunits with prior knowledge about seeds, we suggest emphasizing the shape further than other methods by using the RFDT as a region-growing process. Initially, we have O, its FDT calculated by dO , and a set of labeled seeds L0 O, D F = {(C0i , R0i , li ) | i = 1, . . . , m}, where C0i ⊂ O, and li and R0i are their labels and distance values in the FDT, respectively. L0 can be, for example, the local grey-level maxima in the FDT. The RFDT of L0 is computed on O, where labels are propagated together with decreasing fuzzy distance information. By this, a subset of the points in O is assigned a label. The process is repeated using L j O, D F = {(C ji , R ji , li ) | i = 1, . . . , m} as input, where C ji ⊂ O are the points labeled li by the previous steps 0, . . . , j − 1, and R ji are their corresponding distance values in the FDT. After a number of iterations, dependent on O and L0 , all points in O are assigned a label. We note that this process allows us to incorporate information about O and at the same time use fuzzy distance information (i.e., shape) in the region growing. When region growing is applied to the FDT, information about O and fuzzy distance information are also used, but they are weighted together in the FDT. In our opinion, the shape of an object is emphasized even more if region growing by means of RFDT is applied, since the two types of information, from O and its FDT, are better exploited when used not only for computing the FDT and selecting seeds, but also in the actual region-growing process. A region-growing process resembling region growing by RFDT is the seeded WS described by Vincent (1993). However, for seeded WS the region growing is done grey-level after grey-level, which means that we need to choose whether to use either the information stored in the FDT or to base the region growing directly on the membership values of O. The RFDT can be implemented, for example, by using sorted priority queues in the same maner as seeded WS, giving comparable computational cost. When CMFB O, D F (or CMFBR O, D F ) is not included in L j O, D F for any j, iterative usage of RFDT will actually not be enough to assign all voxels in O with a label. A constrained FDT can be used for assignment of the remaining voxels. An unlabeled voxel in O is assigned the same label as its closest, using d F , already labeled voxel. Region growing by RFDT is well suited for identification of rather spherical subunits of a fuzzy object, or, to use a slightly different terminology, rather spherical clustered structures. We use the fuzzy objects Oi , i = 1, . . . , 4, previously shown in Figure 2, to illustrate the region growing by RFDT. In Figure 5, the left column shows Oi with the support outlined in green and L0 (Oi , h3, 4i) overlaid in magenta and blue, i = 1, . . . , 4. In the right column, the subunits resulting from the region growing are shown. Notice the size of the region corresponding to a seed depends on the
The Reverse Fuzzy Distance Transform
153
FIGURE 5 Left column: Fuzzy objects Oi with L0 (Oi , h3, 4i) in magenta and blue, i = 1, . . . , 4. Right column: Subunits in magenta and blue found using region growing by RFDT starting from L0 . The support of Oi , , i = 1, . . . , 4 is outlined in green. (See Color Insert.)
significance of the subunit it corresponds to compared with the significance of the subunit with which it “competes”.
4. CRYO-ELECTRON TOMOGRAPHY FOR IMAGING OF INDIVIDUAL MACROMOLECULES This section describes an application from the field of structural biology for which the above-described theoretical framework can be used with favor. Macromolecular structures can be imaged using a transmission electron microscope (TEM). In fact, transmission electron microscopy is a powerful tool for increased understanding of biological processes. To achieve a threedimensional (3D) view over the sample under study, one of two techniques can be applied: so-called single-particle imaging or individual-particle imaging. Both methods need to be combined with a reconstruction technique to obtain a 3D electron tomographic image from 2D micrographs (i.e., 2D projection images). We focus here on the latter approach. Micrographs of the sample are captured from different angles. The micrographs are then used to reconstruct a 3D image (in the following 3D reconstruction). The 3D reconstruction can be determined by using the well-known filtered backprojection technique as described by Crowther, DeRosier, and Kulg (1970), algebraic reconstruction techniques (ART) as described in Gordon,
154
Stina Svensson
Bender, and Herman (1970), the simultaneous iterative reconstruction technique (SIRT) as described in Gilbert (1972) or by, which is the technique used in this work, the iterated regularization method Constrained Maximum Entropy Tomography (COMET), with filtered backprojection as prior, ¨ Burnett, and Bricogne (1996) and as described in Skoglund, Ofverstedt, ¨ Rullg˚ard, Oktem, and Skoglund (2007). By using cryo-ET, it is possible to examine proteins and other macromolecules with a resolution of a few nanometers. The 3D reconstruction provides structural information of macromolecules that cannot be captured in any other way, as the properties of individual molecules can be examined. For full use of the unique type of information, methods for interpretation of the 3D reconstructions are necessary. This type of image is challenging. The image quality is rather poor for several reasons. The electron irradition destroys the sample, which means that the total dose must be kept very low, resulting in images with low SNR and thereby low contrast. The electron microscope limits the angular range that can be examined to 120◦ –140◦ , resulting in limited angle artefacts in the 3D reconstructions. The samples are often of uneven thickness, which skews the background level in the 2D micrographs and hence also in the 3D reconstructions. These are just of few of the image-quality challenges. In addition, the macromolecules often are represented by a rather small number of voxels. Considering all these facts, this theoretical framework is well suited for application to 3D reconstructions.
4.1. Methods for Analyzing Cryo-Electron Tomography Data As mentioned previously, developing automatic, or even semi-automatic, methods for ET data is not a trivial task. This is true both when the data are collected in a single-particle manner or in an individual-particle manner. Over the past 10 to 15 years a number of articles have been published on this topic, but the field still is far behind the developments in, for example, medical imaging. Methods for segmentation of subunits can be found in Volkmann (2002), Yu and Bajaj (2006), Baker, Yu, Chiu, and Bajaj (2006), ˜ et al. (2008) and Nguyen and Ji (2008) among others. As ET Garduno gives low-resolution data, it is often of interest to combine the structural information possible to extract from high-resolution data (i.e., docking of high-resolution structure, as resolved by, for example, X-ray crystallography) into low-resolution data. Such methods can be found in Wriggers, Milligan, and McCammon (1999), and Birmanns and Wriggers (2007), among others.
4.2. Specific Imaging Settings To show the performance of our methods on experimental data, we have ¨ ¨ used the material published by Sandin, Ofverstedt, Wikstrom, Wrange, and Skoglund (2004). There cryo-ET experiments on the monoclonal murine antibody IgG2a in solution were described. Antibodies are crucial
The Reverse Fuzzy Distance Transform
155
constituents of our immunological defense systems. They binds to foreign agents and target them, for instance, for destruction. The IgG antibody is the most abundant antibody in blood and has a molecular weight of about 150 kDa. It consists of three subunits, two fragment antigen binding (Fab) arms and a stem (Fc). The subunits are pairwise connected by a flexible hinge that allows for significant relative mobility. The three subunits can be identified in 3D reconstructions, but the resolution is too low to actually resolve the hinge region. The hinge region consists of 19 amino acids and depending on the extent to which this region is stretched, the subunits appear either disconnected or connected. In the study, IgG2a solution was mixed with a solution of 10-nm colloidal gold particles at a ratio of 2:1. Ultrathin films of the mixed solution were plunged into liquid ethane at −175◦ C, which causes a very rapid freezing rate with the effect that the water forms vitreous ice particles around the protein, preserving the hydrated structure and immobilizing them in the states they last occupied. The specimen was imaged under a low-dose condition (the ˚ 2 ) using a field emission gun (FEG) 200-keV total electron dose was ∼ 20 e/A TEM (Philips CM200). Two-dimensional micrographs was recorded over an angular range of 120◦ –125◦ at either every degree or every other degree on a CCD detector with a pixel size of 14 µm2 , at a magnification of 26,715×. The colloidal gold particles were used to align the micrographs to one another. Three-dimensional reconstructions were reconstructed using COMET. For a more detailed description, see Sandin et al. (2004). In Figure 6, four 50 × 50 × 50 voxel extracts from the 3D reconstructions, each containing one manually identified IgG antibody, are shown volumerendered. The antibodies shown are in IgG1 (top left), IgG2 (top right), IgG3 (bottom left), and IgG4 (bottom right).
4.3. Phantoms Constructed From Structures Deposited in the Protein Data Bank To show various aspects of our methods in a more controlled way, we have constructed a set of phantoms from structures deposited in the Protein Data Bank (PDB) (Berman et al., 2000). From a PDB entry, it is possible generate a 3D image with a certain voxel size and resolution of the corresponding macromolecule. For example, this can be done using a model where a Gauss kernel is placed at each atom position and multiplied by the mass of that atom (Pittet, Henn, Engel, & Heymann, 1999). The total density is then calculated by adding the contributions from Gauss kernels of atoms in the vicinity of the voxel. The resolution of the image is 2σ , where σ is the variance used to calculate the Gauss kernels. This approach is a simplification of the macromolecule as each atom is approximated by a kernel of the same shape. Instead, we have used RHOGEN, a program originating from the crystallographic community and
156
Stina Svensson
IgG1
IgG2
IgG3
IgG4
FIGURE 6 Four 50 × 50 × 50 voxel extracts from 3D reconstructions each containing one manually identified IgG antibody (shown volume-rendered).
adapted to ET at the Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden, and freely available for academic use,1 to create the phantoms. Each atom is approximated by a function that reflects the shape of the atom and different oxidation states of the atom. The potential functions are different if the incoming scattering is from the electron cloud or from the nuclei; hence, the needed adjustment of the program. We have chosen three different structures to show the different features of our methods. They are, listed with reference to their respective PDB, as follows: 1igt This corresponds to murine IgG2a murine antibody (mAb) 231 and is the only crystallographic structure of an intact IgG2 antibody (Harris, Larson, Hasel, & McPherson, 1997). Hence, it is the structure that best corresponds to the experimental data described in Section 4.2. As previously mentioned, the molecular weight of an IgG antibody is ∼150 kDa. 1 Contact Prof. Ulf Skoglund (
[email protected]) for a copy of the program.
The Reverse Fuzzy Distance Transform
157
2rec This corresponds to the RecA protein from the experiments by Yu and Egelman (1997), where it was suggested that the RecA hexamer is a structural homologue of ring helicases. The RecA protein plays several key roles in genetic recombination and repair and has therefore been the focus of several electron microscopy and X-ray crystallography studies. In Birmanns and Wriggers (2007), this structure was used to illustrate the proposed algorithm for multiresolution anchor-point registration of biomolecular assemblies and their components. The molecular weight of RecA is ∼230 kDA. 1q5a This corresponds to the S-shaped trans-interactions of cadherins described in He, Cowin, and Stokes (2003), which is a model based on the fitting of C-cadherin ectodomains (PDB id 1l3w, Boggon et al. (2002)) to 3D reconstructions of desmosomes. Cadherins are transmembrane proteins that mediate adhesion between cells in the tissues of animals. Cell adhesion relies on interactions between cadherin molecules. Such interactions between cadherin molecules and their organization in the desmosome can only be studied in situ (as far as we are aware), using ET. Recently a model for the organization ˜ Diez, Betts, & was built using this technique (Al-Amoudi, Castano Frangakis, 2007). The molecular weight of the structure (1q5a) is ∼200 kDa. ˚ We start by constructing phantoms of atomic resolution (i.e., grid size 1 A). These are shown in Figure 7 (left column). The 3D images are then low-pass ˚ and then resampled to grid size 5.74 A ˚ (i.e., filtered to a resolution of 30 A what correspond to the grid size for the experimental data). The resulting low-resolution phantoms are shown in Figure 7 (middle and right columns).
4.4. Simulated Data One difficulty when evaluating methods intended for use in ET applications is the fact that no ground truth exists and phantoms, constructed as in Section 4.3, provide a “too-ideal” view of the structures as many artefacts are imposed by the image acquisition technique. This makes a fully objective evaluation hard to achieve. One way to overcome this to some extent is to construct a phantom and use a TEM simulator to simulate micrographs from it, which can then be used to obtain a 3D reconstruction of the phantom. The phantom can be constructed from any suitable structure deposited in PDB as described in Section 4.3. For our simulations, we use the TEM simulator implemented by Hans Rullg˚ard, Department of Mathematics, Stockholm University, Stockholm, Sweden, which is freely available.2 The TEM model in 2 Download from URL: www.math.su.se/˜hansr.
158
Stina Svensson
FIGURE 7 From top to bottom: Phantoms constructed of PDB id 1igt, 2rec, and 1q5a, respectively. The left column shows surface renderings of phantoms of atomic resolution. The middle column shows surface renderings of phantoms that have been low-pass filtered ˚ and resampled to grid size 5.74 A ˚ . The right column shows volume to a resolution of 30 A renderings of the phantoms in the middle column.
¨ the simulator is described by Fanelli and Oktem (2008). To run the simulator the user needs to provide a file containing a 3D image in raw format containing the model for the sample; that is, the phantom to be imaged, together with text files describing the parameters specifying the model for the ET experiment. We have used phantoms of atomic resolution as described in Section 4.3. The settings in the simulator correspond to descriptions in Section 4.2, which means that the generated micrographs resemble micrographs from a FEG 200-keV TEM (Philips CM200). Three-dimensional reconstructions from the micrographs were done using COMET. Figure 8 shows the simulated 3D
The Reverse Fuzzy Distance Transform
159
FIGURE 8 From top to bottom: Simulated 3D reconstructions of phantoms constructed of PDB id 1igt, 2rec, and 1q5a, respectively. The left column shows maximum intensity projections and the right column shows surface renderings of the 3D reconstructions.
reconstructions as maximum intensity projections (left column) and surface rendered (right column). Of note, the simulated data are still rather ideal compared with the experimental data as the simulator takes into account many aspects of the image acquisition process but is far from complete. However, this technique
160
Stina Svensson
provides a much more realistic type of data than the phantoms and allows us to better verify the performance of the our methods.
5. FROM ELECTRON TOMOGRAPHIC STRUCTURE TO A FUZZY OBJECTS REPRESENTATION A 3D reconstruction is a 3D image containing the density values of the imaged structure. The methods described here require a fuzzy object as the input data. Hence, before applying the methods we need to extract a fuzzy object corresponding to the structure. This can be done in many different ways. Some articles on fuzzy segmentation of electron tomographic data sets, which is a process that can be used to extract a fuzzy object, have ¨ ˜ et al., 2008; Gedda, Skoglund, Ofverstedt, already been published (Garduno & Svensson, submitted for publication; Svensson, 2007b; Svensson et al., 2006). Because the connectedness of the image points making up the objects is imprecise, fuzzy set methods are well suited and, hence, it is probable that more publications on this topic will be forthcoming. To extract a fuzzy object from the data described in Section 4, we use the following approach. For the experimental data, some preprocessing is required to remove small variations in the 3D reconstructions, most likely related to noise. Because we want to leave significant features such as edges, an edge-preserving smoothing algorithm is a suitable choice. A number of algorithms are available for this purpose. An evaluation algorithm on the topic, focused on ET data, was recently published (Narasimha et al., 2008). We use anisotropic diffusion filtering introduced by Perona and Malik (1990), since it has been shown to perform well on our data. This step is not required for the phantoms or for the simulated data because in those cases the original data provide sufficient input. After the data are preprocessed, we intend to extract the fuzzy object O corresponding to a certain structure by using the degree of belongingness a voxel in the image has with voxels that we are certain belong to the structure (seed points)—that is, voxels placed in the most internal parts of the structure. The degree of belongingness is measured using fuzzy connectedness c [Eq. (3)]. The first step is then to identify suitable seed points. For this purpose, we follow the approach described, for example, in Svensson et al. (2006). There the FDT, calculated directly from the input data, is used to emphasize the most internal parts of the structure. Seed points are found among the voxels with the highest FDT values—in our case, by using a simple thresholding. The seeds are considered to have µO = 1. We calculate c for all other voxels in the image. As we use a fuzzy setting, the positioning of the border of the support of O is less crucial than if a crisp setting were used. We have chosen to include in O the voxels having the 3000, 4500, and 4000 highest c values for 1igt, 2rec, and 1q5a, respectively,
161
The Reverse Fuzzy Distance Transform
IgG1
IgG3
IgG2
1igt
2rec
IgG4
1q5a
FIGURE 9 Top row: Fuzzy objects representing the experimental data of the IgG antibodies in Figure 6. Bottom row: Fuzzy objects representing the simulated data of phantoms constructed of PDB id 1igt, 2rec, and 1q5a in Figure 8.
which is a number proportional to their molecular weight. Once the support of O has been determined, µO is set to be the calculated c values rescaled so that the voxels in the support of O have values in the range [0, 1]. ˜ et al. (2008). There seeds are A different approach is suggested in Garduno marked manually in the image and are assigned different labels. One seed is used for the background. The seeds are then allowed to compete with each other (i.e., the label of a seed is propagated together with the membership value). A voxel is assigned the label to which seed it has the highest degree of ˜ et al. (2008) use a more sophisticated belongingness. Furthermore Garduno membership function than Eq. (3), for which the expected homogeneity of the region corresponding to each seed is reflected. Hence, the membership function may differ for the different seeds. Figure 9 shows (top row) the fuzzy objects representing the experimental data of IgG antibodies in Figure 6 and (bottom row) representing the simulated data of 1igt, 2rec, and 1q5a in Figure 8. The fuzzy objects representing the phantoms (Figure 7), not surprisingly, turn out very similar to the input data (but with voxel values in the range [0,1]) and we therefore do not include them in Figure 9.
6. IDENTIFYING THE SUBUNITS OF A MACROMOLECULE Two of the previously mentioned structures (IgG/1igt and 2rec) have been chosen to show the identification of subunits in a macromolecular
162
Stina Svensson
structure using region growing by RFDT (Section 3). This is a relevant topic because subunit identification is often required before subsequent analysis, where the goal of the analysis can be, for instance, to determine interrelations to various degrees (i.e., the structural conformation of the imaged macromolecule). Region growing by RFDT can be used for this purpose. As mentioned earlier, it is tailored for structures with rather spherical subunits. This is the case for a large set of macromolecules such as virus particles. What is required, given a fuzzy object representation O of the macromolecule, is to detect suitable seeds [referred to as L0 O, D F in Section 3] for the region, growing. In most cases, subunits have the highest density in their most internal parts; hence, µO will be close to 1 corresponding to those regions (following the construction of µO described in Section 5). The FDTs computed on O further emphasize this fact. In previous publications, we have suggested use of Euclidean- or fuzzy distance-based clustering of local grey-level maxima in the FDT computed on O as a way to detect relevant seeds (Gedda et al., submitted for publication; Svensson, 2007b). Another simpler and less general, but still efficient way to detect seeds is to use a suitable set of mathematical morphology operations. In this chapter, we chose to use the latter approach as it proved robust enough for the data at hand. For seed detection before region growing by RFDT other approaches, developed for ET data, could be followed, such as the gradient vector diffusion method presented in Yu and Bajaj (2005). We assume that each subunit of O has an internal rather homogeneous region of adjacent voxels, v1 , . . . , vn for which µO (vi ), i = 1, . . . , n is higher than its surroundings. However, due to the poor quality of the input data, it is not guaranteed that there will be one single local grey-level maxima corresponding to each subunit. Therefore, we use a method to detect such a region called image reconstruction by erosion (Vincent, 1993). We start by creating a structuring element (SE) shaped as a ball and with a diameter slightly smaller than the expected diameter of one subunit. O is then eroded by SE—that is, each voxel v in the support of O is assigned the smallest membership value found in a neighborhood of v corresponding to SE. We denote the resulting image SE (O). SE (O) is then iteratively dilated by SE using O as a constraint. The result is an image with a locally maximal plateau corresponding to each subunit. The plateau can be easily identified and is included in L0 O, D F . For more details on this mathematical morphology based technique, see Vincent (1993) and Soille (1999). Once L0 O, D F has been identified, region growing by RFDT is applied. Figure 10 shows the subunits identified for the experimental data of the IgG antibodies in Figure 6. The subunits in this case correspond to the two Fab arms (yellow and orange) and the Fc stem (red). To identify the seeds we ˚ used a structuring element of diameter seven voxels: ∼40 A.
163
The Reverse Fuzzy Distance Transform
IgG1
IgG2
IgG3
IgG4
FIGURE 10 The Fab arms (yellow and orange) and the Fc stem (red) for the IgG antibodies in Figure 6 identified using region growing by RFDT. (See Color Insert.)
One important feature to consider when evaluating a certain method is its robustness toward changes in resolution of the input data. For this purpose, we refer to the phantoms described in Section 4.3. The PDB id 1igt and ˚ with 2rec was used to construct phantoms of resolution 20, 30, and 40 A, ˚ ˚ a grid size of 5.74 A. (The 30 A resolution phantoms are shown in Figure 7.) Figure 11 shows the subunits identified for each of the phantoms, where the subunits correspond to the Fab arms (yellow and orange) and the Fc stem (red) for 1igt and to the individual RecA subunits for 2rec. To identify the ˚ seeds we used a structuring element of diameter seven voxels (i.e., ∼40 A) ˚ for 1igt and a structuring element of diameter three voxels (i.e., ∼17 A) for 2rec. The subunits are satisfactory identified, and the region growing by RFDT thereby appears robust to changes in resolution, in all cases except for the ˚ There the phantom constructed of PDB id 2rec at a resolution of 20 A. border between the subunits shown in blue and yellow does not reflect the structure well. As the resolution becomes higher, the amount of internal structure becomes richer. This is useful for several structural focused analysis reasons, but it actually affects the seed detection process used here in a negative way. The used grid size is not enough to fully represent the internal structure so that, even though 2rec is symmetric, discretization caused the yellow seed to be weaker (in the sense of containing less relevant image points) than the blue seed. We use this example to show that, even with this difference, the fuzzy shape is better preserved than if other region growing methods are used. In Section 3, region growing by RFDT was briefly compared with seeded WS (Vincent, 1993). In WS, the grey-level image is considered as a topographic map and the final segmentation corresponds to the catchment basins, one for each local maximum. Region growing starts from labeled local grey-level maxima and propagates on a greylevel basis. When the propagation fronts from different maxima meet, a watershed is built to prevent further region growing. In a real application, the numbers of local maxima are often large. Many of the maxima correspond to small nonrelevant grey-level variations in the image that can result in
164
Stina Svensson
FIGURE 11 Subunits identified using region growing by RFDT for phantoms constructed of PDB id 1igt (tow row) and 2rec (bottom row). The phantoms are at a resolution of, from left to ˚ and have a grid size 5.74 A ˚ . The subunits correspond to the Fab arms right, 20, 30, and 40 A (yellow and orange) and the Fc stem (red) for 1igt and to the individual RecA subunits for 2rec. (See Color Insert.)
FIGURE 12 Subunits identified using region growing by RFDT (left) and seeded WS (right) for ˚ , with grid size 5.74 A ˚. a phantom constructed of PDB id 2rec at a resolution of 20 A
oversegmentation. This can be overcome only allowing region growing from seeds. WS is a well-known segmentation algorithm used in many different application (see Meyer and Beucher (1990) and Vincent (1993) for overviews). Lately WS has also drawn interest in the electron microscopy community (see Volkmann (2002)). Therefore, it is relevant for use as a comparison. Figure 12 shows the subunits identified for the phantom constructed of PDB id 2rec ˚ using region growing by RFDT (left) and seeded WS at a resolution of 20 A (right) starting from the same seeds. The positioning of the border between the subunits shown in blue and yellow is even more erroneous when seeded WS is used.
The Reverse Fuzzy Distance Transform
165
FIGURE 13 Subunits identified using region growing by RFDT for simulated data of phantoms constructed of PDB id 1igt (left) and 2rec (right). The subunits correspond to the Fab arms and the Fc stem (red, yellow, and orange) for 1igt. For 2rec, two subunits are identified (instead of the desired six) due to loss of information in the simulation process. (See Color Insert.)
Finally, Figure 13 shows the subunits identified for the simulated data in Figure 8 (top and middle row), corresponding to phantoms constructed of PDB id 1igt and 2rec. Region growing by RFDT was applied to their respective fuzzy object representations (Figure 9, bottom row, left and middle). For the seed detection we used the same settings as for the experimental data and for the phantoms. For 2rec, the resolution is insufficient to actually resolve the six subunits; the seed detection method fails to detect six separate regions.
7. IDENTIFYING THE CORE OF AN ELONGATED MACROMOLECULE The third structure in Section 4.3, namely PDB id 1q5a, was chosen to illustrate the use of CMFBs in the analysis of elongated structures. For elongated structures, identification of subunits is often not an issue. Instead, we are interested in tracking the structure and measure (e.g., its length or the thickness along it). To facilitate such measurements, the core, preferably a centrally located curve, representing the original structure can be used. For this purpose, the CMFBs are of interest. In Figure 14, the fuzzy objects Oi , CMFB(Oi , h3, 4, 5i), and CMFBR (Oi , h3, 4, 5i), i = 1, 2, where O1 is used to represent the phantom constructed ˚ and grid size 5.74 A ˚ in Figure 7 and of PDB id 1q5a at a resolution of 30 A O2 the simulated data of 1q5a in Figure 8, are shown as surface-rendered. The number of voxels included are 5502 (O1 ), 2133 (CMFB (O1 , h3, 4, 5i)), and 1355 (CMFBR (O1 , h3, 4, 5i)) for the phantom and 3875 (O2 ), 2361 (CMFB (O2 , h3, 4, 5i)), and 1061 (CMFBR (O2 , h3, 4, 5i)) for the simulated data. Note that from CMFBR (Oi , h3, 4, 5i), the support of Oi can be recovered using RFDT. Hence, we have an efficient way of representing Oi while still preserving important aspects of its shape. Figure 14, clearly shows the effect of the limited angular range in the electron microscope. Because the sample only can be examined to 120◦ –140◦ ,
166
Stina Svensson
FIGURE 14 From left to right: The fuzzy objects Oi , CMFB (Oi , h3, 4, 5i), and CMFBR (Oi , h3, 4, 5i), i = 1, 2, where O1 is used to represent the phantom constructed of PDB id 1q5a in Figure 7 and O2 the simulated data 1q5a in Figure 8. Surface rendering is used in all subfigures.
the 3D reconstructions will have missing data artefacts. In the case of an elongated, tubular structure such as 1q5a, the result is the cross section along its main axis more resembles an ellipse than the expected disk. This causes CMFBR (O2 , h3, 4, 5i) to be a medial surfacerather than a medial curve. For a fuzzy object O, CMFB O, D F [as well as CMFBR O, D F ] constitutes a medial representation O. One common difficulty when extracting a medial representation of a structure is the identification of end points. Intuitively, a medial representation can be obtained by “peeling” or thinning (i.e., iterative removal of all current border voxels until a curve has been obtained). This process, even when constraints are used for a topology-preserving thinning, evidently causes shortening of the structure. For CMFB O, D F , all relevant protrusions will be represented. As the support of O can be recovered CMFB O, D F , this set for some applications appears to be a too-rich structure. For instance, if the interest is to measure the length of O1 or O2 , CMFBR (O1 , h3, 4, 5i) and CMFBR (O2 , h3, 4, 5i) are preferably further reduced to curves, with the induced loss of information. We remark that CMFB O, D F is not always as dense as for the examples shown for 1q5a, but can constitute a set of disconnected voxels internally located in O. To facility subsequent analysis,
The Reverse Fuzzy Distance Transform
167
such a sparse set is preferably connected to a curve. Yu and Bajaj (2006) suggested extracting the centerline by tracing the eigenvectors of local structure tensors. Initially, they detect seeds corresponding to local maximal grey-level points in the 3D reconstructions. The seeds are then connected by starting from the seeds in two opposite directions and following the principal axis, defined by the eigenvactor corresponding to the minimum eigenvalue for the local structure tensor. A similar approach could be applied starting from CMFBR O, D F and calculating the local structure tensors from the fuzzy object.
8. CONCLUSIONS This chapter has focused on the use of fuzzy set methods in the shape analysis of structures in images with low signal-to-noise ratio (and thereby low contrast). I have described a theoretical framework, including the reverse fuzzy distance transform (RFDT), region growing by RFDT, as well as the concept of centers of maximal fuzzy balls (CMFB), which are applied to fuzzy object representations of relevant structures. All are generalizations of methods well known for crisp objects to a fuzzy setting. These generalizations allow us to achieve a more robust analysis, especially when the input data are of low contrast and low resolution (i.e., when crisp segmentation of the image into object and background is difficult). The output of region growing by RFDT is a crisp segmentation of a fuzzy object. Hence, we are still somewhere between a crisp and a truly fuzzy setting. For example, when addressing the relation between different regions we consider crisp relations. It may be of interest to take one step further and use partial belongings to a region and thereby obtain a fuzzy decomposition into regions where points belong to a region up to a certain membership. Ideas like this, including also fuzzy adjacencies, are described by Bloch (2005). The second part of chapter applies the methods to cryo-electron tomographic data of macromolecules. To show various aspects of the methods and to illustrate artefacts imposed by the image acquisition technique (and thereby further emphasize the need for robust methods tailored for this specific application), I use phantoms constructed from structures deposited in the Protein Data Bank and simulated data of both the constructed phantoms and experimental data. I believe that basing validation of methods on datasets with different resolutions and image quality is very relevant, especially as there is no way to create true ground truth datasets when cryo-electron tomography is used. One final remark regards the computation time required for the presented methods. In this chapter no optimization with respect to computational efficiency has been done. Of course, this aspect needs to be considered before using the methods in large-scale projects. For the computation of the FDT
168
Stina Svensson
and the RFDT, sorted priority queues are preferable (Saha et al., 2002). Region growing by RFDT can be implemented in a manner similar to watershed segmentation (Vincent & Soille, 1991) to thereby obtain a computationally more efficient algorithm. The reduction of CMFBs is still an open question. However, suggestions have been made for a solution in a crisp setting ¨ 1997). (Borgefors & Nystrom,
ACKNOWLEDGMENTS Joakim Lindblad and Robin Strand, Centre for Image Analysis, Uppsala, Sweden; and Nataˇsa Sladoje, Faculty of Engineering, Novi Sad, Serbia; are acknowledged for scientific support regarding the theoretical parts of the manuscript. ¨ ¨ Ulf Skoglund, Lars-Goran Ofverstedt, and Lars Norl´en, Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden, are acknowledged for scientific support regarding the electron tomography application. Hans Rullg˚ard, Department of Mathematics, Stockholm University, Stockholm, Sweden, is gratefully acknowledged for implementing and providing the TEM simulator. The experimental cryo-ET data sets on the IgG antibody have been provided by Sara Sandin, Division of Structural Studies, MRC Laboratory of Molecular Biology, Cambridge, United Kingdom (formerly at the Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden). The work presented here is part of project conducted with Magnus Gedda, Centre for Image Analysis, Uppsala, Sweden, who is acknowledged for various valuable contributions. The author received financial support from the Swedish Research Council (Project 621-2005-5540) and the Visualisation Research Programme by the Knowledge Foundation, V˚ardalstiftelsen (the Foundation for Health Care Sciences and Allergy Research), the Foundation for Strategic Research, VINNOVA, and Invest in Sweden Agency.
REFERENCES ˜ Diez, D., Betts, M. J., & Frangakis, A. S. (2007). The molecular Al-Amoudi, A., Castano architecture of cadherins in native epidermal desmosomes. Nature, 450(7171), 832–837. Arcelli, C., & Sanniti di Baja, G. (1988). Finding local maxima in a pseudo-Euclidean distance transform. Computer Vision, Graphics, and Image Processing, 43(3), 361–367. Baker, M. L., Yu, Z., Chiu, W., & Bajaj, C. (2006). Automated segmentation of molecular subunits in electron cryomicroscopy density maps. Journal of Structural Biology, 156, 432–441. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., et al. (2000). The protein data bank. Nucleic Acids Research, 28(1), 235–242. Beucher, S., & Lantuejoul, C. (1979). Use of watersheds in contour detection. In International workshop on image processing: Real-time edge and motion detection/estimation.
The Reverse Fuzzy Distance Transform
169
Birmanns, S., & Wriggers, W. (2007). Multi-resolution anchor-point registration of biomolecular assemblies and their components. Journal of Structural Biology, 157, 271–280. Bloch, I. (2000). Geodesic balls in a fuzzy set and fuzzy geodesic mathematical morphology. Pattern Recognition, 33, 897–905. Bloch, I. (2005). Fuzzy spatial relationships for image processing and interpretation: A review. Image and Vision Computing, 23, 89–110. Blum, H. (1967). A transformation for extracting new descriptions of shape. In W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form (pp. 362–380). Cambridge, MA: MIT Press. Boggon, T. J., Murray, J., Chappuis-Flament, S., Wong, E., Gumbiner, B. M., & Shapiro, L. (2002). C-cadherin ectodomain structure and implications for cell adhesion mechanisms. Science, 296(5571), 1308–1313. Bogomolny, A. (1987). On the perimeter and area of fuzzy sets. Fuzzy Sets and Systems, 23, 257–269. Bongini, L., Fanelli, D., Svensson, S., Gedda, M., Piazza, F., & Skoglund, U. (2007). Resolving the geometry of biomolecules imaged by cryo electron tomography. Journal of Microscopy, 228, 174–184. Borgefors, G. (1986). Distance transformations in digital images. Computer Vision, Graphics, and Image Processing, 34, 344–371. Borgefors, G. (1996). On digital distance transforms in three dimensions. Computer Vision and Image Understanding, 64(3), 368–376. ¨ Borgefors, G., & Nystrom, I. (1997). Efficient shape representation by minimizing the set of centres of maximal discs/spheres. Pattern Recognition Letters, 18, 465–472. ¨ I., & Sladoje, N. (2005). Shape signatures of fuzzy star-shaped sets based Chanussot, J., Nystrom, on distance from the centroid. Pattern Recognition Letters, 26, 735–746. Crowther, R. A., DeRosier, D. J., & Kulg, A. (1970). The reconstruction of three-dimensional structure from projections and its application to electron microsopy. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 317(1530), 319–340. ¨ Fanelli, D., & Oktem, O. (2008). Electron tomography: A short overview with an emphasis on the absorption potential model for the forward problem. Inverse Problems, 24, 013001. ˜ Garduno,, E., Wong-Barnum, M., Volkmann, N., & Ellisman, M. H. (2008). Segmentation of electron tomographic data sets using fuzzy set theory principles. Journal of Structural Biology, 162, 368–379. ¨ Gedda, M., Skoglund, U., Ofverstedt, L.-G., & Svensson, S. (2008). Image processing system for localising macromolecules in electron tomography data (submitted for publication). Gilbert, P. (1972). Iterative methods for the three-dimensional reconstruction of an object from projections. Journal of Theoretical Biology, 36(1), 105–117. Gordon, R., Bender, R., & Herman, G. T. (1970). Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography. Journal of Theoretical Biology, 29(3), 471–481. Harris, L. J., Larson, S. B., Hasel, K. W., & Mcpherson, A. (1997). Refined structure of an intact IgG2a monoclonal antibody. Biochemistry, 36, 1581–1597. He, W., Cowin, P., & Stokes, D. L. (2003). Untangling desmosomal knots with electron tomography. Science, 302(5642), 109–113. Levi, G., & Montanari, U. (1970). A grey-weighted skeleton. Information and Control, 17, 62–91. Meyer, F. (1994). Topographic distance and watershed lines. Signal Processing, 38, 113–125. Meyer, F., & Beucher, S. (1990). Morphological segmentation. Journal of Visual Communication and Image Representation, 1(1), 21–46. Narasimha, R., Aganj, I., Bennett, A. E., Borgnia, M. J., Zabransky, D., Sapiro, G., et al. (2008). Evaluation of denoising algorithms for biological electron tomography. Journal of Structural Biology, 164(1), 7–17. Nguyen, H. (2008). Shape-driven three-dimensional watersnake segmentation of biological membranes in electron tomography. IEEE Transactions on Medical Imaging, 27(5), 616–628. ¨ Nystrom, I., & Borgefors, G. (1995). Synthesising objects and scenes using the reverse distance transformation in 2D and 3D. In C. Braccini, L. D. Floriani, & G. Vernazza (Eds.), Proceedings of ICIAP’95: Image Analysis and Processing (pp. 441–446). Berlin: Springer-Verlag.
170
Stina Svensson
Perona, P., & Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis, 12(7), 629–639. ´ A. (2001). Segmentation into fuzzy Philipp-Foliguet, S., Vieira, M. B., & De Albuquerque Araujo, regions using topographic distance. In Proceedings of the XIV Brazilian Symposium on Computer Graphics and Image Processing (pp. 282–288). Washington, DC: IEEE Computer Society. Pittet, J.-J., Henn, C., Engel, A., & Heymann, J. B. (1999). Visualizing 3D data obtained from microscopy on the internet. Journal of Structural Biology, 125, 123–132. Rosenfeld, A. (1979). Fuzzy digital topology. Inform. Control, 40(1), 76–87. Rosenfeld, A. (1984). The fuzzy geometry of image subsets. Pattern Recognition Letters, 2, 311–317. Rosenfeld, A., & Pfaltz, J. L. (1966). Sequential operations in digital picture processing. Journal of the Association for Computing Machinery, 13(4), 471–494. ¨ Rullg˚ard, H., Oktem, O., & Skoglund, U. (2007). A component-wise iterated relative entropy regularization method with updated prior and regularization parameters. Inverse Problems, 23, 2121–2139. Rutovitz, D. (1968). Data structures for operations on digital images. In G. C. Cheng, D. K. Pollock, & A. Rosenfeld (Eds.), Pictorial Pattern Recognition (pp. 105–133). Washington, DC: Thompson. Saha, P. K., Wehrli, F. W., & Gomberg, B. R. (2002). Fuzzy distance transform: Theory, algorithms, and applications. Computer Vision and Image Understanding, 86, 171–190. ¨ ¨ A.-C., Wrange, O., & Skoglund, U. (2004). Structure and Sandin, S., Ofverstedt, L.-G., Wikstrom, flexibility of individual immunoglobulin G molecules in solution. Structure, 12(3), 409–415. Sanniti di Baja, G. (1994). Well-shaped, stable, and reversible skeletons from the (3, 4)-distance transform. Journal of Visual Communication and Image Representation, 5, 107–115. Sanniti di Baja, G., & Svensson, S. (2002). A new shape descriptor for surfaces in 3D images. Pattern Recognition Letters, 23(6), 703–711. (Special issue on Discrete Geometry for Computer Imagery). Sethian, J. A. (1999). Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science. Cambridge University Press. ¨ Skoglund, U., Ofverstedt, L.-G., Burnett, R. M., & Bricogne, G. (1996). Maximum-entropy threedimensional reconstruction with deconvolution of the contrast transfer function: A test application with adenovirus. Journal of Structural Biology, 117, 173–188. Sladoje, N., & Lindblad, J. (2007). Representation and reconstruction of fuzzy discs by moments. Fuzzy Sets and Systems, 158(5), 517–534. ¨ Svensson, S., & Lofstrand, ¨ Smedby, O., T. (1999). Greyscale connectivity concept for visualizing MRA and CTA volumes. In S. K., Mun and Y., Kim (Eds.), Medical imaging 1999: image display, (volume 3658 of Proceedings of SPIE), (pp. 212–219). SPIE-The International Society for Optical Engineering. Soille, P. (1994). Generalized geodesy via geodesic time. Pattern Recognition Letters, 15, 1235–1240. Soille, P. (1999). Morphological Image Analysis. Berlin: Springer-Verlag. Svensson, S. (2007a). Centres of maximal balls extracted from a fuzzy distance transform. In G. J. F., Banon, J., Barrera, U. d. M., Braga-Neto, and N. S. T., Hirata (Eds.), Procedings of 8th international symposium on mathematical morphology, Vol. 2 (pp. 19–20). Available from http://urlib.net/dpi.inpe.br/ismm@80/2007/06.13.10.08. Svensson, S. (2007b). A decomposition scheme for 3D fuzzy objects based on fuzzy distance information. Pattern Recognition Letters, 28(2), 224–232. Svensson, S. (2008). Aspects on the reverse fuzzy distance transform. Pattern Recognition Letters, 29(7), 888–896. ¨ Svensson, S., Gedda, M., Fanelli, D., Skoglund, U., Ofverstedt, L.-G., & Sandin, S. (2006). Using a fuzzy framework for delineation and decomposition of immunoglobulin G in cryo electron tomographic images. In Y. Y. Tang, S. P. Wang, G. Lorette, D. S. Yeung, & H. Yan (Eds.), Proceedings The 18th International Conference on Pattern Recognition: Vol. 4 (pp. 520–523). Svensson, S., & Sanniti di Baja, G. (2002). Using distance transforms to decompose 3D discrete objects. Image and Vision Computing, 20(8), 529–540.
The Reverse Fuzzy Distance Transform
171
Udupa, J. K., & Saha, P. K. (2003). Fuzzy connectedness and image segmentation. Proceedings of the IEEE, 91(10), 1649–1669. Udupa, J. K., & Samarasekera, S. (1996). Fuzzy connectedness and object definition: Theory, algorithms, and applications in image segmentation. Graphical Models and Image Processing, 58, 246–261. Vincent, L. (1993). Morphological grayscale reconstruction in image analysis: Applications and efficient algorithms. IEEE Transactions on Image Processing, 2(2), 176–201. Vincent, L., & Soille, P. (1991). Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583–597. Volkmann, N. (2002). A novel three-dimensional variant of the watershed transform for segmentation of electron density maps. Journal of Structural Biology, 138, 123–129. Wriggers, W., Milligan, R. A., & McCammon, J. A. (1999). Situs: A package for docking crystal structures into low-resolution maps from electron microscopy. Journal of Structural Biology, 125, 185–195. Yu, X., & Egelman, E. H. (1997). The RecA hexamer is a structural homologue of ring helicases. Nature Structural & Molecular Biology, 4(2), 101–104. Yu, Z., & Bajaj, C. (2005). Automatic ultrastructure segmentation of recontructed CryoEM maps of icosahedral viruses. IEEE Transactions on Image Processing, 14(9), 1324–1337. Yu, Z., & Bajaj, C. (2006). Computational approaches for automatic structural analysis of large bio-molecular complexes. IEEE/ACM transactions on computational biology and bioinformatics, 1, 1–15. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.
Chapter
5 Anchors of Morphological Operators and Algebraic Openings M. Van Droogenbroeck
Contents
1. Introduction 1.1. Terminology and Scope 1.2. Toward Openings 1.3. Anchors 2. Morphological Anchors 2.1. Set and Function Operators 2.2. Theory of Morphological Anchors 2.3. Local Existence of Anchors 2.4. Algorithmic Properties of Morphological Anchors 3. Anchors of Algebraic Openings 3.1. Spatial and Shift-Invariant Openings 3.2. Granulometries 4. Conclusions References
173 174 177 180 182 183 187 188 192 195 197 198 199 200
1. INTRODUCTION Over the years mathematical morphology, a theory initiated by Matheron (1975) and Serra (1982), has grown to a major theory in the field of nonlinear image processing. Tools of mathematical morphology, such as morphological filters, the watershed transform, and connectivity operators, are now widely available in commercial image processing software packages and the theory itself has considerably expanded over the past decade (Najman & Talbot, 2008). This expansion includes new operators, algorithms, methodologies, and concepts that have led mathematical morphology to become part of the mainstream of image analysis and image-processing technologies. University of Li`ege, Department of Electrical Engineering and Computer Science, Montefiore, Sart Tilman, Li`ege, Belgium Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00010-X. c 2009 Elsevier Inc. All rights reserved. Copyright
173
174
M. Van Droogenbroeck
The growth in popularity is due not only to the theoretical work of some pioneers but also to the development of powerful tools for image processing such as granulometries (Matheron, 1975), pattern spectrum analysis-based techniques (Maragos, 1989) that provide insights into shapes, and transforms like the watershed (Beucher & Lantu´ejoul, 1979; Vincent & Soille, 1991) or connected operators (Salembier & Serra, 1995) that help to segment an image. All these operators have been studied intensively and tractable algorithms have been found to implement them effectively, that is, in real time on an ordinary desktop computer. Historically, mathematical morphology is considered a theory that is concerned with the processing of images, using operators based on topological and geometrical properties. According to Heijmans (1994), the first books on mathematical morphology discuss a number of mappings on subsets of the Euclidean plane, which have in common that they are based on set-theoretical operations (union, intersection, complementation), as well as translations. More recently researchers have extended morphological operators to arbitrary complete lattices, a step that has paved the way to more general algebraic frameworks. The geometrical interpretation of mathematical morphology relates to the use of a probe which is a set called structuring element. The basic idea in binary morphology is to probe an image with the shape of the structuring element and draw conclusions on how this shape fits or misses the regions of that image. Consequently, there are as many interpretations of an image as structuring elements, although one often falls back to a few subset of structuring elements, such as lines, squares, balls, or hexagons. In geometrically motivated approaches to mathematical morphology, the focus clearly lies on the shape and size of the structuring element. Algebraic approaches do not refer to geometrical or topological concepts. They concentrate on the properties of operators. Consequently, algebraic approaches embrace larger classes of operators and functions. But in both approaches the goal is to characterize operators to help design solutions useful for image processing applications.
1.1. Terminology and Scope Let R and Z be the set of all real and integer numbers, respectively. In this chapter, we consider transformations and functions defined on a space E , which is the continuous Euclidean space Rn or the discrete space Zn , where n ≥ 1 is an integer. Elements of E are written p, x, . . ., while subsets of elements are denoted by upper-case letters X, Y, . . . Subsets of R2 are also called binary sets or images because two colors suffice to draw them (elements of X are usually drawn in black, and elements that do not belong to X in white). Next we introduce an order on P(E ), the power set comprising all subsets of E , that results from the usual inclusion notion. X is smaller or equal to Y if
Anchors of Morphological Operators and Algebraic Openings
175
and only if X ⊆ Y . Note that we decide to encompass the case of equality in contrary to strict inclusion ⊂. The power set P(E ) with the inclusion ordering is a complete lattice (Heijmans, 1994). This chapter also deals with images. Images are modeled as functions (denoted as f, g, . . .), that map a non-empty set E of E into R, where R is a set of binary values {0, 1}, a discrete set of grey-scale values {0, . . . , 255}, or a closed interval [0, 255], defining, respectively, binary, grey-scale or continuous images. The ordering relation ≤ is given by “ f ≤ g if and only if f (x) ≤ g(x) for every x ∈ E”. It can be shown that the space Fun(E , R) of all functions and ≤ form a complete lattice, which means among others that every subset of the lattice has an infimum and a supremum. The frameworks of P(E ) and ⊆, or Fun(E , R) and ≤, are equivalent as long as we deal with complete lattices. Most results can thus be transposed from one framework to the other. In the following we arbitrary decide to restrict functions to be single-valued grey-scale images. Also, for convenience, we use the unique term operators to refer to operations that map sets of E into E or handle Fun(E , R). In addition, we restrict the scope of this chapter to operators that map sets to sets, or functions to functions. This implies that both the input and the output lattices are the same, or equivalently, that there is only one complete lattice under consideration. Operators are denoted by greek letters , δ, γ , ψ, . . . 1.1.1. The Notion of Idempotence Linear filters and operators are common in many engineering fields that process signals. As an analogy, remember that every computer with a sound card contains a hardware module that prefilters the acquired signal before sampling to prevent aliasing. Likewise, digital signals are converted to analog signals by means of a low-pass filter. The concept of a reference filter called an ideal filter is often used to characterize linear filters. Ideal filters have a binary transmittance with only two values: 0 or 1. Either they retain a frequency or they drop it. Ideal filters stabilize the result after a single pass; further applications do not modify the result any more in the spectral domain (nor in the spatial domain!). Unfortunately, it can be shown that because practical linear filters have a finite kernel in the spatial domain, they cannot be ideal. It might be impossible to build ideal filters, but they nevertheless serve as a reference because it is pointless to repeat the filtering process. The notion of frequency is irrelevant in nonlinear image processing. Nonlinear operators operate in the initial domain, either globally or locally. An important class of nonlinear operators computes rank statistics inside a moving window. For example, the median operator that selects the median from a collection of values taken inside a moving local window centered on x and that allocates the median value to ψ( f )(x) is known to be efficient for the removal of salt-and-pepper noise (Figure 1). However, in some cases, the median filter oscillates as shown in Figure 2.
176
M. Van Droogenbroeck
(a)
(b)
(c)
FIGURE 1 Effect of a median filter. (a) Original grey-scale image, (b) original image corrupted by noise, and (c) filtered image.
FIGURE 2 Successive applications of a three-pixel wide median filter on a binary image may result in oscillations.
Oscillations may not be common in practice, but they still cast doubts on the significance of the output function. Therefore, the behavior of median operators has been characterized in terms of root signals. Technically, a root signal f of an operator, which is sometimes called a fixed point, is a function that is invariant to the applications of that operator for each location: ∀x ∈ E , ψ( f )(x) = f (x), where f is the root signal. The existence of root signals is not restricted to the median operator. Let us consider a simple example of a one-dimensional constant function g(x) = k and a linear filter with an impulse response h(x). The filtered signal r (x) is R +∞ R +∞ the convolution of g(x) by h(x): r (x) = −∞ g(t − x)h(t) dt = k −∞ h(t) dt = kH (0), where H (0) is the Fourier transform of h(t) taken for f = 0. In the case of an ideal filter, H (0) = 1 so that r (x) = k = g(x). This illustrates that, up to a constant, root signals for linear operators typically include constantvalued or straight-line signals. For nonlinear operators, there is a property somewhat similar to that of an ideal filter for linear processing. Mathematical morphology uses a property called idempotence.
Anchors of Morphological Operators and Algebraic Openings
177
Definition 1. Consider an operator ψ on a complete lattice (L , ≤), and let X be an arbitrary element of L . An operator ψ is idempotent if and only if ψ(ψ(X )) = ψ(X )
(1)
for any X of L . In the following, we use this notation or the operator composition: ψψ = ψ. In contrast to the case of ideal filters, it is possible to implement idempotent operators. Therefore, the property is part of the design of filters and not just a goal; idempotence is chosen as one of the compulsory property for the operator. This explains why many idempotent operators have been proposed: algebraic filters, morphological openings, attribute openings,. . . Idempotence might be one of the requirements in the design of an operator, it does not suffice! For example, an operator ψ that maps every function to g(x) = −∞ is idempotent but useless. Hereafter, we present additional properties to complete the algebraic framework and elaborate on the formal definition of openings.
1.2. Toward Openings By definition, a notion of order exists on a complete (L , ≤). The property of increasingness guarantees that an order between objects in the lattice is preserved (remember that we deal only with objects that belong to a unique lattice, that is X, ψ(X ) ∈ L and there is only one notion of order ≤). That is, Definition 2. The lattice operator ψ is called increasing if X ≤ Y implies ψ(X ) ≤ ψ(Y ) for every X, Y ∈ L . Let ψ be an operator on L , and let X be an element of L for which ψ(X ) = X . Then X is called invariant under ψ or, alternatively, a fixed point of ψ. The set of all elements invariant under ψ is denoted by Inv (ψ) and is called the invariance domain of ψ. The Tarski’s fixpoint theorem specifies that the invariance domain Inv (ψ) of an increasing operator ψ on a complete lattice is nonempty. Increasingness builds a bridge between ordering relations before and after the operator. But, as X and ψ(X ) are defined on the same lattice, one can also compare X to ψ(X ), leading us to define additional properties: Definition 3. Let X, Y be two sets (or functions) of a lattice (L , ≤). An operator ψ on L is called • extensive if ψ(X ) ≥ X for every X ∈ L ; • anti-extensive if ψ(X ) ≤ X for every X ∈ L . In more practical terms, increasingness tells us if an order in the source lattice is preserved in the destination lattice, idempotence if the application
178
M. Van Droogenbroeck
(a) f(x)
(b) γ (f(x))
(c) g(x)
(d) γ (g(x))
FIGURE 3 Illustrations of increasingness and anti-extensivity. (a) and (b) Original grey-scale image f (x) and after processing with an opening operator γ ; (c) and (d) Similar displays for a whiter image g(x).
of the operator stabilizes the results, and extensivity if the result is smaller or larger than its source. Figure 3 shows these remarks in illustrative terms: f (x) ≤ g(x) implies γ ( f ) ≤ γ (g), γ ( f (x)) ≤ f (x), and γ (g(x)) ≤ g(x). When an operator ψ is both increasing and idempotent, it is called an algebraic filter. Regarding the extensivity property, there are two types of algebraic filters: anti-extensive or extensive algebraic filters are respectively called algebraic openings or algebraic closings. Openings and closings share the common properties of increasingness and idempotence, but are dual with respect to extensivity. Thanks to this duality, we can limit the scope of this chapter to openings; handling closings brings similar results. Figure 4 shows the effect of an algebraic opening Qn that rounds greyscale values to a closest inferior multiple of an integer n; this operator is called quantization in signal processing. It is increasing, anti-extensive, and idempotent, but it should be noted that if none f (x) is a multiple of n, then
Anchors of Morphological Operators and Algebraic Openings
(a) f(x)
(b) Q10(f(x))
179
(c) Q50(f(x))
FIGURE 4 Quantization operator. (a) Original grey-scale image, (b) image rounded to the closest inferior multiple of 10, and (c) image rounded to the closest inferior multiple of 50.
f (x) 6= ψ( f (x)) for all x ∈ E . In other words, quantization may produce values that are not present in the original image and thus have questionable statistical significance. Note that the definition of order is a pointwise property. f (x) is compared with g(x) or γ ( f (x)) but not compared with the value at a different location (called pixel in image analysis). In practice, however, neighboring pixels share some common physical significance that, for example, rank operators explore. A rank operator of rank k within a discrete sliding window centered at a given location x is obtained by sorting in ascending order the values falling inside the window and by selecting as output value for x the kth value in the sorted list. Some of the best-known rank operators are the local minimum and maximum operators. In mathematical morphology, these operators are referred to as erosion and dilation, respectively, and the window itself is termed a structuring element or a structuring set. There filters are also referred to as min- or max-filters in the literature. The presence of some interaction between neighboring pixels introduced by rank operators is why their characterization becomes more challenging. Consider a complete lattice (L , ≤). To elaborate on the notion of neighborhood, we propose the definition of a property called spatiality. Definition 4. An operator ψ on (L , ≤) is said to be spatial if for every location x ∈ E and for every function f , there exists y ∈ E such that ψ( f (x)) = f (y),
(2)
and at least one y is different from x. The trivial case of x = y, for every x ∈ E , is thus excluded. As explained previously, the quantization operator is not spatial because it does not consider the neighborhood of x.
180
M. Van Droogenbroeck
Assuming that an image is the result of an observation, the smaller the choice of the neighborhood for finding y, the higher the physical correlation between pixels will be. On purpose, there is no notion of distance between x and y in the definition of spatiality, although one hopes that operators with a reasonable physical significance should restrict the search for y to a close neighborhood of x. Spatiality constrains operators to select values in the neighborhood of a pixel. But the underlying question that is twofold remains: (1) Does an operator drive any input to a root signal (this is called the convergence property), and (2) if not, do oscillations propagate? Root signals have been studied with a particular emphasis on median filters (Arce & Gallagher, 1982; Arce & McLoughlin, 1987; Astola, Heinonen, & Neuvo, 1987; Eberly, Longbotham, & Aragon, 1991; Eckhardt, 2003; Gallagher & Wise, 1981). The convergence property is of no particular interest for idempotent operators, as ψ(ψ( f )) = ψ( f ), so that the question becomes that of determining the subset of locations x ∈ E such that f (x) = ψ( f (x)). In different terms, the study of the invariance domain Inv (ψ) is a key to a better understanding of ψ; indeed, characterizing locations for a function f with respect to ψ can help implement the operator (as shown in Van Droogenbroeck and Buckley, 2005).
1.3. Anchors To analyze the behavior of some operators, we introduce the concept of anchors. We now define this concept, which can be seen as an extension of that of roots. An anchor is essentially a version of the root notion where the domain of definition is reduced to a subset of it. Definition 5. Given a signal f and an operator ψ on a complete lattice (L , ≤), the pair comprising a location x in the domain of definition of f and the value f (x) is an anchor for f with respect to ψ if ψ( f )(x) = f (x).
(3)
In marketing terms, one would say “The right value at the right place.” The set of anchors is denoted Aψ ( f ). Note that Definition 5 differs from the initial definition provided in Van Droogenbroeck and Buckley (2005) to emphasize the role of both the location x and the value of f (x). We provide an illustration in Figure 5. In this particular case, there is no evidence that anchors should always exist. Take a grey-scale image whose values f (x) are all odd, then Q2 ( f ) has no anchor, although f and Q2 ( f ) look identical. The existence of anchors is an open issue. Also it is interesting to determine whether an order between operators implies a similar inclusion order between anchor sets. In general, γ1 ≤ γ2 is no guarantee to establish an inclusion of their respective anchor sets. However, drawings (d) and (e) of
Anchors of Morphological Operators and Algebraic Openings
(a) f(x)
(b) Q10(f(x))
(d) Anchors of Q10(f(x))
181
(c) Q50(f(x))
(e) Anchors of Q50(f(x))
FIGURE 5 Quantization operator and anchors whose locations are drawn in black in (d) and (e). (a) Original image, (b) image rounded to the closest inferior multiple of 10, (c) image rounded to the closest inferior multiple of 50, (d) and (e), respectively, anchors of (b) and (c).
Figure 5 suggest that AQ50 ( f ) ⊆ AQ10 ( f ), which is true in this case because, in addition to Q50 ( f ) ≤ Q10 ( f ), Q50 ( f ) “absorbs” Q10 ( f ); more precisely, it is required that Q50 ( f ) = Q50 (Q10 ( f )) (see Theorem 8). Figure 6 shows anchors of two other common operators: morphological erosions and openings (detailed further in Section 2). Papers dealing with roots, convergence, or invariance domains focus either on the operator itself or on the entire signal. Anchors characterize a function locally, but they also help in finding algorithms, or interpreting existing algorithms. Van Droogenbroeck and Buckley (2005) presented algorithms applicable to morphological operators based on linear structuring elements and show how they offer an alternative to implementations like the one of van Herk (1992). In this chapter, we use an algebraic framework, with an eye on the geometrical notions, to expose the notion of anchors. The remainder of this chapter is organized as follows. Section 2 recalls several definitions and
182
M. Van Droogenbroeck
(a) f
(b)
(d) γ B (f )
B (f )
(c) Anchors of
B (f )
(e) Anchors of γ B (f )
FIGURE 6 Illustration of anchors [marked in black in (c) and (e)]. (a) Original image, (b) an image eroded by a 3 × 3 square structuring element, (c) anchor locations of B ( f ), (d) an image opened by a 3 × 3 square structuring element, and (e) anchor locations of γ B ( f ).
details theoretical results valid for morphological operators; anchors related to morphological operators are called morphological anchors. This section rephrases many results presented in Van Droogenbroeck and Buckley (2005). Section 3 extends the notion of anchors to the framework of algebraic operators. In particular, we present the concept of algebraic anchors that applies for algebraic openings and closings. The major contribution is the proof that if some operators might have no anchors (remember the case of the quantization operator Q2 of an image filled with odd grey-scale values), classes of openings and closings, others than their morphological “brothers,” have anchors, too.
2. MORPHOLOGICAL ANCHORS After a brief reminder on basic morphological operators, we emphasize the role of anchors in the context of erosions and openings by discussing their
Anchors of Morphological Operators and Algebraic Openings
183
existence and density. It is shown that anchors are intimately related to morphological openings and closings (their duals), and that the existence of anchors is guaranteed for openings. Furthermore, it is possible to derive properties useful for the implementation of erosions and openings. Section 3 generalizes a few results in the case of algebraic openings.
2.1. Set and Function Operators If E is the continuous Euclidean space Rn or the discrete space Zn , then the translation of x by b is given by x + b. To translate a given set X ⊆ E by a vector b ∈ E , it is sufficient to translate all the elements of X by b: X b is defined by X b = {x + b|x ∈ X }. Due to the commutativity of +, X b is equivalent to b X , where b X is the translate of b by all elements of X . Let us consider two subsets X and B of E . The erosion and dilation of these sets by a set B are respectively defined as X B=
\
X −b = { p ∈ E |B p ⊆ X },
(4)
[
(5)
b∈B
X⊕B=
[
Xb =
b∈B
Bx = {x + b|x ∈ X, b ∈ B}.
x∈X
For X ⊕ B, X and B are interchangeable, but not for the erosion, where it is required that B p be contained within X . Note that there are as many erosions as sets B. As B serves to enlighten some geometrical characteristics of X , it is called a structuring element or structuring set. Although the window shape might be arbitrary, it is common practice in applied image analysis to use linear, rectangular, or circular structuring elements. If B contains the origin o, ! X B=
\ b∈B
X −b =
\
X −b ∩ X,
(6)
b∈B\{o}
which is included in X . Therefore, if o ∈ B, the erosion and dilation are, respectively, anti-extensive and extensive. In addition, both operators are increasing but not idempotent. Because erosions and dilations are, respectively, anti-extensive and extensive (when the structuring element contains the origin), the cascade of an erosion and a dilation suggests itself. This set, denoted by X ◦ B, is called the opening of X by B and is defined by X ◦ B = (X B) ⊕ B.
(7)
Similarly, the closing of X by B is the dilation of X followed by the erosion, both with the same structuring element. It is denoted by X • B and defined by X • B = (X ⊕ B) B. Dilations and erosions are closely related although
184
FIGURE 7
M. Van Droogenbroeck
Opening and closing with a ball B.
not inverse operators. A precise relation between them is expressed by the duality principle (Serra, 1982) that states that ˇ c X B = (X c ⊕ B)
or
ˇ c, X ⊕ B = (X c B)
(8)
where the complement of X , denoted X c , is defined as X c = { p ∈ E | p 6∈ X }, and the symmetric or transposed set of B ⊆ E is the set Bˇ defined as Bˇ = {−b|b ∈ B}. Therefore, all statements concerning erosions and openings have an equivalent form for dilations and closings and vice versa. When B contains the origin, X B is the union of locations p that satisfy B p ⊆ X . When a dilation is applied to this set, the resulting set sums p B -like contributions, which are equivalent to B p . So X ◦ B is the union of B p that fits into X : X ◦ B = {B p |B p ⊆ X }.
(9)
In addition, it can be shown that X ◦ B is identical to X ◦ B p , so that the opening does not depend on the position of the origin when choosing B. The interpretation of X ◦ B as the union {B p |B p ⊆ X } is referred to as the geometrical interpretation of the morphological opening. A similar interpretation yields for the closing. The closing is the complementary set of the union of all the translates B p contained in X c . Figure 7 illustrates an opening and a closing with a ball. The geometrical interpretation suffices to prove that if X ◦ B is not empty, then there are at least #(B) anchors, where #(B) denotes the cardinality or area of B. The existence of anchors for X B is less trivial; assume that X is a
Anchors of Morphological Operators and Algebraic Openings
185
chessboard and B = { p}, where p is located at the distance of one square of the chessboard. In this case, X B = X p and X ∩ X p = ∅; A X B (X ) is empty. To the contrary, if o ∈ B and X B is not empty, then the erosion of X by B has anchors. In the following, we define operators on grey-scale images and then discuss the details of anchors related to erosions and openings. Previous definitions can be extended to binary and grey-scale images. If f is a function and b ∈ E , then the spatial translate of f by b is defined by f b (x) = f (x − b). The spatial translate is also called horizontal translate. The vertical translate, used later in this chapter, of a function f by a value v is defined by f v (x) = f (x) + v. The vertical translate shifts the function values in the grey-scale domain. The erosion of a function f by a structuring element B is denoted by B ( f )(x) and is defined as the infimum of the translations of f by the elements −b, where b ∈ B ^
B ( f )(x) =
f −b (x) =
b∈B
^
f (x + b).
(10)
b∈B
Likewise, we define the dilation of f by B, δ B ( f )(x), as δ B ( f )(x) =
_ b∈B
f b (x) =
_
f (x − b).
(11)
b∈B
Note that we consider so-called flat structuring elements; more general definitions using a non-flat structuring elements exist but they are not considered here. Just as for sets, the morphological opening γ B ( f ) and closing φ B ( f ) are defined as compositions of erosion and dilation operators: γ B ( f ) = δ B ( B ( f )), φ B ( f ) = B (δ B ( f )).
(12) (13)
Figure 8 shows the effects of several morphological operators on an image. Again, B ( f ) and δ B ( f ), and γ B ( f ) and φ B ( f ) are duals of each other (Serra, 1982), which is interpreted as stating that they process the foreground and the background symmetrically. If, by convention, we choose to represent low values with dark pixels in an image (background) and large values with white pixels (foreground), erosions enlarge dark areas and shrinks the foreground. From all the previous definitions, it can be seen that erosions, dilations, openings, and closings are spatial operators, as defined previously. They use values taken in the neighborhood. Heijmans (1994) and other authors have shown that set operators can be extended to function operators and hence the entire apparatus of
186
M. Van Droogenbroeck
(a) f(x)
(b)
(c) δB (f )
B (f )
(d) γ B (f )
(e) φB (f )
FIGURE 8 Original image (a), erosion (b), dilation (c), opening (d), and closing (e), with a 15 × 15 square.
morphology on sets is applicable in the grey-scale case as well. The underlying idea is to slice a function f into a family of increasing sets obtained by thresholding f . Without further details, consider a complete lattice Fun(E , R). We associate a series of threshold sets to f as defined by (Figure 9) X (t) = {x ∈ E | f (x) ≥ t}.
(14)
Note that X (t) is decreasing in t and that these sets obey the continuity condition X (t) =
\
X (s).
(15)
s
In addition, there is a one-to-one correspondence between a function and families of sets X (t). In fact, the function f can be recovered from the
Anchors of Morphological Operators and Algebraic Openings
187
f (x )
t
X(t)
x
FIGURE 9 The profile of a function f and one of its threshold sets.
W series of X (t) by means of f (x) = {t|x ∈ X (t)}; this is the threshold superposition principle. One interesting application of the correspondence is the possibility to interpret an opening on f as the union of B that fits in the threshold sets. From an implementation point of view, this leads to alternative definitions for morphological operators. Although morphological openings were defined as the cascade of an erosion followed by a dilation [see Eq. (12)], this does not mean that one must implement an opening according to its definition. Examples of the conclusion are, among others, the two implementations of an opening with a line proposed in Van Droogenbroeck (1994) and Vincent (1994). These implementations scans the image line by line and use threshold sets to compute the opening.
2.2. Theory of Morphological Anchors Let us first consider the simple case of the set opening of X by B. If X is empty, X ◦ B is empty, and there is no anchor. Similarly, if X = E and B is finite, X ◦ B = E , all points are again anchors. Leaving these trivial cases, let us take X containing some elements of E . As X ◦ B ⊆ X , if X ◦ B is not empty, all locations of X ◦ B are anchors. Therefore, in the binary case, anchors do always exist for non-empty sets. For openings, the notion of anchors is linked to that of invariance domain. Remember that the opening of X by B is the union of the translate of B that fit into X . Therefore, the corresponding invariance domain of X ◦ B is given by Inv (X ◦ B) = {Y ⊕ B|Y ∈ P(E )}. Accordingly, if X ◦ B is not empty, there exists a set Y and, as X ◦ B ⊆ X , the amount of anchors must be larger than #(Y ) + #(B) for continuous sets (#(Y ) + #(B) − 1 on a digital grid). If ψ is an opening by B, then one can derive, from the decomposition of f by threshold sets or equivalently by the geometrical interpretation of an opening, that the lower bound of a function f is an anchor, if the lower bound exists. However, some topological issues arise here. To circumvent the case of functions such as f (x) = x1 for x > 0 which have no lower bound, we have previously restricted the range of grey-scale values to a finite set (which means that it is countable) or a closed interval. Likewise, we must deal with
188
M. Van Droogenbroeck
finite structuring elements to be able to count the number of anchors. Both finiteness assumptions are used in the following. Consequently, there is at least one global minimum, and at least one anchor point. That is, Theorem 1. Consider a finite structuring element B, the set of anchors of a morphological opening is always non-empty: Aγ B ( f ) 6= ∅.
(16)
We provide an improved formal statement on the number of anchors for openings later. Note that the position of the origin in B has no influence on the set of anchors of γ B ( f ). This originates from the corresponding property on the operator itself, that is, γ B ( f ) = γ B p ( f ), for any p (on a infinite domain). Similar properties do not hold for erosions. In fact, the set of anchors of a morphological erosion may be empty, and the location of the origin plays a significant role; a basic property states that X B p = (X B)− p . Figure 10 shows two erosions with a same but translated structuring element. Note that the choice of the origin in the middle of B is no guarantee for the number of anchors to be larger. Again, based on the interpretation of openings in terms of threshold sets, larger structuring elements are less likely to lead to large sets of anchors. Indeed, large structuring elements do not fit into higher threshold sets, so that at higher grey-scale levels there are fewer anchors. Figure 11 shows the evolution of the cardinality of Aγ B ( f ), as the size of B increases.
2.3. Local Existence of Anchors W Because δ B ( f ) is defined as b∈B f (x −b), the dilation is a spatial operator. So the supremum (or maximum for real images) is reached at a given location p such that δ B ( f )(x) = f ( p), where p = x − b0 . But if b0 ∈ B, then p ∈ Bˇ x ; Bˇ x is the symmetric of B translated by x. Up to a translation, Bˇ x = x + Bˇ defines the neighborhood where the supremum for x can be found. Intuitively, there are as many anchor candidates for δ B ( f ) as disjoint sets like Bˇ x . Similar arguments lead to a relation valid for erosions. The following proposition gives the respective neighborhoods: Proposition 1. If B is finite and x is any point in the domain of definition of f , then δ B ( f )(x) = f ( p)
(17)
B ( f )(x) = f (q)
(18)
for some p ∈ Bˇ x , and some q ∈ Bx .
Anchors of Morphological Operators and Algebraic Openings
(b)
(a) f(x)
(d) Anchors of
B (f )
(c)
B (f )
(e) Anchors of
189
Bp(f )
Bp(f )
FIGURE 10 Original image (a), erosion by B (b), erosion by B p (c), and their anchor sets marked in E , respectively, (d) and (e). B is a 11 × 11 centered square and p = (5, 5).
We can combine Eqs. (17) and (18) to find the neighborhoods of openings and closings. From Eq. (12), we have γ B ( f ) = δ B ( B ( f )). Therefore, γ B ( f )(x) = B ( f )( p) with p ∈ Bˇ x . Similarly, B ( f )( p) = f (q) with q ∈ B p . ˇ x . For the closing, So we have γ B ( f )(x) = f (q), and q ∈ ( Bˇ ⊕ B)x = (B ⊕ B) the neighborhood is identical. This can be summarized as Proposition 2. If B is finite and x is any point in the domain of definition of f , then γ B ( f )(x) = f ( p) φ B ( f )(x) = f (q)
(19) (20)
ˇ x. for some p, q ∈ (B ⊕ B) As mentioned previously, the openings and closings are insensitive to the location of the origin of the structuring element. Let us consider Br instead
190
M. Van Droogenbroeck 100 Boat Theoretical lower bound
10
1
0.1
0.01
0.001
10
20
30
40
50 60 Size n
70
80
90
100
FIGURE 11 Percentage of opening anchor with respect to a size parameter n; B is an n × n square structuring element and the percentage is the ratio of the cardinality of Aγ B ( f ) to the image size. The figure also displays the lower bound established later in the chapter.
ˇ −r , this of B and compute the corresponding neighborhood. As (Bˇr ) = ( B) ˇ ˇ ˇ neighborhood becomes (Br ⊕ Br )x = (B ⊕ B)x+r −r = (B ⊕ B)x . Also, note ˇ x in all that B ⊕ Bˇ always contains the origin, which means that x ∈ (B ⊕ B) cases. To the contrary, if B does not contain the origin, the neighborhood of x for a dilation (that is, Bˇ x , does not contain x, nor does Bx for the erosion). Let us now consider that o ∈ B. Then the dilation is extensive: f (x) ≤ δ B ( f )(x). If f is bounded, then there exists r ∈ E such that f (r ) is the upper bound of f . As r belongs to its own neighborhood and f (r ) is an upper bound, δ B ( f )(r ) ≤ f (r ) too. This means that (r, f (r )) is an anchor with respect to the dilation: δ B ( f )(r ) = f (r ). In other words, the (r, f (r )) pair of an upper bound is an anchor for the dilation when the structuring element B contains the origin. In contrast to the cases of dilations and erosions, the number of anchors for the opening is not limited by the number of lower or upper bounds. To get a better lower bound of the cardinality of anchors, we establish a relationship between erosion anchors and openings. By definition and according to Eq. (18), B ( f )(x) =
^
f (x + b) =
b∈B
^
f (q).
(21)
q∈Bx
As B is finite, there exists q ∈ Bx such that B ( f )(x) = f (q).
(22)
Anchors of Morphological Operators and Algebraic Openings
191
Next we show that (q, f (q)) is an anchor for the opening. As before, note ˇ q . Now that q ∈ Bx implies x ∈ ( B) γ B ( f )(q) =
_
B ( f )(r )
(23)
ˇ q r ∈( B)
≥ B ( f )(x) = f (q).
(24) (25)
As before, we use the anti-extensivity property of an opening, that is γ B ( f ) ≤ f . This proves that γ B ( f )(q) = f (q) and therefore (q, f (q)) is an anchor for the opening. The following theorem establishes a formal link between erosion and opening anchors. Theorem 2. If B is finite and x is any location in the domain of definition of f , then B ( f )(x) = γ B ( f )( p)
(26)
for some p ∈ Bx . Moreover ( p, f ( p)) is an anchor for γ B ( f ), that is γ B ( f )( p) = f ( p).
(27)
The density of anchors for the opening is thus related to the size of Bx . It ˇ x -like neighborhood, there is an anchor for is also true that for each (B ⊕ B) γ B ( f ). To prove this result, remember that γ B ( f )(x) = B ( f )( p) = f (q)
(28)
for some p ∈ Bˇ x and q ∈ B p . Next, we want to prove that (q, f (q)) is an anchor. By definition, γ B ( f )(q) may be written as γ B ( f )(q) =
_
B ( f )(r ).
(29)
ˇ q r ∈( B)
ˇ q implies that q ∈ Br . Then, according to Eq. (28), However r ∈ ( B) γ B ( f )(q) =
_
B ( f )(r ) ≥ B ( f )( p)
(30)
ˇ q r ∈( B)
and, as B ( f )( p) = f (q), γ B ( f )(q) ≥ B ( f )( p) = f (q).
(31)
192
M. Van Droogenbroeck
But openings are anti-extensive, which means that γ B ( f )(q) ≤ f (q). This proves that (q, f (q)) is an opening anchor: Theorem 3. If B is finite and x is any point in the domain of definition of f , then γ B ( f )(x) = γ B ( f )(q) = f (q)
(32)
ˇ x. for some q ∈ (B ⊕ B) Theorems 2 and 3 lead to bounds for the number of anchors because they establish the existence of anchors locally. Intuitively, regions with a constant grey-scale value contain more anchor points; in such a neighborhood all points will be anchors. But the number of anchors is also related to the size of the structuring element. Theorem 3 specifies that at least one opening ˇ x . Surprisingly, it is Theorem 2, anchor exists for each region of type (B ⊕ B) which links erosion to opening, that provides the tightest lower bound for the density of opening anchors: 1 . #(B)
(33)
This limit is the minimum proportion of opening anchors contained in an image; it is plotted on Figure 11. It is reachable only if E can be tiled by translations of B. Where such tiling is not possible, for example, when B is a disk, this bound is conservative. Note also that the number of opening anchors is expected to decrease when the size of B increases. This phenomenon is illustrated in Figure 12, where opening anchors have been overwritten in black.
2.4. Algorithmic Properties of Morphological Anchors In addition to providing a weak bound for the number of anchors, Theorem 3 has an important practical consequence. It shows that all the information needed to compute γ B ( f ) is contained in its opening anchors. In other words, from a theoretical point of view, it is possible to reconstruct γ B ( f )(x) from a subset of Aγ B ( f ). The only pending question is how to determine this subset of Aγ B ( f ). Should an algorithm be able to detect the location of opening anchors that influence their neighborhood, it would provide the opening for each x immediately. Unfortunately, unless f (x) has been processed previously and information on anchors has been collected, there is no way to locate anchor points. But with an appropriate scanning order and a linear structuring element, it is possible to retain some information about f to locate anchor points effectively. Such an algorithm has been proposed by Van Droogenbroeck and Buckley (2005). Figure 13 shows the computation times of such an algorithm for a very large image and a linear structuring
Anchors of Morphological Operators and Algebraic Openings
(a) f
(c) γ 11×11(f )
193
(b) γ 3×3(f )
(d) γ 21×21(f )
FIGURE 12 Density of opening anchors for increasing sizes of the structuring element. From left to right, and top to bottom: original (a) and openings with a squared structuring element B (of size 3 × 3, 11 × 11, and 21 × 21 respectively).
element L whose length varies. For this figure, one image was built by tiling pieces of a natural image, the other was filled randomly to consider the worst case. An interesting characteristic of this algorithm is that the computation times decrease with the size of the structuring element. To explain this behavior, remember that the number of anchors also decreases with the size of B. Because the algorithm is based on anchors, there are fewer anchors to be found. Once an anchor is found, it is efficient in propagating this value in its neighborhood. We have thus so far worked on the opening, but we can use Theorem 2 and anchors for a different algorithm to compute the erosion. Because the set of erosion anchors may be empty, we cannot rely on erosion anchors to develop an algorithm to compute the erosion. However, it is known (Heijmans, 1994) that the erosion of f is equal to the erosion of γ B ( f ):
194
M. Van Droogenbroeck 1.6 Opening by anchors (random image) Opening by anchors (natural image)
Computation time [s]
1.4 1.2 1 0.8 0.6 0.4 0.2 0
0
5
10
15
20
25
30
Length of the structuring element (in pixels)
FIGURE 13
Computation times on two images (of identical size).
1.4 Erosion by anchors (natural image) Opening by anchors (natural image)
Computation time [s]
1.2 1 0.8 0.6 0.4 0.2 0
0
5 10 15 20 25 Length of the structuring element (in pixels)
30
FIGURE 14 Computation times of two algorithms that use opening anchors to compute the erosion and the morphological opening.
B ( f ) = B (γ B ( f )) for any function f and B. The conclusion is that the computation of erosions should be based on opening anchors rather than on erosion anchors. Computation times of such an algorithm for several erosions are displayed in Figure 14, side by side to that of the opening. The algorithm for the erosion is slower for two reasons: Anchors are to be propagated in a smaller neighborhood and the propagation process is more complicated than in the case of the opening. However, this shows that opening anchors are also useful for the computation of erosions.
Anchors of Morphological Operators and Algebraic Openings
195
Note that the relative position of the computation times curves is unusual. Openings are defined as the cascade of an erosion followed by a dilation, so slower computation of openings would be expected. Figure 14 contradicts this belief. To close the discussions on morphological anchors, let us examine the impact of the shape of B on the implementation. The shape of B is usually not arbitrary: Typical shapes include lines, rectangles, circles, hexagons, and so on. If B is constrained to contain the origin or to be symmetric, we can derive useful properties for implementations. Suppose, for example, that ( p, f ( p)) is an anchor with respect to the erosion B ( f ) and that B contains the origin o. Then the dilation is extensive (δ B ( f ) ≥ f ) and therefore f ( p) = B ( f )( p) ≤ δ B ( B ( f ))( p) = γ B ( f )( p).
(34)
But openings are anti-extensive (γ B ( f ) ≤ f ) so that γ B ( f )( p) = f ( p). In other words, an anchor for B ( f ) is always an anchor for γ B ( f ) when B contains the origin as below, Theorem 4. If o ∈ B and ( p, f ( p)) is an anchor for the erosion B ( f ), then ( p, f ( p)) ∈ Aγ B ( f ) . Another interesting case occurs when B is symmetric (that is, when ˇ This covers B being a rectangle, a circle, an hexagon, and B = B). so on (many software packages propose only morphological operations with symmetric structuring elements to facilitate handling border effects). Anchors of operations with B and Bˇ then coincide and it is equivalent to scan images in one order or in the reverse order.
3. ANCHORS OF ALGEBRAIC OPENINGS The existence of anchors has been proven for morphological openings. The question is whether the existence of anchors still holds for other types of openings, or even for any algebraic opening. From a theoretical perspective, an operator is called an algebraic opening if it is increasing, anti-extensive, and idempotent. Therefore, algebraic openings include but are not limited to morphological openings. Known algebraic openings are area openings (Vincent, 1992), openings by reconstruction (Salembier & Serra, 1995), attribute openings (Breen & Jones, 1996), and so on. The family of algebraic openings is also extensible, as there exist properties, like the one given hereafter, that can be used to engineer new openings.
196
M. Van Droogenbroeck
Proposition 3. If γi is an algebraic opening for every i ∈ I , then the supremum W i∈I γi is an algebraic opening as well. Attribute openings are most easily understood in the binary case. Unlike morphological openings, attribute openings preserve the shape of a set X , because they simply test whether or not a connected component satisfies some increasing criterion 0, called an attribute. An example of valid attribute consists of preserving a set X if its area is superior to λ and removing it otherwise. This is, in fact, the surface area opening. More formally, the attribute opening γ0 of a connected set X preserves this set if it satisfies the criterion 0: γ0 (X ) =
X, if X satisfies 0, ∅, otherwise.
(35)
The definition of attribute openings can be extended to nonconnected sets by considering the union of all their connected components. Since the attribute is increasing, attribute openings can be directly generalized to greyscale images using the threshold superposition principle. Such openings always have anchors. But do all openings have anchors? The reason we fail to prove that all openings have anchors is as follows. Let us consider an algebraic opening γ . Since γ is increasing (as it is an opening), γ is upper bounded by the identity operator: γ ≤ I . Assume now that A f (γ ) = ∅, then γ < I . Remember that γ is also anti-extensive; it follows that γ γ ≤ γ I . Would it instead be here that γ γ < γ I (this property is not true!), then using the property of idempotence γ γ = γ , and one would conclude that γ < γ , which is impossible and anchors would exist in all the cases. But γ γ ≤ γ I and not γ γ < γ I , so that we derive that the anti-extensivity itself does not provide a strict order and that it gives some freedom on the operator to allow functions not to have some anchors. The properties of an algebraic opening are not sufficient to guarantee the existence of anchors. We need to introduce additional requirements on an algebraic opening to ensure the existence of anchors. Openings that explicitly refer to the threshold value can have no anchor. Remember the case of the quantization operator Q2 applied on an odd image. Obviously, if f (x) = 3, Q2 ( f )(x) = 2; there is no anchor. Similarly, consider an operator ψ( f )(x) = x ∧ f (x). This operator is an opening, but if g(x) = x + 1, ψ(g)(x) = x; again, there is no anchor. This time the opening does not refer to threshold levels but explicitly to the location, and not the relative location. Two constraints are considered hereafter. The first constraint, spatiality, relates to the usual notion of neighborhood as used in the section on morphological anchors, and the second constraint, shift invariance, relates to the ordering of function values.
Anchors of Morphological Operators and Algebraic Openings
197
Definition 6. An operator ϕ is shift-invariant if for every function f , it is equivalent to translate f vertically by v (v ∈ R) and apply ϕ or to apply ϕ on the vertical translate f v (see previous definition of a vertical translate). In formal terms, for every function f and every real value v (v ∈ R): ϕ f v (x) = ϕ( f (x) + v) = ϕ( f )(x) + v. (36)
3.1. Spatial and Shift-Invariant Openings Section 2 showed that the minima of a function automatically provide anchors for every morphological opening. A simple example suffices to show that this property does not necessarily hold for any opening. Let us ¯ defined as reconsider the previous examples and a constant function k, ¯ = 2. ¯ In addition, ¯k(x) = k for all x ∈ E . If ψ( f ) = Q2 ( f ), then ψ(3) ¯ 6= 3. ¯ Therefore, the processing for ψ( f )(x) = x ∧ f (x), we have ψ(3) of a constant function by an algebraic opening can produce a nonconstant function or a constant that takes a different value. If entropy is meant here as the cardinality of grey-scale values after processing, then to the contrary of what morphological operators suggest, the entropy of an algebraic opening may increase. Obviously, these situations do not occur for spatial openings. Morphological openings are a particular case of spatial openings, denoted ξ hereafter. We have proven that the minimum values of a function are anchors with respect to a morphological opening. Let us denote by min f , the minimum of a lower bounded function f , and assume that the minimum is reached for p ∈ E . Because ξ is an opening, ξ( f ) ≤ f for any function f . In particular, ξ( f )( p) ≤ f ( p) = min f . By definition of spatiality, for every location, including p, there exists a location q such that ξ( f )( p) = f (q). But such a value is lower bounded by min f . Therefore, ξ( f )( p) ≥ min f , and ξ( f )( p) = f ( p) = min f . Theorem 5. Consider a spatial opening ξ . Then global minima of f provide all anchors for ξ . This theorem can also be rephrased in the following terms: Provided a set of grey-scale values of a function processed by an opening is a subset of the original set of grey-scale values, there are anchors. Indirectly, it also proves the existence of anchors for any spatial opening; to some extent, it generalizes Theorem 1. Let us now consider the shift-invariance property. From a practical point of view, shift-invariance means that functions can handle offsets, or equivalently that offsets have no impact on the results except that the result is shifted by the same offset. This is an acceptable theoretical assumption, but in practice images are defined by on a finite set of integer values (typically {0, . . . , 255}); handling an offset requires redefining the range of grey-scale values to maintain the full dynamic of values.
198
M. Van Droogenbroeck
Consider a shift-invariant operator ϕ. Imagine, for a moment, that there is no anchor with respect to ϕ. Since ϕ is anti-extensive (as it is an opening), ϕ( f )(x) ≤ f (x) becomes ϕ( f )(x) < f (x)
(37)
for every x ∈ E . In other words, there exists λ > 0 such that ϕ( f )(x) + λ ≤ f (x).
(38)
By increasingness, ϕ(ϕ( f ) + λ) ≤ ϕ( f ). After some simplifications and using the shift-invariance property, ϕ(ϕ( f ) + λ) = ϕ(ϕ( f )) + λ = ϕ( f ) + λ ≤ ϕ( f ), which is equivalent to λ ≤ 0. But this conclusion is incompatible with our initial statement on λ. Therefore, Theorem 6. Every shift-invariant opening ϕ has one or more anchors. For every function f , Aϕ ( f ) 6= ∅.
(39)
A subsequent question is whether the minimum is an anchor, regardless of the type of opening. Let us build a constant function filled with the minimum value of f (x); this function is denoted τ¯min . Since an anchor does exist for τ¯min , at least some of the values of τ¯min are anchors, though not necessarily all of them (see previous discussions for ψ( f )(x) = x ∧ f (x)). Through increasingness, τ¯min ≤ f implies γ (τ¯min ) ≤ γ ( f ), where γ is an algebraic opening. Anti-extensivity implies that γ ( f ) ≤ f . We can conclude that there exists p ∈ E such that γ (τ¯min )( p) = τ¯min ( p) ≤ γ ( f )( p) ≤ f ( p). So, if f ( p) = τmin , then τmin = γ ( f ( p)). Therefore, Theorem 7. If the set of anchors with respect to an algebraic opening is always non-empty, then at least one global minimum of a function f is an anchor for that opening. This theorem applies for morphological, spatial, and shift-invariant openings but in the two first cases, we have proven that all minima are anchors. Note, however, that anchors should always exist for this property to be true. Neither the quantization operator Q2 nor ψ( f )(x) = x ∧ f (x) meet this requirement.
3.2. Granulometries In practice, one uses openings that filter images with several different degrees of smoothness. For example, one opening is intended to maintain many details; another opening filters the image to obtain a background image. When the openings are ordered, we have a granulometry.
Anchors of Morphological Operators and Algebraic Openings
199
Definition 7. A granulometry on Fun(E ) is a one-parameter family of openings {γr |r > 0}, such that γs ≤ γr ,
if s ≥ r.
(40)
If γs ≤ γr , then γs γr ≥ γs γs = γs . Also, γr ≤ I implies that γs γr ≤ γs , so that γs γr = γs . The identity γr γs = γs is proved analogously. It follows that a family of operators of granulometry also satisfies the semigroup property: γr γs = γs γr = γs ,
s ≥ r.
(41)
As a result, anchor sets are ordered like the openings of a granulometry as below. Theorem 8. Anchor sets of a granulometry {γr |r > 0} on Fun(E ) are ordered according to Aγs ( f ) ⊆ Aγr ( f ) .
(42)
There is a similar statement for morphological openings. Suppose B contains A (that is, A ⊆ B) and B ◦ A = B, then, according to Haralick, Sternberg, and Zhuang (1987), γ B ( f ) ≤ γ A ( f ).
(43)
For example, B is a circle and A is a diameter, or B is a square and A is one side of the square. Note that A ⊆ B is not sufficient to guarantee that γ B ( f ) ≤ γ A ( f ). Applying Theorem 8, we obtain Corollary 1. For any function f , if A ⊆ B, B ◦ A = B, and A, B are both finite, then Aγ B ( f ) ⊆ Aγ A ( f ) .
(44)
This theorem is essential for morphological granulometries. It tells us that if we order a family of morphological openings, anchor sets will be ordered (reversely) as well. In fact, Vincent (1994) developed on algorithm based on the concept of opening trees that is based on this property.
4. CONCLUSIONS Anchors are features that characterize an operator and a function. This chapter has discussed the properties of an opening and shown how they related to anchors. First, we have established properties valid for
200
M. Van Droogenbroeck
morphological operators. Anchors then depend on the size and shape of the chosen structuring element. For example, it has been proven that anchors do always exist for openings and that global minima are anchors. The concept of a structuring element is not explicitly present any longer for algebraic openings. It also appears that some algebraic openings have no anchor for some functions. However, with additional constraints on the openings (that is, spatiality or shift-invariance), the framework is sufficient to ensure the existence of anchors for any function f . In addition, it has been proven that the existence of anchors then implies that some global minima are anchors. This is an interesting property that could lead to new algorithms in the future.
REFERENCES Arce, G., & Gallagher, N. (1982). State description for the root-signal set of median filters. IEEE Transactions on Acoustics, Speech and Signal Processing, 30(6), 894–902. Arce, G., & McLoughlin, M. (1987). Theoretical analysis of the max/median filter. IEEE Transactions on Acoustics, Speech and Signal Processing, 35(1), 60–69. Astola, J., Heinonen, P., & Neuvo, Y. (1987). On root structures of median and median-type filters. IEEE Transactions on Acoustics, Speech and Signal Processing, 35(8), 1199–1201. Beucher, S., & Lantu´ejoul, C. (1979). Use of watersheds in contour detection. In International workshop on image processing, Rennes, CCETT/IRISA. pp. 2.1–2.12. Breen, E. J., & Jones, R. (1996). Attribute openings, thinnings, and granulometries. Computer Vision and Image Understanding, 64(3), 377–389. Eberly, D., Longbotham, H., & Aragon, J. (1991). Complete classification of roots to onedimensional median and rank-order filters. IEEE Transactions on Acoustics, Speech and Signal Processing, 39(1), 197–200. Eckhardt, U. (2003). Root images of median filters. Journal of Mathematical Imaging and Vision, 19(1), 63–70. Gallagher, N., & Wise, G. (1981). A theoretical analysis of the properties of median filters. IEEE Transactions on Acoustics, Speech and Signal Processing, 29(6), 1136–1141. Haralick, R., Sternberg, S., & Zhuang, X. (1987). Image analysis using mathematical morphology. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 532–550. Heijmans, H. (1994). Morphological image operators. In Advances in electronics and electron physics series. Boston: Academic Press. Maragos, P. (1989). Pattern spectrum and multiscale shape representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 701–716. Matheron, G. (1975). Random sets and integral geometry. New York: Wiley. Najman, L., & Talbot, H. (2008). Morphologie math´ematique 1: approches d´eterministes. Paris: Hermes Science Publications. Salembier, P., & Serra, J. (1995). Flat zones filtering, connected operators, and filters by reconstruction. IEEE Transactions on Image Processing, 4(8), 1153–1160. Serra, J. (1982). Image analysis and mathematical morphology. New York: Academic Press. Van Droogenbroeck, M. (1994). On the implementation of morphological operations. In J. Serra, & P. Soille (Eds.), Mathematical morphology and its applications to image processing (pp. 241–248). Dordrecht: Kluwer Academic Publishers. Van Droogenbroeck, M., & Buckley, M. (2005). Morphological erosions and openings: fast algorithms based on anchors. Journal of Mathematical Imaging and Vision, 22(2–3), 121–142. (Special Issue on Mathematical Morphology after 40 Years). van Herk, M. (1992). A fast algorithm for local minimum and maximum filters on rectangular and octogonal kernels. Pattern Recognition Letters, 13(7), 517–521.
Anchors of Morphological Operators and Algebraic Openings
201
Vincent, L. (1992). Morphological area openings and closings for greyscale images. In Proc. Shape in Picture ’92, NATO Workshop, Driebergen, The Netherlands: Springer-Verlag. Vincent, L. (1994). Fast grayscale granulometry algorithms. In J. Serra, & P. Soille (Eds.), Mathematical morphology and its applications to image processing (pp. 265–272). Dordrecht: Kluwer Academic Publishers. Vincent, L., & Soille, P. (1991). Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583–598.
Chapter
6 Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems Dong Yang, Shiva Kumar, and Hao Wang
Contents
1. Introduction 2. Configuration of a Time-Lens–based Optical Signal Processing System 3. Wavelength Division Demultiplexer 4. Dispersion Compensator 5. Optical Implementation of Orthogonal Frequency-Division Multiplexing Using Time Lenses 6. Conclusions Acknowledgment Appendix A Appendix B Appendix C References
203 206 211 216 219 226 227 227 228 229 231
1. INTRODUCTION The analogy between the spatial diffraction and temporal dispersion has been known for years (Kolner, 1994; Papoulis, 1994; Van Howe & Xu, 2006). In spatial domain, an optical wave propagating in free space diverges due to diffraction. As an analog, in the temporal domain, an optical pulse propagating in a dispersive medium broadens due to dispersion. This spacetime duality can also be extended to lenses. The conventional space lens produces quadratic phase modulation on the transverse profile of the input Department of Electrical and Computer Engineering, McMaster University, Canada Advances in Imaging and Electron Physics, Volume 158, ISSN 1076-5670, DOI: 10.1016/S1076-5670(09)00011-1. c 2009 Elsevier Inc. All rights reserved. Copyright
203
204
Dong Yang et al. (1)
β 21
(2)
β 21
(1)
β 22
Time lens 1 f1
(2)
β 22
Time lens 2 f1
Temporal filter
f2
f2
FIGURE 1 Scheme of a typical 4-f system. PM1 = phase modulator 1; PM2 = phase modulator 2. (Based on Yang et al. (2008).)
beam and its analog, time lens, simply applies a quadratic phase modulation to a temporal optical waveform. On this basis, several temporal analogs to spatial systems based on thin lenses have been created and many real-time optical signal processing applications, including temporal imaging, pulse compression, and temporal filtering based on time-lens systems have been proposed (Azana & Muriel, 2000; Berger, Levit, Atkins, & Fischer, 2000; Kolner & Nazarathy, 1989; Lohmann & Mendlovic, 1992). The temporal filtering technique was first proposed by Lohmann and Mendlovic (1992). In their pioneering work, a temporal filter was introduced in a 4-f configuration consisting of time lenses. In the spatial domain, a conventional lens produces the Fourier transform (FT) at the back focal plane of an optical signal placed at the front focal plane, which is known as a 2-f configuration or 2-f subsystem. The spatial filter placed at the back focal plane modifies the signal spectrum, and a subsequent 2-f subsystem provides the Fourier transform of the modified signal spectrum, which returns the signal to the spatial domain with spatial inversion. There exists an exact analogy between spatial filtering and temporal filtering techniques (Lohmann & Mendlovic, 1992). In the case of temporal filtering, the spatial lens is replaced by a time lens (which is nothing but a phase modulator), and spatial diffraction is substituted with secondorder dispersion. Yang, Kumar and Wang (2008) discuss a modified 4-f system consisting of two time lenses (Figure 1). In this 4-f configuration, the subsystem T1 provides the Fourier/inverse FT of the input signal. The signs of chirp and dispersion coefficients are reversed in the 2-f subsystem T2 after the temporal filter. In contrast, Lohmann and Mendlovic (1992) proposed the 4-f configuration in which the signs of the dispersion coefficients of the 2-f subsystems are identical. This implies that the second 2-f subsystem (Lohmann & Mendlovic, 1992) provides the FT of the modified signal spectrum leading to a time-reversed bit pattern, whereas in the approach used by Yang et al. (2008), it provides the inverse Fourier transform (IFT) so that the output bit pattern is not reversed in time. The approach proposed
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
205
by Yang et al. (2008) has no spatial analog since the sign of spatial diffraction cannot be changed (Goodman, 1996, chap. 4). Based on the 4-f configuration consisting of time lenses, three applications have been numerically implemented. One of them is a tunable wavelength division demultiplexer (Yang et al., 2008). The wavelength division multiplexing (WDM) demultiplexer is realized using the temporal band pass filter in a 4-f system. The temporal filter is realized using an amplitude modulator, and the channel to be demultiplexed can be dynamically changed by changing the input voltage to the amplitude modulator. The passband of the temporal filter is chosen to be at the central frequency of the desired channel with a suitable bandwidth such that only the signal carried by the channel to be demultiplexed passes through it with the least attenuation. The wavelength division multiplexed signal passes through the 2-f subsystem T1 (see Figure 1) and its FT in the time domain is obtained. After the temporal filter, the signals in any other undesired channels are blocked and the signal in the desired channel then passes through the 2-f subsystem T2, which finally produces the demultiplexed signal as its original bit sequence by the exact IFT. Another application of the 4-f temporal filtering scheme is a higherorder dispersion compensator (Yang et al., 2008). The temporal filter in this application is realized by a phase modulator. To compensate for fiber dispersion, the time domain transfer function of the phase modulator has the same form as the frequency domain transfer function of fiber but the signs of dispersion coefficients are opposite. At the temporal filter, the Fourier transformed input signal is multiplied by the time domain transfer function of the phase modulator so that fiber dispersion–induced phase shift is canceled out. Finally, an implementation of orthogonal frequency-division multiplexing (OFDM) in the optical domain using Fourier transforming properties of time lenses is discussed (Kumar & Yang, 2008). The first 2-f subsystem provides the IFT of the input signal that carries the information. This input signal is obtained by the optical/electrical time division multiplexing of several channels. The kernel of the IFT is of the form exp (i2π f t) and therefore, IFT operation can be imagined as the multiplication of the signal samples of several channels by the optical subcarriers. These subcarriers are orthogonal and therefore, the original signal can be obtained by a Fourier transformer at the receiver, which acts as a demultiplexer. The temporal filter between T1 and T2 is characterized as the transfer function of a fiber-optic link with nonlinearity. The second 2-f subsystem provides the FT of the output from the fiber link so that the convolution of the input optical signal and the fiber transfer function in time domain is converted into the product of the FT of both, but still in time domain. The output of the Fourier transformer is the product of the original input signal (input signal of T1) and the phase correction due to the fiber transfer
206
Dong Yang et al.
function. The photodetector responds only to the intensity of the optical signal and therefore, the deleterious effects introduced by the fiber that appear as the phase correction are removed by the photodetector. This occurs because the fiber transfer function due to dispersive effects is of the type exp [iβ( f )]. In the absence of fiber nonlinearity, optical carriers are orthogonal and the Fourier transformer at the receiver demultiplexes the subcarriers without introducing any penalty. However, strong nonlinear effects can destroy the orthogonality of the optical subcarriers. Nevertheless, the simulation in Kumar and Yang (2008) shows that the time-lens–based optical OFDM system scheme has good tolerance to the fiber nonlinearity.
2. CONFIGURATION OF A TIME-LENS–BASED OPTICAL SIGNAL PROCESSING SYSTEM Figure 1 shows the modified 4-f system based on time lenses (Yang et al., 2008). This 4-f system consists of two cascaded 2-f subsystems T1 and T2, each of which contains a time lens and two segments of single-mode fibers (SMFs) that are symmetrically placed on both sides of the time lens. A time lens is a temporal analog of a space lens and it is realized by an electro-optic phase modulator that can generate quadratic phase modulation. The signal spectrum can be modified using a temporal filter. The temporal filter can be realized using an amplitude and/or phase modulator. The transfer function of the temporal filter can be changed by changing the input voltage to the amplitude and/or phase modulator. For the proper operation of this system, two phase modulators and the temporal filter should be properly synchronized. If Tk is the absolute time at the phase modulator k, k = 1, 2, then they are related by T2 = T1 + f 1 /υg1 + f 2 /υg2 ,
(1)
where υg1 and υg2 are the group speeds of the fibers after the time lens 1 and just before the time lens 2, respectively. f 1 and f 2 are the fiber lengths as shown in Figure 1. This delay between the driving voltages of the phase modulators can be achieved using a microwave delay line. Similarly, the absolute time T f at the temporal filter is related to T1 by T f = T1 + f 1 /υg1 ,
(2)
where f 1 is the length of the SMF in T1. Propagation of the optical signal in a system consisting of a dispersive SMF and the time lens (such as the 2-f subsystem T1 in Figure 1) results in the FT of the input signal at the length 2 f 1 . We use the following definitions
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
207
of FT pairs: U˜ (ω) = F{u(t)} = u(t) = F −1 {U˜ (ω)} =
1 2π
Z
+∞
u(t) exp(iωt)dt −∞ Z +∞
U˜ (ω) exp(−iωt)dω,
(3) (4)
−∞
where u(t) is a temporal function and U˜ (ω) is its FT. The time lens is implemented using an optical phase modulator whose time domain transfer function is given by h j (t) = exp iC j t 2 ,
j = 1, 2,
(5)
where C j is the chirp coefficient of the phase modulator j in the 2-f subsystem T j , j = 1, 2, and t is the time variable in a reference frame that moves with an optical pulse. Using Eq. (5) in Eq. (3), we obtain the FT of the transfer function h j (t): p ω2 H˜ j (ω) = iπ/C j exp −i 4C j
! ,
j = 1, 2.
(6)
Suppose that the field envelope of the input signal is u(t, 0), and the (k) (k) corresponding FT is U˜ (ω, 0). Let β21 and β22 be the dispersion coefficients of the first and second fibers in the subsystem Tk , k = 1, 2, respectively, and f k be the length of the SMF in Tk , k = 1, 2. Before the time lens of T1 (z = f 1− ), the temporal signal is (Agrawal, 1997) 1 u(t, f 1− ) = 2π
Z
+∞
−∞
U˜ (ω, 0) exp
i (1) 2 β f 1 ω − iωt dω. 2 21
(7)
The behavior of the time lens is described as follows: u(t, f 1+ ) = u(t, f 1− ) h 1 (t).
(8)
Because the product in the time domain becomes a convolution in spectral domain, taking the FT of Eq. (8), we obtain U˜ (ω, f 1+ ) = F {u(t, f 1− )h 1 (t)} Z ∞ 1 = U˜ ω − ω0 , f 1− H˜ 1 (ω0 )dω0 , 2π −∞
(9)
208
Dong Yang et al.
where U˜ (ω, f 1− ) is the FT of u(t, f 1− ). Hence, at the end of the first 2-f subsystem T1, we obtain 1 u(t, 2 f 1 ) = 2π
Z
∞
U˜ (ω, f 1+ ) exp
−∞
i (1) 2 β f 1 ω − iωt dω. 2 22
(10)
Substituting Eq. (9) into Eq. (10), by choosing the focal length as (Lohmann & Mendlovic, 1992) (1) f1 = 1 2β22 C1 ,
(11)
and after some algebra (see Appendix A), we obtain the following: √ iπ/C1 ˜ U u(t, 2 f 1 ) = (1) 2π β22 f 1
t (1)
β22 f 1
! , 0 exp (−iφres ) ,
(12)
where the residual phase term is given by
φres
(1) (1) β + β 21 22 2 1 = (1) − t . 2 (1) β22 f 1 2 β22 f1
(13)
Let us consider the case when the temporal filter is absent. This can be divided into two cases. (1)
(1)
(1)
Case 1: β22 = β21 = β2 In this case, φres = 0 and
√ iπ/C1 ˜ U u(t, 2 f 1 ) = (1) 2π β2 f 1
t (1)
β2 f 1
! ,0 .
(14)
. (2) (2) (2) (2) Choosing β21 = β22 = β2 and f 2 = 1 2β22 C2 in the 2-f subsystem T2 and noting that (1) (1) (1) F{U˜ (t/β2 f 1 , 0)} = 2π β2 f 1 u −β2 f 1 ω, 0 ,
(15)
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
209
we finally obtain (see Appendix B) (1)
u(t, 2 f 1 + 2 f 2 ) = u −
β2 f 1 (2)
β2 f 2
! t, 0 ,
(16)
where 2 f 1 + 2 f 2 is the total length of the time-lens system. For the 4-f configuration proposed by Lohmann and Mendlovic (1992), the magnification factor is defined as (1)
M =−
β f1 C2 = − 2(2) . C1 β f2
(17)
2
(1) (2) If sgn β2 = sgn β2 , from Eq. (16) it follows that M is negative. Defining a positive stretching factor m = |M|, Eqs. (17) and (16) can be rewritten as (1)
(2)
β2 f 1 = mβ2 f 2
and C2 = mC1
(18)
and u (t, 2 f 1 + 2 f 2 ) = u (−mt, 0) ,
(19)
which shows that T2 provides the scaled FT of its input and then leads to an inverted image of the signal input of the 4-f configuration (Lohmann & Mendlovic, 1992). If M = −1, T2 provides the exact FT and this leads to the reversal of the bit sequence within a frame, which requires additional signal processing in the optical/electrical domain to recover the original bit sequence. In Yang et al. (2008), the 4-f of Lohmann and Mendlovic (1992) system (1)
is reconfigured. Suppose sgn β2 rewritten as (1)
(2)
(2)
= −sgn β2
β2 f 1 = −mβ2 f 2
, Eqs. (17) and (16) can be
and C2 = −mC1
(20)
and u(t, 2 f 1 + 2 f 2 ) = u(mt, 0),
(21)
which shows that the 2-f subsystem T2 provides the scaled IFT so that the signal at the end of the 4-f system is not time-reversed. If M = 1, the output signal is identical to the input signal. We note that in spatial optical
210
Dong Yang et al.
TABLE 1 Comparison of fiber dispersions and phase coefficients for different time-lens systems Parameters
PM1
PM2
T1
Function
T2
C1
C2
(1) β21
(1) β22
(2) β21
(2) β22
Case 1
+ − + −
+ − − +
+ − + −
+ − + −
+ − − +
+ − − +
Time reversal Time reversal No time reversal No time reversal
Case 2
+ −
− +
− +
+ −
− +
+ −
No time reversal No time reversal
Based on Yang et al. (2008).
signal processing, it is not possible to obtain both direct and inverse Fourier transformation since the sign of diffraction cannot be changed (Goodman, 1996, chap. 4). Table 1 lists the signs of fiber dispersions and chirp coefficients of subsystems T1 and T2 to produce signals with and without time reversal. (1) (1) Case 2: β22 = −β21 In this case, φres = exp −i
!
t2
(22)
(1)
β22 f 1
and √ iπ/C1 ˜ U u(t, 2 f 1 ) = (1) 2π β22 f 1
t (1)
β22 f 1
! , 0 exp −i
t2
!
(1)
β22 f 1
.
(23)
If we configure the 2-f subsystem T2 with reversed parameters, (2)
(1)
β21 f 2 = −β22 f 1 ,
C2 = −C1 ,
(2)
(1)
β22 f 2 = −β21 f 1 ,
(24)
then we find that the input signal of T1 can be exactly recovered at the end of T2 (see Appendix C), u(t, 2 f 1 + 2 f 2 ) = u(t).
(25)
Lines 5 and 6 in Table 1 present two possible configurations of Case 2. The (2) (1) result of Eq. (25) has a simple physical explanation. When β21 f 2 = −β22 f 1 , the accumulated dispersion of the second fiber of T1 is compensated by that
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
211
of the first fiber of T2, leading to unity transfer function. After that, since chirp coefficients C1 and C2 are of opposite sign, they cancel each other too, making the transfer function from PM1 to PM2 (see Figure 1) to unity. Finally, the accumulated dispersion of the first fiber of T1 is compensated by that (1) (2) of the second fiber of T2 when β22 f 2 = −β21 f 1 . Thus, the total transfer function of the 4-f system is unity. By inserting a temporal filter between two 2-f subsystems (see Figure 1), we can easily fulfill various kinds of optical signal processing in the time domain. The important advantage of the time-lens–based temporal filtering technique is that the transfer function of the temporal filter can be dynamically altered by changing the input voltage to the amplitude/phase modulator and therefore, this technique could have potential applications for switching and multiplexing in optical networks. Next, we provide two examples of optical signal processing based on this time-lens temporal filtering technique to show the potential advantages of this temporal filtering technique. One is a tunable wavelength division demultiplexer and the other is a higher-order fiber dispersion compensator.
3. WAVELENGTH DIVISION DEMULTIPLEXER For fiber-optic networks, tunable optical filters are desirable so that center wavelengths of the channels to be added or dropped at a node can be dynamically changed. Tunable optical filters are typically implemented using directional couplers or Bragg gratings. Yang et al. (2008) discuss a temporal filtering technique for the implementation of a tunable optical filter. As an example, let us consider the case of a 2-channel WDM system at 40 Gb/s channel with a channel separation of 200 GHz. Let the input to the 4-f system be the superposition of two channels as shown in Figure 2b. Here we ignore the impairments caused by the fiber-optic transmission and assume that the input to the 4-f system is the same as the transmitted multiplexed signal. In this example, we have simulated a random bit pattern consisting of 16 bits in each channel. The bit “1” is represented by a Gaussian pulse of width 12.5 ps. We assume that the bit rate is 40 Gb/s and therefore the signal bandwidth for each channel is approximately 80 GHz (Returnto-zero (RZ) signal with duty cycle of 0.5). Thus, the channel separation of 200 GHz is wide enough to avoid channel interference. Assume that channel 1 is centered at the optical carrier frequency ω0 and channel 2 is centered at ω0 + 21ω, where 21ω is the channel separation. A band pass filter that is also centered at ω0 and with double-side bandwidth of 21ω can allow channel 1 to transmit while it blocks the channel 2. In this case, the temporal band pass filter is realized using an amplitude modulator—for instance, an
212
Dong Yang et al. (b) 8
Channel 1 Power (mW)
1
0 –200
–100
(a)
0 Time (ps)
100
0 –200
M
200
4
U
0 Time (ps)
200
100
Power (mW)
CH2
200
0 –200
200
1
H(t)
q(t,2f1–)H(t)
–80
0 80 Time (ps) H(t)
Temporal Filter
T2 1
0 200
Transmission
Power (mW)
X 1
–100
100
CH1
|q(t,2f1– )|2
2 Power (mW)
0 Time (ps)
(c)
Channel 2 2
0 –200
–100
H(t) (Arb. units)
Power (mW)
2
q(t,2f1–) T1
q(t,0)
Demuliplexer 0 –200
–100
0 Time (ps)
100
200
(d) Demultiplexed signal
FIGURE 2 A WDM demultiplexer based on a 4-f time-lens system (M = 1). (a) Input signals from channel 1 and channel 2. (b) Multiplexed output signal. (c) Combined signals before and after the temporal filter. (d) Demultiplexed signal in channel 1. (Based on Yang et al. (2008).)
electroabsorption modulator (EAM), with a time domain transfer function H (t) = (1)
(1)
1, 0,
(1)
|t/β2 f 1 | ≤ 1ω otherwise,
(26)
(1)
where β21 = β22 = β2 is the dispersion coefficient of the SMF in the 2-f subsystem T1 and 1ω/2π = 100 GHz. In this section and in Section 4, ( j) ( j) ( j) we assume that β21 = β22 = β2 , j = 1, 2, and M = 1 unless otherwise specified. Considering the realistic implementation of the highspeed amplitude modulator, we need to choose the parameters of the 4-f system carefully. The state-of-the-art EAM operates up to a bit rate of 40 Gb/s (Choi et al., 2002; Fukano, Yamanaka, Tamura & Kondo, 2006), which implies that it can be turned on and off with a temporal separation of 25 ps. From Eq. (26), we see that the amplitude modulator should be turned on for a duration (1) 1T = 2 β2 f 1 1ω
(27)
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
213
and then it is turned off. Setting 1T ≥ 25 ps, we find that (1) β2 f 1 ≥ 20 ps2 .
(28)
Equation (28) can be satisfied using a dispersion compensation module (1) (DCM) with dispersion coefficient β2 = 123 ps2 /km and length f 1 = 1 km. From Eq. (27), we find 1T = 155 ps. To implement IFT in T2, we use a (2) standard SMF with dispersion coefficient β2 = −21 ps2 /km and length (2) (1) f 2 = 5.86 km, which leads to β2 f 2 = −β2 f 1 and M = 1. Figure 2 shows a wavelength division demultiplexer based on a 4-f timelens system. After the 2-f subsystem T1, we obtain the FT of the multiplexed signal in time domain, given by h i q(t, 2 f 1− ) = U˜ 1 (ω) + U˜ 2 (ω)
, (1) ω=t/ β2 f 1
(29)
where U˜ 1 and U˜ 2 are the spectra of the signals for channel 1 and channel 2, respectively. Then it passes through the temporal filter defined by Eq. (26), and the output is given by h i q(t, 2 f 1+ ) = H (t) U˜ 1 (ω) + U˜ 2 (ω) = U˜ 1 (ω)
(1) ω=t/ β2 f 1
, (1) ω=t/ β2 f 1
(30)
and as shown in Figure 2c, the signal from channel 2 is blocked. Thus, at the input end of the 2-f subsystem T2, only the signal from channel 1 is retained. Finally, we obtain the demultiplexed signal for channel 1 at the output of the 2-f subsystem T2 as shown in Figure 2d. According to Eq. (21), it is q(t, 2 f 1 + 2 f 2 ) = u 1 (t) for m = 1, which is identical to the original input of the channel 1. As can be seen, the data in channel 1 can be successfully demultiplexed. For a practical implementation, the quadratic phase factor of Eq. (5) cannot increase indefinitely with time and therefore, a periodic time lens is introduced in Kumar (2007). It is given by
h j (t) =
+∞ X n=−∞
h 0 j t − nt f ,
(31)
214
Dong Yang et al. (a) Input 2 1
Power (mW)
0 –600
–400
–200
0
200
400
600
200
400
600
200
400
600
(b) Output, M = –1 tf
2 1 0 –600
–400
–200
0 (c) Output, M = 1
2 1 0 –600
–400
–200
0 Time (ps)
FIGURE 3 Input and output bit sequences of the WDM demultiplexer based on a 4-f time-lens system. (a) Input. (b) Output with time reversal, M = −1. (c) Output without time reversal, M = +1. Guard time tg = 0 and t f = 400 ps. (Based on Yang et al. (2008).)
where h 0 j (t) =
( exp iC j t 2 , otherwise
|t| <
t f + tg , 2
j = 1, 2,
(32)
where t f is the frame time and tg is the guard time between the frames. Based on the periodic time lenses, we redo the above simulation with a random bit pattern consisting of 3 frames (=48 bits) in each channel. We assume the frame time t f = 400 ps and therefore, a frame consists of 16 bits. Figure 3a shows the input bit sequence of channel 1. Figures 3b and 3c show the demultiplexed signals for M = −1 (time reversal) and M = 1 (without time reversal), respectively. For the case of M = −1, we choose (1) (2) the same parameters as for the case of M = 1 except that β2 f 1 = β2 f 2 . From Figure 3b, we see that the pulses at the edges of frames are distorted, whereas the output pulses without time reversal as shown in Figure 3c suffer only some minor distortions. As discussed in Kumar (2007), the time-reversal system introduces distortion if there is no guard time between the frames because the bits at the left and right edges of a frame are imaged by the (1) neighboring (in time domain) time lenses. As β2 increases, the distortion also increases because pulses at the edges of a frame broaden more due to high dispersion of the 2-f subsystems and occupy the neighboring frames. In this modified 4-f configuration (M = +1) in Yang et al. (2008), the distortion
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
215
(a) Input 2
tg
tg
1
Power (mW)
0
–600
–400
–200
0
200
400
600
200
400
600
200
400
600
(b) Output, M = –1 tf
2 1 0
–600
–400
–200
0 (c) Output, M = 1
2 1 0
–600
–400
–200
0 Time (ps)
FIGURE 4 Input and output bit sequences of the WDM demultiplexer based on a 4-f time-lens system. (a) Input. (b) Output with time reversal, M = −1. (c) Output without time reversal, M = +1. Guard time tg = 50 ps and t f = 400 ps. (Based on Yang et al. (2008).)
is less (Figure 3c) than with M = −1 (Figure 3b). This is probably because the distortion introduced by the first 2-f system is suppressed by that due to the second 2-f system since the signs of the dispersions are reversed. These distortions can be avoided by introducing a guard time between frames, as shown in Figure 4. Figure 4a shows the input bit sequence with a guard time of 50 ps. Figures 4b and 4c show the output signals with and without time reversal, respectively. From Figures 4b and 4c, we find that the distortion is effectively eliminated by adding the guard time between frames. The advantage of this scheme is that the channels to be demultiplexed at a node can be dynamically reconfigured. For example, if channel 1 has to be blocked instead of channel 2, the transmittivity of the amplitude modulator [Eq. (26)] can be dynamically changed to
H (t) =
1,
0,
t (1) − 21ω ≤ 1ω β f1 2 otherwise
(33)
such that channel 1 is in the stop band of the temporal filter and channel 2 is in the passband.
216
Dong Yang et al.
4. DISPERSION COMPENSATOR The following example is the implementation of a higher-order fiber dispersion compensator based on the temporal filtering technique. As the bit rate of a single channel increases, its spectrum broadens and the higher-order dispersion leads to pulse distortion in such a transmission system that limits the system performance. Various dispersion compensation techniques have been developed. The most commercially advanced technique is negativedispersion fiber-based dispersion compensation, but the disadvantage is that, in practice, it is hard to design and manufacture a dispersioncompensating fiber with negative dispersion slope due to its higher sensitivity to waveguide profile fluctuations (Srikant, 2001). Yang et al. (2008) discuss a dispersion compensation technique based on the time-lens–based temporal filtering technique that can compensate for any order of fiber dispersion in fiber-optic transmission systems. In this application, the temporal filter is realized using a phase modulator instead of an amplitude modulator and its time domain transfer function is given by X ωn d n β H (t) = exp −i L n! dωn n≥2
!
,
(34)
(1) ω=t/(β2 f 1 )
where d n β/dωn is the nth order dispersion. To realize the transfer function given by Eq. (34), an arbitrary waveform generator (AWG) is required to drive the phase modulator. Considering that the bit sequence is processed frame by frame, if the frame time is t f , then for each frame, the frequency resolution is given by 1f =
1 . tf
(35)
Before using the temporal filter we obtain the FT of the input signal in the (1) ˜ time domain as U t/β2 f 1 . Therefore, the corresponding time resolution is given by (1) 1t = β2 f 1 (2π 1 f ) .
(36)
The maximum achievable bandwidth, Bmax , of an AWG using the current technology is ∼12.5 GHz (Kondratko et al., 2005). Therefore, we obtain 1t ≥
1 Bmax
= 80 ps.
(37)
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
217
Using Eqs. (35) and (36) in Eq. (37), we obtain the following constraint on the parameters of the time-lens system: (1) β2 f 1 ≥
tf . 2π Bmax
(38)
From Eq. (38), we see that if the frame is too wide, the fiber with higher dispersion coefficient is required when the bandwidth of AWG is fixed. However, on the other hand, if the frame is too narrow, synchronization of various modulators should be done at a higher frame rate (=1/t f ). Usually synchronization is done by extracting the clock from the signal (Nakazawa, Yamada, Kubota & Suzuki, 1991) and synchronization is easier at lower frame rates. From Eq. (38), we see that if the bandwidth of the AWG is too small, we need a fiber with large dispersion. We choose dispersion coefficient (1) β2 = 123 ps2 /km and the frame time t f = 100 ps. From Eq. (38), we find f 1 = 10.4 km. With the above choice of frame time, the synchronization of the frame needs to be done by extracting a clock at 10 GHz (=1/t f ). Without loss of the generality, in this example only the second- and thirdorder fiber dispersion effects are taken into account. Therefore, the temporal filter is simplified to H (t) = exp −iω2 β2 L/2 − iω3 β3 L/6
, (1) ω=t/ β2 f 1
(39)
where β2 and β3 are the second- and third-order dispersions of the (1) transmission fiber, respectively. L is the transmission distance. β2 is the dispersion coefficient of the SMF in the 2-f subsystem T1. In the simulation, we ignore the nonlinear effect and amplifier noise. The bit “1” is represented by a Gaussian pulse with width of 25 ps, and the input bit sequence is shown in Figure 5a. We simulate a random bit pattern with 20 bits (10 frames), and the bit rate is 20 Gb/s such that there are two bits within each frame in this case. The guard time is chosen as tg = 12 ps. The second-order dispersion of the transmission fiber is β2 = −21 ps2 /km and the transmission distance is 10 km. To highlight the effect of the third-order dispersion, we set β3 = 20 ps3 /km, rather than the standard value of 0.1 ps3 /km, but it does not affect the generality of this dispersion compensation technique. After propagation along the transmission fiber with both the second- and third-order dispersions, each bit has been broadened beyond its bit interval and hence the output signal after fiber transmission is distorted as shown in Figure 5b, which is then input to the 4-f system. As analyzed previously, the 2-f subsystem T1 provides the FT of the signal input to the 4-f system in the time domain such that
218
Dong Yang et al. (b) 1.6
Power (mW)
(a) Input
Power (mW)
2
1
0
Transmission
–500
0 Time (ps)
1
0
–500
0 Time (ps)
500
500
(c)
(d) Output 2.5
Temporal Filter H(t) Power (mW)
2 T2
T1
1 Dispersion Compensator
0
–500
0 Time (ps)
500
FIGURE 5 A dispersion compensator based on a 4-f time-lens system. t f = 100 ps, tg = 12 ps, and M = 1. (a) Input signal. (b) Output signal after fiber propagation. (c) Dispersion compensator based on the time-lens system. (d) Output after the dispersion compensator. (Based on Yang et al. (2008).)
q(t, 2 f 1− ) = U˜ (ω) exp iω2 β2 L/2 + iω3 β3 L/6
, (1) ω=t/ β2 f 1
(40)
where U˜ (ω) is the FT of the launched bit sequence as shown in Figure 5a and the exponential part is the linear transfer function of the fiber transmission system. Thus, combining Eqs. (39) and (40), after the temporal filter we obtain q(t, 2 f 1+ ) = q(t, 2 f 1− )H (t) = U˜ (ω) (1)
ω=t/ β2 f 1
,
(41)
and it is shown from Eq. (41) that the phase shift caused by the transmission fiber has been completely canceled out by the temporal filter. Since M = 1, (1) the 2-f subsystem T2 provides the exact IFT of q(t, 2 f 1+ ) = U˜ (t/β2 f 1 ) so that q(t, 2 f 1 + 2 f 2 ) = u(t, 0), where u(t, 0) is the input signal of the fiber transmission system. Figure 5 demonstrates the process of the dispersion compensation undertaken by the time-lens–based temporal filtering system.
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
219
(a) Input 2 1
Power (mW)
0
–500
0
500
(b) Output, t g = 12 ps 2 1 0
–500
0
500
(c) Output, t g = 24 ps 2 1 0 –600
–400
–200
0 Time (ps)
200
400
600
FIGURE 6 Input and output bit sequences of the dispersion compensator based on a 4-f time-lens system (M = 1 and t f = 100 ps). (a) Input. (b) Output with guard time tg = 12 ps. (c) Output with guard time tg = 24 ps. (Based on Yang et al. (2008).)
Figure 5d shows that there is some distortion of the output signal since the guard time (=12 ps) used in this simulation is quite small. To avoid this distortion, we use the guard time tg = 24 ps and redo the above simulation. Figure 6 shows the simulation results with tg = 12 ps and tg = 24 ps and the rest of the parameters is the same as that used in Figure 5. Figure 6a shows the input bit sequence (the same as Figure 5a). Figure 6b (the same as Figure 5d) and Figure 6c show the output signals for the cases of tg = 12 ps and tg = 24 ps, respectively. As can be seen, the distortion is more effectively suppressed in the case of tg = 24 ps compared with the case of tg = 12 ps.
5. OPTICAL IMPLEMENTATION OF ORTHOGONAL FREQUENCY-DIVISION MULTIPLEXING USING TIME LENSES Orthogonal frequency-division multiplexing is widely used in wireless communication systems and recently has drawn significant attention in optical communication (Djordjevic & Vasic, 2006; Lowery & Armstrong, 2005). OFDM is a form of multiple subcarrier modulation in which subcarriers are orthogonal. Kumar and Yang (2008) propose an optical implementation of OFDM using time lenses. Figure 7 shows an example of a fiber-optic communication system based on OFDM. Single-channel input data are converted into N parallel data paths using serial-to-parallel conversion. Let the symbol interval of the input to the inverse fast Fourier transform (IFFT) be Tblock and each of the parallel
220
Modulation
IFFT
Parallel to serial
...
Serial to parallel
...
Binary input
...
Dong Yang et al. D/A Converter
RF Upconverter
Laser
FIGURE 7 (2008).)
A/D converter
Serial to parallel
FFT
Demodulation
...
RF Downconverter
...
PD
Fiber link
...
Optical modulator
Parallel to Output serial
Block diagram of a conventional OFDM system. (Based on Kumar and Yang
Laser
Modulation
~ uin(t ') Serial input
Inverse Fourier transform
FT
fiber
u out (t )
Fourier transform
u out (t ')
Direct detection
IFT
u out (t )
Fiber link
Serial output
I(t ')
FIGURE 8 Block diagram of the time-lens–based scheme. (Based on Kumar and Yang (2008).)
data channels consists of a complex data ak , k = 1, 2, . . . , N , within each block. These signals are multiplied by subcarriers and multiplexed together using IFFT, which processes data on a block-by-block basis. The subcarriers are sinusoids with frequencies that are integer multiples of 1/Tblock , which makes them orthogonal to each other. The output of IFFT is up-converted to the radio frequency (RF) domain after parallel-to-serial and digital-toanalog (D/A) conversion. An optical carrier is intensity modulated by the RF signal consisting of an OFDM band using optical modulator. At the receiver, the RF signal is down-converted to baseband and the input complex data ak can be recovered from the output of fast-Fourier transform (FFT). The signals in adjacent blocks could interfere with each other due to fiber dispersion leading to interblock interference, which can be avoided by introducing guard intervals between blocks (Djordjevic & Vasic, 2006; Lowery & Armstrong, 2005). Figure 8 shows an optical implementation of OFDM using time lenses. Let the input optical field envelope be u˜ in (t 0 ). We
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
221
define the FT pairs as u(t ˜ ) = F[u(t); t → t ] = 0
0
Z
∞
u(t) exp(−i2π t 0 t)dt,
(42)
−∞
and u(t) = F
−1
[u(t ˜ ); t → t] = 0
0
Z
∞
u(t ˜ 0 ) exp(i2π t 0 t)dt 0 .
(43)
−∞
The IFT and FT are implemented using time lenses. Propagation of an input optical field envelope in a 2-f system consisting of dispersive elements and a phase modulator (time lens) results in either FT or IFT of the input signal at the output of the 2-f system (Jannson, 1983; Lohmann & Mendlovic, 1992). The output of the 2-f system can either be the FT or IFT depending on the signs of second-order dispersion coefficients and chirp factors of the phase modulator (Yang et al., 2008). We choose (1)
(1)
(1)
β21 f 1 = β22 f 1 = β2 f 1 , (2)
(2)
(1)
β21 f 2 = β22 f 2 = −β2 f 1 ,
(44)
so that the first 2-f subsystem provides the IFT and the second 2-f subsystem ( j) provides the direct FT. Define the S j = β2 f j , j = 1, 2 as the accumulated second-order dispersion coefficient of the 2-f subsystems T1 and T2, respectively, we have S1 = −S2 . The temporal filter in this application is characterized as a transfer function of fiber link. Using the time-lens–based 2-f system, the IFT of the input field envelope, u˜ in (t 0 ) is given by (Lohmann & Mendlovic, 1992) p −1 0 0 u IFT out (t) = F [u˜ in (t ); t → t/(2π S1 )]/ −i2π S1 , p = u in (t/(2π S1 ))/ −i2π S1 .
(45)
The input and output of the IFT (T1) are serial and therefore, there is no need for serial-to-parallel and parallel-to-serial conversion. The signal u IFT out (t) is transmitted over the fiber-optic link and at the receiver, the output of the fiber-optic link, u fiber out (t) passes through a Fourier transformer (T2) whose output is 1 0 0 u˜ FT F[u fiber out (t); t → t /(2π S2 )], out (t ) = √ −i2π S2 p 0 = u˜ fiber out (t /(2π S2 ))/ −i2π S2 .
(46)
222
Dong Yang et al.
After passing through the direct-detection optical receiver, the photodiode current is given by 0 2 I (t 0 ) = |u˜ FT out (t )| ,
(47)
assuming unity responsivity of the photodiode. Consider the optical OFDM signal u IFT out (t) propagating in a fiber whose linear transfer function is (Agrawal, 1997) H f ( f ) = exp[iθ ( f )L],
(48)
θ ( f ) = β2 (2π f )2 /2 + β3 (2π f )3 /6 + β4 (2π f )4 /24 + . . . ,
(49)
where
L is the fiber length, and βn is the nth-order dispersion coefficient. In the absence of noise and nonlinearity, the fiber output is IFT u fiber out (t) = u out (t) ∗ h f (t),
(50)
where ∗ denotes convolution and h f (t) = F −1 [H f ( f )] is the fiber impulse response function. Convolution in the time domain becomes a product in frequency domain and therefore, after the Fourier transformer (see Figure 8), the output signal can be written as 1 0 0 0 u˜ FT F{u IFT out (t ) = √ out (t); t → t /(2π S2 )}F{h f (t); t → t /(2π S2 )}. (51) −i2π S2 Using Eqs. (45) and (48), we obtain 0 0 0 u FT out (t ) = u˜ in (t ) exp[i Lθ (t /(2π S2 ))].
(52)
From Eq. (47), the photocurrent is I (t 0 ) = |u˜ in (t 0 )|2 .
(53)
Thus, we see that the effect of fiber dispersion is a mere phase shift [Eq. (52)] in this system and therefore, dispersive effects completely disappear after direct detection and the photocurrent is directly proportional to the input power. So far we have assumed that the aperture of the time lens is infinite. To process the input signal block-by-block basis, periodic time lenses with finite aperture should be introduced (Kumar, 2007; Yang et al., 2008). In this
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
223
case, the phase modulator multiplies the incident optical field envelope by a function ∞ X
h(t) =
h 0 (t − nTofdm ),
(54)
n=−∞ 2
h 0 (t) = exp(iαt ) for |t| ≤ Tblock/2 = 0, elsewhere
(55)
and input signal is of the form u˜ in (t 0 ) =
∞ X
n=−∞ f n (t 0 )
u n (t 0 ) = = 0,
u n (t 0 − nTofdm ),
(56)
for |t 0 | ≤ Tblock /2 elsewhere
(57)
where f n (t 0 ) is an arbitrary input signal in the block n and Tofdm is the total width of the OFDM signal. The difference Tofdm − Tblock is the guard time introduced between consecutive blocks to ensure that interblock interference is not significant. Consider the input signal u n (t 0 ), which is limited to the interval [−Tblock /2, Tblock /2]. Let the highest-frequency component of the signal u n (t 0 ) be f max . After passing through the IFT, the corresponding signal should be confined within the interval [−Tblock /2, Tblock /2]. Using Eq. (45), the above condition leads to |S1 | ≤ Tblock /(2 f max ),
(58)
which gives an upper bound for |S1 |. There are a few differences between the conventional OFDM and the scheme proposed by Kumar and Yang (2008) (see Figure 8). First, the conventional OFDM uses the discrete FT, whereas the scheme of Figure 8 uses continuous FT. Second, conventional OFDM requires high-speed digital signal processors to compute FFT and IFFT and computational time increases as the aggregate bit rate becomes higher. In contrast, the FT and IFT operations of Figure 8 are almost instantaneous except for the small propagation delays in the dispersive elements. This time-lens–based scheme should not be considered a method to compensate for second-order dispersion of the transmission fiber. This is because the dispersive elements of the FT and IFT should have both positive and negative second-order dispersion coefficients. This technique can be used to compensate for thirdand higher-order dispersion coefficients. To illustrate this point, numerical simulations of a dispersion-managed transmission system is carried out
224
Dong Yang et al. 3.5 3
With FT&IFT input Without FT &IFT
Power (mW)
2.5 2 1.5 1 0.5 0 10.4
10.41
10.42 Time (ns)
10.43
10.44
FIGURE 9 Plot of input and output powers vs time. The solid line denotes output power without using FT and IFT. The dotted line shows input power and the crosses show output power with FT and IFT. (Based on Kumar and Yang (2008).)
using the following parameters: bit rate = 320 Gb/s, second- (β2 ), third(β3 ) and fourth- (β4 ) order dispersion of the transmission fiber (SMF) are −21.6 ps2 /km, 0.05 ps3 /km, and 1.69 × 10−4 ps4 /km, respectively, and total transmission distance = 400 km. The commercially available dispersioncompensating module (DCM) has a zero dispersion slope. Therefore, we choose β2 = 128 ps2 /km, β3 = 0 and β4 = 0 for the DCM. The nonlinear coefficients of SMF and DCM are set to zero. Fiber losses are 0.2 dB/km and 0.5 dB/km for SMF and DCM, respectively. Fiber losses are fully compensated by amplifiers that are placed 80 km apart. The amplifiers are assumed to be noiseless to see the impact of dispersion clearly. The secondorder dispersion of the SMF is fully compensated by the inline DCM placed just before each amplifier. u n (t 0 ) is a random bit sequence consisting of Gaussian pulses with FWHM of 1.56 ps, Tblock = 6.4 ns, Tofdm = 7.2 ns, and S1 = 397 ps2 . The dotted line, crosses and solid line in Figure 9 show the input power(|u in (t 0 )|2 ), output power (I (t 0 )) with FT and IFT, and output power without FT and IFT, respectively. In Figure 9, the amplifiers are assumed to be noiseless to see the impact of dispersion clearly. Four blocks of data are simulated and Figure 9 shows a part of the data within a block. As can be seen, there is a significant distortion due to β3 and β4 when FT and IFT are not used, and higher-order dispersive effects can be suppressed using the scheme of Figure 8. The transmission distance can further be increased without distortion by increasing Tblock . In Figure 10, we compare the bit error rate (BER) of the time-lens–based scheme (see Figure 8) and the system without using FT and IFT. In this simulation, amplifier noise is turned on and 4 million bits are used at the input; all the other parameters are kept the same as in Figure 9, except Tblock is changing. Optical signal-noise ratio (OSNR) is calculated using a bandwidth
225
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems 0 –0.5 No OFDM
–1
T block= 3.84 ns T block= 6.4 ns
log10 BER
–1.5 –2 –2.5 –3 –3.5 –4 –4.5 27.5
28
28.5
29
29.5 30 OSNR (dB)
30.5
31
31.5
32
FIGURE 10 Comparison of BER between OFDM-based systems with different TOFDM and the conventional OOK system. (Based on Kumar and Yang (2008).) 4.5
With FT&IFT input
4
Power (mW)
3.5 3 2.5 2 1.5 1 0.5 0
22.42
22.44 22.46 Time (ns)
22.48
22.5
FIGURE 11 Nonlinear performance of the time-lens–based scheme. (Based on Kumar and Yang (2008).)
of 0.1 nm. In the absence of FT and IFT, from Figure 10 we see that BER is nearly independent of OSNR, indicating that degradation is mainly due to higher-order dispersion. As OFDM symbol interval increases, BER reduces, since larger Tblock corresponds to lower subchannel bit rates and therefore higher dispersion tolerance. Figure 11 shows the evolution of pulses in the presence of fiber nonlinearity. The parameters are same as in Figure 8 except that nonlinear coefficients of SMF and DCF are 1.25 W−1 km−1 and 5 W−1 km−1 , respectively, and the launch peak power is 4 mW. As can be seen, there is a pulse distortion due to intrachannel nonlinear impairments, which should be expected
226
Dong Yang et al.
since this time-lens–based scheme does not compensate for nonlinear impairments. Note that the FT and IFT should be properly synchronized to ensure that they process the data on a block-by-block basis. However, the synchronization needs to be done only at the OFDM symbol rate (=1/Tofdm ), which is much lower than the aggregate bit rate.
6. CONCLUSIONS Various issues in the design of a time-lens–based optical signal processing system are discussed. The 2-f subsystem T2 is chosen to be anti-symmetric (2) (1) with respect to the 2-f subsystem T1, that is, β2 = −β2 and C2 = −C1 so that the 2-f subsystem T2 provides the exact inverse Fourier transform, and as a result the bit sequence at the output is not time-reversed. In contrast, as for the 4-f configuration in previous work (Lohmann & Mendlovic, 1992), the sign of dispersion of the fiber in the 2-f subsystem T2 is identical to that in T1 and therefore, T2 provides the FT of its input, which leads to a timereversed output bit pattern and therefore, additional signal processing in the optical/electrical domain to recover the original bit sequence is required. We have discussed three applications using the time-lens–based optical signal processing scheme. First, a possible implementation of a tunable wavelength division demultiplexer is demonstrated. The temporal filter in this example is an amplitude modulator. The advantage of this timelens–based wavelength division multiplexing demultiplexer is that the channels to be demultiplexed at a node of an optical network can be dynamically chosen by changing the passbands of the temporal filter. Second, a temporal filtering technique for the implementation of a higherorder fiber dispersion compensator is discussed. In this case, the temporal filter is realized by a phase modulator. The transfer function of the filter can be expressed as a summation of several independent terms, each of which corresponds to the amount of the phase shift caused by the certain order of fiber dispersion. This dispersion compensation technique can flexibly compensate for any order of fiber dispersion by simply modifying the transfer function of the temporal filter. Finally, an implementation of orthogonal frequency-division multiplexing in the optical domain using Fourier transforming properties of time lenses is discussed. In this application, the first 2-f subsystem (T1) takes role of an inverse Fourier transformer and the second 2-f subsystem (T2) works as a Fourier transformer. The temporal filter between T1 and T2 is characterized as the transfer function of a fiber-optic link with nonlinearity. The output of the Fourier transformer is the original input signal of IFT (T1) multiplied by a phase factor due to fiber dispersive effects. Since the photocurrent is proportional to the absolute square of the optical field, dispersive effects are eliminated using this approach.
227
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
ACKNOWLEDGMENT The authors thank the National Science and Engineering Research Council of Canada for research grant support.
APPENDIX A The output signal at the end of 2-f subsystem T1 is given by 1 u(t, 2 f 1 ) = 2π
+∞
Z
U˜ (ω, f 1+ ) exp
−∞
i (1) 2 β f 1 ω − iωt dω, 2 22
(59)
where 1 U˜ (ω, f 1+ ) = 2π
∞
Z
U˜ (ω − , f 1− ) H˜ 1 ()d.
(60)
−∞
Substituting Eq. (60) into Eq. (59), we obtain u(t, 2 f 1 ) =
1 2π
2 Z
+∞ Z ∞
U˜ (ω − , f 1− ) H˜ 1 () i (1) β22 f 1 ω2 − iωt ddω. × exp 2 −∞
−∞
(61)
From Eqs. (7) and (6), we have U˜ (ω, f 1− ) = U˜ (ω, 0) exp
i (1) β f 1 ω2 2 21
(62)
and p ω2 H1 (ω) = iπ/C1 exp −i 4C1
! .
(63)
Let ω0 = ω − in Eq. (61). Inserting Eqs. (62) and (63) into Eq. (61) and rearranging terms, we obtain u(t, 2 f 1 ) =
1 2π
2 s
iπ C1
Z
∞
φ(ω0 )U˜ (ω0 , 0) exp
−∞
i (1) 02 0 × exp β f 1 ω − iω t dω0 , 2 22
i (1) 2 β f 1 ω0 2 21
(64)
228
Dong Yang et al.
where φ(ω ) = 0
Z
+∞
−∞
i (1) 2 exp β22 f 1 2 − i 2 4C1
!
(1) exp iβ22 f 1 ω0 − it d. (65)
The 2 terms appearing in the argument of the exponent in Eq. (65) are (1) eliminated if we select the chirp coefficient C1 , dispersion β22 , and the fiber length f 1 are related by C1 =
1 (1) 2β22 f 1
.
(66)
Hence, Eq. (65) becomes Z
+∞
(1) exp iβ22 f 1 ω0 − it d −∞ (1) = 2π δ t − β22 f 1 ω0 ,
φ(ω0 ) =
(67)
where δ(·) is a delta function. Inserting Eq. (67) into Eq. (64), we finally obtain ! √ t iπ/C1 ˜ U u(t, 2 f 1 ) = ,0 (1) (1) β22 f 1 2π β22 f 1
(1) (1) β + β 21 22 1 × exp −it 2 (1) − . 2 (1) β22 f 1 2 β22 f1
(68)
APPENDIX B (2)
(2)
If β21 = β22 in T2, following the same derivation as in Appendix A, we obtain ! √ iπ/C2 ˜ t U u(t, 2 f 1 + 2 f 2 ) = , 2 f1 (69) (2) (2) β22 f 2 2π β22 f 2 where U˜
t (2)
β22 f 2
! , 2 f1
= F {u(t, 2 f 1 )}|ω=t/β (2) f . 22
2
(70)
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
229
From Eqs. (14) and (15), we have √ iπ/C1 ˜ U u(t, 2 f 1 ) = (1) 2π β22 f 1
t (1)
β22 f 1
! ,0
(71)
and ( F U˜
!)
t (1)
β22 f 1
,0
(1) (1) = 2π β22 f 1 u −β22 f 1 ω, 0 .
(72)
With the help of Eqs. (71) and (72) and noting that (1)
1/C1 = 2β22 f 1
(73)
(2) 2β22 f 2 ,
(74)
1/C2 = we obtain
q ! (1) (2) (1) i β22 f 1 β22 f 2 β22 f 1 u(t, 2 f 1 + 2 f 2 ) = u − (2) t, 0 . (1) β22 f 2 β22 f 1
(75)
Ignoring the trivial constant in Eq. (75), we finally obtain (1)
u(t, 2 f 1 + 2 f 2 ) = u −
β22 f 1 (2)
β22 f 2
! t, 0 .
(76)
APPENDIX C For Case 2, from Eq. (23), we have √ iπ/C1 ˜ U u(t, 2 f 1 ) = (1) 2π β22 f 1
t (1)
β22 f 1
! , 0 exp −i
t2 (1)
β22 f 1
! .
(77)
The FT of Eq. (23) is given by F {u(t, 2 f 1 )} = U˜ (ω, 2 f 1 ) .
(78)
230
Dong Yang et al.
After propagating in the first single-mode fiber of T2, we obtain U˜ (ω, 2 f 1 + f 2− ) = U˜ (ω, 2 f 1 ) exp
i (2) β21 f 2 ω2 , 2
(79)
where U˜ (ω, 2 f 1 + f 2− ) is the Fourier transform of the signal right before the time lens 2. Noting that U˜ (ω, 2 f 1 ) = U˜ (ω, f 1+ ) exp
i (1) 2 β f1ω , 2 22 (2)
(80) (1)
substituting Eq. (80) into Eq. (79) and choosing β21 f 2 = −β22 f 1 , we obtain U˜ (ω, 2 f 1 + f 2− ) = U˜ (ω, f 1+ ) ,
(81)
u (t, 2 f 1 + f 2− ) = u (t, f 1+ ) .
(82)
which leads to
After the time lens 2, we obtain u (t, 2 f 1 + f 2+ ) = u (t, 2 f 1 + f 2− ) exp iC2 t 2 .
(83)
The first time lens introduces a quadratic phase factor u (t, f 1+ ) = u (t, f 1− ) exp iC1 t 2 .
(84)
If C2 = −C1 , inserting Eqs. (82) and (84) into Eq. (83), we obtain u (t, 2 f 1 + f 2+ ) = u (t, f 1− ) .
(85)
U˜ (ω, 2 f 1 + f 2+ ) = U˜ (ω, f 1− ) .
(86)
From Eq. (85), we have
After propagating in the second single-mode fiber of T2, we obtain U˜ (ω, 2 f 1 + 2 f 2 ) = U˜ (ω, 2 f 1 + f 2+ ) exp
i (2) 2 β f2ω . 2 22
(87)
231
Temporal Filtering Technique Using Time Lenses for Optical Transmission Systems
Also note that U˜ (ω, f 1− ) = U˜ (ω, 0) exp
i (1) β21 f 1 ω2 . 2 (2)
(88) (1)
Substituting Eqs. (86) and (88) to Eq. (87) and choosing β22 f 2 = −β21 f 1 , we obtain U˜ (ω, 2 f 1 + 2 f 2 ) = U˜ (ω, 0)
(89)
u (t, 2 f 1 + 2 f 2 ) = u (t, 0) ,
(90)
and
where u (t, 2 f 1 + 2 f 2 ) is the output signal of T2 and u (t, 0) is the input signal of T1. In summary, for the Case 2, if we choose (2)
(1)
(2)
(1)
β21 f 2 = −β22 f 1 , C2 = −C1 , β22 f 2 = −β21 f 1 ,
(91) (92) (93)
then at the end of the 4-f time-lens system, we can exactly replicate the input signal without time reversal.
REFERENCES Agrawal, G. P. (1997). Fiber-optic communication systems. In Series in microwave and optical engineering (2nd ed.) (p. 49). New York: Wiley. Azana, J., & Muriel, M. A. (2000). Real-time optical spectrum analysis based on the time-space duality in chirped fiber gratings. IEEE Journal of Quantum Electronics, 36(5), 517–526. Berger, N. K., Levit, B., Atkins, S., & Fischer, B. (2000). Time-lens-based spectral analysis of optical pulses by electrooptic phase modulation. Electronic Letters, 36(19), 1644–1646. Choi, W.-J., Bond, A. E., Kim, J., Zhang, J., Jambunathan, R., Foulk, H., et al. (2002). Low insertion loss and low dispersion penalty InGaAsp quantum-well high-speed electroabsorption modulator for 40-Gb/s very-short-reach, long-reach, and long-haul applications. Journal of Lightwave Technology, 20(12), 2052–2056. Djordjevic, I. B., & Vasic, B. (2006). Orthogonal frequency division multiplexing for high speed optical transmission. Optics Express, 14(9), 3767–3775. Fukano, H., Yamanaka, T., Tamura, M., & Kondo, Y. (2006). Very-low-driving-voltage electroabsorption modulators operating at 40 Gb/s. Journal of Lightwave Technology, 24(5), 2219–2224. Goodman, J. W. (1996). Fresnel and Fraunhofer diffractions. In Introduction to Fourier optics (2nd ed.) (pp. 63–95). San Francisco: McGraw-Hill. Jannson, T. (1983). Real time Fourier transformation in dispersive optical fibers. Optics Express, 8(4), 232–234.
232
Dong Yang et al.
Kolner, B. H. (1994). Generalization of the concepts of focal length and f-number to space and time. Journal of the Optical Society of America A, 11(12), 3229–3234. Kolner, B. H., & Nazarathy, M. (1989). Temporal imaging with a time lens. Optics Letters, 14(12), 630–632. Kondratko, P. K., Leven, A., Chen, Y.-K., Lin, J., Koc, U.-V., Tu, K.-Y., et al. (2005). 12.5ghz optically sampled interference-based photonic arbitrary waveform generator. IEEE Phontonics Technology Letters, 17(12), 2727–2729. Kumar, S. (2007). Compensation of third-order dispersion using time reversal in optical transmission systems. Optics Letters, 32(4), 346–348. Kumar, S., & Yang, D. (2008). Optical implementation of orthogonal frequency-division multiplexing using time lenes. Optics Letters, 33(17), 2002–2004. Lohmann, A. W., & Mendlovic, D. (1992). Temporal filtering with time lenses. Applied Optics, 31(29), 6212–6219. Lowery, A. J., & Armstrong, J. (2005). 10 Gb/s multimode fiber link using power efficient orthogonal frequency division multiplexing. Optics Express, 13(25), 10003–10009. Nakazawa, M., Yamada, E., Kubota, H., & Suzuki, K. (1991). 10 Gbit/s soliton data transmission over one million kilometres. Electroic Letters, 27(14), 1270–1272. Papoulis, A. (1994). Pulse compression, fiber communications, and diffraction: a unified approach. Journal of the Optical Society of America A, 11(1), 3–13. Srikant, V. 2001, Broadband dispersion and dispersion slope compensation in high bit rate and ultra long haul systems. In Optical Fiber Communication Conference (OFC), 2001 OSA Technical Digest Series, Optical Society of America. Van Howe, J., & Xu, C. (2006). Ultrafast optical signal processing based upon space-time dualities. Journal of Lightwave Technology, 24(7), 2649–2662. Yang, D., Kumar, S., & Wang, H. (2008). Temporal filtering for optical communication systems. Optics Communications, 281(2), 238–247.
Contents of Volumes 151–157 VOLUME 151* C. Bontus and T. K¨ ohler, Reconstruction algorithms for computed tomography L. Busin, N. Vandenbroucke, and L. Macaire, Color spaces and image segmen-
tation G. R. Easley and F. Colonna, Generalized discrete Radon transforms and
applications to image processing T. Radliˇcka, Lie agebraic methods in charged particle optics V. Randle, Recent developments in electron backscatter diffraction
VOLUME 152 N. S. T. Hirata, Stack filters: from definition to design algorithms S. A. Khan, The Foldy–Wouthuysen transformation technique in optics S. Morfu, P. Marqui´e, B. Nofi´el´e, and D. Ginhac, Nonlinear systems for image
processing T. Nitta, Complex-valued neural network and complex-valued backpropagation
learning algorithm J. Bobin, J.-L. Starck, Y. Moudden, and M. J. Fadili, Blind source separation: the
sparsity revoloution R. L. Withers, ‘‘Disorder’’: structured diffuse scattering and local crystal
chemistry
VOLUME 153 Aberration-corrected Electron Microscopy H. Rose, History of direct aberration correction M. Haider, H. M¨ uller, and S. Uhlemann, Present and future hexapole aberration
correctors for high-resolution electron microscopy O. L. Krivanek, N. Dellby, R. J. Kyse, M. F. Murfitt, C. S. Own, and Z. S. Szilagyi,
Advances in aberration-corrected scanning transmission electron microscopy and electron energy-loss spectroscopy P. E. Batson, First results using the Nion third-order scanning transmission electron microscope corrector * Lists of the contents of volumes 100–149 are to be found in volume 150; the entire series can be searched on ScienceDirect.com
233
234
Contents of Volumes 151–157
A. L. Bleloch, Scanning transmission electron microscopy and electron energy
loss spectroscopy: mapping materials atom by atom F. Houdellier, M. H¨ytch, F. H¨ ue, and E. Snoeck, Aberration correction with the
SACTEM-Toulouse: from imaging to diffraction B. Kabius and H. Rose, Novel aberration correction concepts A. I. Kirkland, P. D. Nellist, L.-Y. Chang, and S. J. Haigh, Aberration-corrected
imaging in conventional transmission electron microscopy and scanning transmission electron microscopy S. J. Pennycook, M. F. Chisholm, A. R. Lupini, M. Varela, K. van Benthem, A. Y. Borisevich, M. P. Oxley, W. Luo, and S. T. Pantelides, Materials applications
of aberration-corrected scanning transmission electron microscopy N. Tanaka, Spherical aberration-corrected transmission electron microscopy for nanomaterials K. Urban, L. Houben, C.-L. Jia, M. Lentzen, S.-B. Mi, A. Thust, and K. Tillmann,
Atomic-resolution aberration-corrected transmission electron microscopy Y. Zhu and J. Wall, Aberration-corrected electron microscopes at Brookhaven National Laboratory
VOLUME 154 H. F. Harmuth and B. Meffert, Dirac’s difference equation and the physics of finite
differences
VOLUME 155 D. Greenfield and M. Monastyrskiy, Selected problems of computational
charged particle optics
VOLUME 156 V. Argyriou and M. Petrou, Photometric stereo: an overview F. Brackx, N. de Schepper and F. Sommen, The Fourier transform in Clifford
analysis N. de Jonge, Carbon nanotube electron sources for electron microscopes E. Recami and M. Zamboni-Rached, Localized waves: a review
VOLUME 157 M. I. Yavor, Optics of charged particle analyzers
Index
A Aberrations astigmatism, 51 chromatic, 50 coma, 50 distortion, 50, 51 image degeneration, 51 spherical, 49 Algebraic anchors, 182 Algebraic filter, 178 Algebraic framework, 181 Algebraic openings, anchors of, 195 attribute openings, 196 granulometries, 198, 199 morphological openings, 195 spatial and shift-invariant openings, 197, 198 Algebraic reconstruction techniques, 153 Amplifier noise, 224 Anchors. See also Erosion anchors; Morphological anchors concept of, 180 illustration of, 182 Anti-extensivity, illustrations of, 178 Aquatic microscope, 59, 61 Arbitrary waveform generator, 216 ART. See Algebraic reconstruction techniques Aspheric magnifier, 76 Astigmatism, 51 Atomic resolution, 158 phantoms, 3D reconstructions of, 159 Attribute openings, 196 Aujol–Chambolle BV-G-E mode Besov space, 115 numerical algorithm, 116, 117 structures + textures + noise decomposition problem, 115 Aujol’s algorithm, 105–107 AWG. See Arbitrary waveform generator B Ball, opening/closing, 184 Bancks microscopes, 52 Band pass filter, 211 Barrel distortion, 51
BER. See Bit error rate Besov spaces, 90, 99, 109, 118 Binary image, 175, 176 Biogenic law, 39 Bit error rate, 224 Borel’s moth from pre-microscopical study of nature, 83 Botanical microscopes, 44, 47 Bounded variation spaces, 101 Bragg gratings, 211 Brownian motion, 43, 44, 46 C Carrier-envelope phase control in multi-megahertz train of pulses, 22 electron acceleration control by, 22, 23 few-cycle laser pulses with, 21, 22 Cell theory cellular structure, 40 microscope and, 41, 42 nucleus within living cells, 42, 43 precursors, 39 Centers of maximal balls center points of, 141 concept of, 148 crisp object, 142 spurious, 151 Centers of maximal fuzzy balls, 141, 147–151 blue/green, fuzzy object, 151 concept of, 148 2D images, 142 distance information, 148 fuzzy object, 141, 142, 148, 150, 166 identification of, 149 macromolecular structure, 142 medial surface, 166 membership function, threshold of, 150 PDB, phantom construction of, 166 reduction of, 168 CE phase. See Carrier-envelope phase Chambolle nonlinear projector, 90, 104, 113 CST operator, 120 Euclidean norm, 132 Euler–Lagrange calculus, 132, 133
235
236
Index
extension, 135 Hilbert space, 135 Karush–Kuhn–Tucker conditions, 134 notations and definitions, 130, 131 numerical algorithm, 105 total variation, 131, 132 Chromatic aberration and botanical microscope, 60 concept of, 50 CMBs. See Centers of maximal balls CMFBs. See Centers of maximal fuzzy balls Colloidal gold particles, 155 Coma aberration, 50 COMET. See Constrained maximum entropy tomography Compound eye, microscopical observation of, 82 Compound microscopes, lens systems in, 35 Computation times, 194 Confocal laser scanning microscopy, operation mechanism, 29 Constrained maximum entropy tomography, 154 Constraints, 196 Continuous image, 175 Contourlet transform, 97, 98 Convergence property, 180 Cryo-electron tomography, 142 of macromolecules data analyzing methods, 154 data simulation, 157–160 phantoms construction, 155–157 specific imaging settings, 154, 155 TEM, 153 of monoclonal murine antibody IgG2a, 154 Crystallographic community, 155 Culex pipiens, larva of, 80 Curvelet transformation, 96 drawback of, 97 Parseval relation, 97 D Dark-ground light microscopy, 29 in Georgian era, 43 submicroscopic structures studies, 73 DCM. See Dispersion compensation module Decomposition algorithms, 129 Demultiplexed signal, 212 Descartes’ law, 33 DFB. See Directional filter bank Diatom image under Brown’s 32.5×lens, 55
under Brown’s 170×lens, 56 under Leitz microscope, 57 under spinel 400×lens, 56 Digital images, 140 Dilation, definition of, 185 Directional filter bank, 97 Dispersion compensation module, 213 Dispersion compensator, 216–219 input and output bit sequences of, 219 time-lens system, 218 Dispersion-managed transmission system, numerical simulations, 224 Distance transform, 140 binary images, 147 subsequent shape analysis, 141 Dollond microscope, 57 DT. See Distance transform E EAM. See Electroabsorption modulator Electroabsorption modulator, 212 Electron acceleration controlled by CE phase, 22, 23 SPP-enhanced (see Surface plasmon polariton-enhanced electron acceleration) Electron emission, surface plasmon fields laser-induced, 3 photo-induced, 4 tunneling/field, 3, 4 Electron irradition, 154 Electron microscope limits, 154 Electron microscopes, 31 Electron tomography, 140. See also Cryo-electron tomography fuzzy segmentation, 160 Erosion anchors algorithms for, 194 computation of, 194 openings, relationship, 190 Euclidean distance, 140, 147 Euclidean norm, 132 Euclidean plane, 174 Euclidean space, 174 Euler–Lagrange calculus, 108, 132, 133 F Fab. See Fragment antigen binding Fast-Fourier transform, 220 FDT. See Fuzzy distance transform FEG. See Field emission gun Femtosecond electron beams, spectral properties of, 2
Index FFT. See Fast-Fourier transform Fiber dispersions chirp coefficients, 210 phase coefficients, comparison of, 210 phase shift, 205 Fiber nonlinearity, 206 Fiber optics link, nonlinearity, 205 networks, 211 transmission systems, 216 Fiber propagation output signal, 218 Fiber transfer function, 205, 206 Fiber transmission system, 217 linear transfer function, 218 Field emission gun, 155 Field microscopy, pioneers of Baker, Henry, 65, 66 Ellis, John, 58–61 Folkes, Martin, 64, 65 Leeuwenhoek, Antony van, 67, 68, 77 Linnaeus, Carl, 61–63 Trembley, Abraham, 64 Finite set, grey-scale values, 187 Fluorescence microscopy, 29 Focal length, 208 Focal plane curvature, 50 Fourier transformation, 91, 176 definition, 93 diffraction sign, 210 multiplexed signal, 213 optical domain, 205 signal right, time lens, 230 Fourier transformer, 205 Fragment antigen binding, 155 Fuzzy distance transform, 141 Fuzzy objects, 140–142 crisp objects, 146 electron tomographic structure, 160, 161 IgG antibodies, experimental data of, 161 phantoms, 161 G Gaussian noise, 111, 112, 124–126 Gaussian pulses, 217, 224 Gauss kernel, atom position, 155 GIMP, adapted software, 124 Granulometry, 198, 199 Grey-scale image, 176, 180, 196 Grey-weighted distance, 141 H Higher-order fiber dispersion compensator, 211
237
Hilbert spaces, 109, 135 History of British Quadrupeds, 45 Horace Dall’s spinel microscope, 55 Horizontal translate, 185 I Ideal filter binary transmittance, 175 IFFT. See Inverse fast Fourier transform IFT. See Inverse Fourier transform IgG2a murine antibody, 156 IgG antibodies, 163 Image acquisition technique, 140 Image decomposition application of, 130 BV-E µ structures and textures, 108, 109 BV-G-E structures, textures, and noise, 119, 120 BV-G-G structures, textures, and noise, 115–117 BV-G structures and textures, 106, 107, 112 BV-H−1 structures and textures, 110, 111 noise, 90 performance evaluation evaluation metrics, 124–128 parameters variability, 128 test image, 123, 124 preliminaries contourlets, 97, 98 curvelets, 96, 97 directional multiresolution analysis, 94 function spaces, 98–102 multiresolution analysis, 93, 94 ridgelet transform, 94, 95 wavelets, 90–93 problem of, 129 structures and textures decomposition, 102–110 structures, textures, and noise decomposition Aujol–Chambolle BV-G-E model, 115–118 ˙ ∞ , 118–123 BV-G-Co −1,∞ BV-G-G local adaptative model, 112–115 BV-G structures + textures, 112 BV-H−1 structures + textures, 111 gaussian noise, 111 oscillatory function, 110 Image decomposition performance evaluation, 128 Image processing
238
Index
corrupted by noise, 176 erosion/dilation/opening/closing, 186 granulometries, 174 pattern spectrum analysis-based techniques, 174 Image segmentation, 151 Inverse fast Fourier transform, 219 high-speed digital signal processors, 223 radio frequency domain, 220 Inverse Fourier transform, 204, 205 K Karush–Kuhn–Tucker conditions, 134 Kronecker symbol, 126 L Laplacian pyramid decomposition (LP), 97 Laws of refraction, 34 Leeuwenhoek microscope, 69, 70 brass, 69 designer of, 80 design of silver, 70 human blood cells imaged by, 73 Leeuwenhoek’s specimens imaging, 77 Legendre–Fenchel transform, 131 Leitz microscope, 53 Lens systems in compound microscopes, 35 ¨ Lieberkuhn reflectors, 47, 58 Light microscope resolution of, 51 technological developments in, 28 Linnean Society microscope, resolution of, 52, 53
numerical algorithm, 103 Meyer’s model, 90 Micrographia, 78, 80, 81 Morphological anchors, 182 algorithmic properties of, 192–195 local existence of, 188–192 role of, 182 set and function operators, 183–187 theory of, 187, 188 Morphological component analysis, 123 Morphological erosions, 181 Morphological opening, 185 geometrical interpretation of, 184 spatial openings, 197 Morphological operators anchors, role of, 182 definitions of, 187 linear structuring elements, 181 Multi-keV kinetic energy levels, 2 Multiresolution analysis, 93, 94 N Near-field scanning optical microscopy, resolution of, 29 Nimrod lens, 83 Noise decomposition, 110–123 Noise modeling, 117 Noise reference images, 123 autocorrelation images, 127 residual levels, 127 Noisy test images, 110, 111 Nonlinear image processing, 175 Nonlinear operators, 175, 176
M
O
Macromolecular identification elongated structures, 165–167 fuzzy object representation, 162 subunits, 161 Magnetic resonance force microscopy, resolutions of, 29 Mathematical morphology geometrical interpretation of, 174 idempotence, 176, 177 image analysis, 173 images processing, 174 tools of, 173 MCA. See Morphological component analysis Median filter, 176 Median operator, 175 Meyer model dual-method approach, 104
OFDM. See Orthogonal frequency-division multiplexing Opening anchor anti-extensivity property of, 191 density of, 193 percentage of, 190 relationship, 190 Optical communication, 219 Optical/electrical domain signal processing, 209 Optical microscopy, limits of image quality, 31, 32 spot, 30, 31 Optical phase modulator time domain transfer function, 207 Optical pulse propagating, 203 Optical signal, 205, 211 noise ratio, bandwidth, 224
Index photodetector responds, 206 propagation of, 206 time domain, 211 Optical waveform quadratic phase modulation, 204 Optical wave propagating, 203 Orthogonal frequency-division multiplexing bit error rate, comparison of, 225 conventional, block diagram of, 220 implementation of, 205 optical modulator, 220 symbol rate, 226 time lenses, 206 optical implementation of, 219–226 wireless communication systems, 219 Oscillating texture, G-norm of, 103 P Parseval relation, 92 PDB. See Protein data bank Phase modulator chirp coefficient, 207 temporal filter, 206 time domain transfer function of, 205 Phase modulator 1, 204 transfer function, 211 Phase modulator multiplies, 223 Photoemission processes, time-resolved studies of, 19 autocorrelation functions, 20 ultrashort nature of electron bunches, 21 Pin-cushion distortion, 50 PM1. See Phase modulator 1 Protein data bank phantoms construction, 158, 164, 166 reference list 1igt, 156 1q5a, 157 2rec, 157 structures deposition, 155 Q Quantization operator, 179, 181 Quincunx filters, 97
definition of, 51 Leitz microscope, 53 limit of, 51 Linnean Society microscope, 52, 53 Reverse fuzzy distance transform brute-force calculation, 149 crisp segmentation, 167 2D and 3D images, 142 effect of, 145 FDT, computation of, 147 fuzzy setting, 145 grey and white pixels, 146 IgG antibodies, 163 image segmentation, 151 implementational aspects of, 147 magenta/blue, region growing, 153 preliminaries computation of, 147 computation weight, 144 Euclidean distance, 143 fuzzy connectedness, 144 fuzzy digital object, 142 fuzzy setting, 145 grey-level image, 145 pixels, 146 sequential algorithm, 143 voxel values updation, 145 region growing, 152, 162 algorithm, 142 magenta and blue, 153 subunits identification, 164 seed detection, 165 seeded WS, 164 subunits identification, 164, 165 RFDT. See Reverse fuzzy distance transform Ridgelet spaces, 99, 100 Ridgelet transformation, 94 1D wavelet transform, 95 Radon transform, 95 reconstruction formula, 95 Robert Brown’s microscope and plant cell, 48 Root signals, 176 Rudin–Osher–Fatemi algorithm, 90, 102 S
R Radon transform, 100 RecA protein, 157 Residual phase, 208 Residual reference image, 126 Resolution of Bancks microscopes, 52
239
Salt-and-pepper noise, 175 Scale function, 93 Scanning electron microscope (SEM), 72 Screw-barrel microscopes, 36, 37, 63 Seed/gradient magnitude, 151 Shift invariance, 196, 197 Shift-invariant operator, 198
240
Index
Signal output, 2-f subsystem, 227, 228 Signal processing, 210 noise ratio, 140 optical/electrical domain, 209 quantization, 178 Signal spectrum, 204 Signal to noise ratio, 140 Simple microscopes. See Single lens microscopes Simultaneous iterative reconstruction technique, 154 Single-channel input data, 219 Single lens microscopes aberrations in, 33, 34 Bancks’ design of, 38, 49, 58 disadvantage of, 35 image quality of, 33 living bacteria studies with, 72, 74, 75 magnification of, 57 Saccharomyces cells analysis by, 54 stages of, 47 Single-mode fibers, 206 SIRT. See Simultaneous iterative reconstruction technique SMFs. See Single-mode fibers Snell’s law, 32, 33 SNR. See Signal to noise ratio Sobolev spaces, 99 Spatial filtering, 204 Spatiality constrains operators, 180 Spatiality, definition of, 179 Spatial operators, 185, 188 Spherical aberration, 49, 50 SPP-enhanced photoacceleration, angle-energy distributions of, 23 SPP-induced photoemission, time-resolved studies of, 20 Spurious pixels reduction process, 150 Structures reference image, 123 Surface plasmon-enhanced photoemission, 16–18 Surface plasmon fields electron acceleration, 7 electron emission in laser-induced, 3 photo-induced, 4 tunneling/field, 3, 4 emission currents field/tunneling emission, 5–7 multiphoton-induced emission, 5 Surface plasmon polariton-enhanced electron acceleration definition of, 2
numerical models electron emission channels, 9 FDTD method, 8, 9 field emission, 12–16 high-energy electrons generation, 18, 19 instantaneous tunneling current, 9 multiphoton-induced emission, 11, 12 physical processes, 7, 8 surface plasmon-enhanced photoemission, 16–18 vacuum electron trajectory, 10, 11 T Tarski’s fixpoint theorem, 177 TEM. See Transmission electron microscope Temporal filtering technique, 204 amplitude/phase modulator, 206 time-lens, advantage of, 211 Textures reference image, 123 Theoretical optics, 32 Threshold sets, profile of, 187 Threshold superposition principle, 187 Time-lens-based scheme block diagram of, 220 nonlinear performance of, 225 Time-lens system, 209 total length of, 228, 229 Transform-limited ultrashort laser pulse, optical waveform of, 22 Transmission electron microscope, macromolecular structures, 153 Tunable optical filters, 211 Tunable wavelength division demultiplexer, 211 U Ultrathin films, 155 V Voxel extracts 3D reconstructions, 156 W Watershed segmentation, 151 Wavelength division demultiplexer, 211–215 Wavelength division multiplexing 4-f time-lens system, 212 input and output bit sequences, 214, 215 Wavelet analysis, 91 Wavelet/contourlet-based algorithms, 125 Wavelet expansion, 99 Wavelet soft thresholding operator, 107
Index Wavelet transform continuous case, 91, 92 convolution product, 92 digital signals, 92 discrete case, 92, 93 Fourier transformation, 92 Parseval relation, 92
241
signal decomposition, 91 WDM. See Wavelength division multiplexing Withering microscope, 48, 49 WS. See Watershed segmentation WST operator. See Wavelet soft thresholding operator