HANDBOOK OF VLSI MICROLITHOGRAPHY SECOND EDITION Principles, Technology, and Applications
JMR
30-Nov-00
JMR- 30-Nov-00
HANDBOOK OF VLSI MICROLITHOGRAPHY SECOND EDITION Principles, Technology, and Applications
Edited by
John N. Helbert Motorola, Inc. Phoenix, Arizona
NOYES PUBLICATIONS Park Ridge, New Jersey, U.S.A. WILLIAM ANDREW PUBLISHING, LLC Norwich, New York, U.S.A.
JMR
30-Nov-00
Copyright © 2001 by Noyes Publications No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without permission in writing from the Publisher. Library of Congress Catalog Card Number: 00-028173 ISBN: 0-8155-1444-1 Printed in the United States Published in the United States of America by Noyes Publications / William Andrew Publishing, LLC 13 Eaton Avenue Norwich, NY 13815 1-800-932-7045 www.knovel.com 10 9 8 7 6 5 4 3 2 1
Library of Congress Cataloging-in-Publication Data
Handbook of VLSI Microlithography / [edited] by John Helbert.--2nd edition p. cm. Includes bibliographical references and index. ISBN 0-8155-1444-1 1. Integrated circuits--Very large scale integration. 2. Microlithography. I. Helbert, John N. TK7874 .H3494 2001 621.3815'31--dc21
00-028173 CIP
MATERIALS SCIENCE AND PROCESS TECHNOLOGY SERIES Series Editors Rointan F. Bunshah, University of California, Los Angeles Gary E. McGuire, Microelectronics Center of North Carolina Stephen M. Rossnagel, IBM Thomas J. Watson Research Center
Electronic Materials and Process Technology CHARACTERIZATION OF SEMICONDUCTOR MATERIALS, Volume 1: edited by Gary E. McGuire CHEMICAL VAPOR DEPOSITION FOR MICROELECTRONICS: by Arthur Sherman CHEMICAL VAPOR DEPOSITION OF TUNGSTEN AND TUNGSTEN SILICIDES: by John E. J. Schmitz CHEMISTRY OF SUPERCONDUCTOR MATERIALS: edited by Terrell A. Vanderah CONTACTS TO SEMICONDUCTORS: edited by Leonard J. Brillson DIAMOND CHEMICAL VAPOR DEPOSITION: by Huimin Liu and David S. Dandy DIAMOND FILMS AND COATINGS: edited by Robert F. Davis DIFFUSION PHENOMENA IN THIN FILMS AND MICROELECTRONIC MATERIALS: edited by Devendra Gupta and Paul S. Ho ELECTROCHEMISTRY OF SEMICONDUCTORS AND ELECTRONICS: edited by John McHardy and Frank Ludwig ELECTRODEPOSITION: by Jack W. Dini HANDBOOK OF CARBON, GRAPHITE, DIAMONDS AND FULLERENES: by Hugh O. Pierson HANDBOOK OF CHEMICAL VAPOR DEPOSITION, Second Edition: by Hugh O. Pierson HANDBOOK OF COMPOUND SEMICONDUCTORS: edited by Paul H. Holloway and Gary E. McGuire HANDBOOK OF CONTAMINATION CONTROL IN MICROELECTRONICS: edited by Donald L. Tolliver HANDBOOK OF DEPOSITION TECHNOLOGIES FOR FILMS AND COATINGS, Second Edition: edited by Rointan F. Bunshah HANDBOOK OF ION BEAM PROCESSING TECHNOLOGY: edited by Jerome J. Cuomo, Stephen M. Rossnagel, and Harold R. Kaufman HANDBOOK OF MAGNETO-OPTICAL DATA RECORDING: edited by Terry McDaniel and Randall H. Victora HANDBOOK OF MULTILEVEL METALLIZATION FOR INTEGRATED CIRCUITS: edited by Syd R. Wilson, Clarence J. Tracy, and John L. Freeman, Jr. HANDBOOK OF PLASMA PROCESSING TECHNOLOGY: edited by Stephen M. Rossnagel, Jerome J. Cuomo, and William D. Westwood HANDBOOK OF POLYMER COATINGS FOR ELECTRONICS, 2nd Edition: by James Licari and Laura A. Hughes HANDBOOK OF REFRACTORY CARBIDES AND NITRIDES: by Hugh O. Pierson
v
vi
Series
HANDBOOK OF SEMICONDUCTOR SILICON TECHNOLOGY: edited by William C. O’Mara, Robert B. Herring, and Lee P. Hunt HANDBOOK OF SEMICONDUCTOR WAFER CLEANING TECHNOLOGY: edited by Werner Kern HANDBOOK OF SPUTTER DEPOSITION TECHNOLOGY: by Kiyotaka Wasa and Shigeru Hayakawa HANDBOOK OF THIN FILM DEPOSITION PROCESSES AND TECHNIQUES: edited by Klaus K. Schuegraf HANDBOOK OF VACUUM ARC SCIENCE AND TECHNOLOGY: edited by Raymond L. Boxman, Philip J. Martin, and David M. Sanders HANDBOOK OF VLSI MICROLITHOGRAPHY: edited by William B. Glendinning and John N. Helbert HIGH DENSITY PLASMA SOURCES: edited by Oleg A. Popov HYBRID MICROCIRCUIT TECHNOLOGY HANDBOOK, Second Edition: by James J. Licari and Leonard R. Enlow IONIZED-CLUSTER BEAM DEPOSITION AND EPITAXY: by Toshinori Takagi MOLECULAR BEAM EPITAXY: edited by Robin F. C. Farrow SEMICONDUCTOR MATERIALS AND PROCESS TECHNOLOGY HANDBOOK: edited by Gary E. McGuire ULTRA-FINE PARTICLES: edited by Chikara Hayashi, R. Ueda and A. Tasaki
Ceramic and Other Materials—Processing and Technology ADVANCED CERAMIC PROCESSING AND TECHNOLOGY, Volume 1:edited by Jon G. P. Binner CEMENTED TUNGSTEN CARBIDES: by Gopal S. Upadhyaya CERAMIC CUTTING TOOLS: edited by E. Dow Whitney CERAMIC FILMS AND COATINGS: edited by John B. Wachtman and Richard A. Haber CORROSION OF GLASS, CERAMICS AND CERAMIC SUPERCONDUCTORS: edited by David E. Clark and Bruce K. Zoitos FIBER REINFORCED CERAMIC COMPOSITES: edited by K. S. Mazdiyasni FRICTION AND WEAR TRANSITIONS OF MATERIALS: by Peter J. Blau HANDBOOK OF CERAMIC GRINDING AND POLISHING: edited by Ioan D. Mavinescu, Hans K. Tonshoff, and Ichiro Inasaki HANDBOOK OF INDUSTRIAL REFRACTORIES TECHNOLOGY: by Stephen C. Carniglia and Gordon L. Barna SHOCK WAVES FOR INDUSTRIAL APPLICATIONS: edited by Lawrence E. Murr SOL-GEL TECHNOLOGY FOR THIN FILMS, FIBERS, PREFORMS, ELECTRONICS AND SPECIALTY SHAPES: edited by Lisa C. Klein SOL-GEL SILICA: by Larry L. Hench SPECIAL MELTING AND PROCESSING TECHNOLOGIES: edited by G. K. Bhat SUPERCRITICAL FLUID CLEANING: edited by John McHardy and Samuel P. Sawan
JMR- 30-Nov-00
Preface
The chapter topics of this lithography handbook deal with the critical and enabling aspects of the intriguing task of printing very high resolution, high density integrated circuit (IC) patterns into thin resist-process patterntransfer coatings. Circuit pattern density or resolution drives Dynamic Random Access Memory (DRAM) technology, which is the principal circuit density driver for the entire Very Large Scale Integrated Circuit (VLSI) industry. This book’s main theme concerns the special printing processes created by workers striving to achieve volume high density IC chip production. The current goal is pattern features sizes near 0.25 µm for 64 Mbit DRAM lithography and, ultimately, the production of devices with features well below 0.1 µm. The text is meant for a full spectrum of readers spanning university, industrial, and government research and development scientists, and production-minded engineers, technicians, and students. Specifically, we have attempted to consider the needs of lithographyoriented students and practicing industrial engineers and technicians. The leadoff chapter focuses on the view that lithography methods (printing patterns) are pursued for the singular purpose of manufacturing IC chips in the highly competitive commercial sector, and it attempts to delineate the factors determining lithographic tool selection. The reader is drawn to consider IC device electrical performance criteria versus plausible and alternative energetic, or circuit density limited, particle printing methods—visible or shorter UV optical, electron, x-ray, and ion beams. The criteria for high quality micrometer and submicrometer lithography is very simply defined by the three major patterning parameters: line/space resolution, line edge, and pattern feature dimension control, which when combined with pattern-to-pattern alignment capability determine lithographic overlay
vii
viii
Preface
accuracy. Patterning yield and throughput further enter as dependent economic factors. Resist and resist process equipment technology have a logical, prominent, second-chapter position indicative of resist’s overall importance in lithography, i.e., the end product of any IC lithography process is the patterned-resist masking layer needed to delineate the VLSI circuit level. Example coverage of optical resist process optimization assures the reader a grasp of the most commonly and widely used (worldwide) lithographic process and equipment technologies. The coauthors believe this chapter to be the first comprehensive and practical work describing the relationship between the resist process and the resist processing equipment parameters. The basic resist design concepts and definitions as well as advanced lithographic processes are thoroughly covered . Chapter 3 deals primarily with basic lithographic resist process defectivity and the potential yield limiting effect on device production yields. It is also a fairly comprehensive summary of defectivity detection systems used for basic lithographic process characterization, and device yield enhancement efforts in general. Basic metrology considerations (Chapter 4) are absolutely imperative to rendering a total description of lithography pattern transfer methodology. The task of precisely measuring printed linewidth or space artifacts at dimensions which are submicron and below is of paramount interest to lithographic technology. The elucidation of optical, scanning-electron-microscope (SEM), and electrical test device linewidth measurements data presents the reader with key boundary conditions essential for obtaining meaningful linewidth characterization. The portrayal of energetic photon or particle microlithography is totally incomplete without some detail of the actual printing tool concepts, design, construction, and performance. The printing tools and their usage in the IC manufacturing world are presented and described in the later chapters, beginning with Chapter 5. Clearly, optical lithography has been the backbone and mainstay of the world’s microchip production activity and will most likely continue in this dominant role into the next century. In the optical arena, it is found that 1–5X reduction printers, of the projection scanned and unscanned variety, must be described in subsets according to coherent and noncoherent radiation, as well as by wavelengths ranging from visible to deep ultraviolet (UV), then extending to angstrom levels. Higher resolution or more energetic sourced tools are also well described. The goals of Chapters 6 and 8 were to provide the first source of documented information on basic lithographic tool automation and vibrational analysis principles. Heretofore, these topics were always relegated to
JMR- 30-Nov-00
Preface
ix
private communications and private notebooks of individual engineers or non-lithographic journal sources. The editor thanks the authors of these two chapters for providing this important and incisive information. Next, in world manufacturing usage, electron beam (e-beam) pattern printing has been vital, mostly because of its application in a pattern generation capacity for making photo masks and reticles, but also because of direct- write-on-wafer device photo usage. The writing strategy divides e-beam printers, in general, into three groups: Gaussian beam raster scan, principally for pattern generation, newer stenciled projection systems, and fixed or variable-shaped beam vector scan for direct-write-on-wafer applications. Subsets of the latter groups depend upon site-by-site versus write-on-the-fly substrate movements. The sophistication and complexity of e-beam printers requires diverse expertise in many technical areas such as electrostatic and electromagnetic beam deflection, high speed beam blanking, intense electron sources, precise beam shapers, and ultra fast data flow electronics and storage. Interestingly, important special beam relationships of maximum current, density, and writing pattern path-speed require the observance of unique boundary conditions in meeting printing criteria. On a worldwide basis, x-ray printing does not yet have high volume IC device production background examples, but high density prototype CMOS devices have been fabricated by IBM, and their feasibility demonstrated. The x-ray chapter presents x-ray lithography as a system approach with source, mask, aligner, and resist components. Of the competing volume manufacturing printing methods (optical and x-ray), the x-ray process is unique as a proximity and 1:1 method. As such, in order to meet the IC patterning quality criteria, extreme demands are placed on the mask fabrication process, much more so than for masks or reticles produced for the optical analogue. For economically acceptable IC production, laser/diode plasma and synchrotron ring x-ray sources must be presented as high density photon emitters. In the second part of Chapter 10, the synchrotron is given special attention and is presented as a unique x-ray generator with an x-ray flux collimation feature. In spite of the synchrotron’s massive size and very large cost, its multi-port throughput capacity makes it viable for the very high production needs of certain industrial IC houses or possibly for multicompany or shared-company situations. In the last of the charged particle printing tool chapters, Chapter 9, the energetic ion is depicted in a controllable, steerable, particle-beam serialpattern writer performing lithography at a high mass ratio compared to an ebeam writer. The focused ion beam not only can deposit energy to form IC pattern latent resist images, but offers, as another application, the direct implant of impurity ions into semiconductor wafers, obviating completely the
JMR
30-Nov-00
x
Preface
need for any resist whatsoever and greatly simplifying the IC chip processing sequence. The versatile energetic ion plays yet another and possibly its most significant role in a “steered beam” tool, indispensable for optical and x-ray mask repair through the precise localized ablation and/or deposition of mask absorber material. One of the editor’s purposes in assembling this book has been to accurately disseminate the results of many and varied microlithography workers. Since it is not possible in any one book to provide enough detail to satisfy every reader’s full curiosity, we’ve attempted to enable him to perform his own valid analysis and make some meaningful conclusions regarding the status and trends of the vital technical thrust areas of submicron IC pattern printing technology. Many individuals representing industrial, government, and university sectors have been extremely helpful in providing technical discussions, data, and figures to the chapter authors of this book. Gratitude is extended here to those persons and their organizations. Gratitude is also expressed via courtesy annotations in the figure captions. Finally, we commend and thank Roxie Helbert for her compilation and editing skills. March 21, 2000 Mesa, Arizona
JMR- 30-Nov-00
John N. Helbert
Contributors
Phillip Blais Westinghouse Electric Corp. (Retired) Baltimore, Maryland
Allen Lepore Army Research Laboratory Adelphi, Maryland
Franco Cerrina University of Wisconsin Madison, Wisconsin
Kenneth Medearis Kenneth Medearis Associates Fort Collins, Colorado
Tony Daou Motorola, Inc. Chandler, Arizona
John Melngailis University of Maryland College Park, Maryland
William B. Glendinning Consultant South Bristol, Maine
Michael Michaels Westinghouse Electric Corp. Baltimore, Maryland
John N. Helbert Motorola, Inc. Mesa, Arizona
Whit Waldo Motorola, Inc. Austin, Texas
Charles T. Lambson ASML Hong Kong, China
Arnold Yanof Motorola, Inc. Chandler, Arizona
Fourmun Lee Motorola, Inc. Chandler, Arizona
xi
NOTICE To the best of our knowledge the information in this publication is accurate; however the Publisher does not assume any responsibility or liability for the accuracy or completeness of, or consequences arising from, such information. This book is intended for informational purposes only. Mention of trade names or commercial products does not constitute endorsement or recommendation for use by the Publisher. Final determination of the suitability of any information or product for use contemplated by any user, and the manner of that use, is the sole responsibility of the user. We recommend that anyone intending to rely on any recommendation of materials or procedures mentioned in this publication should satisfy himself as to such suitability, and that he can meet all applicable safety and health standards.
Contents
xiii
Contents
1
Issues and Trends Affecting Lithography Tool Selection Strategy .................................................................................... 1 Phillip Blais, Michael Michaels, and John N. Helbert 1.0
INTRODUCTION ................................................................................. 1 1.1 Device Lithography Requirements: Advances and Predictions ......................................................... 3 1.2 Semiconductor World Fab Status .............................................. 8 1.3 Wavefront Engineering and Reticle Fabrication Maskshop Issues ...................................... 10
2.0
STRATEGY ........................................................................................ 16 2.1 Charter ..................................................................................... 17 2.2 Marketing................................................................................. 20 2.3 Product Development .............................................................. 22 2.4 Production Facility ................................................................... 24 2.5 Technical Capability ................................................................. 25 2.6 Types of Lithography .............................................................. 25 2.7 Economic Factors..................................................................... 57
3.0
IMPLEMENTATION OF STRATEGY ................................................ 63
4.0
SUMMARY ....................................................................................... 70
REFERENCES ............................................................................................. 71
xiii
xiv
Handbook of VLSI Microlithography
2 Resist Technology—Design, Processing, and Applications ..................................................................... 74 John N. Helbert and Tony Daou PREFACE
................................................................................................. 74
1.0
INTRODUCTION TO PATTERN TRANSFER TECHNOLOGY .......... 75
2.0
RESIST DESIGN ................................................................................. 77 2.1 Conventional Photoresists ...................................................... 77 2.2 Deep UV Resists ...................................................................... 96 2.3 Radiation Resists ................................................................... 111 2.4 Future Resists ........................................................................ 119
3.0
RESIST PROCESSING ...................................................................... 120 3.1 Resist Parameter Screening .................................................... 120 3.2 Resist Adhesion Requirements .............................................. 146 3.3 Resist Application ................................................................. 161 3.4 Prebake/Exposure/Postbake/Development Processing.......... 163
4.0
LITHOGRAPHIC PROCESSING EQUIPMENT ................................ 187 4.1 Wafer Processes and Equipment (Wafer Tracks) ................... 187 4.2 Resist and Develop Track Fab Qualification .......................... 241 4.3 DUV Resist Wafer Tracks ...................................................... 252 4.4 Photochemical Support to Modern Fabs ............................... 256
5.0
APPLICATIONS AND SPECIAL PROCESSES ................................ 259 5.1 Future Device Demands ......................................................... 259 5.2 Introduction to Multilayer Applications ................................ 260 5.3 Introduction to MLM Lithography ........................................ 262 5.4 Applications .......................................................................... 262 5.5 Summary and Future Predictions ........................................... 313 5.6 Future Processes ................................................................... 313
REFERENCES ........................................................................................... 314
3 Lithography Process Monitoring and Defect Detection .... 327 Fourmun Lee 1.0
OVERVIEW ...................................................................................... 327
2.0
DEFECT DETECTION TOOLS ......................................................... 329 2.1 History ................................................................................... 329 2.2 Inspection Equipment Requirements ..................................... 331 2.3 Detection Techniques ............................................................ 332
3.0
DATA ANALYSIS AND DEFECT CHARACTERIZATION ............ 352
Contents
xv
4.0
PROCESS OPTIMIZATION AND QUALIFICATION ...................... 354
5.0
DEFECT REDUCTION ..................................................................... 356
6.0
CASE STUDIES ............................................................................... 358 6.1 Center Stripe Defects ............................................................. 358 6.2 Circle Defects ......................................................................... 363 6.3 Repeater Defects .................................................................... 367 6.4 New Process Optimization ..................................................... 369
REFERENCES ........................................................................................... 381
4 Techniques and Tools for Photo Metrology ....................... 382 Arnold Yanof 1.0
INTRODUCTION ............................................................................. 382
2.0
CD SCANNING ELECTRON MICROSCOPE (CD-SEM) .................. 383 2.1 Basic CD-SEM Equipment and Measurement ........................ 383 2.2 Characteristics and Limitations of Low Voltage SEM Imaging and Metrology ......................................................... 389 2.3 CD-SEM Measurement Validity ............................................. 404
3.0
ELECTRICAL CD (ECD) METROLOGY ........................................... 414 3.1 Types of ECD Test Structures ................................................ 415 3.2 Gauge Capability and Accuracy of ECD ................................ 416
4.0
OVERLAY MEASUREMENT .......................................................... 420 4.1 Basic Optical Overlay Measurement ...................................... 420 4.2 Overlay Metrology Tool Performance .................................... 421 4.3 Plotting Overlay Results ........................................................ 426 4.4 Process-Related Overlay Measurement Errors ...................... 427
5.0
FILM THICKNESS BY ELLIPSOMETRY AND REFLECTANCE SPECTROMETRY ............................................................................ 436 5.1 Optical Thin Film Phenomena ................................................ 437 5.2 Light Polarization Basics for Ellipsometry .............................. 438 5.3 Basic Ellipsometer .................................................................. 439 5.4 Film Thickness Instrumentation for Semiconductor Use ....... 440 5.5 Physics of Optical Film Thickness Measurement .................. 442
6.0
STATISTICAL APPLICATIONS TO METROLOGY ........................ 453 6.1 Definitions of Accuracy, Precision, Reproducibility and Matching ............................................... 453 6.2 Analysis of Variance for Metrology Gauge Studies and Process Analysis ............................................................ 454 6.3 Chi-Square Test for Variance Comparisons ............................ 462
REFERENCES ........................................................................................... 466
xvi
Handbook of VLSI Microlithography
5 Techniques and Tools for Optical Lithography .................. 472 Whit Waldo 1.0
INTRODUCTION ............................................................................. 472
2.0
FRAUNHOFER DIFFRACTION ...................................................... 475 2.1 Diffraction Through a Rectangular Aperture ......................... 476 2.2 Diffraction Through a Circular Aperture ................................ 477 2.3 Airy Disk ................................................................................ 478
3.0
THEORETICAL RESOLUTION LIMIT ............................................ 480
4.0
DIFFRACTION GRATINGS ............................................................. 484
5.0
FOURIER SYNTHESIS ..................................................................... 487
6.0
ABBE’S THEORY OF IMAGE FORMATION .................................. 489
7.0
INTRODUCTION TO TRANSFER FUNCTIONS ............................. 491 7.1 Spread Functions ................................................................... 492 7.2 Modulation ............................................................................ 493 7.3 Modulation, Phase, and Optical Transfer Functions ............. 494 7.4 Cascading Linear Functions .................................................. 495 7.5 Illumination Degree of Coherence .......................................... 496 7.6 Wavelength Effect on MTF ................................................... 504 7.7 Depth of Focus ...................................................................... 504 7.8 Diffraction Limited Resolution ............................................... 507 7.9 Minimum MTF Requirement .................................................. 508 7.10 Field Application of Transfer Functions ................................ 509
8.0
DESIGN CONSIDERATIONS FOR IMAGING EFFECTS ................. 510 8.1 Laser Interferometry ............................................................... 512 8.2 Aberration Modeling ............................................................. 516 8.3 Aerial Image Intensity Distribution........................................ 523 8.4 Shaped Illumination Sources and Spatial Filtering ................. 528
9.0
NUMERICAL AND STATISTICAL METHODS .............................. 534 9.1 Data Regression ..................................................................... 534 9.2 F-Test and T-Test ................................................................... 536 9.3 Multifactor Experiments ......................................................... 538 9.4 Analysis of Experiments ........................................................ 542 9.5 Process Control...................................................................... 542
10.0 PRACTICAL IMAGING QUALITY ................................................. 547 10.1 Field Diameter and Resolution ............................................... 547 10.2 Exposure-Defocus Diagrams .................................................. 548 10.3 Depth Of Focus Issues .......................................................... 552 10.4 Illumination ............................................................................ 562 10.5 Thin Film Interference and Standing Waves .......................... 571 10.6 Vibration ................................................................................. 579
Contents
xvii
10.7 Miscellaneous Processing Issues .......................................... 580 10.8 Industrially Accepted Designs .............................................. 581 11.0 PRACTICAL IMAGE PLACEMENT ............................................... 582 11.1 Alignment .............................................................................. 582 11.2 Field Errors ............................................................................. 592 12.0 MASK ISSUES ................................................................................ 601 12.1 Particulate Protection ............................................................. 601 12.2 Phase Shifting Masks ............................................................ 604 12.3 Serifs ...................................................................................... 628 12.4 Excimer Laser Irradiation Damage .......................................... 628 12.5 Registration Error Contributions ............................................ 629 REFERENCES ........................................................................................... 630
6
Microlithography Tool Automation .................................... 644 Charles T. Lambson 1.0
AUTOMATION BASICS ................................................................. 644 1.1 Introduction ........................................................................... 644 1.2 Automation Is a Gradual Process ........................................... 645 1.3 Cluster Tools .......................................................................... 645
2.0
CELL CONTROLLERS ..................................................................... 646 2.1 Motivation for Cell Controllers .............................................. 646 2.2 Work Cells .............................................................................. 647 2.3 Model Cell Controller ............................................................. 647 2.4 Cell Controller Benefits .......................................................... 650
3.0
EQUIPMENT COMMUNICATION INTERFACES .......................... 651 3.1 SECS-I Protocol...................................................................... 651 3.2 The SECS-II Standard ............................................................ 652 3.3 The GEM Standard ................................................................ 653 3.4 The SEM Standards ............................................................... 655
4.0
STATE MODELS ............................................................................. 655
5.0
LADDER DIAGRAMS..................................................................... 662
6.0
MATERIAL TRANSPORT ............................................................... 663 6.1 Fab Layout Considerations for Automated Material Transport ................................................................. 664 6.2 Interface Considerations for Automated Material Transport . 665 6.3 CIM (Computer Integrated Manufacturing) Architecture Considerations .................................................. 668
REFERENCES ........................................................................................... 669
xviii
Handbook of VLSI Microlithography
7 Electron-Beam ULSI Applications ...................................... 670 Allen Lepore 1.0
INTRODUCTION ............................................................................. 670
2.0
THE LITHOGRAPHY PROCESS ...................................................... 673 2.1 Logistics of Exposure............................................................. 673 2.2 Physics of Exposures ............................................................. 676 2.3 Lithography Process Issues and Parameters ......................... 686
3.0
ELECTRON-BEAM LITHOGRAPHY EQUIPMENT ........................ 698 3.1 Introduction ........................................................................... 698 3.2 Electron-Optical System ......................................................... 700 3.3 Writing Strategies and Architecture ...................................... 709 3.4 Calibrations ............................................................................ 714 3.5 Examples of Commercial Equipment ....................................... 715 3.6 Novel Electron-Beam Technologies ....................................... 725
4.0
RESIST ............................................................................................. 729 4.1 Introduction ........................................................................... 729 4.2 Resist Properties .................................................................... 733 4.3 Positive Electron-Beam Resists.............................................. 735 4.4 Multiple-Layer Resist Strategies ............................................ 738 4.5 Negative Electron-Beam Resists ............................................ 743 4.6 Conductive Overlayers .......................................................... 745 4.7 Inorganic Resists and Self-Assembled Monolayers .............. 746
5.0
COMPETING TECHNOLOGIES ....................................................... 746
6.0
ACKNOWLEDGEMENTS ............................................................... 750
REFERENCES ........................................................................................... 750
8
Rational Vibration and Structural Dynamics for Lithographic Tool Installations ........................................... 756 Kenneth Medearis 1.0
INTRODUCTION ............................................................................. 756
2.0
STRUCTURAL DYNAMICS, VIBRATION, AND STRUCTURAL ENGINEERING ................................................................................. 758
3.0
TOOL EXCITATION SOURCES AND LEVELS ............................... 758 3.1 Tool Excitation Sources and Levels—Case Study 1 .............. 760 3.2 Tool Excitation Sources and Levels—Case Study 2 .............. 762
Contents
xix
4.0
DISPLACEMENT, VELOCITY, OR ACCELERATION CRITERIA .... 767 4.1 Floor Displacement Criteria .................................................... 768 4.2 Tool Manufacturers Floor Vibration Specifications ............... 774
5.0
VIBRATION-RESISTANT SUPPORT PEDESTALS FOR TOOLS .... 776
6.0
SYSTEM “ISOLATION” .................................................................. 783
7.0
CONCLUSIONS AND COMMENTS ............................................... 784
8.0
RECOMMENDED TOOL AND FLOOR VIBRATION CRITERIA .... 787
REFERENCES ........................................................................................... 789
9
Applications of Ion Microbeams Lithography and Direct Processing .......................................................... 790 John Melngailis 1.0
INTRODUCTION ............................................................................. 790
2.0
ION-SURFACE INTERACTION....................................................... 792
3.0
FOCUSED ION BEAMS................................................................... 799 3.1 Machinery .............................................................................. 799 3.2 Point Sources of Ions............................................................. 800 3.3 Ion Column ............................................................................ 805 3.4 Beam Writing ......................................................................... 806
4.0
FOCUSED ION BEAM APPLICATIONS ......................................... 814 4.1 Low Energy Ga Ion Beam Applications ................................. 814 4.2 Applications of the High-Voltage Mass-Separated FIB Systems .......................................................................... 827
5.0
FOCUSED ION BEAM LITHOGRAPHY .......................................... 832
6.0
MASKED ION BEAM LITHOGRAPHY .......................................... 836 6.1 The Mask ............................................................................... 836
7.0
ION PROJECTION LITHOGRAPHY ................................................ 840 7.1 Ion Source .............................................................................. 841 7.2 Mask ...................................................................................... 841 7.3 Ion Optical Column ................................................................ 844 7.4 Pattern Lock System .............................................................. 844 7.5 Optical Column Design .......................................................... 845 7.6 Stochastic Blur ....................................................................... 846 7.7 Resist Exposure...................................................................... 846
xx
Handbook of VLSI Microlithography 8.0 CONCLUSION ................................................................................. 848 REFERENCES ........................................................................................... 849
10 X-Ray Lithography .............................................................. 856 William B. Glendinning and Franco Cerrina PART I ....................................................................................................... 856 1.0
INTRODUCTION ............................................................................. 856
2.0
X-RAY PRINTING METHOD—SYSTEM APPROACH ................... 857 2.1 X-Ray System Definitions ...................................................... 859 2.2 Minimum Feature Size and Line Width Control ...................... 859 2.3 Overlay Accuracy .................................................................. 862 2.4 Throughput ............................................................................ 863
3.0
X-RAY SYSTEM COMPONENTS .................................................... 864 3.1 Sources for X-Ray Flux .......................................................... 865
4.0
MASK TECHNOLOGY .................................................................... 870 4.1 Minimum Line Width and Control .......................................... 872 4.2 Overlay .................................................................................. 874 4.3 Throughput ............................................................................ 874
5.0
MASK CONSTRUCTION ................................................................ 874 5.1 Mechanical and Optical Distortions ...................................... 879 5.2 Defects ................................................................................... 884 5.3 Inspection .............................................................................. 886 5.4 Pattern Generation ................................................................. 886
6.0
ALIGNMENT ................................................................................... 887 6.1 Interferometric Schemes ......................................................... 889 6.2 Non-Interferometric Schemes ................................................. 897
7.0
RESIST ............................................................................................. 899
8.0
METROLOGY .................................................................................. 901
9.0
X-RAY SYSTEM .............................................................................. 902 9.1 X-Ray Radiation Damage to IC Devices ................................ 908
10.0 CONCLUSION FOR PART I ............................................................. 910 PART II ..................................................................................................... 912 11.0 SYNCHROTRON RADIATION SOURCES ...................................... 912 11.1 Introduction ........................................................................... 912 11.2 Properties of Synchrotron Radiation ..................................... 913 12.0 TYPES OF MACHINES ................................................................... 930
Contents
xxi
13.0 BEAM TRANSPORT SYSTEMS ..................................................... 933 13.1 Vacuum Requirements ............................................................ 936 13.2 Optical .................................................................................... 940 13.3 Data Communication .............................................................. 944 13.4 Safety Issues ......................................................................... 944 13.5 Machines and Lithography .................................................... 946 14.0 ACKNOWLEDGMENT ................................................................... 947 REFERENCES ........................................................................................... 948
Index ........................................................................................... 957
JMR- 30-Nov-00
Lithography Tool Selection
1
1 Issues and Trends Affecting Lithography Tool Selection Strategy Phillip Blais and Michael Michaels Westinghouse Electric Corporation Advanced Technology Labs Baltimore, Maryland
John N. Helbert Motorola, Inc. Compound Semiconductor Fab-2 Mesa, Arizona
1.0
INTRODUCTION
Integrated Circuit (IC) fabrication requires performing a long sequence of many complex processes. Lithography, which recurs typically as many as ten to thirty-plus times for a given device flow, is the most important of these complex processes as it is used to define the dimensions, doping, and interconnection of each segment of each device. Literally, this indirect process defines nearly all of the working elements for the IC device. The domination of lithography in the total cycle time to fabricate an IC device is shown in Fig. 1.[1] Lithography consumes ~60% of the total time and roughly 40% of the cost required to fabricate IC devices! Since labor
1
11/30/00
JMR
2
Handbook of VLSI Microlithography
costs are directly proportional to cycle time, the selection of the appropriate, and hopefully optimum, lithographic technique and associated tool can be critical to the success of a wafer fab operation. The best choices may differ for experimental fab operations compared to high volume production fabs, but in either situation, the choice can be critical. Factors governing lithographic technique and tool selection begin with a basic requirement for technical capability, continue through economic considerations, and finally end with such factors as production volume, turnaround time, product planning, process availability, and others.
Figure 1. Lithography dominates in determining the total cycle time for IC processing.
Since the early days of semiconductor production, optical lithography has always been the choice for volume semiconductor manufacturing. The real question is when will this choice be for a non-optical method? The non-optical players have not really changed since the early 1980s, except for their delay to become mainstream or used for volume production. These players are still e-beam, x-ray, ion-beam, and extreme-UV (EUV).[2] For example, since the 1960s optical lithography has been the cost effective tool of choice, while the industry has seen the number of CMOS device levels go from eight to thirty-plus layers; in the meantime, x-ray lithography (1979) has been relegated to an R&D status and probably will remain there until dimensions become less than 0.13–0.10. The barriers are always cost and infrastructure, not technical capability, even for the next optical generations. Infrastructure is defined as reliable resist processing and supplies, appropriate mask technology, metrology with a gauge capability at the next generation of critical dimension (CD), tool vendor support, and so on.
11/30/00
JMR
Lithography Tool Selection
3
In the 1970s and early 1980s, optical exposure tools operated at ~400 nm on average, and the feature sizes were always greater than the wavelength of the exposure tool at 1.5 to many microns. In 1996, Sematech felt 0.30 µm was the limit for i-line lithography, and this limit was now less than the tool illumination wavelength for the first time.[3] For optical lithography, there has always been an effort to use imaging fabrication tools at features less than the wavelength of the tool’s actinic source. In the past, these efforts were mainly focused in research or development areas, but in the future, production tools will also be employed at more aggressive performance levels. Optical design rules at a fixed numerical aperture (NA)/wavelength (WL) ratio are governed by k1 selection and confined to k1 for space and pitched line sums greater than 0.5. To extend optical tool usage at a given NA/WL to lower k values requires the application of wavefront engineering—off-axis reticle illumination schemes (OAI), optical proximity connection (OPC) and the use of phase shifting mask technology (PSM), or all three. OPC has seen applications over the last two years, but the powerful combinations of techniques are still being developed. If achieved, these combinations could lead to <0.13–0.18 micron devices printed with 248 nm DUV optical lithography. With the caveat that experts are usually wrong because of vested interest or sheer optimism, John Sturtevant of Motorola SPS/APRDL predicts optical lithography will extend to the year 2015 without a serious challenge. He assumes also that at least two laser wavelength generations are also involved. This innovation trend will not stop and optical lithography will continue to be the main production semiconductor device fabrication tool for at least another decade or possibly longer. For optical lithography to survive another decade, it too must have new technology that is cost effective. Expert projections in ~1997 for the development costs for 193 DUV and 0.13 micron EUV lithographies were in the ballpark of $350 million and $700 million, respectively. Will this development occur? No one really knows for sure, but based on history, the answer is probably yes; and these tools will more than likely begin to appear in 1999–2000 timeframe as discussed in Sec. 2.6. 1.1
Device Lithography Requirements: Advances and Predictions
Historically, dynamic random access memory (DRAM) devices have been a lithography driver because higher resolution lithography produces more bits/unit area and the cost per bit also goes down concurrently. About every three years, a new DRAM generation requires a new
11/30/00
JMR
4
Handbook of VLSI Microlithography
lithography tool system, typically a new generation optical stepper capable of producing three times more pixels and at a greater resolution.[4] The roadmap milestones of Fig. 2 clearly illustrate the dependency of DRAM and MPU technology, as well as device capacity and performance upon lithographic capability.[5] In addition to the ever increasing device density, the cost per logic transistor continuously decreases, as shown in Fig. 3, for example, an 8" modern fab. This is the real driving force for the industry, for as long as this cost/transistor trend continues, the semiconductor industry will continue to be more profitable and have growth.
Selected Roadmap Milestones 1997
1999
2001
2003
2006
2009
2012
Dense lines (DRAM half-pitch) (nm)
250
180
150
130
100
70
50
Iso lines (MPU gates)
200
140
120
100
70
50
35
DRAM capacity (introduced)
256 M
1G
n/a
4G
16 G
64 G
256 G
MPU transistors/chips
11 M
21 M
40 M
76 M
200 M
520 M
1.4 B
DRAM chip size, mm2 (year 1/year 6)
280/100 400/140 445/160 560/200 790/280 1120/390 1580/550
Litho field size (mm2)
22 × 22 25 × 32 25 × 34 25 × 36 25 × 40 25 × 44
Frequency (MHz) (across chip, hi-perf) Max. wiring levels Min. logic (Vdd) (supply voltage) Max. wafer dia. (mm)
750
1200
1400
1600
2000
2500
6
6–7
7
7
7–8
8–9
1.8–2.5 1.5–1.8 1.2–1.5 1.2–1.5 0.9–1.2 0.6–0.9 200
300
300
300
300
450
25 × 52 3000 9 0.5–0.6 450
Figure 2. SIA Roadmap milestones showing the linkage between lithography and device evolution. Taken from Ref. 5.
11/30/00
JMR
Figure 3. Cost per transistor vs. wafers produced (called outs) for an example 200 mm wafer logic factory.
Lithography Tool Selection 5
JMR
2/23/01
6
Handbook of VLSI Microlithography
To a first approximation, the cost of a new lens system of a new generation optical stepper is roughly proportional to the number of pixels it can image. Lens numerical aperture (NA) values also increase with each generation, which means lens elements also must increase in size as well. These lenses contain more than 100 lbs. of high purity quartz and are ~3 feet tall or more; their optimization requires the solution of 100 simultaneous non-linear equations with many practical lithography boundary conditions.[6] The lens surfaces also must be produced now to within the order of a few angstroms. Of course, these lenses could not be produced without phase interferometry, including their assembly. The 1-Gbit DRAM is projected to be made by 248 nm DUV lithography at 0.18 micron minimum feature size in the year 2000, and have a die size of 19 × 37 mm.[7] Although DRAM technology drives lithography, logic devices remain very profitable to build, even though the world-wide microprocessor business is only ~20%. Since 1981, the number of logic transistors per chip has been increasing at a rate of 21%/year.[8] Logic devices, microcontrollers, and microprocessors require higher-density circuits to achieve greater operating performance speeds. Since these speeds are driven by the device vertical and horizontal lengths, it follows that higher resolution lithography drives this business as well, as is the case. Channel lengths for CMOS devices are controlled by gate dimension control (CD) with all other things being equal.[9] Recently (Q398), Motorola announced a CMOS SRAM with enhanced contact features of 150 nm (0.15 micron), copper metallization (5 to 6 levels), at 4 Mb. The use of these features allows SRAM cell geometries to be reduced by 35%, with 50–70% less power consumption, improved interconnect reliability, and higher levels of performance (clock rates at 333 MHz) at lower cost. Without this lithographic advancement, none of these device goals were achievable previously. According to John Sturtevant of Motorola, each nanometer of gate variation for these logic devices translates to a cost or loss of 1 MHz in operating performance and ~$7.50/chip in device cost.[10] Digital Signal Processor (DSP) chips are gaining ground on microprocessors and programmable gate arrays. For example, in 1995 DSPs grew at a 73% rate. The expansion of embedded controller and digital communications markets will continue and provide for continued DSP chip sales.[11] In the early 1990s, the semiconductor device world transitioned from bipolar to complementary metal oxide silicon (CMOS) device technology. According to Gordon Moore, the percentage is now 99% MOS and bipolar has been supplanted primarily due to the ability of MOS to make
11/30/00
JMR
Lithography Tool Selection
7
high-frequency devices.[12] Moore further states that “0.35 micron design rule MOS is in production as of 1997 and 0.25 is closing in 1998. Excimer laser 248 nm DUV lithography will prevail from 0.25 to 0.18 or so, but tools operating at <193 nm have a lack of optical materials problem, i.e., nothing is transparent there.” For this reason, he feels some other lithography must be used at ~0.13 micron, and finally move away from optical. X-ray and ebeam stencil printing, competing technologies, still suffer from high mask costs and manufacturability issues. “To achieve devices at <0.13 micron EUV technology may handle one to two generations of devices, but after that device technology becomes iffy, and can we afford the cost involved? EUV requires reflective optics of phenomenal fabrication precision, but it provides focus depth at features less than 0.13 and reduction optics.”[12] In Europe, production of 0.25 micron devices occurred in 1998, with production at 0.18 in 1999 and early production at 0.15 micron in the year 2000, all utilizing optical tools at 248 nm, according to J. Wauters.[13] STM plans to have a factory in production in 2000 producing 0.15 micron, 6-layer metal ICs, putting 15 million transistors/cm2. Siegel of AMD stipulates that 200 nm gates went into production in late 1998, but to get to 100 nm gates will require a dizzying number of materials innovations and long difficult timelines to be met (for example, he reminds us it took DUV at 248 nm fifteen years to reach production applications). IC technology has always been moving towards higher levels of integration, but usually within a particular device family, for example, memories as they moved from 1 kb to 64 Mb. Chip component density is roughly inversely proportional to the minimum feature size squared. Now, however, circuits or devices that were previously on separate chips are being combined on a single chip, a System-on-a-Chip (SOC).[15] In order to integrate device cores including microprocessors, digital signal processors (DSPs), embedded memory, control functions, analog, rf and other devices, they must initially be designed for reuse and be process integratable. An SOC example chip for the Multimedia PC market would include main memory, CPU graphics, flash memory, controllers, MPEG encoders/ decoders, and at least four controller chip functions, all on a single chip; Sasaki also shows the number of components on a chip divided by the area is proportional to the inverse of the minimum feature size squared, a defining relationship.[15] SOC requires design reuse of device IP and cooperation between companies with different IP strengths. McIntosh, the COO of Philips, emphasizes this places the need for lithographic tool sets to be reusable, and that they be capable of two or more generations or design IP rules.
11/30/00
JMR
8
Handbook of VLSI Microlithography
Then, what are the limits of semiconductor device applications and standard processing? Device performance gains may slow down due to threshold voltage-nonscaling, but CMOS devices will scale to at least 0.05 micron with substantial speed improvements.[16] According to Meindel of GIT, CMOS scaling will stop at ~ 62 nm, while optical lithography will stop at ~125 nm due to the physical limitations of channel lengths and lens materials, respectively.[17] According to Gordon Moore, devices may stop behaving well at 50–100 nm, and statistically you have problems with individual devices because of doping atom variations.[12] Nevertheless, if these limits are true, they will allow lithography and semiconductor processing years, if not decades, of continuous applications and further approximate conformance to Moore’s law. 1.2
Semiconductor World Fab Status
In 1998, the top ten semiconductor companies were headed by Intel and it was ~70% larger than the next company NEC (see Table 1). In 1995, the world semiconductor chip business was at $144 billion, having previously doubled in size every five years, roughly from $50 billion in 1990.[18] Unfortunately, according to the Semiconductor Ind. Assoc. in 1998, the world chip market had shrunk to $122 billion, but 9.1% growth is predicted for 1999. On the more positive side, the average car’s semiconductor content, for instance, will go from $60 in 1984 to about $400 by 2000—this is driving the growth in the microcontroller semiconductor business. At the same time the cost per 1 million computer instructions has gone down by a factor of about eighteen from 1980–95.[19] The last three decades of the IC market, or semiconductor fab growth, have been fueled by the improvement in cost/bit DRAM or logic function by 25–30%/year. This occurred due to increased capital utilization, increased wafer size, improved die yield, and shrinking lithographic print capabilities. From the data above, it is easy to see why the chip business is predicted to continue growing again at such a rapid rate. In 1996, Dataquest reported that nearly half of the fabs were still producing devices at greater than 0.8 microns. Furthermore, DUV exposure tool shipments were predicted to remain at fairly low levels until the late 1990s. This has in fact remained the case into 1998 with most advanced fabs still running predominantly 0.5 micron technology utilizing i-line optical stepper tools. The use of DUV steppers is limited to device critical levels
11/30/00
JMR
Lithography Tool Selection
9
in a hybrid mix-and-match with i-line and new device designs. In order to meet future demands on capital outlay, more mix-and-match lithographic technology is predicted as DUV tool costs approach $7 million. Table 1. Top Ten Worldwide Semiconductor Vendors by Revenue Estimates (Source: Dataquest, January 1998) 1996 Rank 1 2 3 6 5 4 7 8 9 11
1997 Rank 1 2 3 4 5 6 7 8 9 10
Company
1997 1996–1997 Revenue Revenue Growth
Intel 17,781 NEC 10,428 Motorola 8076 Texas Instruments 7064 Toshiba 8065 Hitachi 8071 Samsung 6464 Fijitsu 4427 Philips 4219 Mitsubishi 4100
21,083 10,656 8120 7660 7507 6523 6010 4872 4435 4097
(%) 18.6 2.2 0.5 8.4 -6.9 -19.2 -7.0 10.1 5.1 -0.1
Early in the next decade, modern new fabs will most likely be part of the transition to 300 mm wafer size. Actually, the Siemens 300 mm pilot line, an example of early 300 mm line, was established in the 1996–97 timeframe. This event will affect the total lithographic throughput through longer stepping time and total exposure time (i.e., more shots will be required). The tool challenges are to make the tool faster and to improve the alignment accuracy and precision (values <40 nm). These are not simple tasks, for example, the stage does not merely scale. Assuming current stage technology, or probably even new technology, wafer throughput will be going down inevitably.[20] Another fab future trend is the move to foundry device factories, or stated another way, fabless semiconductor companies. The Asian or Far East countries are the leaders in these foundry fabs. For example, Charter Semiconductor in Singapore has a joint venture with HP and Toshiba to provide their deep-submicron lithographically fabricated devices. The
11/30/00
JMR
10
Handbook of VLSI Microlithography
companies that can reuse IP and utilize foundries will survive the future— the others may not, primarily due to costs.[21] Furthermore, the number of fabless companies is predicted to increase to ~ 550 in 2003, from roughly 50 in 1988.[22] Motorola, a company with a large chip factory portfolio and definitely not currently fabless, even plans to out source ~50% of their die production to foundries by the middle of the next decade. Therefore, this predicted trend may be real and have significant future cost and world-wide fab impact. 1.3
Wavefront Engineering and Reticle Fabrication Maskshop Issues
Wavefront engineering in optical lithography is made possible usually through technology which occurs at reticle or mask data design/ preparation and maskshop glass plate fabrication. It can also occur in the condenser optics of the stepper tool through the use of special apertures, but still involves the reticle and how it is illuminated, so OAI is included in this section for completeness.[23] John Peterson of SEMATECH agrees and stresses optical lithography will be extended by high transmission ternary simulated phase shift applications, alternating phase shift blanks, less than 2 degrees of phase shift error capabilities, and better OPC; he also stresses tunable illuminators, aberration-smart lens systems, and grid-based IC designs will be required.[24] Optical tool performance can be improved for resolution and depth of focus (DOF) by altering the reticle illumination angle, referred to as oblique or off-axis illumination (OAI), or by employing advanced reticle fabrication technology, phase sensitive mask technology (PSM), a destructive interference phase shifting of the reticle illuminating light passing through the reticle clear areas. Both techniques are described in Fig. 4.[25] The implementation of oblique illumination apertures can sometimes decrease overall light intensity, affecting throughput deleteriously as well, because of physical light blockage; for some, but not all systems, and if not done in a certain manner, it can decrease image field uniformity also. ASML’s AERIAL system produces annular OAI illumination by refracting a circular beam into a ring, thus as much as 90% of the light is still transmitted, an example system where light intensity and ultimately throughput are not lost. For device manufacturers employing both PSM and oblique illumination, custom-crafted apertures may also be required due to the pattern-dependent aperture states. Lens distortion and overlay may also be
11/30/00
JMR
Lithography Tool Selection
11
affected by oblique illumination. Annular and quadrupole illumination schemes are only effective for line and space patterns oriented to the optical axispatterns at odd angles are not enhanced, thus reducing the general applicability. Finally, mask inspection is difficult with PSM due to the fact that all mask areas have the same transparency.
Normal Illumination T-mask
Off-axis Illumination
Alternating aperture
T-mask
Intensity at lens pupil
Intensity at wafer
Displacement
Displacement
Displacement
Figure 4. Reticle illumination techniques and resulting intensities at the pupil (aperture) and wafer planes. *Note: normal binary masks are represented on the left, while PSM (center) and off-axis OAI are featured to the right.
According to John Nistler of AMD, “4X DUV S&S systems are installed at every major manufacturing production fab. Pilot production in 1998 was at sub-0.2 micron gates. Maskmakers haven’t pushed for smaller resolutions because many leading fabs have the ability to do off-axis reticle illumination to avoid costly and difficult OPC down to 180 nm.” [24]
11/30/00
JMR
12
Handbook of VLSI Microlithography
Optical proximity correction predistorts images on the photomask to make the developed image on the wafer the desired size.[25] Since OPC should be software-implemented, one would expect data handling for the non-resolvable structures, called serifs and facets, to be extensive and it is. Inspection and certification of these reticles is also a problem due to the large number of edge detected defects. Phase-shifting masks employ reticle fabrication structures to provide image destructive interference which produces enhanced resolution and DOF at the wafer plane. Since destructive interference does not require a precise in-focus condition, the resolution enhancement produced by the PSM structures is accompanied by increased DOF, produced via 2-beam or order image composition (see Fig. 5). Both of these techniques provide relief of DOF loss associated with reduced feature sizes. The data of Fig. 5 shows this issue clearly. Levenson has further refined the wavefront engineering limits in Fig. 6. In late 1996, Chris Spence presented a paper entitled “Attenuated PSM (AttPSM): From R&D to Production” at the IEEE Lithography workshop.[26] It dealt with embedded or attenuated phase shifting masks and was a key definitive paper for AttPSM. For most companies, even Japanese companies, this is the favored approach to PSM due to it’s fabrication simplicity. Even so, a great deal of work had to be done in the areas of resists, multiple film stacks, sub resolution mask writing, and inspection algorithms. Although more work was required than expected, this technique is the most advanced and the most production worthy at this time. Feature biasing and OPC structures to prevent sidelobe printing proved to be very important and the addition of subresolution features next to larger metrology features was needed. The end result of this project was a reticle technology and a written spec which can be utilized at DuPont Photomask. They confirm this capability. A 2-layer mask requiring either two writing steps or one step with subresolution features is required; plus, new metrology equipment for measuring phase and transmission of mask blanks had to be added. Better than eight degree phase angle control and less than 1% transmission errors are required for successful applications. Attenuated phase-shift masks are being employed in conjunction with OPC to reduce artifacts of PSM itself.[27] The combination of techniques allows lithography at about one-half the wavelength, while overcoming attenuated phase-shift artifacts to achieve superior CD control and resolution improvements.
11/30/00
JMR
Figure 5. Comparison of reticle illumination techniques to provide greater depth of focus, as shown in the center and right hand figures. Figure courtesy of A. R. Neureuther and SPIE, volume 1674, p. 98, 1992.
Lithography Tool Selection 13
JMR
2/23/01
JMR
11/30/00 NA
n OAI+φ
o PSM n WPSM o PSM n WPSM
n WPSM
n OAI+φ
n o OPC
o PSM
n OAI+φ
n o C
0.55
248 nm
n o C n o OPC
0.70
0.55 n o C n o OPC
365 nm
365 nm
n OAI+φ
n WPSM
n o OPC o PSM
0.70
248 nm
n OAI+φ
n WPSM
n o OPC o PSM
n o C
0.55
193 nm
↓ n OAI+φ
n WPSM
n C n o OPC o PSM
0.70
193 nm
Figure 6. Resolution limits of the various wavefront engineering techniques as a function of illumination wavelength. Taken from Ref. 24.
Legend: n = Dark lines, o = Bright spaces, C = Conventional, OPC = Optical proximity correction, WPSM = Weak phase-shifting mask, PSM = Phase-shifting mask, OAI+φ = Off-axis illumination plus WPSM, or "alternating-aperture" PSM with σ = 0.3.
0.100
0.125
0.18
0.25
0.35
CD (µm) 0.50
Wavelength
14 Handbook of VLSI Microlithography
Lithography Tool Selection
15
The choice for lithographic tools below 0.1 micron will be, according to G. Feit of Sematech, one of mask technologies.[28] All of the non-optical options here, ion projection, Scalpel, EUV and even the older x-ray projection technique, require sophisticated and expensive mask or reticle support, and it may be that some of these are bypassed entirely by mainstream IC production. The manufacturing challenges of producing the next generation large format photomask materials may also be formidable for DUVL maskmaking at 0.25 µm and below. First of all, a 9 × 9" reticle blank technology must be developed and scaled up. New resists, resists that aren’t merely handme-down systems, must be developed; the problem here is that resist manufacturers don’t see the return potential. There are also infrastructure issues, and Hoya is concerned about whether they can absorb the total development costs for the required mask blanks. Even Cr film work will be necessary to meet the dry etch CD requirements. At 193 nm, reticle materials may be scarce versus the demand. The challenges and costs associated with the upgrade of a mask fab facility to 230 mm (the 9" mask specification) is estimated to be in the range of $50–100 M. Grenon further estimates the cost of quartz blanks with coated resist to be $5–10 k.[29] Results from the poll of the referenced paper had OPC as the top priority and PSM second for areas “needing attention.”[30] There is an immediate need for manufacturable and aggressive (i.e., fast or short cycle time) optical proximity correction—basically dropin OPC is what is needed. In 1998, 0.3 micron i-line reticle and 0.2 micron DUV 6" quartz technology was available. In early 1999, 0.3 micron reticle technology still only represents a small percentage of the total reticle market. For advanced logic devices at <0.3 micron design rules, the mix of standard binary reticles per device set is projected to drop from ~70% in 1999, to <40% by 2004, with the differences being composed of OPC/PSM reticles. Average reticle cycle time is roughly less than four days. Experts at Hoya believe 9" reticles will be developed, but the development costs could be $100 M, $150 M, and $250 M for reticles for 0.25, 0.18, and 0.13 micron lithography, respectively. There is also a projected shortage of technical people for maskshop, and an x-ray blank costs about $20 k currently.
11/30/00
JMR
16 2.0
Handbook of VLSI Microlithography STRATEGY
The definition of the method to be used for the selection of an optimum lithography system is the purpose of this chapter. There are many factors in the selection process. The primary factor is the technical requirements: Are the equipment and process capable of defining and registering the features we need to produce? Then we turn to the economics of the situation: entry cost (capital), and operating costs (material and labor) are the major components of cost. Economics is the dominant factor once the decision has been made as to which equipment can meet the technical requirements. Within the economic constraints, we can establish the tradeoffs of capital investments, recurring cost of operation including maintenance, and acceptable yield range. Figure 7 shows a block diagram for a prudent decision-making process. The role of charter, marketing, and production requirements in the decision process is clearly evident. These factors altogether determine requirements imposed on the lithography system to be acquired. The strategy for choosing an optimum lithography system requires these nontechnical factors be considered in addition to the technical capability factor and economics.
Figure 7. Block diagram for strategy of tool selection.
11/30/00
JMR
Lithography Tool Selection
17
Dataquest analysts Fuhs and Pakdaman predicted in 1996 there would be a pause in equipment market growth starting in 1997 and extending into 1998, as new capacity is absorbed in the DRAM and foundry sectors.[31] They proved to be prophetic as in October of 1998 the book-tobill for equipment dropped to less than 0.7. Their prediction that 1998 was to be the most difficult year looks like an excellent call at the time of this writing. They also predicted the lengthening of the development cycle for DRAMs from 3.5 to 5 years, which remains to be verified. In Fig. 8 from Ref. 32, it is apparent that there is a cyclical nature of the IC and Equipment B/B ratios, and that there is a lag in phase of roughly 68 months between the curves; note also, that over the period from 1992 to early 1996 neither B/B dropped below ~1. It took a global recession in 1998 to deter the results for that time period as hypothesized by Fuhs and Pakdaman.
Figure 8. Book to bill ratio for the IC and Semiconductor equipment businesses. Book is product orders and bill is the billed delivered product. A B/B above one is a sign of prosperity and order back logs.
2.1
Charter
An important input to strategy is to consider the charter. Charter is the overall philosophy and objective of the operation. The charter derives from the types and quantities of the IC products and the degree of difficulty
2/23/01
JMR
18
Handbook of VLSI Microlithography
of the process and product. For instance, an operation working on research and development will be at the cutting edge of the technology and, therefore, will be more concerned about technical capability as opposed to throughput and yield. On the other hand, an operation facility which is responsible for only one product using one IC process and expected to obtain very high volumes will use a mature process on a well defined product with emphasis on costs, yield, and throughput. One can visualize classes of applications based on volume and diversity of products and IC processes. Note: this is subjective and other ways can be used. As such, the classes are described in the following section. Note: we have chosen to describe the classes in terms of volume, which is a complex result of demand and pricing in the marketplace. Research and Development Class. These applications are primarily driven by technical issues. They usually involve many types of IC processing with versions of each IC process such that the total number of variants could be well over ten. In addition, the number and types of products including digital devices and analog devices is usually well over ten. Such an environment will produce a very small number of any one product, a hundred or less, and will at best produce prototype devices. More often than not, such a process line will run a large number of short loop experiments as opposed to running the entire process sequence. This is the only class in which production cost is not the main consideration. Some of the primary considerations for this class are flexibility, quick turnaround time, pushing the state-of-the-art, exploratory and very small volume of end product. As such, a trade-off of throughput versus cost of fabrication has only a secondary influence. This is not to say that cost is ignored but, instead, to state that cost associated with volume is of secondary importance. An example of work done at this type of facility is provided by recent work at Sematech, an industrial consortium with a large lithography program in Austin, Texas. Their Delphi Project has produced 0.1 micron, or 100 nm images, on a 248 nm S&S system by the application of OPC and alternating phase shift photomasks. This is viewed as the establishment of capability limits, not that of a volume-production fab. For example, their 0.13 µm feature required a 0.5 µm space and their 0.1 µm structure required a ~0.3 µm space.[33] Very Low Volume/Many Processes and Products Class. This class of process fabrication line is often considered a development line, and not a research line. Its primary purpose is to develop a refined process sequence and to checkout some early prototyping of products which are not
11/30/00
JMR
Lithography Tool Selection
19
necessarily ready to be sold as product but could provide early product samples. This class is not dominated by economics, but is strongly influenced by it. Its primary drivers are still flexibility and technical criticality. Throughput and yield start to be more of an issue, but major issues of cost due to volume are still secondary. Such a line usually supports several types of processes with many types of products. The product volume is still in the 100’s per month at most and the versions of types of processes is still large (in the low 10’s). Concerns about capital cost start to become more of an issue. Low Volume/Multiple Processes and Products Class. This class of fabrication line is now becoming more of a manufacturing oriented line than the lower volume classes previously discussed. Throughput vs. cost of fabrication and yields are becoming the primary drivers but still must be considered with the flexibility necessary to run a multiple number of processes and usually an even larger set of products. The processes will be mature with a moderate degree of process upgrade and incorporation of new versions of an existing process. Product volume will start to reach into the hundreds per month and could even be in the thousands. This is the first class where configuration control, certification of process, and qualification of the line and products start to become very important issues. As such, the technical difficulty of doing the state-of-the-art is less a factor than in the lower volume classes. Moderate Volume/Few Processes and Products Class. This class is a manufacturing operation. Its process and products are mature. It has only a few different types of processes and a small number of products, usually ten or less. The volume of products becomes the primary driver to produce a cost competitive device, and is numbered in thousands of devices per month. The trade-off of throughput versus cost of fabrication dominates, and yield is the number one consideration. The cost of capital including facilities, equipment, raw materials and especially maintenance is the driving parameter for tool selection. Products need to have a long life, measured in years. The process sequence needs to become simplified to reduce the burden of fabrication. Repeatability is more important than flexibility although the presence of both could be the most significant factor. High Volume/Few Products Class. This class of fabrication line is strictly a manufacturing operation whose charter is to produce low cost product. No development and few process adjustments are made on such a fabrication line. In fact, the class has basically only one process and only
11/30/00
JMR
20
Handbook of VLSI Microlithography
a few products with product volume in the tens of thousands. The major factor is economics. The process and products are very mature and the process sequence is simplified more so than for the lower volume classes. Performance of the products has a wide performance band and flexibility is a minor consideration. Repeatability, very short calibration and characterization time, and equipment availability (which includes reliability) are the dominant factors. Scheduling is one of continuous flow, and multiple sets of equipment are the order of the day. Capital cost including facilities is a major order of business due to the large amount of equipment and the physical size of such an operation. Asian foundry operations can be considered to be in this class. With the emphasis on fabless companies and the move by larger semiconductor companies to increase their foundry production, this class of factory may become very economically important. It is also known, these factories have very low wafer cost and they can be technically competitive. Very High Volume/One Product Class. This class is nearly totally inflexible with only one process and one product. Economics is the dominant consideration especially when considering volume. As such, throughput and yield are the two key parameters. The process and product are the most mature of any class. Yield is paramount and throughput is a close second. Repeatability, availability (including reliability) along with the most simplified operation are the dominant issues. No development is ever done on this line which consists of the most streamlined of product flows with as few operations as can be tolerated. Efficiency and time of process are keywords. Scheduling is as straightforward as possible. Emphasis is on high yield and the shortest possible cycle time. It could be argued that the volume fabs of the future will be in this class. They’re fabs with 6–10 k wafer starts per week and they run just DRAMs or microcontrollers or microprocessors, employing really only one process flow. Lithographically, we may see in the next decade a divergence from optical tools across these different types of dedicated fabs, i.e., the DRAM factories may go to x-ray tools, where the other types of fabs may not due to costs and production compatibility. 2.2
Marketing
Marketing is an important input to tool strategy because it provides information regarding the future technical requirements for the lithography system to be acquired. Generally, the resolution requirements for production obey an exponential relationship with time. First discovered by
11/30/00
JMR
Lithography Tool Selection
21
Gordon Moore,[34] the slope predicts halving of the critical dimension every six years (see Fig. 9). A more recent[35] extension of Moores Law is shown in Fig. 10, showing how DRAM and logic devices have developed since the 1975 timeframe of Moores original curve.
Figure 9. Minimum IC dimensions for future requirements.
Figure 10. IEDM update of Moores original curve to illustrate how the limits of the original curve in Fig. 9 did not materialize.
2/23/01
JMR
22
Handbook of VLSI Microlithography
The curve shown in Fig. 9 only presents an approximation to the future requirements of the industry. A 1998 update to the law states the size of the details delineated in a silicon device decreases by a factor of two every eighteen months. Brunner of IBM, restates the law as: the number of pixels per image tool field double every two years, where a pixel is defined as the minimal printable square. Computing power has increased exponentially with time, because of increases in the number of transistors/chip and memory size, and decreases in geometric feature size and gate delay.[36] Chip area also increases with time placing very stringent requirements upon the exposure field for the lithography equipment, assuming an optical aligner. The Marketing Department of the business entity has the responsibility to determine the precise requirements which are necessary to reflect the charter, customer demands, and areas selected for new growth. Unfortunately many marketing departments tend to minimize the importance of this critical function as a result of emphasis on current sales. No IC facility can hope to become an industry leader without an annual report from Marketing which attempts to objectively predict future sales trends. Such a report is necessary in order to formulate an action plan which is necessary to prepare for future sales. Companies which fail to provide this guidance can hardly hope to have products ready for customer evaluation in prototype equipment with assurance to the customer that deliveries will be prompt. The report should at least include sales volume as a function of minimum geometry. The distribution of chip area vs. minimum geometry should also be presented, along with sales price. These reports in turn form the basis for evolution of the charter. A majority of the inputs to the annual report should be obtained through candid discussions with the customers throughout the year. Additional sources of information should include observations of competitor announcements/trends, government reports, cyclic patterns in book/sell ratios, etc. Supplemental data can be gathered from internal sources, e.g., process development engineers and device designers. 2.3
Product Development
Product development is where the ideas developed in Marketing are transformed into a few physical samples which can be electrically evaluated. Product development frequently works with potential customers by evaluating their needs and submitting sample IC’s for evaluation. Here, the impact upon fab photo capability must also be assessed and decisions have
11/30/00
JMR
Lithography Tool Selection
23
to be made on whether existing fabs can be employed or whether an entire new factory must be built for this next generation chip. The conversion of ideas into physical devices begins at a Computer Aided Design (CAD) system, which assists the designer in obtaining a detailed layout of the IC device which is consistent with a prescribed set of design rules. Sophisticated CAD systems are commercially available and their capability varies widely between manufacturers, depending in large part on the financial and engineering investments made. Normally, the CAD system is also capable of predicting the performance of the proposed design. Finally, the CAD system prepares a magnetic tape which contains the coordinates of all the surface features of the design. In the future, this system will be even taxed greater due to the development of advanced reticle technology, e.g., OPC and PSM, requiring extensive computing power. The magnetic tape is then used by the mask fabrication facility to make, inspect, and repair the masks to be used by the production facility. Alternatively, the magnetic tape can be used to pattern the wafers directly by using electron-beam lithography. The product development group oversees the fabrication of prototype samples of the new design. Fabrication will normally be performed in either a special low-volume pilot line set up for prototype development, or in the actual production facility. The pilot line concept is typical for large semiconductor companies which can ill afford to disturb large production facilities. Smaller production facilities, on the other hand, normally have the flexibility to handle odd lots. Small facilities with Computer Aided Manufacturing, CAM, are especially adroit at mixing several products in a production line without processing errors. In the case where the state-ofthe-art is being pressed by more demanding processing requirements, the Process Development Department will be solicited for assistance. Specifically, process development lithographers will be requested to reduce critical dimensions. The resulting efforts by the lithographer to meet smaller CDs will later provide important technical inputs for the selection of future lithography equipment. The successful development of sample IC’s with improved performance will normally result in a decision to fabricate production quantities to meet the volume of expected sales. Input information to production should include: i. Minimum critical dimensions, CD’s ii. Chip area iii. Inter-level registration requirements
11/30/00
JMR
24
Handbook of VLSI Microlithography iv. Delivery schedule v. Mask levels vi. Estimated sales price
2.4
Production Facility
Having received technical requirements for manufacturing new IC devices from Product Development, it is Production’s responsibility to fabricate the devices at a reasonable profit margin with the yield and volume necessary to satisfy the customers’ needs. Production is ultimately responsible for efficiently employing the optimum lithography equipment, and begins the process of selection by determining the minimum yield necessary to make a profit. This is a relatively easy task when the new product only represents a minor variation from the present products. The present total wafer fabrication costs are first assumed to remain fixed. The fabrication cost per functional chip is estimated from the difference between the expected selling price for the IC device and the sum of the profit margin, electrical testing and packaging cost. The total necessary yield, Yt , is then determined by: Eq. (1)
Yt =
Total Wafer Fabrication Cost n [IC Sales Price - (Profit + Test & Packaging Costs )]
where n = the number of chips fabricated on a wafer, IC sales price = price as sold to the customer. The volume of production per unit time is then determined by the ratio of delivery schedule to the yield, Yt. This is the full IC yield budget, a significant part of which is the lithography budget. The lithography personnel must then start translating this yield into specifications that their tools will require. Production is fully aware of the manufacturing space available for expansion, and the cost per square foot of area required to install the lithography equipment. These facts will later be used as inputs to compare the total economic impact of acquiring each lithographic system which has been found capable of meeting all the technical requirements. Production then sets a maximum capital appropriation which is available to purchase the lithography equipment. The appropriation is generally determined by a combination of:(i) total capital equipment funds to be made available during the year, (ii) depreciation schedules, (iii) available cash flow, (iv) tax considerations, and (v) sales trends.
11/30/00
JMR
Lithography Tool Selection
25
Production engineering finally assigns an expert in lithography to determine which types of equipment and specific models are needed to technically be able to meet the functional requirements. Obviously, the lithographer, usually with help from an equipment engineer, will simultaneously try to stay within the capital appropriation available, installation area, and required production volume. 2.5
Technical Capability
Technical capability is an absolute requirement which describes the ability of a lithography system to produce a resist profile of given dimensions (resolution, edge acuity, CD control, etc.) properly registered to previous patterns and the effect of the tool on yield. Technical capability is a necessary but insufficient criterion for selecting a particular lithography system/tool. The technical capability of the equipment and its interactions with resist systems must be completely understood and compared with the requirements of the acquisition. This section will define how technical capability is determined from a combination of theoretical considerations, prior experience, experimental evaluation, and vendor information. 2.6
Types of Lithography
A wide variety of lithography systems exist, and a brief description of each will be discussed in this section to initiate inexperienced readers, and to provide a background for technical capability. The reader will find each system more fully described technically in later chapters. This section will outline the resolution range over which each technique is technically capable of operation, and the approximate throughput in wafer-levels/hour. The types of lithography may be primarily categorized according to exposure radiation such as optical, x-ray, e-beam and ion beam. Optical lithography is further subcategorized into proximity/ contact printing, direct-step wafer, full wafer reduction projection, and direct-step and scan on a wafer (S&S). Likewise, e-beam lithography has subcategories of direct write on a wafer, projection printing (ELIPS), and stencil-masked printing (SCAPEL). Ion-beam lithography is being developed along three variations, namely: focused ion-beam, FIB, (direct write on wafer), masked flood beam (step and repeat), and reduction stepper. The many types of lithography systems make the selection process very complex when comparing technical merit, cost of equipment, labor cost of fabrication, and many other facets.
11/30/00
JMR
26
Handbook of VLSI Microlithography
A possibly critical future lithographic technology not covered in this book’s first edition may be extreme ultraviolet lithography (EUV). This tool technology started as a Dept. of Energy and DARPA funded research program at Lawrence Livermore National Lab aimed at 0.13 or 0.1 micron optical projection lithography with greater than 1 micron DOF. It must be remembered this system is a vacuum lithographic system, like e-beam and ion-beam systems, thus requiring vacuum load locks and other potential throughput affecting subsystems. Optical Lithography. Optical lithography is generally the most cost effective lithography technique, whenever it is capable of meeting all technical requirements. This parallel writing advantage results in high throughput with resists of moderate to fast sensitivity. The commonly used positive type diazide-novolak resists feature high sensitivity, high contrast, excellent adhesion, good resistance to dry etching, and low cost. The primary limitations are resolution due to diffraction and substrate reflectivity. Diffraction limits resolution because the image projected on the resist becomes increasingly blurred as the dimensions decrease. Substrate reflectivity adversely affects linewidth control because of standing waves in the resist and the effect of interference at the air-resist interface on coupling of the incident exposure dose into the resist. The major thrust of optical lithography over the last twenty years has been to reduce the reticle circuit pattern image illumination wavelength, or the actinic exposure wavelength of the aligner tool. Besides wavelength reductions, further tool performance improvements have also been carried out in lens and condenser optics technology, and through improving resists. The trend in tool forecasting for optical tools is shown in Fig. 11. In Table 2, Brunner tabulates the data for several generations of aligners.[37] The reduction in wavelength occurs because the minimum feature size changes proportionately, and each generation of device requires it.[38] Optical lithography advances have been made possible through advances in optical design and materials, lens assembly techniques, interferometric metrology, lens element anti-reflective coatings, and precision engineering. The practical limit in tool numeric aperture (NA) lies probably between 0.7 and 0.8, due to the difficulty in fabricating large field lenses with pixel counts approaching 7–10 × 1010 and with negligible aberrations. The increase in reduction lens pixel counts is beginning to level off as shown in Fig. 12. Shrinking DOF will continue to be an issue as NA values increase since the Rayleigh criterion is only two times the wavelength for NAs in that range.
11/30/00
JMR
Lithography Tool Selection
27
Unit Shipments
Lithography System Forecast 1800 1600 1400 1200 1000 800 600 400 200 0 1994 1995 1996 1997 1998 1999 2000 2001 2002 Figure 11. Tool forecasts through the year 2002.Courtesy of Semiconductor International, p. 20, Mar. 1998.
Table 2. Wavelengths.[1] ∆λ/λ (e.g., calculated λ1-λ2/ λ2), Minimum Resolution (Rmin), and Depth of Focus (DOF) for Optical Lithography Systems.
λ nm
∆λ/λ %
Rmin nm
DOF nm
g-line
436
0
311
850
h-line
405
7.6
290
790
i-line
365
19
260
730
KrF
248
47
175
500
ArF
193
28
140
400
F2
157
23
112
320
Ar2
126
25
90
257
11/30/00
JMR
28
Handbook of VLSI Microlithography
Figure 12. Lens pixel counts, or resolution performance, vs. year. Note the slope reduction in the data and there is also a trend to S&S systems from purely reduction steppers.
Being able to print features at less than the wavelength of the tool exposure system, is actually relatively quite easy.[39] Printing that feature with production-worthy DOF and sufficient overlay is yet another thing altogether. For example, at 0.18 micron optical lithography, the overlay (OVL) tolerance is expected to be ~50 nm. To measure this requires a metrology tool with ~5 nm gauge, and the verification depends upon how the wafer is actually sampled to verify it. Contact/Proximity Printing. Contact printing represents an optical system with the lowest total cost since both the equipment and process are simple, and throughput is large. The primary disadvantage is low yield caused by contact between the resist and mask, and poor registration due to the extremely large field area. “Hard” contact printing yields the highest resolution but the yield is severely degraded by the repeated forceful intimate contact. Proximity printing alleviates the defect issue by not allowing the mask to directly contact the wafer. Resolution, however, is severely impaired by diffraction which increases rapidly with mask-toresist spacing (gap, g). Minimum linewidth is equal to: Eq. (2)
LWmin = Q(λg/2)1/2
where Q is a constant, λ is the wavelength, and g is the mask-to-resist gap distance.
11/30/00
JMR
Lithography Tool Selection
29
Equation 2 is only valid when λ2 = gλ << LW 2. Adequate CD control requires a minimum value of Q equal to approximately 2.0. As an example, the minimum line/space width is approximately equal to 0.85 µm for a mask-resist gap of 1 µm using 365 nm radiation. The exposure time per wafer is of the order of 10 seconds and throughput is primarily determined by the time required for the operator to align the mask to the wafer. The throughput, with automatic cassette loading and unloading and automatic alignment, is in the vicinity of 60 wafers/hour. Contact/proximity printing is generally becoming obsolete because of the high defect density associated with physical contact of the mask with the resist and should only be considered for acquisition under unusual conditions. For example, III-V and II-VI semiconductor devices where circuit density is very low or device area is very small. Full Wafer Scanning Projection Printing. Full wafer projection printing eliminates the problem of high defect density associated with contact printing, because a lens is used to image the mask onto the wafer. Full wafer projection systems normally project the image at 1X magnification using primarily reflection type lenses and a scanning slit to reduce distortion. The minimum linewidth practical for optical projection systems is equal to k1 × λ/NA, where k1 is a proportionality constant, λ is the wavelength of the exposure radiation, and NA is the numerical aperture of the projection lens. The value for k1 has been determined by experience to be 0.8 for production worthiness using single layer resist systems. Multilayer resist systems can use a value of k1 = 0.7. Pilot line operation with multilayer resist systems can with extra care use a k1 value of 0.6. The throughput is approximately 80 wafers/hour and dominated by alignment time. Direct Step On a Wafer, Stepper. Direct step on a wafer extends the resolution capability of full wafer projection printing by easing the lens fabrication problem through limiting the exposure field. Reducing the exposure field decreases the focal length of the lens which makes high numerical aperture lenses easier to design, fabricate, and assemble. The lenses are normally chromatically corrected only over a narrow range of wavelengths, and interference between the incident and reflected waves causes standing waves in the resists and variable coupling between the incident radiation and the resist system. The resolution limit of this type of tool is equal to klλ/NA where, k1 is a constant, λ is the wavelength of the exposure radiation and NA is the numerical aperture of the lens. At the time of the first edition of this book (ca, 1991), a state-of-the-art i-line (λ = 365 nm) stepper with a 0.42 NA
11/30/00
JMR
30
Handbook of VLSI Microlithography
projection lens could reasonably be expected to resolve 0.69 µm with multilayer resist systems and special care. The depth of field for this system was at that time, however, only 1.04 µm, and this value included resist thickness, topographical effects and the precision of the focusing mechanism. At that time, the depth of field problem posed a most serious obstacle to optical lithography, especially, at higher tool NA values and in the CD region between 0.25 and 0.5 µm. Today, this issue has subsided for at least i-line due to advances in resist technology, reticle technology, and stepper optics design. I-line tools today have larger NA (0.63) and can operate in volume production at less than 0.35 micron; 0.5 micron production with the previous generation i-line tool has been routine since the early 90s. A rough opinion census of experts in 1996 predicted the 0.35 micron chip generation would be done with i-line, 0.25 with 248 nm tools, and the 0.18 with 193 nm. The throughput, a critical element for large wafer fabs, for an optical stepper is calculated using the following equation:
Eq. (3)
W=
3600 t oh + N field (t align + t step + t exp )
where W is the throughput in wafers/hour, t oh is the overhead time, Nfield is the number of exposure fields per wafer, and t align , tstep , and t exp are the times necessary to align, step the stage, and expose each field. The overhead timetoh includes the time necessary to set up the lithography tool, change masks, and load wafers. The exposure time, t exp (sec), is equal to the required resist dose, Di (mJ/cm2 ), divided by exposure intensity, I i (mW/cm2 ). Equations very similar to Eq. (3) can be written for every lithographic tool. Normally, the exposure time, texp , dominates the denominator, therefore, resist sensitivity and exposure brightness are very important factors in determining throughput. Throughput is typically equal to 30 to 50 150 mm (or 6") wafers/hour, and specifications for modern tools call for up to 100+ 8" wafers/hr. Direct Step and Scan On a Wafer.Step and scan (S&S) systems have recently been developed which use a full wafer scanning type of optics to expose small areas. These systems could be viewed as hybrid systems, a marriage between steppers and full wafer scanning systems. The advantages of broad exposure bandwidth and fewer optical surfaces are formidable. Equipment of this type is now available using both deep UV and i-line radiation sources.
11/30/00
JMR
Lithography Tool Selection
31
Since the cost of conventional reduction stepper lenses are approaching $1 M, scanning steppers, hybrid tools that scan a reduction reticle through a small well-corrected image field prior to stepping become attractive. These tools, with higher NAs and reduced illumination wavelengths, also allow lithographers to keep up with the trend of increasing chip sizes with decreasing pattern dimensions; when this all happens together, a new generation of tool is usually also required historically. Canon introduced their 300 mm wafer DUV step and scan system, the FPA 5000ES2 at Semicon in 1998; it has a 0.68 NA lens and 26 × 33 mm field, and runs at 100 wafers/hr for 8" wafer throughput tests. The new flat stage and TTL (through the lens) phase grating auto-alignment system allows this system to achieve overlay approaching 50–70 nm.[40] Furthermore, Canon expects the overall cost to dictate the use of these systems for critical levels only in a DUV/i-line mix and match hybrid method.[41] Nikon Stepper has a step and scan system similar in performance, and of course, SVGL first introduced this type of system operating at 248 nm in the 1980s. Step and Scan lenses can be refractive or reflective, also referred to as catadioptric.[42] The issue is size and weight. Cote of SVGL stipulates that a properly designed catadioptric lens having NA of 0.6 can be two to three times more compact than the comparison reflective system, and five to six times more compact than a purely refractive stepping option; lens weights go from ~300 lbs to ~900 lbs to ~1900 lbs, respectively. The size differences can have a significant effect upon tool mechanical mounting and support systems and the fab building structural and vibrational support systems as well (see Ch. 8). Bruning[43] has published the definitive article on why Step & Scan systems were needed. Optical system lenses must be diffraction limited, which means their aberrations determined by ray tracing through the optics must be dimensionally less than the exposure wavelength. These aberrations are a composite from design to fabrication errors. Most importantly, he stipulates the critical scaling characteristics are proportional to field size and that lens material costs are proportional to the cube of the field size. Scanners obtain sufficient exposure field to accommodate ever larger die without expensive and large refractive lenses. If the potential of optical lithography is to be exploited, then systems with both step and scan where smaller scanning optics are involved, become attractive economically and technically. Bruning stipulates that this trend should continue in the future where virtually perfect smaller lenses with high NA will be developed, and where 2D or raster-scanning are accomplished with only minor penalties to throughput.
11/30/00
JMR
32
Handbook of VLSI Microlithography
Step-and-scanning systems also have been shown to reduce dynamic lens distortion by 50%, and dynamic field flatness by ~75%. DOF, one of the most important issues in semiconductor production lithography at 0.25 micron design rules, is increased by ~45%. According to ASML researchers, the advantages of step and scan systems aren’t realized until 0.25 micron CD or below, and when device die sizes get above 16 × 16 mm.[44] That’s where the larger field sizes of the S&S tools become an advantage over the stepper fields, which are typically limited to 22 × 22 mm—this is a throughput issue for single product large volume fabs. 0.18 and 0.15 micron device generations will probably be accomplished with 248 nm tools as well, due to the fact the S&S tools at those respective wavelengths may not be available except for product development. IMEC researchers of Europe have demonstrated 0.18 and 0.25 micron 248 nm DUV lithography. Their goal was to show feasibility, because they also feel that 193 nm DUVL and it’s associated resist technology will be late to the market. The DOF at 0.18 µm was, however, only 0.8 microns (w/16% exposure latitude), which may not be adequate for high-volume production. Their 0.25 micron DOF was greater than a micron and other data published exceeded 1.2 microns, so it does look encouraging for 248 nm applications at those feature sizes. Note, bottom RIE developed Arc was used for this processing demonstrationwithout it results are much poorer or totally unacceptable. Using top surface imaging, 1.1 microns of focus was obtained with 17% process latitude. Stepper vs. Scanner Performance Comparison. According to Bill Arnold of AMD, who is considered an expert on stepper-and-scanning systems, die sizes may be saturating at 100–200 mm2, and the trend to even larger die sizes could be subsiding or leveling-off as of late 1996.[45] Then why are scanners needed if the print field pixel requirements are leveling off at 2–8 × 109? His answer, like that of John Sturtevant of Motorola, is CD control. His concerns, on the other hand, are (i) throughput could be lower for scanners (i.e., if restricted to smaller stepper field sizes), (ii) the costs for a mixed line would be 1.5 times that of an all stepper line, and (iii) that the cost for an all Step-and-Scan line would be nearly two times more, even with i-line and DUV Step-and-Scan systems mixed. As a result, Arnold has proposed that a 2D scanner is needed to be developed for the future consistent with the views of Bruning above.[46] IBM’s Hibbs stresses there are distinct advantages of DUV stepand-scanners vs. reduction steppers. Scanning systems can employ large scan fields (26 × 33 mm vs. 22 mm square stepper fields) achieved with a
11/30/00
JMR
Lithography Tool Selection
33
smaller projection lens, but at the cost of a greater increased complexity of mechanical scanning systems. Scanner optics can provide coma, distortion, and lateral color relief, the elimination of spherical aberrations, just leaving astigmatism and field curvature to be dealt with. A scanner can provide anisotropic mag,X and Y non-linearity and slit, anamorphism, and skew and rotation corrections, which are not correctable for stepping systems as well. At a given field size, it is always easier to fabricate low aberration optics for S&S systems compared to steppers according to M. Levenson. These scan adjustments can also produce a little amount of image smearing, but the dominant effect is a reduction of lens distortion. Field curvature is reduced also by focus averaging in the scan direction, at little cost to image quality. An additional benefit in accuracy of focus is gained by dynamically varying focus during the scan. This “terrain following” feature allows a closer match of the focal plane to the non-flat surface of the wafer than can be achieved with a single, large, statistically exposed reduction stepper field. CD control can be enhanced by these systems as well. A properly designed catadioptric lens centralizes the power to refractive elements where reflection angles are affected less by barometric changes. Intrafield CD control is also less affected due to a smaller aberration susceptibility and these systems have a smaller dependence upon laser bandwidth centering, thus bandwidths greater than 50 pm are acceptable. Refractive stepping systems do not have this technical advantage, another reason for employing the S&S systems. Phil Ware of Canon agrees that CD control can be better with a scanner.[40] He reasons that a scanner lens, having a smaller diameter is easier to manufacture and should have less aberration. When comparing laser-based steppers and scanning steppers, the scanner requires twice as many light pulses as the stepper to ensure uniform exposure, because the scanner is moving during exposure. He further sees the stepper initial machine cost, running cost and throughput at stepper type field sizes in the favor of conventional DUV steppers, but the scanner has better CD uniformity and lower distortion, and a clear win in printable field size. For two chips per field, as an example, the cost difference can be as great as two times assuming laser sources for both tools. He further stipulates a stepper may be favored for making high volume low cost chips such as DRAMs, and scanners may be favored for making logic devices. Older volume fabs with existing i-line tool sets often find that to utilize DUV S&S tools to their equipment list has limited throughput determined by the stepper field size, usually 22 mm square. A possible
11/30/00
JMR
34
Handbook of VLSI Microlithography
solution to the dilemma of reduced throughput for DUV S&S tools, is to mix them with an i-line S&S tool with the same field size. New fabs with no existing i-line tools may also find this option attractive. Bill Arnold of AMD further predicts the extensive use of i-line scanners and ArF DUV scanners in a mixed usage made. He notes also the 180 nm device generation will be the first sub-wavelength production tool (in ~1999)— usually the next generation tool is employed and the feature size is again greater than the tool wavelength.[47] To accomplish this, Arnold further stipulates optical assist features on the reticles and the need for device designers to put OPC and/or phase-shifting patterns into their design software earlier in the design cycle. A final roadblock to large field S&S applications may lie in the development of 9" (230 × 230 mm) reticle technology. DUV scanner costs (greater than $5 M) and their large footprints also are still factors, but nearly all of the big players are in this game, and the Japanese stepper companies are looking at step & scanning in some form or another too. 193 nm Technology Lithography. According to Prof. Bill Oldham of UC Berkeley, 193 nm light compacts fused silica refractive lenses, depending upon the dose which depends upon the resist sensitivity.[48] Optical doses at 193 nm break and reform bonds in silica, which causes atoms to move slightly and compact the structure. At the resist speeds of today’s experimental resists, lenses will only last about a year. With new faster resists, lifetimes up to 10 years are predicted. According to Rothchild of MIT Lincoln Labs, fused silica is usable at 193 nm, but today a ten-year lifetime against compaction cannot be assured. Needless to say, this would be alarming to most production fab Operations Managers purchasing these systems. Systems employing 193 nm and fused silica optics suffer from color center formation, or loss of lens transparency, as well as the lens material compaction. Joe Langston[49] of Intel feels logic device production will employ these tools in 1999, while DRAM needs are scheduled for 2001. Based on 1997 projections, Langston predicted the early production would still be at 248 nm DUV—only time will tell. DUV at 193 nm may be employed for 4-GB DRAM production, but arguments could also be made for 1:1 x-ray. If this does happen, it would really be the first generation of device to employ a non-optical lithography tool in volume production. As of 1999, first-generation, 193-nm production systems are being installed in development lines.[50] In addition to the reticle challenges, discussed above, the systems have a full slate of issues to address:
11/30/00
JMR
Lithography Tool Selection
35
• The cost of ownership (COO) of commercial systems is very high at 2.5 to 3.0 times that of 248-nm lithography used at 180nm. But COO improves with time; the natural development progression is to improve throughput and laser power, plus resist process maturity, thus driving yield up and COO down. Today’s COO compares, and perhaps confuses, 193-nm process development tools with 248 nm production tools. It is expected that 193-nm COO will come down to ~1.5 times that of 248 nm in the next few years, when there is a drive toward volume production. Comparison must properly consider that 248-nm COO will be driven up as it is enhanced with OPC and PSM. • An infrastructure for volume manufacturing of CaF2, the required lens material, has to develop. This is certainly part of bringing COO down. Progress is being made on both verification of material quality and supply. For example, Schott Glass has reported lens blanks up to 233-mm dia. with less than two ppm homogeneity, roughly twice that required for 193-nm lenses. According to Richard George, “ASML has an adequate supply of CaF2, which we also use in illuminator optics of our 248-nm system (model 700B), for all 1999 shipments of 193nm systems. We are working with our suppliers to ramp up for 2000.” John Bruning, president of Tropel, adds, “The optical material lifetime for 193 nm is now well understood and not the show stopper it was once.” In the timeframe 2000–2001, 193 nm S&S tools with 230 mm square reticles and new resists will be required to produce DRAMs at the 1 GB level according to M. W. Powell. [51] TI researchers, on the other hand, predict 0.1 micron isolated gate transistors on mixed signal CMOS in volume production with 300 mm wafers by 2001, again employing 193 nm tools. The first TI machine is to be delivered in mid-1999. This would beat the 97 SIA roadmap by ~2 years.[52] ASML will deliver their first 193 nm full-field exposure tool in early 1999, while SVGL expects to deliver their first tool also in the first half of 1999. In other words, as the Delphi roadmap indicates, many experts think that the 150-nm technology node, and even the 130-nm node, and some applications of 100-nm nodes are still the territory of 248-nm DUV with full application of OPC, PSM, and OAI technologies. But achieving optical lithography at this level will clearly be an economic issue, and
11/30/00
JMR
36
Handbook of VLSI Microlithography
economic comparisons will be measured against the costs of 193-nm DUV, not NGL. This is not to say that the DUV route will be straightforward, but the path seems brighter today than a year ago. 157 nm Lithography. At 157 nm where F 2 lasers are employed, Rothchild asserts the only non-birefringent optical lens materials available with decent optical properties are CaF2, and MgF 2 (or sapphire). These same materials would be required as mask substrates as well. These materials are transparent and single-crystalline materials, and are not subject to compaction as is fused silica. According to Bruning, however, these materials are hard to fabricate to precise levels of smoothness and be stable; material availability is also in doubt according to SVGL sources. Nevertheless, an early 157 nm microstepper tool has been built to produce 80 nm features using PSM.[53] Even though operating at this wavelength poses optical materials problems, this laser is a true excimer laser, an excited dimer, and possesses a linewidth in the pm range, which allows the elimination of some optics that could cause further light losses. As optical lithography marches towards 0.10 micron, many technical questions and economics questions remain.[54] Can new lasers such as the F2 at 157 nm prove feasible, and can the cost of applying OPC and PSM technologies be overcome to provide volume production capabilities? Of course, the impact of wavelength can provide another 70% improvement versus the 248 nm capability or at least a 0.15 micron resolution capability (see Table 2). NA will be limited to gains from ~0.6 to 0.8, at which point the DOF is less than 0.6 times the wavelength. This number is considered to be too small due to wafer non-flatness, topography, and tool performance by many. DUV step and scan tools have the advantage in that they print over small scan areas where aberrations can be controlled. H. Y. Liu of HP, stipulates that to achieve less than 15 nm CD control at 150 nm feature sizes, requires PSM reticles and working at a k1 value of 0.237. With so much emphasis and remaining potential for 248- and 193-nm DUV, it seems that the relatively narrow production process window for 157-nm DUV is simply too narrow to make economic sense, as the argument often goes. In addition, 157 nm has some seemingly overriding “show stoppers,” for example, the high coefficient of thermal expansion of CaF2 reticles makes them difficult to build and use. Just when the industry again projects the ultimate limit to optical lithography, however, it becomes evident that the fundamental research from which lithography rises is not static, nor is the calculation of the economic advantages of the various possibilities for the future in terms of circuit element density. Ultimately, if the narrow window of 157 nm makes
11/30/00
JMR
Lithography Tool Selection
37
economic sense from the standpoint of DOF and process latitude, then there are those in the industry who will do it. Indeed, the list of 157-nm work is growing; consider, for example: Recent work from Corning Glass shows some promise with fluorine-doped fused silica; it has enough transmission to make it practical at least for consideration as an alternative to CaF2. Intel, a major supporter of NGL EUV technology, sees 157 nm as possibly more cost-effective than any NGL methods. It is at least funding a nine-month study with MIT Lincoln Labs to identify potential 157-nm show stoppers. Late in 1998, SVGL announced its intention to develop a 157nm lithography system to fill in the gaps between 193 nm and EUV, which the company is also developing for 70 nm and beyond. Optical Lithography at Low k1 Values. Lithography has always set the pace of the IC industry as far as circuit integration and pattern density are concerned. A fundamental change occurred at 0.5 micron technology, however, where for the first time production Fabs began to run at k1 values less than 0.8, or ~0.6.[55] The downward trend in k1 is shown in Fig. 13. Alternative advanced technologies such as PSM, off-axis illumination, and OPC may allow production operation at k1 values lower than 0.5; in order to accomplish this, lens aberrations must be controlled to much tighter tolerances or tool choices switched to S&S systems, where lens aberrations are inherently lower. This trend will have to continue in order to obtain adequate CD control; tools will have to have precise machine to machine matching, and high-volume manufacturing will have to have recipe control of performance enhancing configurations, as is becoming the case.
Figure 13. Trend in k1 to higher resolution performance.
2/23/01
JMR
38
Handbook of VLSI Microlithography
To get to small features, those below 0.15 micron, will require working at k1 values less than 0.65. According to Prof. Pease of Stanford U., a value of k1 of 0.25 represents the limit of resolution, i.e., zero modulation for incoherent light. This will require OPC, PSMs, improved resists and equipment with off-axis reticle illumination capabilities. According to Levenson, these wavefront engineering techniques will be required because process windows get small at k1 values less than 0.5, and tools capable of printing 180 and 150 nm features at higherk1 will not be available in time.[56] Combining OAI, OPC, and weak PSM correctly may allow lithography at k1 less than 0.4, but care must be exercised to have specific combinational conditions.[56] The application of these techniques will provide features less than the actinic wavelength of the illuminating optics. Ogawa et al.[57] have demonstrated that tuned quadrupole (see middle image of Fig. 14) 248 nm illumination combined with attenuated PSM can lead to less than 0.25 micron features with greater than 1.8 micron DOF. These gains are not without cost—asymmetric lens aberrations such as Coma and three-leaf clover can cause enhanced secondary peaks with attenuated PSM imaging,[58] thus pointing out that OPC requires a clear understanding between design data handling software and the CAD tools of mask writing. Their examples point out the data file sizes, with and without OPC, can differ by large factors, approaching ten times. Adding the required unresolvable serifs and scattering bars of OPC is the culprit to this increase in data size.
Figure 14. Light distributions of conventional, quadrupole, and annular type illumination systems. (Figure courtesy of Semiconductor Fabtech.)
2/23/01
JMR
Lithography Tool Selection
39
Working at low k1 has requirements for mask making. Mask Error Factor, MEF, which is defined as small mask CD variations, is normally small and ignorable for higher k1 reticles or masks. This is not the case for low k1 reticles. Langston of Intel, reports at SPIE 98, that MEF is a big problem, which is ignored by Technical Roadmaps and funding sources. Etec Corp. was awarded a Sematech contract in 1997 to address mask making equipment for low k1 lithography, or 0.15–0.13 micron design rule devices.[59] At these CD levels the data volume approaches 128 GB, and the so called “maskmaker vacation” is over.[60] Improvements in mask pattern generation (PG) and mask making techniques will have to happen and accelerate to meet these requirements; the advent of extensive OPC applications will also aggravate these requirements. CD Control Challenges for Sub-0.25 µm Patterning. Historically, CD budgets have decreased from a few tenths of a micron to the nm region.[61] Actual CDs have decreased at the rate of 11% a year, while wafer area has been increasing at the rate of ~15%/year. Similarly, CD budgets across the exposure field have decreased to ±25 nm for memory devices and even tighter for logic devices. Overlay tolerances are now less than 100 nm. Typically, overlay budgets are about one-third of the CD, and are made up of stage errors and intrafield errors such as mask, lens distortions, wafer field and image tilt errors, and machine matching. For 0.25 µm devices, the total allowable CD variation for the gate transistor level will be of the order of less than 40 nm, 3-sigma, and at 0.18 µm it will be less than 30 nm. Sturtevant et al. of Motorola[62] stipulated in 1996 that the correlation to Leff from gate CD is 85% and the cost, or lost revenue, is about $10/nm of CD variation above the target. Taking another view, each nanometer is equal to 1 MHz of operation speed loss. This may be the first talk ever to put these specifications in terms of dollars, the real issue to semiconductor manufacturers. They also stipulate the CD budget for Photo is not 10%, it’s really less than 8% total with the bulk of that going to Etch. Furthermore, they hold that DUV scanners provide better CD control than DUV steppers, another reason to go with scanner equipment. Factors influencing CD control include reticle linewidth and proximity errors, lens focal plane deviations and aberrations, intra-field proximity effects, and inter-level reflectivity effects. Inorganic DUV Arcs were also viewed as essential to success at the gate level, with reflectivity reductions to less than 1% being successfully achieved. X-Ray Lithography. X-ray lithography overcomes the diffraction problems associated with optical proximity printing by using ultra short wavelengths in the region of 5 to 15 angstroms. X-ray lithography also
11/30/00
JMR
40
Handbook of VLSI Microlithography
allows for thicker resists, which aids from an metal RIE standpoint, and anti-reflective coatings are not required as for optical lithography to suppress light reflection effects. The resolution is again limited as in contact/proximity printing by LW= Q(λg/2)1/2. In the case of x-ray lithography, λ is typically equal to 10 Å, and the minimum linewidth is approximately 0.2 µm for a typical mask-wafer gap of 20 µm. The large mask-resist gap made possible with x-ray radiation eliminates the high defect density normally associated with optical contact printing. X-rays exhibit negligible reflection at resist-substrate boundaries, and excellent CD control can be maintained in simple single layer resist systems. Defects due to particulate contamination are very low since organic materials are nearly transparent to x-rays. The low absorption by resists also insures steeper profiles in thick resists. Initially, x-ray systems were developed using full wafer exposure, but fabricating large area masks proved unworkable and present systems use a step-and-repeat mode. The fabrication of 1X masks with zero defects and accurate placement of images on the mask remain the most difficult technical problems to be solved in developing production worthy systems. The future of x-ray lithography is very dependent on solving all the problems associated with mask technology. The very large costs expected for fabricating a set of defect free x-ray masks makes this lithography technique most practical when the production volume per device type is large enough to satisfactorily amortize the mask costs per IC device. Point Source Systems. X-ray lithography was initially developed using electron impact sources of radiation, where high energy electrons bombarding the target excite the target electrons into higher orbits and xrays are released upon relaxation. These sources are extremely inefficient (~3 × 10-5 W/W-sr-Å for palladium) and have been replaced with more efficient ultra high temperature plasma sources. Laser or electrical discharge driven plasma sources are capable of delivering 20–40 mJ/cm2 at the resist surface, and feature a source size of less than 200 µm in diameter. The throughput for 125 mm wafers with a resist sensitivity of 100 mJ/cm2 is approximately 25 wafer levels per hour, using a 20 × 20 mm field size. The use of point sources in conjunction with finite mask-resist gaps results in penumbral blur and geometric magnification of the mask. Figure 15 depicts the phenomena of penumbral blur and geometric runout. The runout, relative to the mask, is equal to R(g/d), where R is the radius from the center of the wafer, g is the mask-resist gap, and d is the mask-source distance. The differential runout with changes in gap is equal to ∆r = ∆g(R/d). Typically, values for R and d are 4 and 30 cm, respectively.
11/30/00
JMR
Lithography Tool Selection
41
Under these conditions a registration error of 0.1 µm will occur at the edge of a 8 cm field for a 750 nm variation in mask-resist gap. An allowed variation of 750 nm places stringent demands on mask flatness and wafer substrate topography. The corresponding penumbral blur for a 200 µm source size and a mask-wafer gap of 20 µm is equal to 13 nm which is negligible compared to diffraction effects.
Figure 15. Penumbral blur and mask-wafer runout which occurs when using point sources of finite size.
X-ray systems using plasma point sources will probably find niches in modest size facilities where the cost of synchrotrons is prohibitive, and the cost of masks is satisfactorily amortized. Synchrotron Source Systems. Synchrotrons feature high intensity and excellent collimation of the radiation. High intensity results in high throughput with single layer resist systems, and high collimation minimizes the penumbral blur and runout problems described for point sources. The output beam is in the form of a horizontal line approximately 1 mm in height. Each exposure field must therefore be individually scanned by either moving the aligned mask and wafer as a unit in a vertical plane or more desirably by using a oscillating x-ray mirror to make the beam scan
2/23/01
JMR
42
Handbook of VLSI Microlithography
the exposure field. The radiant intensity for a typical storage ring is in the range of 0.25–2.5 W/mrad. The number of mrad/cm2 of resist surface is dependent on the length of the beamline (port) and ranges between 5 to 10 mrad/cm2 as the beamline varies from 5 m to 10 m. The intensity at the resist surface ranges from 12.5 to 25 W/cm2. The exposure time per exposure field is approximately 1 second for a resist sensitivity of 100 mJ/cm2. Exposure fields of 4 × 4 cm are practical and the throughput for 150 mm wafers is approximately 50 wafers-levels per hour. Mask fabrication, inspection, and repair along with a huge capital investment are the primary obstacles to widespread industrial implementation of x-ray. In Q296, however, the American National Mask shop reports shipping ten-level quantities of production worthy masks/week to the four x-ray US manufacturers, as well as 0.13 micron capability.[63] A synchrotron costs in the vicinity of $30M, and requires about two to three years to assemble and commission. A single synchrotron can easily accommodate eight to ten alignment machines with a combined throughput of 400 to 500 wafer-levels per hour. The high cost of mask fabrication per device type limits practical application to high volume production. Smaller volumes (100–5,000 wafers) are more economically fabricated using e-beam lithography since mask fabrication is not required. X-ray lithography, using synchrotron sources, is likely to emerge first in large corporations with large captive requirements for IC’s with dimensions between 0.1 and 0.4 µm. Another barrier to projection x-ray is the requirement to achieve and maintain very small mask to wafer gaps during exposure.[64] To achieve these 20–30 micron gaps leaves little margin of error for particulates, mask or wafer flatness, and the fact that these requirements will be applied to large wafers is cause for concern. To achieve 0.10 lithography will require gaps less than 20 microns, a substantial task. E-Beam Lithography. An electron accelerated to 25 keV has a de Broglie wavelength of only 0.074 Å! Clearly, an electron experiences negligible diffraction. A second advantage for electrons is that because of their charge they can easily be focused into a fine spot and deflected by electrostatic or magnetic fields. While the electron impact area can be precisely controlled, the electrons are easily scattered in the resist by interactions with protons and electrons in the resist. The electron beam in effect expands due to “forward scatter” in the resist. The scattering becomes more intense with atoms of high atomic weight, and substrates such as silicon and GaAs actually turn some electrons around and these “backscattered electrons” experience further scatter on their return path toward the
11/30/00
JMR
Lithography Tool Selection
43
vacuum-resist interface. The range of electron spreading is of the order of microns and two phenomena are observed. The first phenomenon is that isolated exposure areas exhibit poor edge acuity compared to the incident electron dose, and a CD which is highly dependent on exposure dose and development time results. The second phenomenon is the CD of a pattern is also a function of nearby exposures. The latter phenomenon is loosely termed “proximity effect” and only becomes a serious problem in submicron lithography where adjacent patterns are very close. Partial correction for this effect can be achieved by GHOST, a technique where the whole wafer is flood-exposed with a low dose, which swamps out the proximity dose from adjacent patterns. Full correction is beyond the scope of this chapter, and the reader at this point only needs to know that it is computationally complex and seriously reduces the throughput of e-beam lithography. Thermionic emission from tungsten filaments was used in early systems, but these were replaced by the higher brightness LaB6 source, and finally by field emitter sources. The field emission type sources feature very high brightness, and low chromatic distortion (low electron energy spread) which facilitates more effective focusing. State-of-the-art beam columns typically deliver current densities,I d , of 50-200 A/cm2 at the resist plane at an energy of 20 keV. High resolution resists are commercially available with sensitivities, S, of 1 µC/cm2. The exposure time per pixel is equal to S/Id, and therefore, ranges between 5 to 20 ns. A considerable amount of research is still continuing on the development of tools with higher throughput, tools with multiple beamlets for parrallel writing (100 × 100 or 50 × 50 arrays) and the AT&T SCAPEL ebeam stencil projection system. The latter provides less than 0.1 micron capability, but stencil masks are not available yet or are scarce and expensive. Hitachi also announced in Semiconductor International a new direct-write e-beam machine in early 1998. It uses cell exposure and diffraction that allows a high-precision wide-angle exposure of electrons. This system is capable of ten 8" wafers per hour. Direct Write - Gaussian Beam. Direct writing with circularly shaped Gaussian beams is performed with beam diameters ranging between 0.01 µm to 0.25 µm. Two Gaussian shaped beam systems exist, namely,(i) raster scan, and (ii) vector scan. Raster scan systems traverse the entire chip area on the wafer and the beam is blanked on and off as it scans a required pattern. The exposure time, T exp , for a wafer is approximately equal to the number of pixels times the time to expose a pixel or:
11/30/00
JMR
44 Eq. (4)
Handbook of VLSI Microlithography Texp = (nA/d 2)(S/Id)
where n is the number of chips/wafer, A is the total chip area, S is the resist sensitivity, d is the beam diameter, and Id is the beam current density. The total write time increases rapidly as resolution is increased (i.e., d made smaller). In an example calculation of T exp using Eq. (4), the conditions of a 100 mm wafer with 61 cm2 of total write area, a resist sensitivity of 5 × 10-6 C/cm2 , a beam diameter of 0.1 µm with a current density of 100 A/cm2 being employed. The exposure time per pixel is only 50 ns, but there are 6.1 × 1011 pixels. The total exposure time, Texp is equal to 8 ½ hrs. In addition, the total time required to process a wafer must include: (i) the time necessary to settle the deflection amplifiers prior to each exposure, (ii) the stage motion time, and (iii) the time to load/unload wafers. The throughput is therefore less than 0.12 wafers/hour. Vector scan systems improve throughput by limiting the scanning area to only the required patterns. In this case, the exposure time, Texp , is equal to: Eq. (5)
Texp = (nAP/d2)(S/Id)
where P is the fraction of the combined area to be exposed—typically 20% of the total chip area. The throughput in this case is increased to 0.59 waferlevels/hour with no loss in resolution. Very high resolution is possible with the smallest beam diameters, but the time required to expose a pattern increases rapidly due to the increase in the number of pixels. Direct Write - Shaped Beam. Shaped beam systems are of the vector scan type, but additional throughput is achieved by flashing the patterns by a series of relatively large rectangles. The exposure time for each rectangle (4 µm sq., maximum) is still equal to S/Id , but tens of pixels are normally exposed simultaneously. Each rectangle is shaped by displacing a small flood beam of electrons from the center axis of the column such that they are partially intercepted by metal blades at the edges. The total electron dose per unit time is several orders of magnitude larger than for Gaussian beam systems, and throughput is increased in proportion. Shaped beam systems are expected to dominate in the future, especially when modest product volume is required. European Silicon Structures (ESS) reports a throughput using the Perkin-Elmer AEBLE system of 10 wafer levels per hour on 5 inch wafers at 1.2 µm geometries for ASIC CMOS devices.[65] The cost is reported as $31 per wafer level.
11/30/00
JMR
Lithography Tool Selection
45
IBM continues to improve their shaped e-beam writing supremacy as verified by a talk by Hans Pheiffer introducing the new EL-5 writing tool in late 1996. It is a 75 keV, 50 A/cm2 shaped beam tool with RISC parallel data path processing and 45 nm resolution. It is equipped with LEARN software which removes deflection distortion errors and the column is modified to write telecentrically, even when the beam is deflected. The tool provides IBM with better than state of the art mask and x-ray mask writing capability ahead of ETEC commercial efforts. E-Beam Proximity Printing, EBP. Proximity systems utilize a 1X mask, which is patterned with photosensitive materials (for example, palladium.) Photo electron emission from the patterning is obtained by illuminating the patterns with ultraviolet light. The electrons are accelerated toward a resist coated wafer in a strong magnetic field to maintain collimation. These systems are not commercially ready at present and the reader need not consider this technology as practical at this time. SCALPEL (Scattering with Angular Limitation E-beam Projection Lithography) is a direct write e-beam tool which forms a 4:1 reduced image of the scattering mask onto the wafer. The mask is a low atomic numbered material so that few electrons are absorbed by the structure. This makes the mask insensitive to heating induced distortions. The advantages are less than 0.1 micron features with large DOF, but the mask technology is very immature and there is no mask production infrastructure. EUV Lithography. The 13.5 nm source will be based upon a highpressure jet of Xe gas which is excited by a 1.7 kW laser. A major problem for this system is laser debris and work was presented at SPIE 98 concerning a 3-mirror aspherical projection lens which avoids this issue as a potential industrial prototype.[66] Still years away—the program needs better key figure roughness capabilities for making the projection mirrors, further development of interferometric metrology to make these measurements, and more improvements in the debris-free plasma light source. According to Bruning of Tropel, EUV lenses will require coatings as precise as those used for space telescopes[66] and their lifetime may not match the requirements as well. For EUV to come to fruition, breakthroughs will have to occur in resists, mask technology, as well as the optics.[67] EUV employs a series of resonant reflectors in its condenser and imaging optics, placing practical constraints on the optics. Validation of the reflective optics and the metrology required for application still remain to be validated.[68] New materials will have to be found in all areas.[69] A further question is how do you align using optics that reflect only at certain wavelengths? But this issue may be a minor hindrance.
11/30/00
JMR
46
Handbook of VLSI Microlithography
EUV tools will be developed through an alliance of companies.[70] The EUV LLC, the United States Lith. Limited Liability Company, is composed of Ultratech stepper, Intel, Motorola, and AMD at this writing. Of course, the Alliance is needed to spread the cost of the future generation tool. Although this is a specific example of a tool Alliance, alliances are becoming the norm even for the semiconductor manufacturers as well, for example, agreements between Siemens/Motorola, AMD/Motorola, etc. Ion Beam Lithography. Ion beams are particle in nature, and diffraction is negligible. The resolution capability of ion beam lithography is very high. Low atomic weight ions such as H+ are preferred for high penetration into the resist, but Ga+ ions are more frequently used because their sources have been more fully developed with very high flux density. Ion energies of between 60 and 100 keV are typical. The ions have very high energy and each ion can create many chemical/physical events. Focused Ion Beam, FIB. The charged nature of ions is used to focus an extremely fine source into a small spot on the wafer, and the beam is electronically deflected in a manner similar to vector scan electron-beam lithography. The relatively heavy ion is, however, more difficult to deflect and the deflection field is only of the order of 1 mm square. The throughput is adversely affected by the numerous stage motions necessary to expose a wafer. Masked Flood Beam, MIBL. Masked ion-beam systems are essentially proximity printing and offer the prospects of high throughput by means of the parallel writing scheme. A very thin membrane mask with absorber patterns is used to shadow image a flood beam of ions. To avoid ion scatter by the membrane, “channel” membranes are used where single crystal materials are oriented to reduce the probability of ion collision with heavy atomic nuclei. Again these systems are not commercially available at present, and the reader may for the present exclude these systems from application in production. Ion Projection Lithography, IPL. Ions, unlike photons, have an electrical charge, and they can be focused or collimated using electromagnetic or electrostatic lenses. Mask images can therefore be demagnified by lenses to project submicron images on the wafer. Mask fabrication, inspection, and repair is significantly less difficult than in 1X mask flood beam systems. The only commercially available system of this type is a 10X reduction machine manufactured by Ion Microfabricating Systems. A resolution of 0.2 µm has been demonstrated with exposure times of 0.5 sec. in image reversal resist. The image field of the IPS 200 system is 7 × 7 mm2,
11/30/00
JMR
Lithography Tool Selection
47
and the depth of field is ±100 µm. The throughput is approximately 490 cm2/hr, or 3.3 six inch wafer levels/hr. A major drawback to this technology is the need for stencil-masking, which is difficult when isolated features are involved; two exposures are required for isolated features in clear areas. Lithography Support Equipment. Each lithography system requires a substantial amount of support equipment. The support equipment generally includes: (i) masks and mask making equipment, (ii) mask inspection and repair equipment, (iii) resist coating equipment, and (iv) resist development equipment. Items i and ii usually reside in mask making facilities, while items iii and iv reside in wafer fabrication facilities. Mask inspection equipment is usually found in both facilities. Masks and Mask Making Equipment. The yield, therefore cost of manufacturing, is crucially dependent on the fact that masks must be defect free and that all patterns are correctly sized and registered on the mask. The fabrication difficulty, or cost, is very dependent on the minimum dimensions required on the mask. 1X masks require high resolution and precise image placement; their cost is very high compared to 10X masks used in reduction type optical step-and-repeat systems. The cost for fabricating and inspecting a defect free 10X mask for an optical stepper is in the vicinity of $1000. The cost for fabricating and inspecting a defect free 5X mask for an optical stepper is in the vicinity of $3–5k, depending upon the specification. Ten to fourteen masks are required to fabricate a typical IC Chip. In comparison, the cost for fabricating, inspecting, and repairing a 1X mask for x-ray lithography is estimated to be in the vicinity of $10,000 to $25,000. Figure 16 shows the estimated mask cost trend for the competing lithographic strategies. The total cost to fabricate IC’s must take into account the substantial amounts of money invested in masks. Mask Inspection and Repair Equipment. The yield of functional IC devices is very dependent upon the use of defect free masks. The masks must be verified to be free of defects since no mask fabrication technique is presently able to guarantee the absence of defects. A typical mask contains some 108 to 1010 pixels of information, each of which must be verified for existence and proper placement. The task of fabricating defect free masks is technically extremely difficult for 1X masks with submicron pixel dimensions because the probability of detecting a defect is less than unity due to limited resolution in the optical inspection systems currently being used. Resist Coating and Development Equipment. The rightful emphasis on exposure equipment has frequently caused the lithographer to neglect
11/30/00
JMR
48
Handbook of VLSI Microlithography
the technical importance of resist coating and developing equipment. Initially, coating equipment was merely a “spinner” where individual wafers were manually placed on a vacuum chuck, the wafer was flooded with resist, and the motor was activated for several seconds. The result was poor resist thickness uniformity, both intra-wafer and inter-wafer, and with poor defectivity levels. Mask Cost vs CD Generation Trend Estimates 120 Binary Engineered
100
EUV
Cost (K$)
80
Ion Beam SCALPEL
60 X-Ray
40
20
0 0.5
0.35
0.25
0.18
0.13
0.1
Generation, micron
Figure 16. Mask cost trends per device generation.
In contrast, modern coating and developing track equipment is quite sophisticated and expensive, and the results are dramatically improved. The coating thickness on planar surfaces can be readily controlled to less than ±20 Å with defect densities of less than 0.01/cm2. In the future, film thicknesses and other equipment parameters will be monitored, real-time, to ensure even tighter tolerances, termed adaptive control where individual wafers can be corrected for wafer variance to achieve tighter process control.[71] The improvements were largely realized by automatic wafer handling, dynamic dispense with nozzle suck-back, high acceleration digital control of motor speed, precise control of air flow, edge bead removal, etc.[72] Further, modern coaters provide improved resist adhesion through
11/30/00
JMR
Lithography Tool Selection
49
dehydration baking and HMDS (hexamethyldisilazane) treatment. The resist is also normally baked on a hot plate which provides excellent transfer of heat to the wafer, and precise temperature control to ±1°C is easily achieved. The entire mechanism is generally referred to as a “wafer track.” In the 1980s, two or three parallel tracks were normally contained in one piece of equipment, but in the mid-1990s, track systems became robotaccessed with randomly-accessed modular processing systems versus the previous linear processing systems. Computer aided manufacturing (CAM) has been extended to wafer track equipment, with automatic monitoring of wafer and lot flows, recipe downloads, resist thickness, spinner speed, air flow, bubbles in resist line, hot plate temperatures, etc. The cost of a modern resist coater is in the vicinity of $2 million, and with computer interfacing it can be in the vicinity of $2.4 million, or so. The cost is no longer a minor factor to the total capital budget. The throughput of serial processing equipment is determined by the slowest step, normally the HMDS prime or more typically the develop cycle. Typical throughput is sixty wafers/hr for older stand alone systems. For modern tracks, the throughput can be robot speed limited, but multiple robot systems have been developed with greater than 100 wafers/hr throughput values. Resist development track equipment follows the serial nature of resist coating equipment. The equipment is normally automated with cassette-to-cassette operation, and with temperature control of the developer (critical to metal-ion-free MIF developers). Ultrasonic dispersion of the developer and other soft-spray dispensing systems have become available. Monitoring of the resist development rate can be performed during development to certify that the process is proceeding correctly (i.e., exposure dose, developer strength/temperature, etc., are correct). The capital cost of dual older stand-alone track resist development equipment is approximately $400 thousand when outfitted with CAM sensors. The throughput is approximately 70 wafers/hr, but an on-vendor-site throughput test should be performed using the proposed resist system to be used in production. Modern tracks usually have two to four develop modules in the system to provide high throughput, and their cost is included in the figures above for the modern integrated resist coat systems. Modern resist track support systems (see Ch. 2) are also integrated lithographic cells in addition to the CAM-like control specified above (see Lithographic Automation Ch. 6). This means there are interfaces for handing wafers between the exposure tools and the support track, and for
11/30/00
JMR
50
Handbook of VLSI Microlithography
handling wafers between process modules. Although these systems increase total system cost, no high volume FAB will be able to survive without these integrated cell systems for throughput, device yield, and manufacturing compatibility reasons. Resist Technology. The resolution, throughput, critical dimension control, adhesion, and post process compatibility are all very dependent on the resist system. The non-linear response of resists to exposure dose is in large part responsible for the excellent resist profiles obtained in spite of the modest imaging quality achieved by the equipment. The resolution capability of a tool and its associated resist system is directly proportional to the edge acuity of the resist profile, ∂ h∂ x. By simple differentials, the edge acuity can be calculated:
Eq. (6)
∂h (∂h ) (∂ D ) = ∂ x (∂ D ) (∂x )
where D is the exposure dose and x is the lateral distance perpendicular to the line/space. The first term on the right side is strictly a function of the resist and its processing. The second term is equal to the gradient of the aerial image and is only a function of the exposure tool; note, this term is generally fixed while the second term will vary with resist quality. The term ∂ h/ ∂ D is proportional to the contrast factor (gamma) of the resist system. The contrast factor for resists can be increased through resist design, the use of high exposures with weak developers, and the use of contrast enhancement materials. High resist contrast is an important factor in achieving control of critical dimensions, as well as providing sufficient edge acuity to control the effects of resist ablation during dry etching. High resistance to thermal flow during dry etching requires resists with high glass transition temperature, Tg . The throughput of the exposure tool is normally dominated by the resist sensitivity, or speed, since exposure time is inversely proportional to resist sensitivity. The exposure time is most significant in systems which expose only a small fraction of the wafer at a time, for example, 5X optical steppers and direct write e-beam lithography tools. Technical Evaluation Of Tools. The lithography engineer/scientist has primary responsibility for determining which specific types of equipment, and which models meet the technical requirements of the proposed acquisition. The necessity to meet the technical requirements is absolute.
11/30/00
JMR
Lithography Tool Selection
51
Failure to meet the technical requirements, which also includes an adequate statistical process/tolerance ratio capability, may result in financial failure of the manufacturing entity. Unlike the other decisions which involve marketing, product development, etc., the task of technical evaluation is a challenging one because of the depth and breadth of technical understanding which is required. The technical evaluation should be performed by personnel with maximum technical competence, and objectivity. The decisions must be made from a deliberate perspective of maximum emphasis on measurements and calculations, and with minimum emphasis on intuition, personal vendor relationships, and subjective reasoning. The following minimum parameters should be measured for the case of an optical alignment/exposure tool: i. Adequate resist profile at minimum CD ii. CD control, normally ±10% a. exposure latitude b. focus latitude iii. Registration, normally ±0.2–0.3CD iv. Throughput, wafers/hr. v. Maximum chip area printable vi. Field distortion over the maximum print field vii. Impact on yield The list is specific to optical steppers but very similar lists exist for all tools. All these measurements should be performed with the resist system to be used in production. The lithographer may consider new multilayer resist systems which allow less costly equipment to be purchased, but the new resist system must already be sufficiently developed to provide a meaningful evaluation, and the cost impact of more expensive resist technology must be considered. Resist profile is especially important when dry etching or lift-off is used. The assurance of an adequate resist profile is relatively simple to perform. Scanning Electron Microscope, SEM, photographs of the resist profile as a function of exposure and focus are sufficient. Measuring the development variance in resist profile with focus is especially important since the edge acuity is a function of the gradient in the aerial image as previously given in Eq. (6). The image quality must be checked in the corners of the field and compared with those in the center. Critical dimension, CD, control is the ability to maintain the dimensions of resist images with time over all areas of the exposure field. CD
11/30/00
JMR
52
Handbook of VLSI Microlithography
variations occur as a result of uncontrolled variations in resist thickness, exposure dose, focus, development, etc. Absolute assurance can only be made by long term statistical process control measurements of linewidth in a production environment. Partial characterization is, however, possible at vendor facilities for the first machine, or when a single machine is to be purchased. Plots of CD vs. exposure/focus can be performed at vendor facilities to simulate the probable uncontrolled variations. The incident exposure is well controlled by steppers, but the coupling of exposure dose into the resist is a strong function of resist thickness and substrate reflectivity. The linewidth can be accurately measured using the NBS crossed bridge test pattern, and electrical probing on a Prometrix LithoMapR System. Figure 17(a) shows the linewidth test structure which is imaged in a resist over a conductive film which has been deposited on an oxidized wafer. The van der Pauw cross at the top measures the sheet resistance, r, of the conductive film, and the Kelvin bridge structure essentially measures the linewidth, L, by R = r(L/A). The length,L, is typically 200 µm. The resist images are anisotropically transferred into the conductive film by RIE etching. Lift-off metallization is also possible, even when the resist profile is non-reentrant, if the edge acuity is high and the metallization thickness is thin (less than 1000 Å). Doped polysilicon and thin films of refractory metals are frequently used as conductive films for electrical measurements. Copious amounts of data are obtained for statistical analysis, and regression analysis to complex models can be accomplished. A typical exposure/focus plot obtained using a Prometrix System is shown in Fig. 17(b). Plots of this type adequately characterize the tolerance for defocus, but CD variations due to exposure are difficult to determine because of close and irregular contour spacing. A plot of the type shown in Fig. 18 is more desirable in accurately characterizing this most important variable. The percent exposure latitude, PEL, is graphically determined by finding the exposure range between +10% and -10% variance from nominal CD, and dividing by the correct nominal exposure dose. Finally, the range is multiplied by 50 to convert to ± percentage. The value considered sufficient for adequate control is a function of the resist system’s ability to control reflectance and the degree of process control which exists. The exact minimum value is set by the lithographer’s experience. Generally, ±12% is adequate for production using single layer resist (SLR) without reflectance control whereas ±8% is adequate for pilot line situations using rigid process control and resist systems which control the effects of reflectivity.
11/30/00
JMR
(b)
Figure 17. Electrical measurement of CD control. (a) Linewidth measurement structure, (b) typical focus/exposure plot.
(a)
Lithography Tool Selection 53
JMR
2/23/01
54
Handbook of VLSI Microlithography
Figure 18. Characterization of CD vs. exposure dose.
Inter-level registration measurements, like CD measurements, are most easily obtained using electrical test structures of the Stickman[73] type. Figure 19(a) describes the two level Stickman structure. The structure is etched into an electrically conductive film (doped silicon) using two resist/ etch cycles. The measurements are valid for intra-machine capability when both cycles are performed using the same machine. Alternatively, intermachine (mix-and-match) lithography can be evaluated using different machines for each level. A vector plot of the misregistration at each die site is shown plotted in Fig. 19(b) for the case of a typical inter-machine test. The plot reveals an obvious stepping difference in both the X and Y directions. Subsequent computer analysis of the data will reveal the systematic sources of misregistration, for example, alignment errors (translation and rotation), lens magnification, reticle rotation, stage orthogonality, etc. Finally, a residuals plot will be made showing that part of misregistration which is random in nature. Registration measurements are relatively easy to perform using alignment/exposure tools at vendor facilities followed by analysis at the production facility using a Prometrix System. When electrical measurements cannot be obtained or when getting them is too time-consuming, metrology overlay tools are employed as described in the metrology chapter (Ch. 4). These systems, made by companies such as IVS and KLA, have become the day-to-day production tools in modern fabs due to their good performance and real-time data generation capability.
11/30/00
JMR
Lithography Tool Selection
55
(a)
(b) Figure 19. Registration testing using electrical test structures, (a) Stickman registration structure, (b) vector map of misregistration.
2/23/01
JMR
56
Handbook of VLSI Microlithography
The throughput of the exposure tool must not only be determined under realistic conditions (applicable resist system, etc.) but the result must be analyzed to reveal the individual time delays for each step in alignment and exposure. The throughput data can be obtained as ancillary information during CD control testing. The maximum chip area is usually specified by the equipment vendor, but the useful area may be substantially less as a result of poor assembly and/or design. A reduction in area may occur as a result of poor exposure uniformity or radial fall-off of modulation for the aerial image. Both of these failures can be quantified through CD measurements as a function of radial distance from the center of the field. Care must be given to the matching of lens systems at the stepper factory prior to their shipping to ensure system-system matching and maximum field usage across the bank of machines. The effect of lens distortion on registration may be revealed during the registration tests. Intra-machine testing will not reveal this factor, however, since both levels will have identical distortion. Relative distortion will be revealed when the exposure tool being evaluated is used in the second exposure/etch cycle. Absolute distortion can only be measured if the first exposure is performed in a system without any distortion. Electron-beam lithography can be considered a reasonably distortion free system, since the table’s position is very precisely measured using optical interference. The effect of defects on yield, Y, is predicted by Stapper’s Model[74] as: Eq. (7)
1/ s Y = (1 + SAD )
where A is the area, D is the defect density per square cm, and S is a constant which describes the fact that the spatial distribution of the defects is not completely random. Poisson’s Equation assumes that S = 0, and Seed’s Model assumes a value of 1.0. Westinghouse measured and found that S is normally equal to 0.44. Equation (7) is applicable for a single masking level. The total lithography yield is the product of all the Y values for all m mask levels: m
Eq. (8)
Yt =
n =1
11/30/00
JMR
1/ Sn
∏ (1+ Sn An Dn )
Lithography Tool Selection
57
Defect density may be measured using optical inspection equipment which identifies resist patterns not contained in the database, or by measuring the yield of metallized test patterns which contain closely spaced lines and spaces. Defects in either the lines or spaces are detected by electrically probing for opens and shorts and interpreted as missing resist patterns or excess resist, respectively, when the resist images are transferred into conductive films by etching. The defect density is calculated from the probability of non-defective test patterns by: Eq. (9)
(
)
D = Y − s − 1 / SA
where Y is the probability of no defects (fractional yield) determined from electrical probing of the test patterns. The impact on yield of new equipment is extremely difficult to determine using vendor facilities, since vendor facilities are seldom up to the required cleanroom standards, and the transportation of wafers generates defects. Complete assurance of yield cannot be practically made in vendor facilities. The problem is only fully remedied when a machine is purchased and installed in the manufacturing facility in anticipation of a much larger follow-on order. The high impact of yield on cost does not, however, allow the evaluator to overlook the yield factor. A partial evaluation is possible prior to purchasing a machine, when an equal number of control wafers are transported to vendor facilities and process simulated, but actually exposed and processed at the production facility. Likewise, the wafers processed at vendor facilities must undergo simulated processing in the production facility. The described pseudo evaluation technique is cumbersome and subject to unavoidable introduction of new variables. Nevertheless, it is a technique of some value which can be used when no alternative is available. A table comparing several alignment/exposure tools will quickly reveal the strong and weak points of each tool and will provide the basic information necessary for a detailed cost analysis. 2.7
Economic Factors
Cost of Manufacturing. The cost of manufacturing a wafer includes all capital, direct labor, materials, and overhead costs. Direct labor normally dominates in a commercially viable manufacturing plant, but in modern factories this may be changing. While lithography is only one of many sequential processes, its repeated use on each wafer has a large weighting factor. The cost to manufacture a functional IC is:
11/30/00
JMR
58
Handbook of VLSI Microlithography
Eq. (10)
Cost/Chip =
Total fabrication cost per chip Total Yield
m
Eq. (11)
Cost/Chip =
∑ ( Labor + Capital + Overhead
+ Material)
n =1
Total Yield
where labor cost, capital cost, overhead costs, and material cost per wafer level must be summed for all major processing steps for n equal to one through m. The total yield is the product of the production yield, electrical or probe yield, and packaging yield. Labor Cost. Direct labor cost and indirect labor cost must both be included in labor cost. Direct labor cost is that directly applied to fabricate devices, and is primarily associated with operator cost. Direct cost is obviously very dependent on the throughput of the lithography system. Indirect labor costs includes the cost of engineering, maintenance, management, etc. The total labor cost for lithography is implicitly dependent on the throughput of the tool and the fractional number of personnel necessary to operate the tool. The turnaround time for processing a lot is an important factor in determining the minimum time required to completely process a lot of wafers. Turnaround time is therefore very important for quick delivery of functional ICs. The turnaround time, T a , in hours, for a lithographic tool is equal to: Lot Size (wafers) n = 1 Throughput (wafers/hr) m
Eq. (12)
Ta = ∑
where n and m were previously defined in Eq. (11). The turnaround time for the complete IC process varies with the charter and typically ranges from 4 to 25 weeks, of which approximately 60% is consumed in lithography. Capital Equipment Costs. Lithography equipment must be periodically replaced, primarily due to antiquation. Today, however, the trend is towards the building of entire new fabs when the design-rule requirement calls for a new tool set, rather than buying new tools for an old existing
11/30/00
JMR
Lithography Tool Selection
59
factory. The model for this is really Intel, where they close older fabs with older generations of tool sets and replace then in the new fab, which may be also a larger wafer sized fab as well. The rapid pace of equipment development is especially true in the semiconductor industry, and lifetimes of only five to ten years are normal. During this lifetime the cost of the equipment must be amortized (i.e., money must be set aside to replace it when replacement is due). The capital cost or depreciation cost per year, C, is proportional to the total capital acquisition cost, A, multiplied by a factor which includes interest rate, i, and years of service, n: Eq. (13)
C = A[i(l + i)n/(1 + i) n-1]
A typical example: An i-line optical stepper costs $2 M to purchase and install, the interest rate is 8% per annum, and the lifetime is eight years. The capital cost per year is $348,000, The capital cost per wafer level is equal to $348,000 divided by the total wafer levels processed per year. The total wafer levels processed per year is highly dependent on the product of machine throughput and machine availability (dependent on machine reliability, stability and maintenance received). In 1996, John Carruthers predicted the cost of a 193 nm stepper to be ~$15 million and the EUV tool to be in the $20-25 million range, so, one can imagine how dramatically this trend will go as we reach the use of these tools.[75] Lithography is the largest segment of the semiconductor equipment business, in 1996 accounting for $3.8 billion in sales according to Dataquest.[76] Overhead Cost. Overhead cost generally includes all indirect costs necessary to fabricate the IC devices. Examples include: amortized facility cost, management, heating/air conditioning, sewer and water, quality control, purchasing, insurance, etc. Material Cost. The material costs involved in making semiconductor IC devices is divided into two categories, namely: (i) direct material costs, and (ii) indirect material costs. Direct materials includes those materials which physically become part of the IC device. Silicon wafers, metal deposition alloys, packaging materials, and diffusion/ion-implantation materials are examples of direct materials. Indirect materials includes all materials which do not actually become part of the IC device. Examples of indirect materials are photoresist, masks, resist developers, processing gases, cleaning solvents, deionized water, and other chemicals.
11/30/00
JMR
60
Handbook of VLSI Microlithography
The sum of direct material and indirect material costs is the total material cost. The total material costs are usually a small fraction of the total fabrication costs. Exceptions to this generality are exotic wafer materials such as GaAs, silicon-on-sapphire, and II-VI compounds. Total Cost. The total cost includes the sum of the capital costs, labor costs, overhead costs, and material costs. Profits are necessary for the long term viability of any commercial manufacturing facility. The market value of an IC device is largely dependent on the competition’s cost to manufacture and market elasticity. The economic health of a manufacturing facility critically depends on the ratio of its total IC cost to that of its competition. Now, factories are going to be judged also by their profit per employee and per capital dollar spent—these are the new criterion. The total cost to fabricate a wafer is dependent on minimum dimensions primarily due to increased cost of lithography. Typical costs for processing 6" (150 mm) and 8” (200 mm) wafers are listed in Table 3 as a function of minimum geometry.[77]
Table 3. Typical Total Wafer Processing Cost of VLSI Circuits Minimum Geometries (µm)
Total Fabrication Cost* ($/Wafer)
1.5 1.0 0.8 0.5–0.18 0.13–0.10
300–350 400–450 500–600 800–1800 2000–4000**
* Estimated costs ** Assumes 300 mm wafers
These costs are for high volume facilities, running approximately 3000 to 4000 wafer starts per week, and the costs must be increased by a factor of two to three for low production volumes and their inefficiencies.[77] Electrical wafer probing will increase the wafer processing cost from $100 to $200 or more depending on the degree of testing performed at the wafer level. Finally, the total cost must also include the assembly, packaging and final testing costs.
11/30/00
JMR
Lithography Tool Selection
61
DM Data Inc. continues to analyze the cost for a typical 50K gate array device using 1µm geometries, a die area of 129 mm2, a package cost of $25, and a probe yield of 6.5%.[8] The cost per good die is determined to be $71. The final cost of the assembled, packaged and tested device is projected at $240. The analysis shows that while yield dominates in determining the cost of a good die at the wafer probing stage, the total device cost is some 338% higher than that at wafer probing.[77] The cost of lithography in this analysis appears to have minimal impact on final device cost. The assumed yield of 6.5% combined with high assembly, test and screening costs is responsible for this result. The fractional cost of lithography increases rapidly with decreasing yield. At a yield of 1%, the cost per good die jumps to $498, while the final device cost increases to $668. In this case, the lithography cost dominates since final chip processing only increases the total cost by 34%. The importance of having a strategy for selecting the optimum lithography tools increases dramatically as yield crosses below the 2% level. For modern lithography tools and for modern chips, this situation may be dramatically different. Melliar-Smith, the COO of Sematech, poses the question, “The issue is not whether we can image a 70 nm line, but can we do it for $50/layer?”[78] Rapid advances in lithography, interconnect technology and transistor scaling have lead us to devices with 12-14% size reductions per year. Sematech’s view of the future predicts the end of optical lithography in ~ 2006. Note also, that microprocessor fabrication is predicted to reach the end of optical tools before DRAM manufacturing. Reticle or mask costs are also escalating as shown in Fig. 16. Again will we be able to absorb these costs? Gerhard Gross, Dir. of Sematech, predicts photomask costs of $100 k for the 0.13 micron generation.[79] In late 1998 following a Sematech meeting in CO. Nov. 1997, the five players for lithography after 0.1 micron were cell block printing e-beam, ion-beam projection, SCALPEL e-beam stenciling, EUV, and 1X x-ray. The projected costs of the five sub-0.1 micron techniques is $50/level for Ion-projection to over $160/level for e-beam direct write. DUV S&S is projected at ~$25/level for reference. The last four techniques also have substantial masking costs and, in all four cases, an immature or non-existant infrastructure. Another issue for the future: Is 12” wafer throughput acceptable for any of these tool choices? Yield. The yield, as shown in Eq. (11), is probably the most important “key” to the cost of shippable IC devices since it can range over several decades. In Fig. 20, the influence of yield is shown on cost per device function; note, as for Fig. 3 productivity increases continue to drive cost per
11/30/00
JMR
62
Handbook of VLSI Microlithography
function down, which makes the semiconductor business continue to grow. Low yield also affects the ability to manufacture and ship IC’s on time. Failure to achieve competitive yield will ultimately result in lost customers. Assuming that the IC has been properly designed, the yield is generally dependent on defects produced during processing. Defects can be classified as: (i) random point defects, and (ii) non-random defects. Point defects are small (<10 µm) and randomly located. The origin of point defects is usually particulate contamination as a result of airborne dirt or more frequently as produced within the processing equipment. Non-random defects due to improper processing are usually much larger than 10 µm in diameter and they have a definite spatial relationship to the patterns on the wafer. Examples of this type include inadequate resolution, poor registration of mask levels, incomplete etching, non-uniform deposition, etc.
Figure 20. Log of the cost per function vs. time. (Taken from Industry Watch, Semiconductor International, p. 17, Dec. 1997.)
11/30/00
JMR
Lithography Tool Selection 3.0
63
IMPLEMENTATION OF STRATEGY
As discussed in Sec. 2.1, there is a large set of factors which influence the selection of lithography tools. A precise strategy must be exercised during the selection process to insure all technical requirements will be met with the lowest total cost. Also note, in general, the quantification of the selection parameters is not precise, but involves some subjective judgment. This section will attempt to unify all the factors into a cohesive quantitative approach to selecting optimum equipment and processes. It could be argued here that for the next several years this analysis will be an optical tool to optical tool analysis for production lithography systems, as has been historically observed since the early 1980s. Iida[80] first attempted in 1983 to quantize a comparison of various lithographies using a figure of merit defined as follows: Eq. (14) where:
P R G = (T/C ⋅ Y/F 2 ⋅ Av ⋅ M n ) × [1/W 2 ⋅ (1 − A/W )] ⋅1/Ta
G T C Y F Av Mn W A Ta P,R
= = = = = = = = = = =
Figure of merit Throughput, wafer-levels/hour System cost including clean room Yield/unit exposure area Normalized chip edge length Equipment available, % Equipment maintainability Linewidth Alignment accuracy Turnaround time Constants defining production/research environments
The first term is primarily an economic factor of great importance in a volume manufacturing entity, while the second term is a technical factor of primary importance in research and development. In researchR is greater than 1 and P is less than 1, whereas in volume production R is less than 1 and P is greater than 1. This approach is extremely useful in determining which type of lithography is appropriate for the application involved. Iida compared optical, e-beam, and x-ray lithography using data available in 1982. The results are tabulated in Table 4, and plotted in Fig. 21. Although this analysis data is somewhat dated and the tool capabilities have improved substantially, it is still a good general approach to the tool selection issue.
11/30/00
JMR
64
Handbook of VLSI Microlithography
Table 4. Figure of Merit Involving Technical and Economic Factors
Figure 21. Figure of merit as a function of resolution. (Ref. Y. Iida.)
1/18/01
JMR
Lithography Tool Selection
65
The curves are of course no longer accurate, but their shapes and relative locations are still generally valid. The curves show that each type of lithography has a resolution region where a maximum Figure of Merit occurs. The resolution, which is determined by where the curves cross one another, may be properly regarded as critical resolution. The e-beam/ stepper critical resolution is 0.68 µm. Today, this critical resolution has shifted to the vicinity of 0.15 µm through the evolution of optical technology. The critical resolution for x-ray/stepper technology has likewise shifted from Iida’s value of 0.97 µm to today’s value of approximately 0.13 µm. Figure 21 incorrectly shows that e-beam lithography is not advantageous for any resolution. The primary reason for the obvious error is the fact that Iida’s work does not reflect the importance of production volume. Equation (14) assumes that the lithography equipment will be 100% utilized (not including downtime). The non-recurring costs of masks, clean room space, etc. have also been ignored since these costs become less significant in very high volume production. Figure 22 presents a picture where volume per device type is an important factor for consideration. The plot in Fig. 22 shows that e-beam lithography dominates in effectiveness when production volume per device type is very low regardless of feature size, because amortized mask costs dominate total cost in small volume, but is equal to zero for e-beam lithography. The manufacture of ASIC (Application Specific Integrated Circuits) type devices is a prime example of low volume device types. Ebeam’s advantage rises slowly as resolution improves from 0.7 to 0.45 µm because of improved yield compared to optical lithography, and the need to use multi-layer resist systems when optical lithography is pushed to the limit. The area shown advantageous to x-ray lithography assumes that mask technology will be solved in time at a reasonable cost. The increasingadvantage with volume at high resolution is a result of the high fidelity of imaging in this region. Optical lithography has a commanding position in the region above 0.5 µm because of its low cost and good performance—today, this region is pushed even lower to the 0.18 micron size and probably below. The low volume performance decreases primarily as a result of mask costs. D. W. Peters[81] also proposed a cost-effectiveness Figure of Merit, F, defined as follows: Eq. (15)
F = NQ/(C + KA) RP
where N is the real time throughput in wafer-levels/hr. including all overhead, not free-running throughput. Q is the resist process complexity
11/30/00
JMR
66
Handbook of VLSI Microlithography
factor and denotes relative yield loss (Q = 1 denotes conventional, single level, positive novolak resist processing; Q = 0.70 denotes unconventional resist materials or processing; and Q = 0.50 denotes multilayer resist processing). C K A R P
= = = = =
Total Capital Cost Clean Room Cost (~$21,500/m2) Footprint Area (m2) Working Resolution Registration Accuracy
Figure 22. Optimum lithography technology based on both minimum feature size and production volume.
Using Eq. (15), Peters assembled a comparison table shown in Table 5. Peters concluded that x-ray steppers, using plasma generated x-rays, are optimum for resolving 0.5 µm geometries. The reader is, however, warned that the values used in Table 5 do not reflect a general consensus of lithographers, and the high cost of x-ray masks is not included. The work is nevertheless shown here as another example of models which can be used to economically compare lithographic tools. The reader may use the model with additional factors or use standard Sematech-based Cost of Ownership (COO) methods[82] which were all developed after the first edition of this book, and values obtained from personal experience. Recently (1998),[82] a COO analysis has come out of Japan that indicates x-ray may have a chance to be used economically vs. optical when production volumes reach 3 million chips/month.[83]
2/23/01
JMR
Lithography Tool Selection
67
Table 5. Figures of Merit for High Resolution Lithography
Parameters* R (µm) P (µm) N (wafer levels/hr) Q C ($M) A (m2) Figure of Merit (/$10 K)
Deep-uv steppers
E-beam directwrite
TubeLasersource generated x-ray plasma source steppers x-ray steppers
0.50 ±0.20 10 0.50 1.65 5.58
0.50 ±0.20 2 0.50 4 6.51
0.50 ±0.20 5 0.70 1.5 1.86
0.50 ±0.20 48 1 1.75 1.95
0.28
0.02
0.22
5.36
*Reference: Peters, D. W., Ref. 81
We have discussed charter, marketing, technical requirements, and economics. These factors are key to the methodology for the selection of lithography tools. First consider the charter. Once one has firmly established the charter, then the class of operation is defined. Combined with marketing, this then sets the volume of product and the price range for the product. From this, one can determine the budgeted operating cost. In parallel, marketing has defined what the product performance must be to remain competitive. This in turn will lead to technical requirements such as minimum geometries, chip area, mask levels, etc. At this point, the technical requirements can define several sets of lithography tools. The remaining decision is largely determined by economics with slight adjustments from the charter. Recall that an operating budget was previously determined. Device selling price dictates what yield and throughput is required, then total cost can be computed from direct and indirect costs. One will then need to choose the tools that are affordable for the proposed life of the product or products. One last time, the charter along with marketing must be used to decide whether the selected tools are consistent with the proposed product price and operating cost. This whole strategy may take several iterations before converging. Note that the result is the selection of a class of operations, then, within that class, trade-offs have been made considering the parameters of yield, throughput, cycle time, technical requirements, and indirect costs.
11/30/00
JMR
68
Handbook of VLSI Microlithography
Let’s look at two examples to apply the lithography tool selection strategy. The first example is a research and development division where responsibility is to push the state of the art in density and performance. Let’s say we are trying to fabricate devices with CD’s equal to≤0.18 micron using chip areas of ≥1.0 cm on a side. First, our charter is the R&D class. We have several processes and products, but our volume is extremely low. Marketing has told us they need a certain density and transistor count to sell future products and this establishes the CD requirements. The speed of the device has also dictated the need for smaller CD’s along with some other parameters which will not be discussed here. Our volume is very low, but since we have many products to support, we must have flexibility in our equipment. Cost is important, but flexibility is vital. We are, therefore, interested in reducing the various set up support functions. We will want to choose a lithography tool oriented to little or no nonrecurring cost, short cycle times, but one that is able to achieve the technical requirements. Note: amassing the final technical requirements is a long process in which each facet discussed earlier must be explored. Finally, we can select several tools which can meet the technical requirements. For our case, let’s assume that an x-ray stepper, e-beam direct-write, and a DUV optical stepper are all technically qualified. However, the i-line optical stepper requires the use of multilevel resist and 0.18 micron CD’s is truly pushing the state of the art. Let’s say the x-ray and the optical steppers cost $3 million each while the e-beam costs $5 million. At first one might believe that the economics might eliminate the e-beam immediately. However, the e-beam needs no masks and it uses a single level resist. Both the x-ray and optical steppers need masks. The high cost of x-ray masks will be prohibitive for any operation that is continuously changing the product design. Even the optical system becomes expensive when constantly making new masks, and waiting for masks to be made causes the time from finish of design to fabrication completion to be much longer than the corresponding time for an e-beam. Here is a case where the wall clock time is as much, if not more important, than the capital cost. The cycle time is important because we are trying to establish a product to take to market and the timing of that product may be the most critical factor. The cycle time for an e-beam based product could be five weeks compared to ten weeks for an optical based system. Thus, the choice for this situation is the e-beam system. In the 1990s, this situation was real for leading edge companies in Japan and at IBM. Let’s now take the other extreme. A factory is to be set up for a single process, single product operation which requires 0.35 µm CD’s with chip sizes of 10 mm on a side. Note, the charter is for the Very High Volume/One
11/30/00
JMR
Lithography Tool Selection
69
Product class. Marketing has already described the product and therefore has essentially set the requirements for technical capability. There are obviously numerous tools which can technically do the job. This includes an optical stepper, which requires only a single level resist. The e-beam, plasma x-ray, and ion beam are far too expensive to operate in a high volume mode and thus the optical stepper is the obvious choice. The next step is to determine what support technologies and equipment are needed to support such a high volume operation. Note, this type of operation is more concerned with recurring cost and not as much with nonrecurring cost. Thus, once the equipment and all support tools are set, there will be little or no change necessary. So the initial nonrecurring cost can be amortized over a large volume. In the second example, the optical tool is the obvious choice and will continue to be into the next century. Recently, Gomei and Suzuki[84] have compared costs for doing 64 Mb, 256 Mb, 1 Gb and 4 Gb DRAM production for optical and non-optical tools (see Fig. 23). They have upgraded the COO modeling from the capability above to arrive at a cost per chip layer exposure (CLE) metric. From their study, the optical lithography costs for 200 mm wafers are $0.035/CLE for 64 Mb (i-line optical) and 0.1/CLE for KrF for 256 Mb. Strikingly, the x-ray costs are less than the KrF costs because the BARC etch and CMP processing items were left out for x-ray. Although the BARC cost is legitimate, the CMP elimination is highly questionable. Nevertheless, the x-ray costs do become competitive. Of course, mix-and-match iline and KrF will allow you to produce 256 Mb devices cost effectively, and may be economical until 1 Gb. At 4 Gb, their x-ray assertion may be valid. Fabs today depend upon wafer or die outs per month, and this can be influenced heavily by lithographic tool throughput. Throughput is dependent upon field illumination intensity, field size, stage design and overheads, alignment times and wafer handling, and even mask design in the case of OPC-designed contact reticles. For the latter, the improved throughput is achieved via shutter or exposure timer reduction, which for large wafers can be production cell throughput-limiting. Production fabs maximize productivity by increasing throughput and maximizing tool availability. Reducing S&S field sizes to match stepper fields, which are typically smaller, for example, can reduce overall throughput. Newer stepper and S&S systems will run at higher wafer throughputs, due to the extensive use of stiff airbearing stages which allow frictionless and high speed accelerations, of the order of 90–100 wafers/hr.[85] So, tool decisions for the next couple of years for volume fabs, will most likely be based upon which optical tool has the greatest throughput.
11/30/00
JMR
70
Handbook of VLSI Microlithography Lithography COO: 40,000 Wafers/Masks
Lithography costs ($/CLE)
0.20
0.15
4 Gb Mask costs Resist process costs Lithography system costs 4 Gb 256 Mb
0.10
0.05
1 Gb
1 Gb
64 Mb
0.00 i-line
KrF
KrF
X-ray
ArF
X-ray
Exposure tools Figure 23. Lithography costs for several generations of DRAM devices and lithography tool strategies. (Taken from Ref. 84 courtesy of Semiconductor International.)
4.0
SUMMARY
Selection of an optimum lithography tool is a very involved process requiring knowledge and experience in several disciplines including physics, chemistry, electronics, device design, processing, marketing, manufacturing, and economics. This chapter has established a basic strategy by which one skilled in these disciplines can select the best tool or set of lithography tools for his/her situation. Although many facets make up the strategy, the technical aspect is the dominant item because the tool has to be able to technically perform. Even though technical capability is the dominant facet, it is not solely sufficient because several types of machines will normally be technically capable, and it must be combined with the economics, charter, marketing, and manifestation of economics. This is true whether the manufacturing facility is very large or simply a small laboratory in a university. Often one finds that the technical analysis has qualified several strikingly different types of lithography tools. At this point economics, whether manifested through the charter, marketing, manufacturing operations, or any combination thereof, will be the final decision maker in the selection of the best lithography for the user.
11/30/00
JMR
Lithography Tool Selection
71
REFERENCES 1. “ICE, Lithography Dominates Wafer Processing,” ICE Midterm 1988 compendium, Semiconductor International, 11:36 (1988) 2. Castrucci, P., Henley, W., and Liebmann, W., Solid State Technology, p. 127 (Nov. 1997) 3. DeJule, R., Semiconductor International, p. 78 (Sept. 1998) 4. Jeong, H. J., Markle, D., Owen, G., Pease, F., Grenville, A., von Bunau, R., Solid State Technology, p. 39 (April, 1994) 5. Gargini, P., Glaze, J., and Williams, O., Solid State Technology, p. 73 (Jan. 1998) 6. Hilbert, R., Laser Focus World, p. 71 (April 1998) 7. 8. 9. 10. 11.
Makimoto, T., Future Fab International, p. 36 (1997) SRC News Letter, p. 5 (Dec. 1996) Derbyshire, K., Solid State Technology, p. 133 (May 1997) Technology News, Solid State Technology, p. 52 (May 1997) Editorial Staff, Semiconductor International, p. 22 (April 1997)
12. Deal B. E., and Talbot, J. B., ECS Interface, p. 18 (Spring 1997) 13. Haavind, R., Solid State Technology, p. 32 (Dec. 1998) 14. Industry Watch, Semiconductor International, p. 29 (July 1998) 15. Sasaki, H., IEEE IEDM-97, p. 4 (1997) 16. Taur, Y., and Nowak, E., IEEE IEDM Proceeedings, 97:215 (1997) 17. 18. 19. 20.
DeTar, J., Electronic News, p. 14 (Sept. 1, 1997) Cohen, W., Bus. and Tech., U.S. News and World Report (Mar. 25,1996) Anderson, D., Solid State Technology, p. 57 (March 1997) DeJule, R., Semiconductor International, p. 36 (Feb. 1997)
21. 22. 23. 24. 25.
Electronic News, p. 74 (Oct. 13, 1998) Industry News, Semiconductor International, p. 42 (June 1998) Murakami, M., Semiconductor International, p. 114 (July 1997) Tech. News, Solid State Technology, p. 24 (Dec. 1998) Levenson, M. D., Solid State Technology, p. 57 (Feb. 1995)
26. Spence, C., Schmidt, R., and Quinto, U., IEEE Lithography Workshop (Aug. 1996) 27. DeJule, R., Semiconductor International, p. 44 (May 1998); Semiconductor International, 74 (June 1998) 28. DeJule, R., Semiconductor International, p. 42 (Jan. 1998) 29. Grenon, B. J., Solid State Technology, p. 46 (Aug. 1998)
11/30/00
JMR
72
Handbook of VLSI Microlithography
30. Nulty, J., Solid State Technology, p. 36 (Dec. 1998) 31. Haviland, R., and Dunn, P., Solid State Technology, p. 50 (Mar. 1996) 32. Korczyski, E., Solid State Technology, p. 40 (Dec. 1996) 33. Editorial Staff, Solid State Technology, p. 54 (July 1998) 34. Moore, G. E., Remarks of Dr. Gordon E. Moore, keynote address to Kodak Interface ’75, G-45 (1975) 35. Singer, P., Semiconductor International, p. 46 (Jan. 1995) 36. Claasen, T., Semiconductor International, p. 175 (July 1998) 37. 38. 39. 40.
Brunner, T., IEEE IEDM Proceedings 97, p. 9 (1997) Brunner, T. A., OCG Interface, p. 1 (1996) DeJule, R., Semiconductor International, p. 50 (Sept. 1998) Ware, P., Submicron Focus, Canon, 2(2):2 (Spring 1997); Solid State Technology, p. 86 (1995) 41. Dickinson, A., Solid State Technology, p. 92 (Sept. 1995) 42. 43. 44. 45. 46.
DeJule, R., Semiconductor International, p. 52 (Nov. 1997) Bruning, J., Solid State Technology, p. 59 (Nov. 1998) DeJule, R., Semiconductor International, p. 72 (June 1996) Arnold, W., IEEE Lithography Workshop, Maui, Hawaii (Aug. 1996) Arnold, W., Solid State Technology, p. 77 (Mar. 1997)
47. Derbyshire, K., Solid State Technology, p. 78 (Jan. 1998) 48. Oldham, W. G., and Schenker, R., IEEE Lithography Workshop, Maui, Hawaii, (Aug., 1996); Solid State Technology, p. 95 (Apr. 1997) 49. Technology News, Solid State Technology, p. 36 (Mar. 1997) 50. Burggraaf, P., Solid State Technology, p. 31 (Feb. 1999) 51. 52. 53. 54. 55.
Powell, M.W., Solid State Technology, p. 81 (Oct. 1998) Technology News, Solid State Technology, p. 1, 44 (Oct. 1998) DeJule, R., Semiconductor International, p. 46 (May 1998) DeJule, R., Semiconductor International, p. 54 (Feb. 1998) Silverman, P., Levenson, M.D., Solid State Technology, p. 81 (Sept. 1995)
56. Levenson, M. D., SPIE, 3051:2 (1997) 57. Ogawa, T., Uematsu, M., and Oda, T., Semiconductor Fabtech, 3:169 (1995) 58. Wampler, K., and Caldwell, R., Solid State Technology, p. 91 (1998) 59. McGrath, D., Electronic News, p. 26 (Sept. 15, 1997) 60. Miller, D., Solid State Technology, p. 69 (Nov. 1997) 61. DeJule, R., Semiconductor International, p. 81 (Feb. 1997)
11/30/00
JMR
Lithography Tool Selection
73
62. Sturtevant, J., Barrick, M., Blackley, S., Crabtree, P., Gerold, D., Hershey, R., Lucas, K., Maltabes, J., Robertson, S., Roman, B., Yang, D., Yuan, C-M, IEEE Lithography Workshop, Maui, Hawaii (Aug. 1996) 63. One Step Ahead, SAL Pub., Q2 and Q3 (1996) 64. Dunn, P., Solid State Technology, p. 49 (June 1994) 65. Burggraff, P., “AEBLE-based E-beam Production Results Divulged,” Semiconductor International, p. 34 (Nov. 1989) 66. Levenson, M. D., Solid State Technology, p. 42 (May 1998) 67. Bruning, J. H., SPIE, vol. 3049, p. 14 (1997) 68. Hawryluk, A., Ceglio, N., and Markle, D., Solid State Technology, p. 75 (Aug. 1997) 69. Electronic News, p. 45 (Oct. 13, 1997) 70. Industry News, Semiconductor International, p. 52 (July 1998) 71. Tepermeister, I., Conner, W., Alzaben. T., Barnard, H., Gehlert, K., Scipione, D., Solid State Technology, p. 63 (March 1996) 72. Skidmore, K., “Applying Photoresist for Optional Coatings,” Semiconductor International, 2:45 (Feb. 1988) 73. Stemp, I. J., Nicholas, K. H., and Brockman, H. E., Proc. Conf. Microcircuit Engineering 1978, Cambridge, England (Apr. 1978) 74. Stapper, C. H., “Defect Density Distributions for LSI Yield Calculations,” IEEE Trans. Electron Devices, ED-20:655 (1973) 75. Levenson, M. D., and Dunn, P. N., Solid State Technology, p. 205 (June 1996) 76. Editorial Staff, Technology News, Solid State Technology, p. 58 (July 1996) 77. Semiconductor Economics Report, Published by DM Data Inc., 6900 E. Camelback Road, Scottsdale, 3:7 (Feb. 1989) 78. 79. 80. 81.
Industry Watch, Semiconductor International, p. 17 (Dec. 1997) Wafer News Staff, Solid State Technology, p. 42 (Sept. 1998) Iida, Y., Hybrid Lithography, Nikkei Electronics, 2-1:213 (1983) Peters, D. W., “Examining Competitive Submicron Lithography,” Semiconductor International, 2:96 (1988) 82. Dance, D., Jimenez, D., Levine, A., Semiconductor International, p. 117 (July 1998) 83. Semiconductor International Editors, Semiconductor International, p. 332 (July 1998) 84. Gomei, Y., and Suzuki, M., Semiconductor International, p. 143 (July 1998) 85. DeJule, R., Semiconductor International, p. 36 (Apr. 1997)
11/30/00
JMR
74
Handbook of VLSI Microlithography
2 Resist Technology— Design, Processing, and Applications John N. Helbert Motorola, Inc. Compound Semiconductor Fab-2 Mesa, Arizona
Tony Daou Motorola, Inc. MOS 12 Chandler, Arizona
PREFACE The objective of this chapter is to provide a user’s view of resist/lithographic process technology. Other notable authors have previously provided insightful views of resist technology,[1]–[5] but from a research or resist inventor’s point of view. My intent is to supplement those excellent works, not to reproduce them in another source. Some material must be rehashed for completeness, but hopefully from another complementary perspective. The emphasis of this chapter will be placed on applications of this technology to the manufacturing of prototype and production integrated circuit devices. Furthermore, a greater emphasis will be placed upon empirical resist process development to achieve reproducible and statistically-controlled resist manufacturing processes.
74
11/30/00
JMR
Resist Technology 1.0
75
INTRODUCTION TO PATTERN TRANSFER TECHNOLOGY
Photolithography technology, the combination of the exposure tool, and the image transfer process, is vital to integrated circuit fabrication,[6] or more generally, semiconductor device manufacturing. Nearly every primary device fabrication step requires a process-compatible masking layer, which is capable of providing a desired circuit level pattern. This indirect patterning is required because either the layer is not directly patternable technologically, or it cannot be accomplished economically. Resist-image transfer layers, as their name implies, “resist” individual layer processing steps to enable electronic devices to be fabricated vertically, layer by layer, on a thin silicon crystal wafer.[1] (See Fig. 1.)[5] These individual layers can beinsulating dielectrics, semiconducting active device elements, or metallic interconnects as shown in the figure; more recently, however, greater than six levels of metal are either designed or are being fabricated complexity-wise. In addition to providing device manufacturability by circuit element definition,[6] manufacturability and yield, and circuit density points-ofview, the lithographic process is also capable of directly influencing device performance. The resist lithographic resolution and critical dimension (CD) control, for example, can directly influence device reliability and electrical performance. Historically, the resist CD requirements have reduced approximately 20–30% every two years,[5] thus, pushing some older lithographic tools to their limits and making them obsolete, except for the fabrication of older more mature devices. The capability of the lithographic process is determined, to a large degree, by the wavelength of the electromagnetic energy source used to carry out the selective patterning image exposure process (see also Ch. 1 and 5). Typically, visible light is used of wavelengths ranging from 248–420 nm for most photostepper equipment. Most of the stepper and process results for this chapter will be for i-line or DUV, which is at a visible spectrum wavelength of 365 nm and 248 nm light, respectively. The light is imaged on the wafer through chromium metal patterned transparent quartz masks using reflective or refractive optics.[7] Electromagnetic energy sources of wavelengths less than 300 nm can be provided by deep UV (DUV) producing systems such as high pressure mercury arc lamps or laser systems.[8] Further reductions in wavelength are achieved by employing focused electron beam sources, like those found in scanning electron microscopy (i.e., 10–100 keV), or those found in soft x-ray systems (2–20 Å wavelength). Most importantly, the wavelength in most cases determines what type of resist can be employed, because the energy
11/30/00
JMR
76
Handbook of VLSI Microlithography
of the lithographic tool must be coupled to the resist to insure conversion of electromagnetic energy to radiation chemical energy occurs. Otherwise, nothing but energy absorption occurs and no lithographic relief images are produced.
Figure 1. Cross sectional view of vertical layering in the manufacturing of silicon VLSI circuits.
11/30/00
JMR
Resist Technology
77
Although the lithographic properties of resists can determine circuit density and performance, the resist must first of all be device layer process compatible/integratable, or it is of academic importance only. Unfortunately, the literature abounds with resist systems of great lithographic capability, but they cannot be employed in the commercial fabrication of semiconductor devices because they are not capable of withstanding or “resisting” certain required processes. In the tool and process applications sections of this chapter, actual process compatible processes and tools will be discussed. In essence, the perspective is a practical user’s view, where actual device fabrication experience exists and the process and manufacturing issues are real. The preceding paragraphs point out that the resist and alignment tool make up a total lithographic system; both can influence the final result, but since the tool aerial image,[8] or the light contrast across the reticle feature edge, is fixed at the wafer plane by the tool manufacturer, the process engineer is left with only the resist process optimization as a primary variable of influence for circuit pattern density. Furthermore, resist systems or processes will be described which actually extend the useful resolution or lifetime of certain lithographic aligners.
2.0
RESIST DESIGN
2.1
Conventional Photoresists
A manufacturable submicron device process requires a photoresist that integrates well with the other fabrication processes, and it must also have excellent lithographic properties as well, especially for ≤ 0.5 µm advanced CMOS or BiCMOS VLSI device fabrication. The photoresist used in the fabrication of these devices is vital to the quality of the final VLSI product. The photoresist must retain its image feature size when subjected to different temperatures and etching processes. Dry etching process temperatures sometimes exceed the image flow temperatures of some current photoresists, especially in harsh multilevel metal MLM backend processing. Therefore, thermal stability and dry etch compatibility are vital photoresist evaluation parameters for MLM process applications. In addition, the photoresist must be lithographically capable of producing quality submicron device features, especially submicron via cuts and metal spaces. The contrast, exposure latitude, depth of focus, and linearity of a photoresist must all be adequate to meet the design rule needs of the device
11/30/00
JMR
78
Handbook of VLSI Microlithography
being produced. This section provides results of an evaluation of the different lithographic and process compatibility properties of two model 4th generation positive photoresists, resist A and resist B, to demonstrate example photoresist process capabilities for the fabrication of next-generation submicron devices with MLM backends. The results are provided merely as example data for two resists; therefore, no resist recommendation is intended nor implied. Positive Resists. These resist materials[1]–[3][7] are the workhorses of modern integrated circuit (IC) manufacturing technology. All new very large scale IC (VLSIC) fabrication lines employ high resolution positive toned material, while the older lines with more mature products still rely heavily on negative toned resists. Positive toned resists develop away to create recessed relief images in the exposed areas with safely-disposing dilute aqueous base developer solutions. When employed, they can be used at all device levels by simply changing the density of the reticle. Positive photoresists are composed or formulated from several components: polymeric resins of molecular weight of the order of 1–10 kilograms/mole, photoactive molecular organic additives (PAC) or non-photoactive dissolution inhibitors, leveling agents (SLA), optional dyes to reduce substrate reflectivity effects, sensitizers, surfactants for developer wetting, and organic spinning solvents. The resin molecular weights are intentionally chosen to be low to insure solubility in the polar aqueous base developers. The photoactive species also acts as a dissolution inhibitor, that is, it prevents development in the unirradiated regions of the film needed to resist (i.e., mask) further processes (see Sec. 3). The leveling agents prevent undulations on the resist surface by plasticizing the resin or by providing a resist solution with lower surface tension to improve wafer wetting at wafer spin. PAC Influence. Photoactive compounds, or sensitizers, are usually naphthoquinone diazides (i.e., PACs) like those pictured in Fig. 2.[9] The diazide (DAQ) moiety of this molecule absorbs in the visible region of the spectrum; but most importantly, it undergoes a photochemically-induced radiation chemical reaction, the photo-elimination of the azo nitrogen, that results in a solubility change in the dissolution inhibitor photoproduct (Refs. 1–3, 7 and references therein). It is this energy conversion process from electromagnetic energy to chemical reaction product which results in the observed resist behavior. It turns out that this conversion process is fairly efficient as determined by basic quantum efficiency measurements for some PACs. This quantity, defined as the ratio of the number of molecules reacting to photoproduct to the number of photons absorbed, φ , can be as large as 106. Values larger than 1
11/30/00
JMR
Resist Technology
79
are usually associated with a free radical chain reaction mechanism, while most photoresist photochemical reactions have values ranging from a few hundredths to a few tenths. The quantum efficiency for acetone, a model carbonyl-containing compound (i.e., C = O containing PAC) like those of Fig. 2, was measured to be 0.17.[10] The quantum efficiencies for the PACs of Fig. 2 were determined to all be about 0.3 at the typical optical exposure wavelengths.[10] Actually, these values are quite high when compared to other energy conversion processes, thus, these photoprocesses are very energy efficient, roughly 30%. Even greater efficiency, 50%, has been observed for some resists by other researchers.[11]
Figure 2. Structural formulae for photoactive diazoquinone (DAC) components of positive photoresist. Note that q = 3 for the tri-functional PAC.
In the acetone example above, the light is being absorbed by the specific carbonyl chromophore group, which in turn leads to the chemical reaction. The first law of photochemistry, “only the light absorbed by the molecule (for example, the PAC) can result in a chemical change in the molecule,”[12] applies for this example and for PAC absorption in photolithography. Further, the sum of the quantum efficiencies must be one, the second law of photochemistry,[12] unless a chain reaction is involved. This
11/30/00
JMR
80
Handbook of VLSI Microlithography
definition stipulates the absorption of energy is a one-quantum process. The light, which is merely absorbed in the resin or the substrate and not at the specific PAC chromophore, does not provide contributions to φ . In other words, only the bleachable absorption of the resist over the exposure spectrum is important in the lithography (Fig. 3). AZ 5214 1.45 µm Coating, Unexposed & 5 Sec Exposure
Figure 3. Bleaching curves for AZ 5214 mid-UV resist. The bleached portion of the spectrum is the difference between the exposed and unexposed spectra of the figure. (Courtesy of AZ Photoresists.)
Absorption of light in the resist is given by the Beer-Lambert law: I/Io =10-Ecl ,[11] where E is the molar extinction coefficient, c the chromophore concentration, and l the resist film thickness. Arden et al.[13] have shown high E can lead to poor resist image edge walls and larger CD variation, and should be judiciously chosen in designing the positive photoresist. The coefficient, E, is a linear function of A plus B, the resist absorption parameters to be discussed later in more detail (see Sec. 3.0), with both A and B being corrected for concentration.
11/30/00
JMR
Resist Technology
81
It is pretty clear that the photochemical quantities of interest to photoresist design are φ and E. Both quantities can be measured empirically, as outlined in Ref. 12. Resist sensitivity is influenced by φ, but E is merely a measure of the film absorption and may not represent absorption which leads to useful radiation chemical change in the resist as a result of a photochemical reaction. For example, conventional photoresists have large E at wavelengths less than 300 nm, but are very poor resists at those wavelengths due to the high absorption of the novolac resin alone, regardless of the φ value of the PAC involved. Obviously,E must not approach one, or the system is useless at those wavelengths but must have some value intermediate (i.e., 0.3–0.5) so the “skin absorption effect” can be avoided. This ensures the resist image will be cleared to the substrate, and that the resist image edge wall will not be severely degraded (i.e., undercut) from normal due to the high resist absorptivity.[7][13] The composition of the PAC can influence both the spectral response and the contrast or resolution of the resist. This is important because mask/reticle aligners operate at different wavelengths, therefore, the PAC must be designed for the wavelength characteristic of that particular aligner tool. Willson and coworkers (Ref. 4 and references therein) have written key papers in the area of PAC design and have demonstrated successful PAC wavelength tuning through chemical synthesis. By adding chemical substituents to the PAC molecules at specific molecular bonding sites and by blending PAC’s, they were successful at formulating a resist designed to be used with a Perkin-Elmer Micralign 500 lithographic exposure system operating at the mid UV (UV-3; 310 nm) region of the Hg lamp emission spectrum. It had bleachable absorption at the mid UV region, which was an indication the radiation chemical reaction of the diazonaphthoquinone molecule to the acid soluble product was occurring as required for image formation (Fig. 4). Daniels and coworkers[14] have also shown the importance of polyDAC substitution of the PAC upon resist contrast or effective aerial image (i.e., CMTF improvement) of the total resist system. In Fig. 5, the theoretical polyphotolysis photoproduct modulation transfer function is compared to that provided by the phototool. The resist can be designed to provide image resolution better than the resolution limited tool performance, a result which is becoming more prevalent, that is, photolithography has gone from aligner limited with low contrast resists to resist performance or contrast limited. Furthermore, higher contrast resists and special resist processes are being developed to extend tool lifetimes in some cases.
11/30/00
JMR
82
Handbook of VLSI Microlithography
Figure 4. Chemical reactions of photoactive component during the UV exposure of positive photoresists.
Figure 5. Image modulation enhancement function or aerial image vs. lateral position for positive photoresist exposure. (Courtesy of Shipley Co. and SPIE Ref. 14.)
2/23/01
JMR
Resist Technology
83
Successful or high contrast positive resist design, requires a very nonlinear response between exposed and unexposed resist. For any degree of polyphotolysis, q, the general dissolution rate can be given by: Eq. (1)
(
R = ro 1 − e −Ec
)
q
where ro is the fully photolyzed dissolution rate.[14] The potential influence of polyphotolysis upon resist contrast is demonstrated in Fig. 6, where it is seen that as q increases, the resist contrast increases. Of course, these theoretical limits are rarely obtained due to the complexity of the total system; but over the last three years, significant gains in resist contrast have been achieved by several commercial positive resist manufacturers (for example, Shipley , JSR, OCG or ARCH, and others), and average q values approaching 4 and above have been achieved.
Figure 6. Theoretical characteristic curves for positive photoresist assuming a polyphotolysis mechanism. (Courtesy of Shipley Co. and SPIE Ref. 14.)
11/30/00
JMR
84
Handbook of VLSI Microlithography
Influence of Resin Composition. The novolac resin composition of the positive photoresist formulation has been shown to influence resist contrast performance by Hanabata et al,[15] Templeton et al,[16] and Pampalone.[17] This task is accomplished through resin conformational effects upon the dissolution rates at image development created by isomeric compositional effects which are built in at resin synthesis. Not just the unexposed development rate is important, but the compositional effect of the resin upon the exposed to unexposed rates in the photoresist. Novolac resins are polymers synthesized via a condensation reaction between substituted phenols, o, m, and p-cresols, and formaldehyde (Fig. 7).[15] Due to the poor reproducibility of feedstock and resin compositions, the normally high polydispersity novolac resins are usually custom blended to achieve a better confidence level in the final formulated product for improved lot-to-lot photoresist performance reproducibility.[17] Their dissolution rates in aqueous base developers are determined primarily by isomer composition, methlyene bond position in the resin, and/or molecular weight; for example, high molecular weight fractions are synthesized most easily from m-cresols,[18] and these resins would have lower dissolution rates. High resin molecular weights (i.e., ≥9000) lead to development resistance, an attractive property in the unexposed area of resist, but also may lead to film residues under conditions of low humidity[18] and reduced contrast.[17] Low molecular weight resins, such as those for pure para-cresol resins, similarly lead to poor photoresist formulations due to increased development rates, therefore, resin molecular weight and dispersity must be optimized in positive photoresist design. Modern conventional resists contain higher quality resins and they are more narrowly dispersed for improved development dissolution performance. Hanabata et al.[15] finds the dissolution rate of “high ortho-bonding” (i.e., ortho to ortho´ methylene bonding; see Fig. 8) unexposed resist to be strongly inhibited; this contrast enhancing property is hypothesized by Hanabata to result from a azo-coupling interaction between this type of resin and the diazide dissolution inhibitor, and ultimately results in higher resist contrast without sacrificing resist speed. Similarly, Templeton et al.[16] found novolac resin solubility rates to be highly methylene bonding position (i.e., structurally) dependent, but they emphasize the intra- and intermolecular hydrogen bonding effects upon isomeric composition to be performance dominating. Regardless of mechanism, however, resin structural composition, molecular weight and molecular weight distribution can be performance limiting and their influence upon dissolution rates (i.e., exposed vs. unexposed) taken into account for successful resist design.
11/30/00
JMR
Resist Technology
85
Figure 7. Novolac photoresist resin structural formulae for representative novolac resins found in conventional photoresist materials.
11/30/00
JMR
86
Handbook of VLSI Microlithography
Figure 8. Graphical representation of o-o´ interactions leading to enhanced dissolution inhibition as introduced by Hanabata. (Figure courtesy of Shipley Co.)
Positive I-line Resist Advances. I-line technology was originally felt to be limited to 0.35 micron and above.[19] The latest i-line resists from Shipley build upon Hanabata’s ortho to ortho´ resin work above, while also improving the resin molecular weight control and reducing the weight dispersivity to a controlled lower level. These advances lead to higher resist contrast, exposure latitude, CD control, improved iso/dense CD bias differences, and depth of focus improvement gains. They also have improved their PAC photospeed, component solubility to reduce particulate defectivity, and improved dissolution inhibition levelsall through PAC synthetic means. The design goal for their advanced systems is to achieve 0.2–0.25 micron positive i-line performance; note, advanced tool illumination, OPC and PSM techniques are also assumed, but again, these goals are substantial and impressive. They have forced some users to rethink the financial viability of DUV resist usage at some device levels. JSR has focused its advanced i-line positive resist efforts on reducing Proximity Bias and Line Shortening effects. Note, by accomplishing this, they also achieve better resist overall performance as well. By modifying the development rate ratios, between unexposed and the very bottom exposed 1000 Å resist layer, through chemical PAC synthesis tailoring and blending, they were able to reduce proximity bias from 40 nm to 8 nm.[20] Line shortening, which affects both DRAM and microprocesser applications, was also improved for high NA usage and thinner resist films, but not eliminated.
2/23/01
JMR
Resist Technology
87
They also achieved a very high performance resist in this search as the photos of Fig. 9 attest. The dissolution behavior desired is approached by the results for JSR 790 resist in Fig. 10; note, the dramatic non-linear dissolution characteristics, as specified by C. G. Willson and Shipley researchers, are required to achieve the high resolution images of these JSR resists.
Figure 9. SEM cross sections of resist designed by JSR to have non-linear dissolution kinetics as specified in reference 20. (Photo courtesy of JSR.)
11/30/00
JMR
88
Handbook of VLSI Microlithography
Figure 10. JSR data for the resist of Figure 9 demonstrating the design principle for nonlinear dissolution. (Figure courtesy of Dr. Nobu Koshiba of JSR.)
Photoresists (OiR 600 series) that extend i-line lithography to 0.24 micron using standard binary masks have been developed[21] and reported at Olin, now called ARCH Chemicals. Their researchers, working with their Fuji Olin partners, have improved their i-line resists by incorporating hindered -OH groups and greater hydrophobicity to their resist resin, and by providing longer more rigid backbones to their PACs. They have achieved structured novolaks with controlled-branching, and have developed a new PAC system that enhances exposure tool response. The Novolak dissolution is controlled in the developer, and this renders an i-line resist capable of sub-wavelength lithography at less than 0.30 micron. Both main ingredient chemical changes provide better “dissolution switching,” greater resolution, and steeper resist profiles. In the last decade, i-line resist companies have begun to provide resists designed for specific layer applications, where differentiation is dependent upon the specific application CD and process compatibility requirement. While this originated as more of a general RIE erosion application thickness issue, it has now become more widespread. Nearly every company also provides a fast low-cost cross-over resist, a resist useable in fabs running multiple wavelength tools or tools like Ultratech broadband steppers with broad illumination emissions spectra. For fabs at large wafer size, i.e., very large numbers of stepped shots per wafer as for
11/30/00
JMR
Resist Technology
89
200 mm wafers, fast resists are important for stepper throughput reasons as number (iv) on page 89 represents. These layer application specific (AS), AS-type resist design categories, are listed below: i. Contact/via and PSM ii. Line/space, gate-like high contrast and resolution iii. Isolated features on reflective and non-reflective substrates iv. Less-critical layers, fast resist v. Metal layer vi. Ion-implant with thick fast resists (>2–6 microns) The CD-critical layers (i–iii), where k1 factors below 0.8 are dictated and their application found in the front-end processing of device flows before metal 1, can also be subdivided into attenuated PSM resists, annular reticle illumination compatible resists, resists with great surface inhibition, high NA standard application resists, ARC compatible resists, and so on. Metal layer application resists can also be metal process specific as well. Thickness is still very important to metal processing resist design, but with the advent of front-end high-energy MeV implants, application thickness has become even more critical. Some keV-level ion-implants are also requiring thinner higher resolution resists to achieve correct angled shadowing and greater pattern resolution. Other advances related to i-line resist design and applications which are being implemented in volume fabs (defined as wafer starts per week from 2000–8000) and at less than 0.5 micron design rules are: improvements in lamp intensity and greater lamp replacement frequency to keep light intensity up and throughput high, better stepper field shot utilization, greater tool utilization % vs. theoretical, move to all cascading and parallel processing track flows, and reduced alignment site mapping overhead times. All of these phototool hardware changes allow fabs to take advantage of the faster resists as wafer sizes go from 8" to 12". Positive Photoresist Summary. All of the positive photoresist design effects discussed above have one thing in common. They all lead to non-linear dissolution characteristics, which creates high contrast resist imagery. Willson[22] has captured this concept graphically as found in Fig. 11. In the figure, the concept of a resulting high-resolution lithographic resist image from a not-so-good aerial tool physical image, via non-linear dissolution, is demonstrated. This is further consistent with the concept of
11/30/00
JMR
90
Handbook of VLSI Microlithography
“multiple PAC blending” proposed by Koshiba et al. of JSR above. Furthermore, these effects also lead to improved latitude in dimensional control,[19] which is a desirable resist and device performance determining characteristic, and required in all modern semiconductor device manufacturing. Negative Toned Mid-UV Photoresist. Negative photoresist, the mainstay of semiconductor manufacturing production from the 1960s to late 1970s, is also basically a two component resist formulation like conventional positive systems. The resist mechanism, however, is quite different. Here, the photoactive species is a di-functional photocrosslinking azide, abbreviated as N3-X-N3, where X is a conjugated aromatic moiety. The bisazide efficiently absorbs visible light to form a very reactive nitrene, -N:, which is capable of chemically inserting into any C-H or C=C bond of the partially cyclized rubber resin to form an intermolecular crosslink between resin molecules. This crosslinking reaction creates a large increase (i.e., 2X) in the cyclized rubber binder polymer molecular weight every time two azide crosslinks occur, thus, decreasing the solubility rate of the optically exposed image substantially. The image is negative toned where the light strikes and remains following development, because of increased molecular weight due to the photocrosslinking reaction vs. the unexposed area which is completely developed away (Fig. 12).
Figure 11. Figure taken from C.G. Willson Ref. 22 illustrating the principle of non-linear dissolution and how it can lead to high-resolution resist images even though the projected image contrast from the exposure tool is actually relatively poor in contrast.
2/23/01
JMR
Resist Technology
91
Figure 12. Resist and etching processing sequence relevant to the fabrication of VLSI circuits.
11/30/00
JMR
92
Handbook of VLSI Microlithography
Negative resists are designed by controlling the degree of partial cyclization of the resin and by extending the conjugation of the bis-arylazide sensitizer.[3] Control of the resin cyclization reaction is thought to influence the resin molecular weight distribution which in turn influences resist contrast, while the degree of conjugation of the azide-containing chromophore determines the spectral wavelength absorption chacteristics of the crosslinking azide sensitizer. These photoresist formulations are generally very sensitive, because the bis-arylazides have high quantum efficiency, where φ ~0.4 to 0.7 and it is a biphotonic average (i.e., φ1+φ2/2 ).[23] Unfortunately, negative resists do not withstand advanced dry etch processing (for example, Applied Materials’ reactive ion etchers) very well (Fig. 13), therefore, negative photoresists remain in use only in older production lines, where large design rules (i.e., large image sizes) are called for and wet isotropic etching is still acceptable from a process image dimension bias view. Negative photoresists suffer from low contrast generally (i.e., usually have contrast values ≥1.0), created at least partially by resist swelling effects which occur during development. The highest contrast negative photoresist tested at Motorola is Merck Selectilux with a contrast value of 1.7.[24] Negative resists also suffer from oxygen sensitivity or reciprocity failure, which is manifested by a thinner resist image than expected due to a competing nitrene/oxygen reaction instead of the desired nitrene/resin reaction. Negative I-line Resist Advances. These resists are needed in newer modern flows for back-end metal processing and front-end ion-implantation applications. To achieve speed and RIE compatibility, they are nearly always CA resists. (See Fig. 14.) So, why use them? They have speed, resolution, “Dual Damascene” metal process compatibility (i.e., where contacts and interconnects are imaged at once), immunity from DUV hardening requirements for RIE resistance and compatibility, and no EBR is required for their use. The latter advantage is unique to negative systems and will save chemical EBR costs. The Shipley I-300 resist is an example of this type resist. It has high thermal stability, a non-HCl acid generating PAG, low PEB vs. dose sensitivity, reduced outgassing and high speed, less than 150 mJ/cm2. Although these systems may not become volume manufacturing products, products like them will most likely further evolve.
11/30/00
JMR
Resist Technology
93
(a)
(b) Figure 13. Illustration of poor etch compatibility (reticulation or frying) of negative photoresist under AME 8000 etch conditions. (a) After AME 8110 etch, (b) after resist ash removal.
11/30/00
JMR
11/30/00
JMR Figure 14. (a) Negative resist chemistry/reaction and (b) example negative resist profiles for Shipley 300 resist. (Figures courtesy of Shipley Co.)
(a)
94 Handbook of VLSI Microlithography
Resist Technology
95
(b) Figure 14. (Cont’d.)
11/30/00
JMR
96 2.2
Handbook of VLSI Microlithography Deep UV Resists
Deep ultraviolet light (DUV) responsive resists, material formulations sensitive to light with 100–248 nm wavelengths, are becoming very important because the lithography tools requiring them are appearing in larger numbers (see Ch. 1). Lithographic alignment tools are being delivered with excimer and other laser light sources at a rapid rate, and modern fabs making sub-0.3 micron patterns have at least a mix of these tools usually running with i-line steppers or field size matched i-line Step & Scan aligners. Conventional photoresists, as discussed earlier, are largely ineffective at these wavelengths due to the strong absorbance of the novolac resins and PAC’s involved. Work on the earliest DUV resists occurred mainly in Japan,[25] at AT&T Bell Labs, [26] at Shipley and IBM.[27] Early DUV work with conventional positive resists focused upon minimization of absorption through isomer resin synthesis,[28] and this same basic principle of maintaining good DUV resist transparency design still rings true, even today. Recently, i.e., from late-1997, higher performance commercial systems began emerging to meet the current and projected device fabrication needs, and design activity for these materials at resist vendor labs has been widespread and extensive in the 1990s and continues. In fact, nearly every major resist company now has multiple systems for sale. Shipley’s road map for 248 nm DUV resists calls for the extension to 0.1 micron dimensions; of course, advanced reticle and S&S illumination technologies are required to accomplish this, but its still an impressive projection and extension of an image capability far below the exposure wavelength and at a k1 factor less than 0.5. Recently, Motorola announced devices made with 0.1 micron CDs employing DUV at 248 nm lithography extended by utilizing PSM and OPC techniques.[29] Positive DUV Resists. The activity for these resists has focused mainly upon chemically amplified (CA) type systems responding to 248 nm illumination. These systems are chemically-amplified to provide sensitivity primarily because the earlier aligners, Perkin-Elmer 500/600s and early SVGL Micrascan S&S systems, were sourced by Hg arc lamps with low emissions at 248 nm. Chemical amplification was first proposed by Ito and Willson.[30] In their CA imaging system, a small amount of acid produced by the photochemical decomposition of a photoacid generator (PAG) induces a cascade of subsequent chemical transformations in the resist film, typically induced thermally on a hotplate after exposure. Although these systems were characterized by high sensitivity or speed, their wide-spread acceptance stems from their high contrast or resolution performance.[22][32] This attribute leads to stable microprocessor operating
11/30/00
JMR
Resist Technology
97
frequencies and DRAM memory applications through stable gate and contact resistance performance. Chemical equations listed in the references cited show the acid is regenerated many times as the resin side linkage is de-protected by the acid-catalyzed reaction, hence the chemical amplification and mechanism. Ten photoresists that have been extensively tested in the 1990s; they are t-BOC (Shipley APEX-E), ESCAP (Shipley UV3 and UV6) and acetal (Olin Arch-200, TOK P007, 015 and 024, Shin Etsu SEPR 4103 PB50, Clariant AZ-DX1200 and Sumitomo PEK 101A6) design families. Of these resists, the reference is APEX-E even though it possesses only moderate resolution as verified by a linearity parameter of only 0.3 microns in 1 micron of resist. It has been used in thinner versions by many fabs to make advanced 0.25 micron device prototypes and is even used in limited production in MOS logic factoriesit is also used extensively with a PVA top coat to lessen the effect of absorbed bases (see below). The performance results of the Shin Etsu and Clariant resists are most comparable with those of APEX-E, while the other systems perform to varying degrees vs. the APEX reference.[33] Although these early systems possessed speed for semiconductor production throughput and resolution, they all suffered from line intermittent anomalous insoluble skin effects or bridging.[31] The culprit of this serious defect was airborne bases, which were prevalent in all semiconductor fabs at levels greater than 10 ppb. This effect was exacerbated by process delays between exposure and post-exposure bake (PEB), and between develop and PEB. Since these amine bases are reactive to the acids generated by the PAG photolysis, it’s not surprising that the surface is rendered not-developable and bridges, while the bulk of the resist is developed out below it (Fig. 15). APEX-E, an early commercial system of tBOC was especially susceptible to this effect, and special processes and sequences were employed to make the early systems useable for device production as listed below: (Ref. 31 and references therein.) i. Application of a protective coating, such as PVA ii. Equipment with base removing filtered air iii. The incorporation of stabilizing additives iv. Selection of a strongly associating PAG v. Delayed generation of acid via PEB vi. Reduction of activation energy of deprotection vii. Reduction of free volume by higher temperature annealing at resist bake prior to exposure
11/30/00
JMR
98
Handbook of VLSI Microlithography
Figure 15. SEM micrograph of T-topped resist images due to base-contamination of the resist surface. (Taken from Y. K, H.Tomiyasu, M. Tsukamoto, T. Niinomi, Y. Tanaka, J. Fujita,T. Ochiai, A. Uedono, S. Tanigawa, SPIE, 3049, 238, 1997.)
Of these methods, the most promising method has been the use of high resist pre-exposure bakes. Ito et al. have shown that it is the Tg of the resin polymer that affects base deleterious absorption and the free volume reduction by elevated bakes prevents, more accurately reduces, these contamination levels. Notice, this also makes the resist more flow resistant during etch processing and other thermal treatments, so everything gets better from a processing view. By careful synthetic copolymerization methods, Ito and coworkers were able to formulate experimental systems with reduced base absorption levels by four to five times and increased PEB delay resistance of hours—most high-performance modern ESCAP systems employ this design or process principle. Shipley data showing the improvements in PED bake delay created by improved resin formulation and higher pre-exposure bakes is shown in Fig. 16; note, only slight Ttopping is observed for the longest delay time. Kameyama et al.[34] have also seen T-topping in tBOC resists (Fig. 15 above). In addition, they found base absorption depended upon the free volume of the resist film, which also depended upon the solvent employed. They further report the acetal systems have generally improved base resistance, and that the acid generator should be a weak acid for reduced activation energies.
11/30/00
JMR
Resist Technology
99
Figure 16. Linewidth vs. PED comparing the behavior of ESCAP resist UV5 to tBoc APEX resist. Note the SEM pictures corresponding to the respective process times at the top of the figure. (Figure courtesy of Shipley Co.)
11/30/00
JMR
100
Handbook of VLSI Microlithography
PAG Effects. The photoacids have been reviewed for tBOC and ESCAP systems by Willson, and for Acetal systems by Houlihan et al.[35] The choice of photoacid plays a critical role in determining the performance of both types of systems. PAGs less susceptible to surface depletion such as 4-methoxybenzenesulphonic acid give acetal resist formulations having excellent PEB delay time latitude, high sensitivity, resolution and focus latitude. PAGs must be bulky as well. The PAG must be thermally stable and be able to generate enough acid to the resist matrix for complete functionality. Resin Design and Mechanism. Acid-catalyzed deprotection has become the exclusive paradigm for aqueous-based developed positive resists, and all the following new generations of resists are essentially variations on this same theme. These systems all accomplish the conversion of the hydrophobic dissolution inhibiting functionality to the a hydrophilic base-soluble one, as embodied by the tBOC original system.[31] The new generation resists are basically of three types: i. TBOC, poly(4-hydroxystyrene) (PHS) based/onium salt PAG ii. Environmentally stable systems (ESCAPs), copolymers of PHS systems and t-butyl acrylate with high bake temperature iii. Acetal protected resin systems[35] The acetal, or Ketal, systems are based on low activation energy protecting groups, where the de-protection reaction occurs at room-temperature during exposure. Example terpolymer resists from Olin MM(ARCH now) are shown in Fig. 17; by incorporating higher molecular weights of the trans-acetalation reaction, the acetal protection is improved and enhanced thermal stability was also achieved. Although this simplifies the process by eliminating bake steps after exposure, it raises concern about volatile products, which has to be fixed by employing tools with a cover plate in the optics or such. According to Jim Cameron of Shipley, these systems also undergo film shrinkage, have poor Rmax dissolution values, and may not be as etch compatible. The three types of systems can also be two and three-component systems where the third ingredient is an molecular dissolution inhibitor versus it being primarily the resin polymer itself. The new generation resists must also be 0.26 N TMAH developable, because most volume fabs will bulk-deliver developer and this one is the industry standard. To accomplish this W. Conley et al.[36] needed to have an t-butyl acrylate comonomer concentration of greater than 25%this ultimately led to Shipley UV-5 and beyond generations of ESCAP resists.
11/30/00
JMR
Resist Technology
101
Figure 17. Structural diagrams for new DUV resins.(Taken from H.-T. Schacht, P. Falcigno, N. Münzel, K. Petschel, R. Schulz, A. Medina, T. Sarubbi, Polymeric Materials Science and Engineering, ACS, Vol. 77, p.428, Fall Meeting, September 8-11, 1997, Las Vegas, Nevada.)
Most activity in recent years has focused on the promising ESCAP tbutyl methacrylate copolymer-like systems because of their resistance to airborne-base induced CD changes, lower dense/isolated feature biasing and improved etch resistance. Even so, all modern fabs have installed expensive carbon filtration systems, and even then base concentrations are sometimes high—the combination of base filters and ESCAP formulations does lead to near-manufacturability. According to C. Grant Willson of the U. of Texas, there have emerged a variety of CA systems, all of which have greatly improved environmental stability and excellent imaging performance.[37] Shipley researchers and others (for example, see also Fig. 17) are focusing on resin system synthesis as shown by the terpolymer structure of Fig. 18. They’re tailoring the resin to be more aqueous developable, more thermally resistant to image flow, or dry-etch resistant, by incorporating PHS and styrene groups into the polymeric resin, respectively. For attenuatedPSM resists, Shipley researchers are targeting the resist design for minimal unwanted sidelobe formation (i.e., an interference pattern that prints in resist
11/30/00
JMR
102
Handbook of VLSI Microlithography
with no Cr reticle corresponding pattern), by building in high resist surface development inhibition; this is manifested empirically by increasing the E95/Eo ratio, or c-value, as determined from the basic contrast curve. The larger the c-value, the greater the focus latitude without sidelobe printing. By carefully controlling the three repeating group concentrations, the XYZ contents, they can tailor the resist to be good for a certain application. These application specific-type resists, are listed below: i. Contact/via and PSM ii. Line/space, gate-like high contrast and resolution iii. Isolated features on reflective and non-reflective substrates iv. Less-critical layers, fast resist v. Metal layer vi. Ion-implant with thick resists (>2–3 microns)
Figure 18. Example general resin terpolymer structural formula for Shipley experimental positive DUV resists, with emphasis on modifying the three monomer compositions to incorporate in dry-process compatibility and high-resolution development characteristics. Note, the exact compositions are confidential to Shipley. (Figure courtesy of Shipley Co.)
11/30/00
JMR
Resist Technology
103
Arai and Sato of Ohka Am.[38] devide the DUV resists into the three categories of i, ii, and iv above. This layer specific approach has yielded three products, TDUR P015PM, 10PM, and 09PM, respectively. An excellent example of imaging results for a potential resist product from JSR is shown in Fig. 19; this advanced system shows what DUV ESCAP resists can do resolution-wise, as well as provide an example of an gate resist with low proximity bias. Shipley also separates products into memory and logic applications-specific—this trend also holds for i-line resists as well, although the customer really only wants to run one resist. Most volume iline factories, however, usually run at least two or even three different resists, a common occurrence, so the concept of AS resists is not unrealistic. In fact, the trend of late is to swap out some DUV layers to i-line for cost, or COO, reasons.
Figure 19. SEM cross sections of an example advanced positive DUV resist. (Figure courtesy of Japan Synthetic Rubber.)
2/23/01
JMR
104
Handbook of VLSI Microlithography
Resist Process Performance. The target process performance goals for DUV resists are listed from Streefkerk et al.[33] i. RIE or overall process compatibility (not taken from SPIE ref.) ii. Resolution/photospeed: exposure dose < 17 mJ/cm2 for 0.25, 0.225, and 0.175 micron lines and spaces iii. Process sensitivity: CD sensitivity to PEB < 5 nm/oC iv. Isofocal bias for dense lines < 10% of the nominal CD; zero is the goal v. Depth of focus (DOF) at 10% exposure latitude (EL) comparable or better than APEX vi. Contamination sensitivity without RTC, 5 nm/half hour in 10 ppb NH3 environment vii. Iso-dense bias < 20 nm viii. BARC compatible (future requirements) but also compatible with highly reflective substrates ix. Acceptable behavior for both dense and isolated lines Modern DUV positive resists all satisfy these criteria to varying degrees, but numbers (i), (ii), and (viii) criteria are the most important to volume manufacturing.[39] In Fig. 20 from Kamberg et al., the improved CD vs. PEB T deviation behavior for ESCAP vs. tBoc resists can be compared. Note, the slope of the advanced DUV resists are always lower or better. It has also been observed that the CD vs. PED time curve slope gets steeper or worse with higher base contamination levels (i.e., 10–40 ppb levels) this means if there are base level excursions, the ESCAP systems can still function well vs. the older tBoc systems. All of the newer resists possess greater immunity to PEB delay times as the data of Fig. 21 from Ref. 33 illustrate. The data for the ESCAP example resist is flat for 15 minutes vs. that for the tBoc reference system. In Figs. 22 and 23 from Ref. 40, the improved iso/dense line bias is easily seen—note, the tBoc resist, APEX-E, has greater separation than Chen’s data for the example ESCAP system, Shipley UV5. Almost all new generation resists satisfy number (v) above, due to improved designbut note, the reason for replacing APEX is not that it can’t perform, but more of a manufacturability and defectivity issue.
11/30/00
JMR
Resist Technology
105
Figure 20. PEB temperature sensitivity of APEX-E (tBoc) vs. Shipley UV5 (ESCAP) to demonstrate the improved performance of the newer resists. Temperature sensitivity at the operating conditions is ~2 1/3 times worse for APEX as UV5. APEX is 14 nm/°C, whereas UV5 is 6 nm/°C.
Figure 21. Post exposure delay vs. measured CD for Shipley APEX-E and UV5 resists. Post Exposure Delay (PED) for UV5 is ~2 nm over 15 minutes within an amine free environment. APEX is ~7 nm over the same time frame.
2/23/01
JMR
Figure 22. CD data for Shipley APEX-E vs. defocus for grouped and isolated line features. (Data courtesy of Dr. Gong Chen of Motorola MOS 12.)
106
11/30/00
JMR
Handbook of VLSI Microlithography
Figure 23. CD data for Shipley UV5 vs. defocus for grouped and isolated line features. (Data courtesy of Dr. Gong Chen of Motorola MOS 12.)
Resist Technology 107
11/30/00
JMR
108
Handbook of VLSI Microlithography
Olin and IMEC researchers reported at OCG 1996 no gain in exposure latitude or DOF from annular illumination exposures of their ARC DUV system. Their results were good, i.e., had large process windows, at NA 0.57 and sigma of 0.6 for L/S geometries. Optical proximity was not alleviated by going to off-axis illumination without loosing process latitude—OPC is therefore recommended at 0.2 micron CDs. Although earlier tBOC resists exhibited resist footing problems on inorganic ARC layers and other substrates, the modern ESCAP resists have a lower tendency for this. There are also surface treatments that improve this situation, but cost is always an issue for an extra process step. Ashing on TiN ARC would be a example of this type of added process, and it prevents footing for ESCAP resists on that and other nitride substrates effectively, and is a standard practice in some DUV fabs. Negative DUV Resists. The first commercial CA negative resist was actually a negative acting system, Shipley’s SNR200. IBM’s early use of the tBOC negative resist, containing triphenylsulfonium hexafluorantimonate, for 1-Mbit DRAM production with Perkin-Elmer DUV-scanners in UV2 mode set the stage for modern DUV Lithography applications.[31] The negative systems depend upon acid-catalyzed condensation polymerization, or simply viewed as polymeric crosslinking, to function. As a result, their exposed dissolution rates are very high, versus the unexposed area resist rate. These systems are less susceptible, but not totally immune, to the PEB delay problem. They also develop in 0.14 N TMAH developer, a dilute developer and a major problem for mainstream applications in a fab running existing i-line processes. Negative systems have, however, been developed to be 0.26 N TMAH compatible, the fab standard. They also enjoy the advantages of better isolated image formation due to dark field improved aerial images, and good resist image thermal flow resistance. Use of these systems has been somewhat limited—their primary use has been as a low-volume e-beam resist application (see Ch. 7). 193 DUV Resist Design Principles. The 193 nm DUV wavelength is the next logical wavelength after 248 nm and a laser source is available as discussed in Ch. 1. S&S tools operating at this wavelength may be capable of imaging to dimensions less than 0.14 microns. Designers of 193 nm resists will be able to use 248 nm tools to develop new 193 resists, because of the CA chemistry similarity, and a new multiple wavelength resist may emerge as a result.
11/30/00
JMR
Resist Technology
109
Historically, resist development tends to lag tool development, but 193-nm resist solutions are coming on strong. A worldwide effort has occurred over the last five years, creating new classes of resists. Allen at IBM’s Almaden Research Center notes, “Much of today’s emphasis is on the cyclic olefin chemical platform, which affords an excellent way to transition the industry towards an entirely new chemical platform.”[41] Cyclic olefin polymer materials offer the potential for superior etch properties, surpassing the most robust 248-nm resists. In terms of etch resistance for polysilicon and oxide etching, cyclic olefin polymer materials approach if not surpass that of novolak-based mid-UV resists. Allen of IBM[42] has outlined the design criteria for 193 nm functioning resists. They must have these following characteristics: i. Optical transparency of the order of 0.5/micron of resist in the same range as that for 248 resists ii. High resolution and single layer resist useage iii. High large wafer throughput photospeed (5–50 mJ/cm2) iv. 0.26 N TMAH base developable v. Dry-etch resistance, e.g., be alicyclic-containing copolymers The design of 193 resists is hampered because of the lack of a PHS equivalent base resin. Multiple choices are available for a polymer backbone to build the resist around. None, however, have emerged as the dropin replacement for PHS. The proper PAG is more complicated for 193 nm resist, because PAGs used in 248 nm products are ineffective when formulated with 193 nm resist polymers. Allen attributes this to be due to the lack of electron transfer sensitization for those polymers, presumably because of the absence of the phenolic hydroxyl groups. Willson of the U. of Texas has been working on 193 nm CA resists containing Norboranyl-containing tetrapolymers. These researchers were among the first to recognize the potential of cyclic olefin polymers. These systems can have RIE etch rates better, i.e., slower, than standard photoresists and good thermal stability. Even though substantial progress has been made with these systems, like 193 nm DUVL in general, they are years away from production applications. Furthermore, the jury is still out on whether the resist at 193 will be surface imaging or like the fully imaged CA resists originally invented by Willson and coworkers.
11/30/00
JMR
110
Handbook of VLSI Microlithography
Om Nalamasu, technical manager for optical lithography and imaging materials research at Lucent’s Bell Labs, notes that for 193-nm lithography, the combination of etch stability and reflectivity control is particularly important because of the relatively thin resist thicknesses (0.6– 0.4 um) required for 150- and 130-nm design rules. Nalamasu says, “because of the inherent ‘paradigm shift’ in technology with 193-nm DUV resist development, many of the issues have been addressed up front and this is speeding up progress.” Murrae Bowden, R&D director for Olin Microelectronic Materials, says, “These concepts are being actively exploited by resist vendors with first-generation products expected early in 1999, along with the first tools.” At the recent Interface 1998 OMM meeting, U. Okorananyanawu of AMD presented a paper on progress on 193-nm technologies, which reported 27 different formulations of single- and bilayer resists being developed for 193-nm. According to Okorananyanwu, there are at least 890 variations of alycyclic, acrylate, hybrid and Si-containing bilayer formulations. While the industry seems to be converging on a few systems with similar properties, Okorananyanwu reported that none of them is entirely adequate for 130-nm production, many showing inadequate etch-resistance, excessive line-edge roughness, and out-gassing. More hopefully, Francis Houlihan of Lucent Technologies described a material strategy combining low-volatility photoacid with norborene malaeic-anhydride polymer and photodecomposible base to achieve an etch resistance 45% better than that for i-line resists. Houlihan predicted that this system may be useful as a crossover resist for 248-nm, 193-nm, and even SCALPEL exposure technologies.[43] Perhaps the most challenging of the problems is the issue of resist out-gassing and subsequent contamination of optical elements. Ron Kunz of MIT-Lincoln Lab, a leading site for 193-nm research, indicated that the out-gassed material is a complex mixture of low molecular weight photoproducts, which can be oxidized and bonded chemically to the lens surfaces by DUV light.[44] The problem has affected several developmental exposure tools, which have required extensive cleaning. Rumors are also circulating that lenses have been ruined by contamination from low-energy activation resists, which are the most susceptible to the problem. Stepper lenses have always been vulnerable to resist out-gassing, and cleaning procedures of one type or another are standard, but the issue takes on added significance at 193 nm because of the extremely close tolerances of the calcium fluoride optics. Even a few nanometers of chemical buildup (which would occur in a year’s production even if today’s out-gas levels were drastically reduced) can result in light scattering image flare, aberrations, and
11/30/00
JMR
Resist Technology
111
a decrease in ultimate resolution. This is an especially difficult problem when 193-nm systems will be asked to begin their work at device geometries substantially smaller than the 193-nm wavelength, and extend a generation or two beyond. 2.3
Radiation Resists
Introduction. The term radiation resist refers generically to materials that function under exposure to ionizing radiation, that is, radiation with short wavelengths such as soft x-rays, electron beams, and ion beams. Since only the relative sensitivities are changed when the radiation source is changed and not the resist process, the text in this entire chapter will be restricted to e-beam resists only. The basic resist mechanisms are unaffected by ionizing radiation source change even though the energy absorption mechanisms may change significantly. The main demand for these resists is in the area of e-beam fabrication of Cr- patterned glass or quartz photomasks and reticles.[45] For these resists, the energy is absorbed much differently than for photoresists. Here, enough energy is available to cleave any bond in the resist and initiate any possible reaction, where for photoresists only absorption at specific chromophoric sites in the resist can result in chemical reactivity. Even though the energy absorption is more uniform with depth for radiation resists and seemingly nonspecific, the radiation chemistry results are surprisingly very specific, hence, design criteria exist and will be reviewed. Energy Absorption Considerations. Resist atomic composition can influence both resolution and energy absorption. For electrons with 10– 50 keV energies typical to e-beam lithography, the energy loss per unit path length is linear with resist density, the quantity Z/A (where Z and A are the average atomic and mass numbers for the resist) and the term ln E.[46] The energy absorbed depends primarily upon the energy of the beam, but resist compositional effects are best for resists with the greatest H content, where Z/A is greatest. SinceZ/A approaches 0.5 rapidly for elements with Z greater than five, resists with high atomic numbered compositions actually may have reduced energy transfer per unit length penetrated vertically by up to ~ 30%. More importantly, lateral scattering and backscattering effects, effects which ultimately limit resolution in e-beam lithography, increase dramatically for higher atomic numbered substituents and hence they should be avoided by resist designers.[47][48] Positive Resists. Positive resists have historically been designed as single component polymer type systems. The most classic examples are poly(methyl methacrylate) and poly(butene sulfone), PMMA and PBS,
11/30/00
JMR
112
Handbook of VLSI Microlithography
respectively. Their radiation chemistry was well known previous to their application as resists. Both were known to degrade, that is, produce large quantities of gaseous radiation byproducts, CO and CO2 and SO 2, as well as exhibit reduced molecular weights under gamma-ray and e-beam exposure. In fact, both do function as positive e-beam resists,[49][50] and the latter material is the major positive e-beam resist in use today, mostly in mask shop applications. In both resists, the design principle is the incorporation of groups that are thermodynamically favored to be split out of the molecule when irradiated; for PBS that group is in the main polymer chain while for PMMA it is in the ester side chain of the molecule. Later positive resist molecular designs involved derivatives of PMMA, namely substituted acrylates and methacrylates. These systems are represented by the general vinyl polymer structural formula: Eq. (2)
-[CH2-CX(CO2Y)] -
where X could be CH3 as in PMMA or any electron-withdrawing group such as a halogen or CN group, and Y could be any alkyl or halogenated alkyl group. When polymers with this general structural formula are employed, their Gs values, polymer main chain bond scission yields/100 eV exposure dose, for degradation are large and their Gx values, intermolecular bonds formed (i.e., crosslinking) between polymer molecules/100 eV dose, are nearly zero in many cases.[51][52] This term, Gs, is analogous to φ for photoresists. It determines to first order the sensitivity of the resist and is approximately inversely proportional to the e-beam sensitivity.[47] Actually, the most important parameter for determining resist tone is Gs/G x. When Gs/G x is ≥4, the resist will predominantly degrade and hence be a positive resist. The ultimate in substituted acrylates was designed/synthesized, and is currently being manufactured, by Toray in Japan as EBR-9, where X is Cl and Y is CH2CF3. This resist rivals PBS as a sensitive positive e-beam resist for mask making purposes, because both the X and Y substituents increase the radiation susceptibility of the resist.[47] The methacrylate polymer resist materials are characterized by fair to good sensitivity, and poor to good contrast or resolution. Unfortunately, all of these resists suffer from poor dry etch compatibility and their use in lithography is restricted to mask-making only (see Table 1). This market is, however, reasonably large since most masks require a master reduction reticle for stepper repeater fabrication or are MEBES master plates themselves.
11/30/00
JMR
Resist Technology
113
Table 1. Methacrylate-Based Positive E-Beam Resists RESIST
SENSITIVITY
PMMA (IBM)
80 MicroC/cm2
EBR-9 (JAPAN)
CONTRAST
4
DRY-PROCESS COMPATIBILITY NO (see Table 9)
12
1.4
NO
5
1.4
NO
40
3
NO
1
2
NO
PMCN (ARMY)
12
1
OK
PMCA (ARMY/ HONEYWELL)
16
FBM-120 (JAPAN) HP POS CROSSLINKED PMMA PBS (AT&T)
NOT MEASURED
NO
Direct-write e-beam positive resists, resists used to fabricate devices directly on silicon wafers, must be heartier or more dry process compatible than the mask making resists above. As such, they have been restricted to two basic resist types: conventional positive photoresists and novolac/ sulfone copolymer blends. Table 2 contains a representative list of these systems and their resist characteristics. Generally, these resists are less sensitive but are dry process compatible, and semiconductor devices can actually be made directly with them on silicon wafers. Fahrenholz et al. of AT&T [53] and Hunt Chemical researchers[54] have been actively involved in designing novolac-based positive e-beam resists for direct-write circuit fabrication. The AT&T activity has focused upon two component polymer blend systems where the dissolution inhibitor is a poly (alkene sulfone) like PBS[55] and the novolac binder resin is designed to actually degrade with the inhibitor and have minimal concomitant crosslinking.[53] The idea here is to minimize the competing crosslinking reaction from the resin which tends to counteract the positive action occurring in the degrading sulfone. Novolacs with bulky substituents on the phenyl ring, n-propyl, sec-butyl, and phenyl, all produce positive acting novolac resins without any dissolution inhibitor at all. Although images of these uninhibited resins do not clear to the substrate without extensive resist
11/30/00
JMR
114
Handbook of VLSI Microlithography
loss in the unexposed areas, they provide a substantial advantage over more conventional resins that crosslink extensively over the entire dose range, which tends to counteract the positive behavior in the two component resist. Other conditions which promote positive behavior are post exposure predevelop exposure to air (oxygen), higher resin molecular weights (i.e., limited due to solubility), and stronger base developers. As with the AT&T resists, the Hunt system also contains a proprietary resin which participates significantly to the positive e-beam behavior.
Table 2. Novolak-Based Positive E-Beam Resists RESIST
SENSITIVITY
CONTRAST
DRY-PROCESS COMPATIBILITY
HUNT-204
50 MicroC/cm2
2.9
YES (0.5)
PC-129
200
4
YES (0.2)
HITACHI NPR
14
0.7–0.9
YES
AZ-2400 (IBM)
10
—
YES IF DUV
25
2.5
50
1.6
HUNT-1182
25
1.3–1.6
UNKNOWN
ALLIED-6010
25
1.2–4
YES
HUNT WX-214
15
2.5–4
YES IF DUV
Negative E-beam Resists. Like positive resists, these resists can be conveniently classified between mask-making and direct-write (see Tables 3 and 4). Basically, these resists are direct-write compatible when they are aromatic in nature, that is, when they are polystyrenes or naphthalenes or their derivatives, and mask-making compatible when they are vinyl polymers without any unsaturated bonding except at the polymer crosslinking site. Mask-making resists generally possess very good sensitivity, but low contrast or resolution (see Table 3 for examples). Their applications are restricted to making 5X or 10X reduction reticles where larger features
11/30/00
JMR
Resist Technology
115
(>4–5 microns) are required. These resists withstand wet etching of thin chromium films, but also require descum processing prior to etch. These resists are basically useless for device fabrication applications using direct-write e-beam due to their poor plasma etch resistance.[56] This is due to a general lack of selectivity to harsh RIE treatments and swelling behavior exhibited by these materials at feature sizes below 1 micron, the size domain where direct-write e-beam techniques are of need to improve circuit packing density.
Table 3. Mask-Making Negative E-Beam Resists RESIST
SENSITIVITY
AT&T COP
0.8 MicroC/cm2
1.0
NO AND SWELLING
2
1.2
NO AND SWELLING
KODAKS
0.5
0.8-1.3
NO PLUS SOME SWELLING PLUS EDGE-SCALING
CONVENTIONAL PHOTORESISTS
1-3
0.7-1.2
NO AND SWELLING
SEL-N (JAPAN)
CONTRAST
DRY-PROCESS COMPATIBILITY
The dry-process compatible resists shown in Table 4 generally possess reduced sensitivity, but with higher contrast and resolution. The sensitivity trade-off, however, is not completely prohibitive (i.e., reasonable exposure levels <10 × 10-6 Coul/cm2 can be employed). Most importantly, these materials are less susceptible to pattern swelling during development, and submicron images are easily obtained. As for the maskmaking resists, these materials must also be descummed after development for best resolution performance. Figure 24 illustrates the effect of the oxygen RIE descum on the negative resist image foot at the base of the example image. The alpha-M-CMS, CMS, polystyrene, and AZ tonereversed positive photoresist systems have all been used in direct-write applications to fabricate high performance MOS and Bipolar circuits.
11/30/00
JMR
116
Handbook of VLSI Microlithography
Table 4. Dry-Process Compatible Negative E-Beam Resists RESIST
SENSITIVITY
CONTRAST
DRY-PROCESS COMPATIBILITY
POLYSTYRENE
80 MicroC/cm2
2.0
YES (0.05)
CMS-EX-S (JAPAN)
1.7
1.6
YES (<0.1)
MES-E (JAPAN)
1.5
1.5
YES (<0.1)
CMS-EX-R (JAPAN)
17
2.2
YES
ALPHA-M-CMS-S (JAPAN)
10
2.0
YES
100
1.4
YES(<0.5)
-1450 (+300)
160
2.4
-1350 (+300)
60
3.2
4
1.6
AZ-1450
GMC (AT&T)
With DESCUM
YES
Without DESCUM
(a)
(b) 1 micron
Figure 24. Illustration of line edge improvement via reactive-ion-etch oxygen descum processing on negative e-beam resist images: (a) with descum, (b) without descum.
11/30/00
JMR
Resist Technology
117
Negative e-beam resists are designed by incorporating radiation crosslinking groups into usually single component vinyl polymer resists.[1][2] These appendage groups range typically from alpha hydrogen or halogen, to side chain epoxy[57] and allyl,[58] to halogenated alkyl groups attached to styrene[59] or acrylate esters [60] (Fig. 25). These groups are all very radiation susceptible and design incorporation into the resist leads to easily crosslinkable polymers with high Gx values, and hence, good ebeam sensitivities (≤5 × 10-6 C/cm2).
Figure 25. Polymeric crosslinking active sites for negative e-beam (or radiation) resists.
11/30/00
JMR
118
Handbook of VLSI Microlithography
Recent advances in negative e-beam resist design have been in the areas of polymer blending[61] and chemical amplification.[62] The blending technique, similar to that for two-component positive e-beam resists, allows the preparation of a sensitive high contrast resist from a polymer with poor sensitivity. The sensitivity requirement for the insensitive material is that it possess a good electron donating ring substituent for improved H-abstraction induced crosslinking efficiency with the co-blended chloromethylstyrene. The latter technique developed at IBM involves radiation induced acid formation in the resist to catalyze crosslinking or induce degradation, hence, positive and negative behavior can both be designed. An example negative behaving system is now marketed by Shipley as SAL 601 EBR-7 or later improved derivatives.[63] (See Ch. 7 also.) The limitation of negative e-beam systems stems primarily from their advantage. Sensitive crosslinked or gelled polymers are also very susceptible to developer solvent swelling (see Fig. 26) due to the three dimensional crosslinked networks formed in the irradiated polymer regions. Hence, these resists generally suffer from reduced resolution and are highly susceptible to proximity effects (cooperative exposures which occur due to backscattered electrons from adjacent lines). As a result, these systems will probably always be limited to high pattern area coverage layers requiring somewhat lower resolution as is typical of many metal layer interconnect requirements. The new Shipley acid-catalyzed resists, however, are less susceptible to this resolution limiting effect because they are base developed novolac systems (recall: positive novolac resists are non-swelling).
(a)
(b)
Figure 26. Optical micrographs illustrating snaking behavior, solvent developer, swelling behavior, of negative e-beam resists. (a) PCMS, (b) COP.
11/30/00
JMR
Resist Technology 2.4
119
Future Resists
Conventional positive photoresist technology combined with multilevel processing techniques (see applications and special processes section) will inevitably allow the lithographic community to achieve 0.5 micron or below design rules without the extensive use of resists responsive to ionizing radiation.[64] The future of optical lithography beyond 0.1–0.25 micron will require extensive development in the areas of deep-UV resists, where new resins and PAGs, or single component resists, and even new resist mechanisms will be required. New advances, such as those demonstrated in DUV binder resin design by Turner et al. [65] and the chemical amplification resist mechanism by Willson et al.[62] or combinations of theoretical lithographic simulation[66] and statistical design of experiments[67] to achieve the ultimate resist design, as demonstrated by Monohan et al., will evolve to provide the next generation photoresists for the next century. These new resists may probably be employed as top layer imaging systems for multilayer resist processes, emphasized due to the sharp increase in numerical aperture or decreased depth of focus of proposed advanced optical lithographic systems.[64] Future DUV resists for 157 nm applications will be delayed, because the laser operating at that wavelength is years away. That technology, when available, will be capable of smaller than 0.08 micron. Like for the EUV resists (see below), it to may end up being a bilayer process. Future resists for EUV technology will probably come from multilayer systems as projected above, with the imaging layer being a thin top layer on a hard mask. The 193 and 157 nm resists will probably be at least candidates to consider. Early contract work on these systems started in early 1999. Of course, transparency, adhesion, aspect ratio, defects, and overall process compatibility are and always will be concerns for the systems of this developmental work. According to C. G. Willson, the EUV tool resist of the future may require properties not found in current resists, requiring a very complex compound resist chemical.[68] A test stand exposure unit for EUV development may be available in late 1999, with the first beta-site EUV tools available in approximately the 2002 timeframe, and with the production tools predicted to be available in 2007–2010. [68]
11/30/00
JMR
120
Handbook of VLSI Microlithography
3.0
RESIST PROCESSING
3.1
Resist Parameter Screening
Before a resist, commercial or otherwise, can be instituted into a device fabrication process flow, it must demonstrate high performance characteristics to a battery of fundamental tests. Completing these tests by no means provides an optimized resist process to be plugged into the production line. That comes later, but the results do provide the processing engineer on the IC fab line a basis for selecting one resist over another in a quantitative impartial way. Usually, the results for a new material are compared to those of an existing baseline process, whose capability may have become insufficient at one or more critical levels. Usually, the resist vendor has spent a lot of time selecting a suitable developer for the resist to be tested, and that recommendation should be used at least initially for performance screening purposes; most resists, however, even modern resists, are designed to develop in 0.26 N TMAH. The investigating engineer, however, should make clear to the vendor what performance is actually being sought, that is, high contrast, speed, metalion or surfactant free base developer or not, thermal stability, whatever. The device requirement will usually dictate the type of developer selected for testing. Photoresist Parameter Screening: Sensitivity and Contrast. A resist is characterized functionally by six basic parameters. The first three are related to the exposure absorption characteristics, the Dill A, B, and C parameters,[69] while the second three are related to the empirical dissolution characteristics of the resist, E1, E2, and E3 in the specific developer system.[70] All six parameters are easily measured empirically, as outlined by Refs. 69 and 70, and can be employed in modeling packages to theoretically calculate image profiles and other important quantities. The parameters A and B are related to E, the resist absorptivity by AM +B, where M is the local PAC concentration. The parameters A and C are related, and are really determined by the photoactive component quantum efficiency, φ ,[16] and track each other in value up and down as well as determine the resist sensitivity. A is also increased by resist photoactive component (PAC) loading,[71] usually 15–25 % by weight of the resist formulation compositionally. B is determined by the transparency of the PAC ballast molecule, the resist formulation resin transparency, and whether absorbing dyes are present or not in the resist formulation. Table
11/30/00
JMR
Resist Technology
121
5 contains some example resist parameter data. Older resists are represented in Table 5 by the OFPR 800 data. They are characterized by lower contrast or resolution, poorer CD linearity (i.e., larger values), and lower A and greater B parameters. Notice that the contrast, linearity, and depth of focus (DOF at 0.5 µm CD) values of the table are all well correlated.
Table 5. Resist Performance Parameters
Resist System
B
A -1
-1
Contrast
Linearity (µm)
CD DOF @ 0.5 µm (µm)
(µm )
(µm )
OFPR 800(Ito)
0.5
0.11
~1.3
0.6µ
System 9
0.44
<0.06
3.4
Shipley 500
0.85
0.05
3.8
Shipley 518L†
³0.85
JSR 500
0.75
OCG 897i
Swing Curve Amplitude 2
Swing Curve Slope 2
(mJ/cm )
(mJ/cm /kÅ)
NM
NM
NM
~0.5
»0.5
60
16.2
0.4
2
31
6.4
<SPR500
NM
NM
25
7–9
0.069
7
0.4
2
40
8–9
0.73
0.07
4.3
0.45
2
38
9
PFI-D11A (undyed)*
0.92
0.50
NM
NM
1.20
NM
NM
MC-231A*
0.76
0.22
NM
NM
1.65
NM
NM
MC-232A*
0.76
0.34
NM
NM
1.50
NM
NM
MC-233A*
0.76
0.46
NM
NM
1.50
NM
NM
MC-234A*
0.76
0.58
NM
NM
1.35
NM
NM
†
>0.05
†
†
NM - No measurement and/or equipment was not a wafer stepper. † - Dyed Shipley 500, B is > B for SPR 500. * - Sumito dyed resist data courtesy of M. Hanabata.
11/30/00
JMR
122
Handbook of VLSI Microlithography
Choosing a resist, for any layer before metal 1 or any MLM layer application beyond first metal, requires a careful judicious selection, which is based upon the six parameters above. The choice also depends upon what the lithographic requirement is; i.e., the resolution required or contrast, the topography and composition of the substrate, the reflectivity of the substrate or wafer, the speed required for large wafer throughput, and of course, the processing cost. Older IC fab lines utilize less expensive resists with typically lower contrast or lower PAC q values, smaller A and larger B parameters, while new fab lines requiring higher metal or via layer resolution will employ resists with greater values of contrast and q, and lower B and greater A parameters. Resists with very nonlinear resist dissolution characteristics, as demonstrated in Refs. 14 and 72, and low A and B parameters, are required for advanced high resolution pre-metal fabrication facility applications. Nonlinear characteristics are primarily driven by the esterification degree of the PAC (i.e. q values greater than one) (Fig. 2). The optimum Dill A value has been established for newer advanced resists at between 0.6–0.8 as demonstrated in Ref. 72. Value B, for newer undyed resists, range between 0.02 and 0.06 µm-1 , while C values range typically between 0.01 and 0.03 cm2/mJ. Via resists suitable for advanced applications usually have a low B value, while for older devices with greater than 1.5 µm dimensional resist requirements lower A and greater B values are more typical. Resist contrast is inversely related to the sum of the A and B parameters,[73] both corrected for local concentration; and as we will see in later sections, (See Sec. 5.4) there will be dyed resist systems with higher B values (Table 5) that are still not capable of suppressing substrate reflectivity effects upon linewidth CD control, and more complex processing must be employed (see multilayer process section). Process Swing Curve Evaluations for Operating Points due to Reflective Interference and Bulk Effects. The basic data needed to establish resist thickness requirements (for example, lithographic testing) are found in Figs. 27–30.[74] This data must be generated because of the need to know how the resist system responds to reflective interference between light reflected from the test substrate and forward light within the film. To generate the data of Figs. 27–30, resist dispense spin speeds are varied from 4500 rpm to 6400 rpm at 100 rpm increments on silicon test wafers (i.e., no thin films present). The pre-exposure resist thicknesses are measured at each spin speed on a Prometrix FT-600 or later version KLATencor tool (e.g., 1050 or 1200 series). Each wafer with a different resist thickness is then exposed on a wafer stepper with a 10 × 10 exposure matrix
11/30/00
JMR
Resist Technology
123
starting at an exposure dose of 50 mJ/cm2 with an increment of 1–2 mJ/cm 2. This enables the determination of the exposure to clear value (or the first exposed square where the resist is completely developed), Eo, to be determined for each thickness after development (see Figs. 31 and 32). Actual high resolution MLM images are usually generated at exposure values ≥ 2Eo for reference.
Figure 27. Eo vs. thickness for resist B.
Figure 28. Eo vs. resist thickness for resist A.
11/30/00
JMR
124
Handbook of VLSI Microlithography
Figure 29. Contrast vs. resist thickness for resist B.
(a)
(b)
Figure 30. (a) Normalized open frame exposed image thickness vs. log exposure for resist A. The contrast value was obtained by a least squares fit of the values and the thickness of resist was lithographically equivalent to that at the second maximum from Figure 29. (b) Detailed contrast method example with numbers shown as for (a).
11/30/00
JMR
Resist Technology
125
Figure 31. Full wafer optical micrograph of a 6 × 6 Eo Canon stepper large field exposure matrix example. Optical micrograph of exposed and developed open frame exposures showing a defect on the objective lens of the stepper. Note: the defect appears in every partially developed frame, and is termed a repeating defect.
Figure 32. Relationship between Eo and the contrast measured from a wafer as in Fig. 31.
11/30/00
JMR
126
Handbook of VLSI Microlithography
Figure 27 shows the graph of Eo vs. resist thickness. Incidentally, resist image critical dimension, or CD, plots vs. exposure have the same exact cyclical shape. The 10.8 kÅ thickness on the graph was chosen as an optimum example operating point for the resist A evaluation, because it was at a swing curve maximum. This choice was made due to a number of considerations; first, resist image tops are more square at the maximum, the resistance to under developed images (i.e., scumming image spaces) created by thickness variation is greatest at a Eo max, process contrast values are largest for operation at swing curve maxima (Fig. 27); plus, at least a micron of resist is usually needed to provide etch masking. Processes requiring thicker resists for masking purposes dictate operating at maxima at greater thickness values along the horizontal axis of the swing curve. Resist A curves are found in Figs. 28 and 30. Notice the operating maxima for both example resists are very similar indicating similar resist refractive indices. To establish the contrast data for resist B of Fig. 29, dispense spin speeds and exposures were varied as for the Eo studies above. The post exposure and develop resist image thickness was measured for each open frame exposure field of the exposure array for each resist thickness. From this, the contrast for each resist thickness was calculated from a least squares fitting algorithm. In most cases, the phase of Eo and contrast are the same;[74] therefore, contrast is usually also at a max when Eo (or image dimension) are at a max. This provides better image dimensional control, another reason to operate at a resist thickness max.[74] The swing curve of Fig. 27 has three peaks (~ every 1.0 kÅ) with operating thickness potential. The peak at 10.9 kÅ is optimum because of the reasons above and because the photoresist thickness must be at a value higher than 1.0 µm due to etching process constraints; metal thicknesses are typically ≥0.7 µm at metal levels 1–3. In addition, the larger thickness maxima operating points have significantly higher contrast values or better resist image dimension control capability, which also affects device layer to layer total overlay; therefore, they must be chosen unless RIE needs dictate otherwise. The swing curve for resist A (Fig. 28) shows a similar result (10.75 kÅ) for operating thickness. A comparison of the max to min ranges of both resist swing graphs shows that for resist A, thickness has a slightly greater effect, or that resist A is slightly more transparent than resist B. Comparing the slopes of the swing curves, generated by either the min-to-min or max-to-max connecting lines, provides a measure of the resistance to image dimensional changes over device topography (i.e., the bulk effect); the data shows that resist B is slightly more
11/30/00
JMR
Resist Technology
127
tolerant to topography. The differences observed here for the latter parameters are, however, fairly small and may even be close to measurement uncertainty limits. Figure 29 is the contrast resist thickness response curve obtained for resist B photoresist. At the optimum operating thickness (10.9 kÅ), resist Bhas a contrast of 4.3. The data of Fig. 30 establishes the contrast at the operating thickness of resist A to be 7.0, a significantly higher value than that for resist B, thus, resist A is the higher contrast/resolution resist of the two. Depth of Focus (DOF) and Exposure Latitude. To obtain resist DOF data, resist B and resist A coated wafers were exposed with a test reticle on a Canon 2000i (0.52 NA) stepper using a 13 × 13 matrix array. The test reticle had CDs ranging from 0.3 µm to 2.0 µm. Machine focus was varied by column with values ranging from -1.8 to 1.8 µm with a delta focus of 0.3 µm per column. The exposure dose was varied by row with values ranging from 110 to 230 mJ/cm2 with a delta exposure of 10 mJ/cm2 per row. The 0.5 µm CD was measured in each die using a Hitachi SEM after development. From this type of data, the best focus and exposure dose to CD size can be determined for each photoresist. Figure 33 contains the resist image critical dimension vs. defocus data for each photoresist at the same best relative exposure, or the iso-focal exposure condition (190 mJ/cm2 for resist B and 210 mJ/cm2 for resist A). The depth of focus values observed were both approximately 2.1 µm for a line CD specification of 0.5 µm ± 10%. Both photoresists have good depth of focus which ensures CDs are printed within the given specification, even when process variations occur (i.e., changes in wafer flatness, topography, etc.). It is not atypical for example, for local total indicated range (LTIR) flatness values [75] for individual wafer die at the wafer edges to be of the order of 0.5–2 microns. Thus, processes with large DOF are very important. DOF is designed into a resist by increasing the PAC q value through increased PAC esterification and by resin molecular weight tailoring.[76] (See also later subsections of 5.4.) CD Linearity. Figure 34 shows the actual Hitachi SEM image size vs. nominal size for resist B. Figures 34 and 35 allow the resist example systems to be compared under equivalent processing conditions for CD linearity. Linearity is a fundamental parameter because it measures the ability of the resist to delineate the smallest feature possible within the standard ±10 % CD criterion. The test reticle has all line sizes in the unbiased condition. It is important that there be no bias when printing different sized features simultaneously. Resist A can print from 2.0 µm down to 0.4 µm CD sizes with no bias, while resist B can print only down
11/30/00
JMR
128
Handbook of VLSI Microlithography
to 0.45 µm CD sizes without bias. This observation is consistent with the higher observed resist contrast performance for the resist A system, since it is well known that high contrast resists also have greater CD linearity. CD linearity is basically determined by the degree of PAC esterification or contrast and the effect of that upon the resist development non-linearity. For advanced designs such as 64M DRAM and beyond, linearity less than 0.35 µm will be required. Resist Image Edge Wall. This parameter defines the criteria for the depth of focus determination given above. The useable edge wall angle definition varies, but is typically from 80–85 degrees. Figure 36 contains some typical resist image SEM cross sections illustrating greater than 85 degree edge walls over a fairly large range of focus for Shipley SPR 500 resist. The resist image edge wall angle also plays an important role in the RIE etch process bias; i.e., the more sloped the resist image the greater the RIE process bias, even under anisotropic etch conditions. When large values of the B parameter or both A and B parameters are large, resist image edge walls degrade to values less than 80 degrees, and resolution suffers as well.
Figure 33. Resist image CD vs. defocus for resist B and resist A resist systems under equivalent exposure conditions.
11/30/00
JMR
Resist Technology
129
Figure 34. Linearity curve for resist B.
Figure 35. Linearity curve for Resist A.
11/30/00
JMR
130
Handbook of VLSI Microlithography
-0.6 µm
-0.7 µm
-0.5 µm
-0.8 µm
-0.4 µm
-0.9 µm
-1.0 µm
-1.1 µm
-1.2 µm
-1.3 µm
-1.4 µm
-1.5 µm
Figure 36. Resist image edgewalls for JSR 790 features printed on a Canon 2000i4 stepper through defocus.
11/30/00
JMR
Resist Technology
131
Photoresist CD Latitude. Figure 37 shows the CD vs. exposure dose data for each photoresist at the best relative focus offset condition (-0.45 µm for resist B and -0.4 for resist A). The conditions for this comparison represent equivalence, i.e., both offset values represent a CD data set centered on the offset value. The exposure latitudes were both approximately 18% over a CD specification of 0.5 µm ± 10%. These values indicate that both example resists are capable of printing CD’s within specification when process variations (e.g., resist thickness, lamp illumination problems, substrate reflectivity, etc.) occur within ±18%. Process Compatability Evaluations. Etch Resistance. It is an obvious advantage for a photoresist to have a high level of device process reactive-ion etch (RIE) selectivity or etch resistance. This means the photoresist is resistant to etching during the etching processes (see Sec. 5.0). If the resist erodes or flows during the etch process, the device cannot be made with reproducible overlay or high yield. Resist B and A example photoresist coated wafers were measured for thickness on a Prometrix SM-200, both with and without Fusion Deep Ultraviolet Radiation (DUV) image stabilization. An oxide test wafer was etched with the resist B and resist A coated wafers in an Applied Materials (AME) 8110 RIE to determine selectivity. Here, selectivity is simply the ratio between the oxide film etch rate and the resist etch rate. An Al test wafer was also etched with one wafer of each photoresist to determine the Al RIE selectivity values. The Al and oxide (TEOS plasma enhanced deposited glass) thickness steps were measured on an Alpha-Step profilometer. The thicknesses of the photoresist coated wafers were measured on a Prometrix SM-200 to determine the etch induced resist loss during the etch process. Table 6 contains the selectivity results of each resist for the TEOS inter-level dielectric and metal etch processes, which can be employed to fabricate BiCMOS MLM devices. The selectivity values for resist A (4.30 and 2.00) are marginally higher than the values for resist B (3.95 and 1.86), meaning resist A is slightly more resistant to the metal and ILD etch chemistries than resist B. Thermal Image Flowing. The thermal stability of the images for a photoresist process is an important factor because some backend metallization etch processes are very long and significantly high temperatures can be achieved, especially under high overetch conditions which are prevalent (see Fig. 38). Deep-UV curing of the photoresist image after development stabilizes the pattern to this type of image flow through both radiation and thermal crosslinking, but unfortunately some production areas do not have this capability or cannot afford to provide this extra process. As a result, this type of fundamental testing is important.
11/30/00
JMR
132
Handbook of VLSI Microlithography
Figure 37. Exposure latitude curves for resist B and resist A.
Table 6. Resist Etch Selectivities vs. AlCu Metal and Via TEOS Glass Materials Selectivity (material/resist) Resist Systems
11/30/00
JMR
TEOS
Metal
Resist A
4.3
2.0
Resist B
3.9
1.9
Resist C
6.0
2.8
Resist D
5.3
2.2
Resist Technology
133
Figure 38. Schematic representation of image flow illustrating how SEM or SiScan CD measurements are made to generate the data for Fig. 39.
The standard vendor processes were followed when preparing wafers for thermal image flow testing. To determine the effects of DUV, four control wafers for each photoresist did not have the DUV treatment. One wafer (DUV) from each photoresist was baked on a hot plate for 180 seconds at 110°C. Two wafers (DUV and No DUV) from each photoresist were baked on a hot plate for 180 seconds at 120°C, 130°C, 140°C, and 150°C.[22] The wafers were then SEM cross-sectioned for 0.5 µm line and 10.0 µm pad CDs. Both sizes were required because larger images always flow at lower temperatures,[77] and because at least two image size domains should be evaluated for flow. The results of the work shown in Fig. 39 demonstrate that the resist B (No DUV) photoresist images begin to flow starting at a hotplate bake temperature of 120°C. Resist A, on the other hand, is a much more thermally stable system (without DUV) and exhibits good image stability at hotplate temperatures up to 150°C. Of course, resist images of both resists exhibited high temperature stability after DUV curing.
2/23/01
JMR
134
Handbook of VLSI Microlithography
Figure 39. Delta CD vs. image flow temperature for 0.5 µm CDs of resist B and A.
Sensitivity and Contrast for Non-Optical Resists. Actual sensitivity curves are found in Figs. 40 and 41, (for example negative and positive systems) respectively. For these curves to have meaning, the experimental data of developer type, or composition and concentration, development time and conditions, image dimension size, and the resist characteristics such as thickness must be known. The resist sensitivity for the negative system in Fig. 40 is defined by that dose where 50–80% (i.e., the data may be fit to any point between 0 and 50 to 80% film retention but should be explicitly specified) of the original resist thickness is maintained; of course, the unexposed areas of resist are completely developed away and only the exposed images remain. For the positive system, sensitivity is defined as the dose where the image is cleared to the substrate (i.e., dose where l o-ld/lo = 0), a unique point from the data plot. It should be noted here, the negative resist sensitivity curve is essentially invariant with development time where, for the positive system, a series of curves is obtained when the development time is changed. These curves will remain fairly parallel until appreciable unexposed resist loss begins to occur. The resulting curve never reaches lo-ld/lo values of 1.0, even at low exposure. Both negative and positive sensitivity curves shift on the exposure axis when the image CD is changed. Most lithographic tools allow for exposure to be varied across the wafer, so this data can be generated on a single wafer if desired.
11/30/00
JMR
Resist Technology
135
Figure 40. Characteristic sensitivity curve for two negative e-beam resists. The AZ resist is imaged density reversed with high e-beam exposure combined with a 300 mJ/cm2 UV-4 flood exposure prior to development.
Figure 41. Characteristic sensitivity contrast curve for a positive e-beam resist example.
11/30/00
JMR
136
Handbook of VLSI Microlithography
Resist process contrast is given by the slope of the least squares fit of the data in Figs. 40 and 41. The contrast of the process is obtained by simply subtracting the log values as depicted on the figures and taking the mathematical inverse. These values are of importance because they can be used to predict edge wall angles and resist resolution.[48] To better clarify positive resist sensitivity, researchers at IBM[167] have adopted a modified sensitivity plot where l(unexposed)/lo is plotted vs. exposure (see Fig. 42). Here, each data point represents an entirely different wafer and development time combination for the chosen CD to develop. Note, it would take a series of exposure response curves like Fig. 41 to generate a single curve like Figs. 42a and b. Since for positive resists, the full resist thickness is usually required for further process masking requirements, this method of resist sensitivity measurement is of great value. The slope of these curves, however, is not a direct measure of the image dose response and cannot be used as a measure of resist process contrast. Resist Image Edge Wall. When reactive-ion etching (RIE) dry etching techniques are employed, the resist image edge wall becomes an important evaluation characteristic and should be measured. A vertical resist image edge wall is essential to RIE to minimize etch bias even for the more anisotropic processes. Unfortunately, angle measurement requires a Scanning Electron Microscopy (SEM) picture be taken edge-on with a cleaved and mounted wafer piece, which is a tedious procedure. The edge wall can be preliminarily estimated from the bulk contrast as measured above,[48] but in the final analysis the edge wall must be verified. Near vertical resist image edge walls (i.e., 87–90°) are specified in nearly all modern facilities, thus necessitating this resist process characteristic be known to be employed as a criterion for resist/process selection (see Fig. 43 for examples). Resist CD Latitude. The most important resist process parameter to a photoengineer is image critical dimension control or CD latitude.[75] This parameter can be quantitatively measured for a resist/developer combination as set forth by Walker and Helbert.[79] In this method, a quantity called delta (∆) which is a measure of the difference of the resist image CD from that nominally exposed, is plotted vs. exposure and development time to obtain a series of characteristic curves. Figure 44 illustrates this for a positive photoresist example. Each point on this graph represents a different exposure and development time combination, which results in a cleared or fully developed image to the substrate. The solid uppermost line of Fig. 43 is a least squares fit of data where the resist image just clears to the substrate; there is no data above this line because it would represent images
11/30/00
JMR
Resist Technology
137
that did not fully develop. The largest exposure on the horizontal ∆ = 0 line is an equivalence sensitivity condition for every plot, where sensitivities for resists can be compared on the basis of equivalent CD transfer. The slope of the just clearing line, called RPL, is a measure of the exposure/ development latitude of the resist process being evaluated.
(a)
(b) Figure 42. (a) Positive resist sensitivity curve for positive e-beam resists. (b) Actual e-beam positive resist sensitivity curve for PC-129 positive photoresist.
11/30/00
JMR
138
Handbook of VLSI Microlithography
(a)
(b)
Figure 43. SEM micrographs of positive and negative resist images illustrating the results of image edge wall profile angle measurement: (a) positive resist, and (b) negative resist.
Figure 44. Characteristic sensitivity curve for positive photoresist employing CD transfer as a criterion.
11/30/00
JMR
Resist Technology
139
In Fig. 45, a strong developer has been intentionally employed to illustrate the effect upon RPL and sensitivity; here an undesirably large value for RPL is obtained. Obviously, the flattest ∆, or CD, vs. dose curve is most attractive from a process stability view, and the resist process yielding that performance characteristic should be implemented onto the IC fab line.
Figure 45. Delta vs. exposure characteristic curve for positive photoresist with a concentrated developer.
Tables of RPL results for representative first, second, and third generation resists are found in Tables 7–10. Since these results are dependent upon testing conditions, they are not meant to be absolute as to the performance of the resists tabulated, but they are of interest for gathering some general performance trends. Second generation resists tend to be more sensitive than first generation materials, while third generation resists tend to have greater contrast as well as sensitivity without sacrificing CD latitude in the process. The effects of developer concentration and development method upon sensitivity and CD control are also evident from these tables. Developer concentration increases can lead to greater sensitivity (Table 7), but usually at the expense of latitude. Development method (see Table 10 and Fig. 46) can influence process contrast and should be investigated when feasible.
11/30/00
JMR
140
Handbook of VLSI Microlithography
Table 7. Photoresist Performance Data for First Generation Resists
Resist
γ
RPL
Qa mJ/cm2
PC-129
2.0
1.0
110±20
1.6
102
D900/PUDDLE
1.6
119
D910/DIP
4-8
50
CONC 900/DIP
(Allied P2025)
a
11/30/00
JMR
Developer/Method D900/DIP
AZ-1350
1.5
1.4
81
1:1 MF 312/DIP
KTI-II
1.8
2.1
105
DE-3/SPIN-SPRAY
1.0
132
DE-3/DIP
Kodak-809
1.6
0.4
100-160
HPR-204
1.7
2.7
91
1:1 LSI/DIP
3.5
60
1:1 LSI/PUDDLE
Data for nominally 3 microns line and space mask images.
809 Developer
Resist Technology
141
Table 8. Photoresist Performance Data for Second Generation Resists
Resist
Allied P-5019
γ
RPL
Q, mJ/cm2
Developer/Method
2.5-3.8
1.0 3.9 3.0
13 72 95
2:1 D100/DIP D150/DIP LM1500/DIP
AZ-1470 - 4110
1.8 1.6
2.3 1.5
40 43
1:1 MF312/DIP 1:1 MF312/DIP
OFPR-800-1 -800-2 -800-1
1.4
1.3 2.7 0.5
74 75 37
NMD3-1/DIP NMD3-2/DIP 1:1 MF312/DIP
KMPR-820-AA -820
1.2
1.6 2.2
18 17
AA-0980-52/DIP AA-0980-52/DIP
0.9 1.4
73 62
1:3 LSI/PUDDLE 1:3 LSI/PUDDLE
3.8 4.2 1.9
24 24 12
1:1 9562(MF)/DIP 1:1 9564/DIP 1:1 9571/DIP
HUNT-159 -118
MAC-9564 -9574
1.5-1.6
1.1
11/30/00
JMR
142
Handbook of VLSI Microlithography
Table 9. Third Generation Resist Results Resist
Q, mJ/cm2
RPL
γ
Developer
ALLIED 6010
83 89
1.1 2.3
1.3 2.3
LMI-600/DIP D-360/DIP
AZ 4110 5214
45 62
1.0 2.1
1.5 2.3
1:1 MF 312/DIP 1:1 MF 312/DIP
DYNACHEM XPR 1000 1501
52 120
2.2 1.2
2.4 1.9
NMD-3/PUDDLE
MACDERMID 914
250
1.5
2.4
1:1 MF-62/DIP
KTI 9000
72
2.1
1.7
DE-3/SPRAY
JSR 3003A
60
0.8
1.6
JSR DEV/DIP
TOKYO OKA ONPR 830
40
0.6
1.7
NMD-3/DIP
KTI 9000
72
1.2
1.7
DE-3 SPRAY
MONSANTO RX7
94
2.0
1.9
MFD DIP
SUMITOMO PF6200
80
0.9
1.5
SOPD DIP
OFPR 800
75
1.3
1.4 (1.6)*
NMD-3/PUDDLE
* DYNACHEM DATA
Table 10. Development Method Effect
11/30/00
JMR
RESIST
METHOD
RPL (mJ/cm2)-1
KTI-II
SPIN/SPRAY DIP
2.1 1.0
HUNT-204
SPIN/SPRAY DIP PUDDLE
SURFACE SPOTS 2.7 3.5
PC-129SF
SPIN/SPRAY DIP PUDDLE
SEVERE SURFACE SPOTS 1.0 1.6
Resist Technology
143
Figure 46. Development methods used to create relief images in photoresist materials.
Figure 47 compares two real e-beam gate processes actually employed to make CMOS transistors. The process on the left had a severe exposure latitude problem, and as a result rework rates for this process were between 40–60%; due to this poor latitude, the process to the left was never fully implemented. Sometimes CD vs. dose curves are flat (i.e., have latitude like that shown in Fig. 48), but are not flat at the CD required. When this occurs, a process bias is applied to achieve the CD specification. This is done by shaving the e-beam directly written image, by software or by biasing the Cr image on the reticle or photomask, and is fairly routine to all fabs.
11/30/00
JMR
144
Handbook of VLSI Microlithography
Figure 47. Exposure latitude comparison for two electron beam gate processes. The process to the left employed positive resist with unexposed pattern islands, while the process to the right is the same as that for the Fig. 13 AZ process.
Figure 48. Resist image CD vs. dose for an example e-beam positive photoresist process.
11/30/00
JMR
Resist Technology
145
Process Compatibility. Thermal image flow/degradation resistance, the second part of process compatibility, can be estimated from polymer or resin differential scanning calorimetry (DSC) measured glass transition temperatures (Tg), and thermal gravimetric analysis (TGA) measured decomposition temperatures, respectively.[80] See Figs. 49 and 50 for examples for pure novolac resin polymer. Resist manufacturers usually list these parameters in their technical brochures, and this additional data forms the rest of the data base required to make a resist process decision.
Figure 49. Differential Scanning Calorimetric (DSC) thermo-analysis curve for an example novolac resin.
11/30/00
JMR
146
Handbook of VLSI Microlithography
Figure 50. Thermogravimetric Analysis (TGA) curve for example novolac system.
3.2
Resist Adhesion Requirements
Before spinning resist onto the IC wafer in process, resist adhesion promotion is typically accomplished. This process, involving either liquid or vapor phase treatment of the wafer with hexamethyldisilazane(HMDS), has become an industrial standard. It is typically used at every lithographic step in the IC fab process, whether it accomplishes surface modification or not. HMDS processing can be carried out on automatic wafer tracks with liquid or vapor HMDS modules (e.g., DNS, Tel, or SVG) or in stand alone microprocessor-controlled all stainless-steel commercial reactor chambers (e.g., IMTEC or YES); these processes are proven for high volume production (see also Sec. 4.0). The line priming modules are hot plates in a vacuum chamber, integrated into the coating track. Due to requirements of chamber evacuation and sequencing with the coating process, the actual priming time is usually limited to 20–50 seconds. To increase the effectiveness of in-line priming modules, Trimethylsilyldiethylamine (TMSDEA) can also be used.[81] TMSDEA reacts to form trimethysilyl groups and diethylamine. The contact angle vs. time data obtained for HMDS treated Si wafers in a Tel Mark 8 vacuum oven at 150°C is shown in Fig. 51. The reaction rate is much faster with TMSDEA than with HMDS (see Fig. 7 of Ref. 81). With TMSDEA, high contact angles are obtained in less than 10 seconds. The proper range for the TMSDEA process in question was for contact angles in the range of 65 to 75°. The data shows that a process with a broad latitude is possible at a temperature of 50°C with a time between 5 and 20 seconds.[81]
2/23/01
JMR
Resist Technology
147
Figure 51. Contact angle vs. time for MicroSi HMDS. Note higher contact angles are achieved with the special laminated bag vs. the standard teflon bag.
The author’s experience with SVG 8000 in-line vapor HMDS priming has contact angle values of 58–79 degrees being achieved in less than 60 sec and at temperatures between 100–150°C. We found no interactions between prime time, temp., or bubbler flow, but best results, i.e., higher CA values, are obtained for that specific equipment at temperatures greater than 100°C, and at longer prime times and greater pressures and nitrogen carrier gas flows (12 psi/30 scfh, respectively). Substrate Condition and Contact Angle Concerns. The pretreatment of the substrates (cleaning, dehydration and exposure to humidity) will greatly influence their surface tension. Contamination with water (dependent on relative ambient humidity) and organic materials on high energy surfaces will occur during the time of the exposure to air, which strongly reduces the surface tension of those substrates. Ambient oxygen and cleaning processes with an oxidizing effect lead to thin oxide layers, or energy-reducing layers, on substrate surfaces like on Si and Cr. For example, Cr coated masks are typically treated with an oven bake (200°C, t = 20 min) before resist coating. This process reduces strongly the surface tension of Cr and results in improved resist image adhesion.[82] Tungsten (W) and Aluminum (Al) substrates are not adhesion improved by HMDS or other surface treatments, but are most affected by slightly-oxidizing air dehydration baking prior to coat usually on a wafer track. Tungsten oxide is observed by ESCA on all W metal surfaces prepared in semiconductor wafer depositions.
11/30/00
JMR
148
Handbook of VLSI Microlithography
Bauer et al. have demonstrated that the development step is the most critical process relating to adhesion. They have further demonstrated that the contact angle for water, θ w , can be used to adjust optimal conditions for adhesion of a resist on different substrate materials. They demonstrated pretreatment caused variations of surface tension that can compensate by specific surface modification to provide stable surface conditions for reproducible and defect free lithography processing. Good adhesion, as evidenced by greater than 5 dyn/cm surface tensions, was obtained when a low-polarity developer (x 3p < 0,3) is used, corresponding to a process window 10 o < θ w < 75°. In contrast, when a high-polarity developer (x 3p > 0,3) is applied, the priming process has to be modified, so that the surface tension of the substrates can be reduced. The process window for this case is 60o < θ w < 75°. They show that a resist with a relatively small surface tension (σ1 = 40 dyn/cm) and high polarity (x1p = 0.4), combined with a developer with high surface tension, is a good prerequisite for process adhesion. Resist Adhesion Concerns. Resist image adhesion is of paramount importance, because if IC circuit patterns are missing or are not the right dimension, the device will simply not function as designed. The resist systems most susceptible to resist image lifting at the development stage of processing are positive e-beam and conventional photoresists.[83] Historically, negative photoresists also have required adhesion promotion, but in that case the goal was reduced undercut from wet etch processing and not the prevention of simple image lifting at develop. [83][84] Undercutting is less of a consideration for positive photoresists, because they are used primarily in conjunction with anisotropic dry etching processes, but not entirely. Examples of negative resist wet etch undercut and positive lifting at develop are shown in Figs. 52 and 53.
VTS
NO PROMOTER
Figure 52. Wet etching adhesion test of siliconillustrating wet etch undercutting at the silicon resist interface during the etch process.
11/30/00
JMR
Resist Technology
149
Effect of Adhesion Promoter on “Lifting”
500X
SiO2
Figure 53. “Image Lifting” phenomena observed with and without adhesion promoter on SiO2 test surfaces. This type of lifting occurs at development prior to etch processing.
11/30/00
JMR
150
Handbook of VLSI Microlithography
Resist image adhesion processes for most MLM applications are standard vapor phase hexamethyl disilazane (HMDS) processes, preferably at reduced pressures in stand alone reaction chambers or high throughput wafer track system modules.[85] Since most dielectric layers exposed to lithography on the backend are oxide substrates which are handled well by standard HMDS processing, very few problems are encountered with this processing. Metal image adhesion, on the other hand, can be a problem, especially when rework is encountered or when wet etching is practiced as for older fab lines. Freshly prepared Al, AlCu, and W wafers, however, do not usually present an adhesion problem for photoresist images. In fact, test image failures or lifting are usually impossible to induce. Furthermore, no evidence of a chemical reaction between the HMDS and the Al films has been obtained from ESCA studies, so this treatment is probably ineffective on metal substrates except for the concurrent wafer dehydration which also occurs in the process. Problems arise when Al metal is ashed or contaminated in any other way. Then, the only way to save the wafers is to sputter or etch the film surfaces to provide a near virgin type surface, hopefully with minimal metal thickness loss and further contamination. Wafer rework for MLM resist processing, however, is also fairly well understood and routine.[85] Another wet processing problem is underetch.[81] Underetch is typified by a sloped profile of the etched layer due to the penetration of the aqueous etchant between the resist and the substrate. Experiments by the authors of Ref. 81 have shown a dependence of wet etch undercut upon trimethylsilyl surface coverage. Figure 4 from Ref. 81 shows the amount of underetch as a strong function of the contact angle for primed BPSG (borophosphosilicate glass) wafers—the greater the CA the lesser the wet etch undercut. The underetch is described as the ratio of the etch distance horizontally (X) to the etch distance vertically (Y) in BPSG glass in a buffered HF etch. A high contact angle indicates a high surface coverage of trimethylsilyl groups and lower X/Y ratio. At higher contact angles (i.e., higher surface coverage with trimethylsilyl groups) less underetching is observed. This shows the trimethylsilyl groups from the HMDS reaction at the BPSG surface block the penetration of the etchant under the photoresist at the interface. Non-HMDS Photoresist Promoters. One of the on-going debates about resist adhesion in recent years has been on what chemical or chemicals give the best adhesion results. There are quite a few chemicals that can be
11/30/00
JMR
Resist Technology
151
used as adhesion promoters, for example, TCPS (tricholorophenylsilane), BSA (bistrimethylsilylacetimide), Monazoline C, trichlorobenzene, xylene, and HMDS (hexamethyldisilazane).[86] No matter what chemical is used the basic premise behind adhesion promoters is the repulsion of water from the resist/adhesion promoter interface, i.e., the surface dewetting characteristics of the adhesion promoter, and the improvement of the bonding strength between the resist and the surface. As a rule of thumb, an angle greater than 50 degrees on Silicon will give fair adhesion results for 2.5 µm geometries and greater on most films. A double prime is shown to make definite improvements over a single prime leading to higher CA and greater trimethylsilyl wafer surface coverage. This was confirmed by contact angle measurements in Ref. 81 (see Fig. 5 of Ref. 81). The authors have also experimented with MicroSi spiked samples with DEATS (see Fig. 54); DEATS spiked samples always exhibit greater CAs holding all other variables constant as shown. We also have demonstrated that HMDS primer NowPack packaging can improve contact angles; in Fig. 51, the improved CA values provided by the Al-laminated storage bag, over a storage time of seven months, is demonstrated. This bag and the NowPack allow HMDS to be stored and transferred to track bubbler vessels nearly air-contact free, which is desired to prevent HMDS reaction with the moisture in the fab air. ESCA Results for HMDS Treated Si Wafers. What is accomplished by adhesion promotion treatments in IC manufacturing should actually be referred to as wafer substrate preparation, and not adhesion. Adhesion in the structural sense, as found in airplane composite material part attachment, is not accomplished in HMDS processing treatments. The term adhesion, as it is used here, refers to a more practical definition, that is, resist image adhesion. Figure 55 demonstrates what is actually accomplished when a Si wafer is treated with HMDS. The ESCA[87] spectra shown clearly illustrate the removal of carbon containing adsorption species at lower binding energies in favor of the monolayer of trimethylsilyl surface reaction product (see Fig. 55a and b).[88] In addition the surface is dehydrated in-situ as verified by increased water contact angle[89] and a lower O/Si surface concentration ratio as measured by ESCA. Furthermore, this converted surface is stabilized for days against recontamination, therefore, the HMDS process provides a very stable surface for resist adherence. The Si 2p ESCA spectrum of Fig. 55c and d verify the appearance of the trimethylsilyl silanol reaction species.
11/30/00
JMR
Figure 54. Contact angle vs. time for MicroSi MP-95, 95% HMDS/5% DEATS.
152
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
153
(a)
(b)
Figure 55. (a) N(E)/E vs. BE for carbon 1s bare silicon wafer, (b) N(E)/E vs. BE carbon 1s for HMDS vapor (Star 2000) treated silicon wafer, (c) N(E)/E vs. BE silicon 2p for untreated silicon wafer, and (d) N(E) vs. BE vs. for silicon 2p of HMDS treated silicon wafer.
2/23/01
JMR
154
Handbook of VLSI Microlithography
(c)
(d) Figure 55. (Cont’d.)
11/30/00
JMR
Resist Technology
155
Resist Adhesion—TOF Results.[81] Time of flight secondary ion mass spectroscopy (TOF SIMS) has been used to determine the surface coverage of trimethylsilyl groups after priming with HMDS (hexamethyldisilazane) and TMSDEA (trimethylsily-diethylamine). The TOF-SIMS work by Micheilsen et al. shows that the trimethylsilyl groups at the resist substrate interface function as a barrier to undercutting during wet processes such as developing and wet etch. The surface coverage is determined by the signal ratio of the trimethylsilyl cation to silicon cation from the TOF-SIMS spectra. The relative surface coverage of trimethylsilyl groups is given by the integrated signal of the peak of (CH3)3 Si+ at mass 73 amu normalized to the signal intensity of the peak of Si+ at mass 28 amu. Their Si-wafer contact angle measurements have also been correlated with TOF-SIMS data. A high contact angle means the surface coverage with trimethylsilyl groups is relatively high, for example, at 94 degrees contact angle the surface coverage is 0.98 (relative coverage vs. 1 for full coverage) and no resist image lift-off occurs, but resist solution dewetting is also observed. As contact angles run from 65–85 degrees, no resist lift occurs as the relative trimethlysilyl surface concentration runs from 0.46–0.75. For contact angles from 36–56 degrees, resist lifting is observed and lower partial surface coverages, i.e., less than 0.4/nm2 are observed by TOF-SIMS methodology. Resist Dewetting and Popping. Substrate non-wetting, another adhesion problem, has been observed most frequently with mistakenly over promoted wafers. It occurs after repeated treatments, when the wrong liquid treatment has been applied, or when vapor times exceed the optimum time for that respective substrate. It also can occur in selected circuit pattern areas and not for the whole layer, and can be characterized by higher water droplet contact angle. Although it is not generally well understood, it can be prevented by reducing prime times for vapor treatments, corrected by ion treatment of the wafer,[90] and be prevented by using resist containing a spinning solvent of lower surface tension or by double resist application under dynamic spin conditions. Over-priming results in dewetting and popping problems. Dewetting occurs when the surface coverage of trimethylsilyl groups is high. The trimethylsilyl groups lower the surface energy of a substrate and the resist can not wet the surface. Therefore, the attraction forces between the trimethylsilyl groups and the resist is very small. Also occurring at high contact angles is the phenomenon called popping. Popping is the blistering of the resist, observed when high exposure doses are required. It is reasoned that the popping is caused by the accumulation of nitrogen (the consequence
11/30/00
JMR
156
Handbook of VLSI Microlithography
of the exposure reaction) between the resist and the substrate. When the exposure dose is high enough the nitrogen can not be dissipated by diffusion through the resist and diffuses to the resist substrate interface. Lowered surface energy of the substrate and therefore the weakened attraction between the substrate and the resist facilitates popping. Figure 6 from Ref. 81 establishes the relationship of the dose threshold for popping as a function of the contact angle, i.e., it drops by a factor of two as the CA goes from 65 degrees to 88 degrees, for a given lamp intensity and photoresist. The authors of Ref. 81 have, therefore, established that high CAs can lead to resist popping, a serious defect and device yield limiter. One then concludes that for adequate control of lift off a contact angle of at least 65 degrees is required (taken from Table 1 of Ref. 81). For assurance that popping will not occur a careful examination of exposure dose required for patterning with the process in question must be understood. In the particular case in this example, the standard exposure dose was 200 mJ/cm2 or less and a contact angle of 75 degrees was allowed. Therefore, the process window for this resist process is from 65 degrees to 75 degrees. Experience has shown that for different stepper intensities and photoresists processes the window can be narrower or wider. Resist Adhesion Method Comparison. The silicon based substrate layers, nearly all the layers encountered in IC device production except metal layers, can usually be successfully promoted against lifting by treatment with liquid silanes or silane vapor treatments at reduced pressures[84][88][90] (see Fig. 56). The carbon 1s ESCA spectra are shown for a Si (100) substrate with native oxide (< 50 Å) in Fig. 57 to illustrate the surface chemical changes between liquid and vapor promoter processes. The LPIII process, a vapor HMDS process, efficiently removes the carboxylic, ethereal, and hydrocarbon impurities from the surface and replaces them with a blanket of trimethylsilyl groups comprising a monolayer (also see Fig. 55c and d). The Mallinckrodt system, a model liquid primer with both amine and alkoxy silane molecular moieties, replaces the carbon surface species with CHx species resulting from the condensation polymerization reaction on the surface, which produced a 20–50 Å thick adhesion promoting layer. When the Si 2p spectrum is observed, a new signal appears at 101.8 eV (see Fig. 55c and d) from the Si (CH)3 groups for the vapor treated HMDS substrates, while no such signal appears for the Mallinckrodt system. Hence, the two comparison systems differ in the basic method of adhesion lifting prevention even though they are both successful “lift preventing”
11/30/00
JMR
Resist Technology
157
processes. In Table 11, the ESCA results and water droplet contact angle (CA) measurements are listed for a range of representative processes. Total surface C/Si ratios from ESCA are also listed because lifting has been shown to occur when that parameter is found to be from 30–100% larger than that for primed wafers.[88]
Figure 56. Chemical reactions for different silicon-based adhesion promoter systems.
Figure 57. N(E)/E carbon 1s spectra for blank wafer and two comparison photoresist adhesion promoter processes, standard LP-3 vapor and vs. Mallinckrodt (liquid silane).
11/30/00
JMR
158
Handbook of VLSI Microlithography
Table 11. Adhesion Prime Methods Comparison
Method
ESCA Tri-
Type
Carboxylic methylsilyl
STAR 2002 (5 MIN.)
VAPOR (100 C)
NO
YES
0.64/NA 75
STAR 2002 (90 SEC.)
VAPOR (100 C)
NO
YES
0.58/NA 75
STAR 2000
VAPOR (150 C)
NO
YES
NA/0.8
YES LP-3
VAPOR (120 C)
NO
YES
0.54/NA 76
YES
0.46/NA 75
SVG
YES;CA<70 VAPOR (RT;760 MM) NO;CA>70
C/SI
CA
77
HMDS SVG
LIQUID
YES
SMALLER
NA/1.1
58
MALLINCKRODT
LIQUID
SMALL
NO
NA/1.5
60
NA
YES
NO
1.0/1.1
24
VIRGIN
The CA measurements of Table 11 indicate these treatments are also very successful at removing wafer surface water contamination, as has been verified by others (see Ref. 89 and references therein). It is notable, however, that there is a correlation between CA, resist image lifting results and ESCA surface condition. If CA is less than 60 degrees, the ESCA C 1s carboxylic peak at 290 eV is present, and there is a high relative total C/Si ratio, lifting or poor resist image adhesion is very likely to occur, either intermittently or quite frequently. For semiconductor manufacturing or any manufacturing process, this kind of processing uncertainty is unacceptable, and the vapor processes and some liquid treatments have created better process reproducibility. HMDS SVG, a wafer track applied liquid HMDS process, is an example of a process that worked most of the time, but provided only marginal resist image lift prevention reproducibility. Vapor temperature is seen to make little difference to the measured parameters, while time of prime does provide more attractive (i.e., higher) CA values. Multiple priming and long prime times can also
11/30/00
JMR
Resist Technology
159
lead to over-promoted or non-wetting (i.e., at coat) wafers, therefore, the prime time should be optimized for each substrate. Results for Reworked Wafers. After silicon wafers have been fabricated to a certain level, they become too valuable to terminate fabrication when misprocessed. Sometimes either the wrong mask is used, the resist exposure is adjusted improperly, or the resist is improperly developed. At lithography, since the pattern is just a thin polymer layer and not an integral part of the circuit, by simply removing the misprocessed layer, the wafer can be saved and completed. A good rework process, sometimes referred to as recycle or redo, is required to accomplish this task. For device implantation, etch, or deposited layers, this flexibility is lost. As a result, rework or redo is quite common in fabrication frontends. Unfortunately, Deckert[91] has found silicon oxide wafer surfaces exhibit random photoresist adhesion variation, which is very much affected by previous chemical treatments, such as those for rework. Of course, these effects can cause defects, and the additional wafer processing required is known to statistically increase defect levels, which in turn decreases circuit electrical yield. Obviously, original and rework processes which prevent adhesion failure and are clean from a surface point of view, are very important economically. The best case situation occurs when the rework process is also the resist removal process, the one which removes the resist following layer patterning completion, thus only one process would be required. When considering adhesion effects with rework wafers, we must first look at the effect upon substrate surface chemistry of a representative group of resist strip or removal processes. This is done in Table 12. The same wafer parameters, as in Table 11, are used. Oxygen plasma and sulphuric acid/peroxide are both oxidizing carbon removal techniques, Carbitol is a commercial mildly alkaline organic solvent stripper, and acetone is simply a representative organic resist solvent stripper. All but the acetone treatment restore the wafer to a state close to the original before prime and coat, but the simple acetone dissolution strip leaves the substrate in the primed, ready for recoat, state, thus saving a reprime processing step. Importantly, the other processes tend to leave the substrate less clean and with larger CA values than the virgin wafers. These observations are consistent with those reported by Deckert,[91] where greater lifting or poorer adhesion was reported to occur for sulphuric/peroxide reworked wafers. Obviously, reworking wafers is not desirable, but when economically necessary, simple solvent treatments on a particle filtered wafer track like the acetone treatment are attractive and can be effective. Rework/reprimed wafer results are found in Table 13. Here, it is seen that all the rework processes return the substrate to a primed condition, butthey
11/30/00
JMR
160
Handbook of VLSI Microlithography
are all a little less properly conditioned than the virgin wafer controls. Either the CA or the ESCA data are a little worse than those values for the virgin controls. Most importantly, these processes must all be concluded to provide a more particulated wafer simply due to the increased handling and processing involved and this most likely will result in reduced circuit yield. Table 12. Rework Method Wafer Characterization Carboxylic
ESCA Trimethylsilyl
C/SI
CA
Acetone Dissolution
-
+
NA/0.48
70
Oxygen Plasma
+
-
NA/0.7
39
Acetone/Plasma
+
-
NA/0.8
26
Sulphuric/Peroxide
+
-
0.6-0.8/NA
25
Carbitol Strip
+
-
0.6-0.8/NA
48
Virgin Control
+
-
0.6/1.0
25
Rework Process
Table 13. Rework Reprime Characterization
Carboxylic
ESCA Trimethylsilyl
C/SI
CA
Acetone Dissolution Only
NO
YES
0.48
70
Sulphuric/Peroxide/ SVG
NO
YES
0.48
68
Sulphuric/Peroxide/ 2000
Very Small
YES
0.59
77
Carbitol/SVG
Very Small
YES
0.56
77
Carbitol/2000
NO
YES
0.60
70
VIRGIN/SVG
NO
YES
0.43
73
VIRGIN/2000
NO
YES
0.46
78
Rework Process
11/30/00
JMR
Resist Technology 3.3
161
Resist Application
Resists are applied by wafer spinning modules either integral to a wafer track system (for example, GCA or SVG) or as individual units like those sold by Headway Research. The resist solutions with an optimally volatile spinning solvent are applied dynamically (i.e., with the wafer spinning at 2–100 rpm) or statically to the wafer using 1–4 cc’s of resist, spread across the wafer by a low frequency spin (for example, 500 rpm), followed by a high rpm/sec ramp (for example, 5,000–40,000 rpm/sec) to the final thickness determining rpm usually at ~ 2–4000 rpm. The film thickness for positive photoresists can usually be approximated by: Eq. (3)
t = K(C) 2/ SS0.5
where C is the concentration of solids and SS the spin frequency in rpm. Stein[92] of Hunt Chemical has shown that K increases with the average molecular weight of the resin. The more general form of this empirical equation is: Eq. (4)
t = K(C)β(v)g/(SS)a
where v is the solution viscosity. Log-log plots of t vs. C, v, and SS provide the constants β, g, and a from the slopes of the least squares data fits, empirically. More typically, technicians in fab lines simply run a curve of t vs. SS and fit the curve with an exponential function as shown in Fig. 58. Due to differences in coating equipment and fab ambient conditions, these curves must always be run, and vendor-provided spin curves used only as rough thickness guides. Resist vendors also adjust resist solutions to achieve the approximate thickness desired to be obtained at roughly 2–4 krpm, because thickness variation is often minimized at mid-range rpm values. Meyerhofer[93] has theoretically treated wafer spinning application of resist and accounted for thickness and drying times considering only centrifugal force, linear shear forces, and uniform evaporation as variables. Middleman[94] has further shown that air flow induced by disc or wafer rotation provides the required shear stress at the liquid/air interface to enhance the rate of resist thinning on the wafer. Of more importance than average film thickness across the wafer is the film thickness variation across the wafer measured radially from the center of the wafer. This is important, because resist image critical dimensions can vary beyond specification limits due to the resist film thickness variation. Film
11/30/00
JMR
162
Handbook of VLSI Microlithography
thickness variation depends upon the spin frequency, and an optimum frequency is usually measurable at a given solution viscosity. Typical thickness variation within wafer and wafer-to-wafer for 1986 vintage wafer track spinners is of the order of 20–100Å and 50–200Å, respectively, which is more than adequate for CD uniformity requirements of most lithography areas. Control charts are typically used (see Fig. 59) to monitor coat processes and if values for monitor wafers fall out of specification, the wafer tracks are shut down for maintenance or engineering adjustment.
Figure 58. Thickness vs. RPM for AZ 5206 prebaked at 125°. The solid curve is an exponential fit of the experimental data.
Figure 59. Thickness control chart for OFPR-800.
11/30/00
JMR
Resist Technology
163
Film nonuniformity, visually observed as radial stripes in the resist called striations, is prevalent when either the resist solvent is too volatile, dynamic dispense is employed, wetting additives are omitted, or at high resist concentrations. Striations are easily measured using interferometric or profilometric techniques. Orange peel, another spin problem, occurs due to rapid evaporation of volatile spinning solvents, and cloudy resist films sometimes occur when hygroscopic solvents are employed.[95] Machine variables which have been observed to influence resist coating uniformity and average thickness are exhaust flow-through, the coat module, motor frequency control and acceleration precision, dispenser type, and of course possible interactions which may occur between these and the resist variables already mentioned. Optimizing coating processes is a complex time-consuming empirical task. It involves screening the many potentially material dependent variables through statistically designed experiments[96] to reduce or minimize process variability to achieve manufacturing success. This process is well documented[59][97] and is very routine except where substrate topography approaches≥ 2X thickness of the resist patterning layer. Methods to overcome these rare coating problems, when they occur, are found in Ref. 98. With standard front end device layers (i.e., those prior to 1st metal) and metal MLM processing layers, resist applications are very routine, and thickness variation can be controlled to ≤100 Å, 3-sigma, all families of variation, assuming that good wafer planarity is maintained. Most resist coatings are very conformal because the viscosity of the film is rising quickly during coating, thus freezing in the topography to be planarized due to local surface tension gradients.[99] Achieving greater planarization requires increasing the solvent fraction or increasing the residual solvent in the film,[99] which does not seem very practical. 3.4
Prebake/Exposure/Postbake/Development Processing
For positive resists, the prebake conditions, exposure, and development conditions, are inseparable, and together can critically determine the performance of the overall resist process. For example, sometimes interactions occur between prebake and development variables; a 2-factor interaction between prebake temperature and development method has been observed for Shipley 1400 resist. Due to this interaction, both of these variables would have to be changed in parallel to optimize this resist process. Prebake is almost always a primary variable for positive resists, because their development rates are influenced strongly by residual solvent content and the thermal history of the
11/30/00
JMR
164
Handbook of VLSI Microlithography
film. For negative resists, this is not the case—prebake and developer compositions have less influence over resist contrast, but they can impact resist swelling behavior and hence resolution performance. Since positive resists are the evolutional choice of most new fab lines and they are most affected by these variable combinations, this section primarily addresses positive resist effects. Prebake temperatures and exposure levels are determined through careful process optimization response surface experiments[74] carried out usually by the resist manufacturer and from Bossung curve generation (see CD vs. exposure data for metal and via layers in Sec. 5), respectively. If the user must determine these parameters, see Ch. 5 for a description of the methodologies. The ARC process optimization of Sec. 5 is a very representative model process optimization methodology sequence for any type of photo process. Currently, nearly all high end processes; i.e., positive resist processes at 1 µm and below, are run with 0.26 N TMAH (2.38 % tetramethyl ammonium hydroxide water solution) developer without surfactant wetting agents. Most resists used at this level of processing contain surfactants, therefore, resist wetting at the develop step with the surfactant-free developer is not a problem. Some manufacturers do use surfactant containing developers, supposedly to prevent puddle, development-induced, undeveloped spot defects, but these defects can be prevented through developer prewetting, developer arm program optimization, and exhaust level adjustments. Most positive resist processes in use today require a post expose and pre-development bake (PEB) process to improve process contrast and to help prevent standing wave interference effects upon the resist image side walls.[74] Adhesion may also be improved, but the most improvement comes in the form of improved resist image CD control, which occurs due to the contrast improvement of the PEB process. Variables which can influence resist image edge wall, critical dimension control, and resist sensitivity are prebake temperature and time, exposure level, developer composition and conditions, rinse composition, and fab ambient. The effect of developer composition upon CD RPL, sensitivity and contrast were demonstrated earlier in this section. All of these parameters can potentially interact, therefore, statistical engineering methods and experimental designs are invaluable in optimizing the overall resist process. Mid-UV Resist Statistical Process Optimization/Characterization Example (circa, 1987). The best way to illustrate a statistically oriented resist process optimization and its efficiency is to provide an example. In this section, a simple OFPR-800 positive photoresist process will be characterized by generating a CD response surface space for the
11/30/00
JMR
Resist Technology
165
process. From that response, an optimum operating point for the resist process is obtained. Most importantly, after completing the statistically designed experiment(s), we know how the CD response varies over a much larger set of operating conditions than that unique set established as the baseline photoprocess. Since a resist process has many different substeps, it is impractical to evaluate each variable individually. It is also unwise since the variables may interact. A statistical experimental approach, which investigates many variables simultaneously and can assign quantitative values to variable effects and interactions, is necessary. The objective of this study was to develop and optimize a process for a photoresist (OFPR-800 manufactured by Dynachem) from both a critical dimension transfer and contrast points of view. Two of the statistical experimentsa 7-factor variable screen[100] and a 3-factor Box-Bhenken[101] response surface investigationare described in this example. In addition, the track development process employed is characterized for completeness. Background. The purpose of a variable screen experiment is to determine which of the many variables (independent variables) involved in any process step are significant, that is, which variables the engineer needs to control to optimize the overall process performance as determined by the process parameters (dependent variables). A variable screen design is a small part of a full 2k factorial.[100] It is designed only to determine the significant independent variables in relatively few experimental trials or wafers, and is unable to identify any variable interactions. A full 2k factorial can determine variable effects and interactions, but requires too many experimental runs to be practical for investigating more than four or five variables. Designs for variable screens are available in several references.[100][101] It must be cautioned, to watch for main variable and two-factor interaction confounding when employing these fractional screening designs. After identifying the significant variables for a process from a screen design, the final step is process optimization. This involves determining any independent variable interactions (non-additive responses) or nonlinearity in the dependent variable response curves. Several statistical designs are available for this type of experiment, but unlike the variable screens are usually based on a full factorial experiment [101] or a high resolution fractional factorial design. Since fewer variables are investigated, the process optimization experiment yields much more information than a variable screen design in approximately the same number of runs, because now the two factor and higher order interactions are no longer confounded in favor of screening a larger number of primary variables.
11/30/00
JMR
166
Handbook of VLSI Microlithography
Experimental Designs. The resolution III screen design used in this example is given in Fig. 60. The procedure for running a variable screen design is to choose two levels of interest for each variable (designated + and -,although they need not be quantitative values), and then run each sample, in random sequence, through the process determined by the screen design. Seven independent variables were examined: softbake time and temperature, postbake time and temperature, exposure, spin method, and development method. The results were quantitatively evaluated based on one dependent variable: resist image linewidth transfer to the wafer. The main variables screened are confounded with three two-factor interactions, therefore, this design will almost always be followed by a response surface design of higher resolution or by a higher order screening design. The purpose of a process optimization experiment is to study, in more detail, a small number of variables that are known to be significant in their useful range without loss of precision. Based on the results of the variable screen design described in Fig. 60, softbake temperature and exposure were chosen for further investigation. Since this experiment was run with a Perkin-Elmer 544 projection aligner, exposure tool aperture was also chosen as a related process variable to make three total variables. The dependent variables measured were resist critical dimension transfer (sidewall angles: 70–80°) and resist contrast.
Figure 60. Seven factor screen design for searching experimental variables of a photoresist process for significance.
2/23/01
JMR
Resist Technology
167
The 3-variable Box-Behnken cube design employed is shown in Fig. 61. The cube is defined by three levels each of three variables; the experimental points are determined by the midpoints of the twelve edges of the cube to check for response surface curvature. The center point of the cube is replicated three times to provide an estimate of process variability (i.e., the precision of the experiment). The effect of this design is to run a complete 2 × 2 factorial, while holding the third variable at its center point. Since the Perkin Elmer 544 has the capability of exposing a single wafer with a number of different exposure levels, this design actually examined five different exposure levels for each run instead of only three. Other process information is given in Table 14. Critical dimension results were evaluated by line and pitch measurements made on the 2.0 µ line of a 4.0 µ pitch structure. Measurements were made on a Leitz MPV-CD system. The precision capability of this tool was determined to be ± 0.09 micron (3-sigma), a value less than required for the process tolerance. Leitz measurements of the 1.5 and 2 micron resist dimensions showed a variation of only 5% or less across the 4" wafer. Individual line measurements were calibrated vs. a sample measured by both the Leitz and a Cambridge SEM. Developer Process Characteristics. The developer process employed was a NMD-3 metal-ion free spray/puddle process at the fab temperature (70°F) on a model GCA 1006 Wafertrac. The wafer was sprayed for 2 seconds at 100 rpm, followed by a 1 sec static spray to ensure good puddle formation, followed by a static 45 sec development and 30 sec wafer rinse. Variable Screen T-test. The data analysis of the experimental design of Fig. 60 is performed for each independent variable by subtracting the average linewidth from those runs for which the variable was at its low level (-) from the average linewidth from those runs for which the variable was at its high level (+). This result is designated Y+ - Y -. Linewidth measurements were made at the center and edge of each wafer. Theoretically, the result Y+ - Y - measures the effect, on linewidth, of changing the independent variable from its lower level to its higher level. In the real world, however, each process has a certain amount of variability no matter how carefully the independent variables are controlled. One way to insure the results demonstrate a real effect or are significant beyond normal intrinsic process variability is to apply a t-test to the data. The t-test is used to compare independent sample averages from two populations (in this case, (+) and (-) levels of the independent variable) to determine if the difference between them is statistically significant.[102] It works by comparing the experimental result (Y+ - Y -) to that which should have
11/30/00
JMR
168
Handbook of VLSI Microlithography
Figure 61. Three variable Box-Behnken experimental design for probing an experimental response surface.
11/30/00
JMR
Resist Technology
169
Table 14. Experimental Conditions Coat:
GCA Wafer Trac 1 µm OFPR 800 Hotplate Bake, 45 sec. Temperature = variable
Exposure:
PE 544 Exposure time and aperture variable
Develop:
GCA Wafer Trac
Measurement:
Leitz MPV-CD. CD measured was 2.0 µm line on 4.0 pith structure.
occurred based upon variability results from either both populations of data or control wafers assuming a t-distribution, which is a small sample approximation of the normal (Gaussian) distribution. The greater the result is from the control wafer population variability, the more likely the result was caused by the independent variable instead of random process variation due to lack of experimental precision. If we choose a given “confidence level” (probability that our conclusion is correct) needed, we can calculate the t-statistic from our results and compare it to the t-table entry for the appropriate number of degrees of freedom and the level of confidence. If the test statistic is greater than the corresponding table entry, then we can conclude that the independent variable is significant (that is, it must be carefully monitored to assure process control) with a corresponding level of confidence. The test statistic for the t-test is given by:
Eq. (5)
1 1 T.S. = Y + − Y − / Sp + n+ n−
1/ 2
where Y+ and Y - are as defined before, n+ and n- are the number of samples at each level, and Sp is the pooled variation calculated from:
Eq. (6)
Sp =
∑(n
1/ 2
j
i
2 − 1)Si n j −1
∑ i
11/30/00
JMR
170
Handbook of VLSI Microlithography
where ni is the number of replicates at each experimental condition and Si the cell variation between replicate observed dependent values. Results and Analysis. The results from the variable screen are given in Table 15. Prebake temperature, develop method, postbake temperature and exposure are all clearly significant in the range studied. Spin method (conventional or special) is also significant. “Special” spin was a method developed to enable easier target detection by the Ultratech stepper alignment system by removing the spread cycle at low rpm and by shortening the final spin dry cycle at the final rpm. Unfortunately, it also resulted in large resist thickness variation across the wafer, which led to the large CD variations illustrated by the values in the table. “Special” spin resist coating has been discontinued for this reason and is not recommended. Critical dimension vs. exposure results for the 3-factor Box-Behnkin design at different softbake temperatures are graphed in Fig. 62. Figure 62 shows two possible operating points for the process that will result in a critical dimension within specification: a 75°C softbake with approximately 75 mJ/cm2 exposure, or a 90°C softbake with approximately 96 mJ/cm2 exposure. The graph also suggests some variable interaction is occurring between 54–75 and 96–116 mJ/cm2 since the three lines are less parallel over those regions. Aperture had a negligible effect on critical dimension within the range of interest (2.0 µm ± 10%).
Table 15. Seven-Factor Screen Design Results
FACTOR
CENTER ISLAND IMAGE CD
EFFECTS (Y+ - Y-) EDGE ISLAND IMAGE CD
SPIN METHOD
-0.29µ
+0.11µ
PREBAKE TEMP.
-0.11
-0.18
PREBAKE TIME
+0.0
+0.09
DEVELOP METHOD
-0.30
-0.30
POSTBAKE TEMP.
-0.53
-0.35
POSTBAKE TIME
+0.03
-0.05
EXPOSURE
+0.15
+0.21
CONFIDENCE LEVEL 0.11 A 90% (T-STATISTICS)
11/30/00
JMR
Resist Technology
171
Figure 62. Critical dimension vs. exposure for OFPR-800 photoresist process.
In order to evaluate the experimental results quantitatively, a Yates[103] analysis was performed on the data. The Yates algorithm is a method of taking advantage of the “hidden replication” found in factorial experiments: the average result (critical dimension) of the points at which a certain factor was at its low value is subtracted from the average result of the points at which the factor was at its high value. The result is a more accurate estimate of the actual effect of that factor than would be available from only one measurement at the high and low value of each factor. The Yates algorithm is used to check for independent variable significance as well as variable interactions. A Yates analysis can be performed for two levels of the variables only. Based on the graph shown in Fig. 61, those levels closest to our expected working range were chosen. The results are shown in Table 15. The results confirm those inferred from the graph; exposure and softbake are clearly significant variables over the range studied, but aperture has a much smaller effect. All of the variable interactions are small enough to be safely ignored. Iso-CD response surfaces have been drawn in three dimensions in Fig. 63. Roughly speaking, an optimal working point would be near the center of the cube, a result that was not designed to occur intentionally. Note also, the exposure latitude for critical image control falls off rapidly at 75°Cprebake, which would make operating at that prebake temperature risky; that is, if exposure unexpectedly changed a little the chance of CD failure would be great.
11/30/00
JMR
172
Handbook of VLSI Microlithography
Figure 63. Iso-CD response surface for OFPR-800, generated using a Box-Behnken design.
The second objective of this investigation was to optimize resist contrast where greatest resolution is possible. Since resist contrast also correlates with resist image sidewall angle, it is an important resist processing variable especially as linewidths decrease. Contrast is determined by measuring remaining resist thickness as a function of exposure for underdeveloped wafers. Hence, exposure cannot be used as a variable in a contrast optimization experiment. Contrast was measured as a function of aperture and softbake, and the results are graphed in Fig. 64. Aperture had little effect on contrast, but contrast was clearly greatest at a 75° softbake. Unfortunately, unexposed resist thickness loss was unacceptably high (i.e., 10%) with a 75° softbake. Some loss of process latitude (slope of line) also can be detected in the critical dimension vs. exposure graph at 75° softbake compared to 90° softbake in the exposure range needed for correct image size transfer. Therefore, it was decided to use a 90°C softbake despite an approximately 10% loss of contrast performance as an acceptable trade-off. Conclusions. These general experimental designs enabled near optimum performance parameters to be chosen for the model photoresist process studied before Statistical Process Control (SPC) methods were implemented to monitor process performance. As a result of this careful
11/30/00
JMR
Resist Technology
173
process characterization, this process was successfully employed to fabricate CMOS test devices in SPC control over an extended period of time before process replacement with a higher contrast process. The variable screen and Box-Behnken response surface designs allowed a great deal of knowledge to be obtained in actually very few trials. The experimental designs employed were extremely efficient in terms of processing time, wafers used, measurement and analysis time, and with built-in design replication to provide experimental precision.
Figure 64. OFPR-800 process contrast vs. softbake temperature from Box-Behnken response surface design.
In the example above, the surface analysis was accomplished graphically, and no computer-aided analysis was employed. If computer facilities are available, regression programs are widely available and are useful in plotting the data graphically to achieve a compromise optimum set of process operating conditions. Other experimental designs, with their own advantages and disadvantages, could have been employed such as the central composite design[100][104] (see also Ch. 5). All of the designs, however, are capable of providing the tools for successful and efficient process development to the process engineer on the fab line.
11/30/00
JMR
174
Handbook of VLSI Microlithography
Philosophy for CD Control for I-line Resists and Beyond. To achieve good CD control or high Cpk for device critical levels, like the gate level for a CMOS device, one must optimize the resist process. Our methodological approach is shown graphically in Fig. 65. In progressing from steps 1–5, you are defining the CD variation problem (step 1), finding the significant variables in steps 2 and 3, finding the optimum CD operating point in step 4, then proving you’ve reduced the CD variability in step 5. If step 5 doesn’t provide evidence of reduced CD variability, then you have to cycle back through step 2–5. The first thing to do with a vendor-transferred resist process for a given layer is to run a step number 1 multivari study (see Sec. 9.5 of Ch. 5). The best way to run this is using product material, because it’s the real thing. After calculating the families of variation as discussed earlier, you have characterized or defined your CD problem, i.e., you’ve found the family of variation dominating the total variation for CDs. If the overall CD variation is low and the families of variation are equally distributed, then you have a stable optimized process most likely. If on the other hand, one family, e.g., within wafer, is much larger than the results for the other families, then there is a problem. Reducing the variation of that dominating family usually leads to the reduction of the total CD variation. The next experiments to run would be statistically designed experiments as shown to the left of Fig. 65 (step 2), fractional statistically design experiments (DOEs). These experiments will possess different levels of resolution statistically, and sometimes allow you to reduce the number of variables affecting the example within wafer effect. After reducing the number of variables with these factorial designs, then the next thing to do is the high resolution response surface design in the middle of the figure. An example of this design usage is given in the next section below. After running that design, the optimized process affecting within wafer variability in the example should be known. The last step is to run the multivari again to verify that you’ve achieved the reduced variability you sought. If the families are not equal and the overall variability of your response isn’t achieved, then you have to return through the whole cycle again, because you’ve missed the main variable or interactions affecting the variability you’re trying to reduce for a manufacturing application. I-line Resist Process Optimization Example.[19][105] In this section, an example of a resist process optimization using statistically designed experiments is demonstrated. Although most resist processes sold commercially now are statistically optimized, when this system was obtained in 1988, it was not optimized, and the users completed this example process optimization.
11/30/00
JMR
Figure 65. Methodology for process variability definition (step 1) and reduction (step 4), and variability reduction confirmation (step 5).
Resist Technology 175
11/30/00
JMR
176
Handbook of VLSI Microlithography
The designs most commonly used are the Central Composite and the Box-Behnken designs.[101] An example 5-level 3-variable Central Composite design is shown in Fig. 66; also shown is the response for a non-linear measured response Y vs. a controlled variableA to show why more than two levels must be measured or probed in order to prevent assuming linearity of an actual non-linear response. The exact five-level five-factor design of this example is shown in Fig. 67. The measured variables are at the bottom of the figure. The most important measured value is the contrast value, because CD control and resist performance depend inversely upon this value. For a stable manufacturing process, we want the value to be reasonably high, stable, and optimized (i.e., changes slowly with control variable changes), thus leading to good CD control and high production resist CD Cpk values. Contrast plots for the five control variables are found in Figs. 68a–e. The variables affecting contrast or CD control from the SAS software data analysis were PEB temperature and time, developer time, and prebake temperature and time. Notice the figure for prebake temperature—without doing a statistical analysis with SAS, you can see that the best operating temperature is about 110°C. At higher temperatures, the slope of the contrast vs. temperature curve gets steep and at lower temperature values the slope increases. By using this simple 2-D graphical analysis, the optimal values for System 9 i-line resist are 110°C, 50 sec, 120°C and 80 sec, and 60 sec, for the prebake temperature and time, the PEB temperature and time, and the develop time, respectively. These values are all relatively stable operating points, i.e., they’re all on the lowest slope regions of the contrast vs. control variable 2-D curves. SAS RSM analysis yielded very close numbers to those obtained graphically, and this resist was ran in production for roughly two years before it was replaced by a next-generation singlelayer resist. SAS analysis uncovered (i) a two-factor interaction between prebake time and temperature, (ii) PEB time was important, and (iii) that development time was also a factor affecting contrast. Results for optimized contrast and multivari CD control studies are in Table 16. Note the example optimized single layer Aspect (now Shipley) System 9 process, ran with optimized values at two swing curve thicknesses, yielded superior values to the older generation 820 resist and yielded equivalent values to the lithographic trilayer resist process 388 CEM/PVA/820 stack, an expensive and high defectivity resist system. Production lithography will always choose the less expensive single-layer resist system over the expensive-to--run multilayer system, and that was the decision here as well. Plus, the single layer optimized system was cheaper to run and had lower defectivity.
11/30/00
JMR
Figure 66. Central Composite Design RSM from Ref. 61; note, it is composed by adding the 3-factor full factorial cube and the axial point designs to the left of the figure.
Resist Technology 177
11/30/00
JMR
178
Handbook of VLSI Microlithography
MEASURED VARIABLES - UNEXPOSED RESIST LOSS, CONTRAST, EXPOSURE TIME TO CLEAR Figure 67. Actual 5-factor RSM design for the example.
(a) Figure 68. (a) Contrast vs. Prebake T, (b) Contrast vs. Prebake Time, (c) Contrast vs. Post exposure bake(PEB) Time, (d) Contrast vs. PEB T, and (e) Contrast vs. Develop Time.
11/30/00
JMR
Resist Technology
179
(b)
(c) Figure 68. (Cont’d.)
11/30/00
JMR
180
Handbook of VLSI Microlithography
(d)
(e) Figure 68. (Cont’d.)
11/30/00
JMR
Resist Technology
181
Table 16. Table of Performance Parameters
PROCESS
CONTRAST
CDCONTROL (3-SIGMA)
388 CEM/K 820
3.4
0.11 MICRON
K 820 SINGLE LAYER
1.2
≥0.25
388 CEM/ASPECT SYS 9
5+
0.11-0.17
ASPECT SYS 9 (LO-HI) CONTRAST
3/5
0.22-0.11
SPC Methods of Process Control. After the process optimization, whether it be a resist lithographic process, a coat process or any process, the process must be monitored to ensure it is operating within the specification limits usually dictated by the device design rules. These methods have been well documented.[106] Two types of examples are provided in Figs. 59, 69 and 70. Figure 59 is a wafer resist coat chart and the other two are CD control charts. In the figures, the spec limits are included for comparison. It must be realized: just employing SPC charting methods does not improve the baseline processes—this comes through careful process optimization as shown in the examples above. The SPC charting methods just provide the data recording format for monitoring the process, and are not able in themselves to influence process quality/stability. Stepper SPC Methods of Process Control. The photo equipment of any fab facility must be maintained under SPC methods and controlled as for resist coating and other applications in Sec. 3. (See Ch. 5.) Examples of stepper SPC reporting are included in Fig. 71. The example fab here is the same as for the resist SPC example which runs a half fractional factorial every day to monitor equipment in every combination to verify machine qualification for that day of production. A system like this one is also employed by IBM.[107] This system, called the daily matrix, has detected problems early before costly rework or scrap for every system monitored. Example control charts for this system, which depends upon observed Eo values, is located in Fig. 71.
11/30/00
JMR
182
Handbook of VLSI Microlithography
Figure 69. Two micron nominal CD control chart for OFPR-800 process using a PerkinElmer 544 projection printer (UV-4) as the lithographic tool.
Figure 70. One and a half micron nominal CD control chart for OFPR-800 process using a Perkin-Elmer 544 projection printer (UV-4) as the lithographic tool.
2/23/01
JMR
Resist Technology
183
Figure 71. Daily Photolithography Matrix Eo functional test data used to test illumination levels for a bank of 5x i-line steppers.
Other daily checks for the steppers include simple wafer functional visual checks for lens contamination and stepper wafer chuck contamination. Figures 72 and 73 show the effects of wafer chuck particle fall-on and backside contamination, and Fig. 31 shows a lens fall-on contamination repeating defect, respectively. Figure 72 is a snake pattern with 2 micron pitch, while in Fig. 31 the open frame of the lens is employed for the entire exposure print field. If these problems are observed, the equipment is taken down and the chuck cleaned and vendor service is called in to remove the dirt from the objective lens, respectively. SPC Methods of Process Control. Standard SPC methods as outlined in Ch. 5 are necessary for MLM lithography or for any type of lithography, for that matter. There are many statistical tools available to control a process so the product is made with high quality. Quality is the conformance to target performance levels and statistical process control uses methods to reduce process variations so continuous improvement is possible. For a complete mathematical treatment of these methods, see Sec. 9.5 of Ch. 5.
11/30/00
JMR
184
Handbook of VLSI Microlithography
Figure 72. Optical micrograph of a wafer with an out of focus condition for a 2 µm pitch snake pattern resulting from stepper wafer chuck contamination under the out of focus die. (See bottom right fields.)
Figure 73. Wafer maps illustrating in the bottom figure an out of focus condition caused by resist buildup on the stepper chuck just near and all the way around the wafer edge from wafers with poor backside edge bead removal. Note, in the top figure the chuck has been cleaned and the wafer map is flat, even at the edges. The test called “chuckit” employs a flat Si wafer and is a standard daily or weekly stepper check.
2/23/01
JMR
Resist Technology
185
MLM Resist Process SPC. For resist application, the thickness control is dictated by the swing curve as shown in Sec. 3.1. From the Fig. 27 example of that section, if resist thickness varies over ±500 Å the full range of fluctuation in CD is possible. To minimize this contributor to the CD control budget, the resist thickness is usually controlled to ±50–100Å as measured by a Prometrix thin film analyzer. The Cp and Cpk values typical for a developmental device fabrication pilot line are shown in Fig. 74. Notice the specification is ±100Å and that for advanced pilot lines Cp and Cpk values ranging between 1.0 and 1.5 are very typical. The one process shown in Fig. 74 with a value below those values is for an older resist process where lower values are acceptable; i.e., that layer is not a critical layer and doesn’t control the process variability for the actual primary process involved. The other data, however, is for resist layers used at device critical layers; therefore, tight thickness control is appropriate for them and is achieved. In addition to the daily resist thickness qualifications above, the resist application wafer tracks and all of the possible combinations of equipment are also qualified daily for functional Eo as defined in Sec. 2.2. This additionalEo functional test provides a redundant check for thickness, plus, it provides a functional photospeed check, which is not guaranteed just by a resist thickness check alone. The Eo matrix for all lithographic tools and their combinations is a half fractional factorial design, so that if any tool is out of specification the “out of spec condition” is detected and corrected in real time before the day’s production. Example data is found in Fig. 75. Examples of out of control conditions detected by this system are resist out of date for shelf life, thickness drifts out of spec, batch to batch resist photospeed or sensitivity shifts, developer CO2 absorption and strength changes, and errors occurring in developer tank fills with the wrong developer or the wrong resist has been placed on the wrong track dispense pump. Resist Postbake and Removal. After the resist images have been developed, it is necessary to remove any residual developer solvents to help prevent image flow during post lithographic processing steps and to promote adhesion if wet etching steps follow. This thermal treatment is accomplished at preferably higher temperatures, as long as image flow is avoided. In fact, deep UV treatments (e.g., Fusion Systems)[108] and other plasma[109] and chemical treatments[110] have been reported for improving the post development process compatibility of the resist. For example, the higher the postbake temperature the more resistant the resist is to image flow and reticulation in RIE environments. Deep UV treatments and the other processes usually allow higher postbake temperatures, hence, improved post development process compatibility.
11/30/00
JMR
186
Handbook of VLSI Microlithography
Figure 74. Resist Cp and Cpk values for several example pilot line coat processes.
Figure 75. Daily photolithography matrix results for SVG coat and develop track examples.
2/23/01
JMR
Resist Technology
187
Finally, the resist must be quantitatively removed from the substrate after each IC processing step—it does not become an integral part of the vertically layered fabricated device as does patterned dielectric and metal layers. It functions only as the circuit specific patterning vehicle for layers which are not directly patternable themselves, and after that processing must be easily removed. This stripping can be accomplished by a variety of wet and dry oxidizing processes such as chromic-sulphuric acid mixtures, hydrogen peroxide mixtures, organic strippers, and oxygen rich plasma systems.[111] (Also see Table 12.) At first glance, this final process appears rather unimportant. But, recent results have been reported where plasma oxidizing removal techniques may deleteriously affect device performance and many fab lines are discontinuing their use. One must be careful then to investigate the effects of these stripping processes upon device performance and/or surface contamination to further processing. Most modern fabs have moved away from oxygen plasma stripping in favor of down-stream activated oxygen systems.[112] 4.0
LITHOGRAPHIC PROCESSING EQUIPMENT
4.1
Wafer Processes and Equipment (Wafer Tracks)
Wafer processing equipment and processes have progressed from food blenders with vacuum hold downs for coating resist, and developerfilled beakers for development, to sophisticated $1–2 million roboticallycontrolled cluster tools. Because nearly 50% of the defects observed in processing areas are large, yield-killing, fall-on, particles from fab traffic, etc., wafer units are now all totally enclosed as shown for the FSI Polaris 2500 track process unit in Fig. 76. This also allows the control of the wafer processing ambient, another important first order consideration for process control, especially for DUV resist processing. Track manufacturers, like FSI and TEL, have also minimized tool footprints due to the high cost of fab space construction by stacking bake and even process modules, as shown in Fig. 77 in cross section. Of course, this was all necessary to achieve greater process control so that even greater density and faster VLSI/ULSI memory and logic devices could be made and in reasonable yields (see Ch. 1). Modern wafer tracks are also capable of processing multiple lots simultaneously, both cascaded and in parallel modes. Most modern tracks have up to four indexers, where separate lots can be loaded and batched where possible for maximum track throughput utilization. This allows these systems to run at levels from 30–100 wafers per hour, which is required for them to
11/30/00
JMR
188
Handbook of VLSI Microlithography
keep up with modern photo exposure tools, such as ASML, SVGL, Nikon and Canon steppers and S&S tools. In the sections to follow, we will break the lithographic processes, or track module processes, down into their individual components and highlight the most important principles and effects from the experiences of the author’s spanning three decades of time. The order is meant to be in the sequence of that normally observed if you mapped the overall process sequence.
Figure 76. FSI Polaris 2500 wafer track. Notice to access internal modules doors must be opened, and this prevents Fab traffic particulation of the wafers being processed.
Figure 77. Track cross section illustrating module stacking to minimize tool footprint.
11/30/00
JMR
Resist Technology
189
Cooling Plate. The use of cooling plates in resist coat and develop wafer tracks are numerous. They quickly lower the temperature of a wafer following any thermal operation or simply establish a desired wafer temperature for an intended process. A typical wafer track resist coat process flow uses a heat cycle, such as a dehydration or vacuum bake, or a hightemperature HMDS vapor prime operation, to prepare the wafer’s surface to receive resist. The cooling plate makes the wafer temperature predictable following any of these thermal events. This is important to the resist coat process. As will be shown later in the chapter, temperature affects resist film uniformity. For now, it’s enough to say the cooling plate temperature has the greatest effect on resist film uniformity outside of the wafer chuck’s temperature differential progressing from the center towards the edge of the wafer. Another common cooling plate application occurs after a post-coat bake cycle and prior to the wafer returning to the indexer cassette. Should a wafer temperature be higher than a cassette’s thermal rating, for example, a typical “blue boat” will melt at 125°C, (personal correspondence with Karl Martin from Empak, the maker of boats) the cassette could be damaged and the wafer contaminated. Additionally, heat could transfer from a wafer loaded into the cassette to another wafer already there. Whether troubleshooting a resist thickness or critical dimension (CD) statistical process control (SPC) violation, for uniformity or mean shifts, the investigation should include the cooling plates. An example of a catastrophic failure would be if the cooling operation was skipped because of an error in the wafer flow, or if the cooling loop somehow becomes plugged and fails to circulate water through the plate. Therefore, the thickness at the center of the wafer will be disproportionately higher than the rest of the wafer. This occurs because the resist dried too quickly during the dispense cycle. However, in most cases, the cool plate effects are subtle, and predominantly affect the edge of the wafer as seen in Fig. 78 (cooling plate set at 25°C). What has happened here, is the edge of the wafer is warmer than its center, which has promoted solvent evaporation and increased thickness non-uniformly. This problem is corrected by lowering the cool plate temperature set point. The same effect occurs to CDs if cooling is not done prior to dispensing developer. With the wafer still being warm from the postexposure bake, the development rate of the image slows, which in turn can increase the variation of development, or even not develop the image out completely. These examples are of extreme cases. What may actually be seen in a factory environment will likely be a subtler event.
11/30/00
JMR
Figure 78. KLA wafer resist thickness illustrating effect cool plate temperature. Figure 78. KLA wafer resist thickness mapmap illustrating thethe effect of of cool plate temperature.
190
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
191
Resist Coat Module. A resist coat recipe begins with the application of a resist fluid and ends with a clean up of the wafer’s edge and backside. Referring to Table 17, our generic resist coat recipe, we see the recipe is initializing hardware in the first step. The motor starts spinning at the dispense speed and the dispense arm moves to the dispense position. If canisters require pressurization or a pump needs a pre-dispense signal, this step is when that happens. Notice the arm mode is at wait—this tells the track module not to move to the next step until the arm is in position.
Table 17. Generic Resist Coat Recipe from TEL MK7 Step ..
Time sec
Speed rpm
Accel krpm/s
Disp
Disp rate
1
1.0
1800
10
0
1
Ctr
wait
Home
2
5.2
1800
10
1
1
C
Non -w
H
3
2.0
1800
10
0
1
Home
N
H
4
25.0
3000
40
0
1
H
N
H
5
2.0
2000
10
0
1
H
N
In 1
6
2.0
2000
10
6
1
H
N
In 1
7
7.0
2000
10
5, 6, 8
1
H
N
In 2
8
2.0
2000
10
6
1
H
N
In 1
9
7.0
2000
10
0
1
H
N
H
10
2.0
0
10
0
1
H
N
H
Arm 1 Mode Pos
Arm 2 Pos
The second step is the resist dispense step. The time of the step is determined by the pumping time required to deliver a specified volume of chemical to the wafer. The third step is an extension of step two that allows the resist to get to the edge of the wafer before ramping up to the final spin speed in step four, and the dispense arm is sent to its home position after completion.
11/30/00
JMR
192
Handbook of VLSI Microlithography
Step four is the final spin step. The time chosen is based on the solvent evaporation rate, the speed determines the final film thickness and acceleration is the velocity used to achieve the desired spin speed. Step five slows the wafer down in preparation for dispensing the edge bead remover, EBR, solvent and moves the EBR arm into place but does not dispense. Step six is the EBR step. It begins with solvent flow through the topside nozzle, but the EBR nozzle hasn’t moved to the wafer yet. This is done as a pre-dispense that pushes out air bubbles. In step seven, the top EBR arm moves slowly to its specified location while dispensing solvent simultaneously with the two backside EBR nozzles. Step eight returns the top EBR arm to its home position while it’s still dispensing solvent, but the backside EBR flow has turned off. Step nine spins the wafer to dry it. The last step is nonfunctional processwise. The TEL MK7 requires the step at the end of all recipes for closure. Factors to Consider when Building a Resist Coat Recipe. At the point just before the resist is dispensed, several factors come into play. Choices must be made: where to dispense the resist, will the wafer be spinning or not, if it is spinning, how fast will it be spinning, and, how much force should be used to deliver the resist so that it wets the surface of the spinning wafer? And, how long should the wafer spin after the dispense is made to spread the resist to the edge of the wafer? What spin speed is required to achieve final film thickness? After the resist film is cast, will top side EBR be required or will back side EBR be sufficient, and, what factors come into play that affect the quality of this resist removal process? These questions, and more, will be addressed in the next few sections. Equipment Set Up. The first step in a resist coat recipe usually initializes the coat module and subsystems in preparation for chemical dispense. For example, the generic photoresist coat recipe shown in Table 17 has the first step initializing the resist dispense arm by setting the Arm Mode state to “wait” and the dispense speed. Consequently, the step’s time varies depending on the dispense arm rate of travel to the dispense location. A sub-menu of the TEL track is commonly used to define the arm travel speed and is done similarly by other equipment manufacturers. An arm’s rate of travel should be quick, but smooth enough so as not to cause resist dripping from the nozzle onto the wafer from abrupt starts and stops. Using the wait mode in the recipe prevents advancement of the recipe to the next step until the controller has received confirmation that the dispense arm is positioned at the pre-determined location. This is important to insure the resist dispense always begins at the specified location. The wafer spin speed is stabilized prior to dispense by using a 1.0 or 2.0 second interval that will ramp the wafer to the dispense speed using an acceleration of 10 to 20 krpm/
11/30/00
JMR
Resist Technology
193
second. Although higher rates of acceleration can be used, they provide no benefit and stress the spin motor unnecessarily. Moreover, if a static dispense is employed, the same methodology is used, but only to the extent to prepare the system. While machine software and hardware differ in how the first step initializes the equipment, the objective is always the same. There will always be exceptions, as with older tracks, such as an SVG 8100 system, which requires no initialization. However, the first step is still useful for stabilizing the spin motor at the dispense speed. Dispense Location or Method. Traditionally, photoresist is dispensed in the middle of the wafer. However, limited success improving film uniformity has been obtained by radially traversing dispense arms beginningat the edge of the wafer and scanning in to the center.[113] More obscure coating techniques have been attempted over the years to overcome process and/or mechanical limitations such as, roller coating, dry film laminate, dip coating, spray coating, ring transfer and a variety of others.[114][115] Of these techniques, spin coating remains the most viable method based on uniformity capability. A case to dispense resist in the middle of a spinning wafer was presented in a study conducted at Motorola’s Advanced Custom Technology (ACT) center. The study was initiated because the resist coat process had drifted to ≈ 150 Å, 3-sigma, all families of variation, from the historical 75 Å level. A low level screening experiment identified the movement of the dispense arm during resist dispense as one of two contributors affecting within wafer uniformity, with exhaust level being the second. Proper surface preparation prior to resist coat dispense promotes resist solution wetting. However, if the surface energy is too low, the resist does not wet and will slide immediately off the wafer. In some cases, microsheering is observed and unconventional coating techniques are required such as double and static dispenses. Additionally, radial striations may be formed due to the resist/substrate interface that requires an increased dispense volume [116] to correct the problem. This phenomena can be overcome to some extent using dispense pressure to compensate. Increasing the applied resist force causes the resist to wet the wafer. Dispense Speed. Resist can be dispensed with or without the wafer spinning. Neither method is considered superior, although each has unique advantages. The most commonly used method in integrated circuit (IC) manufacturing is with the wafer spinning. This method is called dynamic dispense. Conversely, when resist is dispensed with the wafer at an idle state, it is called a static dispense. Either method when chosen has the single goal of completely coating the wafer with a uniform film. In the case of a dynamic dispense, the dispense speed recommended by the resist supplier is a goodplace
11/30/00
JMR
194
Handbook of VLSI Microlithography
to start. The speed should be optimized considering these responses: film uniformity, spin induced defects, surface wetability and resist volume. For the sole purpose of brevity, the remainder of this topic will focus on dynamic dispense. A typical dispense speed is 2 krpm, but could be higher or lower depending on the resist’s viscosity. The correct dispense speed will carry the resist to the edge of the wafer in five seconds or less. More time than this will allow too much solvent to evaporate, and the film will be too thick toward the wafer’s edge. Also, spin defects, such as the striations shown in Fig. 79 (TEL striation) can result from an incorrect dispense speed or low acceleration.[117] It’s best to use a statistically designed experiment, such as a Box-Behnken or Central Composite, whose results can be used to generate a Response Surface Model (RSM). Use dispense speed and pressure for input variables. The goal would be to optimize those settings for a uniform film with the fewest defects, using the least amount of resist. Dispense Pressure. A dispense pressure should be chosen based on the amount of force required to wet the surface uniformly. Too low of a pressure doesn’t wet the wafer and the resist “balls-up” and flies off the wafer leaving behind partially coated areas. The effect of too high a pressure is the resist strikes the surface with such force that it’s atomized, and splatters outside the catch cup. Typical dispense pressures range from 7 to 15 pounds per square inch (psi) and are dependent upon resist viscosity. A common resist dispense system in the industry is the Integrated Dispense System™, IDS, manufactured by Integrated Designs Incorporated®, IDI. A standard unit has a dispense range between 3 and 15 psi. Dispense pressure is a function of the resist’s viscosity, Fig. 80. For example, a 7 centipoise (cp) resist requires the dispense pressure be 4 or 5 psi, to generate a flow rate of roughly 2 milliliters (ml) per second. The 2 ml/second (sec) target gets the resist to the wafer quickly and efficiently. A 25 cp resist requires a 7 or 8 psi dispense, and a 65 cp resist requires 15 psi or higher to deliver the resist at the 2 ml/sec rate. In most manufacturing environments, a single dispense pressure must accommodate all substrates, because pump technology won’t usually allow recipe specified dispense pressure. Since surface energy varies with substrate, testing wafers at several different process levels is required to determine optimum dispense pressure. It is beyond the scope of this chapter, but briefly, an explanation of substrate energy variation is provided. First, the clean process preceding a photo operation that incorporates HF will leave the surface fluorinated and hydrophobic.[118] On the other hand, an SC1 and SC2 (NH4OH/H2O2/H2O and HCl/H2O2/H2O)[119] clean without HF does not fluorinate the surface, which
11/30/00
JMR
Resist Technology
195
Figure 79. Effect of low dispense speed illustrating striation defect formation. The striation signature data of the figure (top photo) was generated using a KLA low-angle laser particle counter.
11/30/00
JMR
196
Handbook of VLSI Microlithography
Figure 80. Resist viscosity effect upon IDI pump dispense pressure.
11/30/00
JMR
Resist Technology
197
leaves it hydrophilic. Resist coverage tests are necessary on all dissimilar process layers to insure wetting problems do not arise in manufacturing as a function of dispense pressure. High temperature diffusion processes also are known to leave surfaces non-wetting (see Fig. 81) and piranha cleans are successful at curing this problem. The acid treatment is also known to help wetting of p-doped oxides as well. Dispense Time. The dispense time in the second step of the recipe affects final film thickness. That is, once the desired volume is dispensed, the recipe should move to the next step. For example, suppose the dispense time in the recipe is 5.0 seconds long, but only 1.0 second is required to deliver the resist. This over-dispense condition should be avoided on the basis of cost and thickness control. Inadvertently prolonging the transition between the dispense step and the subsequent step causes too much of the resist’s spinning solvent to evaporate, and makes it difficult to achieve the resist supplier’s specified thickness for a given viscosity. A long dispense, one that continues past the time the resist has cleared the wafer edge, yields a thicker film. This is true because the solvent has more time to evaporate and the resist tends to coat over itself. This is not the same as double coating, but the film will be thicker. Dispense Volume. Dispense volume is really the same as the effect of dispense pressure (or rate of delivery), and depends directly upon dispense speed, equipment pump capability and resist viscosity.[114][120] Each parameter will be discussed here. Dispense volume is a matter of economics. Moreau et al.[114] and Daou[120] have shown that millions of dollars are at stake and it can account for 3% of factory expense costs. What makes the interest in reducing resist consumption so great is more than 99% of the resist dispensed on the wafer is spun off [120] and goes down the waste drain. The typical volume used today in coating a 200 mm wafer is between 2 and 4 ml.[114] Considering the cost of a single gallon of resist which can easily exceed $500, coupled with the costs of disposal for hazardous waste,[120] one can understand why an extraordinary effort is made to reduce dispense volumes throughout the industry. Moreover, at least five U.S. Patents covering processes that employ everything from ultrasonic dispense (U.S. Pat., 4,290,384, Perkin Elmer),[121] wafer solvent prewet (U.S. Pat., 5,066,616, Hewlett-Packard), [122] inverted meniscus application (U.S. Pat., 4,590,094, IBM), [123] enhanced resist spread and spin cycles (U.S. Pat., 5,013,586, S.E.T), (U.S. Pat., 3,695,911, Aleo); and (U.S. Pat., 4,748,053, Hoya)[124] have been issued with the goal to reduce dispense volume without compromising the film quality.
11/30/00
JMR
198
Handbook of VLSI Microlithography
Figure 81. Resist de-wetting caused by high-T processes where the wafer remains dehydrated with a high surface energy.
11/30/00
JMR
Resist Technology
199
Consider then the challenges of coating a near defect free, uniform film with the minimum volume of resist dispensed. The main problem with using the minimum volume is adequate coverage of the wafer. Resist striations and nonuniformity caused by solvent evaporation and airflow dynamics within the catch cup also complicate the equation and generally force a greater shot size to compensate. An insufficient supply of photoresist to the surface produces pie shape wedges of missing resist at the periphery of the wafer. The problem is compounded as wafer size increases. Moreau et al. have reported results for experiments with varying shot size and wafer diameter demonstrating that a 200 mm wafer required 50% more resist than a 150 mm diameter wafer to produce a uniform striation free coating of resist.[114] If pressure, speed, equipment set up, and resist viscosity are established as previously discussed, then our next consideration is surface state character. Resist will not wet a wafer if it meets too much resistance caused by unbalanced molecular forces.[114] These forces vary with substrate. However, photoresist does easily wet wafers with low surface energy, i.e., low water droplet contact angle, (Fig. 82 from Moreau Ref. 114). The application of HMDS typically lowers the surface energy from about 50 to 25 dynes/cm,[114] providing a surface on which the resist can easily extend itself. The wafer pretreatment process can be optimized to alleviate substrate differences.
Figure 82. Illustration of water droplet contact angle measurement method to generate data, which can be used with other data to calculate surface energy. At low angles, Si-type wafers wet easily with water and are said to be hydrophillic.
11/30/00
JMR
200
Handbook of VLSI Microlithography
One method used to reduce resist dispense volume is to increase the rate of delivery by employing a very short dispense time at a high pressure. This appears to overcome the surface energy-induced dewetting phenomena at the liquid/solid interface. The objective being to deliver the resist to the spinning surface as quickly as possible to minimize solvent evaporation, airflow, temperature and humidity effects. If the dispense rate is still not high enough to achieve the target thickness, uniformity and resist minimization criteria, there is still another card to play. The dispense line and or tip diameter can be increased. Most tracks are delivered with 0.25" outside diameter (OD) dispense lines with a wall thickness of 0.047" (medium wall) that are reduced at the point of dispense to 0.125" OD or less. By increasing the dispense tip diameter to 0.18" OD, no reduction in the rate of delivery occurs. Additionally, since dispense time is part of the “rate” equation, a recipe time that stops the dispense just prior to the resist reaching the wafer’s edge will further reduce waste. A resist’s viscosity and spinning solvent are part of the dispense volume equation. It is more often the case than not, that a resist is chosen for its photolithographic resolution, photospeed, etch resistance and so on. The fact is, higher viscosity resists and resists that use solvents with high evaporation rates will increase the dispense volume required to sufficiently cover the wafer. Dispense Suckback. Once the resist dispense is complete, the resist is drawn (sucked) back into the dispense nozzle by 1 or 2 mm. Sucking the resist back into the nozzle prevents small droplets of resist from falling onto the spinning wafer, which causes striations. It also stops the resist from drying around the outside of the nozzle, where it eventually flakes off and will fall onto a wafer causing pattern defects.[125] Resist can dry inside the nozzle if it is sucked back more than 2 mm. Small globules of resist will cling to the inside walls of the nozzle where they can dry between dispenses. The dried resist particles will be carried out of the nozzle with the next dispense and onto a wafer’s surface. Also, too much suckback will separate the resist in the nozzle and form an air bubble. And, air bubbles will definitely cause striations. Resist Spread. An intermediate step is often employed following resist dispense, to stabilize the resist a few seconds, before accelerating the wafer to the “final spin” step where the final film thickness is determined. The wafer spin speed is the same as the dispense spin. This step is optional and essentially is an extension of the resist dispense step helping to eliminate poor resist coverage of the wafer caused by very short dispenses used to minimize resist volume. The condition is manifested visually when the last of many concentric visual rings has not reached the edge of the wafer before dispense stops. A couple of examples, where a “spread step” might
11/30/00
JMR
Resist Technology
201
be used, are dispenses involving resist viscosities greater than 65 cp and when high surface energy poorly wetting substrates are involved. The spread step can be too long, and make it almost impossible to achieve the desired film thickness. While the wafer is spinning at a slow speed, the spinning solvent is evaporating, establishing a high ratio of solids to solvent condition. After more than three seconds, the resist begins to dry and cannot easily be altered. Very high spin speeds will be required to reach targeted film thicknesses. Conversely, if a thicker film is desired than what was designed in the resist system, this spread step is one method of increasing final thickness without purchasing a higher viscosity resist. Unfortunately, this method degrades film uniformity and may be unpredictable because solvent evaporation is dependent upon other factors such as airflow, humidity, spin speed and so on. The magnitude of the spread-time factor, with respect to wafer-towafer mean thickness variation, is put in perspective by Cayton and Williams.[126] Using designed experiments and state-of-the-art equipment, they produced a sensitivity index for the resist coating process. The index is shown in Table 18. Cayton and Williams’ data revealed the most significant factor is the end-of-dispense (EOD) cast time—or spread time as referred to in this text. They found a 68 Å/sec change in mean thickness using OMM 897-9I photoresist. Hence, with this large a contribution, the spread time factor is one to consider when matching resist coat modules. Acceleration To Final Spin Speed. Acceleration is the rate the wafer velocity transitions from one speed of rotation to another. For example, if the wafer is spinning at 2 krpm during the dispense step, and the final spin step requires a speed of 4 krpm to achieve the targeted thickness, the wafer can be accelerated at a rate of 1 to 50 krpm/sec in increments of 1 krpm/sec Although most systems have the capability of accelerating at 50 krpm/sec, this rate of acceleration is rarely used. in the final spin step of the recipe. Not all equipment performs the same. It should be noted most older equipment sets, such as the SVG 8100 Series tracks, allow acceleration to be increased from 10 to 50 krpm at increments of 10 krpm/sec. Advanced tracks have finer increment control (± 10 rpm vs. 1000 rpm), and a lower starting velocity 100 rpm vs. 10,000 rpm. Typical acceleration values range from 10 to 20 k rpm/sec to ensure resist film uniformity and low defectivity. Acceleration to the final spin speed is important because it can affect film uniformity.[127] As shown in Fig. 83, uniformity improves as acceleration is increased. In this example, acceleration was varied from 2 to 10 krpm/sec. There was a 50 % improvement in 1-σ at the higher acceleration. The figure
11/30/00
JMR
202
Handbook of VLSI Microlithography
doesn’t show what happens beyond 10 krpm/sec, but personal experience has shown the response remains flat up to 20 krpm/sec. Accelerations limiting factor is not uniformity—unfortunately acceleration rates above 20 krpm/sec, increase spin induced defects. Table 18. Sensitivity Index, FSI Data Sensitivity at 1 σ, Å Parameter
Mean
Within-wafer
Resist Temperature, °C Exhaust, LFPM Chuck Temperature, °C Dispense Speed, RPM Dispense Rate, ml/sec Dispense Volume, ml EOD to Cast Time, sec Cast Time, sec Time on Chuck, sec Ambient Temperature, °C Relative Humidity, % Velocity, LFPM
-7.50 0.04 3.24 0.40 1.19 5.65 67.88 -7.29 -0.69 28.00 9.20 1.82
9.00 0.00 0.40 0.00 -1.79 -0.15 0.91 -0.04 -0.02 0.13 0.18 0.12
Wafer-to-wafer -2.25 0.00 -0.62 0.00 0.00 -0.06 0.61 -0.10 -0.02 0.00 0.05 0.00
Figure 83. Effect of ramp rate, or acceleration, upon resist thickness for a typical i-line positive resist, System 9.
11/30/00
JMR
Resist Technology
203
Therefore, the second consideration for acceleration is defectivity. The concentration of particles in the cup dramatically increases with acceleration.[128] These particles are created by the high acceleration of the spin motor from the resist apply step to the final spin step. As shown by Pratt,[128] who collected aerosol particle counts with a Met One Particle Counter inside a modified coater cup with roughly 0.17 cu. ft. of controlled space, even the slow acceleration rate of 3 krpm/sec produced greater than 60 thousand particles greater than 0.3 microns in size. The concentration of particles rose about 20% when acceleration was increased from the modest 3 krpm/sec to 10. Particles continued to rise linearly at a rate of 400 for every 1 krpm of acceleration up to the experimental maximum of 40 krpm/sec. Atomized photoresist particles can find their way back onto the wafer’s surface.[128] The particle shape depends on its solvent content, where a spherical shape is consistent with particles that are dry or very low in solvent. Half domes, or “fish eyes” or “color spots” are formed when particles rich in solvent redeposit on the surface. The spherical particles block UV radiation and form islands of resist pattern where there should be no resist, and, they don’t fully develop when imaged. “Fish eye” particles act like micro lenses, focus the UV light into the particle to partially develop leaving a crater shaped pattern defect. These particles can not be avoided, but can be managed or minimized with sufficient and balanced exhaust. Redeposition of both particle types is avoidable when the particles are drawn below the wafer plane into the exhaust port of the coater cup. To minimize both particles and acceleration’s influence over uniformity, a response surface experiment should be run to optimize both acceleration and exhaust settings. Brown et al.’s[113] optimization of the resist coat recipe at Motorola ACT identified acceleration as one of three controlling variables for film uniformity and spin induced defect levels. A contour plot, (Fig. 84) illustrates the acceleration operating range narrows for film uniformity (dashed lines) as exhaust decreases. Unfortunately, this was where exhaust was at its optimum setting. Defects on the other hand, illustrated here in Fig. 85 (exhaust defect), are high when both acceleration and exhaust settings are low (Fig. 85). However, as the two variables rise together the number of defects fall until they begin to rise again. It is in this saddle where the process was set to operate because film uniformity was still good there. While film uniformity was best when exhaust was low, there were too many defects. And, when acceleration was at it’s most robust state, film uniformity was poor. Obviously, compromises had to be made in this case study that could not be avoided. Not only is this typical, expect it to be the “rule of thumb.”
11/30/00
JMR
Figure 84. 2-D response surface for coat defects as a function of acceleration and exhaust level during the coat process.
204
11/30/00
JMR
Handbook of VLSI Microlithography
Figure 85. 3-D response for the same variables as for Fig. 84, with the RSM modeled optimal operating points shown to the left.
Resist Technology 205
11/30/00
JMR
206
Handbook of VLSI Microlithography
Temperature Introduction. The temperature variable is by far the most influential factor affecting resist thickness and uniformity. More important than average film thickness is film variation as measured radially across the wafer center to edge.[129] The temperatures of the wafer, resist and ambient environment make the difference between controlling thickness uniformity from a meager 100 Å to 10 Å, 1-σ , total variation across the wafer, wafer-to-wafer, and lot-to-lot. In today’s manufacturing of wafers with sub-half micron features, this level of temperature control is critical for the total CD budget to be controlled, or the budget will be exceeded in the coating process alone. Using temperature control units, a cooling water loop maintains the resist, wafer and wafer chuck to ± 0.3°C. Separate water loops for each are employed to bring equalization to the three as they are composed of materials of different heat capacities. Optimization of these temperatures starts by setting all temperatures, the resist, coater cup (Fig. 86), wafer chuck, and cooling plate, to the same temperature. A good rule of thumb is all temperatures should be within 1.0°C or better,[117] which by the way, should also be within 1.0°C of the room temperature. Chuck Temperature. The wafer chuck temperature, if independent adjustment is available, is adjusted to correct resist film thickness nonuniformity. The greatest effect to thickness is at the center of the wafer versus that at the wafer’s edge. The 3-D contour plot, Fig. 87 (i.e., chuck too cold) shows the subtle effect chuck temperature has on resist uniformity. In this example, the chuck temperature is too low and the 3-D plot exhibits a “Stadium Effect,” where the edge of the wafer thicknesses are greater than at the center. Adjustments are made by changing the chuck’s temperature in 0.5°C increments. While the chuck temperature effect is small, the fine adjustment of the chuck’s temperature maintains uniformity wafer-towafer, by removing heat that would otherwise be transferred to the chuck from the spinner motor as it heats during normal operation. This final temperature adjustment should bring complete equalization to wafer, hardware and environment. Temperature of Resist. Resist temperature affects the thickness of the film in the center of the wafer versus thickness observed out radially at about two-thirds the diameter of an eight inch wafer. An example of resist temperature vs. uniformity is presented in Fig. 88. The cup temperature and cooling plate temperature were held constant and resist temperature was changed ± 8°C from nominal. As the resist’s temperature increased, the thickness at the center of the wafer increased, and the inverse is true as the resist’s temperature is decreased.
11/30/00
JMR
Figure 86. Effect of coat module cup temperature upon 8" wafer resist thickness uniformity. (Figure courtesy of Frank Fischer of Motorola and taken from Motorola Univ. EDN 030.)
Resist Technology 207
11/30/00
JMR
Figure 87. Effect of coat chuck temperature too low creating a “Stadium Effect” for resist thickness across the 8" wafer.
208
2/23/01
JMR
Handbook of VLSI Microlithography
Resist Technology
209
Resist Temp. vs. Uniformity
Figure 88. Effect of resist temperature upon resist thickness across the 8" wafer. (Figure courtesy of Frank Fisher of Motorola MOS 12 and taken from Motorola Univ. EDN 030.)
11/30/00
JMR
210
Handbook of VLSI Microlithography
To correct a uniformity problem caused by resist temperature, the resist film’s profile should be mapped. Using thickness measurements collected radially across the wafer to within 10 mm of the wafer’s edge, a map of the film’s profile can be drawn, (see Fig. 89, Mexican hat effect). If the data is collected manually, at least sixteen sites should be measured from edge-to-edge to accurately depict the film’s surface. The data should be entered into a spreadsheet and graphed using a line chart with the X-axis representing the site location and Y-axis for thickness measurements. If an automated tool is used, such as a Tencor UV1250, a 49 site measurement scheme using eight concentric rings is appropriate, see Fig. 89. The UV1250 has 3D mapping capability for this job. Evaluate the film as before. Adjust the resist temperature by increasing or decreasing the set point in increments of 0.5°C. Allow at least 15 minutes to pass before rerunning the test to allow the resist in the supply line to acclimate to the new setting. Temperature and Humidity of Coat Chamber. Resist film uniformity, and mean thickness variation wafer-to-wafer and day-to-day, are influenced by coater cup temperature and relative humidity (RH).[117][130] This is explained by the resist’s spinning solvent volatility.[114] Basic chemistry tells us when temperature increases, so does the rate of solvent evaporation leaving behind more solids, thereby increasing thickness.[114][129] Moreover, the magnitude of change, or slope, is determined by the solvent choice the resist manufacturer made when designing the resist. That is, a resist using more volatile spinning solvents will change more quickly. First, we’ll discuss the influence of cup temperature on film coating uniformity, then, focus on how to correct the problem. An example of how the resist film coating uniformity is affected when the coater’s cup temperature changes is shown in Fig. 86. The example shows five different cup temperature settings. When the cup temperature was set to 30°C, which was 8.3°C higher than the resist temperature, the range across the wafer was greater than 900 Å. Note the center of the wafer is very thin with respect to the edge, and that the profile reversed when the cup temperature was set lower than the resist temperature. If this uniformity problem is encountered in the factory, it can be corrected by generating the same data as shown in Fig. 86. Start the optimization process by setting the cup temperature to the same temperature as the resist temperature. Hold the resist and cool plate temperatures constant, and vary cup temperature ± 8°C in increments of 4°C. The resulting plot should resemble that which is shown here. Choose the cup temperature that produces the uniform film.
11/30/00
JMR
Figure 89. “Mexican Hat Effect” observed when the resist temperature is relatively too hot.
Resist Technology 211
11/30/00
JMR
212
Handbook of VLSI Microlithography
The coater cup RH influences the resist-coated film’s mean thickness (MT), but not its uniformity.[130][131] However, the effect can be minimized by controlling the coater cup’s RH to± 1 %. Consider Fig. 90, which shows mean thickness decreases as humidity increases. The change is roughly 10 to 20 Å/%RH, and even higher if the resist spinning solvent has a high vapor pressure. The reason RH affects MT is there is less exchange of spinning solvent molecules in an environment abundant with H2O. Evaporation is thereby decreased, and the spinning solvent carries the solids to the edge of the wafer with less loss. The result is a thinner final film.[117]
Figure 90. Resist thickness RSM for resist coating non-uniformity versus relative humidity and temperature. Note, when most fabs run at RH of 40% and temperature of ~68°F, they are operating near optimum.
Controlling humidity to a specification of ±1 %RH is adequate until line widths drop below the sub-half micron level. At this point, the specification must tighten to the ± 0.5 percent level, and to an even tighter ± 0.25 %RH when line widths approach 0.25 µm.[132] Airflow. We will discuss two applications where airflow control is needed, and used almost universally. The first application we’ll cover is for resist coaters utilizing temperature and humidity (T/H) control systems. The second is for contamination control within the boundaries of the process equipment.
11/30/00
JMR
Resist Technology
213
In a process chamber, i.e., coater module, where a T/H control system is used, the reference to airflow is the pressure created by the T/H unit. The airflow in the coater chamber is barely above atmospheric pressure because it’s intent is to bathe the wafer with T/H controlled air to control solvent evaporation. It is not intended to pressurize the coat process chamber to stop airborne particulates from entering it. A common setting, recommended by TEL Process Engineering, is 0.3 meter/second, with a process margin of ± 10%. If the flow is too high, solvent evaporation may increase and affect resist film uniformity. On the other hand, if the flow is too low, no benefit can be realized from the T/H unit. The second application where airflow is controlled is specific to track equipment not under control of a T/H unit. The intent here is to reduce pattern defectivity by isolating the equipment’s process area from the global factory environment where airborne particulates could fall onto the wafer.[133] This is done by creating a positive pressure environment using a “mini-environment” which surrounds the process tool. In one of its first uses in the IC industry, an SVG 8100 series coater at Motorola’s MOS 5, was retrofitted with a mini-environment. Overall pattern defectivity was lowered as hoped, but an unexpected change in resist thickness occurred. The average thickness increased 200 Å after the tool modification because the process area had laminar flow air focused onto the wafer during processing which increased solvent evaporation. The airflow was originally 250 cubic feet per minute (cfm), which is the standard flow for the factory. It was dropped to 90 cfm to bring thickness back to the specified thickness. Barometric Pressure. Barometric pressure is a variable often overlooked in setting up a process because changes occur over hours rather than minutes.[134] We look to the basic laws of chemistry and physics to explain the phenomena. First, solvents are more volatile at lower atmospheric (atm) pressure, this accelerates their evaporation from the resist film and makes it thicker. Second, high atm pressure works to decrease the film thickness by reducing spinning solvent evaporation rates making the resist film thinner. Therefore, as atm increases, thickness decreases. Of course, the amount of change depends on the spinning solvent. Barometric pressure compensation (BPC) for the resist coat module is available only with the most sophisticated and latest generation track equipment. The POLARIS 2500 Microlithography Cluster utilizes BPC and reduces day-to-day thickness variation by more than 50%.[134] For all other equipment that lacks BPC, being aware of, and understanding the phenomena is all that can be done. Final Spin Speed. It is in this step of the coat recipe the final film thickness is established. Bornside et.al.[135] explain this to result from an
11/30/00
JMR
214
Handbook of VLSI Microlithography
equilibrium established by evaporation of solvent, coupled with the thinning of the film by centrifugal force reaching a point where the film ceases to change. Much work has been published, including articles by Meyerhofer, Middlemnan and Olin (Hunt), that further define factors such as shear force and stress, and airflow at the liquid/air interface[129] to determine the final film thickness and uniformity. Although several factors are involved, the final film thickness for a resist can still be approximated by:[129] t = K(C ) 2 / SS 0.5 where K is a constant, C is concentration of solids and SS is the spin speed in rpm. From a practical point of view, the final spin speed is the “knob” that is used to change the average film thickness. The rule is simple, increase the speed the wafer rotates and the final film thickness will decrease. Figure 91 contains an example of a spin curve for a conventional organic-based resist. These data are collected by running several wafers at different spin speeds. Since applications for resist vary as widely as the resist, it’s common for resist suppliers to offer more than one viscosity of the same resist. The suppliers provide a specification sheet that includes a spin curve run on their tool set. These data are guidelines at best. Since all factory equipment sets and processes differ, the user should run a spin curve in-house, and not rely on vendor company data quantitatively. It should be noted, there are upper and lower limits for the final spin speed due to wafer size limitations. While there are no hard and fast criteria for maximum spin speeds, it’s best to spin 8-inch wafers ≤ 3500 rpm, 6-inch wafers ≤ 4000 rpm, and three through 5-inch wafers at ≤ 5000 rpm. For example, if the target thickness can only be achieved by spinning the wafer at 6000 rpm, it’s probably because the resist viscosity is too high. On the other hand, if the final spin speed is slower than the speed used in any EBR steps, the viscosity is too low. The problem with spinning any sized wafer too fast is the generation of aerosol particles—this condition should always be avoided. While the main particle or spin defect contributor is acceleration to the final spin speed step, there is a measurable difference in the number of particles created as a result of high spin speeds alone.[128] On the other hand, spinning the wafer too slowly compromises film uniformity. This is often the case when a single resist viscosity is forced into use to coat multiple film thicknesses. Figure 92 is a spin curve that plots both film thickness and uniformity (standard deviation) as a function of spin speed. At the lower and higher spin speeds, note the thickness standard deviation is higher, which supports our position.
11/30/00
JMR
Resist Technology
215
Figure 91. Resist spin curve, thickness and uniformity, vs. spin rpm.
Figure 92. Resist spin curve data illustrating how uniformity (curve in figure on next page) gets worse at higher and lower rpm values. This means: there is an optimal spin range for each resist, depending upon its viscosity. To change thicknesses, the manufacturer should change or reformulate the resist viscosity, and not have the user changing spin speeds to lower or higher than optimum values.
11/30/00
JMR
216
Handbook of VLSI Microlithography
Figure 92. (Cont’d.)
Final Spin Time. The final spin time is the amount of time it takes for the resist to actually dry. There are at least two different methods to accomplish this. There is the “Spin-to-Dry” method, and there is the “QuickSpin, Then-Dry” method. Spin-to-Dry is defined as spinning the wafer at the final spin speed long enough for the resist to dry. The drying process usually takes 20 to 30 seconds, but is known to vary depending on the resist’s spinning solvent.[114] The higher the solvent vapor pressure, the more quickly the solvent will evaporate. The “Quick Spin, Then-Dry” method is used in special applications. It is defined as a final spin step that spins the wafer at the final speed for only a few seconds before slowing. The speed will usually decrease more than 50% in the next event for the drying process. Although not proven with Scanning Electron Microscopy (SEM) cross section micrographs, it’s been hypothesized this process reduces “piling” of resist over topology by allowing the resist to relax and dry slowly in a settled, rather than stressed state. Also, extensive testing by Lyons[136] has shown there are no changes to the films final solvent content, photoactive compound (PAC), line width, image profile, or thermal deformation characteristics. Drying time, for both casting techniques is determined by watching the colored bands move across the wafer. The bands are representative of thickness changes occurring as the film densifies as a result of solvent evaporation. The bands move quickly at first, but as solvent concentration decreases the bands almost completely diminish. This simple visual determination provides the time needed before moving to the next program step in the recipe—remember, longer times affect track throughput adversely.
11/30/00
JMR
Resist Technology
217
Exhaust. The role exhaust plays in the resist coat depends on the step in the recipe and the resist in use. For example, an exhaust pressure barely less than atmospheric during a positive resist dispense step increases solvent evaporation, which increases the drying effects that commonly produce nonuniform resist coatings. In the case where a typical negative resist is dispensed, an exhaust pressure of 250 cubic feet per minute is best to prevent bowl cob-webbing. Both the spin-off and EBR steps of the recipe require maximum exhaust flow to pull down the aerosol particles created by the deposition on the spinning wafer.[128][135] If the exhaust is turned off or made very low during a positive resist dispense step, the evaporation of the resist’s spinning solvent slows and a more uniform film than could otherwise be generated with the exhaust on is produced.[113] The thickening effect of the accelerated resist solvent evaporation at the wafer’s edge is seen in Fig. 93 (exhaust to high example). Notice, the film is uniform across most of the wafer’s surface, but takes a sudden up turn toward the wafer periphery. The total observed range of film thickness readings is at least three times higher than that of an optimized process, (see Fig. 94, typical good uniformity data). Bornside’s experimental model predicts a 10X film thickness increase at the outer 1/3 of the wafer, which is not realized here, or confirmed by Bornside empirically. Rather, only 0.2[135] to 0.4% change of the total film thickness was actually realized. Qualitatively, however, there is agreement between the observed center to edge thickness differences and his theoretical predictions. The pursuit of the uniform resist coating at low exhaust creates an opportunity for spin coat defects. Low exhaust allows atomized resist particles to be redeposited on the wafer’s surface.[113][128][135] The response surface from Brown et al. is shown in Fig. 85, where the trade-off for defectivity is easily seen. Note, by working at the optimum points modeled by the RSM, the defect density is reduced to less than 0.1 defects/cm2, a greater than 4X reduction (note also, the SVG 8800 optimum settings are listed on Fig. 85 to the right). There are two zones of re-circulation in a catch cup, (see Fig. 95 taken from Bornside et al.) one large zone below the wafer plane to the opening of the exhaust port and a smaller zone in the exhaust port.[135] The re-circulation zone outside the exhaust channel is of great importance, as Bornside has modeled, and this has been verified empirically by others.[113][128] As exhaust flow decreases, the re-circulation zone moves to a higher plane and expands in both intensity and size. Particles and solvent vapors are no longer pulled down into the exhaust port. Instead, the particles move upward escaping the catch cup, and some are redirected by laminar airflow above the catch cup.
11/30/00
JMR
Figure 93. Example resist thickness data where the exhaust is too high, creating a “Stadium Effect.”
218
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
219
Figure 94. More typical resist coat uniformity results under optimum conditions. This data should be compared to that of Fig. 93.
2/23/01
JMR
220
Handbook of VLSI Microlithography
Figure 95. Modeling data illustrating particle tracking due to exhaust in the coat cup module. (Taken from Bornside et al.) Modified for presentation here.
This downward airflow can then send the particles crashing down onto the spinning wafer below, thus creating spin-induced killer defects. A second scenario is particles trapped in this zone of re-circulation may come flying back onto the wafer as the airflow dynamics inside the resist catch cup are altered at coat recipe step changes. Edge Bead. Photoresist solvent evaporates more quickly at the edge of the wafer than any other place on the wafer because the wafer’s radial velocity is greatest there. Bornside[135] explains this wafer edge phenomena as a result of both axial and radial airflow dynamics influencing the rate at which solvent is evaporated. The rapid loss of solvent forces viscosity to increase establishing a higher concentration of solids in the moving liquid piling up at the edge of the wafer. Consequently, resist thickness is increased. This thick region or “bead” of photoresist is called an edge bead. The consequences of leaving the piled resist at the edge of the wafer are debatable. Nevertheless, most agree resist on the front side of the wafer near the edge is likely to be damaged during cassette and wafer handling. As the resist-coated wafers are jostled in the cassette or grabbed by equipment handlers, the resist may be torn and damaged. The particles scatter with some probability of redepositing on a wafer’s surface causing a pattern defect, which in turn can block an etch or even an implant. Modern resist coaters remove this bead using an EBR solvent. There is top EBR and bottom EBR. Top bead removal is usually accomplished using an arm, other than the one used for resist dispense. The arm is often equipped with a small diameter dispense tip, on the order of a 0.5 mm inside diameter (ID), designed to deliver a fine stream of solvent to the spinning wafer. The
11/30/00
JMR
Resist Technology
221
length of time the solvent is dispensed depends on the removal rate of the EBR solvent, a typical time is shown in Table 17 (resist generic recipe). Keep in mind, the photoresist and EBR solvents must be compatible for the process to be effective and efficient. Consulting with the resist vendor or referring to a solubility map chart, such as Fig. 96 (Ref. OCG Solvent Handbook), are both good sources for choosing or confirming an EBR solvent. Bottom EBR is applied to the backside of the spinning wafer with a fine stream either from a nozzle or by forming a meniscus of EBR solvent. The meniscus is formed by flowing solvent, at a low pressure, through a large orifice placed in a ring fixed to the coater cup and surrounding it. Both of the bottom dispense methods accomplish the task effectively. The choice of application method is made by the track supplier and is usually of little consequence. During top or bottom EBR solvent application, there are opportunities to add defects, which are ususally circular in shape and “device yield killing.” Exhaust flow level, dispense pressure and spin speed are critical variables to control.[113] Brown et al. found an optimum point at which the number of EBR defects, i.e., solvent spots on resist film, were negligible using designed experiments and response surface modeling, RSM. (See Fig. 97 and 98.) In Fig. 97, the SVG track adjustments are detailed and the EBR spot density has been reduced, as modeled, to ~0 from the original values of ~30 defects/wafer; Fig. 98 is a slice across the 3-D curve of Fig. 97 to show the model exhibits a classical “saddle optimum.” Fundamentally, the principal is, as one might expect, very low and high spin speeds cause more defects. Intermediate levels of these settings provide the best operating condition. The same goes for dispense pressure. Bottom EBR is top EBR’s predecessor, and is used in almost every resist application, even if top EBR is not. Cleaning the backside of wafers became critical when step and repeat alignment tools with their narrow depth of focus became prevalent. Solvent must be applied to the back side of the wafers to dissolve and remove the photoresist drawn under the spinning wafer and circulated by the exhaust vortices during the resist apply step. The particles attach to the exposed area of the wafer, the area not contacted by the coater’s vacuum chuck. The dried resist particles, if large enough, can cause localized image deformation during the printing process due to stepper system focus loss. These focus spots are the result of the silicon wafer conforming to the resist particle as the wafer is pulled tightly to the stepper vacuum chuck. The spot will be out of focus when the area is imaged causing poor pattern definition. Often these particles are transferred to the stepper chuck from the wafer, which will affect all subsequent wafer exposures.
11/30/00
JMR
Figure 96. Positive photoresist solubility map taken from Olin (ARCH Chemical).
222
11/30/00
JMR
Handbook of VLSI Microlithography
Figure 97. EBR spot RSM illustrating the optimum operating conditions for this example resist/track process combination.
Resist Technology 223
11/30/00
JMR
Figure 98. 2-D RSM slice from Figure 97 illustrating a perfect “saddle point” operating minimum point for EBR spot defectivity.
224
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
225
There is a third method used to remove the photoresist edge bead not discussed earlier. The process is known as Optical EBR (OEBR), and is an optional piece of hardware that can be added to an exposure tool or as a module to the wafer track. The edge bead is irradiated with a light source. The exposed photoresist is later dissolved in the develop process. The process is clean, and reduces chemical usage. However, it was identified as the source of pattern defects at Motorola’s MOS12. There, Miller and Fischer found resist flakes scattered around the wafer that originated from its edge. Their tests pointed to wafer edge contaminants that contributed to the loss of resist adhesion during the OEBR exposure to create popped resist particle defects.[137] Modern fabs typically do top and back side chemical EBR followed by the top OEBR to achieve low coat process defectivity. Stadium Effect Caused by EBR. The Stadium effect illustrated above in Fig. 87 has also been observed when EBR lines are contaminated mistakenly. In this example, the propylene glycol methyl ether acetate (PGMEA) EBR line was filled with N-methyl pyrrolidone(NMP). This mistake happened, because the tracks were plumbed with both EBR fluids and the bulk chemical connections were switched at a bulkhead. Since dilution in the buffer tanks takes days to complete, the Stadium effect gets progressively worse each day until 100% NMP is reached—the effect is caused by resist swelling by the NMP during the EBR operation. The thickness uniformity and the Stadium effect get progressively worse, until finally the difference from wafer center to edge approaches hundreds of angstroms and adjusting other track settings has no reduction effect on the magnitude of the effect. Effect of Resist Properties. Resist viscosity has an effect on the number of aerosol particles generated during the resist coat process.[128] Pratt observed resist with a higher concentration of solvent, that is a lower viscosity, doesn’t produce nearly as many aerosol particles at spin speeds of 3 and 5 krpm. Testing resist viscosities from 3 to 70 cp, Pratt found an 8X jump in the resulting number of particles between 3 and 14 cp resists. Resists with viscosity above 20 cp, produced more particles, but the % change in aerosol particles tapered off after 20 cp to rise only another 10% for resist viscosities ranging from 20 to 80 cp.[128] Additionally, resists with volatile solvents can exhibit swirl defects and striations.[114] Softbake. The softbake process, sometimes referred to as “prebake” because it precedes the pattern exposure step, is done primarily to drive the residual spinning solvent out of the resist film. Therefore, the resist film densifies as the solvent evaporates—this is especially important to positive ESCAP. DUV resist. At typical non-DUV softbake temperatures between 90 to 110°C, there remains 8 to 2% solvent by weight.[138] Although the solvent’s
11/30/00
JMR
226
Handbook of VLSI Microlithography
influence is small, excessive solvent retained in the film makes the required exposure energy for pattern transfer unpredictable. Additionally, the solvent’s presence lowers the resist glass transition (Tg) temperature,[136] which allows pattern deformation to occur during subsequent thermal processing such as, ion implant and plasma etch. If solvent content is suspect in a process problem, resist thickness can be measured as a function of time and temperature. The curve produced from thickness data will flatten when solvent loss has stopped. (See Fig. 99.)
Figure 99. Typical positive resist film thickness vs. softbake time data.
The environmental conditions existing during softbake can influence the resist film uniformity. For example, if the softbake chamber’s exhaust is not uniform across the chamber and creates a low-pressure area, localized solvent evaporation can be accelerated there. This can increase the film thickness in these areas. The 3D map of the resist film will show a grading effect across the wafer. Another common problem that occurs is localized film nonuniformity. Defects of this sort appear as bumps or depressions when illustrated in 3D on a wafer map. The cause is particles on the hotplate.
2/23/01
JMR
Resist Technology
227
Particles that do not conduct heat well produce bumps, as opposed to particles that produce depressions. To determine if softbake exhaust or hotplate defects have an influence on film uniformity, compare results for a softbaked wafer to those of one not softbaked. If the film is uniform without the softbake, then the problem is the softbake process. Coat Conformality Effect on Alignment Targets. The conformal nature of the stressed-dried resist over large pattern structures yield an asymmetric resist coating, which in turn produces an asymmetric alignment signal.[139] As resist flows over alignment marks and other etched structures, several hundred angstrom thickness variations are not unusual. If the resist was perfectly planar, no issue would exist. However, some asymmetry occurs because the resist flows radially from the center of the wafer and tends to pile to one side of relief structures such as exposure tool align marks. This causes incident light from the alignment system to reflect differently from the air resist interface than from the resist wafer interface, which produces an alignment signal that may have more than a single peak.[139] Another complicating factor is most exposure tools use monochromatic illumination for pattern alignment, which is susceptible to thin film interference produced in the photoresist film. The resist thickness variations cause erratic variations in intensity of light back to the exposure tool’s alignment detection system that can be misinterpreted. The effect is worsened with thick resist films and when large pitch alignment structures are employed.[139] According to Chen and Eskes of Motorola, this problem leads to a situation where the alignment mark center can’t consistently or accurately be determined, which contributes to larger site-to-site and lot-tolot alignment variations to occur and lower product quality. Develop Track Operations. The process of developing the exposed image in the photoresist is not quite as complicated as the photoresist application process, but is no less important. The process sequence is bake, cool, and develop. The bake process is called “post exposure bake” (PEB) for the obvious reason that it follows the exposure step. PEB is done to reduce, if not entirely eliminate the standing waves that form as a result of reflected light from the wafer surface. (See Fig. 100.) The thermal treatment essentially produces a homogenous transition at the interface of the exposed/unexposed area in the resist, in part by diffusing and distributing exposed PAC molecules with unexposed PAC molecules. After PEB, the wafer is cooled to the approximate temperature of the developer solution. In the last step, the exposed resist image is actually developed. The most widely used chemistry for develop is tetra-methyl-ammonium-hydroxide (TMAH), that is 0.26 normal (N). The developer solution is between twelve
11/30/00
JMR
228
Handbook of VLSI Microlithography
and thirteen on a pH scale. Surfactants can be added to developers to enhance contact hole clearing[140] for even out-of-focus conditions. Develop concentration plays a moderate role in image quality and CD control. However, the two factors of most significance are temperature of the solution, and puddle time.[141]
Figure 100. Standing Waves observed on the edge walls of the photoresist without a post exposure bake (left) and with a PEB (right). Figure courtesy of Shipley Co. taken from P. Trefonas III, B. K. Daniels, M.J. Eller and A. Zampini, “Examination of the Mechanism of the Post Exposure Bake Effect,” SPIE, Vol. 920 Advances in Resist Technology and Processing V, p. 203,(1988).
Eo Testing. Before diving too deep into the develop process, we must discuss the energy-to-clear (Eo) test. The Eo test is a large area pad exposure easily monitored by the naked eye and should not be confused with exposures to size for a given feature. However, there is a fixed relationship between these two E values, and they change in concert with process shifts. The test procedure involves coat, exposure, develop and inspection. A resist coated wafer is exposed with increasing amounts of E starting below the E required to completely convert the resist’s PAC. Increments of exposure E are usually set 2-millijoules/square centimeter (mj/cm2) apart from one another. The type of masking (reticle) and exposure pattern vary from fabto-fab. Some fabs use reticles with patterns, others use reticles with no pattern at all. Following exposure, the test wafer is developed, then visually inspected to find the first resist-free exposure field.
11/30/00
JMR
Resist Technology
229
The Eo test is quick and reliable, and a tool that can be used to troubleshoot process anomalies. The gage study of Fig. 101 is an example showing test results are repeatable and reproducible—even when multiple people and samples are used in the testing process. However, there can be a “run order” effect as shown in Fig. 102. Temperature affects Eo. The Eo vs. run-order experiment shows us the “wafer loading” effect of a TEL Mark 7 develop track. The first six wafers do not have the same wait time at the chill plate as all subsequent wafers. The loading effect occurs because the develop process cycle is longer than the previous steps (i.e. baking and cooling), and wafers entered the develop module at a lower temperature which decreased Eo. As the track was loaded up with more wafers, each wafer eventually had the same amount of wait time for the develop process module and Eo stabilized. The Eo test is used to monitor established processes, but can also be used for process characterization or optimization. As a process monitor, the test detects changes in fab humidity, resist thickness, resist photospeed, developer concentration and equipment matching. In the following sections, we’ll discuss each of these effects. From a process understanding point of view, the test can be run in such a way to generate a contrast curve. A contrast curve is a graph of normalized resist thickness versus the log of exposure time. An example of a contrast curve is shown Fig. 30b. The data for the chart comes from the Eo test. Each exposure field’s resist thickness is measured and normalized to one, and charted on the Y-axis. The slope of the resulting graph is an indirect measure and predictive of the resist image sidewall. Eo and Humidity. Wafer fab humidity during the exposure process affects Eo. Moisture in photoresist is essential to convert the sensitizer to an acid during the UV exposure step. Without water, or with less than 20% RH, an ester is formed and the resist’s dissolution rate slows.[142] Notably, it’s not the level of humidity at the time the resist is coated that influences Eo. That doesn’t seem to matter because resist humidity equilibrates to ambient RH within 30 seconds or less.[142] Therefore, it’s the wafer fab’s RH that affects Eo. As it turns out, lithography areas prefer RH be above 35% to reduce electrostatic discharge that damages the chrome on the masking plates. RH above 40% is not good according to Bruce et al. The IBM researchers showed RH above 40% degraded resist contrast.[142] They attribute the effect to acceleration of the resist’s dissolution rate. From these data, we can derive the fab specification for RH: 40% > RH > 35%.
11/30/00
JMR
Figure 101. Example Eo gage study for a modern Fab.
230
2/23/01
JMR
Handbook of VLSI Microlithography
Resist Technology
231
(a)
(b) Figure 102. (a) Eo vs. run-order with a preceding delay and (b) without any process delay, i.e., five dummy wafers prior to the run.
11/30/00
JMR
232
Handbook of VLSI Microlithography
A study conducted by Motorola engineers at COM1 quantitatively showed the effect RH has on Eo. Referring to Fig. 103, Eo increased 10 to 15% as RH dropped from 35 to 15%. As mentioned earlier, there’s a relationship between Eo and the Esize. As Eo increased due to humidity, the mean CD increased too. Seen in Fig. 104, Eo shifted upward from 210 to 230 mj/cm2, and CDs increased from 0.4921 µm to 0.5052. The 13 nm increase constitutes a 2.6% change, or almost 10% of the entire CD budget. Developer Concentration. A surface response analysis conducted at Motorola’s ACT center, evaluated the influence of developer concentration and the quality of resist image, or contrast as it is often referred to, for the early i-line resist Aspect System 9.[105][141] Malhotra et al. test conditions included develop concentrations of 0.20, 0.215 and 0.23 N. As Fig. 105 shows, there is a strong quadratic response for contrast, which indicates a moderate concentration of developer will produce the best image. This has an added benefit as Malhotra points out, lower develop concentrations reduce surface roughness[143] and decrease CD variation. Even though the optimum developer concentration was 0.215 N for this example, modern resists are usually all designed now to develop optimally with 0.26 N TMAH. Develop Temperature. Develop temperature also affects the dissolution rate and the sidewall angle of the resist image.[141] What is most interesting about this process is temperature’s inverse relationship to resist dissolution rate. That is, as temperature increases, the rate of dissolution decreases.[138][141] A fast develop process may seem desirable. However, if the develop solution is too aggressive, the image quality degrades because the top of a line becomes smaller than the bottom and more unexposed resist loss is typical for concentrated developers. The resist’s profile or sidewall angle, as it is sometimes referred to, may drop below a factory’s criteria. This criteria is elaborated upon earlier in Sec. 3.1. As part of Malhotra’s contrast response surface analysis of the develop process, develop temperatures were investigated. Referring once again to Fig. 105, develop temperature response is quadratic in nature. The lower develop temperatures did produce poorer contrast. Although the effect was not as significant at the higher temperatures, lower contrast was observed. These data support our position that a more aggressive solution (i.e., low temperature solution), degrades image quality.
11/30/00
JMR
Resist Technology
233
Figure 103. Eo vs. %RH for a standard positive resist and a fab built circa 1993.
Figure 104. CD vs. Eo for a fab running at 0.5 micron in production.
11/30/00
JMR
234
Handbook of VLSI Microlithography
Figure 105. Resist contrast vs. concentration and temperature for Aspect System 9, immersion developed. (See Ref. 141.)
Develop Time. Choosing a develop time can be complicated because it involves a trade-off between image quality and manufacturability. The problem is, long puddle times are required to produce the sharpest image,[141] but they slow the track down too much. It is best to first optimize both the develop time and concentration with a fixed develop puddle time using a surface response experimental design. A common develop puddle time used for experiments, and in production, is 60 seconds. Run the RSM and choose the developer’s concentration and time based on the highest contrast. Since contrast responds linearly with develop time,[141] a simple time versus contrast test can be run. Decide on the develop time that meets both throughput and image quality requirements. Develop time for the track recipe is not chosen based on the amount of time required to dissolve exposed photoresist. Actually, that’s a function of the developer’s pH. Typical rates of dissolution are between 1500 and 2000 Å/second for a developer with a pH between twelve and fourteen.[130] In contrast, dissolution of unexposed resist occurs at roughly 10 Å/sec. Developer dissolution rate can be measured empirically by monitoring the time it takes to completely dissolve the exposed area of resist. However, the test results are not that useful from a manufacturing viewpoint, and can be misleading. For example, if the rate of dissolution is 2000 Å/sec, and the
11/30/00
JMR
Resist Technology
235
film is 10,000 Å thick, the exposed film would dissolve in 5 seconds. While theoretically this is true, it’s just not that simple. Develop Track/Module Matching. Matching developer tracks is essential to controlling CDs in a factory where more than one stand-alone develop track is used. This also is the case in cell tracks with more than one develop module. The Eo test can be used to match tracks and modules because Eo decreases as develop time increases. For example, if two wafers are coated and exposed exactly the same, then developed in two different modules (or tracks),Eo should be the same. If the two wafers Eo results don’t match, changing develop time will make the two modules match. The test is quick, and requires no metrology tool to read the results. For this reason, the Eo test is used often to monitor stand-alone and cell track system matching. It may be necessary to match develop tracks with more accuracy than an Eo test can provide. When a fine scale is required, there is no replacement for measuring CDs. Again, for the test to be effective, all test wafers are coated and exposed on the same tools, and in the same manner. The test wafers are distributed to as many develop systems as there are in the factory—at least two wafers for each develop module are recommended. Line widths are measured on at least five sites for each wafer. The results are averaged, then compared, to a known reticle size or to another known standard. The difference from one wafer to another developed on the same track, and the difference between modules should obviously be as small as possible. In a case study conducted at Motorola’s COM1 factory, two developers were in production in each lithographic cell (i.e., not stand-alone), and the engineering staff wanted to make sure the average CD from each develop module matched. An SPC chart was set up that monitored the difference between the two systems (Fig. 106). The chart shows the develop modules matched within <0.025 µm. This CD matching is critical to obtain throughput because the develop module is throughput limiting, and lot wafers must alternate between develop modules to maximize it. This is also why there are ≥ two develop modules/coat module in cell tracks. Developer Dispense. The method used to dispense developer onto the wafer surface influences defectivity, CD uniformity and developer consumption. In the next few sections we’ll discuss these three factors in detail because it is at that moment the developer solution is dispensed which defines the develop process latitude.
11/30/00
JMR
Figure 106. SPC control chart for a Tel MK 8 develop track illustrating develop module matching for the line CD (0.35 micron) control. CD23AVG-CD24AVG is the difference in mean CD for wafers from the two modules.
236
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
237
Develop Defectivity. The develop solution has a tendency to foam during dispense due to its aqueous composition. Even develop solutions without surfactant foam—just not as much. Foaming must be avoided because the bubbles prevent development of the image in localized areas. (See Fig. 107.)[144] The circular defect is an artifact of the air bubble trapped in the develop solution. The ideal dispense will produce a bubble-free puddle. There are three commonly used methods for dispensing developer, which are usually specified by the track supplier. There are the E2, spray, and stream nozzles. TEL tracks equipped with the E2 nozzle inhibit bubbles formation by flooding the wafer with a high-flow, low pressure dispense that covers the wafer quickly. The alternative spray dispense, produces many bubbles during dispense. However, the bubbles eventually stop forming with the continued spray of fresh developer on the wafer. This third method of dispense is the stream dispense. Bubbles are not easily formed with this type of developer application. Consequently, the process doesn’t generally suffer from the defects mentioned earlier. Developer must remain bathed in N2 to prevent formation of “CO2 scum.” Any time the developer is exposed to air, the potential exists to form the scum defect through carbon dioxide absorption. The defects that form are insoluble and are very difficult to rinse off as seen in Fig. 108 (CO2 Scum defect). The particles are semi-transparent and range in size from one to tens of microns. Common places where CO2 scum forms are in dispense nozzles and pressurized canisters. The use of in-line filters will keep the scum from the canisters from reaching the wafers, but there is no filtering at the point of dispense. Therefore, special attention must be paid to this area. The TEL tracks bathe the dispense E2 nozzle with N2 while it rests in its home position. Tracks that do not have this option should pre-dispense developer before dispensing on the wafer. However, even the N2 bath cannot prevent formation of CO2 scum if developer is drawn back into the nozzle too far and air is trapped. Figure 109 shows the signature of a contaminated E2 nozzle. CD Uniformity. The uniformity of CDs across wafers can be affected by the develop dispense. The problem is seen when developer is dispensed in one location. Take for example, a stream dispense nozzle applying a developer in the center of the wafer. The CDs at this location will be different than any other location because the solution is constantly being refreshed at that one spot. The concentration of developer dispensed in one spot is too aggressive, and creates “hot spots.” Therefore, stream dispenses are done by dispensing developer into a water puddle that is built first. Both spray and E2 nozzle application of developer yield comparable results.[144]
11/30/00
JMR
238
Handbook of VLSI Microlithography
Figure 107. Micrograph of NDO defect from Ref. 117.
11/30/00
JMR
Resist Technology
239
Figure 108. Micrograph of CO2 scum defect from Ref. 117.
11/30/00
JMR
Figure 109. Contaminated E2 nozzle defectivity wafer signature. The pattern is the develop dispense pattern.
240
2/23/01
JMR
Handbook of VLSI Microlithography
Resist Technology
241
Development Dispense Volume. Although the cost of TMAH developers are hundreds of dollars per liter less expensive than photoresist, the volume used for each dispense is ten to thirty times greater. Dispense volumes should always be the minimum required for a stable process or chemical costs will be exorbitant. For example, consider a resist that costs $200/liter, and a developer that costs $4/liter are in use at a high volume production fab that does 100,000 wafer alignments each week. And, dispense volumes are the industry’s average for eight inch wafers, i.e., 2 ml and 40 ml/wafer for resist and developer respectively. It would cost $40,000 a week to coat the wafers, and an additional $16,000 a week to develop them. As seen in this example, developer is a significant cost of doing business. The most wasteful of the three dispense methods is the stream dispense—which uses 100 ml/wafer for complete coverage. The spray method is second worse requiring a 50 ml/wafer. The most efficient method of dispense is the E2 nozzle that uses only 30 ml/wafer.[144] 4.2
Resist and Develop Track Fab Qualification
Starting up a new or used piece of track equipment begins before the tool arrives. Regardless if it’s a new or used purchase, or a transfer from a related company department, an agreement between the supplier and the customer must be made. Criteria and expectations of the tool’s performance, its capabilities and the requirements for facilities and environmental compliance, must be explored and understood before the tool arrives on the production floor. A poorly planned installation or tool qualification can lead to poor tool performance and severe tool availability delays. Track Configuration. The first step in the process of bringing a new tool on-line is the hardware configuration of the machine. Tracks are not simple purchases—they are built to match hardware, software and process requirements specified by the customer according to manufacturing requirements. Begin with the process flow, which determines the type of hardware, required for the job. A common photolithographic process flow is cassette→adhesion→cool→coat→bake→cool→cassette. If an exposure tool is in-line (i.e., interfaced to the track), then the developer is also required in the track floor plan. The flow would be extended to include exposure→bake→cool→ develop→cassette. The throughput requirements specified by manufacturing dictates the number of adhesion units, hot plates, cool plates, coat modules, and develop modules needed to layout the wafer track.
11/30/00
JMR
242
Handbook of VLSI Microlithography
Next, the type and number of hardware pieces are determined. The fact is, there is no need to purchase a million dollar state-of-the-art halfmicron capable tool to run a 2.0 micron process. It’s not economically feasible to do this. Each piece of hardware and software specified should be verified before the machine is shipped to the user factory by completing a “source inspection” of the tool. This entails visiting the toolmaker and confirming in detail every aspect of the tool. The module types and layout should be verified, so should support systems such as environmental controllers, resist pumps and chemical cabinets. The tool’s safety features, such as CO2 fire suppression units, should be reviewed by an equipment engineer, as well as all-electrical and plumbing connections. A thorough source inspection at the factory is critical, because this is where corrections or any misunderstandings can be resolved. Track Startup. After the tool arrives, it is installed by field service personnel. They set up and run initial tests confirming functionality and general operations. Often the supplier’s field service representatives use a checklist (e.g., TEL start-up checklist) that is carefully executed to cover their obligations. Nevertheless, the process engineer and local technicians are required to define temperature settings, flow rates, chemical configuration, canister pressure and numerous other things for the service personnel. Once this step is complete, the customer verifies that the set up meets the agreed upon configuration and the customer qualification. The hardware is always tested first. The reasoning is if the hardware is not capable, then it’s likely the process will not meet specification either. Tests are run in a logical order, each test building upon its predecessor. The tool must pass a wafer-handling test. This test uses every mechanical function the machine has to offer. At least 5,000, sometimes 10,000, wafer transfers must occur with usually no more than one failure and one assist. Next, are tests measuring the accuracy and repeatability of heating and cooling plates including the adhesion module, followed by the environmental T/H control system, water cooling loops, spin motor functions, and lastly, wafer and nozzle centering. See Table 19, the Motorola COM1 2.0 Qualification Summary;[145] notice the example’s test order closely matches what was just described above as the Tel checklist example, and could be used as a good checklist model. Thermal Devices. Temperature control testing requires all heating plates, including the adhesion module and cooling plates meet a temperature range and plate uniformity specification. Data is collected over a 3 to 5 day period using either a wafer with five or more thermocouples (TC), or a hand held probe that can be placed on at least five locations around the hot
11/30/00
JMR
Resist Technology
243
or cool plate. As shown in Table 19, the low oven specification for regulation was 50 to 120°C with a uniformity specification of ± 0.2°C. At a plate temperature setting of 100°C, the unit passed with an average temperature of 99.93°C with a 1-sigma value of the five TC readings of 0.069°C, or a total range of roughly 0.2°C. A high temperature oven specification is not as tight as a low temperature oven because the plate and controller are not as capable. The high temperature oven specification is typically five times more liberal for plate uniformity. Cooling plates require the same tight uniformity control as low temperature ovens, with generally a narrower range of temperature regulation such as 15 to 30°C. The adhesion module requires more than a high temperature hot plate thermal qualification for uniformity and operating range. It must also be able to uniformly treat a wafer’s surface with HMDS. Contact angle measurements are collected over a 3 to 5 day period of processing at least two wafers, then analyzed as a multivari study. There are three families of variation evaluated, site-to-site or within-wafer, wafer-to-wafer, and run-to-run. For water droplet on silicon wafer testing, the contact angle specification is written to include an angle for a given temperature and process time. Referring to the Case Study, see Table 20, a specification can be written to quantitatively specify adhesion performance. A suggestion for process capability for within-wafer variation might be ± three degrees, 1-sigma, based on personal experience, and possibly looser if the goniometer has poor gage capability. An example of a gage capability study that was conducted for a Video Contact Angle (VCA) 2000 goniometer at Motorola’s ACT is shown in Fig. 110. It shows that measurements are repeatable and reproducible more than 70% of the time to a specification of ± 10°. State-of-the-art resist coaters and developers are equipped with an array of temperature control devices, see Table 19. Specific to coaters, environmental units are used to regulate temperature and humidity in the catch cup module. The units are expected to have a temperature operating range from 18 to 24°C with no more than ± 1.0°C variation, and humidity control from 30 to 50% RH with no more than ± 1.0% variation. Water cooling loops maintain the temperature of photoresist and developer supply lines by circulating temperature-controlled water through jackets surrounding the chemical lines. The temperature is measured near the point of dispense by affixing a TC to the line. Collecting data once a day for 3 to 4 days and calculating the average and standard deviation provides a statistical snap shot of the tool’s control performance. The mean temperature measured should average no more than ± 0.2°C from the set point. If the
11/30/00
JMR
244
Handbook of VLSI Microlithography
day-to-day variation, although not usually specified, exceeds ± 0.2°C, this might indicate a control problem. The system should be checked out or resist thickness uniformity tests, run later, will fail a multivari study. Cooling water loops that remove heat from the motor flange should be tested in the same manner as the resist temperature, but here again, no specification is established so the data is used primarily to troubleshoot a resist uniformity issue.
Table 19. TEL Mark 8 - SFTRKO1
2/23/01
JMR
Resist Technology
245
Table 19. (Cont’d.)
Table 20. Temperature vs. Contact Angle Using Sixty-second Process Time
Temp (°C)
Contact Angle (degree)
50 100 150 200
63.5 67.0 68.5 72.5
1/19/01
JMR
Figure 110. VCA 2000 contact angle measurement tool gage study example for an 8” modern Fab. (This study courtesy of V. Louis and T. Ko of Motorola CDMC.)
(Cont’d.)
246
2/23/01
JMR
Handbook of VLSI Microlithography
247
Figure 110. (Cont’d.)
Resist Technology
11/30/00
JMR
248
Handbook of VLSI Microlithography
Resist coater and developer spin motors are tested for speed control and acceleration. The supplier of the equipment tests the motor’s performance before shipment, but a simple test of speed control to verify performance in-house is wise. Acceleration is usually only verified through supplier documentation. A motor for a state-of-the-art track system has an operating range from 0 to 6000 rpm with ± 2-rpm variation control, and can be changed in 1 rpm increments. The motor’s performance is monitored via the user interface and/or verified using an external calibrated digital tachometer. Wafer coat centering is critical for resist within-wafer uniformity and EBR quality. Visual inspection of a wafer rotating on the chuck, at 100 rpm or less, is a qualitative method of checking centering. While the bare silicon wafer is spinning, the reflection seen on the surface will be undisturbed if centering is good. If the reflection on the wafer’s surface is moving erratically, centering is not good. A quantitative method requires coating a wafer and using the top EBR to clean approximately 5 mm of resist from the edge of the wafer. This test method is also used for qualifying the top EBR mechanical set up and process. The EBR cut is measured at three or four locations around the wafer perimeter. The difference between measurements should not exceed 0.02 mm. Using a ruler does not fit into this category. There are two common production tools capable of measuring EBR width in microns. The tools are optical microscopes with precision digital stages, and the KLA5XXX overlay inspection station, respectively. Process Multivari Studies. Following the wafer track hardware qualification, and in preparation for the process qualification, the operating conditions for all chemical fluids must be set up. If the tool is the first of its kind in the manufacturing area or lab, there is a great deal of preliminary work to be done including designed experiments, DOEs, to establish the operating conditions. In this case, refer to early sections in the chapter that describe in detail the type of factors to include in the experiments. Assuming the track system is a new install from an established equipment base, the resist, EBR, developer and HMDS dispense parameters, i.e., pressure, time, flow, etc., are set up according to existing customer specifications. Next, test theequipment’s performance under those set conditions. Resist and developer dispense volumes are tested once a day for 3 to 5 days. The gage capability of a microbalance can be a problem with small dispense volumes, but is overcome by making multiple dispenses. (Refer to the exampleGage Capability Study, Fig. 111, that was run at Motorola’s COM 1 wafer fab). The specification for resist and developer dispense repeatability is ± 0.2 ml and ± 1 ml, 3-sigma, of the targeted dispense volume, respectively. If the tests pass, then the set points for pressures, times and flows are repeatable and stable. Up to this point, the equipment was being qualified, it’s now time to qualify the process.
11/30/00
JMR
Figure 111. Micro-balance gage capability study example for measuring wafer track dispense volumes gravimetrically.
Resist Technology 249
JMR
2/23/01
250
Handbook of VLSI Microlithography
The resist coat and developer processes are qualified by running multivari studies. If not already done, these processes must first be optimized using recipe conditions and temperature settings described earlier in this chapter and section. When satisfied, conduct the resist thickness multivari studies and evaluate all three families of variation, i.e., site-to-site, wafer-to-wafer, and day-to-day. For example, run 25 wafers per day for 5 days and measure thickness on at least 13 sites, preferably 49 sites. Start measuring from the center of the wafer moving outward radially to within ≈10 mm from the edge of the wafer. Using a statistical software package such as JMP™, calculate the families of variation. An optimized process will meet a 10 Å, 1-sigma, specification for site-to-site, wafer-to-wafer, and day-to-day variation. A develop process qualification requires resolution and critical dimensions, CD, be evaluated. The multivari study for develop is the same as for coat, 25 wafers per day, 17 sites per wafer, for five days. The output is CD uniformity. The specification for this test is dependent upon the exposure tool capability; for the sake of this example, let’s assume the established process is 0.35 µm. As a rule of thumb, a lithography process is capable if no more than 10% of the CD budget is used. In this example, the process specification for a printed 0.35µm line/space pair is ± 0.035 µm, 6-sigma, for both site-to-site and wafer-to-wafer families of variation. As shown in Table 19, the tool passed the process specification for resolution and CD uniformity. An example CDMV which could be run after the track qualification to monitor track performance is the CDMV test (see Fig. 112). In the figure, the characteristics of the test are listed; note, the test qualifies the entire process on a blank silicon wafer, a perfect substrate to qualify the process daily or even weekly if the data supports that sampling. The EBR process qualification involves a multivari study as we’ve seen for other process qualifications. Additional wafers are not required to be run because the same wafers used for the resist thickness and/or CD uniformity multivari tests can have the EBR measurements taken before reclaim. As described in the section for wafer centering, the EBR cut is measured at three or four locations around the wafer perimeter. Measure the EBR cut at 0°, 90°, 180° and 270°. The specification is ± 0.3 mm, 3-sigma. To meet this kind of specification, again using a ruler just won’t work. As mentioned earlier, use a microscope with a precision digital stage and x/y coordinate display, or a KLA 5XXX inspection station to complete the task.
11/30/00
JMR
Figure 112. SPC example control chart for the CDMV test used to montor track CD performance and control.
Resist Technology 251
11/30/00
JMR
252
Handbook of VLSI Microlithography
Track Defectivity and Contamination. Defectivity, particle and contamination are tests, which are run to insure the equipment will produce wafers with high yield. Defectivity tests involve printing a snake pattern, which is either lines in a serpentine pattern or an array of holes in the resist that are spaced appropriately for the manufacturing area’s lithographic capability. Run at least two wafers through the coat, expose and develop process. Then inspect on a KLA or similar type inspection station that performs die comparisons. The specification for such a test is dependent on the fab’s technology level. Using the same example as for the developer qualification, a state of the art fab requires the defect density be less than 0.05 defects/cm2 at 0.3 µm. Particle checks are done using at least two wafers per run on both uncoated and resist coated silicon wafers. Before wafers are processed, pre-counts are taken using the KLA 2XXX tool or Tencor 6100 or 7600 surface scanning tools, or equivalent. Wafers are subjected to either wet or dry processing in the coater and developers. Usually the test is repeated several times. The specification for uncoated wafer, i.e., dry processed, is less than 5.0 particles greater than 0.3 µm on the front side. Resist coated wafers should have no more than five particles/ defects total. With the polished side facing down, backside particles can be counted. The specification here is fewer than 500 particles larger than 0.5 µm and fewer than 1000 particles larger than 0.3 µm. Testing contamination such as mobile ions and bacteria are done usually only once. Two virgin wafers are processed through the track systems and sent to an analytical lab with a control wafer where total x-ray fluorescence (TXRF) is conducted. Elements of concern are Na, K, Fe, Cr and Zn. Samples of DI water are take from rinse nozzles on the developer and cultured for 48 hours for the bacteria testing. The specification for the number of bacteria colonies varies from one area to another. It’s best to consult the Quality Assurance personnel for guidance, but no colonies are the result sought. 4.3
DUV Resist Wafer Tracks
The requirements for DUV wafer track processing systems can be more stringent than those for non-DUV resists described in section 4.1. To some degree, this is also due to the ever shrinking resist CD requirement, but also is related to the extensive use of CA DUV resists and their inherent greater susceptibility to environmental conditions. Table 21 lists the 248 nm DUV track specifications for 300 mm wafers.[146] Although this table is for 300 mm, it is representative for the requirements of 200 mm wafer DUV tracks as well.
11/30/00
JMR
Resist Technology
253
Table 21. DUV 248 nm KrF Track Specifications Attribute Equipment parameters
ARC Edge bead removal (EBR)
Process targets
Process resist thickness Coating uniformity total variability (3σ) Coating uniformity within wafer (3σ) Coating uniformity wafer to wafer (3σ) Develop uniformity total variability (3σ) Bake uniformity total variability (<110°C) (3σ) Bake uniformity total variability (<150°C) (3σ) EBR D radius (3σ) EBR D radius (3σ)
Process charac-
Contact angle after adhesion process NH3 concentration
teristics
Units
Metrics optional optional
µm nm nm nm °C °C mm degree
0.76 1.2 <5 <5 5.5 0.2 0.35 0.2 1
degree
60–70
Life of chemical filter in 50 ppb cleanroom
ppb months
<1.0 12
Defect -
In-film at 0.20 µm
#/wafer 0.0051/cm2
PWP
On bare Si at 0.09 µm Backside on Si at 0.20 µm
#/wafer 0.0252/cm2 #/wafer 0.30/cm2
Cost/ perform target
Throughput* Tool capital cost MTBF MWBA MTTR Preventative maintenance Consumables+ Area per tool Support area per tool
wph SM hour wafer hour hour/week $/wafer pass m2 m2
79 1.6 700 2000 3 3 4.5 5.0 2.6
COO Target
COO Objective
$/wafer pass
5.81
*At nominal dose for 0.18 mm, including ARC and no EBR +Including
resists, parts, laser and gases
11/30/00
JMR
254
Handbook of VLSI Microlithography
Central to the wafer track needs, in general, is the need for systems to be interfaced to exposure units, or “lithographically configured and controlled as a cell” (see Ch. 6). By utilizing the cell cluster approach, there is less room for process delay errors due to stand-alone equipment wafer handling, better contamination and isolation control, shortened overall cycle time, and better logistical wafer control. This requires the track be capable of the throughput of the exposure tool—the track must keep up. FSI and Tel have improved their robot handling track designs to the point where throughput matching is no longer an issue. They have reduced handling movement overheads, improved module layouts, and even added robots to achieve the desired throughputs. The most recent systems now have the throughput capacity to even match run speeds of the low-shot count widefield steppers (e.g., Canon 3000 iW), used for less critical layers like polyimide or bond-pad contact layers. Throughput and process control are almost always inter-related or connected, i.e., achieving precise process transfer delay control can impact throughput if the main arm is the only transfer arm. Track software must be able to optimize or balance the demands of throughput while maintaining CD process control, and keep the exposure tool loaded as well. This software must also be able to run in parallel and cascading modes, where different recipes and flows can run for at least two recipes and new lots started before in-process lots are done. The track must also have buffer lot capability in case the exposure tool has a temporary delay. Modern tracks like those of Tel and FSI have these capabilities and are becoming increasingly necessary as the mix for DUV increases beyond the current roughly 5% by photolayer usage level. Track Footprint and Ambient Control. Compact wafer track designs for throughput and process control can also be fairly low in footprint, another important COO consideration. FSI, for example, has achieved a cluster track with two robots capable of 100 8" wafers/hr in 80 sq. ft. It is totally enclosed for ambient base contamination protection with Donaldson charcoal filters and is class 1 internally. The fab airborne bases common to modern MOS wafer fabs are TMAH, NMP, NH3, HMDS, and various other base compounds in lower amounts, must be controlled to less than 1 ppb. If base levels are left uncontrolled, these base contaminants can destroy the acid catalyst in CA DUV resist films at the surface thus destroying the development of the image profile and size control. Suprisingly, large amounts of bases are also found in stepper tools, and sub-fabs typically have very high levels (40–100 ppb) as well.
11/30/00
JMR
Resist Technology
255
DUV Track Process Cost. Since DUV track processing costs per wafer are dominated by resist costs (i.e., 62%)[147] due to ~$2–4 k/gal. CA DUV resist material costs, pressure is put on users and track companies to reduce dispense volumes to the minimum. Engineers at Tel are predicting resist volumes to reach 1.5 ml and below for 8" wafers, and smaller volumes are being targeted. DUV Track Module Process Effects. Researchers at IBM were involved in DUV DRAM production in the early 90s, and they found the stability of the post-exposure bake to be the single most important CA positive DUV resist track parameter. [148] This is so because for CA DUV resists the reaction is actually completed and quenched at the bake and chill plates. Their results improved with hotplate controller design improvements, where temperature range variation was reduced from ± 0.4 to 0.15°C; mean temperature plate-to-plate variation went from 1 to 0.4°C as well. This increased track bake stability improved CD variation from ~15 nm to ~5 nm, and reduced wafer rework by ~33%. FSI has also recognized the importance of PEB control and process delay control, and developed an integrated bake/chill station for clustered tools. By integrating the plates and a buffer position into the same compact module, every wafer receives the same PEB-to-chill cool delays within 1 sec, and without sacrificing throughput matching to the DUV exposure tool.[148] As CDs continue to shrink to the 0.18 micron and below, adaptive control methods will become more necessary. For CD control, withinwafer variation is governed ~ 46%, or nearly half, by variation in the wafer track processes. Brown et al. have further determined that 70% of wafer-towafer variation can be accounted for by the contribution of the wafer track processes.[113] To reduce overall CD variance in the future, wafer track adaptive systems that monitor process conditions will have become more prevalent. These systems will monitor T, % relative humidity (RH), barometric pressure, and so on, to allow families of variation to be corrected for in real time, and hopefully some day, on a wafer-by-wafer basis. In the meantime, environmental control systems for tracks will control more precisely. Recently, ±0.2% or 0.4% RH local environmental control is being specified for DUV tracks.[149] Track process chemicals are being tightly temperature-controlled, and PEB plate temperatures are moving to ±0.1°C control for plates near 100°C; these plates are monitored by wafers with embedded thermocouples like those made by SensArray. In summary, wafer tracks are developing into high technology cluster toolsthey are not just simple chemical dispensing units anymore. Productivity in these systems will continue to advance, and CD budgets dictated by track process control sub-systems will continue to shrink as design rules improve to the 0.1 micron and below levels.
11/30/00
JMR
256
Handbook of VLSI Microlithography
4.4
Photochemical Support to Modern Fabs
Many modern factories have out-sourced their chemical management systems to companies like ARCH (formerly Olin Microelectronics Materials) Chemicals and Air Products. A system developed at Motorola in the early 1990s with ARCH is called TCM, Total Chemical Management. The system allows the semiconductor company to focus upon building devices, the thing it does best, and the chemical company on chemicals, again, the thing it does best.[150] TCM is also referred to as Chemical Management Services, CMS. With CMS, suppliers assume responsibility for virtually all chemicalrelated functions within the semiconductor factory.[151] CMS was originated with chip manufacturers as cited in the reference above. A new CMS partnership begins with an audit of the semiconductor manufacturer’s operations and requirements. The supplier assumes an expanded and radically altered rolefor example, even purchasing materials from competitors, if necessary. Supplier responsibilities include verifying chemical purities, designing and operating chemical-delivery systems, and even collaborating with the customer on refinements in process engineering. In any CMS relationship, manufacturers must be free to choose and use products from any sources they desire. Similarly, to be a credible partner, the CMS must ensure the customer has access to the optimal product regardless of sources. By being on-site, most often around the clock, the CMS supplier can forecast needs more precisely and stay on top of changing requirements. CMS is a value-added service that provides a high level of chemical-related expertise. Centralized chemical-delivery systems enable CMS professionals to leverage their expertise throughout a fab, ensuring that large volumes of chemicals are delivered to the point of use with maximum purity and efficiency. Having chemical experts on-site also helps manufacturers anticipate new industry questions. For example, consider the challenge presented by the bulk delivery and recovery of CMP slurry. The ammonium-hydroxide-based compound readily gels, and requires special piping and handling. CMS suppliers must, therefore, take the lead in developing innovative ways to reclaim or dispose of this widely used material. They must assume ownership of problems that a chip producer may not have the manpower or expertise to solve. Safe, high-purity transport of chemicals from storage vessel to process tools is primary in semiconductor processing, where an uninterrupted supply of chemicals is required. This can best be done with bulk chemical distribution (BCD).[151]
11/30/00
JMR
Resist Technology
257
The move to central distribution systems stems from the particle generation inherent in the use of bottled chemicals. Chemicals commonly delivered by BCDs are isopropyl alcohol (IPA), propylene glycol monomethyl ether acetate (PGMEA), n-methyl pyrrolidinone (NMP), and tetramethyl ammonium hydroxide (TMAH) aqueous solutions.[152] Chemical distribution systems consist of a source of chemical or storage vessel, a chemical delivery module (CDM) and a piping system (Fig. 113). CDMs filter, blend and transport chemicals through tubing/piping to the point-ofuse (POU) where controllers regulate the rate and pressure of chemical delivery. Distribution systems are typically programmable logic or microprocessor-based controlled and many work through LAN systems for fabwide networking. Particle contamination generated by BCD components can be limited by:[152] •
Recirculation through filters
•
Reduction in pulsations
•
Constant flow through filters
•
Clean materials
•
Micro-porous filtration
For this reason, internal system components are typically made of fluoropolymers such as polytetrafluoroethylene (PTFE) or polyfluoroalkoxy (PFA). BCDs are not suitable for all fab chemicals. Point-of-use systems, such as all photoresists, are generally distributed from glass gallon bottles or NowPaks,[153] a bag-in-a-bottle container. Design rules to minimize the danger of spills, leaks, vapors and fires include: •
Secondary containment around the source and dispense vessels
•
Double stainless steel cabinet walls for fire retardation
•
Fire detection and suppression systems incorporated when required by code
An important component of a fab’s chemical management operation is the handling of chemical waste following wafer processing. As environmental controls become tighter, semiconductor manufacturers will need environmentally safe means for collecting and disposing of waste solvents and chemicals. Waste management systems range from central drains to local point-of-use (POU) collection to mini-waste collection and pumping systems. For volume manufacturing, however, local POU collection is not
11/30/00
JMR
258
Handbook of VLSI Microlithography
possible due primarily to the sheer volumes involved and possibility of spills when handling large numbers of containers over short time periods. In some modern fabs, waste segregation or separate waste collection of recyclable or reactive chemicals is accomplished. Systems such as Semco’s Chemtech 2001 provide continuous automated waste collection into 55 gal drums, and if the drain lines are properly segregated, solvent recovery and reuse.
Figure 113. FSI example chemical delivery system schematic illustrating the CDM points, piping systems, and the control system. (Figure courtesy of FSI taken from FSI Pipeline, Vol. 1.2, p. 2, Aug. (1997). ChemControls™ is a registered trademark of FSI International.)
11/30/00
JMR
Resist Technology 5.0
APPLICATIONS AND SPECIAL PROCESSES
5.1
Future Device Demands
259
Integrated circuit lithographic design rules have historically decreased over time, and will continue to do so—at least to the 0.1–0.07 micron level over the next 2–5 years. The driving forces for these increased circuit packing density demands are cost per function decreases, faster switching speeds, and lower chip power consumption. These new design rules will necessitate higher resist and lithographic tool resolution performance. Since the etching technology exists and tool contrast is only improved by the purchase of expensive new equipment, the goal for some process engineers simplifies to one of improving resist process performance. The processes in this section address this issue. Quoting L.F. Thompson of AT&T Bell Labs, “It’s all in the processing.” Resist processing is really the only variable left to most lithographic engineers with which they can influence device production, because the photolithographic tool aerial image limit is basically fixed by the manufacturer for a given generation of tool. Conventional single-level positive photoresist technology has recently progressed rapidly, especially in the DUV class, but may be incapable of providing the necessary resist imaging required for the next generation of chips, i.e., less than 0.1 micron CD. Here, resist imaging thickness is the key issue to meet RIE etch masking requirements. Therefore, new resist processing technology (for example, surface imaging, like Desire processing) may be needed. Furthermore, processes which extend the performance levels of existing projection exposure tools or provide depth of focus relief are very attractive due to the increasing cost of new higher performance exposure equipment and the return on net asset demands placed upon chip sales of new devices to pay for these tools. While a great deal of resist process research has been occurring to extend phototool lifetime, advances in reduction lens design and reflective optical wavelength reductions have also been occurring. As a result, optical lithography continues to relegate direct-write e-beam lithography to a quick turn around Application Specific IC (ASIC) role, and extensive x-ray lithography usage has continued to be delayed for years.
11/30/00
JMR
260
Handbook of VLSI Microlithography
5.2
Introduction to Multilayer Applications
Multilayer processing techniques, where layers of radiation sensitive (top), non-photosensitive organic, and/or inorganic materials sandwiched together to become the total patterning layer, have become common in semiconductor and computer manufacturing R&D labs. (See Fig. 114.)[3] Due to their complexity and problems that have appeared at bilayer interfaces, these techniques have not been widely accepted in high volume production. Photolithography image edge and dimension quality is limited by two basic effects, bulk and substrate reflectivity.[13][154] The bulk effect arises when lithography patterns are required at two different topographical layer levels of the vertically-fabricated monolithic circuit. Reflectivity effects occur when patterned areas of the circuit have different reflectivity coefficients, as well as topographical levels. Lithographic exposure tool resolution performance can be influenced by resist processing. Stover et al.[155] have shown K from the resolution equation, R = K wl/2NA, is directly influenced by multilayer processing: wl is the monochromatic light wavelength and NA is the numerical aperture of the projection optics lens system; K is typically 0.8 in manufacturing with single layer resist processes but can be as low as 0.5 with multilayer processes.[156] Linewidth control is also directly affected by resist processing, and greater resist image critical dimension control can be achieved through multilayer processing[13] combined with anisotropic reactive-ion etching technology. Lin[157] has done extensive research in bilayer systems, multilayer processes utilizing a UV sensitive material on top with a DUV absorbing or non-absorbing system underneath. This type of multilayer system has seen limited circuit fabrication application, because it is plagued by deleterious interfacial layers formed between layers at coat.[13][158] These problems have been solved by various treatments both before and after image formation, but bilayer technology has taken a backseat primarily to dyed thick single-layer photoresist processes.[159] Some bilayer processes are reported to be free of interfacial mixing problems,[160] and the organic ARC process described is a usable bilayer system. Trilayer systems[161] utilize an intermediate layer between the photosensitive top layer and the virtually developer-insoluble oxygen RIE patterned bottom layer. It is usually deposited by low temperature (< 250°C) thin film deposition techniques, but can also be applied as a liquid spin-on-glass solution.[162]–[164] The middle layer can also be a spun organic polymer (also water soluble) barrier layer coating for the lithographic trilayer processes, as
11/30/00
JMR
Figure 114. Multilayer resist systems illustrating different multilayer processes and PCM techiques.
Resist Technology 261
11/30/00
JMR
262
Handbook of VLSI Microlithography
opposed to RIE trilayer processes where the middle hard mask material must be RIE etched and cannot be developed by base developers or water. While the former technique is usually void of interfacial mixing layer problems, the latter technique can exhibit this problem intermittently. All of the multilevel processes described are successful to some degree in relieving resolution and linewidth control limitations of current single-layer optical exposure equipment. Future device fabrication requirements and the application of high numerical aperture exposure equipment, however, will most likely create the need for multilayer processes at one or two critical device levels. 5.3
Introduction to MLM Lithography
The requirements of VLSI/ULSI multilevel metallization (MLM) systems have led to numerous challenges in the photolithographic and etch areas. These challenges arise from two inter-related sources: the continuing trend to smaller geometries and the new materials that must be used to build a successful MLM. Smaller geometries often create a need for new materials, and the ability to process those materials can limit usable geometries. Photolithography and etch technologies with the capability of producing submicron MLM structures are an integral part of the successful fabrication of the high density, high performance, and highly reliable interconnect technologies of the nineties and beyond. For devices with multilayer metal (MLM) interconnects, these individual layers are insulating dielectrics or metallic interconnect layers. The ability to produce high quality MLM layers depends upon both the lithographic tool imaging capability, quantified by the Modulation Transfer Function (MTF) (see Ch. 5) of the tool optics which in part depends upon the tool objective lens numerical aperture (NA), and the resist process quality or CMTF.[3][165] The electrical yield for MLM device layers depends upon these quantities for device element packing density and the ability of the lithographic tool to align layer to layer; the combination of CD control and alignment control represents the total overlay capability of the system which must be better than that required by the design rules for the circuit. 5.4
Applications
It is well known that lithographic image sizes, better known as critical dimensions or just CDs, and CD variation are governed to first order by the quality of the stepper projection lens (i.e., NA and MTF), the resist contrast
11/30/00
JMR
Resist Technology
263
or CMTF, and the degree of the resist process optimization.[3][105][165] As CDs get smaller and the lens and resist parameters approach their practical limits, secondary characteristics such as lens astigmatism and proximity, substrate reflectivity, topographical or bulk effects, and local resist thickness variation begin to be significant and have to be controlled along with the primary effects for successful device fabrication, especially at less than 0.5 µm polysilicon gate, or any other layer for that matter, dimensions for advanced CMOS or BICMOS device fabrication. Dyed and Thinned Single Layer Resist (SLR) Processes. The two most obvious things to do to improve resist imaging performance on the most difficult substrates (i.e., those requiring greatest resolution and with reflective topography as for device isolation, gate, and MLM metal and via levels), is to reduce the resist thickness and add dyes to it, respectively. Unfortunately, resist thinning is most feasible with multilevel lithography processes,[166] where a thin top resist layer is allowable. Resist thinning in itself accomplishes nothing towards reflective image notching relief, the main observed problem for reflective topographical gate or MLM situations. Furthermore, resist thinning presents a severe problem to step coverage and metal etching because of poor selectivity, and is in fact usually prohibitive. All of these negatives aside, resist thinning has been shown by IBM researchers[167] to improve linewidth control by 15% and focus control by 35%, when and if it is feasible to do it. Therefore, the full advantages of resist thinning may be really only achievable through bi- and trilayer lithographic processes, further reinforcing the restrictive applicability of resist thinning. The more practical solution to reflective notching problems, those observed on reflective surfaces due to either feature topographical or metal grain boundary relief structures, is provided by resist dyeing. Most device fabrication areas will select this option over multilayer Portable Conformable Mask (PCM)[166] processes due to greater simplicity and lower costs. The resulting dyed and usually thicker (~2 microns) material is still a single level resist process, but without the added complexity of multilayer processes; note also, by going to thicker dyed resists relief from topography induced CD variation, the well-known “bulk effect” is also achieved. Dyeing resists requires a price be paid in lower resist contrast[168] and greater exposure time,[169] but dyeing does provide greater process latitude[97] and reflective notching can be effectively eliminated[76] or at least minimized. Sandia workers[170] have provided a dyed system for H and G line steppers which has a very small exposure penalty, just 15 mJ/cm2. Bolson et al.[76] have demonstrated similar results, and an approximately 60% gain in
11/30/00
JMR
264
Handbook of VLSI Microlithography
exposure latitude was achieved or a reduction ink1 from the Raleigh resolution equation from 1.1 to 0.6 was effectively obtained. On the negative side, adding dye to the resist formulation may lead to a larger standing wave foot,[171] lower depth of focus (Table 5), and reduced imageedge wall angles consistent with the observed bulk contrast reductions reported by Pampalone.[168] Most dramatically, Brown and Arnold[171] have observed a 3-fold increase in CD exposure latitude, a result which explains the wide acceptance of this technique for metal layer lithography by older mature production fab lines, lines where metal CDs are larger and advanced planarization techniques beyond POT processing are not common (see planarization section). SLR Image Reversal (IREV). Positive photoresist image reversal (i.e., negative toned imagery from a positive toned resist) is a processing technology which addresses the deficiencies of single-layer dyed resists. Moreover, IREV can be accomplished on dyed material to achieve the best of both worlds, namely, relief from topographical or bulk effects and reflective notching minimization. Over the years, many papers have been published on this subject.[172]–[180] The salient contents of those papers will be reviewed and compared. The main focus of this section is on single level thermal induced reversal processes for AZ 5214. Image reversal is an alternative to conventional positive photoresist technology. Briefly, a single layer of positive photoresist is exposed using a projection aligner, reversed by either doing a post exposure thermal treatment on a hotplate or by adding a base to the resist, flood exposing and developing. The result is a negative toned image with a controllable edge wall angle, something dyed resists with conventional processing cannot deliver. The IREV processes are capable of printing images previously unattainable with the given exposure tool, thus, extending the resolution and focus latitude performance of the alignment tool, and the life of it as a capital asset. Marriott, Garza, and Spak have written the definitive paper on thermal image reversal.[179] Using both theoretical PROSIM simulation and empirical methods, the AZ 5214 IREV process has been optimized, and good agreement between theoretical images and real images obtained. The process was optimized for photospeed, resolution, focus and CD latitude, and vertical edge wall by adjusting the developer concentration to 0.21 N MF-312 (90 secs), setting the PEB temperature to 110°C, and by using a flood exposure after PEB. This data has been independently verified at Motorola (see Fig. 115). An improvement of 1.25 micron in focus latitude, 150% improvement in CD control and a 8° improvement in image edge wall were also reported.[179] AZ researchers further reported a bulk contrast
11/30/00
JMR
Resist Technology
265
performance improvement of roughly 200% for IREV AZ 5214 over that for the positive performance mode. Consistent edge wall imagery improvement can be seen when comparing Figs. 115 and 116, where Fig. 116 portrays images with edge wall angles more typically observed from normal positive tone performance.
1.2 MICRON/UV-3/B49B
1.8 MICRON/UV-3/H9P
4.5 MICRON
4.5 MICRON
Figure 115. AZ-5214 thermal tone reversal process edge wall profiles illustrating vertical edge walls.
AZ 5214
OFPR-800 Negative Mask
OFPR-800 Positive Mask
Figure 116. Edge wall profiles for conventionally developed positive photoresist images. Note, the edge walls are poorer than those for both IREV processes.
11/30/00
JMR
266
Handbook of VLSI Microlithography
All in all, it is felt that the image reversal process advantages may overshadow the disadvantages for certain critical device fabrication levels, where standard resist processing technology simply cannot satisfy the future lithographic imaging requirements. Furthermore, work by Gijsen et al. (see Ref. 90 of Ref. 129) has demonstrated image reversal behavior for dyed positive photoresist without degrading edge wall slope advantages, thus providing a process with relief from integrated standing wave and pattern scattering effects in addition to the relief from the bulk effect provided by the undyed reversal process. Resist Processes For Reflective and Topographical MOS Gate Situations.[181] In this section, an example, silicided, 0.5 µm polysilicon gate layer is focused upon and all data presented is for that critical device layer. Resist simulations of CDs and substrate reflectivities were obtained employing the version 2.2 PROLITH simulation package. The problem or dilemma faced early in the 0.5 µm work dealt with polysilicon CDs which were larger in a packed line device configuration than when they were isolated. This was just the opposite to that which was typically observed for metal or for single crystal silicon substrates and even flat polysilicon substrates.[182][183] Furthermore, the results of Ref. 184 have demonstrated that the effect gets larger the smaller the pitch, and those authors actually recommended that this effect be corrected for at design by differential biasing. On flat surfaces, with all other factors being negligible, the isolated CDs can be anywhere from 0.03–0.06 µm larger than that for the tightly packed features, a bias which is not negligible at 0.5 µm CDs, i.e., it’s greater than 10% without any normal processing variation; this bias can actually get as large as 0.08 µm depending upon how large the normal photo/etch bias is. Figure 117 contains PROLITH 2 simulation data consistent with these empirical results, therefore, another mechanism must be controlling the CD behavior on the polysilicon gate device test structures other than the normal iso/dense resist CD bias effect. PROLITH swing curve simulations, plots of resist CD vs. resist thickness, for silicon and polysilicon substrates were also very similar as found in Fig. 118. But, when swing curves for polysilicon resist images on raised field oxides are compared to those for images on topographically lower active device regions, the curves are shifted along the resist thickness axis for both isolated and dense gate features.[185] This result now provided a mechanism, i.e., the resist thickness differences between field and active regions,[185] to allow for the possibility for the resist polysilicon CD to be
11/30/00
JMR
Resist Technology
267
larger or smaller depending on polysilicon underlying structures. Notice, however, that although the data of Ref. 185 shows the effect of swing curve shifting for the first time it still cannot account for the observed gate CDs being larger for packed features, so a closer look is required.
Figure 117. PROLITH generated polysilicon resist CD vs. exposure for isolated and packed lines. The simulations are verified empiracally in the SEM picture on the top.
2/23/01
JMR
268
Handbook of VLSI Microlithography
Figure 118. Resist sensitivity vs. resist thickness for gate CDs on polysilicon and silicon wafers. Linewidth vs. thickness would look the same if plotted.
This leads to lens astigmatism. For the Canon 0.52 NA lens employed, the astigmatism was =10 nm, so, this contributor was negligible. This was a valuable piece of information since the packed CDs were oriented 90 degrees off 0 or horizontal to the isolated transistor gates. Figure 119 further illustrates the CD distributions for isolated horizontal gates are also well separated from the vertical isolated gates as well as smaller. As a sanity check, swing curves were generated for silicon and polysilicon test wafers with absolutely no underlying patterns or topography. The polysilicon CDs for these images were as predicted by normal proximity, and reversed in order from those observed for live product wafers (compare the data of Fig. 120 to that of Fig. 118 and 119). So, when the substrate is flat and unpatterned, there are no swing curve phase shifts and the packed gates are smaller than the isolated gates as pure “proximity effect” would dictate.
11/30/00
JMR
Figure 119. Normal distributions of polysilicon CDs for BIMOS gates showing the effect of gate orientation and gate oxide substrate nature.
Resist Technology 269
JMR
2/23/01
270
Handbook of VLSI Microlithography
cd cd cd
Figure 120. CD vs. thickness of resist for the three gate types on bare silicon wafers with no previous patterns.
11/30/00
JMR
Resist Technology
271
How can the data of Fig. 119 be rationalized? The answer lies in the data of Fig. 121. Here, the swing curves have been generated for the three different gates on live BIMOS product wafers as in Fig. 119; notice that the packed gates on field oxide, as for Ref. 185, have a dramatically different swing curve phase than the other curves. At resist thickness of 10,800 Å, the split bridge packed gates are in fact larger than both the horizontal and vertical gates on gate oxide over active area. Hence, a mechanism or a reason for the observed gate CD order results is obtained, which, as for the results of Ref. 185, is attributed to local resist thickness variation in the respective gate areas. Further note, that even though there is little if any lens astigmatism, the horizontal gates are also larger than the vertical gates, again because their swing curves are shifted even though they are at the same topographical level. The final effect described is reflectivity, R. As for swing curves, polysilicon reflectivity varies sinusoidally with resist thickness due to standing wave interference effects. CD variation with thickness is in phase with R, but CD variation is usually just roughly in phase, or the CD variation on polysilicon is usually a minimum when reflectivity is at a minimum.[186] In conclusion, the large effects reported here must be canceled out by differential CD biasing at device design CAD or with OPC applied corrections. The proximity induced iso/dense CD effect contribution, which can be completely canceled out or dominated by larger swing curve effects, can be minimized by employing improved resists. Device Isolation Topography Effect upon Resist Gate CD Con[187] trol. First generation LDMOS devices, laterally diffused RF devices, are characterized by large gate pattern dimensions (i.e., 1 micron and greater) and mature but topography creating single hipox isolation processing. Figure 122 illustrates the severe topography created by the “bird’s beak” of this older device isolation process. This topography creates an unwanted constraint upon the resist coating process, which in turn affects CD control in a negative way, both gate to gate and within gate for the RF device. In fact, the gate dimension across the RF device active area can vary as much as 160 nm as verified by Fig. 123, the “wavy gate effect” first reported in this section. If the resist thickness is measured across the active area, it varies in direct correlation to the CD variation consistent with the well known effect on CDs of varying resist thickness, the resist swing curve effect. This is similar to the effect discussed above for polysilicon gates.
11/30/00
JMR
Figure 121. CD vs. resist thickness for actual live BIMOS product illustrating gate type effect upon the individual swing curves.
272
11/30/00
JMR
Handbook of VLSI Microlithography
Resist Technology
273
SINGLE HIPOX ENCHROACHMENT:
LDMOS SINGLE HIPOX PROCESS 1.80 µm BIRD’S BEAK
Figure 122. SEM cross section of the “Bird’s beak” topography inherent to the single hipox device isolation process.
11/30/00
JMR
Figure 123. Plot of gate CD vs. position along the gate across the active device area.
274
2/23/01
JMR
Handbook of VLSI Microlithography
Resist Technology
275
Third generation LDMOS devices, however, must have submicron gate dimensions (i.e., < 0.6 microns) and require much less gate image dimensional variation to perform within minimum acceptable device threshold tolerances. The topography minimizing or “bird’s beak” reducing effect of the modified double hipox isolation is shown in Fig. 124. There, the first hipox is followed by oxide removal and another hipox to leave the wafer surface much flatter. The effect of this extra device isolation processing is improved wafer flatness as manifested by improved resist uniformity across the active area and the resultant gate CD control. The data of Table 22, clearly demonstrates a factor of four times the improvement due to the less topographical modified double LOCOS isolation process. The gate CD range along the gate for the improved isolation scheme reduces to 22 nm versus the 160 nm observed for the single LOCOS isolation process, and the gate to gate variation is only 39 nm for the new process. These CD improvements are dramatic and more than justify the extra process cycle time incurred.
DOUBLE HIPOX ENCROACHMENT:
BIPOLAR DOUBLE HIPOX PROCESS Figure 124. SEM cross section of the reduced topography inherent to the double hipox device isolation process.
2/23/01
JMR
276
Handbook of VLSI Microlithography
Table 22. CD Families of Variation for Both Isolation Processes and The ARC Process With The New Isolation Process [187] Variance Component Estimates Process
Component
Standard Deviation
2-MLOCOS
wafer site [wafer]
3.1 × 10 -3 µm 6.3 × 10 -3 µm
1-LOCOS
wafer site [wafer]
7.67 × 10-3 µm 20.8 × 10-3 µm
ARC
wafer site [wafer]
1.01 × 10-3 µm 6.55 × 10-3 µm
Notice also that the use of ARC, which usually improves CD control where CDs are varying due to local thickness variation, has little effect upon the multivari CD results as would be predicted. The total CD variation is the same with or without ARC, again as a result of the improved wafer surface flatness created by the modified double LOCOS process. In conclusion, we have quantitatively compared the effect of resist process and isolation processing upon RFLDMOS gate image CD variation, and found that the modified double hipox isolation process reduces the gate variation by close to a factor of four times. This tighter gate CD variation performance has allowed this device to be manufactured in volume production at high yields and performance levels. Multilayer Resist Processing—An Optimized Organic ARC Process for I-Line Lithographic Fabrication of 0.5 µm Devices. When dyed resists, which have been widely used to prevent notching reflections in older product line fabs at greater than 1.0 µm metal linewidths, fail to provide the necessary reflective notching relief, multilayer processes may be employed. Dyed resist useage is limited by reduced contrast, resolution loss due to greater film thicknesses, and poorer general effectivity.[171] An example of the failure of dyed resist processing to prevent metal grain boundary reflective notching is found in Fig. 125; the severe reflective notches are
11/30/00
JMR
Resist Technology
277
easily seen along the resist pattern edge. The notching effect caused by the substrate reflectivity of this example can be reduced by employing inorganic or organic antireflective coatings (ARCs) in a bilayer structure, i.e., the ARC layer is below the resist patterning layer on the wafer surface. Bilayer ARCs have been widely used in optical lithography to prevent or minimize problems of linewidth variation caused by substrate topographical reflective notching, i.e., for resist metal line images running over vertically raised steps on the substrate.[171][188]–[194] Antireflective coatings also help reduce standing wave and thin film interference or swing curve effects,[190] a result similar to the effect of increasing the resist B parameter by resist dyeing, and IBM has further shown they can also be top ARC coatings as well.[195]
Figure 125. SEM micrograph of notched resist images on large grained aluminum film.
11/30/00
JMR
278
Handbook of VLSI Microlithography
Inorganic layers, sputtered thin films, such as a-Si, V, TiW,[171][188] TaSi and TiN,[189] and TiON,[190] have been tested and employed as ARC layers. Economic factors at larger feature sizes, however, favor the use of organic ARCs based upon equipment costs.[171] The example, and actually an older material, organic ARC referred to primarily in this chapter, is ARCXLT, manufactured by Brewer Science.[191] Bilayer ARCs from this manufacturer have been, and continue to be, widely used, even in volume MOS production lines. Currently, almost all DUV applications involve either anisotropically dry-developed organic ARCs or inorganic ARCs, such as TiN, Si-rich nitride or an oxy-nitride film. Very few DUV resists are even tested without an appropriate ARC, all being required due to the greater reflectivity of device layers at DUV wavelengths and below. ARC Processing Example. Organic ARC bake conditions need to be carefully optimized and controlled for aqueous-base developed organic ARCs.[171][192] The bake latitude for the ARC process is found to be extremely critical to the lithographic results.[191][192] This section outlines a strategy for ARC process optimization, which led to improved lithographic results, as verified by metal electrical snake structure probe data. Most significantly, through ARC process optimization, including the develop process, it is found that the ARC application resolution range can be extended to much smaller geometries than the 0.9–1.0 µm limits previously reported;[171][188][191][192] these references attributed that limitation to development undercut of the ARC layer, created by the isotropic nature of it. Furthermore, the greater bake process latitude reported in Ref. 191 has been confirmed. ARC Process Optimization Strategy. The optimization strategy entails the use of the following two response variables which help by simplifying the type of experiments to be performed: (a) ARC thickness remaining after development: Experiments that use this variable as a response are very simple to perform, in that they do not require the use of photoresist or an exposure tool. The rate of ARC development is probably the most important factor in determining lithographic performance for a fixed ARC/photoresist system. This response variable is used to explore and obtain a broad process window, while the final conditions were determined by further experimentation within this window using the response mentioned in b, (b) Minimum CD remaining without lift off: ARC undercutting has been a limiting factor in determining the upper limit of resolution with regards to lithographic performance. A small circular pillar-like pattern is best for this testing because this structure is very susceptible to undercut lifting. This response variable plays a key role in ARC process optimization.
11/30/00
JMR
Resist Technology
279
An example of a pillar functional test working specification for 0.5 µm MLM processing would be 0.6 ±0.1 µm. Furthermore, this simple functional test provides an SPC ARC control test for routine daily qualifications. The following sequence of experiments were all integral parts of the optimization strategy: i. Identification of bake and develop methods: The following variable combinations will be discussed here: a. Convection bake/immersion develop b. Contact bake on a hot plate/single spray puddle develop ii. Determination of bake uniformity: An acceptable level of ARC thickness uniformity following bake has to be achieved and verified by ellipsometry. iii. Screening design experiment to determine significant variables: This experiment was a 2 7-4 fractional factorial[99] using ARC thickness after development as a response. Wafers were coated with ARC only. Resist softbake /PEB were taken into account in the process simulation. The development time used was 5–10% of the actual develop time for resist coated wafers. Variables used in this design were: a. Dehydration bake b. Delay between ARC coat/dehydration bake c. Delay between convection bake/ARC coat d. Bake temperature e. Delay between resist coat/ARC bake f. Resist softbake g. Post exposure bake These variables may vary depending on the type of bake used. Variables such as proximity bake, wafer gap from hot plate may be included as desired. iv. Determination of development rate of ARC in developer: ARC thickness remaining after develop was determined for a series of development times. From the screening experiment, bake temperature and time were determined to be the only significant variables. Hence, a two factor
11/30/00
JMR
280
Handbook of VLSI Microlithography Central Composite Inscribed (CCI) design was[100] run using these two factors and ARC thickness, post develop, as the response variable. This experiment led to an empirical result with a broad enough process window for further experimentation to take place. The final ARC thickness at the end of the photoresist/ARC develop process has to be zero. However, the ARC thickness should not approach zero early on in the develop process because this will lead to severe undercutting, hence, the reason for process optimization. v. Minimum feature size remaining without lift off on metal wafers: The two factor Central Composite Inscribed design above was run using a range of ARC bake temperatures and times. Minimum CD without pattern liftoff was determined. Wafers were checked for possible ARC scumming. ARC bake conditions were selected for lowest possible CD remaining without the presence of ARC scum, which occurs readily at high bake temperatures (greater than 180°C) due to excessive ARC curing. vi. Process verification using SEM cross-section/electrical probe: An electrical probe structure in the form of a standard snake pattern was used to verify process conditions. CDs may also be determined electrically using linewidth split cross bridge structures. Cross sections using the SEM reveal final ARC/resist profiles. SEM measurement waveforms may also be used to qualitatively determine profiles for the presence of an ARC foot.
Optimized Process Comparisons. Results of the screening design experiment for convection bake/immersion develop suggested that the only significant variables were bake temperature and bake time from RS1 empirical modeling. Plots of ARC thickness remaining were obtained for different temperature/time combinations using a 2-factor CCI experiment. The statistical analysis confirmed bake temperature, bake time squared, and the product of bake temperature and bake time to be significant factors for this response. A follow up 2-factor factorial experiment was conducted using these conditions and using metal snake probe yield as a response. Neither
11/30/00
JMR
Resist Technology
281
temperature nor time was determined to be statistically significant for probe yield, however, the product of temperature and time was still significant. The optimum process conditions were determined to be 163°C bake for 41 minutes This is a 1.3 µm line, 0.7 µm space pattern, and ARC undercutting is clearly visible as reported in Refs. 171, 188, 191, and 192. Table 23 clearly illustrates improvement in CD control with the optimized ARC process compared to the unoptimized process. The ARC process optimized for immersion develop can also be used successfully in conjunction with track development. However, the presence of an ARC foot can be a problem if the same bake conditions are used. The foot problems are most likely related to problems with convection baking which include (i) temperature variation in the oven as a function of position, (ii) heat loss upon opening of oven, and (iii) operator errors leading to overbake. The presence of the ARC foot combined with the problems mentioned above made it necessary to switch to track baking of the ARC, which provides better temperature and time control, and it increases wafer process throughput. Plus, wafer track processing is more manufacturable. Table 23. Mean CD and Total CD Variation for Resist Processes With and Without ARC Process
Mean CD µm
CDVariation 3-sigma
SPR 500/unoptimized ARC
1.10
0.35
SPR 500/optimized ARC
1.04
0.09
SPR 500 (dyed)/optimized ARC
1.08
0.13
SPR 500/no ARC
severe notching
>0.4
SPR 500 (dyed)/no ARC
notching
>0.4
By employing optimized wafer track, contact baking and development, 0.6 line/space patterns have been successfully achieved as verified by actual etched metal probe results and the SEM cross section of Fig. 125. Metal lines from the process ranged between 0.5 and 0.6 µm due to process bias, but these
11/30/00
JMR
282
Handbook of VLSI Microlithography
features were far below the limits established by previous studies.[171][188][191][192] Furthermore, the footed features which plagued the convection baking process have been eliminated. Metal snake yields using a 1.0 line/space pattern were determined to be 94.8%. Also, 90% snake yields using a 1.3 µm line/0.7 µm space were obtained. In conclusion, the process optimization of an organic ARC has been accomplished and submicron feature capability demonstrated, as verified by electrical probe and SEM methods (see Fig. 126). The resulting optimized ARC example process is currently being employed to manufacture advanced BiCMOS and CMOS devices with submicron features. Note, that the same optimization methodology can be applied to any lithographic process such as the photoresist process, and this is usually done by the resist manufacturer to define a recommended baseline process prior to customer receipt.
Figure 126. SEM cross section micrograph of immersion developed resist/ARC process on AlCu at 0.6 micron space resolution.
11/30/00
JMR
Resist Technology
283
Resist Image ARC CD Control Advances—Inorganic ARC Design.[196] In photolithography, substrate and photoresist reflectivity variations introduce linewidth variations within-wafer or more importantly within the field. As feature sizes shrink below 0.35 µm, this variation goes beyond acceptable CD tolerances. To have an advanced CMOS process in manufacturing today, technology to dampen these reflections is necessary—“Without it, i-line gate lithography below 0.4 µm is not possible,” said Dr. John Sturtevant, senior process engineer at Motorola (Austin, Texas).[196] Work has continued in the area of i-line ARC bilayer processing from the work discussed earlier. A minor controversy developed as to whether organic (dry-developed) or inorganic ARC systems were best. Since both types of systems have been used in volume production, the decision boils down to what performance level is required and the COO. In the rest of this section, we will highlight the literature and present data for both systems with applications at the MOS gate level in mind. Remember that CD control at this level can determine logic or DRAM device performance and can translate to device manufacturing profitability as well (see Ch. 1). As gate dimensions shrink from 0.5 to 0.1 micron, CD variation within the print field can be as large as 0.1 micron, or between 20–56% of the nominal CD. The reason CD control was observed for the ARC data presented in secs. 5.3.3 and 5.3.4 is because substrate reflectivity was controlled to ~10%, and for those somewhat larger CDs, adequate CD control was obtained. In Figs. 127 and 128, later swing curve suppression data for JSR 061 resist on silicon gate, or with the Si-rich nitride (SiRN) inorganic ARC, and the later generation, Brewer Science organic ARC XHRi are shown. If gate CDs had been plotted, they would have changed proportionately to that as for the E0 variation shown. Note that varying degrees of suppression are obtained depending upon the ARC system employed (see Fig. 129 for percent suppression vs. ARC type). Will Conley has outlined the important issues governing CD control at 0.25 micron CD and below.[198] Suppression of the swing in gate CD must be attained by controlling the gate surface reflectively to less than 0.5%; swing curve control to less than 1% must be achieved; greater conformality and less planarization from the ARC coating is desirable, which favors inorganic systems; better optical performance or operation at higher k1 values is required, which again favors inorganic systems, and is desirable; and, enhanced defectivity and process integration compatibility must be maintained, which again favors the employment of inorganic ARCs. If one looks at the predominance of inorganic ARC device applications at 248 nm DUV, this may be the indicator signaling the application trend break to be occurring in favor of the inorganic ARC systems.
11/30/00
JMR
284
Handbook of VLSI Microlithography
Figure 127. JSR 061 swing curve showing the suppression created by an Si-rich nitride ARC vs. no ARC. (Data courtesy of Mike Brown of Motorola CS-2.)
Figure 128. JSR 061 swing curve showing the suppression created by an organic ARC XHRi vs. no ARC. (Data courtesy of Mike Brown of Motorola CS-2.)
11/30/00
JMR
Resist Technology
285
Figure 129.Swing curve suppression vs. ARC type. Notice also: all of the ARCs are effective at suppressing CD variation with the worst system providing 60% improvement.
Graca et al. have characterized the use of SiRN as an i-line and DUV Although organic top anti-reflection coating (TARCs) have shown promise in reducing linewidth swings arising from thin film interference effects, they do not effectively eliminate the serious problem of localized exposures arising from reflective substrate topology; i.e., reflective notching. For this reason most efforts focus upon BARCs. Organic bottom antireflection coatings (BARCs) have been widely used to minimize these background reflections, and although effective, they have proven difficult to integrate and maintain low cost of ownership early on. More recently, however, novel, plasma-deposited, inorganic BARCs have been proposed for 365 nm and 248 nm lithography that appear to offer good control of critical gate patterning, easy integration, and cost effectiveness. To understand the initial effectiveness of the deposited SiRN BARC, the simulation tool Prolith 2, a commercially available software, can be used to investigate the substrate reflectivity at various ARC film thickness. The simulation data shows 25 nm thick SiRN BARC works best for DUV reflectivity suppression, and 35 nm thick SiRN ARC is optimum or works best for i-line (see Fig. 130).[199] With the ARC thickness around 27 nms, less than 2% substrate reflectivity for 248 nm DUV should be obtained and less than 5% substrate reflectivity at i-line. The i-line characterization shows the Eo swing has been reduced from 24 mj to 10 mj, without an increase in Eo by using the SiRN ARC. The DUV data, at 248 nm, clearly indicates that a reduction of the CD swing from 40 nm to about 10 nm is achieved for a resist thickness change over a full swing cycle, by applying the SiRN ARC. Based on the measured optical properties of the SiRN ARC and the Prolith simulation results, the 25 nm ARC film is found to be optimal for DUV processing and it also provides a significant reduction of the substrate reflectivity for i-line process as well.[199] ARC.[198]
2/23/01
JMR
Figure 130. Gate polysilicon Prolith reflectivity simulations for Si-rich nitride (left) and CD swing data (right) for the respective ARC layers.
286 Handbook of VLSI Microlithography
Resist Technology
287
The optical film properties for both silicon-rich plasma enhanced nitride (PEN) and silicon-rich silicon oxynitride (SiON) ARCs have been established at advanced fabs for use as potential inorganic ARCs.[200] Additionally, the plasma ARC films render to the top resist process increased depth of focus and better resolution. With the installation of the Motorola APRDL 8" AMAT 5200 CenturaTM system having the resistively heated DxZTM chamber, development of the plasma ARC layers was further expanded beyond the Si-rich PEN system to include Si-rich silicon oxynitride. The primary advantage of inorganic BARCs is “pure optical performance,” says Sturtevant of Motorola: “It’s possible to tune these inorganic type films precisely to the optical properties needed.”[197] The key is a high k value approaching 1 which provides a greater tuning latitude. The k value, one of three parameters which determine substrate reflectivity, indicates optical density or film absorption.[201] The following parameters can be tuned by adjusting deposition temperature and pressure: i. Real refractive index (n) ii. Imaginary refractive index (k) iii. Film thickness (t) Inorganic AR coatings also provide the ability to conformally deposit from plasma over topography. This has meant better performance—performance which outweighs inconvenience, longer processing time and higher expense, according to Sturtevant. For DUV, SixNy, and silicon oxynitrides have been used, and for i-line, TiN. In the 1997 timeframe, there were indications that inorganic BARC processing could easily be incorporated into the process flow, and this has proven to be the case. The further consideration is to fine-tune these optical films with a continuously changing refractive index from bottom to top of the AR coating layers to build a gradient effect. If n and k of ARC layers are perfectly matched to the adjacent layers, technically, there will be no reflections. The two interfaces at the substrate and at the photoresist, however, are typically very different in their refractive indices, and this is what gives rise to the large reflectivity. By placing a layer, in between that is well matched on the bottom to the substrate and on the top to the resist, may be very effective at reducing reflections, according to Sturtevant. For manufacturing, inorganic coating technology is confronted with several issues. First, the parameters must be carefully controlled. Thickness and optical properties require tight control to minimize variations across the wafer and between wafer. Second, inorganic layers have chemistries similar to the underlying layer which can make removal difficult. However, some
11/30/00
JMR
288
Handbook of VLSI Microlithography
companies are adapting ways to leave the layers in place and thus avoid removal. And third, in terms of capital, this process is more expensive. Although in-house equipment is typically used, to put it in-line means added cost and footprint to increase capacity. Dedicated production tools for inorganic plasma deposition are becoming available, such as Applied Materials’ silane-based dielectric CVD. By eliminating ammonia from the plasma, this system can be used to deposit AR film which is compatible with DUV chemically assisted resists as well as with i-line. Defectivity with organic BARC coatings is a process problem. Interfacial defects appear as small bubbles between the bottom organic AR coat and photoresist layers due to the surface tension mismatch. One way to eliminate them is by adjusting the photoresist or AR coating process. For BARCs, when the layer is spun on and the photoresist is then applied with a dynamic dispense at a minimum of 2500 rpm, the bubbles are thrown off. Interfacial defects have also been addressed chemically. By implementing chemistry to achieve better matching for the AR layer surface energy and photoresist, the need for special resist processes has essentially been eliminated in new BARC products. The TFI effect may be reduced with a flatter wafer surface. Several processing techniques, such as chemical mechanical planarization (CMP), shallow trench isolation and dual damascene, are gaining interest as lithography tools as device features shrink below 0.35 µm. Some researchers have obtained reasonable process control and tight CD distributions in the face of small geometries and wafer topography—the only alternative to AR coatings is the elimination of topography, according to Johnson. While ARCs have been used to enhance IC lithography for years, the CD budget allowed in advanced sub-0.35-µm lithography is placing more demands on them.[202] Design specifications of next-generation devices will require ARCs to (a) suppress greater than 99% of substrate-reflected light, (b) meet stringent photoresist and device contamination requirements, and (c) operate at extended ultraviolet wavelengths. Many of these requirements are not met by the conventional ARCs in production today. During photolithography, light from the stepper is passed through a mask and the pattern is transferred to the wafer coated with photoresist. However, when the underlying film is highly reflective, as in metal and polysilicon layers, light reflections can destroy the pattern resolution by three mechanisms: (a) off-normal incident light can be reflected back through the resist that is intended to be masked; (b) incident light can be reflected off device features and expose “notches” in the resist; and (c) thin-film interference effects can lead
11/30/00
JMR
Resist Technology
289
to linewidth variations when resist thickness changes are caused by wafer topology or non-flatness. CVD-deposited dielectric ARCs work by phase-shift cancellation of specific wavelengths. This requires the simultaneous specification of three optical parameters: refractive index n, extinction coefficient k, and thickness d. Proper choice of these three parameters ensures that the transmitted wave that passes through the ARC film will, on reflection from the substrate, be equal in amplitude and opposite in phase to the wave reflected from the resist-ARC interface. In practice, phase cancellation requires very tight control of process parameters such that, for example, the thickness of the ARC is maintained to within 15Å. Conversely, CVD-deposited dielectric ARCs are thin (200–300 Å), conformal to device features, and usually very selective during the ARC etch, so that CD control is easily maintained during pattern transfer. Typical simulation input variables include stepper parameters, mask pattern details, resist and developer types, and pre-and post exposure bake conditions. Of these parameters, only the wavelength, substrate film stack, and resist index of refraction are critical to the design of the dielectric ARC. These parameters will determine the substrate reflectivity, which can be minimized by adjusting the dielectric ARC optical constants n, k, and d. In the straightforward case of aluminum, neither the underlying film structures nor the aluminum film thickness plays a critical role, because the extinction coefficient of aluminum is high enough (k = 2.35–2.94 at 248 nm) so that any DUV will attenuate rapidly. The accurate predictions of resist profiles and CD control, however, require attention to all the input parameters. Second order effects, such as the angular distribution of substrate illumination, have not been considered in optimization of the ARC properties. MLM Resist Issues. Total device layer to layer overlay is a complex situation, but to first order requires that the reticles stack well, the stepper alignment be good (i.e., three to four times smaller than the CD), and that the CDs for both layers are within specification. Even though the latter terms are usually the smallest terms, they can be large for poor processes and lead to poor circuit yield. Typically, CD control is specified at ±10% of the design rule or better. As of 1992, this requirement translates to approximately ±0.1 µm, 3-sigma, CD control for critical layers. In the late 90s, this budget has reduced by more than three times and some would argue that the 10% CD budget number is now more like ±8%. MLM Metrology Calibration. (Also see Ch. 4.) To prevent resist image shorting of high density interconnects due to photolithographic bridging, the photolithographic metrology must have the ability to measure 10% of the
11/30/00
JMR
290
Handbook of VLSI Microlithography
nominal CD and/or the nominal alignment tolerance ideally. The task then is to establish this type of gauge capability for both the CD and alignment measurement tools used to monitor MLM fabrication verification data. For example, the SPC capability of an Hitachi 7000 to measure 0.5 µm features in resist is between 20–30 nm, and the IVS ACV4 capability is between 20 and 33 nm, depending upon the MLM layer being measured. Clearly, these capabilities are larger than the 10% ideal target, but these numbers are probably adequate until newer metrology equipment can be developed or measurement methods or algorithms improved. Notice that at larger design rules, i.e., 0.8 µm CD/0.25 µm alignment, these capabilities are very adequate. At smaller features approaching 0.1 µm, SEM gauge capabilities must be of the order of a few nanometers, and modern SEMs do achieve these levels. CDs are calibrated either electrically (see Fig. 131) or by SEM cross sections (i.e., the CD at the image base is referenced). The offset in Fig. 131, the difference between the electrical CD data and the Hitachi SEM data, is the sum of the RIE etch bias and the Hitachi bias. For metal layers, split cross bridge electrical structures are routinely used to calibrate either ADI or ACI CDs since they are absolute measurements. For via type structures, via resistance measurement for long via chains may be employed for calibration or electrical test structures recently developed by Lin can be employed.[203] More typically at via, SEM cross section calibrations are done. Here, metrology SEM measurements are typically compared to analytical SEM cross section measurements at 100,000X for the same features. Of course, the pitch for these patterns is well known to facilitate the calibration. Example MLM SEM cross sections for via and metal resist CDs are found in Figs. 132 and 133. For alignment, electrical structures are also prevalent like those employed by Prometrix commercial electrical measurement systems, but these systems are hardly ever used on actual circuit reticles due to space issues, or if they are included, they are only found in PC cells. More typically, IVS or KLA optical alignment systems are employed to check alignment of product wafers during fabrication and those tools are SPC controlled vs. secondary standard wafers in a continuous mode as for all the other lithographic tools discussed previously. Substrate Reflectivity Issues. Variations in wafer substrate film stacks can have a significant effect upon the resist critical dimension (CD) and exposure level for layers patterned to fabricate advanced four level metal BiCMOS MLM devices. In the fabrication of these VLSI devices, patterning is frequently performed on film stacks of varying thickness and optical properties.
11/30/00
JMR
Resist Technology
291
Figure 131. CD vs. stepper exposure for 0.6 micron metal 1 patterns illustrating CD calibration and the difference between SEM data and electrical probe data.
Figure 132. SEM cross section for a 0.6 micron via printed with a Canon 2000i1 stepper.
2/23/01
JMR
292
Handbook of VLSI Microlithography
Figure 133. SEM cross section for a 0.6 micron metal 1 line on AlCu at l.2 micron pitch printed with a Canon 2000i1 stepper.
PROLITH 2[204] can and has been used to simulate lithography behavior on actual device film stacks, and the results compare favorably to data collected from actual MLM product wafers. These simulations can be used to accurately predict the exposure changes needed to compensate for changes in film thickness and film stack upon CD. In most cases studied, less than 3% deviation between the experimental and simulated results is observed. Thin film reflectivity is also observed to have a strong influence on CD variation. The significant improvements in CD variation have been generally correlated with reductions and/or optimizations in substrate reflectivity. Electrical probe CD data for backend metal layers has also been evaluated for thin film notching behavior, and the CD variation is minimized by applying the results from PROLITH reflectivity analysis. The constructive and destructive interference of light reflected from each film stack layer gives rise to reflectivity minima and maxima which can have an adverse effect on CD control.[205] Front-end and back-end CDs, printed under fixed processing conditions, can vary up from ±5–25% or a
11/30/00
JMR
Resist Technology
293
large part of the usual CD variation budget due to film thickness and reflective notching effects. On metal layers, grain boundaries cause large local reflectivity variations which can cause notching (see Fig. 125).[159] Brunner[206] has developed a simple analytical expression, based upon the Fabry-Perot Etalon optical model for thin film interference, which accurately predicts the effect of thin film interference upon CD behavior. This expression can be used to account for reflectivity effects upon resist CDs, and examples of reflectivity interference related peak to peak amplitude minimization for both resist top surface and substrate antireflective coatings have been demonstrated.[206] Furthermore, detailed example results for the latter case are provided here. In this section, PROLITH 2 was used to simulate reflectivity and CD behavior on actual device film stacks and the results compared to data collected from product wafers. Results are described from a study of thin film lithographic effects in the fabrication of advanced BiCMOS devices employing Shipley System 9 (900 series resist in tables) or SPR 500, or JSR 500 high performance, resists. The TiW inorganic ARC films were sputter deposited and MRC films. Results for Metal and Via MLM Layers. The film stack shown in Fig. 134 is representative of the layers encountered in BiCMOS and Bipolar devices at metal and via photo steps. The only difference between metal and via layers is the oxide thickness, 1.5 kÅ versus 7.5 kÅ. In this case, the aluminum-copper (AlCu) is sufficiently thick to behave as bulk Al which is opaque to the exposing wavelength. Hence, any layer lying beneath the Al layer can be ignored for the purposes of the reflectivity calculation. The PROLITH 2 simulations in Fig. 135 show how the substrate reflectivity varies with TiW thickness in the absence of an oxide overlayer; in Fig. 136 similar data for TiN deposited from a Varian metal system are also shown (Note: both metal cap systems are efficient at suppressing Al reflectivity taken as ~100% for the figure). It can be seen that TiW behaves as an inorganic antireflection coating, and can reduce the reflectivity (at 365 nm) of the underlying AlCu from 65% to less than 30%. When an oxide and resist layer are present, the reflectivity behaves as shown in Fig. 137. Interference effects are responsible for the modulation in reflectivity with oxide and TiW layer thicknesses. At a fixed TiW thickness, variations in the oxide thickness can cause a 13% change in R (Note: The reflectivity changes from min. to max. when the oxide thickness changes by 1.25 kÅ). It is also easily noticed from Fig. 137 that ARC is a much more efficient reflectivity suppresser that TiW (i.e., see triangle data at bottom). The simulations in Fig. 137 (right hand photo) show that over
11/30/00
JMR
294
Handbook of VLSI Microlithography
this reflectivity range a ±0.05 µm change in CD can occur. When the ARC is included, this CD change becomes more tolerable at ±0.03 µm and the impact of the interference effects are effectively minimized. The theoretical reflectivity curve in Fig. 138 shows that ARC-XLT can reduce the AlCu reflectivity to below 10%. When used in conjunction with TiW, the overall reflectivity and modulation is reduced significantly as shown in the right hand curve in Fig. 137. Figure 139 shows there is good agreement between theoretical and experimental results.
Figure 134. Film stack of BICMOS and Bipolar devices at metal and via photo steps.
11/30/00
JMR
Resist Technology
295
Figure 135. Dependence of reflectivity on TiW thickness. Substrate is 6.5 kÅ AlCu on silicon.
Figure 136. Dependence of reflectivity on TiN thickness.
11/30/00
JMR
296
Handbook of VLSI Microlithography
Figure 137. Effect of oxide thickness on CD variation film stack is resist/oxide/TiW/6.5 Å AlCu.
Figure 138. Dependence of reflectivity on ARC-XLT thickness. Substrate is 6.5 kÅ AlCu on silicon.
11/30/00
JMR
Resist Technology
297
Figure 139. Comparison of experimental vs. theoretical reflectivity; film stack is resist/ ARC/oxide/TiW/AlCu.
The data shown in Table 24 demonstrates the effect of reflectivity on via CD control. In via patterning experiments, vias printed with only resist exhibited high CD variation (~0.25 µm, 3-σ) due to substrate reflectivity and reflective notching. Actually, larger CD variation (greater than 0.30 µm) has been observed on live product depending upon the lot. In an extensive multivari experiment with notching and non-notching lots, almost the entire via CD variation was attributed to the presence of grain boundaries in the underlying metal. The use of a higher performance Shipley dyed resist (510L) did not improve the CD variation. When an optimized ARC was used in conjunction with the resist, CD variation dropped by 35% for the Shipley 915 dyed resist process and notching was minimized. By using 510L which is a higher contrast dyed resist, 0.11 µm, 3-σ, variation was obtained. The significant improvement in CD variation observed with ARC and 510L/ARC has been correlated with the reduction and optimization of substrate reflectivity.
11/30/00
JMR
298
Handbook of VLSI Microlithography
Table 24. Mean CD and Total CD Variation for Via MLM Patterning Resist Systems
CD Mean
Std. Dev. (3-σ)
915(no ARC)*
1.09 µm
0.24 µm**
510L(no ARC)***
0.97
0.26
915/ARC
0.94
0.15
510L/ARC
0.93
0.11
* Shipley dyed resist for I-line. ** Significant amount of notching appears without ARC. Values can be as high as 0.30 µm. *** Shipley positive resist generation after 900 series. L means dyed material.
The results of linewidth experiments on metal substrates are show in Table 25. Wafers patterned only with 915 resist showed waviness and notching which contributed to poor linewidth control (0.22 µm, 3-σ). Similar to the via results described above, the addition of ARC reduced CD variation by 40% and minimized the effect of notching. Our CD control target of ±0.1 µm was exceeded using 511 + ARC.
Table 25. Mean CD and Total CD Variation for Metal 1 MLM Patterning
CD Mean
915(no ARC)*
1.31 µm
0.22 µm
915/ARC
1.33
0.12
511/ARC**
1.28
0.09
* Shipley dyed resist for I-line. ** Shipley undyed positive resist generation after 900 series.
11/30/00
JMR
Std. Dev. (3-σ)
Resist Systems
Resist Technology
299
Thin film reflectivity has been demonstrated to be a major issue in MLM lithography and has been shown to have a strong influence on device CDs, level exposures, and CD variation. 0.09 and 0.11 µm, 3 σ, CD control can be achieved on difficult metal and via layers, respectively. The significant improvement, i.e., reduction in CD variation, has been correlated with a reduction and optimization of substrate reflectivity, consistent with the work of Brunner.[206] In patterning the highly reflective layers encountered in BiCMOS backends, the use of a high contrast resist in conjunction with ARC-XLT can significantly improve CD variation by reducing the effect of substrate reflectivity. This bilayer process effectively suppresses the effects of reflective notching and interference swing contributions to CD variation below that allowed by device design CD tolerance levels. An undyed resist/ARC process is currently employed in patterning metal layers where ±0.10 µm or better CD control is required for device yield. Process Biasing. Metal features are usually biased larger than their design to allow for size erosion during fairly harsh etch processes (see Sec. 5). Older resists used to fabricate mature devices require at least +0.2 µm bias (i.e., here, total bias for both image sides is used) be added to achieve the required resist image CD before etch processing. For some very old processes, biases as large as 0.5 µm have even been employed, but these were for fairly isotropic etch processes and much lower contrast resists. Today’s resist processes are higher contrast materials and photo biases are now zero or near zero, i.e., the chromium image from the reticle is printed divided by the reduction ratio of the tool. Metal etch biases may still be ~0.1 µm, but they too are being reduced through improved processing and by implementing next-generation single wafer etchers (for example, Applied Materials 5000 etchers). Via images were typically biased to be undersized so that greater CD control could be achieved through greater exposures. Recently, however, processes have been improved so substantially that truly zero bias processes are now achievable for both photo and etch biasing. Typical biases for older via/contact processes were of the order -0.2 to -0.3 µm, but it is fairly apparent that for next-generation dense circuits near-zero photo and etch biases are required. DOF Budget and Focus Offsets. To successfully print MLM features, the wafer field must stay within a certain focus range or patterns may be missing in parts of the die or print field. The resist image DOF as shown in Figs. 140 and 141 must be wider than the DOF budget determined by summing wafer, lens, and reticle/pellicle budget contributions.[207] Table 26 contains an example budget for a 0.5 µm metal 2 or via 1 processes. Since the DOF window data
11/30/00
JMR
300
Handbook of VLSI Microlithography
of Fig. 140 is greater than this budget, the metal 1 patterns should print across the field with fidelity with an estimated 0.7 µm surplus tolerance. Note, that the budget may be somewhat smaller or better for via layers because they are all printed to a single metal plane determined by the metal thickness and they are small square images that do not extend over great lengths as metal line interconnects do. The metal 3 budget may be slightly worse because two layers of planarization errors must be budgeted for instead of just the one contributor encountered at metal 2. Similarly, there may be a frontend topographical contributions that deminish the budget at metal 1, but overall the budget estimates should all be less than the resist process DOF. This has been verified on MLM wafer line monitors, i.e., greater than 95% metal snake yields have been routinely achieved with the example DOF budget estimates.
Figure 140. Metal 1 CD vs. defocus for the JSR 500/ARC process.
11/30/00
JMR
Resist Technology
301
Figure 141. Metal 1 CD vs. defocus or the JSR 500/ARC process with -0.6 micron focus offset.
Table 26. DOF Budget for 5X Canon 2000i with N-layer Planarization at Metal 2 or Via 1 Contributor Lens Field Curvature Focus Repeatability Field Leveling Repeatability Reticle Flatness, etc. Wafer LTIR Residual Planarity topographyb Total DOF =
3-sigma Contribution 0.4 µm 0.1 0.15 0.1a 0.5 0.3c 1.55 µmd
a
B. J. Lin, Ref 62. Doubles at Metal 3. c Contribution may be closer to 0.1 µm since the vias all land on metal of the same thickness. d At M1 approx. the same DOF, but at M3 the DOF is ≈ 1.8 µm b
11/30/00
JMR
302
Handbook of VLSI Microlithography
Figures 140 and 141 are referenced to demonstrate another principle governing MLM requirements, that is, focus offsets may and usually are required. They are required primarily because the effective thickness of the resist must be corrected for the refractive index of the resist (~1.5); as a result, negative offsets for most layers range from -0.3 µm to -0.6 µm. The offset centers the defocus CD data at machine zero so that wafer flatness or other errors, which can be as large as 0.5 µm and as great as several microns at edge die of severely distorted wafers, do not create a yo-yo effect where the wafer CDs move up and down the steeply varying CD vs. defocus curve on the negative side of defocus. From Fig. 140, it is apparent that without an defocus offset, probably half of the optical fields will be precariously near the CD cliff edge (left hand side of Fig. 141) with little process tolerance or margin available. MLM Planarization Processes. The DOF budgets for many new devices are very tight, especially those at DUV R&D facilities where the DOF of the resist/tools are currently ~1 µm TIR. Additionally, wafer flatness on an exposure field basis, termed local total indicated range (LTIR), can be much greater than the 0.5 µm usually budgeted for wafer flatness and can have values as high as several microns. Without planarizing steps between metal layers to achieve some degree of global planarization, the topography alone can be as great as 2–3 µms. Obviously, employing advanced four or five level MLM processing dictates some sort of planarization processing from first level ILD to fourth or fifth level metal or the stepper images needed to be printed at third via or third metal will probably be out of focus and not print with fidelity. The three processes in current development[208] are Chemical Mechanical Polishing (CMP), and two different 2-layer etchback techniques, I-layer or negative blocked 2-layer[208][209] and N-layer, termed positive resist 2-layer blocked.[210][211] The last technique will be discussed here because it is probably the most mature process from a defectivity point of view and experience with it is extensive.[210] The first two processes provide better global planarity, but require equipment development and negative resist development, respectively. IBM researchers have also reported using both N-layer and CMP together to achieve excellent global and local layer planarization.[212] Two-Layer Photo Planarization Process. The conventional POT process,[210] i.e., where the dielectric layer is planarized by the application of one sacrificial layer followed by an etchback of that layer, is always considered first as a leading candidate for the planarization of any device MLM backend due to its simplicity. However, it is hampered by the fact that conventional, commercially available organic materials and spin-on glasses
11/30/00
JMR
Resist Technology
303
can only planarize underlying features up to a finite distance from the edge of the feature, and neither group of materials is capable of more than step rounding or smoothing in reality. This distance is approximately 10–20 µm, and is known as the planarization length. Thus, for large, greater than 20 µm isolated features separated by more than 20 µm such as bonding pads, etc., the degree of planarization achieved with conventional materials spun to workable thicknesses (less than≈2 µm) decreases dramatically, to the point where there will be vast numbers of features on typical circuitry left completely unplanarized. For these types of structures, the only thing gained using conventional POT processing is simple rounding of the edges, which does improve things by helping maintain metal line continuity over steps and by eliminating shorting bars or picket fences during metal etch at those locations. The Two Layer Photo (TLP) process has been described in the literature[213]–[215] as a means, through the use of an extra masking step per ILD layer, to overcome the shortcomings of conventional POT and provide nearly perfect dielectric planarization independent of underlying topology. Figure 142 a-f describes the method. After the given metal layer is etched, the initial TEOS layer is deposited. In the standard POT process, the sacrificial organic would be spun on at this point, giving a large variation in the vertical altitude of the organic surface due to the dependence of the planarizing ability of the coated layer on the dimensions of the underlying topography.[216][217] This problem becomes more severe as feature sizes continue to shrink. With the TLP process, the problem of global planarization is lessened substantially by the use of the additional masking step. A photoresist layer is spun onto the structure (Fig. 142b), and is patterned with a mask that is roughly an inverse image of the underlying metal pattern (Fig. 142c). This leaves resist in the field areas between the metal lines, effectively plugging these areas and planarizing the structure. In order to achieve planarity, the thickness of this resist layer (known as an “N-layer”) must be equal to that of the step height, which is the sum of the metal thickness and the degree of overetch into the underlying glass layer during the metal etch step. The N-layer is not an exact reverse image of the underlying metal because of several reasons. First, the TEOS deposited onto the sides of the etched metal steps effectively widens the features by an amount equal to 60% of the deposited glass thickness, independent of feature size. Second, there is a limit to how small the N-layer “plugs” can be made. Since patterning resist layers planarize small gaps fairly well without TLP and adhesion and overcoat compatibility concerns exist for small N-layer images, n-layer plugs are usually larger than the lower limit of 1–2.0 µm. Finally, alignment tolerance comes into play. This tolerance is aligner
11/30/00
JMR
304
Handbook of VLSI Microlithography
Figure 142. (a) Structure after metal etch and initial TEOS deposition. (b) After N-layer resist spin. (c) After N-layer patterning. (d) After overcoat and resist spin. (e) After etchback planarization. (f) After TEOS redeposition.
2/23/01
JMR
Resist Technology
305
specific, but is typically set to be ≤ 0.25 µm, meaning that there is a nominal 0.25 µm or smaller gap between the edge of the glass and the edge of the N-layer plug. This is necessary because if the tolerance is made too tight, normal misalignments could result in full-thickness N-layer structures ending up on top of features, to the detriment of surface planarity. Thus, areas on the circuit with gaps between glass sidewalls smaller than 1 + (0.25 × 2) = 1.5 µm will not get plugged. For any thickness (ILDt) used for the initial TEOS deposition for first ILD, the areas with a gap between neighboring metal features of size Nmin of less than Eq. (7)
Max unplugged dim. = Nmin + 2 × (ILDt × sidewall coverage)
will not be plugged. For future generations of back end product, the adhesion and alignment tolerances will be tightened, resulting in a larger percentage of the circuit getting N-layer plugs and n-layer plugs as small as 0.5 µm may be easily obtained. After the N-layer feature has been patterned, the wafers are coated with another organic layer which is similar to the sacrificial organic used in the POT process. A typical thickness of this organic layer is about 8500 Åfor all three ILD’s, but thinner layers of the order of 0.3 µm are more attractive for etchback time reduction reasons and films this thin are still capable of the misalignment gap filling required. A thinner resist overcoat is attractive out of concern for the overall nonuniformity of the glass thickness after etchback planarization. The thicker the resist/TEOS stack, the larger the magnitude of thickness variation after the etch, assuming the percent nonuniformity remains constant as a function of stack thickness. Its only purpose is to fill in the gaps that the N-layer has not plugged due to the photolithographic alignment considerations (Fig. 142d). The overcoat layer does not have to be photo sensitive and should not be for cost reduction reasons. Great care must be taken in the treatment of the wafers after completion of the N-layer patterning before the overcoat layer is spun. The N-layer resist must be either baked to high temperatures[209] or deep UV cured to prevent dissolution of the bottom N-layer resist images. If this Nlayer image insolubility process does not occur, planarization of the surface will not occur due to dissolution of that layer at the overcoat step. Deep UV curing has been reported[214] to be a necessary component of the TLP process, and is used extensively. A high temperature bake will accomplish the same purpose, but temperatures high enough to prevent image dissolution (greater than 160°C) will also cause greater shrinkage of the N-layer resist to leave larger gaps. Because this shrinkage occurs to the same degree
11/30/00
JMR
306
Handbook of VLSI Microlithography
in all directions, a gap of several microns can open up between the edge of the N-layer plug and the edge of the glass. For this reason lower temperature baking followed by a deep UV cure is the recommended process. Others have used a layer of sputtered quartz to separate the two resist layers, but the deep UV cure is a simpler technique, without the added deposition and has been demonstrated to work well.[210] The example n-layer process depicted is the result of three basic experiments. First, the temperature where n-layer image shrinkage was zero or as small as possible to prevent pattern pull away and film stress needed to be determined. Next, the early literature was unclear as to whether DUV curing, high temperature curing, or both were necessary or what order was essential for n-layer double coating. By doing the experiments of Figs. 143 and 144, it was determined that DUV was required and that it should occur after baking. Furthermore, baking before DUV is necessary to prevent “eyelash” defects from forming around the n-layer image edges as shown in the Fig. 145. “Eyelash” images will transfer to the underlying TEOS layer during etchback and there is a considerable amount of deleterious topography associated with them, so they must be avoided. Any photoresist can be used for the n-layer process; in fact, at least four different resists have been successfully tested. Images for all of these resists are rendered insoluble to the planarization overcoat before etchback by the 120°C baking followed by the several hundred mJ/cm2 DUV cure/exposure. Early on, there was concern about the limitation of the N-layer resist’s ability to fill the narrowest crevice left in the glass after initial TEOS deposition. This problem never materialized, however, and the resist overcoat easily fills gaps of approximately 1500 Å without leaving voids. Any gaps that the resist does not fill should be plugged by the TEOS redeposition. After the overcoat layer is spun on, it is also important not to overbake the wafers, as excessive temperatures cause layer “bubbling,” which causes a phenomenon to occur near the boundary between N-layer plugs and glass edges reminiscent of localized resist reticulation. This problem becomes more severe as the metal thickness increases, which can cause local N-layer build up. The cause of this problem is believed to be due to excessive volatile product build up in the film that cannot escape quick enough as for thinner film regions. This is a common DUV curing problem as resist thicknesses are increased. The resist/TEOS sandwich is next etched back to within some targeted distance above the top of the metal using a chemistry that etches resist and TEOS at roughly the same rate (Fig. 142e), and then another TEOS deposition brings the final ILD thickness up to the specified target (Fig. 142f).
11/30/00
JMR
Resist Technology
307
A great deal of discussion has transpired regarding the need for the complex processing associated with the TLP N-layer technique. Until other techniques such as CMP are established in volume production as cost effective, (note: in the 1990s this occurred) TLP process will be employed extensively, at least in R&D fabs doing advanced MLM processing.
Figure 143. N-layer 3-factor factorial analysis results to determine the minimum full thickness process required to prevent N-layer dissolution at the overcoat planarization process.
Figure 144. 3-factor factorial results of the experiment designed to determine the processing conditions that prevent the “eyelash” problem. Note: no eyelash cells to the right.
2/23/01
JMR
308
Handbook of VLSI Microlithography
Figure 145. Optical micrograph illustrating the “eyelash” problem of improperly processed N-layer images around MLM bonding pads.
Alternate Processes to TLP Etchback. There has also been an ongoing effort to investigate the claims of several photoresist vendors of the perfection of materials giving far better planarization lengths than today’s resists. At this writing, impressive degrees of planarization (in excess of 80%) have been achieved with some of these materials such as Futterex PC2-1500.[208] Problems remain with purity and stability, however. It is hoped that use of these materials in the simpler POT-like process will be a viable alternative to N-layer processing in the near future. Of course, this type of single layer process would require an organic material capable of planarizing substantially large pads, 100 microns square and isolated. A careful survey of the literature turned up several ideas, some involving multiple layers and various processing sequences. These were all tested, the results of which are found in Table 27. The results of the table clearly indicate a lack of planarization for large feature geometries and a large discrepancy between the degree of planarization between center and edge of the test wafers. Even though this center to edge phenomena is not well understood, the results are also not acceptable, because the maximum planarization value observed was only 27%.
11/30/00
JMR
Resist Technology
309
Table 27. Planarization Measurements for Literature Systems Before 1988 Organic Systems
Planarization* at Wafer Center, %
Planarization* at Wafer Edge, %
SINGLE LAYER SYSTEMS OCG WX-214(Special Spin)** OCG WX-214(Std. spin) Polystyrene TLS(Polysiloxane) FUTTEREX (early version)
1.8 ~0 0 3.1 ~0
24.2 18.5 8.5 23.6 19.1
BILAYER SYSTEMS OCG WX-214/OFPR-800*** OCG WX-214/OFPR-800 Polystyrene/OFPR-800(180 C) OFPR-800/Polyvinyl alcohol OFPR-800/Polyacrylic acid
~0 ~0 ~0 ~0 1.5
25.4 21.7 26.8 26.8 19.3
*
Calculated from P=100X (Pad Step Ht.- Meas. Step Ht. at Coat)/Pad Step Ht.
**
Special spin has a very short spin of a few seconds and the wafer drys with the chuck stationary not spinning as for Std. WX-214 is a standard photoresist.
*** Dynachem standard photoresist.
Stillwagen[217][219] was also aware of these limitations, and was able to formulate a series of polymeric compounds which were capable of planarization due to highly viscous flow and long spin times. Unfortunately, these compounds are not practical for use to the electronics industry, but they did demonstrate satisfactory planarization could be achieved and provide an existence proof of that. Since that work, several polymer systems have been developed by OCG and Futterex, which are capable of thermal flow after coating followed by thermally induced crosslinking reactions to harden them. This creates planarity over 1 micron topography of up to 70–90%
11/30/00
JMR
310
Handbook of VLSI Microlithography
for Hunt and Futterex test materials, respectively. The later data occurred for 200°C cures, and should be adequate for initial POT process experiments without N-layer lithographic steps. Even though these advances are impressive, it appears that CMP research has won out over the long term and these POT like processes have faded away. Via Layer Processing. Via layer images of an MLM process are square small contact geometries and allow the layers of metal to be connected at specific sites. So, for every layer of metal there is a via level which follows it, including even the last layer of metal which must be contacted through passivation dielectric in order to place the chip into a package. The via imagesare printed usually on a TEOS oxide or any type of insulating dielectric layer, then transferred into the dielectric to the lower metal surface using RIE as in Sec. 3. Via Layer CD Control. The issues for via CD control are very similar to those for metal layers, i.e., reflectivity, focus, and exposure. As for metal layers, image focus can be very important. In Figs. 146 and 147, the importance of putting in a focus offset to center the CD vs. the range of defocus data is demonstrated. Without an offset, the process is performed precipitously close to the right hand data of Fig. 146 which is going out of CD control quickly at less than 1 µm of positive defocus (see 370 and 420 mJ/cm2 data). The CD spec is 0.5 ±0.1 µm and notice there is an interaction of exposure and defocus at large minus and positive defocus values. By working at the machine defocus of zero from Fig. 146, if any thing happens across the wafer or the print field, the data will go out of specification quickly. In comparison the data of Fig. 147 is better centered at zero machine focus after putting in the -0.7 µm offset and now any changes that occur up to ±1.4 µm in local field planarity will have only a minimal effect on the via CD. Notice, that this empirically determined offset value is very close to the theoretical value you would calculate to compensate DOF for resist thickness,[207] the resist thickness 1.28 kÅ divided by the refractive index of 1.6 or -0.8 µm. The data of both figures also allows the etch bias of this example process to be estimated to be about +0.05 µm, i.e., the via gets a little larger after etch by using the fixed 420 mJ/cm2 data from both plots. As via CDs decrease with time, processes with no photo or etch bias will be required. Via CD variation can also be affected by metal reflectivity in the same way metal CDs are affected, because they too are printed to the same type of metal grainy surface (see ARC section). If reflective notching is a problem at the preceding metal layer, it most likely will also be a problem at the via layer as well since the metal has not changed. In fact, studies on 1 µm vias cut to
11/30/00
JMR
Resist Technology
311
metal requiring ARC demonstrated greater CD control when the ARC was also used at the via layer. The asymmetric notched via images with no ARC are found in Fig. 148 with the heavily notched via CD line aids, also printed with the via. After employing ARC at via, the reflective notch induced CD variability is reduced from between 27–37%, from separate studies.
Figure 146. Via 1 CD vs. defocus for the JSR 500 process.
Figure 147. Via 1 CD vs. defocus for the JSR 500 process with -0.7 micron focus offset.
11/30/00
JMR
312
Handbook of VLSI Microlithography
Figure 148. SEM micrographs of notched vias illustrating asymmetry vias that occurs without an ARC undercoating.
11/30/00
JMR
Resist Technology
313
For smaller vias at 0.5 µm, CD variability can be as large as 0.13 µm (all families of variation for ten lots of real wafer data, 5 wafers/lot, 5 sites/ wafer, 3-sigma), without ARC with the largest family of variation being wafer via site to site variation. This is really not too surprising because reflective notching does vary via to via as is seen in Fig. 148. By employing an ARC at via, that variation can be reduced to within the ±0.1 µm specification level. Vias printed with only resist exhibited “reflective notching dominated” CD 3-sigma variation of 0.10 µm greater, or double the specification, than that observed where an optimized ARC process was employed under the resist to minimize substrate reflectivity. 5.5
Summary and Future Predictions
The MLM photo processes of this section were basically for 0.8–0.7 Bipolar and 0.5 µm BiCMOS design rule technologies. With further design rule shrinkage to 0.35 µm and below, zero bias processes will be required. For example, the authors of Ref. 220 demonstrated 0.25 µm capability with an i-line stepper with oblique illumination combined with attenuated phase shift reticles, which is one of the easier phase shift techniques to employ. Of course, advances in phase shift technology processing and inspection must also occur before these technology goals can be achieved. Advances in DUV tool technology and phase-shift reticle technology are also pushing the frontiers of lithography and may eventually achieve 0.10 µm production lithography by the year 2000. 5.6
Future Processes
With the advent of the extension of optical lithography to shorter wavelengths and higher numerical apertures to resolution values approaching 0.1 micron and smaller, the future processing needs will surely be in the surface sensitive processing area, termed Desire processes.[221]–[223] This processing has considerable potential, because it would allow for relief of e-beam proximity effects and photolithographic bulk and focus tolerance effects. The Desire-type processing allows thick SLR systems to be employed effectively as bilayers for RIE selectivity and bulk effect relief, and with no inter-layer mixing problems. Desire processing is accomplished by post HMDS treatment of the exposed photoresist to yield an in-situ silylated bilayer, and renders the system as a RIE bilayer system as opposed to the more typical lithographic bilayer systems. The system requires oxygen RIE development of the final image. It allows simplified processing over that
11/30/00
JMR
314
Handbook of VLSI Microlithography
for conventional bilayer processes in that (1) the second resist spin and other top layer processing, (2) the top layer development, (3) the DUV flood exposure of the bottom layer, and (4) a subsequent separate development are eliminated and replaced by a high pressure HMDS treatment and RIE development. Obviously, Desire processes have an even greater advantage when DUV or EUV lithography are involved where thinner resists are involved and resist exposure penetration is limited. Desire processing, like the other special processes of this section, possesses higher process contrast.[222][223] This is achieved because the process images only the thin top part at the top of the thick resist layer. Furthermore, all feature sizes can be written at nearly the same exposure level, and dye can be incorporated to provide reflective notching effect relief. The process is not without problems, however, as swelling effects have been observed and reported as early as 1988.
REFERENCES 1. Thompson, L. F., and Kerwin, R. E., Annual Review of Materials Science, 6:267–299 (1976) 2. Bowden, M. J., and Thompson, L. F., Solid State Technology, 72 (May 1979) 3. Thompson, L. F., Willson, C. G., and Bowden, M. J., Introduction to Microlithography, ACS Symposium Series 219, Washington DC (1983) 4. Bowden, M. J., in: Materials for Microlithography, (L. F. Thompson, C. G. Willson, and J. M. J. Frechet, eds.), 266:39–117, ACS Symposium Series, Washington DC (1984) 5. Sze, S. M., VLSI Technology, McGraw-Hill, New York (1983) 6. McGillis, D. A., “Lithography,”VLSI Technology, (S. M. Sze, ed.) p. 267 (1983) 7. Bowden, M. J., in: Materials for Microlithography, (L. F., Thompson, C. G., Willson, and J. M. J. Frechet, eds.) ACS Symposium Series, 266:39 Washington DC (1984) 8. King, M. C., “Principles of Optical Lithography,” Chapter in VLSI Electronics-Microstructures Science, (N. G. Einsproch, Ed.,) Academic: New York (1981) 9. DeForest, W., “Photoresist Materials and Processes,” McGraw-Hill: New York (1975); Kaplan, M., and Meyerhofer, D., RCA Review, 40:167 (1979) 10. Barrow, G. M., Physical Chemistry, McGraw-Hill, New York (1966)
11/30/00
JMR
Resist Technology
315
11. Watts, M. P. C., “A High Sensitivity Two Layer Resist Process for Use in High Resolution Optical Lithography (for VLSI),” SPIE, 469:2 (1984) 12. Calvert, J. G., and Pitts, J. N., Photochemistry, Wiley, New York (1967) 13. Arden, W., Keller, H., and Mader, L., “Optical Projection Lithography in the Submicron Range,” Solid State Technology, pp. 143–150 (July 1983) 14. Trefonas. P., and Daniels, B. K., “New Principle for Image Enhancement in Single Layer Positive Photoresists,” SPIE, 771:194 (1987) 15. Hanabata, M., Furuta, A., and Uemura, Y., SPIE, 771:85–92 (1987) 16. Templeton, M. K., Szmanda, C.R., and Zampini, A., SPIE, 771:136–147 (1987) 17. Pampalone, T. R., Solid State Technology, pp. 115–120 (June 1984) 18. Fahrenholz, S. R., Goldrick, M. R., Hellman, M. Y., Long, D. T., and Pietti, R. C., ACS Org. Coat. Plast. Chem. Preprints, 35:306–311 (1975) 19. Deckert, C. A., and Peters, D. A., Solid State Technology, pp. 76–80 (Jan. 1980) 20. Koshiba, N., IEEE Workshop on Microlithography, Maui, HA, (1996) 21. DeJule, R., Semiconductor International, 76 (June 1997) 22. Willson, C. G., Ch. 3, in Intro. to Lith., ACS Prof. Ref. Book, p.139 (1994) 23. Reiser, A., and Marley, R., Trans. Faraday Soc., 64:64 (1968); Blais, P. D., Private Communication (1978) 24. Neisius, K., and Merrem, H. J., J. Electronic Materials, 11:761–777 (1982) 25. Hayase, S., Onishi, Y., and Horiguchi, R., JECS, 134:2275–2280 (1987) 26. Reichmanis, E., Smith, B. C., Smolinsky, G., and Wilkins, C. W., JECS, 134:653–657 (1987). Also see references therein. Wolf, T. M., Hartless, R. L., Schugard, A., and Taylor, G. N., J. Vac. Sci. Technol. B, 5:396–401 (1987) 27. Turner, S. R., Ahn, K. D., and Willson, C. G., Ch. 17 in: Polymers for High Technology Electronics and Photonics (M. J. Bowden and S. R. Turner, eds.), 346:200–210, ACS Symposium Series, Washington DC (1984) 28. Gipstein, E., Ouano, A. C., and Thompkins, T., JECS, 129:201–204 (1982) 29. Dorsch. J., and Steffora, A., Electronic News, 10 (Feb. 1999) 30. Ito, H., and Willson, C. G., Technical Papers of SPE Reg. Tech. Conf. on Photopolymers, p. 331 (1982) 31. Ito, H., Solid State Technology, p. 164 (July 1996) 32. Willson, C. G., Ch. 3 in Intro. to Microlith., ACS Ref. Books, 2nd Ed. (1994) 33. Strefkerk, B., van Ingen Schenau, K., and Buijk, C., SPIE (1998)
11/30/00
JMR
316
Handbook of VLSI Microlithography
34. Tomiyasu, Y. K. H., Tsukamoto, M., Niinomi, T., Tanaka, Y., Fujita, J., Ochiai, T., Uedono, A., Tanigawa, S., SPIE, 3049:238 (1997) 35. Houlihan, F., Nalamasu, O., and Reichmanis, E., et al., SPIE, 3049:466 (1997) 36. 37. 38. 39.
Conley, W., et al., SPIE, 3049:183 (1997) Willson, C. G., et al., SPIE, 3049:39 (1997) Arai, Y., and Sato, K., SPIE, 3049:300 (1997) Kamberg, K., Chen, G., Wiljanen, K., and Marzani, J., Motorola MOS 12, Photolith. Symposium (Oct. 1998) 40. Chen, G., Motorola MOS 12 internal data (1997) 41. 42. 43. 44.
Burggraaf, P., Solid State Technology, p. 31 (Feb. 1999) Allen, R. D., Semiconductor International, p. 73 (Sept. 1997) Technology News, Solid State Technology, p. 28 (Jan. 1999) Technology News, Solid State Technology, p. 32 (May 1998)
45. Fok, S., and Hong, G. H. K., in: Proceedings of Kodak Microelectronics Seminar-Interface (1983) 46. Bethe, H. A., and Aschkin, U., in: Experimental Nuclear Physics (E. Segre, ed.), Wiley, New York (1959) 47. Helbert, J. N., Iafrate, G. J., Pittman, C. U., and Lai, J. H., Polym. Eng. and Sci., 20:1076–1081 (1980) 48. Iafrate, G. J., Helbert, J. N., Ballato, A. D., and McAfee, W. S., US Army R&D Tech. Report, Ft. Monmouth, N.J.: ECOM–4466 (1977); Iafrate, G. J., Helbert, J. N., Ballato, A. D., Cook, C. F., and McAfee, W. S., in: Proceedings of Army Science Conference, West Point, NY (June 1980) 49. Hatzakis, M., Ting, C.H., and Viswanathan, N., in: Proceedings of Electron and Ion Beam Sci. and Techn., 6th International Conf., 542–579 (1974) 50. Thompson, L. F., and Bowden, M. J., JECS, 120:1304–1312 (1973) 51. Chen, C-Y., Pittman, C. U., and Helbert, J. N., J. Polym. Sci.: Chem. Ed., 18:169–178 (1980) 52. Lai, J. H., Helbert, J. N., Cook, C. F., and Pittman, C. U., J. Vac. Sci. Technol., 16:1992–1995 (1979) 53. Fahrenholz, S. R., J. Vac. Sci. Technol., 19:1111–1116 (1981) 54. Buiguez, F., Lerme, M., Gouby, P., and Eilbeck, N., in: Proceedings of 1984 International Symposium on Electron, Ion, and Photon Beams (1984) 55. Bowden, M. J., Thompson, L. F., Fahrenholz, S. R., and Doerries, E. M., JECS, 128: 1304–1311 (1981) 56. Helbert, J. N., Schmidt, M. A., Malkiewicz, C.,Wallace, E., and Pittman, C. U., in: Polymers in Electronics, (T. Davidson, ed.), 242:91–100, ACS Symposium Series, Washington DC (1984)
11/30/00
JMR
Resist Technology
317
57. Thompson, L. F., Stillwagon, L. E., and Doerries, E. M., J. Vac. Sci. and Techn., 15: 938–943 (1978); Thompson, L. F., Yau, L., and Doerries, E. M., JECS, 126:1703–1708 (1979) 58. Tan, Z. C. H., Daly, R. C., and Georgia, S. S., SPIE, 469:135–143 (1984) 59. Imamura, S., Tamamura, T., Sukegawa, K., Kogure, O., and Sugawara, S., JECS, 131:1122–1130 (1984); Saeki, H., Shigetomi, A., Watakabe, Y., and Kato, T., JECS, 133:1236–1239 (1986); Imamura, S., JECS, 126:1628–1630 (1979); JST News, 3: 56–57 (1984) 60. Taylor, G. N., Coquin, G. A., and Somekh, S., in: Technical Papers of SPE Photopolymers Principles - Processes and Materials, pp. 130–151 (1976) 61. Tanigaki, T., Suzuki, M., and Ohnishi, Y., JECS, 133:977–980 (1986) 62. Frechet, J. M. J., Eichler, E., Stanciulescu, M., Iizawa, T., Bouchard, F., Houlihan, F. M., and Willson, C. G., Ch. 12 in: Polymers for High Technology Electronics and Photonics (M. J. Bowden and S. R. Turner, eds.), 346:138– 148, ACS Symposium Series, Washington DC (1984) 63. Liu, H., deGrandpre, M., and Feely, W., Proceedings of the 31st International Symposium on Electron, Ion, and Photon Beams, Woodland Hills, CA. (May 1987) 64. Brunsvold, W. R., Crockatt, D. M., Hefferson, G. J., Lyons, C. F., Optical Engineering, 26:330–336 (1987) 65. Turner, S. R., Schleigh, W. R., Arcus, R. A., and Houle, C. G., in: Proceedings of Kodak Microelectronics Seminar-Interface (1986) 66. Oldham, W. G., Nandgaonkor, S. N., Neureuther, A. R., and O’Toole, M., IEEE Trans. Electron Devices, ED-26, 717 (1979) 67. Monohan, K., Hightower, J., Bernard, D., Cagan, M., and Dyser, D., in: Proceedings of Kodak Microelectronics Seminar-Interface (1986) 68. Electronic News, p. 90 (Sept. 15, 1997) 69. Dill, F. H., Tuttle, J. A., and Neureuther, A. R., “Modelling Positive Photoresist,” Proceedings of Kodak Interface, 24 (1974) 70. Hornberger, W. P., Huge, P. S., Shaw, J. M., and Dill, F. H., “The Characterization of Positive Photoresists,” Proceedings of Kodak Interface, 44 (1974); Esterkamp, M., Wong, W., Damar, H., Neureuther, A. R., Ting, C. H., and Oldham, W. G., “Resist Characterization: Procedures, Parameters, and Profiles,” SPIE 334:182 (1982) 71. Daniels, B. K., Trefonas III, P., and Woodbrey, J. C., “Advanced Characterization of Positive Photoresists,” Solid State Technology, 105–109 (Sept. 1988) 72. Kitaori, T., Fukunaga, S., Koyanagi, H., Umeda, S., and Nagasawa, K., “A Study of Photosensitizer for I-line Lithography,” SPIE, 1672:242 (1992) 73. Mack, C. A., “Development of Positive Photoresists,” Journal of Electrochemical Society, 134:148 (1987)
11/30/00
JMR
318
Handbook of VLSI Microlithography
74. Waldo, W., and Helbert, J., “Lithographic Process Development for High Numerical Aperture I-Line Steppers,” SPIE, 1088:153 (1989); Way, K., Smith, C., Malhotra, S., and Helbert, J. N., Motorola ACT Technical Report No. 190 (1993) 75. Underhill, J. A., Lunding, D. L., and Kerbaugh, M. L., “Wafer Fatness as a Contributor to Defocus and to Submicron Image Tolerances in Step-andRepeat Photolithography,” J. Vac. Sci. Technol. B, 5(1):299 (1987) 76. Hanabata, M., “Material Design of Photo Sensitive Agent 0.35µm Single Layer Within Target,” Nikkei Microdevices, 51 (Apr. 1992); Hanabata, M., Oi, F., and Furuta, A., “Novolak Design for High Resolution Positive Photoresists (IV) Tandem Type Novolak Resin for High Performance Positive Photoresists,” SPIE, 1466:132 (1991) 77. DeMuynck, D., Malhotra, S., and Helbert, J., “Next Generation I-Line Photoresist Evaluation for 0.6µm Photolithography,” Proceedings of 1991 Motorola SPS Technical Enrichment Matrix, 1:B(9) (1991) 78. Blais, P. D., Solid State Technology, pp. 76–79 (Aug. 1977) 79. Walker, C. C., and Helbert, J. N., in: Polymers in Electronics, (Davidson, T. ed.), 242:65–77, ACS Symposium Series, Washington DC (1984) 80. Wong, C. P., and Bowden, M. J., in: Polymers in Electronics, (T. Davidson, ed.), 266:285–304 (see Ref. 4); Tarascon, R., Harteney, pp. 359–388, ACS Symposium Series, Washington DC (1984) 81. Micheilsen, M. C. B. A., Marriott, V. B., Ponjée, J. J., van der Wel, H., Touwslager, F. J., Moonen, J. A. H. M. Microelectronic Engineering 11, 475–480 (1990) 82. Bauer, J., Drescher, G., Silz, H., Frankenfeld, H., Illig, M., SPIE, Vol. 3049 (1997) 83. Helbert, J. N., and Hughes, H. G., in: Adhesion Aspects of Polymeric Coatings, (K. L. Mittal, ed.), 499, Plenum Press, New York (1983) 84. Deckert, C. A., and Peters, D. A., in: Proceedings of the 1988 Kodak Microelectronics Seminar, 13 (1977); Circuits Manuf. (Apr. 1979) 85. Helbert, J., Saha, N. and Mobley, P., Adhesion Aspects of Resist Materials, Opportunities and Research Needs in Adheison Science and Technology, (G. G. Fuller and K. L. Mittal, eds.), pp. 3–17, Hitex Publications (June 1988) 86. Nistler, J. L., “A Simple Technique to Analyze Conditions that Affect Submicron Photoresist Adhesion,” Proceedings of Kodak Interface K– 88, Advanced Micro Devices, Austin, Texas, p. 233 (1988) 87. Siegbahm, K., Nordling, C., Johansson, G., Hedman, J., Heden, P. F., Hamrin, K., Gelius, V., Bergmark, T., Werme, L. O., Manne, R., and Baer, Y., in: ESCA Applied to Free Molecules, North Holland, Amsterdam, (1969); Carlson, T. A., in: Photoelectron and Auger Spectroscopy, Plenum Press, New York (1975)
11/30/00
JMR
Resist Technology
319
88. Helbert, J. N., and Saha, N. C., Ch. 21 in: Polymers for High Technology Electronics and Photonics (M. J. Bowden and S. R. Turner, eds.), 346:250–260, ACS Symposium Series, Washington DC (1984) 89. Mittal, K. L., Solid State Technol., 89 (May 1979) 90. Helbert, J. N., Robb, F. Y., Svechovsky, B. R., and Saha, N. C., in: Surface and Colloid Science in Computer Technology, (K. L. Mittal, ed.,) pp. 121– 141, Plenum Press, New York (1987) 91. Deckert, C. A., and Peters, D. A., in: Adhesion Aspects of Polymeric Coatings, (K. L. Mittal, ed.), 469, Plenum Press, New York (1983) 92. Stein, A., The Chemistry and Technology of Positive Photoresists, Technical Bulletin of Philip A. Hunt Chemical Corporation 93. Meyerhofer, D., in: “Characteristics of Resist Films Produced by Spinning,” J. Appl. Phys., 49:3993–3997 (1978) 94. Middleman, S., J. Appl. Phys., 62:2530–2532 (1987) 95. Lai, J.H., paper at American Chemical Society National Meeting, Honolulu (Apr. 1979) 96. Box, G., Hunter, W., and Hunter, J., in: Statistics for Experimenters, Wiley, New York (1978) 97. Helbert, J., Ch. 2 in Handbook of VLSI Microlithography, (W. Glendinning and J. Helbert, eds.) Noyes Publications, Park Ridge, NJ (1991) 98. Helbert, J. N., and Saha, N., “Application of Silanes for Promoting Resist Patterning Layer Adhesion in Semiconductor Manufacturing,” in Silanes and Other Coupling Agents, (K. L. Mittal, ed.) 439 (1992) 99. Sukanek, P., “A Model for Spin Coating with Topography,” J. Electrochem. Soc., 136:3019 (1989) 100. Bryce, G. R., and Collette, D. R., Semiconductor International, 71–77, (1984) 101. Diamond, W. J., Practical Experiment Designs for Engineers and Scientists, Lifetime Learning Publications, Belmont, CA (1981) 102. Alvarez, A., Welter, D., and Johnson, M., Solid State Technology (July 1983) 103. Yates, F., in: The Design and Analysis of Factorial Experiments, Bulletin 35, Imperial Bureau of Soil Science, Harpenden, Herts, England, Hafner (Macmillan) 104. Johnson, M., and Lee, K., Solid State Technology, 281 (Sept. 1984) 105. Waldo, W., and Helbert, J., 1988 IEEE Worshop on Lithography, Jackson Hole, WY (1988) 106. Campbell, D. M., and Ardehali, Z., Semiconductor International, pp. 127–131 (1984)
11/30/00
JMR
320
Handbook of VLSI Microlithography
107. Martin, A., Anastos, L., Ausschnitt, C., Balas, J., Brige, L., Golden, K., Long, D., Marsh, J., Taylor, R., and Thomas, A., “Elimination of Send Ahead Wafers in an IC Fabrication Line,” SPIE, 1673:640 (1992) 108. Matthews, J. C., and Willmott, J. I., Jr., SPIE, 470:194–201 (1984) 109. 110. 111. 112. 113.
Ma, W. H. L., SPIE, 333:19–23 (1982) Roberts, E. D., SPIE, 539:124–130 (1985) Peters, D. A., and Deckert, C. A., JECS, 126:883–885 (1979) Sargent, J., Starov, V., Rust, W., Solid State Technology, p. 172 (May 1997) Brown. M., John, J., DeCoursey, R., Malhotra, S., Lee, F., Helbert, J., “Photolithographic Process Defect Density Minimization Using Response Surface Experimental Designs and Modeling,” Microcontamination Conference Proceedings, pp. 330–337 (1993)
114. Moreau, W., Cornett, K., James, F., Linehan, L., Montgomery, W., Plat, M., Smith, R., and Wood, R., “The Shot Size Reduction of Photoresist Formulations,” SPIE, 2438:646–658 (1995) 115. Bruce, J. A., Burn, J. L., Sundling, D. L., and Lee, T. N., in: Proceedings of Kodak Microelectronics Seminar-Interface (1986) 116. Dunn, D. D., Norris, K. C., Somerville, L. K., “DUV Photolithography Manufacturing,” Solid State Technology, pp. 54 (1994) 117. Blash, A., TEL Mark 7 Operations Training Class (1997) 118. Tong, Q. Y., Lee, T. H., and Gosele, U., “The Role of Surface Chemistry in Bonding of Standard Silicon Wafers,” J. Electrochem. Soc., 144(1) (Jan. 1997) 119. Singer, P., Semiconductor International, p. 88 (October 1995) 120. Daou, T., Webber, G., “Photoresist Dispense Technology,” Semiconductor International, pp. 81–82 (1995) 121. U.S. Pat., 4,290,384, Perkin Elmer (1981) 122. U.S. Pat., 5,066,616, Hewlett-Packard (1991) 123. U.S. Pat., 4,590,094, IBM (1986) 124. U.S. Pat., 5,013,586, S.E.T (1991), U.S. Pat., 3,695,911, Aleo; U.S. Pat., 4,748,053 Hoya (1988) 125. Bokelberg, E. H., Pariseau, M. E., “Excursion Monitoring of Photolithographic Processes,” OCG Interface ’97 (Nov. 1997) 126. Cayton, J., Williams, D., “The Effects of Resist Processor Parallel Path Matching on Lithographic Process Control,” OCG Interface ’97, (Nov. 1997) 127. Waldo, W., “Box-Behnken Experiment,” Motorola Internal Publication, (Oct. 1988)
11/30/00
JMR
Resist Technology
321
128. Pratt, L. D., “Photoresist Aerosol Particle Formation during SpinCoating,” SPIE, Vol. 1262, Advances in Resist Technology and Processing VII:170–179 (1990) 129. Helbert, J. N, Ch. 2, in: Handbook of VLSI Microlithography, (W. B. Glendinning and J. N. Helbert, eds.), pp. 41–140, Noyes Publication, Westwood NJ (1991) 130. Fischer, F., EDN 030, “Microlithography: Intermediate,” Motorola University Internal Course, Version 2.0 (Jan. 1997) 131. Reihani, M., “Environmental Effects on Resist Thickness Uniformity,” Semiconductor International, p.120–121 (June 1992) 132. “Lithography News,” “Sub-0.5µm Geometries Require Tighter Humidity Controls,” Semiconductor International, p. 66–67 (Oct. 1996) 133. Dubno, W., Aspin, C., “Isolating an SVG Coater Track from the Work Environment,” Proceedings from, Motorola’s Defect Density Conference, Book 3 (Dec. 1991) 134. FSI Lithographer, “Barometric Pressure Compensation (BPC) Reduces Long-term CD Variation,” 3.3:2 (July 1998) 135. Bornside, D. E., Brown, R. A., Ackmann, P. W., Frank, J. R., Tryba, A. A., and Geyling, F. T., “The Effects of Gas Phase Convection on Mass Transfer in Spin Coating,” J. Appl. Phys., 73(2) (1993) 136. Lyons, D., and Beauchemin, B. T., Jr., “A Unique Spin Coat Process for Positive Photoresists.” SPIE 2438:726–736 (1995) 137. Fischer, F., and Miller, S., “Resist Popping Phenomena During OEBR,” Internal Publication, Motorola MOS12 (1994) 138. Dammel, R. Ch. 5 in: Diazonaphthoquinone-based Resists, (D. C. O’Shea, ed.), pp. 104–106, SPIE Optical Engineering Press Bellingham, WA (1993) 139. Chen, X., and Pease, R.W.F., “Minimizing Alignment Error Induced by Asymmetric Resist Coating,” J. Vac. Sci., Technol. B, 14(6):3980–3984 (1996) 140. Shimada, H., Shimomura, S., Au, R., Miyawaki, M., and Ohmi, T., IEEE Transactions on Semiconductor Manufacturing, 7(3):389 (Aug. 1994) 141. Malhotra, S., Helbert, J. N., and Waldo, W. G., “Influence of Development Process Parameters Upon I-Line Resist Lithographic Performance,” KTI Interface, p. 369 (Nov. 1990) 142. Bruce, J. A., Dupuis, S. R., Gleason, R., and Linde, H., “Effects of Humidity on Photoresist Performance,” J. Electrochem. Soc., 144(9) (Sept. 1997) 143. Perera, T., SPIE, 1086:470 (1989) 144. Lee, F., “Translation of Semiconductor World Article” pp. 58–61 (1992) 145. Steinwall, J., SFTRK02 Qual Report, Internal publication for track purchased under guidelines specified in a Motorola/TEL Equipment Purchase Agreement (1995)
11/30/00
JMR
322
Handbook of VLSI Microlithography
146. Enloe, D., Semiconductor International, p. 82 (Feb. 1998) 147. Burggraaf, P., Semiconductor International, p.100 (Jul. 1995) 148. Dunn, D., Norris, K., and Somerville, L., Solid State Technology, p. 53 (Sept. 1994) 149. Braun, A. E., Semiconductor International, p. 63 (Feb. 1998) 150. Tolliver, D., and Gallagher, D., Microelectronics Manuf. Tech., p. 17, (Oct. 1991) 151. Anderson, H., Solid State Technology, p. 131 (Oct. 1996) 152. DeJule, R., Semiconductor International, p. 74 (Aug. 1996) 153. Osgar, M., and Waldman, J., “Product News,” Solid State Technology, p. 148 (Oct. 1998); US Patent 5,102,010 (Apr. 7, 1992) 154. Helbert, J., in: “Interfacial Phenomena in the New and Emerging Technologies,” (W. Krantz and D. Wason, eds.), 2–41–2–44, Proceedings of the Workshop held at Department of Chemical Engineering, University of Colorado (May 1986) 155. Stover, H., Nagler, M., Bol, I., and Miller, V., SPIE, 470:22 (1984) 156. Bruning, J., ECS Proceedings of the Tutorial Symposium on Semiconductor Technology, 82–85:119–137 (1982) 157. Lin, B., Solid State Technology, pp. 105–112 (May 1983) 158. Lin, B., SPIE, 174:114 (1979) 159. Bolsen, M., Buhr, G., Merrem, H. J., and van Werden, K., Solid State Technology, pp. 83–88 (Feb. 1986) 160. Vidusek, D., and Legenza, W., SPIE, 539:103–114 (1985) 161. Ting, C., and Liauw, K., SPIE, 469:24 (1984) 162. Gellrich, N., Beneking, H., and Arden, W., J. Vac. Sci. Technol. 3:335–338 (1985) 163. Ray, G., Shiesen, P., Burriesci, D., O’Toole, M., and Liu, E., JECS, 129:2152–2153 (1982) 164. Jones, S., Chapman, R., Ho, Y., and Bobbio, S., in: Proceedings of the 1986 Kodak Microelectronics Seminar (1986) 165. Wolf, S., and Tauber, R. N. “Silicon Processing for VLSI Era,” Silicon Processing, 413, Lattice Press, Sunset Beach, CA. (1986); Peterson, J. S. and Kozlowski, A. E. “Optical Performance and Process Characterizations of Several High Contast Metal-ion-free Developer Processes,” SPIE, 469:46 (1984) 166. Lin, B. J., “Multilayer Resist Systems,” Ch. 6, in: Introduction to Microlithography, (L. F. Thompson, C. G. Willson, and M. J. Bowden, eds.) Vol. 219, ACS Symposium Series, Washington DC (1983)
11/30/00
JMR
Resist Technology
323
167. Bruce, J. A., Burn, L. J., Sunding, D. L., and Lee, T. N., “Characterization of Linewidth Variation for Single and Multiple Layer Resist Systems,” in: Proceedings of Kodak Microelectronics Seminar-Interface (1986) 168. Pampalone, T. R., Kuyan, F. A., “Improving Linewidth Control over Reflective Surfaces Using Heavily Dyed Resists,” in: Proceedings of the 1985 Kodak Microelectronics Seminar Interface (1985) 169. Mack, C. A., “Dispelling the Myths About Dyed Photoresists,” Solid State Technology, 31:125 (1988) 170. Renschler, C. L., Stienfeld, R. E., and Rodriquez, J. L., “Curcumin As a Positive Resist Dye Optimized for g- and h-line Exposure,” JECS, 1586 (June 1987) 171. Coyne, R. D., and Brewer, T., “The Use of Anti-Reflection Coatings for Photoresist Linewidth Control,” Proceedings of the Kodak Microelectronics Interface, 83:40 (1983) 172. U.S. Patent No. 4,104,078, Moritz, H., and Paal, G. (1978) 173. MacDonald, S., Miller, R., Willson, C., Feinberg, G., Gleason, R., Halverson, R., MacIntyre, M., and Motsiff, W., “The Production of a Negative Image in A Positive Photoresist,” Proceedings of Kodak Interface (1982) 174. Long, M., and Newman, J., “Image Reversal Techniques with Standard Positive Photoresist,” SPIE Proceedings, Vol. 469, Advances in Resist Technology, p. 189 (1984) 175. Alling, E., and Stauffer, C., SPIE 539:194 (1985) 176. Klose, H., Sigush, R., and Arden, W., IEEE Transactions on Electron Devices, ED-32:1654 (1985) 177. Moritz, H., IEEE Transactions on Electron Devices, ED-32:672 (1985) 178. Hartglass, C., in: Proceedings of Kodak Interface (1985) 179. Marriott, V., Garza, M., and Spak, M., SPIE, 771:221–230 (1987) 180. Spak, M., Mammato, D., Jain, S., Durham, D., “Mechanism and Lithographic Evaluation of Image Reversal in AZ 5214 Photoresist,” VII Int. Tec. Photopolymer Conf. (1985) 181. Smith, C., Lee, F., Malhotra, S., and Helbert, J., Motorola TEM (Dec. 1992) 182. Flanner, P., III, Subramanian, S., and Neureuther, A., SPIE Optical Microlithography V, 633:239 (1986) 183. Lyons, C., Long, D., Miura, S., Wood, R., and Olson, S., Solid State Technology, p. 95 (Nov. 1990) 184. Shamma, N., Sporon-Fiedler, F., and Lin, E., “A Method for Correction of Proximity Effect in Optical Projection Lithography,” Proceedings of KTI Interface, p. 145 (Oct. 1991)
11/30/00
JMR
324
Handbook of VLSI Microlithography
185. Fitch, J., Saffert, R., Ruhde, A., and Wachman, E., Proceedings of KTI Interface, Vol. 9 (Oct. 1991) 186. Baker, D., Johnson, G., and Bane, R., SPIE, 1261:482 (1990) 187. Steinwall, J. A., Lambson, C., Helbert, J., Windsor, W., and Yanof, A., Motorola/RF Div./COM 1. TEM 96 Phoenix, AZ, p. 121 (1996) 188. Lin, Y.-C., Purdes, A. J., Saller, S. A., and Hunter, W. R., “Linewidth Control Using Anti-Reflective Coating,” IEDM International Electron Device Meeting, San Francisco CA, 399 (1982) 189. Nölscher, C., Mader, L., Schneegans, M., “High Contrast Single-Layer Resist and Antireflection Layers—An Alternative to Multilayer,” SPIE, 1086:242 (1989) 190. Pampalone, T. R., Camacho, M., Lee, B., and Douglas, E. C., “Improved Photoresist Patterning Over Reflective Topographies Using Titanium Oxynitride Antireflection Coatings,” J. Electrochem. Soc., 136:1181 (1989) 191. Martin, B., Odell, A. N., Lamb III, J. E., “Improved Bake Latitude Organic Antireflective Coatings for High Resolution Metallization Lithography,” SPIE, 1086:543 (1989) 192. Lin, Y. C., Marriott, V., Orvek, K., and Fuller, G., “Some Aspects of AntiReflective Coating for Optical Lithography,” SPIE, 469:30 (1984) 193. Kaplan, S., “Linewidth Control Over Topography Using Spin-On Ar Coating,” Proceedings of the KTI Microelectronics Seminar, 307 (1990) 194. Harrison, K., and Takemoto, C., “The Use of Anti-Reflection Coatings for Photoresist Linewidth Control,” in Proceedings of the Kodak Microelectronics Seminar, Interface 83 (1983) 195. Miura, S., and Lyons, C., “Reduction of Linewidth Variation Over Reflective Topography,” SPIE, 1674:147 (1992) 196. DeJule, R., Semiconductor International, p. 169 (July 1996) 197. DeJule, R., Semiconductor International, p. 76 (June 1997) 198. Graca, S., Sethi, S., and Fernandex, P., Motorola MOS-11, Austin, Texas: and Roman, B., King, C., and Ong, T. P., Motorola APRDL, Austin, Texas, OCG Interface 95, p. 53 (1995) 199. Schmidt, M., and Chen, G., Motorola TEM, p. 59.1 (1998); Chen, G., Motorola Shipley Photolithography Symposium (Oct. 1998) 200. Filipiak, S., Motorola APRDL Internal Report, Austin, Texas (Apr. 4, 1996) 201. Hilfiker, J., Synowicki, R., and Bungay, D., Solid Technology, p. 101 (Oct. 1998) 202. Bencher, C., Ngai, C., Roman, B., Lian, S., and Vuong, T., Materials, p. 109 (March 1997)
11/30/00
JMR
Resist Technology
325
203. Lin, B. J., Underhill, J. A., Sunding, D., and Peck, B., “Electrical Measurement of Submicrometer Contact Holes,” SPIE, 921:164 (1988) 204. Mack, C. A., “Prolith: A Comprehension Optical Lithography Model,” Optical Microlith. IV. Proc. SPIE, 538:207 (1985) 205. Cuthbert, J. D., “Optical Projection Printing,” Solid State Technology, 59 (Aug. 1977) 206. Brunner, T., “Optimization of Optical Properties of Resist Processes,” SPIE, 1466:297 (1991) 207. Lin, B. J., “Quarter- and Sub-Quarter-Micron Optical Lithography,” Patterning Science and Technology II, (W. Greene and G. J. Hefferton, Eds.), The Electrochemical Society, Inc. Penning, NJ, 3 (1992) 208. Singer, P., “Searching for Perfect Planarity,” Semiconductor International, 43 (March 1992) 209. Schiltz, A., and Pans, M., “Two-Layer Planarization Process,” J. Electrochem Soc., 133:178 (1986) 210. Nagy, A., and Helbert, J., “Planarized Inorganic Interlevel Dielectric for Multilevel Metallization–Part I,” Solid State Technology, p. 53 (Jan. 1991) 211. Nagy, A., and Helbert, J., “Planarized Inorganic Interlevel Dielectric for Multilevel Metallization - Part II,” Solid State Technology, p. 77 (March 1991) 212. Comello, V., “Wafer Processing News,” Semiconductor International, p. 28 (March 1990) 213. Szxena, A., and Praminik, D., “Manufacturing Issues and Emerging Trends in VLSI Multilevel Metallizations,” Proc. V-MIC Conf. IEEE, p. 9, New York (1986) 214. Thoma, M. J., et al., “A 1.0 u m CMOS Two Level Metal Technology Incorporating Plasma Enhanced TEOS,” Proc. V-MIC Conference, IEEE, 20, New York (1987) 215. Wang, D. N. K., Somekh, S., and Maydan, D., “Advanced CVD Technology,” Proc. 1st Int’l Symp. on ULSI Science and Technology, 712 (1987) 216. Stillwagon, L. E., Larson, R. G., and Taylor, G. N., “Spin Coating and Planarization,” SPIE, 771:186 (1987) 217. Stillwagon, L. E., and Larson, R. G., “Fundamentals of Topographic Substrate Leveling,” J. Appl. Phys., 63:5251 (1988) 218. Wilson, R. H., and Piacente, P. A., “Effect of Circuit Structure on Planarization Resist Thickness,” Proc. V-MIC Conf. IEEE, p. 30., New York (1984) 219. Stillwagon, L. E., “Planarization of Substrate Topography by Spin-Coated Films: A Review,” Solid State Technology, 67 (June 1987)
11/30/00
JMR
326
Handbook of VLSI Microlithography
220. Tamechika, E., Matsuo, S., Komatsu, K., Takeuchi, Y., Mimura, Y., and Harada, K., “Investigation of Single Sideband Optical Lithography Using Oblique Incidence Illumination,” J. Vac. Soc, Technol. B, 10:3027 (1992) 221. Greeneich, J., JECS, Extended Abstract, 80-81:261 (1980) 222. Roland, B., Lombaerts, R., Jakus, C., and Coopmans, F., SPIE, 771, 69 (1987) 223. Coopmans, F., and Roland, B., Solid State Technology, 93 (June 1987)
11/30/00
JMR
Process Monitoring and Defect Detection
327
3 Lithography Process Monitoring and Defect Detection Fourmun Lee Motorola, Inc. Chandler, Arizona
1.0
OVERVIEW
The importance of monitoring the performance of a manufacturing line through real time, in-line product inspections has been discussed in the literature.[1][2] Such inspections have been shown to be effective for identifying defect and process excursions, finding marginal processes and process integration issues, and identifying improvement opportunities. Control and monitoring of the photolithography processes is critical to achieving and maintaining high device yield. In the manufacture of modern integrated circuits, the wafers in process are subjected to as many as 35 lithography steps. At each of these steps, there are opportunities for yield loss induced by process shifts or defects. These problems can be caused by drift in tool parameters, material problems, or environmental factors. Problems can be the result of an unplanned change in a process parameter or the interaction of two or more process parameters. Examples of each parameter type and its potential impact on lithography process parameters are shown in Table 1. The impact of individual process parameters on pattern fidelity is discussed in Ch. 2, Secs. 3 and 4.
327
11/30/00
JMR
328
Handbook of VLSI Microlithography
Table 1. Lithography Parameters and Its Impact on The Process Parameter Type
Source of Variation
Process Parameters Impacted
Tool
Stepper
Focus Alignment Exposure dose Defectivity
Tool
Track (coat)
HMDS Prime temperature Temperature Spin speed Exhaust Post-exposure bake temperature Defectivity
Tool
Track (develop)
Temperature Develop Defectivity
Material
Resist
Viscosity Photospeed Defectivity
Material
Developer
Developer concentration Surfactant concentration Defectivity
Material
Water
Defectivity
Material
Wafer
Environmental
Fab
Substrate flatness Substrate reflectivity Substrate topography Defectivity Temperature Humidity Defectivity
Two key observations can be made from Table 1: (1) The lithography process is sensitive to a large number of parameters, some of which are not under the direct control of the photo process; (2) All parameters have a defectivity component which can directly impact the performance of the photo process.
11/30/00
JMR
Process Monitoring and Defect Detection
329
Figure 1 shows how the minimum critical dimensions (CD) required for device fabrication has decreased over time. Although not shown in the figure, it is generally acknowledged that the trend toward higher levels of integration has kept die size the same or even increased it over time. As a result of these development trends, devices are becoming more sensitive to defect density and maximum defect size. To achieve and maintain yields at economically acceptable levels, it is imperative to closely monitor and reduce photo process defectivity. This need as well as general wafer inspection needs, has motivated the development of several wafer inspection technologies which are described in the next section.
Figure 1. Evolution of minimum critical dimensions (CD) as a function of time.
2.0
DEFECT DETECTION TOOLS
2.1
History
In the 1970s and early 1980s, wafer inspection was performed manually using optical microscopes, sometimes with the assistance of bright lights. Such inspections were possible since critical dimensions were microns and yield limiting defects were easily seen visually. This activity was facilitated by the use of snake patterns and other repeating structures which allowed fab personnel to more easily find defects. However, it was recognized early on that this work was slow and tedious, and the results were subject to considerable variance. The reduction in feature
2/23/01
JMR
330
Handbook of VLSI Microlithography
size together with the increase in pattern complexity provided additional motivation for the development of automated inspection tools. The first automated inspection tools appeared in the early 1980s. The tools offered were based on two different detection techniques: brightfield image comparison and darkfield light scattering. An example of a brightfield tool from this era is the KLA-2020. It was basically an automated microscope that looked for defects by comparing brightfield images and looking for differences between adjacent die. Sensitivity was better than 2 microns but it suffered from low inspection speed (~0.025 8-inch wafers equivalents/hour). Darkfield inspection tools from this time period, such as the Tencor-3000, relied on detection of light scattered from particles as a laser is scanned across on the wafer surface. Inspection speed was quite high (~7.5 8-inch wafer equivalents/hr) but minimum sensitivity was limited to ~3 microns and could only handle unpatterned wafers. Initial adoption of this new technology was slow due to the poor business conditions in the industry, high tool cost, and the limited capability of the tools themselves. As a result, manual wafer inspection continued to be commonplace. In the late 1980s, significant advances were made in metrology tools for measuring on-wafer defects on both unpatterned and patterned wafers. Major improvements were made in inspection speed, sensitivity, and overall capability, made possible primarily by advances in chip technologies which these tools helped develop. Several companies introduced new tools for measuring patterned and unpatterned wafers. KLA introduced the 21XX family of patterned wafer inspectors, Tencor introduced the 7XXX series tools. Inspex provided the EX-3000 and OSI provided the IQ-165. The tools from Tencor and Inspex were based on light scattering combined with rudamentary image processing for enhanced defect sensitivity. The advent of these tools made it possible to perform real time inline production monitoring for the first time. Defect detection technology evolved rapidly in the 1990s and is continuing to improve at a rapid pace. The later part of the 1990s witnessed the merger of the two leading inspection tool suppliers and a convergence of inspection technologies. In 1996, KLA introduced a 400megapixel version of the 21XX inspector with improved sensitivity. The same year Tencor introduced the AIT, a laser scattering inspection tool combining high wafer throughput with high speed image processing. This was followed by the merger of KLA and Tencor in 1997. The rapid adoption of chemical mechanical polish (CMP) as a volume manufacturing process drove the development of inspection technologies to handle
11/30/00
JMR
Process Monitoring and Defect Detection
331
die-to-die color variation and low contrast defects. To provide better sensitivity for CMP inspections, KLA-Tencor developed ultra-broadband illumination optics (UBB), segmented auto-threshold (SAT) and the 2230 platform. The 2230 uses high speed image processing with darkfield illumination. Applied Materials introduced the WF-7XX platform in 1997. The WF-736 is the first tool to combine darkfield and brightfield detection on a single platform. 2.2
Inspection Equipment Requirements
Wafer inspection tools used in production applications are typically judged on the following criteria:[3][4] • sensitivity • throughput • stability • automation • reliability • cost of ownership • false defect rate The weight assigned to each factor is dependent on the target application. For example, overall sensitivity might be considered more important for a line monitoring tool while throughput and cost of ownership might be the primary factors for a process monitoring tool. Sensitivity is a measure of a tool’s ability to capture various types of defects for different process layers. The defects may be located on the wafer surface, partially embedded in a deposited film, or submerged beneath the wafer surface. Different process layers will present different background noise levels which must be distinguished from the actual defects. The physical characteristics of the defects together with the surrounding pattern matrix will determine the difficulty of detecting a defect. Stability refers to the ability of the tool to repeatably detect the same types of defects in the same size range on similar substrates. Minor changes in the process which are typical of normal process variances should not affect the measured defect density by more than 10%. The false defect rate should be zero using optimized inspection parameters. Reliability is measured by uptime, time between assists, and the ability of the tool to continuously perform its intended function. Throughput is a measurement of the number of wafers scanned per hour. In most cases, this value will depend on the inspection sensitivity and area inspected per wafer. Included in this value is the overhead required for loading, alignment, and unloading of wafers under test. In a
11/30/00
JMR
332
Handbook of VLSI Microlithography
typical production fab, sufficient tooling should be purchased to provide the capacity to inspect three to five wafers per lot while sampling 20–30% of the lots in process. Assessing throughput is more complicated when auto-defect classification (ADC) is provided on the tool. ADC reduces throughput by increasing the residence time of each wafer on the inspection tool and is dependent on the number of defects which must be redetected. On the other hand, ADC reduces the need for manual defect review by fab personnel. Automation refers to the ability of the tool to perform all tasks for which the tool was designed with minimal assistance from a human operator. A fully automated tool includes the automatic wafer/lot identification using an OCR or bar-code reader, automatic recovery from failure, and automated setup. Except for selecting alignment sites, specifying wafer layout information and sample plans, selection of parameters for normal inspections should require minimal manual optimization. However, manual setup modes must be available for jobs requiring special optimization. Cost of ownership (COO) is a combination of the capital cost of the tool, maintenance costs, together with considerations for throughput, reliability, and automation. COO is typically calculated as a cost per wafer inspected and is heavily impacted by percent utilization, setup time, and engineering activities. Selection of the proper inspection tool relies on a detailed understanding of the types and sizes of defects which need to be detected, the processes to be monitored, the types of wafers to be inspected, and the capabilities of the wafer inspection tools. The ability of an inspection to detect specific types of defect is highly dependent on the detection technique employed by the tool. Inspection tools are available which employ one or more of the following techniques: digital image processing, Fourier filtering, and laser scattering. In the selection of an inspection tool, it is recommended that target tools be evaluated using wafers representative of the process to be monitored to confirm adequate sensitivity and throughput. Evaluation methodology and an example of an actual tool evaluation is provided in Refs. 5 and 6. 2.3
Detection Techniques
Inspection tools have been developed based on the following detection technologies: light scattering, pattern analysis, and optical pattern filtering. Each technology provides some unique advantages depending on the application. Table 2 compares some of the key characteristics of these inspection technologies.
11/30/00
JMR
Process Monitoring and Defect Detection
333
Table 2. Comparison of Inspection Technologies Detection Technique
Strength
Weakness
Machine
Pattern analysis
*High sensitivity *Capable of inspecting unpated wafers terned and pattern*Sensitivity to most defects (surface and sub-surface)
*Slow compared to laser scanning tools *Low depth of focus
KLA20XX, 21XX
Light scattering
*Fast *Capable of inspecting unpatterned and patterned wafers
*Poor sensitivity to planar and subsurface defects
Tencor 6XXX, 7XXX Inspex 8XXX KLATencor AIT
Optical pattern filtering
*Large depth of focus *High sensitivity
*Can’t inspect OSI IQ-165 unpatterned wafers *Application limited to repetitive geometries
The latest patterned wafer inspection tools employ a variety of imaging technologies together with software or hardware filtering, and signal or image processing to achieve good sensitivities at all process layers. Both brightfield and darkfield tools can be used effectively for general line monitoring applications (both CMP and non-CMP layers). Darkfield tools are especially well-suited for inspection of CMP layers due to their superior immunity to color noise. The lower data processing requirements of darkfield detection provides these tools with a distinct throughput advantage over brightfield tools. On the other hand, brightfield tools are superior to darkfield tools for line monitoring applications where there is a need to detect all defects generated within a process module. In some cases, brightfield tools also exhibit better sensitivity. Selection of the optimum tool for an application depends on understanding both the
11/30/00
JMR
334
Handbook of VLSI Microlithography
types of defects to be detected and the characteristics of the wafers at the target inspection steps. Brightfield Detection. Brightfield detection relies on illuminating a wafer at normal incidence and evaluating the resulting reflected image. The reflected image is digitized and converted into a gray scale image. Using image processing, comparisons are made between the gray scale images of the candidate image and the reference image. Differences are then compared with a threshold. When a difference is greater than the threshold value, the event is flagged. To insure identification of the defect in correct die, an event is only reported as a defect when it is flagged twice in the same location (double detection algorithm).[7] Examples of tools which use this approach are KLA-Tencor’s model 2135 and 2138 tools (see Fig. 2), representing the latest offerings in its 21XX series of brightfield defect inspection tools. The basic optics setup for the 21XX series tool is shown in Fig. 3. Illumination from a white light source is projected on the wafer through an objective selected during recipe creation. The reflected image is returned though the objective and passes through a magnification changer on to a high speed time-delay integration (TDI) sensor. Defects must be optically resolved in order to be detected. A neutral density filter is used to adjust the intensity of the light projected on the wafer which affects the image contrast. Defect sensitivity can be increased by performing inspections at higher magnification at the expense of reduced inspection speed. In operation, the wafer to be inspected is placed beneath the objective lens and is scanned as the stage is moved continuously along the x-axis. When the last die in the row is reached, the stage is stepped in the y-axis and the scan is repeated along the x-axis in the reverse direction, forming a serpentine scan pattern as shown in Fig. 4. This scanning is repeated until all the die specified in the sampling plan have been inspected. The standard operating software provides the capability to omit any die from the inspection, a feature which is useful for bypassing embedded test patterns and reduce test time. Inspection time can be reduced through the use of die subsampling or eliminating rows of die from the inspection. The effectiveness of brightfield defect detection has been demonstrated by its ability to reliably capture defects at the current process level as well as defects present from previous processing. As a result, this is a good technique for process module monitoring where there is a need to find defects generated by a sequence of process steps. Since capture of defects is based both on contrast and on size, very small defects can be
11/30/00
JMR
Process Monitoring and Defect Detection
335
detected if they have high contrast compared with their background. This technique provides detection capability for defects located on the wafer surface, partially embedded in a deposited film, or totally submerged beneath the wafer surface. In the 2135, a narrow band light source (Xe-Hg arc lamp) provides good sensitivity on wafers where the local dielectric thickness variations are small (for example, wafers that have not been subjected to chemical-mechanical polishing).
Figure 2. KLA-Tencor 2138. Tool shown is configured to handle SMIF pods.
2/24/01
JMR
336
Handbook of VLSI Microlithography
Figure 3. Detection optics used in the KLA-Tencor 21XX series tools.
Figure 4. Typical die scan pattern used by the KLA-Tencor 21XX series tools.
2/23/01
JMR
Process Monitoring and Defect Detection
337
There are some defect types which may be difficult to detect using brightfield techniques, such as a scratches in an oxide layer or surface particles in a densely patterned area. Surface particles can be difficult to detect because they tend to be located out of the plane of focus of the wafer surface. As shown in Fig. 5, scratches in oxide appear as a transparent defect in a transparent material. Both of these defects are difficult to detect in brightfield due to low contrast between the defect and the background. Such defects are more easily detected using darkfield detection because they have significant scattering cross section. (See next section.) When a scratch is filled with metal, the defect has high contrast and the detection sensitivity reversed. The defect is highly visible in brightfield but may be difficult to locate in darkfield due to weak scattering of grazing incidence illumination.
(a)
(b)
Figure 5. Examples of CMP scratches. There is a significant difference in the (a) brightfield contrast for a CMP scratch in oxide, and (b) CMP oxide scratch filled with metal.
Brightfield inspections of metal layers can be challenging due to false defects and missed detection. Metal features can be very grainy, producing large, noisy difference values where there is no defect present. At the same time, the surrounding dielectric can contain important defects that produce a small difference value. The result is poor capture of important defects (high threshold) or high capture rate with many false defects (low threshold). To reduce false and missed detections in these cases, KLA-Tencor has developed software called segmented auto threshold (SAT). With SAT, the image is segmented based on brightness and noise of each pixel. A dynamic threshold is applied to each segment which tracks with the segment’s noise level. The Inspex tools use a similar technique to aid in the detection of defects.
11/30/00
JMR
338
Handbook of VLSI Microlithography
SAT can be effectively used to mask out the color variations that result from CMP. CMP-processed dielectric layers can have a great deal of thickness variation, resulting in extensive background color variations when viewed in brightfield. This can make inspection of post-CMP process layers very difficult, as the color variation can produce a large number of false and nuisance defects. The autothresholding provided by SAT tracks the color variation levels on wafer and increases the sensitivity in areas with no die-to-die color variation. However, color variations are not eliminated in collected images (only adapts to color variation levels). When color differences are severe, they may cause wafer alignment failures and false defects. On the KLA-Tencor 2138, ultra broad band (UBB) illumination is used to cancel the color variation. The background color is due to interference of the light reflected from the surface of the dielectric with that reflected from the surface beneath it. The dielectric thickness variation causes the color variation. Figure 6 shows the output spectrum from the standard illuminator and the UBB illuminator. Use of a broader bandwidth of light reduces the coherence length of the light, making the interference less sensitive to the dielectric thickness. The result is a more consistent background color across the wafer. SAT can be used in conjunction with UBB to further increase signal to noise, resulting in increased defect sensitivity.
Figure 6. Comparison of light output from standard illuminator and broad band illuminator used in KLA-Tencor 21XX tools.
11/30/00
JMR
Process Monitoring and Defect Detection
339
In photo process monitoring, either 2135 or 2138 can be used with excellent sensitivity. Since unpatterned silicon wafers are the substrates of choice for normal process testing, the type of illumination employed is not a consideration. These tools are capable of inspecting both patterned and unpatterned wafers. Patterned wafers are preferred because they facilitate offline defect review using an optical defect review station or SEM. Darkfield Detection. Darkfield detection relies on scattering of laser light from the wafer under measurement and collection of the scattered light using one or more detectors. The technique relies on detecting differences in scattered light intensity which is a function of the scattering cross-section of the background device pattern and defects which may be present. Particles, leftover CMP slurry, and scratches on the wafer are examples of defects which will scatter light. Sensitivity is highest for surface defects or defects with non-zero topography (i.e. extends above the plane of the wafer surface). This technique is designed to obtain information primarily from the wafer surface, either defects in surface layers or buried just beneath it. In contrast with brightfield techniques, darkfield techniques do not require resolving the image of the defect to detect it. Illumination beam diameters of 1-mm combined with modest data rates provide high speed inspection capability. On a typical darkfield tool, the pixel size can be five to ten times as large as the smallest detectable defect. This pixel size leveraging allows darkfield inspection tools to scan more area per unit time, resulting in significantly higher inspection speed and throughput. Another advantage provided by darkfield detection is the larger depth-offocus relative to brightfield techniques. Tools employing laser light scattering are capable of measuring unpatterned and patterned wafers. Unpatterned wafer measurements are typically used to monitor the particle performance of process tools. For unpatterned wafer measurements, optimum sensitivity is obtained by performing the measurement on films whose thickness maximizes the scattered light signal. The film thickness which optimizes sensitivity depends on the refractive index of the material and wavelength of laser light employed by the inspection tool. The inherent immunity of darkfield detection to film thickness variations makes it ideally suited for inspection of post-CMP material. An example of CMP-induced color variation is shown in Fig. 7. The optical photos show the typical die-to-die color difference resulting from 100 to 200Å oxide film thickness variations. Darkfield tools with patterned wafer inspection capability are available from KLA-Tencor, Inspex, and Applied Materials.
11/30/00
JMR
340
Handbook of VLSI Microlithography
Figure 7. Photos illustrating die-to-die CMP-induced color variation.
The AIT system from KLA-Tencor uses a method called double darkfield detection (both illumination and collection angles are greater than 45 degrees).[8] As shown in Fig. 8, illumination provided by a 488nm argon ion laser is scattered off the wafer at a grazing angle and detected using optics positioned near the wafer horizon that funnel light to two collectors (photomultiplier tubes: PMT). By positioning the collectors near the horizon, 90 degrees away from the direct beam, scatter from surface roughness and pattern is suppressed while allowing scatter from particles to reach the collectors. Each PMT provides a data collection channel which can be independently configured with programmable spatial filters (PSF) and polarizers. This allows for optimization of signal-tobackground levels for different defect types and varying pattern densities. Use of a PSF in one or both collection channels can increase the capture rate of defects in memory arrays and other repetitive regions. In the AIT, grazing angle illumination essentially limits detection capability to those defects present on the wafer surface. This is an advantage for unit process monitoring or tool qualification applications where one is only interested in detecting defects generated by the most recent process step. To determine whether a defect is present, the current signal produced from the PMT is compared to that from a adjacent die or cell. If the difference between the signals is great enough, it is considered to be a defect. A defect only needs to be able to scatter light to be detected. Hence, a scratch in a dielectric layer, which can be difficult to detect in brightfield, can be detected with relative ease in darkfield. Planar defects such as extra or missing pattern produce weak but discernable signal differences. Detection of planar defects is accomplished using real time die-to-die signal comparison. Compared with brightfield systems, this comparison can be performed more rapidly because less data must be processed. A “larger” pixel can be used because the defect does not have to be resolved. Data acquisition is synchronized to the die size so that no registration of the signals from neighboring die is required.
2/23/01
JMR
Process Monitoring and Defect Detection
341
Figure 8. Optical path in KLA-Tencor AIT wafer inspection tool.
2/24/01
JMR
342
Handbook of VLSI Microlithography
Another darkfield inspection system is the Inspex Eagle (see Fig. 9). In this tool, scattered light is detected by a CCD array instead of a PMT. Using a programmable Fourier mask (PFM), diffracted light from patterned wafers is selectively filtered to reduce background noise. Random scatter from defects passes through the filter while periodic scattering is blocked, resulting in increased capture rate of pattern defects and particles. The PFM operates under software control to adapt filtering for various product layouts and process levels. Advanced pattern suppression (APS) software, which performs functions similar to SAT on the KLATencor tools, provides automatic definition of regional thresholds for optimum sensitivity. Image subtraction is applied to the output from the CCD to identify the defects. On this tool, the ability to detect defects does not depend on laser spot size or precise control of illumination angle. In addition, illumination angle can be adjusted to optimize sensitivity for individual process steps. The Applied Materials WF-731 and WF-736 systems (see Fig. 10) perform darkfield detection using a normal incidence laser. Like the AIT, illumination is provided by a 488 nm argon ion laser. Figure 11 shows the major components of the detection optics used in the 73X platforms.[9] Common to all 73X tools is a detection method called perspective darkfield imaging (PDI). It is implemented using four PMT detectors positioned at 90 degree intervals, oriented at 45 degrees from the wafer normal. Four concurrent high signal-to-noise images are generated for each location on the wafer. The signals from each detector are compared, with weights assigned to each channel during initial recipe setup. Image processing is used to combine the results from each detector to assign a grade to each defect. Grade is defined as the probability that a defect is real. Use of a normal incidence laser is advantageous because shadowing effects can be avoided. With grazing incidence illumination, depending on the illumination, a defect behind a high-aspect ratio pattern feature may scatter less strongly than the same defect in front of the pattern feature. With normal illumination and four evenly distributed detectors, the probability of that defect scattering light into a detector can increase. Like the 21XX tools, inspection speed and sensitivity are closely coupled. Sensitivity is set by selection of the inspection magnification. Increasing the magnification results in a linear increase in the amount of pixel data that must be processed. As a result, scan speed and throughput are inversely proportional to inspection sensitivity.
11/30/00
JMR
Process Monitoring and Defect Detection
343
Figure 9. Inspex Eagle wafer inspection tool.
Figure 10. Applied Materials WF-73X wafer inspection tool.
2/24/01
JMR
344
Handbook of VLSI Microlithography
Figure 11. Schematic of inspection optics used in Applied Materials WF-73X series tools.
In the WF-736, brightfield and darkfield detection are combined in a single system. A method called Integrated Normal Perspective (INP) is made possible through the addition of a fifth detector. The fifth detector is positioned directly above the wafer to provide brightfield collection of the retro-reflected light. Images from all five detectors are analyzed in parallel. Weights are assigned to each detector during recipe setup. If a given layer gives very good detection capability with darkfield, then more weight is given to the darkfield channels. For layers where the brightfield channel can contribute to the confidence of detection, more weight will be given to the brightfield channel. The primary benefit of the fifth detector is enhanced sensitivity to planar defects. Darkfield detection is combined with image processing on the KLA-Tencor 2230. This tool is based on the 2135 hardware and software platform with detection optics redesigned for darkfield operation. A laser darkfield illumination module has also been added. New to this platform is a redesigned autoloader which can be seen in the photo in Fig. 12. This new design provides pipelined off-stage pre-aligner capability to significantly reduce wafer loading time compared with the 21XX series tools. In
11/30/00
JMR
Process Monitoring and Defect Detection
345
the typical monitoring applications, multiple wafers in a lot are inspected. The overhead for the first wafer is the same as before. However, after the first wafer is loaded on the inspection stage, handling of subsequent wafers are automatically prepared for loading while the first wafer is being processed. The layout of the detection optics is shown in Fig. 13. In operation, oblique darkfield illumination is provided by a solid state laser (532 nm). Two methods of optical pattern filtering techniques are used to suppress pattern noise: repetitive pattern filtering (RPF) to block repetitive patterns while allowing the collection of the non-repetitive defect signal; and azimuth filtering to block noise from horizontal and vertical lines. This filtering occurs prior to image acquisition to reduce noise. SAT can be applied to further increase sensitivity on wafers with grainy metal and color variation seen in CMP processes. The image processing used on the ILM-2230 is identical to the 21XX tools.
Figure 12. KLA-Tencor 2230 wafer inspection tool.
11/30/00
JMR
346
Handbook of VLSI Microlithography
Figure 13. Schematic of KLA-Tencor 2230 optics.
For inspection of unpatterned wafers, Tencor 4500, 5500 and 6XXX series tools are the most commonly used and may be found in most modern fabs. The 7XXX series tools have the ability to inspect patterned wafers and but are included here since they are more similar to the 6XXX tools than the image analysis tools. These tools offer high sensitivity and high throughput at a moderate cost compared with image analysis-based tools. Examples of these tools and their target applications are summarized in Table 3. The 4500, 5500 and 6100 tools all employ a HeNe laser incident at a normal angle to the surface to scan the wafer. On a clean, smooth wafer surface, the laser light is reflected at a very predictable angle. When surface defects or particle contamination are present, the light is scattered. The scattered light is collected by an integrating light collector and amplified using a photomultiplier tube (PMT). Output from the
11/30/00
JMR
Process Monitoring and Defect Detection
347
PMT is an electrical current proportional to the number of incident photons, and hence to the amount of collected light scattered by the defect. By setting a detection threshold (gain), only pulses with levels exceeding the threshold are counted as events while events below threshold levels are rejected. In the evolution from the 4500 to the 5500 and 6100, incremental improvements were made in the collection optics to increase the signal-to-noise ratio, resulting in increased sensitivity. On the 4500 and 5500 platforms, scanning was performed by rastering the laser beam across the wafer while the wafer remains stationary. With the 6100, a considerable gain in throughput was achieved by using beam rastering in conjunction with a moving wafer.
Table 3. Examples of Unpatterned Wafer Inspection Tools Tool
Tencor 4500
Tencor 5500
Tencor 6100
Features
Sensitivity/ Throughput
Applications
*HeNe laser (633 nm) *normal angle of incidence
*up to 0.21 micron *60 wpm maximum on 150 mm wafers
Polished Si or GaAs, Epi Si, oxides, nitrides, photoresist, reflective metal
*HeNe laser (633 nm) *normal angle of incidence
*HeNe laser (633 nm) *normal angle of incidence
*up to 0.20 micron on smooth surfaces *77 wpm maximum on 150 mm wafers *200 mm capable
Polished Si or GaAs, Epi Si, oxides, nitrides, photoresist, reflective metal
*up to 0.157 micron on smooth surfaces *150 wph maximum on 150 mm wafers *200 mm capable
Polished Si or GaAs, Epi Si, oxides, nitrides, photoresist, reflective metal
(Cont’d.)
11/30/00
JMR
348
Handbook of VLSI Microlithography
Table 3. (Cont’d.) Tool
Features
Tencor 6200
*Argon-ion laser (488 nm) *normal angle of incidence *broad angle collector
Tencor 6400
Tencor 7600
*Argon-ion laser (488 nm) *oblique angle of incidence *variable polarization and side collection optics
*Argon-ion laser (488 nm) *variable input polarization *programmable spatial filter *variable collection aperture and polarization
Sensitivity/ Throughput *up to 0.10 micron on smooth surfaces *150 wph maximum on 150 mm wafers *200 mm capable *good for smooth surfaces but especially designed for rough surfaces or non-uniform films *up to 0.12 micron for smooth surfaces *120 wph maximum on 150 mm wafers *200 mm capable *up to 0.15 micron on *up to 0.40 micron on front end etched layers *up to 0.7 micron on etched metal layers *30 wph maximum on 200 mm wafers
Applications Polished Si or GaAs, Epi Si, oxides, nitrides, photoresist, reflective metal
Metal films, poly-Si, CMP substrates
All non-CMP layers
Same as 7600 with the following extra features: Tencor 7700
Same as 7600 *dual collection channels *circular polarization
11/30/00
JMR
All layers
Process Monitoring and Defect Detection
349
A shorter wavelength laser (argon ion: 488 nm) and a broad angle collector was introduced in the 6200 to obtain higher sensitivity relative to the 6100. While this configuration provided excellent sensitivity on smooth films, sensitivity on rough films such as poly and tungsten is limited by noise from surface scattering. [10] Noise originating from surface scattering is of two types: temporal and spatial. Temporal noise arises from primarily from random fluctuations in photoelectron generation in the PMT which can be reduced by time averaging. Spatial noise is a function of scattered light power, dropping off with increasing spatial frequency (deviation from normal incidence). The design of the 6400 takes advantage of this phenomena through the use of oblique illumination together with variable polarization and side collection optics. In the 7600 and 7700 tools, the addition of signal processing and programming spatial filter capabilities enable inspection of patterned wafers. All of the 6XXX and 7XXX tools can generate defect coordinate files for use with optical defect review stations and SEMs. Optical Pattern Filtering. A third technique, called optical pattern filtering (OPF), is used by inspection tools produced by OSI. The basic principles of OPF is illustrated in the schematic drawing in Fig. 14. Tools based on OPF rely on diffraction of laser light in specific directions from the repetitive pattern on the wafer surface. The wafer under inspection is illuminated by laser light at a grazing angle. The diffracted beams are captured by lens L1 and focussed into spots of light in the focal plane of the lens. This is known as a Fourier image. When the image is projected through a specially made filter located at the filter plane, all light coming from the wafer pattern is blocked while light coming from the defects is passed through unaffected. For each wafer pattern to be inspected, a custom filter must be generated containing the inverse Fourier image of the pattern. A second lens L2, located behind the filter, is used to form an image containing only the defects on the wafer surface. This image is detected using a CCD camera. During wafer inspection, a large portion of the wafer is illuminated to generate the Fourier pattern. A much smaller scan area is rastered over the wafer in a serpentine pattern to perform the actual defect detection. The diffraction pattern is determined by the spacing of the repetitive pattern on the wafer as illustrated in Fig. 15. A set of sparse, repetitive features produces a row of closely-spaced dark spots as a Fourier image. As the pitch of the pattern increases, the dark spots become larger and more widely spaced. Therefore a system based on optical pattern filtering shows increased sensitivity as wafer patterns get more dense or complex.
11/30/00
JMR
Figure 14. Schematic of inspection optics used in OSI inspection tools.
350
2/23/01
JMR
Handbook of VLSI Microlithography
Figure 15. Wafer patterns and their corresponding Fourier Images.
Process Monitoring and Defect Detection 351
JMR
2/23/01
352
Handbook of VLSI Microlithography
The major advantages of defect detection using OPF are extremely high defect sensitivity to all defects, independence of inspection speed and sensitivity, and freedom from depth of focus limitations. Extremely fast scanning speed is possible because imaging and data processing requirements are minimal. This technique is ideally suited for detecting low contrast defects such as resist thinning, water spots, and resist residue. Despite these advantages, the technique has not been widely accepted because its application is limited to inspection of devices containing repeating geometries such as memory arrays and snake patterns. This technique cannot be applied to peripheral circuitry or logic devices where there is no periodicity. As a result, tools based on this technique have very limited usefulness where devices (such as microprocessors) with nonrepeating geometries must be inspected.
3.0
DATA ANALYSIS AND DEFECT CHARACTERIZATION
Data analysis and defect characterization are integral parts of an effective photo process monitoring program. Collection of monitoring data at regular scheduled intervals must be a part of the production process. To obtain maximum benefit from this data collection, the data should be reviewed frequently and analyzed for trends and defect signatures. This is facilitated by a defect data management system which provides the facilities to automatically store wafer inspection data in a database and tools to retrieve and visualize the stored data. Some examples of defect data management systems are Tencor SwiftStation, KLA2550, and Knights Yield Manager. Fab personnel can use trend charts to watch for changes in the manufacturing process and assess whether the process is operating within specifications. A well-behaved process will show variation over time around some average value but no distinct upward or downward trend. An upward trend indicates worsening defect performance which must be investigated and corrected. On the other hand, a downward trend indicates improving defect performance. This also merits investigation because the information may be helpful in improving the performance of other tools. When multiple process tools are in use, it is useful to monitor for both defect trends and defect performance differences between tools. Statistically significant differences between otherwise identical tools should be given prompt attention.
11/30/00
JMR
Process Monitoring and Defect Detection
353
In addition to trend analysis, review of wafer maps for defect signatures and classification of detected defects should be performed regularly. The presence of a defect signature indicates a systematic defect issue. The size and orientation of the signature often provides clues for identifying the origin of the problem. Defect classification refers to the process of viewing individual defects in the optical microscope or SEM, and assigning pre-defined codes to the defects based on their physical appearance. This data can then be used to generate defect Pareto charts. This information is useful for identifying the top defect types and prioritizing defect reduction programs. Common photo defects and their origins are summarized in Table 4.
Table 4. Common Photo Defects and Their Sources Defect type Bubbles in resist Particles in resist
Pattern defect (nonrepeating) Pattern defect (repeating)
Spot
Lifting
Scratch
Cause * Air in resist dispense line * Particles in coating environment * Contaminated resist * Resist precipitation * Bubbles in developer * Environmental particle * Particle on stepper optics * Particle or defect on reticle * Bubbles in developer * EBR splashback * Water droplet from develop process * Poor resist adhesion to wafer * Wafer handling problem
Source Coat module Coat module
Various
Stepper or reticle
Various
Insufficient HMDS prime or wafer surface contamination Any tool component which comes in contact with wafer in process
11/30/00
JMR
354 4.0
Handbook of VLSI Microlithography PROCESS OPTIMIZATION AND QUALIFICATION
Process optimization is an essential part of developing a new photo process or start-up of a new process tool. The first task in optimization begins with identification of the major variables which affect the output from the process. Screening experiments in the form of full- or partial factorials are performed to identify the most important variables for the particular process to optimized. This is followed by more detailed studies of the key process parameters. There are often interactions between key parameters so it is critical to design experiments which provide information on potential interactions. For this reason, data should be collected to allow the construction of response surfaces involving key parameters. The reader is referred to Chs. 2 and 5 for basic information on experimental design. For a discussion on response surface methodology see Ch. 2. In the optimization of any process which will be used for integrated circuit fabrication, defectivity must be included as a response variable. This point cannot be overemphasized in photo process optimization. The most robust process is of little value if it provides poor end-of-line device yields. Once a process is optimized, it must be monitoring to insure that it is continuing to generate the desired results. This operation, commonly called process qualification, should be performed at pre-established intervals and after all scheduled or unscheduled maintenance events which are expected to impact the process. Table 5 summarizes typical events which should trigger a process qualification.
Table 5. Events Requiring Photo Process Qualification Event
Examples
Scheduled monitoring
Not applicable
Scheduled maintenance Unscheduled Maintenance
weekly PM
Other Excessive idle time
11/30/00
JMR
coater module serviced, developer module serviced resist bottle change, DI water maintenance tool not processing wafers due to lack of material
Qualification Frequency daily, weekly or as needed immediately after completion of service immediately after completion of service Immediately as needed
Process Monitoring and Defect Detection
355
The goal of photo process qualification is to insure the process is performing to specifications with respect to image quality and defectivity (cleanliness). Defectivity monitoring can be performed using patterned or unpatterned wafer tests. The following is a brief discussion of the methodology that can be employed with each type of monitor and the advantages and disadvantages of each approach. Prior to the development of high speed patterned wafer inspection tools, process qualification was primarily performed using unpatterned wafer tests. The testing can be set up to monitor the defect add-ons contributed by each of the unit operations using the photo process: prebake, HMDS prime, resist coat, softbake, stepper, post-exposure bake, develop, and post-develop bake. Bare silicon wafers are the most convenient test vehicle, with a Tencor unpatterned wafer scanner used as the measurement tool. A typical test sequence would consist of taking prereadings on the monitor wafer, cycling the wafer through the module of interest, and rescanning the wafer to get a post-reading. The difference in the pre- and post- counts is used as measure of the cleanliness of the module. Modern photo tracks can have one or more modules dedicated to each of these functions so the monitoring procedure must be designed to sample all the critical modules. In practice, it is not efficient or necessary to perform day-to-day testing in this fashion. In a typical fab environment, it is more practical to consolidate these individual tests into two tests: one for the coat process and one for the develop process. Sufficient wafers should be processed to test all coat modules, develop modules, and resist types. The preferred method of process qualification employs generation of a patterned wafer. This method has several advantages over the bare wafer approach because it is more representative of the actual process used on product wafers. In addition to sampling the defect performance of each module, this testing will also (1) check the nominal performance of the resist and associated process parameters, (2) stepper focus, illumination, and chuck contamination, and (3) nominal performance of the developer and associated process parameters. Typical problems seen with a patterned wafer qualification and their potential causes are summarized in Table 6.
11/30/00
JMR
356
Handbook of VLSI Microlithography
Table 6. Common Defect Issues and Corrective Actions
5.0
Problem High defect counts
Potential Cause Multiple causes
Corrective Action Troubleshoot as needed
Resist lifting
Insufficient HMDS prime
Check HMDS application conditions (oven temp, pressure)
Focus spot
Stepper chuck contamination
Clean stepper chuck
Comet tail
Poor resist coat
Check resist dispense, temperature, spin speed
Repeater
Reticle defect, stepper lens contamination, stepper focus
Inspect reticle; repeat test with different reticle.
NDO
Insufficient exposure, insufficient develop, insufficient bake
Troubleshoot as needed
DEFECT REDUCTION
Defects must be minimized in all process steps used in chip fabrication. Minimizing photo defects is especially important because fabrication of modern integrated circuits can involve as many as 35 photo process steps. Some of these steps produce a pattern which is transferred to an underlying layer (for example, gate, contact, metal) while other steps form an implant pattern (for example, well, source, drain). A defect in the former will produce a pattern defect which can be detected by patterned wafer inspection. A defect in an implant layer pattern will probably not
11/30/00
JMR
Process Monitoring and Defect Detection
357
cause a visual defect. Both types of defects are equally important because they can cause the device to malfunction. Reducing defects consists of the following basic steps: i. identifying that a problem exists ii. finding the source iii. determining the root cause iv. determining solutions v. implementing the solution vi. monitoring to ensure the problem has been eliminated Defect problems originating in the photo operations can be detected in one of several ways: (i) equipment qualification, (ii) ADI, (iii) routine in-line product inspections, (iv) correlation of electrical test failures, and (v) physical failure analysis. Issues detected by equipment qualification or ADI are the easiest to address because the problem is already isolated to a specific photo tool. By sending the affected product through “redo,” impact to device yield is minimized. If a problem is detected at routine in-line product inspections, correlation of electrical test failures, or physical failure analysis, the affected material is already compromised. Rapid response by the yield enhancement team is required to identify all affected material and prevent further impact to product in process. When a defect problem is found in this manner, it is necessary to determine the source of the defects. Partitioning studies are performed to first isolate the source to a process module and then isolate the source to specific process steps and tools. Equipment commonality between bad lots and good lots can provide information to direct troubleshooting efforts to specific steps and tools. Designed factorial experiments are useful in identifying a problem which may involve the interaction of two or more processes. When there are two or more variables to consider, the RSM approach (response surface methodology) help understand the operating range of the process. In the course of investigating a problem which is believed to be related to process margin, it is instructive to recreate the problem. This can be accomplished by “abusing” the process by performing experiments which exercising the process through its normal operating range as well as outside its normal operating range. The results of such experiments can provide valuable insight into the root cause of the problem. Questions that this type of analysis will help answer include:
11/30/00
JMR
358
Handbook of VLSI Microlithography i. Was the tool operating within normal specifications? ii. Was the correct process recipe used? iii. Did the tool “hiccup?” iv. Was there a random problem?
After a tool modification, process or procedure change is implemented, a suitable monitor should be implemented to monitor the defect performance over an extended period of time. The author has observed that problems often go undetected in the limited testing performed to qualify a process. However, problems emerge in volume manufacturing when applied to a variety of parts processed on many different process tools. The problems may appear as a result of interactions with other processes used prior or after the process which was changed. Hence, it is useful to trendchart the results, watching for drift, accumulation, or fluctuations.
6.0
CASE STUDIES
The following case studies illustrate a few of the defect issues which can be encountered in production photo processes. 6.1
Center Stripe Defects
The center stripe defect, as implied by the name, is a defect signature resembling a stripe. This signature was detected by in-line product inspection during start-up of a 200-mm factory. With notch down, the defects formed a vertical stripe extending from the top of the wafer to the notch. As shown in Fig. 16, the signature of this defect was not obvious from looking at individual wafer defect maps and was not present on all wafers inspected using our sampling plan. However, by viewing a composite wafer defect map generated by stacking data collected from several wafers as shown in Fig. 17, the higher density of defects at wafer center becomes more apparent. The signature is also readily apparent if a composite wafer defect map is creating which includes only defects classified as pattern defects (Fig. 18).
11/30/00
JMR
Process Monitoring and Defect Detection
359
Figure 16. Visual defect wafer map from inspection of a single wafer. No defect signature is apparent. Data was collected from in-line inspection of a product wafer.
Figure 17. Cumulative visual defect wafer map formed by stacking data from multiple wafers. The center stripe defect signature is readily apparent. Data was collected from in-line inspection of product wafers.
11/30/00
JMR
360
Handbook of VLSI Microlithography
Figure 18. Cummulative defect wafer map showing locations of pattern defects. Data was collected from in-line inspection of product wafers.
The impact of this defect on device yield was quantified through correlation of sort data and bitmap data with in-line inspection results. Figure 19 shows a cumulative die yield map for a lot affected by this problem. Each rectangle in the map represents one die on the wafer. The die yield was very uniform across the wafer except in the one column of die (filled rectangles) at wafer center were there is almost no yield. Overlay of in-line visual data with failed bitmap data showed a high correlation between visual defects and electrical failures. Failure analysis indicated that many of the single bit failures were caused by missing contacts and other pattern defects. Figure 20 shows a SEM micrograph of a typical device structure containing a missing contact caused by the center stripe issue.
11/30/00
JMR
Process Monitoring and Defect Detection
361
Figure 19. Cummulative die yield map showing center stripe yield signature. Each rectangle represents one die on the wafer. The dark rectangles represents the locations of low yielding die.
Figure 20. SEM micrograph of a typical device structure containing a missing contact caused by the center stripe issue.
11/30/00
JMR
362
Handbook of VLSI Microlithography
The localized nature of the defects provides a clue as to the cause. There are only a few process tools known to maintain fixed wafer orientation for all wafers during processing. With this key piece of information, all but a handful of tools were eliminated as potential sources of the problem. A module partitioning study suggested the photo develop process to be the origin of the defect. In photo processing using a lithocell, all wafers from the stepper are transported to the develop track with the same notch orientation. It was observed that the orientation of the stripe coincided with the starting position of the developer dispense nozzle. To confirm these findings, one product lot was processed with an intentional 90 degree rotation of wafers at every develop step. As expected, the cumulative yield map showed a 90 degree rotation of the defect signature. It is important to note that the center stripe signature was not observed on wafers processed to monitor and qualify the lithocell for production processing. The defect pattern did not appear on unpatterned wafer monitors or patterned wafer process quals (resist pattern printed on bare silicon wafers). Our data showed that this defect problem was pattern dependent and sensitive to wafer surface conditions. The problem was much more readily detected on product wafers, especially when forming small openings such as contacts. Several designed experiments were used to study the conditions under which the defect occurs. The variables investigated were dispense nozzle offset, wafer rpm during developer dispense, developer temperature, developer dispense pressure, and pre-wet. Wafer rpm, developer temperature, and pre-wet did not generate measurable responses in these studies. It was found that changing the dispense nozzle offset reduced the extent of the problem but did not completely eliminate the problem. It was discovered that insufficient developer dispense pressure was the root cause of the problem. By increasing the developer dispense pressure to a minimum critical level to achieve near-vertical streams, the defect signature disappeared. This defect issue has been uniquely associated with E2type soft-impact nozzles. With this type of nozzle, the nozzle surface is in contact with the developer puddle during resist processing. Regular maintenance is required to prevent resist build-up and nozzle clogging, another potential cause of developer defects.
11/30/00
JMR
Process Monitoring and Defect Detection 6.2
363
Circle Defects
The circle defect was an issue detected by normal in-line monitoring of product material. Automated wafer inspections are typically performed after each critical etch module. The problem was seen on 5 to 10% of the material inspected. These defects were unusual because they were perfectly circular in shape and had a uniform size distribution (three to five microns in diameter). In the optical microscope, the defects appear as circular islands of unetched metal, and can be seen as isolated defects or attached to metal features. SEM analysis indicated that the defects are full thickness metal. On affected lots, the frequency of these defects typically ranged from 0 to 10 per wafer. In cases where the problem was severe, over 100 circle defects per wafer was observed. The circle defects were randomly distributed on affected wafers and did not appear to be localized to any particular region of the wafer. No discernable defect signature was detected, even when composite defect maps were generated using data from multiple wafers and lots. When these defects appear in circuit areas where it can cause significant narrowing of metal spaces, corrosion of the metal lines can occur by trapping of etch by-product, making this defect a potential reliability risk. No reduction in device yield has been observed on material affected by low level circle defects. A 20 to 30% yield depression has been seen on material where the problem was severe. Equipment commonality studies on affected material indicate that all lots with the problem were processed through a single lithocell. Based on this evidence, a group of fifteen lots processed on the suspect tool were identified for KLA inspection at ADI and ACI. Due to the intermittent nature of the problem, the problem was not seen on any of the lots inspected. However, low level circle defects continued to be seen on some product material at regularly scheduled KLA inspections. To determine the source of the defect, partitioning experiments were performed using product material. All defects captured at ADI were photographed and saved on a video archival-retrieval system (VARS). Inspection of the same wafers at ACI detected some circle defects. Defect overlay and subtraction indicated that some of the defects detected at ADI had evolved into circle defects. This was the first evidence collected which suggested the potential source of the problem. In a follow-up partitioning study, optical and SEM characterization of the ADI and ACI defects was performed. SEM photos of the suspicious ADI defects are shown in Fig. 21. At ACI, circle defects were captured at locations where
11/30/00
JMR
364
Handbook of VLSI Microlithography
sphere defects were detected at ADI. Figure 22 shows the circle defects as they appear at ACI. In a separate study, pre-photo inspections were performed showing that sphere defects seen at ADI were not present on the wafers prior to the photo process.
Figure 21. SEM photographs of defects at ADI which cause circle defects.
11/30/00
JMR
Process Monitoring and Defect Detection
365
Figure 22. Optical and SEM photographs of the circle defects seen at ACI.
11/30/00
JMR
366
Handbook of VLSI Microlithography
EDS analysis of the sphere defects similar to the ones shown in Fig. 21 show the presence of carbon and sulfur, elements indicative of resist. SEM analysis of a large number of sphere-containing wafers showed spheres located on the wafer surface next to or on top of well-formed resist features. No evidence of distorted resist features was found. This strongly suggested that the spheres were not being formed in the develop or exposure process. Particle counts performed on coater modules on a “good” and “bad” lithocell revealed substantial differences in particle behavior. A review of the process history of the lithocells indicated that the appearance of the sphere defect problem followed the introduction of a special coat program needed to support a new technology. The sphere problem was also only seen on lithocells running the special coat program. Air flow studies on coater modules later showed that the coater exhaust was very low on tracks affected by the sphere problem. During follow-up inspection of the coater module, we discovered that the exhaust lines were severely constricted by excessive accumulation of resist. Based on the data collected, the following mechanism was formulated to explain the formation of circle defects: i. Resist coat recipe A employs spin speed and acceleration conditions which make it susceptible to formation of resist spheres. Under normal exhaust conditions, this is not an issue. It is an issue when the coater exhaust becomes constricted due to resist accumulation. ii. Under constricted exhaust conditions, resist spheres contaminate the coater module and enclosure, leaving spheres suspended in the air and/or deposit spheres on exposed surfaces in the coater track. iii. When wafers are processed in the contaminated track, spheres suspended in the air or lightly attached to the enclosure fall on the wafers while they are in transport in the coater track. iv. The resist spheres flatten out in the subsequent DUV and metal etch process. v. The resist spheres cause micromasking during the metal etch process, leaving circular islands of metal. Once the exhaust problem was identified, the partially blocked exhaust lines were replaced with clean exhaust lines and the exhaust output was
11/30/00
JMR
Process Monitoring and Defect Detection
367
adjusted to meet resist coat uniformity requirements. All lithocells in the fab were checked for the problem. A periodic exhaust check was implemented on all lithocells to monitor for exhaust flow issues to prevent the problem from reoccurring. 6.3
Repeater Defects
The wafer map in Fig. 23 illustrates the signature of a typical repeater defect. Repeater defects are those defects which appear on the wafer with some regular periodicity and will show some fixed relationship to the die layout on a reticle or stepping pattern on a wafer. Reticle defects are one the most common cause of repeater defects. A reticle defect can be in the form of (1) extra chrome pattern on the mask plate, (2) missing chrome on the mask plate, (3) particulates on the mask plate or on the reticle, and (4) damage to the pellicle.
Figure 23. Wafer map illustrating a repeating defect, common to reticle defects which print in every field stepped.
11/30/00
JMR
368
Handbook of VLSI Microlithography
In a manufacturing fab where many different products are processed on the same line, reticles are subject to frequent handling which increases the probability of a reticle-induced defect. Under such operating conditions, it is imperative to have a well-established procedure for regularly monitoring the condition of the reticles. Reticle monitoring can be effectively performed using a Horiba, QC Optics, or similar reticle inspection tool. Such tools perform a 4-pass inspection of the reticle in which the upper pellicle, lower pellicle, and top and bottom surfaces of the mask plate are scanned for particles and chrome pattern integrity. The locations of all detected defects are recorded and can be used to perform in-situ review of the nature and size of each defects. Defects below specified critical sizes are ignored. Particles exceeding critical size are removed using a pressurized gas stream (for example, nitrogen). Chrome damage exceeding a certain size, or pellicle damage, requires that the reticle be sent for repair or replacement. Routine in-line inspection of product lots provides another way to monitor for repeating defects. It can be very effective but several wellknown limitations must be recognized. Current patterned wafer inspection tools rely on comparison of images of inspected die. If a die is printed using a single die reticle, a repeater defect cannot be found by the inspection tool because all die look alike. For reticles containing multiple die, the reticle layout will determine whether special sampling plans must be employed to find repeaters. The size of the repeating defect also plays an important part in determining whether it can be detected. In general, product inspection is not as effective for monitoring for reticle defects as reticle inspection Examples of die inspection patterns used for reticle qualification: 2×2
11/30/00
JMR
XX-X XXX X-XX XXX
3×3
XXX XXX XXX X XX X XX X XX XX X XX X XX X
Process Monitoring and Defect Detection
369
Repeater defects can be caused by stepper focus-tilt issues. As device dimensions decrease and die sizes increase, lithography operations become increasingly sensitive to changes in focus conditions. Modern lithography tools provide the capability to precisely control the focus and tilt conditions. The optimum focus will shift depending on the feature size and circuit layer being printed. Using the same lithography tool, the center of focus for printing a 0.80 micron feature will be different from the center of focus for printing a 0.50 micron feature. Optimum focus for printing a 0.50 micron contact on oxide will differ from optimum focus for printing a 0.50 micron line on metal. For these reasons, focus-exposure experiments must be performed on the target lithography tool to determine the focus offset for printing the specific feature of interest. Tilt offset is another parameter which must be set to insure proper imaging of the device pattern across the entire reticle field. Focus and tilt are interrelated parameters, and must both be carefully controlled. 6.4
New Process Optimization
Device chip yield depends strongly upon the cumulative defects caused by process integration and processing induced point defects, which are derived from the surrounding environment or the actual processes themselves. New device designs can require greater than thirty photolithography steps, thus, creating the need for photolithographic processes with defect densities less than 0.1 defects/cm squared, at less than 0.5 micron particle size, to have any significant probe yield. In developing a new photo process for manufacturing, both CD control as well as the defectivity of the process must be included as a key response variables in process optimization experiments. The photolithographic process requires the device yielding wafer be handled by many different pieces of equipment and with multiple processing modules per piece. The wafer must be resist adhesion promoted, coated, baked, exposed, baked, developed, and inspected, which usually requires two separate tools and manual transfer of the partially processed wafers between tools. We have shown that a large number of point defects result from fab traffic and that their levels can be dramatically reduced by tool enclosures. (See Ch. 2 and references therein.) In older fabs, this is accomplished by retrofitting enclosures to the equipment to keep the environment isolated from the wafers in process. In newer fabs, track and steppers are integrated into an environmentally isolated unit called a
11/30/00
JMR
370
Handbook of VLSI Microlithography
lithocell. The lithocell configuration keeps the wafers separated from the workers and provides completely automated wafer transfer between photo process modules. It has been determined that many “best in class” photo processing equipment types are relatively defect free and not a significant source of point defects. As a result, process induced sources of defects, which are roughly half of all yield limiting defects, must be identified and minimized for each specific fab equipment/process set. Sometimes these defects are prevented by equipment modifications, but in many instances, they can be eliminated or at least minimized to acceptable levels through methodical detection and reduction techniques. The reduction and minimization of defects induced by two main photolithographic process modules, resist coat and development, are used to illustrate how to achieve a defect optimized (i.e. minimized or eliminated) process capability. The technique of applying statistically designed experiments to photolithographic process optimization has been documented by Wolf and Tauber.[11] The general approach calls for the application of lower resolution, highly confounded, statistical screening designs for controlling variable identification. This is followed by the use of higher ordered or resolution designs to detect important process controlling variable interactions detected from variable screening, and then culminated by the generation of empirically modeled response surfaces (RSM) generated from multileveled higher ordered statistical designs capable of detecting non-linear measured variable responses. The resulting response surfaces can then be mathematically analyzed by RS1 software[12] to provide optimum, i.e. maximized or minimized, operating points for any general process. In this work, this general approach has been successfully applied to the systematical detection and resolution of photolithography defects from processes performed in both coat and developer modules. Before this methodology above can be applied, the process module to be evaluated must be characterized statistically by a multivari experiment.[12] This type of statistical experiment treats the nested families of process parameter variability, and actually defines the process reproducibility problem quantitatively. Of course, after the process module or process has been optimized by the methodology above, another multivari must be performed to provide a statistical verification that the module or process has actually been optimized for variability reduction, and that a stable process performance has been achieved versus the results for the beginning unoptimized process.
11/30/00
JMR
Process Monitoring and Defect Detection
371
An important aspect of the optimization is the defect inspection technique and procedures employed. The inspection system is a patterned wafer inspection tool employing digital image processing to simultaneously provide high sensitivity for all defect types and excellent inspection speed. At the heart of the system is the image acquisition and image processing hardware. During an inspection, the wafer pattern is scanned for pre-defined cells and copies of adjacent cell images are stored electronically in buffers. The image processor performed image subtraction on the stored images and tests for pattern differences. When differences are detected in the same location for two consecutive subtractions, the location is flagged as a defect site. Using such an approach, the tool can detect defects independent of topography. The system is capable of 0.25 micron sensitivity and can be calibrated for a variety of sensitivities, which is limited only by the combination of available inspection optics. Coat Module Process Improvement Example. Coat modules provide reproducible thicknesses to photolithography process areas before exposure and alignment in the IC wafer patterning stepper. The films must be free of coating process defects or circuit yields will be poor. Because resist image critical dimensions (CDs) are dependent upon resist thickness, thickness control tolerances must be established at less than 100Å, 3σ, for monochromatic high numerical aperture tools. Actually, poor CD control induced by poor resist thickness control can also be viewed as a defect if it leads to poor circuit layer to layer overlay and yield. The coater track module thickness variability for this example had drifted to ~150Å (3-σ , all families of variation) from the 75Å level. Therefore, the process program needed to be reoptimized, but without sacrificing low defectivity levels. In fact, it was also a goal to reduce the process defectivity as well. The main process coating induced defects are particle fall-ons from fab traffic, splash back circular defects, and resist lumps (see developer example below). The experimental screening results for coat uniformity indicated that exhaust levels in the coat module during coating and the resist dispense dynamic arm program (i.e. how the resist dispense arm traverses the wafer during resist dispense) were the variables controlling within wafer resist thickness variation. The within wafer family of variation made a large contribution to the total variation in the baseline mutlivari problem definition phase, with the day-to-day family being the largest contributor. At this point, it seemed plausible to reduce both of these large contributors by improving the dispense process of the coat module, and this proved later to be the case. The screen design results for spin induced
11/30/00
JMR
372
Handbook of VLSI Microlithography
defectivity levels indicated that exhaust, dispense acceleration rate in krpm/sec, and dispense speed (krpm) were controlling variables. The RSM results for coat uniformity and defects are overlayed and found in Fig. 24. Notice, there is a classic saddle point optimum for defect level without incurring a large thickness variation for resist coating uniformity. Also note that the resist thickness uniformity contours (dashed lines) are increasing in value to the right. Hence, the RS1 determined operating minimum of the upper left hand side represents a good operating compromise for both responses. The three dimensional RSM for total coat defects is found in Fig. 25, where the minimized defect level predicted by the model with an 86% fit is ~0.84 defects per wafer or roughly 1. This level of defectivity is less than our 0.1 defects/cm2 goal, and if verified by a multivari verification experiment, represents an improvement in defectivity levels. Results from the optimization verification multivari yielded 1.6 defects/wafer vs 4.3 for the old process, a factor of 2.7 times the improvement, with coat uniformity improving to 48Å from ~150Å or roughly three times the improvement. The new coat program improved the lot-tolot or day-to-day variation from 45Å to 12Å and the within wafer variation from 14Å to ~9Å. Therefore, the optimized process conditions improved both the coating capability, and more importantly, the defectivity level of the process as the empirical RSM modeling predicted. Coater Module Edge Bead Process Improvement. After wafer resist coating, the wafer backside and edge must be rinsed with an organic spinning solvent to remove traces of resist. Such resist traces can cause defects during handling or stepper wafer chucking operations. Resist from the wafer must not come into contact with any wafer transfer arms or wafer chucking surfaces. Particulate defects and/or stepper local site focus errors may result. Unfortunately, the process used to prevent lithographic defects can also create them if the edge bead process (EBR) is not optimized for minimum or zero EBR induced defect levels. The two types of EBR defects are found in Fig. 26. Notice, there are two independent EBR processes and their defects are distinctly different. The same basic methodology was employed to achieve optimized top and bottom side EBR processes. From screening experiments, it was determined that EBR canister pressure, exhaust flow level, and spin dry speed were significant to bottom side EBR spot prevention, and that canister pressure and spin speed are significant variables to top side EBR spot defect formation. The RS1 generated RSM surface (note the fit is 92%) is found in Fig. 27. The optimum top EBR operating conditions are
11/30/00
JMR
Figure 24. RSM contour maps for the response variables, coat uniformity (dashed lines) and coat defectivity (solid lines), plotted as a function of the control variables, acceleration and exhaust.
Process Monitoring and Defect Detection 373
JMR
2/23/01
Figure 25. Three-dimensional RSM for coat induced defect levels.
374
11/30/00
JMR
Handbook of VLSI Microlithography
Process Monitoring and Defect Detection
375
Top Side EBR Spot
Bottom Side EBR Spot Figure 26. Optical micrographs of SVG 8800 wafer track coat EBR spot defects.
2/24/01
JMR
Figure 27. Three-dimensional RSM for EBR spot defects. Note, the optimum conditions from the RSM analysis are highlighted to the left of the response surface.
376
11/30/00
JMR
Handbook of VLSI Microlithography
Process Monitoring and Defect Detection
377
located on the figure, and clearly there is an optimum canister and spin speed condition. At the operating point calculated, the model predicts ~0 defects or total defect reduction, a very desirable result. Bottom side EBR spot levels were improved further, after RSM model minimization efforts, by employing an improved cup lid after optimization efforts failed to reduce the bottom side EBR spots to zero. Developer Module Example. The develop track module evaluated in this study was a modified system; it was equivalent to the coater module system except for the wafer transport and developer dispense systems. At the beginning, the defect multivari results indicated this develop module was responsible for an average of 400 defect add-ons per wafer, a defect density of 13 defects/cm 2 vs. 0.08 for the reference system. The major defect observed was a developer lump type defect as shown in Fig. 28. The functional test photo speed values (often called Eo values) were also high at 9.6 mJ/cm2 (3- σ), while the CD performance was at ± 0.11 microns, a more reasonable value. Obviously, the defect density and the photo speed reproducibility were very poor for this module, but the great exposure latitude for the resist process allowed for minimum impact of photo speed variability to CD control. From the multivaris, the largest family of variation for CD control was within wafer or CD site-to-site. For photo speed and defects, both wafer-to-wafer and day-to-day families were large. Consequently, for CD control only minor changes to the developer process needed to be achieved, but for defects and Eo a major process change was needed. Improving these poor developer module numbers required several screening experiments. The variables detected were developer wetting method for defects, and puddle spray time and pre-wet time for CD control. By running a 2-factor central composite design, the two CD variables were found to interact with 92% confidence, which required both to be at relatively high levels for best results. Defect density was most influenced by the pre-wet method, where a water rinse/developer combination spray at pre-wet, as well as a new development program, were able to reduce defect levels to the 0.1 defect/cm2 goal level—a dramatic result. CD control improved to 0.04 microns, 3-σ , presumably due to the water/developer interrupt contrast enhancement effect. This was accompanied by a 130x reduction in defect levels (see Fig. 29), and almost a 50% improvement in photo speed reproducibility in the functional test (see Fig. 30 Eo SPC data).
11/30/00
JMR
378
Handbook of VLSI Microlithography
The systematic methodology for general process optimization can be effectively employed for defect level minimization with reasonable empirical modeling coefficients (i.e. 85–92%). The methodology has been successfully applied to several track process module examples. Dramatically improved coating and defectivity performance levels have been achieved for the track module systems evaluated.
Figure 28. KLA 2111 generated optical micrographs showing several examples of photo process induced lump defects.
11/30/00
JMR
Process Monitoring and Defect Detection
379
(a)
(b) Figure 29. KLA 2111 printouts of developer module defect levels (a) before and (b) after the systematic defectivity level reduction.
11/30/00
JMR
(a)
11/30/00
JMR
Figure 30. Eo SPC charts for the optimized developer module process. Note, after the process was optimized, the Eo variability is roughly 2X smaller and the data is centered better within the specification range.
380 Handbook of VLSI Microlithography
Process Monitoring and Defect Detection
381
REFERENCES 1. 2.
3.
4. 5.
6.
7.
Cusson, B., and Emani, I., Applications of the KLA2111 as an In-line Monitor, SEMI Japan Technology Symposium Proceedings (1992) Wang, P., Lee, F., Chan, K. M., Goodner, R., and Ceton, R., “Yield Enhancement in a High-Volume 8" Wafer Fab, Part II: Yield Enhancement Programs,” Semiconductor International, pp. 217–222 (July 1996) Schward, R., and Sherman, R., “Processing and Machine Mastering Employing WF-710 Wafer Inspection System,” Metrology, Inspection, and Process Control for Microlithography X, SPIE Proceedings, 2725:242–254 (1996) Lee, F., “Defect Metrology for the 21st Century,” Future Fab, 6:239–244 Wang, P., Lee, F., Chan, K. M., Goodner, R., and Ceton, R., “Yield Enhancement in a High-Volume 8" Wafer Fab, Part I: Inspection Equipment Selection,” Semiconductor International, pp. 221–226 (June 1996) Ceton, R., Goodner, R., Lee, F., and Wang, P., “Comparison of Patterned Wafer Defect Detection Tools for General In-Line Monitor,” IEEE/SEMI ASMC Proceedings, pp. 92–99 (1996) Dickerson, G., and Wallace, R., In-Line Wafer Inspection Using 100 Megapixel per Second Digital Processing Technology, SPIE Proceedings, 1464 (1991)
8.
Stokowski, S., and Vaez-Iravani, M., Wafer Inspection Technology Challenges for ULSI Manufacturing, International Conference on Characterization and Metrology for ULSI Technology, NIST, (March 23–27, 1998) 9. Skumanich, A., Advanced Wafer Defect Detection for CMP Process Development, Semiconductor European, pp. 33–36 (March 1996) 10. Larson, C. T., Gross, K. P., and Stokowski, S. E., Noise Sources and their Influence on Surface Particle Detection, ASTM Conference Proceedings, San Jose, (Sept. 22–23) 11. Wolf, S., and Tauber, R. N., Silicon Processing for the VLSI Era, Vol 1., Lattice Press (1986) 12. Brown, M., John, J., DeCoursey, R., Malhotra, S., Lee, F., Helbert, J., “Photolithographic Process Defect Density Minimization Using Response Surface Experimental Designs and Modeling,” Microcontamination Conference Proceedings, pp. 330–337 (1993)
11/30/00
JMR
382
Handbook of VLSI Microlithography
4 Techniques and Tools for Photo Metrology Arnold Yanof Motorola, Inc. Chandler, Arizona
1.0
INTRODUCTION
A series of nested control loops regulates the performance of the photo-etch module. The smallest loops in this control system electronically stabilize individual track, stepper, and etch equipment components— for example, the spin motor speed, the stepper lamp intensity, the rf power input, etc. Intermediate-level feedback control of an entire system is also employed. Etch endpoint detection, for example, fine-tunes the etch process individually on every wafer. Again, photoresist thickness has a very sensitive influence on the exposure process because of optical interference effects, and a tight control loop is placed around the coat process by monitoring resist film thickness. The highest loop level is final lithography output control. Imaging metrology tools make it possible to gauge photolithographic output in-line at the completion of the patterning and etching module. Such final measurements of critical dimension (CD) and overlay not only impose process control over the entire module, but also determine whether product meets specifications. This chapter discusses the CD scanning electron microscope (CDSEM) and the optical overlay tools which today provide the most
382
11/30/00
JMR
Techniques and Tools For Photo Metrology
383
important final measurements for lithographic control. Electrical CD measurement is also discussed, because of its growing importance as a CD control method in cutting edge microprocessor fabrication. This chapter also covers optical metrology tools for film thickness control. The final section of the chapter contains a guide to the statistical interpretation of metrology data.
2.0
CD SCANNING ELECTRON MICROSCOPE (CD-SEM)
The line width is a single-number characterization of an integrated circuit lateral feature dimension. This was a well-defined measurement when lateral dimensions were an order of magnitude greater than the corresponding film thicknesses—about twenty years ago. Today, electrical features often have high aspect ratio (greater height than width). Line width today is not well-defined because the features to be measured very rarely have a precisely rectangular cross section. The purpose of the CDSEM is still, however, to gain from a top-down scan or image sufficient information about a complex cross-sectional shape to predict electrical device performance. This has become a difficult task with multiple requirements. The scan or image must have high resolution to represent details of the top surface, the side wall, the foot, and any residual material in between features. The single number metrology output must be highly reproducible over time and among multiple tools within the fab. The CD result must respond linearly to process changes which affect device electrical performance; but it should be insensitive to process changes that have no electrical effect. The offset between the CD-SEM measurement output and the line width corresponding to electrical properties—for example, conductance of a metallic connection—has become significant. Offsets, therefore, must be constant, so that offset variation does not become an excessive source of error. This section will briefly introduce current equipment and then discuss the characteristics and limitations of the CD-SEM image. The discussion then turns to the validity of the single number output of CD measurement. 2.1
Basic CD-SEM Equipment and Measurement
The CD-SEM consists basically of an ultrahigh vacuum electron column which produces a tiny electron probe on the wafer surface; an accurate stage for locating the features to be measured on the wafer; a
11/30/00
JMR
384
Handbook of VLSI Microlithography
detector which collects the electrons that arise from the impact of the probe upon the wafer surface (called “secondary” electrons or “SE”); and a sophisticated computer for interpreting the images and controlling the entire system. As with any scanning probe, the beam probe is scanned across the area to be imaged or measured, and the detector signal is recorded and/or displayed as a function of the (x, y) position of the beam. Figure 1 shows a schematic diagram of the CD-SEM. Recipe Mgmt., System Control, Pattern Rec., and Signal Analysis Computer
Image CRT Anode Voltage System Ion Pump
Field EmissionGun
Scan Amplification
Condenser Lens
Signal Amplification
Optical Align’t Microscope Objective Lens Stage
Detectors
Vibration Isolation System
Wafer Pre-align / Loadlock
Robot
Cassette
to Pumps
Figure 1. Schematic diagram of a CD-SEM.
Several aspects distinguish the CD-SEM from the analytical SEM which preceded it: (a) The CD-SEM automates the entire measurement process, minimizing the operator component of variation. This includes the use of sophisticated pattern recognition to locate the measurement feature and to position it properly and repeatably within the scan. It also requires automatic focusing. (b) Highly stable beam control reduces the beam-setup component of measurement variation. (c) Sophisticated wafer handling across the air-vacuum interface minimizes the pump-down contribution to the time required for a measurement. (d) Low beam energy reduces radiation damage to the semiconductor device, and provides an excellent image of the surface of low-Z semiconductor materials such as photoresist and silicon. (e) Low beam current together with high efficiency detection reduce image distortions due to charging and radiation
11/30/00
JMR
Techniques and Tools For Photo Metrology
385
damage. (f) There is high immunity to the fab environment, including mechanical vibration, air flow and acoustic noise, rf and electromagnetic fields from etch and diffusion equipment, respectively, as well as power line noise and ground loops. CD-SEM Individual Components. The Electron Gun. SEM performance depends on the small size, high brightness, and low energy spread of the electron beam source. [1][2] The electron source has advanced dramatically during the semiconductor integrated circuit era.[3] Initially, thermionic tungsten hairpin filaments[4] were used as the electron source. At 3000°C, the energy spread is 1–2 eV, leading to chromatic aberration, since the focus depends upon electron energy. The brightness is ~3 amp/ cm2. A better source material with a smaller work function such as LaB6 can produce ten times the brightness at 0.5–1.0 eV energy spread, because the work function is 40% smaller than that of tungsten and the operating temperature is considerably lower. For either tungsten or LaB6, a high voltage extractor makes the effective source spot size ~10 micron. The modern CD-SEM uses a smaller, much brighter, and lower energy-spread source called a field-emission (FE) gun. A tungsten point tip with radius 10-100 nm emits electrons by quantum tunneling. The small point radius enhances the electric field at the surface of the tip, resulting in a very narrow tunneling gap. The tip may either be cold, or may operate at 1800° K to provide thermally assisted tunneling. In the cold tip, the energy spread is very low: 0.2–0.5 eV. A current density of 2 × 105 amp/cm2 is achieved from a tip of radius 100 nm. In a heated FE gun, the energy spread is somewhat broadened to 0.5–0.7 eV. The advantage of heating is no contaminating gas molecules adsorb onto the tip, so that the smallest radius section of the tip supplies all the current. In the cold FE gun, adhering gas continually degrades the current during operation over several hours, and it is necessary to “flash” the tip, i.e., cycle the tip to a high temperature periodically. The SEM gun consists of a field emission source, and several gun electrodes for extracting the beam from the gun area. The gun must work in an environment of 10-10 torr. Down the column from the gun may be one or more demagnifying condenser lenses separated by apertures ~10 microns in diameter, and a final demagnifying lens with a final aperture of diameter ~100 microns. The apertures restrict electrons to the low-aberration core of the optics, block stray electrons, and also serve as vacuum separators permitting differential pumping. The distance between the final aperture and the sample is less than 10 mm.
11/30/00
JMR
386
Handbook of VLSI Microlithography
The Electron Detector. Conventional secondary electron detectors for CD-SEMs are placed between the wafer and the final objective lens from which the beam emerges. A great deal of ingenuity has gone into low-profile detectors, since close detector proximity to the beam spot is needed for detector efficiency, whereas a short working distance is needed for the best resolution. An example of a rather bulky detector is the Everhart-Thornley type. This is made up of an accelerating grid, scintillating material which emits light when struck by an electron, and a photomultiplier tube. Some CD-SEMs have four such detectors arranged to cover four quadrants. Two opposing detectors are used for measuring vertical lines, two detectors are used for measuring horizontal lines. A small electric field sweeps low energy secondary electrons from the impact area of the beam to the detectors. Another detector that has been used in line width measuring systems is a multichannel plate. This detector has small thickness, so it can be placed directly over the impact region of the primary electron beam. This results in a symmetrical image. A disadvantage of the multichannel plate is a limited lifetime. The degradation mechanism may be polymer deposition in the CD-SEM sample chamber due to the volatile components of photoresist and other semiconductor processing materials. Many of the most advanced CD-SEMs avoid the trade-off between detector proximity and working distance by using a through-the-lens detection method. One configuration is depicted in Fig. 2.[4] In the above type of system, the secondary electrons spiral back up through the bore of the final magnetic lens of the CD-SEM. The physics of this technique is based on the cyclotron phenomenon in plasma physics. The Lorentz force on an electron in a magnetic field acts perpendicular to both the velocity of the electron and the magnetic field vector. (See Fig. 3.) Therefore, the moving electron is forced to go in a circular orbit or to move in a helical path around a bundle of magnetic field lines. In a magnetic lens, the magnetic field lines form a bundle parallel to the beam in the bore of the lens. In a through-the-lens system, the low energy electrons from the wafer are accelerated back toward the pole face by an electric field. As they approach the pole face each electron is guided by, and spirals around, a continuous bundle of magnetic field lines converging into the bore. Above the magnetic objective lens, the electrons can be attracted by a small electric field to a detector offset from the primary electron beam axis.
11/30/00
JMR
Techniques and Tools For Photo Metrology
387
Figure 2. Through-the-lens detection system schematic, depicting also the sample feature and the expected line scan signal.
Instantaneous velocity
e- Trajectory Force
B, magnetic field Figure 3. The Lorentz force on an electron traveling in a region of magnetic field causes the electron to be accelerated in a direction perpendicular to both the magnetic field and the electron velocity vector.
1/19/01
JMR
388
Handbook of VLSI Microlithography
Through-the-lens detection permits a short working distance. The location of the detector reduces electromagnetic interference. Throughthe-lens detection discriminates against secondary electrons which originate far from the beam spot. Secondaries produced by the primary incident beam are called SE1 and contain the high resolution sample information. Secondaries produced by backscattered electrons exiting the sample (SE2) or by impinging on the pole face (SE3) contain low quality information. Thus through-the-lens detection improves the image quality for several reasons. A quadrant type of through-the-lens detection is schematically represented in Fig. 4.[5] Here, an electrostatic lens is placed beneath the magnetic pole piece. This particular field configuration preserves the angular relationship between different portions of the image. Hence, a four-quadrant in-lens detector can again be used, like the four-quadrant Everhart-Thornley detector described above, to enhance topographic and orientation contrast. (The magnetic/electrostatic compound lens is also discussed in reference to Fig. 7.)
Figure 4. Combination through-the-lens and quadrant detector configuration. The electromagnetic objective lens preserves the SE emission direction from the target, providing orientation contrast.
2/23/01
JMR
Techniques and Tools For Photo Metrology 2.2
389
Characteristics and Limitations of Low Voltage SEM Imaging and Metrology
The usual definition of resolution is the minimum discernible separation between sample features under optimum SEM conditions. Current state-of-the art CD-SEMs quote resolution in the range of ~4 nm. The resolution is complicated both in terms of physics and in terms of the concept. From the physics aspect, the resolution depends on the actual width of and current in the beam, the interaction of the beam with the sample, and sample changes such as degradation and charging brought about by the beam. The resolution is always quoted for an optimal sample material with tiny artifacts, such as titanium spheres or gold-coated carbon. The resolution is considerably less favorable on low-atomic number materials, low bonding energy materials, and low conductivity materials. Resolution is always much poorer than the quoted specifications on the most important material to the photo-metrologist—photoresist. Some metrologists advocate a novel definition of resolution: the minimum detectable line width change on resist or etched structures.[6] This latter concept of resolution is actually line width sensitivity, and is clearly a practically important parameter. Electron beam width in an ordinary SEM can be determined directly by scanning the focused beam across a knife edge connected to a picoammeter. A related practical approach for CD-SEMs is to evaluate beam sharpness in terms of an “apparent beam width.”[7] Here, the acuity of the beam is gauged by scanning a photoresist or other available semiconductor processing feature having nearly ideal 90° side walls, as determined by cross-section SEMs or by atomic force microscope (AFM). The apparent beam width is the rise-distance of the bright image of the feature edge in the CDSEM scan. Figure 5 illustrates a typical scan on an appropriate resist feature and indicates the apparent beamwidth calculation. [7] Beamwidth Contributions from Diffraction, Source Width and Lens Aberrations. Because the FE source-size is already so small, a single magnetic lens is sufficient to demagnify and project the source down to the final probe on the sample. Even under conditions of best focus and astigmatism, the demagnification process is perfect only for very few electrons with trajectories precisely on the axis of the beam. In order to get enough electrons into the probe spot for good signal-to-noise, the lens must focus a converging cone of electrons onto the sample. As the cone angle, α, widens (effected by increasing the size of the final aperture), more and more current is brought to the sample, which improves the signal-to-noise of the
11/30/00
JMR
390
Handbook of VLSI Microlithography
image. Furthermore, diffraction, which is significant for low voltage electron beams, is reduced by having a larger aperture. Such is the case for any optical system. As the aperture widens, however, spherical aberration and chromatic aberration increase the spot size (see Fig. 6). Thus, there is an optimum cone angle, α, giving sufficient signal-to-noise and acceptable spot size.
Figure 5. Apparent beam width determination. The graph depicts the sum of several raw line scans across a photoresist feature having nearly ideal, vertical side walls. The outside distance at the base, less the distance between signal peaks, is equal to the apparent beam width.
Figure 6. Spherical and chromatic aberration arising from magnetic lenses. [8]
11/30/00
JMR
Techniques and Tools For Photo Metrology
391
The beam diameter, d, is given by d2 = dd2 + d s2 + dc2 + dg2. The physics and order of magnitude of the probe size components on the right hand side of this equation are described below: a) Diffraction term dd. The Uncertainty Principle determines the wavelength of the electron as l = h/p, where h is Planck’s constant and p is the momentum of the electron. As the energy of the probe is decreased— e.g., to avoid charging a resist sample—the wavelength increases, and diffraction becomes a factor in the beamwidth. As for any optical system, the Airy diffraction disk diameter is given by Eq. (1)
dd =
1.22λ α
For a 600 eV electron energy, the wavelength is 0.5 Å. A typical SEM aperture has a cone of half-angle 0.01 radians, so dd ~6 nm. b) The spherical aberration of the magnetic lens increases rapidly with the aperture size and the beam current. The spherical aberration is given by Eq. (2)
d s = 12 C sα 3
where Cs is the spherical aberration coefficient. The “pinhole” type[9] magnetic lens has Cs = 2 cm. For typical α = .01, the spherical aberration is d s ~10 nm. c) The most important term in the low voltage SEM beam size is the chromatic aberration term, dc. Chromatic aberration is due to the spread DE in energy of electrons in the beam. The problem is the magnetic lens does not bring electrons of differing energy to the same focus. The chromatic aberration beam width is
Eq. (3)
d c = 12 Cc α
∆E E
where Cc is the chromatic aberration coefficient. For the low voltage SEM, the FE or TFE sources are essential to obtain a small spread in energy ∆E, as described in the above section on the electron gun. Conventional magnetic objective lenses have a field distribution with inherent chromatic and spherical aberrations. For such an FE-SEM at E = 600 V, Cc is a few millimeters and dc ~ 5 nm. [10]
11/30/00
JMR
392
Handbook of VLSI Microlithography
d) Minimum geometric size, dg. The sample probe must be large enough to contain sufficient current for good image signal-to-noise. The probe size d g needed to contain probe current ip from a source of brightness β is given by geometrical optics:
Eq. (4)
dg =
4i p
βπ 2α 2
The extremely bright FE source gives β = 2 × 109 A/cm2-sr. For efficient detectors, a 5 pA beam current gives good signal-to-noise. Under these conditions d g = 2.5 nm. Resolution Improvements in State-Of-The-Art CD-SEM. As discussed above in above section on beamwidth contributions, objective lens aberrations add in quadrature to place a limit on the resolution of a CD-SEM that uses a conventional magnetic “pinhole” lens. The important chromatic aberration coefficient, Cc , is proportional to the working distance. For small samples, the sample can be located inside the magnetic objective to reduce working distance and aberrations. [11] This magnetic “immersion” objective is inappropriate for semiconductor wafer metrology. A compound objective lens consisting of a magnetic, followed by an electrostatic lens offers reduced effective working distance in a configuration compatible with wafer metrology. Some state-of-the-art CD-SEMs capitalize on improved chromatic aberration using a compound objective. One such configuration is depicted[12] in Fig. 7. This lens achieves Cc = 0.8 mm at 600 volts beam energy. At the optimum α = .0087 radians, the resulting beam diameter is d = 4.9 nm. Electron Range Limits on Image and Measurement Scan Acuity. The SEM image characteristics depend upon the range of incident beam electrons, which in turn depends upon electron energy and the sample material. Ideally, if the interaction volume between the primary incident beam and the sample were very small, the resolution would be determined only by the size of the primary beam. Because of the straggling depicted in Fig. 8, the scan samples the material, topography, etc., of a broader region than just the impact area of the beam. The image or measurement scan shape depends, therefore, in a complex way upon the range of the primary beam and the secondaries, and upon the shape and surface topography of the sample. The range, in turn, depends upon the sample material composition.
11/30/00
JMR
Techniques and Tools For Photo Metrology
393
Figure 7. One configuration of compound magnetic/electrostatic objective lens used in state-of-the-art CD-SEMs for high resolution. The electric field at the wafer also extracts secondary electrons from high aspect ratio vias and spaces.
Figure 8. Monte Carlo simulation of electron range versus initial energy for carbon target of density 1 gram/cm3 .[13]
11/30/00
JMR
394
Handbook of VLSI Microlithography
Sample Material Dependence of Incident Electron Range. Figure 8 shows a Monte Carlo simulation[13] of electron range versus initial energy for carbon target of density 1 gram/cm3. For energy higher than ~40 eV, the electron range through the solid target increases with increasing energy. It undergoes both elastic and inelastic (energy dissipative) collisions along a tortuous path. (Below ~40 eV, the range theoretically increases; however, such low energy electrons do not result in significant SE, which need to be at least 50 eV to exit the sample and reach the detector.) The simulation for resist material[14] depicted in Fig. 9 shows straggling in resist. The depth of the straggling shown is in approximate agreement with the graph in Fig. 8. The straggling in silicon and in tungsten is shown in Figs. 10 and 11, respectively. The higher electron density associated with higher Atomic Number, Z, results in shorter and shorter distance between collisions that change the trajectory. To image resist clearly requires the CD-SEM beam energy to be much less than 1 keV to avoid a loss of detail. Any higher energy results in a rounded appearance of corners, blurring of side wall artifacts such as standing waves, and invisibility of resist webbing between lines. Resist 2.0 keV
1.0 keV
0.8 keV
0.6 keV
250 nm
Figure 9. Monte Carlo simulation[14] of electron straggling in a resist target versus incident beam energy.
Silicon 2.0 keV
1.0 keV
0.8 keV
0.6 keV
250 nm
Figure 10. Monte Carlo simulation[14] of electron straggling in a silicon target versus incident beam energy.
11/30/00
JMR
Techniques and Tools For Photo Metrology
395
Tun gste n 2. 0 k eV
0. 6 k eV
25 0 nm
Figure 11. Monte Carlo simulation[14] of electron straggling in a tungsten target versus incident beam energy.
Contrast in CD-SEM Images. No matter how small the beam diameter is, nor how well-localized the electron scattering, successful imaging and CD measurement depend upon contrast to define sample features. This section touches upon contrast in low CD-SEM images. Low voltage SEM images based on secondary electron (SE) detection do not show much contrast between clean, smooth planar materials oriented normal to the beam. The tilt of feature surfaces, however, produces significant contrast. When the beam enters a sharply tilted surface, the straggling paths are close to the surface and scatter more electrons with sufficient energy to escape. Directional orientation of a surface also produces contrast, provided the detection scheme is asymmetrical. This has already been mentioned in connection with split detectors (see the above section on “The Electron Detector”). A steeply sloping edge emits secondaries more strongly in the direction it faces, and protruding sample features block visibility to the opposing detector. The texture of surfaces also can contribute significant contrast. The exact surface condition, or surface contamination, can affect brightness more than the underlying bulk material, since low voltage SEM electrons stop within thin surface layers (see the above section on “Electron Range Limits”). Back-scattered electron (BSE) detection mode in the low voltage SEM offers somewhat more material contrast than SE detection. Electrons are much more likely to be scattered with low loss in energy back up in the direction of the beam by high-Z materials than by low-Z materials.
11/30/00
JMR
396
Handbook of VLSI Microlithography
Edge Contrast. The edge effect for SE (both SE1 and SE2) is enhanced secondary emission at a convex corner. There are two locales for emission near a convex corner: (a) at the entry of the beam into the top of the feature and (b) out the adjacent side wall of the feature. The yield of secondaries is thus enhanced by a factor of ~2.[15] A typical CD-SEM image of a photoresist, polysilicon, or metal line displays extremely bright edges along the feature sidewalls owing to edge contrast. Figure 12 shows the Monte Carlo simulation[14] of a SE image of a 0.2 micron resist feature on silicon, at various working SEM energies. At 0.8 keV, there is a ~2X increase in SE signal at the top resist corner. At this incident energy, the beam penetrates rather far into the resist. Many secondaries are produced well below the top surface and can not escape out the top. At the top corner, however, the proximity of the sidewall allows a significant number of these secondaries to escape out the sidewall of the feature. At higher energies, the corner effect is greater, because the range of the beam is greater and secondaries are formed even deeper into the resist. Note, there is a corresponding reduction in secondary production at the concave corner where the resist edge meets the silicon, because many secondaries produced in the silicon are captured in the resist and prevented from escaping by the enclosed geometry.
0.25
resist E-beam energy, keV
0.2
0.2 micron
Total SE, relative
0.5 0.6
silicon
0.15
0.8 1 2
0.1
0.05
0 0
0.05
0.1
0.15
0.2
0.25
X (m icr ons)
Figure 12. Line scans across the edge of a photoresist feature on silicon. The graph shows a family of scans with beam energy as parameter. Enhanced edge contrast causes the secondary electron peak. Scans were simulated by Monte Carlo calculation.[14]
11/30/00
JMR
Techniques and Tools For Photo Metrology
397
Edge effects are extremely important for CD metrology: the edge response provides the most important part of the signal contrast and a characteristic signature that can be mathematically identified by the CDSEM computer. Texture Contrast. Finely textured surfaces, such as resist sidewalls showing a standing wave effect, or etch “grass,” emit SE very efficiently, increasing the bright appearance of the textured surface. Figure 13 shows the Monte Carlo simulation[14] of a SE image of a very small feature. The 20 nm feature gives the brightest image for 0.8 keV incident beam. Here, the effect of finer geometry is to enhance the secondary production by making multiple sidewall surfaces available for the escape of secondaries, plus the top surface. In a collection of “grass” features, a BSE may pass through a number of surfaces before finally exiting the sample. Secondaries are abundantly produced at each BSE entry and exit from the solid.
Figure 13. Line scans across a very narrow photoresist line on silicon. SE Scans were simulated by Monte Carlo calculation.[14] The graph shows a family of scans with feature width as parameter. At 20 nm width, high SE production results from multiple SE escape paths. Similarly, a finely textured surface has multiple SE escape paths and a bright image.
11/30/00
JMR
398
Handbook of VLSI Microlithography
Material Contrast from SE. The efficiency for SE production is called δ. This is the number of SE ultimately produced by a single incoming primary electron. Material contrast arises because different clean, flat material surfaces produce SE with a different efficiency, δ. Figure 14 indicates vastly different materials showing only modest differences in δ.
Seconda ry Electron Yields for Some IC Materials 2.5
2
I-line Resist
SE Yield,
Silicon (average) 1.5
PETEOS a-Silicon pure A l (av erage)
1
silicon dioxide Titanium
0.5
0 0.1
1
10
Ene rgy (keV )
Figure 14. Secondary electron (SE) yield, δ, for a variety of different integrated circuit materials. Materials with very different Atomic Number, Z, show only modest differences in δ.
The reason for weak material contrast in CD-SEM images is the SE have low energy and can only arise from collisions that occur near the surface of the material. As shown in Fig. 8, electrons exiting the surface with less than 50 eV energy can penetrate only 10–20 nm through carbon. This surface depth for SE production is known as the escape depth, ΛSE. As Z increases, ΛSE decreases. For higher Z materials, however, the rate of energy deposition by PE or BSE penetrating this skin also increases, as demonstrated by the straggling plots, Figs. 9–11. For increasing Z, the effects of ΛSE decreasing while the rate of SE-production is increasing tend to balance out, so that δ is relatively independent of sample material type. A note on the shape of the yield curve: slightly below 1 kV beam energy, the experimental and theoretical curves for each material show a
11/30/00
JMR
Techniques and Tools For Photo Metrology
399
maximum, as indicated by the data of Fig. 14. This is the energy of maximum δ because the straggling path of the incoming electron approximately equals the SE escape depth ΛSE . The shape of the yield curve is important to the discussion of charging, below. Material Contrast from Back-Scattered Electrons (BSE). If SE are not aggressively extracted, very few can reach the detector from deep, restricted geometries such as contact holes, narrow resist and etch spaces, etc. In many SEMs, the SE image of the bottom of such important features is essentially black and contains no information. Figure 15 shows compiled data[16] for BSE yield, η, from the same group of IC materials as Fig. 14 above. Note, the material contrast for BSE is somewhat greater than for SE. Higher-Z materials generally appear brighter under BSE detection. Note further, resist has a low η yield of BSE relative to Si or SiO2. BSE material contrast, therefore, can enhance the number of electrons emitted from the base of resist-patterned and etched contact holes, relative to SE. The use of a detector to capture BSE electrons scattered back up the beam line has been proposed as a method to image and measure contact holes. [17][18] A possible advantage is image contrast of BSE is less affected by sample charging. BSE have higher energy than SE, so the sample surface potential has less effect upon their departure and detection.
BSE Yields for Some IC Materials 0.3 0.25
I-line Resist Silicon (average)
SE Yield,
0.2
PETEOS 0.15
a-Silicon pure Al (average)
0.1
silicon dioxide Titanium
0.05 0 0.1
1
10
Energy (k eV)
Figure 15. Back-scattered electron (BSE) yield, η, from the same group of IC materials as studied in Fig. 14.
11/30/00
JMR
400
Handbook of VLSI Microlithography
Figure 16 shows the Monte Carlo calculation[14] of the image contrast of a resist space patterned on silicon. In Fig. 16, as the angle subtended by the detector is reduced, the signal from the bottom silicon becomes stronger than that of the resist. The favorable silicon/resist contrast obtained from BSE depends upon BSEs that are scattered back in the direction of the beam line. The same Monte Carlo simulation also shows one disadvantage of BSE for metrology: BSE production is less than 5% SE production in this example. Based upon fewer electrons, the backscattered electron image is noisier than the SE image. Furthermore, in favor of SE detection, recent CD-SEMs use compound magnetic/electrostatic lenses which immerse the sample in an electrostatic field. This field extracts the available SE out of contact holes so that they are adequately imaged by SE. (See above paragraphs on “The Electron detector” and “Resolution Improvements in State-of-the-Art CD-SEMs.”)
Contact Hole BSE Contrast Depe nds on Directionality 5 detector 4.5
detector angle
Silicon/Resist BSE Contrast Ratio
4 3.5 0.5 um 3
resist
2.5 2
silicon
0.2 um
1.5 1 0.5 0 0
10
20 30 40 50 60 70 Angle Subtende d by Back scatter De tector
80
90
Figure 16. Back-scattered electron (BSE) contrast for a resist space on silicon, as a function of the angle subtended by the back-scatter detector. Monte Carlo calculation[14] of contrast was used.
Charging Effects on Imaging and CD Metrology. When the electron beam impinges on a perfectly clean metallic sample grounded to the SEM chamber, the current is dissipated without developing any appreciable voltage on the sample. Under any other circumstances, charging is likely to occur. Charging causes both bizarre and subtle effects upon images and measurements, due to contrast changes and image distortions.
11/30/00
JMR
Techniques and Tools For Photo Metrology
401
If electron traps are produced in the sample, the effects dissipate slowly. If damage results from the buildup of charge, the effect can be permanent. Usually, however, the level of charge can change rapidly with changing beam conditions such as beam current, voltage, scan rate, magnification, scan direction, etc. This non-permanence helps to distinguish the effects of charging from the effects of contamination. Electron Yield Curve Explanation of Charging. Figure 17 shows the total electron yield for silicon when struck by an electron beam of primary energy, PE, ranging from 0.1 to 10 keV. By definition, the total yield is just the sum of secondary yield, δ, and backscattered yield, η . (See above Figs. 14 and 15.) This diagram provides a basis for discussing charging phenomena. Consider the different PE regimes: E1 < PE < E2: As described in the above section on material contrast, there is a maximum in secondary electron yield when the escape depth Λ SE ≈ the range of the incoming electron beam. In this situation, the yield is greater than one, and more electrons are leaving the sample than are arriving. The sample moves to a more positive electrical potential, the magnitude of which depends upon the local conductivity and the beam current. This increases the energy of impact in that region. PE > E2: At impact energies higher than E2, the beam penetrates much deeper than the escape depth LSE. Each incoming electron gives rise to fewer than one exiting electron. The local potential tends to become more negative. PE = E2: At E2, each incoming electron gives rise to one exiting electron. There is no charging at this impact energy. Just below E2, the charging effect increases the impact energy, and just above E2, the charging effect decreases the impact energy. E2 is, therefore, a stable point of operation, which is ideal for avoiding charging effects.
Calculated Silicon Electron Yield 2 1.8
E1
E2
1.6
Yield,η+δ
1.4 1.2 1 0.8 0.6 0.4 0.2 0 0.1
1
10
Beam Energy, PE ( keV)
Figure 17. Total electron yield (η + δ ) as a function of primary beam electron energy.
11/30/00
JMR
402
Handbook of VLSI Microlithography
Influence of Charging on Contrast. The discussion above, “Contrast in CD-SEM Images,” indicated better SE emitters—whether due to surface tilt, material difference, or surface texture—will produce a brighter image than poorer SE emitters. This is true only for grounded, highly conducting samples. If charging occurs, contrast between different portions of the image also depends upon the local electrical potential. E2 is a stable point. Different parts of a low conductivity sample tend towards different local potential, so that the local impact energy tends toward the local value of E2. The contrast then results from differences in detector efficiency. Positively charged areas have a lower electric field driving SE to the detector and appear dark—even black. Negatively charged areas appear bright or white—so-called “blooming.” Such effects vary with choice of PE, magnification, beam current, and scan rate. As discussed in the above section on edge contrast, a tilted sample surface produces higher electron yields. Thus the yield curve, Fig. 17, shifts upwards for tilted surfaces. This has been exploited in tilt-stage SEMs to reduce ‘negative’ charging when the beam energy is higher than E2 on the flat surface.[19][20] Figure 18 shows scans of isolated resist lines on silicon under varying conditions of beam current and tilt.[21] In Fig. 18, the high dose scan on the “flat” sample ( = top down) with PE > E2 causes a bright resist image due to negative charging. When tilted, as in the middle scan, the same resist sample now has E2 > PE. This results in positive charging, and a darkened resist image. In the third scan, a reduction in dose (beam current) eliminates blooming on a flat sample. The trailing edge (right side) of the third scan, however, exhibits increased brightness due to negative sample charging during the scan. Figure 19 shows three rotated scans of the same sample of resist on silicon. Charging is responsible for the varying image of the silicon near the ends of the lines. As the beam rasters left-to-right across the sample, the SE production, trajectory, and detection is influenced by charge left behind at previous locations in the scan. Without charging—since nothing is changed other than the scan direction—all three images would be identical.
Figure 18. Scans of isolated resist lines on silicon under varying conditions of beam current and tilt. The beam energy is 1.0 kV.
11/30/00
JMR
Techniques and Tools For Photo Metrology
403
Figure 19. Evidence of charging effects in a scan rotation.
Effect of Charging on CD Measurement. The changing vertical level of the arrowheads in the above Fig. 18 indicates the difficulty of choosing the correct threshold of brightness for determining the line width. Charging alters the contrast or brightness level. This affects the base line location, which is a fundamental starting point for CD measurement algorithms. Surface charge also bends the actual electron paths, producing a distorted image or measurement scan.[23] Figure 20 illustrates a calculation of the distortion effect for large beam currents. In this model, the beam induces a conducting layer at each surface of the insulator. The electrostatic field due to negative charging deflects the primary beam laterally. In Fig. 20, the CD measured would be smaller than the actual size. The above theory explains only one particular charging effect. In general, however, charging degrades CD-SEM measurement accuracy and reproducibility in a number of ways, most of which are poorly accounted at the present time.
Figure 20. Illustration of a possible mechanism of image distortion due to negative charging.[23]
11/30/00
JMR
404 2.3
Handbook of VLSI Microlithography CD-SEM Measurement Validity
The fundamental issues in SEM CD metrology are single-tool reproducibility, tool-to-tool reproducibility or matching, and accuracy of the measurements. Statistical process control of single tool reproducibility is important because when photolithographic or etch processes vary, metrology is always a possible initial suspect. Statistical tool matching is important because any such complex tool will suffer downtime for maintenance or repair, and a certifiably equivalent CD-SEM must be substituted to avoid interruption in the flow of product. Accuracy of measurements would be less important if offsets were absolutely constant. As critical dimensions have decreased, the offsets have become a larger fraction of the measurement (~20 % for 0.25 µm technology). Any change in the offset due to process changes such as sidewall profile, surface condition, charging, etc., impact the reproducibility. Single Tool CD-SEM Statistical Process Control. A standard metrology tool control chart plots the daily measurement(s) on a “golden wafer” at the same site(s) every day. Statistical process control (SPC) of the CD-SEM is hampered by a lack of repeatability due to sample degradation. This section first discusses the target degradation. An example based upon actual control charts is given to illustrate the standard line width SPC and typical 0.5 micron generation CD-SEM reproducibility performance. Finally there is brief discussion of “delta-to-predicted” (DTP) statistical process control. DTP is a more sophisticated chart which compensates for target drift. Problem of Target Degradation. Degradation can result in measurement feature shrinkage, growth, or both, depending on the number of repeated measurements. Figure 21 shows line broadening after repeated measurements. F. Mizuno et al. have discussed electron beam assisted deposition of hydrocarbons, resist shrinkage due to electron beam induced cross-linking, and charging.[24] The effects have been statistically modeled by K. Monahan, et al.[25] They point out that the traditional methods for mitigating such effects are to minimize beam current, sampling time, and magnification. Unfortunately, these methods also reduce signal-tonoise and increase the uncertainty of the measurement. SPC control of the measurement tool relies upon long-term repetition of measurements on the same target. A typical way to circumvent the problem of degradation is to rotate targets. A common method for dealing with degradation on etched targets is to use an O2 plasma clean to remove contamination on a regular basis.[26] W. Keese has suggested the
11/30/00
JMR
Techniques and Tools For Photo Metrology
405
methodology of measuring a different site every day.[27] The SPC wafer contains thirty-one sites, and the first site is measured on the first day of the month, second site on second day of the month, etc. It then takes ~1 month to establish statistical process control.
Figure 21. Repeated measurements cause substantial increase in measured line width. (See Ref. 29, p. 219)
Example of CD-SEM Single-Tool Statistical Process Control. Figures 22–24 illustrate about nine months of statistical process control on a 0.5 micron generation CD-SEM. The method used in this example was to measure four sites daily, as well as a fifth site for pitch verification. The four sites consisted of two horizontally- and two vertically-oriented line width targets. Anticipating target wearout, a large number of sites were measured initially. These initial values became the “target” numbers for checking long-term reproducibility. Delta to T arget 0.02 0.01 4 S ites Avg
S igma DTT
0
target change -0.01
-0.02 3-F eb
3-Apr
2-Jun
1-Aug
30-S ep
Figure 22. Delta-to-target SPC chart shows target degradation and periodic replacement.
11/30/00
JMR
406
Handbook of VLSI Microlithography Pitch
2.52 2.51 2.5 Pitch 2.49 2.48 2.47 3-F eb
3-Apr
2-Jun
1-Aug
30-S ep
Figure 23. Pitch SPC plot is a sensitive warning signal against SEM magnification changes.
S tigmation 0.02 0.01 0
S tigmat
-0.01 -0.02 3-F eb
3-Apr
2-Jun
1-Aug
30-S ep
Figure 24. Stigmation SPC plot charts the difference between vertical and horizontal measurements. Beam tuning errors are detected.
Types of Control Charts for CD-SEM SPC. The SPC approach was to measure daily and plot the average of the four “deltas-to-target” (DTT). (See Fig. 22.) The standard deviation of the four |DTT|’s was placed under SPC as well. The difference between the vertical average DTT and the horizontal average DTT was tracked to provide further information about beam alignment (Fig. 24). The pitch measurement was tracked to provide a very basic accuracy check (Fig. 23). CD-SEM Reproducibility Performance. The dominant characteristic of the CD-SEM performance revealed in Fig. 22 was target wearout. Within two to eight weeks the CD measurement grew beyond the allowed limit, and required replacement. The fresh site was generally much closer to its target value. However, the next few readings showed a negative trend, due to a reduction in the sidewall image brightness. After several readings, a positive growth trend, ~1 nm per repeat, began to dominate the characteristic. The SPC record for the first ~30 days (Fig. 22) indicated another typical problem. This was a period of start-up, characterized by improper beam maintenance. The stigmation plot was particularly sensitive to beam setup errors. Vertical and horizontal features had a different focus due to astigmatism. The poorly focused orientation measured larger than the sharply focused orientation due to the thresholding line width
11/30/00
JMR
Techniques and Tools For Photo Metrology
407
algorithm.[28] The pitch record of Fig. 23 showed no abnormality. Pitch is very insensitive to focus and other beam problems. Sigma DTT showed some large values due to stigmation errors. Toward the end of the record, DTT, sigma DTT, and Pitch records signaled a scanning amplifier calibration error. This is an unusual, but a very serious, problem. A pitch record is basic to assuring scanning accuracy. Delta-to-Predicted (DTP) Statistical Process Control. The elegant way to handle target degradation is to plot and control the difference between measured and predicted target values. [29] Figure 25 shows a target degradation trend. Figure 26 shows the same plot after a linear correction is made for daily target growth. The target growth in this case was 0.7 nm/reading. Note that the control limits were ±73 nm when the target degradation was not compensated (DTT), whereas it was ±15 nm in the DTP scheme. There was thus a ~5X tightening of control using DTP.
Figure 25. Trend of target degradation in a conventional delta-to-target (DTT) control chart. The step indicates a change in target.
Figure 26. Trend of target degradation is theoretically compensated in a delta-to-predicted (DTP) control chart.
11/30/00
JMR
408
Handbook of VLSI Microlithography
Multi-Tool CD-SEM Matching. The complexity of the CD-SEM measurement process limits the throughput: the high vacuum interlock, multiple optical and SEM image pattern search/recognition and alignment steps, automated mechanical and e-beam focusing, multiple low-current measurement scans, and multi-site measurement plans each play a part in limiting the wafer throughput of the tool. The quantity of stepper setup jobs and product CD measurement tasks usually requires the capacity of several CD-SEMs in a semiconductor facility. CD-SEMs are subject to down-time for replacement of e-beam components due to contamination and wear out. Major preventive maintenance and repairs to the e-beam column incur additional downtime (8–24 hours) to bake out the vacuum system before resuming operation. Because of the above circumstances, it is essential that the multiple CD-SEMs within the fab be interchangeable for semiconductor measurements. The measurements must match. Matching is likely to be achieved only when the hardware is identical (same manufacturer and model number), the software recipe is identical (preferably downloaded from a common server), and the beam is maintained and tuned according to a daily or shiftly schedule (focus, astigmatism, aperture alignment). Matching can be defined as the tool-to-tool component of measurement reproducibility. Matching is, therefore, a statistical concept, and statistical process control is used to flag CD-SEMs that do not match.[30] The accepted procedure is to measure the same wafer(s) with each of the CD-SEMs in the fab. Measurement problems are specific to each layer, so wafers should be chosen to represent the critical layers in the manufacturing process. Wafers should be specially patterned with a focus-exposure array so that the different sites on the wafer represent a realistic range of process variations encountered in manufacturing. Because most line widths grow a fraction of a nanometer with each repeated measurement, the sequence of tools should be rotated or randomized, and/or the apparent growth can be taken into account.[31] An assessment of the contamination growth can be made by observing the growth trend in the test data over a period of many days, or—what is slightly different—by observing growth in a dynamic repeatability test on a single tool. Matching Statistics. A single matching test on one layer consists of measuring each of several sites on a wafer, using exactly the same recipe, on each of the CD-SEMs. As an example, the test data for five different CD-SEMs on a wafer with five sites is shown in Table 1. The correction for line width growth is assumed to be negligible. The data is plotted in Fig. 27.
11/30/00
JMR
Techniques and Tools For Photo Metrology
409
Table 1. Example Matching Test Data SEM #
site 1
site 2
site 3
site 4
site 5
wafer avg.
sem 1
0.207
0.213
0.215
0.211
0.209
0.211
sem 2
0.212
0.217
0.212
0.213
0.213
0.2134
sem 3
0.214
0.214
0.214
0.215
0.215
0.2144
sem 4
0.219
0.225
0.222
0.219
0.223
0.2216
sem 5
0.207
0.21
0.211
0.215
0.211
0.2108
site avg.
0.2118
0.2158
0.2148
0.2146
0.2142
Matching Raw Data 0.226 0.224
Linewidth, microns
0.222 0.22
sem 1
0.218
sem 2
0.216
sem 3
0.214
sem 4
0.212
sem 5
0.21 0.208 0.206 0
1
2
3
4
5
6
site #
Figure 27. Plot of line width measurements at 5 different sites on a test wafer using 5 SEMs to be matched.
In this example, SEM 4 is apparently different from the other tools in the set. The question is, which SEMs differ significantly from the others. The following three tests[32] can supply an answer: (1) The ANOVA test can determine whether changing SEMs is a significant effect, but does not determine which SEMs are different. (2) A t-test can be used between each different pair of SEMs to determine whether there is a statistically significant difference between that pair. This test becomes unfairly strict in very large tool sets when testing tools from opposite extremes of the distribution. A test which loosens the criterion appropriately at the extremes in large sets of tools is (3) Duncan’s Multiple Range Test. The ANOVA test for the data of Table 1 is shown in Table 2. In the ANOVA table, column p (probability of a null effect) shows which effects
11/30/00
JMR
410
Handbook of VLSI Microlithography
are important. In this case, changing SEMs contributes strongly, and changing sites contributes weakly, to the total variation in the measurement data. The ANOVA table is important because it calculates the mean square error (MSE)—i.e., the residual measurement noise remaining after known influences such as SEM differences and the site differences are modeled. The MSE will, therefore, be the quantity of importance in any statistical tests. In Table 2, the MSE is found in the “error” row under the column “ms.” Table 2. ANOVA Table for the Data of Table 1 source sem-to-sem site-to-site error
SS 0.000386 0.000044 0.000074
df 4 4 16
ms 9.65 E-05 1.10 E-05 4.64 E-06
total
0.000505
24
2.10 E-05
F 20.81 2.38
p 3.52 E-06 9.52 E-02
Fcrit 3.01 3.01
A simple t-test for each pair of CD-SEMs is illustrated in Table 3. The left side of the table lists all pairs of SEMs. For each pair, the absolutevalue of the difference in average reading is shown. On the right side of the table, the difference is compared with (2.MSE/5)½. This is the uncertainty in the difference between two averages of five readings. The t-test can be used to determine the significance of the difference between two average readings. For 0.05 probability of error, the appropriate statistic is the twosided t-distribution with (degrees of freedom) df = # measurements - # SEMs = 20 and an α of 0.025. The significant differences according to this test are underlined. Table 3. T-Test Tabulation for the Above Matching Data sem differences: Pair |difference| 1&2 0.0024 1&3 0.0034 1&4 0.0106 1&5 0.0002 2&3 0.001 2&4 0.0082 2&5 0.0026 4&3 0.0072 3&5 0.0036 4&5 0.0108
11/30/00
JMR
t-test: |difference|/ sqrt(2*mse/5) t.025,20 1.76 2.086 2.50 7.78 0.15 0.73 6.02 1.91 5.28 2.64 7.93
Techniques and Tools For Photo Metrology
411
Figure 28 illustrates the results of the t-tests performed above. A solid line connects the CD-SEMs 1, 2, and 5, which have insignificant differences. CD-SEM 3 is marginally different from 1 and 5, as indicated by the dashed black line. The dot-dashed line shows the main problem: SEM 4 is significantly different from all other SEMs. The appropriate action is to correct SEM 4, followed by SEM 3.
Figure 28. Matching analysis chart. Statistical methods determine which pairs of SEMs are unmatched. SEM 4 shows statistical difference from all other SEMs. It has top priority for corrective action.
Duncan’s test is slightly more sophisticated, and may be justified for matching large numbers of tools. A tabulation of Duncan’s Test for the above matching data set is shown in Table 4. In the first section of the table, the SEMs are sorted by descending line width average result. In the middle section, all possible pairs of SEMs are listed: first the adjacent pairs in the ordered list (p = 2), then next neighboring pairs (p = 3), et cetera. For each pair, the difference in average reading is tabulated. In the rightmost section, the difference is divided by (MSE / # sites)½, which is the uncertainty of a 5-site average. The quotient in this column is a measure of the significance of the difference, and is to be compared with r.05(p, 25), for p = 2, 3, 4, and 5. These r-values are tabulated significant ranges for Duncan’s Multiple Range Test. Here, 0.05 is the probability of a null effect, and 25 is the number of measurements in the data set. The significant ranges are underlined. Note as p increases, the r-values are increasing slightly, making the test less stringent at the extremes of the distribution. Also note the comparisons made in the t-test for significance are basically equivalent to those in Duncan’s test: in Duncan’s test the columns are in approximately the same ratio as in the t-test, so the comparisons are approximately equivalent.
11/30/00
JMR
412
Handbook of VLSI Microlithography
Table 4. Table Illustrating Duncan’s Multiple Range Test sort averages:
find all pair differences:
SEM#
averages
Pair
p
differences
4 3 2 1 5
0.2216 0.2144 0.2134 0.211 0.2108
4&3 3&2 2&1 1&5 4&2 3 &1 2&5 4&1 3&5 4&5
2 2 2 2 3 3 3 4 4 5
0.0072 0.001 0.0024 0.0002 0.0082 0.0034 0.0026 0.0106 0.0036 0.0108
test significance: difference/ sqrt(mse/5) r.05(p,25) 7.47 2.92 1.04 2.92 2.49 2.92 0.21 2.92 3.07 8.51 3.07 3.53 2.70 3.07 3.15 11.00 3.15 3.74 3.225 11.21
Matching Corrective Actions. The corrective action for a matching error is usually to retune the adjustable beam parameters, change the aperture, or perform some other routine maintenance of the maverick SEM. High quality recipe management systems, which download recipes from a common server at the time of use, assume that all tools run identical recipes. This rules out recipe-dependent slope-offset corrections for individual tools. Any corrections to the maverick tool must, therefore, be on a system-wide level of hardware or software. CD-SEM Calibration. Calibration standards are important for a metrology tool to maintain reproducible and accurate measurements. Whereas a golden standard can provide a basis for determining reproducibility over time within a fab, a certifiable standard can guarantee reproducible processing across geographically and/or temporally separated fabs. Calibration standards are especially important for the CD-SEM because of the strong dependence of line width measurement upon the choice of algorithm used to interpret the line scan. The CD-SEM has excellent sensitivity as a line width comparator: small differences in line width can be detected, provided all other variables are kept constant. This suggests that the existence of a good line width reference standard would make possible more accurate and reproducible measurements. The utility of a line width calibration standard for CD-SEMs is diminished by the fact that the e-beam sample interaction is very materialdependent. Consequently a different reference standard would be needed
11/30/00
JMR
Techniques and Tools For Photo Metrology
413
for each different material. The sidewall angle and other details of the profile must be identical between the product sample and the reference standard to give a correct calibration. The charging phenomenon on many materials—including photoresist—is variable from sample to sample. Charging also dictates low voltage and beam current, resulting in relatively poor contrast, resolution, and signal-to-noise. Under low signal-tonoise conditions, the information offered by comparison with a standard is diminished. Finally, target degradation requires that the reference standard be renewed frequently, or certified at a large number of sites. Pitch Calibration Standards. A pitch standard does not present so many difficulties as a line width standard. The edge location offsets due to beam-sample interaction, sidewall details, charging, and even target degradation are essentially invariant from one line to an adjacent line in a pitch standard. Most any algorithm can successfully measure the pitch. Although the pitch is not directly important to the operation of most semiconductor circuits, it does contain very significant information about the absolute magnification and any variability in magnification. All CDSEMs assume correct and constant magnification as a given in the output of a line width measurement. The key requirements of a pitch calibration standard have been pointed out by E. Chain et al.[33] These include traceability, low edge roughness and pitch variation, good contrast, construction out of materials compatible with semiconductor facilities, and a size range commensurate with submicron semiconductor device features. Such a standard can be permanently mounted on the SEM stage. The standard has been used to match magnification among several SEMs to within 0.6% at 60,000X. Pitch standards are available, including the 0.24 micron pitch standard used by the above authors.[34] This pitch standard was patterned using laser interferometer lithography.[35] The National Institute of Standards and Technology (NIST) developed a Standard Reference Material (SRM)-484 in 1977.[36] This material consists of alternating electrodeposited layers of gold and nickel turned on edge and polished. The spacings are viewed using a FE-SEM with a laser interferometer stage to measure the displacements under the beam. The pitch between gold lines ranges from ~0.5 micron (spacing uncertainty ~4%) up to even larger sizes, so that this material is passé with respect to advanced lithographic requirements. NIST has developed SRM-2090A as a magnification calibration standard to replace SRM-484. [37] Prototype samples were fabricated by metal lift-off on e-beam-written patterns. The pitch ranges from 3000 to
11/30/00
JMR
414
Handbook of VLSI Microlithography
0.2 micron. Experimental samples are available but not yet certified at the time of this writing. Line Width Standards. Despite the challenges, certifiable line width standards are being developed. The methodology is to fabricate sub-micron controlled-geometry, single-crystal silicon electrical line width structures.[38] The structures are patterned on (110) SIMOX and BESOI silicon, which are substrate technologies in use for Silicon-on-Insulator (SOI) devices. As is well known, a wet KOH etch will “stop” on the {111} crystalline faces of silicon, resulting in vertical etch sidewalls on the (110) material. Therefore, it is possible to produce electrically isolated, free-standing rectangular silicon bars with atomically smooth sidewalls. An example of structures fabricated in (110) BESOI are shown in Fig. 29.[39] Measurements by optical, SEM, AFM, and SEM cross-section techniques are all possible on the same material and are relatively easy to model. The physics of conduction should also be easy to understand for these structures. The intent is to compare metrology techniques, to establish the “true” line width, and to take advantage of the ease and precision of electrical resistance measurements to proliferate certified line width standards inexpensively.
Figure 29. Nearly ideal geometry line width test structure fabricated from (110) BESOI. The feature depicted is ~1 micron high and ~1 micron wide, has precisely vertical sidewalls, and rests on an insulating substrate. (See Ref. 39, p. 127.)
3.0
ELECTRICAL CD (ECD) METROLOGY
Electrical CD measurements have always occupied an important place in final device testing. Because the CD-SEM has gauge capability and accuracy issues for 0.18 micron technology and below, ECD measurement has also become an important in-line CD measurement and control technique. ECD is useful only on conducting layers. Polysilicon gate and metal are two critical layers that fall into this category. The advantages of
2/23/01
JMR
Techniques and Tools For Photo Metrology
415
ECD are superior gauge capability, higher throughput, and lower equipment cost. Furthermore, the line conductances directly affect circuit performance. Top-down line width per se is not a device parameter. The main disadvantage is electrical contact with the wafer may cause contamination. ECD can also be used off-line to characterize stepper performance.[40] The additional time required for etch processing is more than compensated by high gauge and speed, which permits many thousands of measurements to be made for a characterization. 3.1
Types of ECD Test Structures
The simplest accurate test configuration for electrical resistance is a four-terminal network, shown in Fig. 30. A current is forced through two of the terminals, and the voltage is read at the other two. Separate terminals for current and voltage allow the measurement to be independent of contact resistance.
Figure 30. Four terminal resistance for accurate line width measurement.
An electrical line width measurement combines a four-terminal resistance measurement with a sheet resistance measurement to compensate for local film thickness and doping variations. The sheet resistance is measured using a coarse line width van der Pauw pattern, Fig. 31. The sheet resistance is given by Eq. (5)
Rsheet = (V I )(π ln 2)
where V = |V1 - V2|, and I = | I in | = |I out|. Rsheet is then used to calculate the number of squares in the four-terminal resistance. The number of squares equals the ratio of length to width of the line between the voltage taps. Eq. (6)
R4−term length = Rsheet line width
Since the length of the network is known with good relative accuracy, the line width can be accurately obtained.
11/30/00
JMR
416
Handbook of VLSI Microlithography
Figure 31. Van der Pauw resistor for determining sheet resistance.
The electrical “cross bridge” combines the van der Pauw and the four-terminal network into one basic test structure.[41] Buehler et al. have proposed a more advanced “split-cross-bridge.”[42] The latter combines the van der Pauw and two four-terminal networks in series. One largewidth line is placed in series with a second large-width line that has a space cut out. This second conducting network permits both lines and spaces to be measured. (See Fig. 32.) The authors have worked out theoretically all the design rules for this structure to provide accurate CD measurements. The split-cross-bridge concept can be extended to the measurement of contact hole size. (See Fig. 33.) Contact hole geometries are cut out from a wide line and the size is calculated from the increased resistance of the remaining conducting material.[43] 3.2
Gauge Capability and Accuracy of ECD
Typical ECD gauge capability is indicated[44] in Fig. 34. For comparison, contemporaneous CD-SEMs give ~5 nm reproducibility, about an order of magnitude larger than ECD. Several advantages favor the greater reproducibility of ECD metrology: (a) The ECD test structure samples a significantly longer line than a CD-SEM. ECD gives a more precise line width reading because it averages line width non-uniformities. (b) The specific location of sampling is absolutely fixed in ECD, whereas sampling location depends upon accurate measurement gate placement in the CD-SEM. (c) There is no evidence of target degradation in ECD.
11/30/00
JMR
Techniques and Tools For Photo Metrology
417
Figure 32. The Split-Cross-Bridge[26] is a carefully designed network which provides accurate electrical measurements of both lines and spaces.
Figure 33. The structure shown here consists of a conducting sheet with an array of contact holes etched out. This network can provide electrical measurements of contact hole dimensions.
2/23/01
JMR
418
Handbook of VLSI Microlithography
Figure 34. Daily measurement data demonstrating the excellent reproducibility of electrical line width measurements.[28]
There is usually a substantial offset between ECD and top-down or cross-sectional SEM CD measurements. Figure 35 shows the relationship between electrical and top-down CD-SEM measurement of submicron polysilicon lines.[44] There is a substantial, variable offset between the two techniques. A different comparison[45] of electrical and SEM over a wider range of sub-micron nested poly line widths is shown in Fig. 36. There is a significant, constant offset between electrical and SEM measurements. The offset is due either to the uncertainty in SEM accuracy (beam width, charging, obscuration by overhanging or protruding parts of the feature, inaccurate SEM algorithm, etc.) or to the lack of a correct theoretical model of the actual conducting cross-section and conductivity in ECD. Electrical measurements are smaller than both CD-SEM and cross-section SEM measurements throughout the literature. This includes small lines from 150 nm[46] up to lines wider than two microns, first studied by Buehler and Hershey.[42] The CD-SEM presumably detects the maximumwidth point of the cross-section, which must be equal to or greater than the average electrical cross-section. Advanced IC metrology needs a more complete understanding of the discrepancy.
11/30/00
JMR
Techniques and Tools For Photo Metrology
419
Figure 35. Relationship between ECD and CD-SEM metrology of submicron polysilicon lines.[44]
Figure 36. Relationship between ECD and SEM metrology of sub-micron nested polysilicon lines.[45] Note that SEM measurements have a positive offset from electrical measurements.
11/30/00
JMR
420 4.0
Handbook of VLSI Microlithography OVERLAY MEASUREMENT
Overlay measurement detects the shift in positioning between two different lithographic layers which are designed to be perfectly aligned. If circuit elements such as metal lines and contacts, metal contacts and gates, transistor gates and isolation regions, etc., are not properly registered across the entire wafer, the circuits will not function.[47]–[49] The overlay tool has a few main uses in the photo area: (a) to check the alignment of photoresist images on product so that misaligned wafer lots can be re-worked, (b) to check alignment on exposed product so that offsets can be anticipated and reduced on new work, and (c) to set up exposure tool alignment systems on test wafers and maintain registration matching between exposure tools. Optical overlay tools measure misregistration by scanning the microscope image of an overlay target consisting of marks patterned at two different layers. The marks are cleanly defined on many front-end wafer processes. At such layers, the linearity of optics and the symmetry of the marks guarantee current overlay tools can achieve good measurement accuracy and precision—adequate for the overlay requirements of today’s leading edge devices. On the contrary, there are unavoidable processes at critical layers for which marks are poorly defined. At such layers the overlay measurements can be noisy and unacceptably inaccurate. Unfortunately, the characteristics which compromise measurement marks at such layers frequently compromise the stepper alignment features as well. It then becomes doubly important to check the alignment performance using accurate overlay metrology. 4.1
Basic Optical Overlay Measurement
The overlay feature consists of inner and outer marks patterned at two different layers. A typical feature is illustrated in Fig. 37. The overlay tool maps the scan of each edge of the mark onto the reflected scan of the symmetrically opposing edge. The offset which causes the two scans to overlap with maximum correlation determines the midpoint (or centroid) of the mark. The difference in centroids of the inner and outer marks is the misregistration between layers.
11/30/00
JMR
Techniques and Tools For Photo Metrology
inner mark 10 um
421
outer mark 20 um
outer box scan
Figure 37. Typical optical overlay mark and optical scans.
4.2
Overlay Metrology Tool Performance
Tool-Induced Shift (TIS). The accuracy of overlay measurement depends upon the symmetry of the hardware. Any imperfection in the optics, including the illuminator, the objective, or the camera/scanner, can cause an asymmetry in the image. This introduces an offset in the results. This offset is called “tool-induced shift” (TIS). Figure 38 shows how a tilt in the optics produces a parallax contribution to TIS.
Tilted optical axis
shifted image
Figure 38. Illustrates how asymmetry in the overlay measurement optics causes an apparent shift in alignment known as tool-induced shift (TIS).
11/30/00
JMR
422
Handbook of VLSI Microlithography
Large TIS problems are associated with back-end processing in which the aligned layers can have a large Z-separation. In Fig. 39, the TIS has been plotted for a large number of different targets with a variety of processes and z-separations between inner and outer targets.[50] Figure 39 shows the correlation between TIS and Z-separation for one particular tilt of the illumination axis. The relationship is 45 nm TIS per micron of δz.
Figure 39. Correlation between measured value TIS and z-separation between the inner and outer overlay target.[50]
In practice, TIS errors can generally be held to within a few nanometers. Since wafer topography is often microns in depth, just minutes of tilt or equivalent optical asymmetry can cause this level of TIS. TIS across the whole range of process targets must be minimized by fine adjustment of the optical hardware. TIS depends strongly upon focus. For each individual process layer, optimum focus minimizes TIS. Surprisingly, blurring the image moderately does not adversely affect the repeatability of the overlay measurement. Within limits, there is considerable latitude to choose the focus that optimizes TIS or some other critical parameter. This latitude is extended
11/30/00
JMR
Techniques and Tools For Photo Metrology
423
even further by focusing separately upon the inner and the outer targets. The additional time and hardware activity have a slightly adverse affect upon repeatability of the measurement, but the trade-off in terms of TIS improvement is significant. The overlay tool must have an extremely repeatable focusing capability. The tool must select the same focus for every target encountered by a given recipe. Certain overlay tools use an interferometric microscope to identify the exact focus.[51] Figure 40 is a diagram of a Linnik interferometric microscope.[52] Other tools insert a knife edge at the optical crossover above the objective to test the exact focus. [53]
Figure 40. Linnik interferometric microscope provides image phase contrast and permits highly repeatable focus.
TIS Correction. The TIS at the backend of the process (metallization) is usually significant, even at the best focus for minimizing TIS. Furthermore, it may be important to adjust focus to optimize site capture, flyer reduction, or target process noise instead of TIS. The accuracy of overlay measurement at the backend of the process, therefore, depends on good TIS correction.
11/30/00
JMR
424
Handbook of VLSI Microlithography
The overlay tool can measure and correct TIS. Rotating the wafer 180° changes the sign of an actual misregistration; the TIS remains unchanged. The formula for TIS is then
Eq. (7)
TIS =
mreg 0 + mreg 180 2
If tool utilization permits the additional measurement time, it is possible to measure and subtract the TIS offset at every site. If tool utilization does not permit TIS measurement of every wafer, it may be advantageous to include a fixed TIS offset in the recipe. This approach depends on the TIS stability of the tool and upon the TIS stability of the process. This raises the question of TIS variability. TIS Variability. The TIS varies from site to site on a wafer. Acrosswafer process variations such as film thickness and sidewall profile variations cause TIS variability. The key recipe parameters—the numerical aperture of the optics, inner target focus, and outer target focus—must be optimized to minimize the average TIS as well as to avoid excessive TIS due to normal process variation. It is possible to adjust most recipes so that TIS variability is 3σ ≤ 5–10 nm, the upper number applying to conventional backend layers. Overlay Tool Repeatability. The overlay tool repeatability is typically 3σ ~ 3 nm. Larger values are unusual and indicate very low contrast targets or hardware malfunction. Overlay Tool Matching. The matching of overlay tools has been studied by Merrill et al.[54] The chosen methodology avoids choice of one tool as a “golden standard,” but rather optimizes each machine separately by adjusting the hardware to minimize TIS and calibrating against a pitch standard to minimize linearity error. The standard stage and focus calibrations were also performed on each tool separately. After these calibrations, the system-to-system variations were characterized on fifteen different wafers on five different product layers. Each recipe was carefully duplicated on all machines. Each recipe was offset so as to correct for tool induced shift (TIS) by performing a TIS calibration. TIS correction is a standard procedure in overlay metrology. TIS is not calibrated on every wafer because of the throughput penalty. In the context of a multi-machine installation, however, the TIS correction has the disadvantage of being a recipe-level rather than a system-level correction. In order to maintain recipe portability among
11/30/00
JMR
Techniques and Tools For Photo Metrology
425
multiple machines, TIS correction should be the same for all the matching tools. In the context of total quality, TIS correction must be a documented component of the recipe specification for the process layer in question. The misregistration was measured at 32 intentionally offset sites on each of the five wafers using three overlay tools. The wafer average x- and y-misregistration was calculated for each overlay tool. The range across the three tools of the wafer average x- and y- misregistrations for each level are plotted in Fig. 41. Also plotted are the pooled standard deviations of the mismatch for each site (mismatch being the difference between the individual tool and the average of the tools at the site in question).
MATCHING CHARACTERISTICS 12
Nanometers
10 8 6
Range Pooled 3 Sigma
4 2
Metal Y
Metal X
Tungsten Silicide Y
Tungsten Silicide X
Oxide Y
Oxide X
Poly Y
Poly X
Nitride Y
Nitride X
0
Figure 41. Matching of misregistration measurements between three separate overlay tools. The range is the discrepancy of the wafer average among the three tools. The pooled 3-sigma values represent the across-the-wafer variation in mismatch between tools.
The wafer average range of mismatch values are comparable with, and must be added to, other inaccuracies such as TIS and long term reproducibility. Figure 41 shows the overlay mismatch problem is complicated by the fact that mismatch is highly site dependent. For example, at Metal, the standard deviation of the mismatch for each site is considerably larger than the wafer-average mismatch. Although it was possible to “match” the machines by inserting a matching offset (TIS correction) into the recipe of each tool at the metal level, the site variability of the TIS correction is large. The matching at this layer is probably not robust with respect to process variations.
11/30/00
JMR
426 4.3
Handbook of VLSI Microlithography Plotting Overlay Results
It is often useful to plot misregistration data on a spreadsheet. Typical output of an overlay tool consists of a list of X and Y misregistration pairs, labeled by the die indices and the X-, Y- offset coordinates within the die. This is illustrated after loading into a spreadsheet in Fig. 42.
1 2 3 4 5 6 7
A DieX 3 1 3 5 3 ...
B DieY 4 2 0 2 2 ...
C X_OFF -9.33 -9.33 -9.33 -9.33 -9.33 ...
D Y_OFF -11.02 -11.02 -11.02 -11.02 -11.02 ...
E Mis_X -0.003 0.099 0.15 0.042 0.061 ...
F Mis_Y 0.07 -0.129 -0.005 0.154 -0.02 ...
Figure 42. Typical registration tool data output in spreadsheet format.
The data can be plotted using simple spreadsheet functions. The Xand Y- transformations for plotting appear in column G and H, respectively, of Fig. 43. The object of these transformations is to use the spreadsheet chart grid as a representation of the wafer grid. The deviation of the plotted points from the grid vertices is in direct proportion to the registration offsets.
1 2
G Plot_X =B2*20+ROUND(D2,-1)+10+F2*40
H Plot_Y =C2*20+ROUND(E2,-1)+10+G2*40
Figure 43. Spreadsheet formulas to represent overlay data as a dot-plot. These cells should be appended in the columns to the right of Fig. 33, and Row 2 should be copied down.
In the formula, the X- die index is multiplied by 20, the approximate die size in mm. The offset within the die needs to be rounded off so that a zero-misregistration point would fall exactly on a graph vertex. The misregistration is magnified by a factor of 40, although this number is arbitrary and can be placed in a fixed spreadsheet cell such as $J$1. The result of these transformations is the very revealing plot in Fig. 44. Figure 44 shows the misregistration at the four corners of five dies on a wafer. This particular wafer has a large rotation error of both the grid and the individual dies. There is also a definite offset in the positive xdirection.
11/30/00
JMR
Techniques and Tools For Photo Metrology
427
DOT PLOT OF MISREGISTRATION 140
500 nm
120
100
Y
80
60
40
20
0 0
20
40
60
80
100
120
140
X
Figure 44. Dot plot representation of overlay data.
4.4
Process-Related Overlay Measurement Errors
There are two main process-related overlay measurement problems: (1) Marks are designed to be symmetrical, but the process can produce asymmetrical structures, and (2) local process effects, such as metal grain structure, can distort the overlay target and produce target noise. Overlay Mark Asymmetry. Figure 45 shows a typical cornerless frame-in-frame overlay feature. This overlay mark is highly immune to process variations. The removal of the corners reduces the influence of the prior mark on the resist flow for the subsequent mark. The overlay measurement structure has a symmetrical design and extends over only a few tens of microns. Most process variations that can affect line width and feature profile such as exposure, film thickness(es), focus, surface reflectivity, and surface morphology do not affect the measurement, provided all parts of the mark are affected equally. If neighboring features are placed too close to the mark, however, the symmetry can be broken. Furthermore, some processes are inherently asymmetrical in their action.
11/30/00
JMR
428
Handbook of VLSI Microlithography
Other processes have intrinsically local variations which break the symmetry. Once the mark symmetry is broken, there can be gross inaccuracies in the misregistration measurement.
Figure 45. Cornerless frame-in-frame overlay mark.
At least three local process variations can have a dramatic effect on the symmetry of the mark: local photoresist thickness variations, the asymmetric deposition of sputtered material, and the smearing of material or directionality of over-polish in CMP processes. Mark Asymmetry due to Resist Thickness Variations. Coleman et al. identifies the flow of photoresist over local topography, such as a field oxide step in close proximity to the mark, as a significant source of mark symmetry.[55] This is depicted in Fig. 46. These authors have identified line width non-uniformity in the resist mark, due to resist thickness variation, as the cause of measurement error. The authors show the measured centroid of the mark differs depending on whether the inside edges or the outside edges of the outer frame are used to locate the centroid. This is illustrated in Fig. 47, in which the misregistration measurements based on inside edges are plotted against misregistration measurements based on outside edges. The left graph is for the mark located near topography; the right graph is taken from the mark located in a planar environment. Near topography, the line width is different from one side to the other side of a frame-in-frame mark. This causes a 63 nm difference between misregistration measurements based on inside edges and those based on outside edges. Furthermore, the resist mark asymmetry is found to transfer to the final layer upon etching.
11/30/00
JMR
Techniques and Tools For Photo Metrology
429
Figure 46. Nearby topography causes photoresist thickness gradation. The resist asymmetry in turn causes line width and etch profile asymmetries, leading to an apparent overlay offset.[55]
Figure 47. Overlay data taken targets near topography, as illustrated in Fig. 38. The data based upon inner and outer edges of the same features shows a 63 nm offset.[38]
11/30/00
JMR
430
Handbook of VLSI Microlithography
Overlay Mark “Random” Noise. One problem is target distortion due to grainy metal, such as hot aluminum copper. High temperature metal deposition produces large grain size and a highly visible surface morphology due to grain boundaries. This camouflages the underlying target topography. Overlay measurement numbers from a distorted overlay target can be quite repeatable, although highly inaccurate. The target has distorted edges, but remains distinct. The overlay metrology tool hardware and software configuration and target design need to be optimized to obtain the best possible accuracy. Figure 48 shows a target distorted by grainy metal.
Figure 48. Overlay target distorted by grainy metal. Note topography to the left of the target.
The inner target in Fig. 48 consists of resist, the outer target consists of slots in dielectric covered by metal. Note the resist transmits the image of underlying grain. This contributes distortion to the resist feature edge. Resist is unevenly removed from the slotted outer target, also contributing image noise. The size of the errors produced by an unoptimized recipe is in the range 100–200 nm (3 sigma).[56] Overlay Measurement Problem: Distinguishing Alignment Errors from Overlay Measurement Errors. As discussed above, overlay metrology errors can be both unacceptably large and difficult to eliminate. Recipe and/or target optimization are often hampered by the fact that the “true” overlay values are not readily determined, there being no sensitive alternative reference technique. When the metrology targets are poor, the stepper alignment marks are also poor.
2/23/01
JMR
Techniques and Tools For Photo Metrology
431
Solution 0: More Overlay Targets. This technique can address overlay target deficiencies, whether due to distortion or to low contrast. The approach is to lay out a number of different targets in one small area of the stepper field.[57] The standard deviation among this cluster of targets, assuming local field distortion is small, is then a metric for the noise due to the metrology targets. This metric is then used to optimize the metrology recipe so as to minimize the effects of target distortion or low contrast. One limitation of this method is it requires sufficient foresight and valuable scribegrid real estate to lay out the extra targets. The method is only useful for noisy targets. It cannot evaluate global target bias due to directional processes such as sputter deposition and CMP. Solution 1: Special Test Wafers to Separate Errors. A direct approach, due to Anderson et al., is to perform special processing on test wafers to clear the metal from some of the fields prior to the photoresist step for the second layer.[58] The stepper then performs the alignment using the alignment targets in metal-covered fields only. On the metalcovered metrology targets, alignment and measurement noise contribute to total misregistration noise through the equation: Eq. (8)
2 2 2 σ TOTAL = σ STEPPER + σ METROLOGY
On the cleared targets, the variance of the registration measurements, σ´ 2TOTAL, has a negligible metrology component: Eq. (9)
2 ′2 σ TOTAL = σ STEPPER
Both variances are readily available by measuring the same wafer. This technique is then a powerful methodology for evaluating both the alignment noise problem due to poor stepper alignment targets as well as the apparent misalignment noise component due to poor overlay metrology targets. The result of the study was the grainy metal targets gave a metrology error component of 84 nm 3σ.[58] The disadvantage is this method requires considerable investment in test wafers and a test flow. Where applicable, the method gives definitive results for both metrology target noise and bias, whether local or global, due to directional processes.
11/30/00
JMR
432
Handbook of VLSI Microlithography
Solution 2: Lens Distortion is Constant from Field-to-Field. Tanaka et al. have described a method for evaluating metrology target noise based on the idea that the intrafield distortion pattern remains the same from field-to-field.[59] This assumes the same two steppers are used to pattern all the wafers used for the test. Logically, if the fields were all stacked one on top of the other, since the distortion of all fields is the same, the deviations of the registration values in any one corner of the field would all be due to target noise. Their method is outlined in Fig. 49. The methodology would be more correct to subtract the mean offset from each field rather than the center offset, as indicated.
Figure 49. Hanabi method of distinguishing alignment errors from overlay measurement errors. The method assumes stepper lens distortion remains identical on all wafer stepper fields.[59]
11/30/00
JMR
Techniques and Tools For Photo Metrology
433
Figure 50 illustrates how a similar analysis can reveal measurement biases due to local process effects. In this figure, registration measurements were performed at metal photo (ADI) and metal etch (ACI) steps of process #1 and at metal ADI on process #2. The misregistration at each of four field corner measurement sites was averaged over five fields on several wafers in order to reduce random measurement noise. The same two steppers were used to align the contact and metal layers, respectively, for both processes. Any lens distortion mismatch must be a constant for all three measurement sets. Differences between the three measurement sets are, therefore, entirely due to local process effects. ADI and ACI overlay systematic errors differed insignificantly for process #1. Unique aspects of process #2- such as local topography, resist thickness, metallization and substrate film stress- result in systematic misregistration errors of up to ~75 nm for this process. Systematic Misregistration at Field Corners for Two Different Processes 2
0
250 nm
1
0
250 nm
Y
Process #1 ACI 0
-2
0
-1
0
0
1
0
2
0
Process #1 ADI Process #2 ADI
-1
0
-2
0
X
Figure 50. After averaging out random measurement noise and removing grid offset, magnification and rotation errors, only systematic misregistration effects remain. These may be due to lens distortion error, as in Process #1, or process-induced bias, as in Process #2.
Solution 3: The Stepper Grid Model is Uniform. As in the section above, an approach due to Yanof et al. works on live product in cases where only the normal arrangement of overlay targets is present.[60] In their method, a typical sampling plan for optimizing on live product is shown in Fig. 51.
11/30/00
JMR
434
Handbook of VLSI Microlithography
4 Sites 5 Fields >4 Wafers
Figure 51. Sampling plan for grid model method. [43]
This approach makes use of the symmetry of a “modeled” or “enhanced global” alignment approach available on many steppers. In this alignment scheme, the stepper captures several alignment targets across the wafer, and calculates an optimum model. The wafer is then stepped out according to this model, rather than site-by-site. For this reason, the stepper grid errors—including scale, offset, rotation, orthogonality— are predictable across the wafer and can be separated from the metrology errors. In their analysis, the overlay measurement can be analyzed into several registration error components, of which one is the target distortion error. Considering the x-measurement, for example: EQ. (10)
x sfwk = X fw + d s + rsfwk + msfw
where: xsfwk = a series of overlay measurement data. Measurements are performed on each site s within the fields f on wafers w. There are k replicates at each measurement site. Xfw
= the grid error on field f of wafer w.
ds
= the lens distortion error, including, e.g., scale, offset, rotation, and trapezoid.
rsfwk = the random metrology error of the overlay tool for each individual measurement, after TIS correction. msfw = the target error due to grainy metal on site s, field f, and wafer w.
11/30/00
JMR
Techniques and Tools For Photo Metrology
435
The first step in the analysis is to “center” the data. The waferaverage misalignment is subtracted, on a per wafer basis, from each datum. The result is:
Eq. (11)
x sfwk − xsfwk, sfk = X fw − X fw, f + rsfwk − rsfwk, sfk + msfw − msfw, sf + d s − d s , s
Here, the bar indicates an average taken over the indices repeated to the right of the comma. Equation (11) can be simplified as follows: The grid error averaged over all the fields is the wafer misalignment. Consider just the center field in the sampling plan. Because of the symmetry of the sampling plan, the enhanced global modeling of the stepper, and the accuracy with which the stepper stage executes the model (better than ~20 nm 3σ), the central field grid error, Xfw, equals just the wafer average misalignment. The grid rotation, grid magnification, and grid skew are all zero for the central field by symmetry of the sampling plan. Hence the two grid terms in Eq. (10) cancel to zero at the central field. The effect of centering the data is shown in Fig. 52.
Raw Data
Centered Data
wafer1 wafer2
wafer1 wafer2
Figure 52. A dot plot of the raw data and of the centered data, showing grid errors are eliminated in the central field of the centered data.
11/30/00
JMR
436
Handbook of VLSI Microlithography
The second step in the analysis is to take the wafer-to-wafer variance of the data at each of the sites of the central field:
x scwk − xsfwk, sfk = Eq. (12)
rscwk − rsfwk,sfk + mscw − msfw, sf Here, the brackets indicate the variance of the enclosed expression. The distortion terms involving ds drop out because the lens distortion variations from shot to shot are negligible compared to the other errors. The repeatability error term can be evaluated and subtracted from Eq. (12). In the case of distorted, high-contrast targets, this term is only a few nanometers, and can be neglected. The remaining quantity on the right is the desired target noise metric. The final step of the analysis is to pool the target noise result over all available sites in the central field. This gives the target noise in terms of the measurement data set, xsf wk. Once a suitable metric of the target noise is found, screening and optimizing experimental designs are useful in determining the best measurement parameters. The result is separate focus settings for the inner and the outer target. Focus settings must be carefully selected to minimize distortion effects and flyers. The target design is also very significant. A target that is much larger than the grain size is not as subject to target noise problems because the influence of the grain averages out. A frame-inframe target gives the best results on grainy metal. One explanation is a frame has as much edge as a box that has twice the size. Referring to Fig. 48, a square resist target transmits the grain structure to the image. A resist frame with ~1–2 micron line width is too narrow to transmit the grainy image, eliminating half the variance due to metal grains.
5.0
FILM THICKNESS BY ELLIPSOMETRY AND REFLECTANCE SPECTROMETRY
Ellipsometers and spectrometers play an important part in photolithography. The most important use is photoresist film thickness measurement and control. Once the optical constants of a transparent material are known, the normal incidence reflectance spectrometer performs the
11/30/00
JMR
Techniques and Tools For Photo Metrology
437
thickness measurement with excellent throughput and gauge capability: 3σ < 2 Å for a 10,000 Å film. The reflectance spectrometer accurately measures the product of thickness and refractive index. In order to obtain the thickness unconfounded by refractive index, it is necessary to develop a good spectral model (Cauchy coefficients) of refractive index for any new materials in the photo area. The ellipsometer is indispensable for this task. Reflectance measurements of underlying layers and anti-reflective (AR) coatings are also critical at the actinic wavelength of a resist to set up optimum resist performance.[61] 5.1
Optical Thin Film Phenomena
It is a part of everyday experience, to the person who wears polarizing sunglasses, that light which reflects obliquely from a surface or even scatters from the atmosphere may become strongly polarized. For a given direction of propagation, the electric field can point in any direction perpendicular to the direction of travel. Light that reflects obliquely from a plane surface tends to be polarized with the electric field parallel to that surface. (See Fig. 53.) For a pure “s-wave,” incident and reflected electric fields are collinear and parallel to the reflecting plane. The other available polarization, which lies in the plane formed by the incident and reflected rays, is called a “p-wave.” A p-wave has a suppressed reflection because the incident electric field is pointing approximately in the reflected direction, and so it cannot strongly excite a reflected wave with transverse electric field. This is the principle of polarizing sunglasses. The ellipsometer measures the differences between s- and p- reflections and uses them to probe the reflecting material.
p-wave E-fields
reflected beam
s-wave E-field (normal to paper)
Figure 53. Definition of s- and p-polarization of a reflected light beam.
11/30/00
JMR
438
Handbook of VLSI Microlithography
Everyday experience of the spectral reflectance from films includes the apparent color of soap and oil films; the variety of colors of different thicknesses of oxide on silicon wafers; and the spreading color rings on photoresist during the coating process. Light reflecting from the top surface of the film interferes with light reflecting from the thin film/ silicon interface (see Fig. 54). A spectrometer measures the intensity of the reflection as a function of wavelength and uses this information to determine film thickness. constructive interference: phase difference = 2nπ
extra path length
Figure 54. There is interference between light reflected at the top surface and light reflected from the bottom surface of a transparent film deposited on silicon.
5.2
Light Polarization Basics for Ellipsometry
Light may be linearly polarized in two independent directions—for example, s- and p-waves impinging at an oblique angle on a plane surface. Other linear polarization directions can be composed by combining these two orthogonal polarizations in arbitrary proportions. The resultant polarization will still be linear, however, only if the two components have the same phase. The electric vector of each linearly polarized component oscillates sinusoidally. If the wavefront of one component is ahead of the wavefront of the other component, the combination will be an elliptical rather than linear polarization. That is, a stationary object in the path of the combined wave experiences an electric field which rotates and varies in length so that the head of the electric vector describes an ellipse. Why is the phase of the polarization components important? It is important because the effect of oblique reflection is not only to favor one polarization direction over another, but also to advance the phase of one component relative to the other. A lightwave may be linearly polarized with a fixed ratio of s- and p-wave components and impinge on a surface. The
11/30/00
JMR
Techniques and Tools For Photo Metrology
439
reflected wave will have a different ratio of s to p. It will be elliptically polarized because s and p components will also undergo a different change in phase upon reflection. Thin films on the surface can have a profound effect upon both the amplitude and the phase of the different reflected wave components. These effects will be analyzed below using basic electromagnetic theory to give structure information such as the index of refraction, absorption coefficient, and thickness of the film structure. 5.3
Basic Ellipsometer
The ellipsometer is so named because it very precisely measures the amplitude ratio and the phase difference between s- and p-wave reflections. Two optical components called polarizers are employed. The first is used to produce pure linearly polarized light with a known direction of polarization. Another component called a quarter-wave compensator is capable of introducing controlled amounts of phase delay between different components of polarization. A second polarizer called an analyzer determines the polarization direction of the reflected wave. The diagram for one basic type of instrument is shown in Fig. 55.[62] This is a nulling type of ellipsometer, in which the polarizer and the analyzer are both adjusted to give a null signal at the detector. The other type of basic instrument utilizes a continuously rotating analyzer, in which the sinusoidal variation in detector amplitude determines the elliptical polarization of the reflected light.[63][64]
Figure 55. Basic ellipsometer for measuring the amplitude ratio and phase difference of the two reflected polarizations of light.
11/30/00
JMR
440
Handbook of VLSI Microlithography
The ellipsometer must measure two quantities which fully characterize the reflection of the two polarizations: (1) tan Ψ, the ratio of the reflectivity for s-wave versus p-wave; (2) ∆, the difference in the phase angle of the two reflected polarizations. 5.4
Film Thickness Instrumentation for Semiconductor Use
Semiconductor use places a particular demand on film thickness tools: a small spot size is needed to probe circuit areas and minimum size metrology test sites. Additional requirements are high throughput for film thickness mapping, pattern recognition to allow the use of live product instead of unpatterned test wafers, high stage accuracy, and excellent toolto-tool matching performance to permit flexibility in manufacture. The advent of chemical-mechanical polishing (CMP) has also brought about a special challenge to make unambiguous measurements on complicated stacks of transparent materials. The latter issue is that different orders of interference, in which the phase differs by a multiple of 2π, often produce the same instrument response (so-called “order skipping”). Spectral Ellipsometry/Reflectance Tool. The available tools meet the above challenges in a variety of different ways. One popular tool, the spectral ellipsometer (SE)/spectrometer, adds a grating to the basic ellipsometry configurations shown above. The grating disperses the light to obtain a complete ellipsometry reading at every wavelength. Focusing optics are used to produce a small spot size in the ellipsometer. The ellipsometer spot is ~30 × 60 µm in size. The normal incidence reflectance spectrometer achieves ~4 micron spot size on production wafers. In the case of a single layer film, ellipsometry at a single wavelength finds the complex refractive index and film thickness, assuming the substrate optical properties are known. In the case of multiple layers, or variable substrates, the response at a multiplicity of wavelengths helps the SE determine an unambiguous measurement. The tool assumes a simple model for complex refractive index of each layer. Extensive computational capability on board fits the model(s) to the data as it is collected. Index of refraction, absorption coefficient, and thickness are determined for every layer. For thicker films, including most resist or planarizing layers, the dual-beam reflectance spectrometer (DBS) provides superior gauge capability to the SE. The SE provides models for the optical properties needed for accurate DBS measurements.
11/30/00
JMR
Techniques and Tools For Photo Metrology
441
Focusing Ellipsometer. A powerful version of the ellipsometer is the focusing ellipsometer (FE). This tool is presently in widespread use in the semiconductor industry because the beam is focused down to a narrow spot which can be trained onto a small scribe line or circuit feature. Typical spot sizes are 20 × 40 microns. This is larger than the spot size of a DBS as described above. The moderate NA lens system produces a converging beam which impinges on the sample at a variety of angles. On-board computational capability handles the ellipsometry calculation across the multiplicity of incident and reflected angles in real time. (See Fig. 56.) [65]
Figure 56. The focusing ellipsometer achieves a small spot size and collects multi-angle information. This reduces thickness errors due to interference order ambiguities.
The detector in the focusing tool is a diode array. Each angle of incidence is imaged onto a different segment of the detector array. Therefore, the tool functions as a multiple set of independent tools, each operating at a different angle. This provides additional order information which is useful for analyzing complicated films. The FE incorporates several different wavelength lasers to further reduce order ambiguities. Microscopic Ellipsometer. A third version of film thickness tool is the microscope objective beam profile ellipsometer (BPE)/ variable angle beam profile reflectance (BPR). (See Fig. 57.) This tool utilizes a conventional high NA = 0.9 microscope objective to provide a broad cone of incident/reflected angles.
p-wave
s-wave
Figure 57. Axial view (looking down the microscope tube) of a polarized light beam entering microscope objective aperture.
11/30/00
JMR
442
Handbook of VLSI Microlithography
As shown in the diagram, a polarized beam illuminating the aperture of the objective provides both p- and s- wave polarizations, depending on orientation of the slice of illumination. A block diagram of this tool is shown in Fig. 58.[67] The tool performs BPE in a ~1 micron spot size. The NA of the objective restricts ellipsometry to angles near the normal. The tool provides reflectance spectrometry, and reflectance as a function of angle. Recent developments of the tool incorporate an auxiliary conventional ellipsometer head to calibrate the BPE. Multiple capabilities within a single instrument help to determine the interference order correctly.
Figure 58. Schematic of a commercial film thickness tool employing beam profile microscope ellipsometry and other film thickness measurement systems.
5.5
Physics of Optical Film Thickness Measurement
The use of ellipsometry and spectrometry to measure a thin transparent film on a silicon substrate is readily described by the basic electromagnetic theory of optical radiation.[67] As an elementary example of how to apply this theory, consider a plane polarized electromagnetic wave of wavelength λ impinging obliquely at angle θ upon a transparent film such
11/30/00
JMR
Techniques and Tools For Photo Metrology
443
as silicon dioxide, silicon nitride, or photoresist that is infinitely thick. (See Fig. 59.)
Figure 59. Incident, reflected and transmitted light rays at an interface between materials.
Light Incident on an Infinitely Thick Film. The problem can be resolved into s- and p-waves. For s-waves, the electric field, E, of the incident plus reflected waves must be equal to that of the transmitted wave at the boundary between the dissimilar media: Eq. (13)
Ei 0 + Er 0 = Ei1
The magnetic field, H, must be perpendicular to both the electric → field E and the direction of propagation, k . The directions of these three vectors obey a vector right-hand rule. In a plane wave, furthermore, Maxwell’s equations require the magnitudes of E and H must have a definite proportion: |H| = n|E|, where n is the refractive index of the medium through which the wave is traveling. H will have components both in the plane of the interface and perpendicular to the interface. The continuity of these two components of H across the interface requires: Eq. (14)
(Ei 0 − E r0 )⋅ n0 cosθ = Ei1 ⋅ n1 cosφ
Eq. (15)
(Ei 0 + Er 0 ) ⋅ n0 sinθ = Ei1 ⋅ n1 sinφ
Equations (13) and (15) are made identical by Snell’s Law: n 0 sinθ = n1 sinφ, which is necessitated by the requirement that the wavefronts of the waves all along the interface must vary at the same spatial frequency. Only Eqs. (13), (14), and Snell’s Law are independent equations. When Eqs. (13) and (14) are combined to eliminate the transmitted amplitude, and solved for the s-wave electric field reflectivity, the result is:
11/30/00
JMR
444
Handbook of VLSI Microlithography s
Eq. (16)
r01 ≡
E r0 n 0 cosθ − n1 cosφ = Ei 0 n 0 cosθ + n1 cosφ
Reflectance Results For Infinitely Thick Film. As an example of this equation, consider normal incidence, where θ and φ are both zero. The reflectivity, which is the ratio of reflected to incident intensity then becomes the familiar result:
Eq. (17)
E R = r0 Ei 0
2
n −n = 0 1 n0 + n1
2
The result is the reflected intensity increases with the square of the index difference between the two materials. The equations for p-waves are similar. As above, the tangential E component must be continuous across the interface. As for the normal component of E, there can be a discontinuity due to polarization charge at the surface. The continuous quantity is instead the normal component of electric displacement, D = ε ⋅ E = n 2 E, where ε is the dielectric constant of the medium at the frequency of the light wave. Now the H field is parallel to the plane of the interface and must be continuous across the boundary. Eq. (18)
(Ei 0 − E r 0 ) ⋅ n0 cosθ = Ei1 ⋅ n1 cosφ
Eq. (19)
(Ei 0 + E r0 ) ⋅ (n0 )2 sinθ = E i1 ⋅ (n1 )2 sin φ
Eq. (20)
(Ei 0 + E r0 )⋅ n 0 = Ei1 ⋅ n1
Now Eqs. (19) and (20) contain Snell’s Law, and if the equations are solved for the ratio of reflected to incoming amplitude, the result is:
Eq. (21)
11/30/00
JMR
p
r01 ≡
cosθ cos φ − n0 n1
Er 0 = Ei 0 cosθ + cosφ n0 n1
Techniques and Tools For Photo Metrology
445
Importance of Brewster’s Angle for Ellipsometry. For normal incidence, where there is no distinction between s- and p-waves, the above result for p-wave polarization is the same as for s-waves. For oblique incidence, however, this equation implies a very important difference between s- and p-wave reflection. Namely, the reflectivity of p-waves can go to zero at a unique angle, known as Brewster’s angle, at which
Eq. (22)
cosθ cosφ = n0 n1
The solution to this equation is
θ B = sin −1
n2 2 n2 2 + n1 2
For any possible combination of two substances, the p-wave will have a vanishing reflection at one angle of incidence. The s-wave reflection, on the other hand, cannot vanish; the corresponding equation for the s-wave reflected amplitude has no zero between 0 and 90 degrees. Figure 60 shows the reflectivity of glass, for which θB ≈ 57°. For the air/silicon interface, θB ≈ 79°. What is the significance of Brewster’s angle? Ellipsometry measures the difference between s- and p-wave reflection. For rather oblique angles in the vicinity of Brewster’s angle, this difference is large. The ellipsometer has large signals to work with and is highly sensitive. For film thicknesses that are too small for measurement via accurate analysis of spectra, the reflectivity differences at oblique angles make ellipsometry highly effective. In semiconductor applications, the film to be measured is often a uniform transparent or partly absorbing layer deposited on a silicon substrate. The model for this situation is shown in Fig. 61.[68] The electromagnetic theory can be applied exactly as for a single interface treated above. There is a partial reflection both at the air-thin film interface and at the thin film - silicon interface. Both the air layer and the thin film, therefore, contain incident and reflected wave components. The silicon contains only a transmitted wave. Again, the problem resolves into completely separate s-wave and p-wave problems; there is no
11/30/00
JMR
446
Handbook of VLSI Microlithography
mode conversion. The same boundary conditions as above need to be met. The result for the reflected amplitude from the combination is:
Eq. (23)
r123
r12 + r23 e 2 iβ = 1 + r12 r23 e 2 iβ
where β = 2π /λ n2d cosφ is the phase change of the electromagnetic wave due to traversal two times through the thin film. (Here, lambda is the wavelength in vacuum, d is the thickness of the film, and φ is the angle between the ray and the normal within the film.)[69]
Figure 60. Dependence of reflectivity upon polarization at all angles of incidence. At Brewster’s angle, the p-component of polarization has zero reflection, whereas the s-wave reflects normally.
i
r123
θ
n1
φ
r23
n2
n3
Figure 61. Incident, reflected, and transmitted beams at the two interfaces between a thin film (layer 2) and the semi-infinite media (layers one and three) on either side of it.
11/30/00
JMR
Techniques and Tools For Photo Metrology
447
Normal Reflectance Results on a Thin Film. Consider normal incidence upon a thin, non-absorbing film. Again, θ and φ become zero. In the above equation, reflection amplitudes r12 and r23 become simply ratios of the difference of refractive indices to their sum—i.e., at normal incidence, r12 = n 2 - n1/n2 - n1. The reflectivity, which is the ratio of reflected to incident intensity then becomes:
Eq. (24)
E R = r0 Ei 0
2
r12 + r23 + 2 r12 r23 cos 2 β 2 2 1 + r12 r23 + 2 r12 r23 cos 2 β 2
=
2
Figure 62 plots R for several illustrative cases assuming n’s independent of wavelength.
Calculate d Norma l Film Reflectance Spectra
Intensity Reflectance Ratio
0.7 0.6 0.5 750 nm Resist on Si
0.4
750 nm SiO2 on Si 0.3
750 nm Resist on glass
0.2 0.1 0 350
450
550
650
750
Wavelength, nm
Figure 62. Calculated normal incidence reflectance spectra for a few different thin films on different substrates.
11/30/00
JMR
448
Handbook of VLSI Microlithography The following basic principles emerge for reflectance spectra:
(a)
The spectral reflectivity shows oscillations of constant amplitude, if the optical constants are invariant with wavelength.
(b)
Higher refractive index or greater thickness results in more rapid oscillations–e.g., resist (n = 1.6) produces more rapid oscillations than oxide (n = 1.45).
(c)
The upper envelope of reflectivity is the same for resist on silicon as for oxide on silicon. This upper envelope is higher than that of resist on glass. The reflectivity envelope is higher over Si than glass because of the large refractive index difference between Si (n = 3.9) and glass (n = 1.4).
(d)
The lower reflectivity envelope depends on all three of the refractive indices. A perfect AR coating has a lower reflectivity envelope of zero. An AR coating is designed for use at a wavelength where the reflectivity is at the lower envelope value.
(e)
If there is significant absorption in the thin film, the top surface will reflect strongly, but there is little amplitude returning from the bottom film interface to cause interference. Under absorption conditions, the envelope closes down, or “pinches” as illustrated in Fig. 63.[70]
Figure 63. The normal incidence reflectance spectrum of polysilicon, which is absorbing at short wavelengths of light. The envelope of the spectrum narrows down in the absorbing region because the interfering beam is being absorbed.
11/30/00
JMR
Techniques and Tools For Photo Metrology
449
Ellipsometry Results on a Thin Film. The result (Eq. 23) for r123 for a thin film on a substrate, applies to either s- or p-waves. Note r12s and r12p are used for finding r123s, whereas r12p and r23pare used for finding r123p. The phase term, e2i β, can be viewed as the change in the phase of the wave as it traverses two times across the thickness of the film. Ellipsometry provides a highly accurate means for measuring the relative magnitudes and phases of the s- and p-wave complex reflection amplitudes, r123s and r123p . A convenient way to view the amplitude ratio and the phase difference, ∆, between s- and p-wave reflections is to define Ψ as the angle whose tangent is equal to the amplitude ratio, tanΨ = |r123s/r123p|. Then any reflection corresponds to a point on the ∆ - Ψ plane. Figure 64 shows the locus of such points for all thicknesses of silicon dioxide deposited on silicon. The plotted squares correspond to oxide thicknesses 0, 10, 20,...190 nm. The point of zero film thickness is at the left end of the figure, at Ψ = 180°. The s- to p-wave reflectivity ratio changes rapidly as film thickness increases from zero, while the phase difference is increasing only slightly.
Silicon Dioxide on Silicon 300
240
180
120
60
0 0
20
40
60
80
100
∆ Figure 64. Calculated ellipsometry data for silicon dioxide on silicon. The plot begins at the left end of the oval, and proceeds counter-clockwise with increasing film thickness. The xand y- axes of the graph relate to the phase difference and amplitude ratio, resp., in the two reflected polarizations.
11/30/00
JMR
450
Handbook of VLSI Microlithography
As the optical path length through the oxide approaches one quarter wavelength (~2600 Å for a 70° angle of incidence), the graph falls back on itself. There is ambiguity in the thickness determination of a non-absorbing sample, when using a single angle, single wavelength ellipsometer. By using a multiple angle ellipsometer, some of the ambiguity is removed. This is illustrated in Fig. 65. At steeper angles, the s-and p-wave differences are reduced, and thus, also the sensitivity of ellipsometry. The locus closes on itself at a thinner minimum film thickness. In this way, the ambiguity between orders can be removed.
Ellipsometry at Different Angles 300
240
180
120
60
0 0
20
40
60
80
100
∆ Figure 65. Calculated ellipsometry data for silicon dioxide on silicon. Here, three different curves are generated by assuming three different angles of incidence in the ellipsometer. Steeper incidence angles give smaller ovals. As in Fig. 55, the points move counterclockwise as the thickness is increases.
If multiple wavelengths are available, the ambiguity is also reduced. However, an assumption must be made about the dispersions relation— i.e., the dependence of the complex refractive index upon wavelength. Figure 66 shows how decreasing the wavelength from 700 nm to 400 nm causes more rapid change in polarizations with thickness. The plotted points indicate 20 nm intervals in film thickness. Here, the refractive indices are assumed to be independent of wavelength.
11/30/00
JMR
Techniques and Tools For Photo Metrology
451
300
240
180
120
60
0 0
20
40
60
80
100
∆ Figure 66. Calculated ellipsometry data for silicon dioxide on silicon. Here, two different wavelengths are simulated. The smaller wavelength curve moves around the oval more quickly with thickness. Multiple wavelength ellipsometry reduces order ambiguity.
Reflectance Versus Spectral Ellipsometric Determination of Resist Cauchy Coefficients. Cauchy coefficients are an efficient approximation or model of the refractive index as a function of wavelength for dielectric materials including photoresist. Accurate Cauchys can be combined with the oscillation frequency of the reflectance spectrometer to give an accurate film thickness. The approximation formula which defines the Cauchy coefficients is:
Eq. (25)
n + ik = n1 +
n 2 n3 n 4 k k k + 4 + 6 L+i k1 + 22 + 34 + 46 L 2 λ λ λ λ λ λ
As shown in Sec. 1.5, the index of refraction determines the amplitude of normal reflectance oscillations. Reversing the logic, the amplitude of the reflectance oscillations could possibly be used to determine the index of refraction as a function of wavelength. The accuracy of Cauchy determinations from reflectance amplitude, however, is limited. The product of film thickness and index of refraction determines the reflectance oscillation interference maxima and minima, as shown in Fig. 67. The figure demonstrates the ambiguity of fitting of a model to the data, since a larger index value combined with a lower thickness value produces very similar curves. In the figure, the index of the 1000 nm film is n = 1.60.
11/30/00
JMR
452
Handbook of VLSI Microlithography
Spectral ellipsometer information unambiguously determines the index of refraction as a function of wavelength. Figure 68 illustrates ellipsometer outputs for the same films examined in Fig. 67. Previous ellipsometer plots showed the phase change, ∆, plotted against the reflectance ratio, Ψ. Figure 68 plots both ∆ and Ψ against the spectral wavelength. With the (index times thickness) remaining constant, the curves shift sharply in position even though the oscillation frequency remains constant. Reflectance Spectrometry Cauchy Determination [Thickness x Index = constant]
Intensity Reflectance Ratio
0.7 0.6 0.5
1040 nm
0.4
1020 nm 1000 nm
0.3 0.2 0.1 0 350
450
550
650
750
Wave length, nm
Figure 67. Calculated normal incidence spectrometry for different film thicknesses. The film thickness decreased, but the index of refraction was increased so as to keep the product constant. Re sist Ellipsometry Cauchy Determ ination [Thickne ss x Index = constant] 80 60 psi 1000 nm
,
40 Ellipsometer
delta " 20 0 350 -20
psi 1020 nm delta " 450
550
650
750
psi 1040 nm delta "
-40 -60 Waveleng th, nm
Figure 68. Calculated spectral ellipsometry for the same films as studied in Fig. 67. Although the product of thickness and refractive index is fixed, the curves shift dramtically in phase. This illustrates ellipsometry is a more sensitive method for simultaneous index and thickness determination.
11/30/00
JMR
Techniques and Tools For Photo Metrology 6.0
453
STATISTICAL APPLICATIONS TO METROLOGY
Statistics plays a part in every aspect of metrology. The purpose of metrology is to assign numerical values to process parameters of importance. This demands a mathematically correct evaluation of those parameters, a correct evaluation of the measurement capability, and a correct appreciation of the significance of the results. 6.1
Definitions of Accuracy, Precision, Reproducibility and Matching
(a)
Accuracy of measurement[71] – closeness of the agreement between the result of a measurement and the “true” value of the quantity being measured. Note that “Accuracy” is a qualitative concept.
(b)
Repeatability – closeness of the agreement between the results of successive measurements of the same parameter carried out under the same conditions of measurement. The constant conditions are called “repeatability conditions.” Repeatability conditions include: the same measurement procedure, the same observer, the same measuring instrument used under the same conditions, the same location, and repetition over a short period of time. Repeatability is expressed quantitatively in terms of the standard deviation of the results. Static repeatability is measured under the condition of no wafer stage motion. Dynamic repeatability defines the measurement process to include the wafer loading/unloading sequence.
(c)
Reproducibility – closeness of the agreement between the results of measurements of the same parameter carried out under changed conditions of measurement. A valid statement of reproducibility requires specification of the conditions changed. The changed conditions may include: observer, measuring instrument, method of measurement, reference standard, location, conditions of use, the passage of time. Reproducibility may be expressed quantitatively in terms of the standard deviation of the results.
11/30/00
JMR
454
Handbook of VLSI Microlithography
(d)
Matching – the reproducibility component in which the changed condition is the serial number of the measurement tool. Fab metrology tools should match provided they contain the same hardware.
(e)
Precision – total variation in the measurement system. This is composed of the repeatability and reproducibility. The mathematical model for the variation is the well-known formula:
Eq. (26) (f)
2 2 Precision = σ rpt + σ RPD
Tolerance – the specification range allowed for a process. The mathematical formula is T= (Upper Spec Limit) – (Lower Spec Limit).
(g)
P/T Ratio – precision-to-tolerance, the percentage ratio of the six-sigma metrology precision to the process tolerance. Table 5 describes metrology quality in terms of P/T.
Table 5. Engineering Significance of P/T Ratio % P/T
6.2
Quality of Process Control Metrology Engineering Input
≤10
excellent
No attention required
20
good
Excellent control possible
30
satisfactory
Much attention necessary
≥40
unsatisfactory
“Flying blind”
Analysis of Variance for Metrology Gauge Studies and Process Analysis
In most metrology tool gauge studies, measurement site identity must be maintained so that measurements are repeated and compared on the same site, for every repeat, tool, operator, and day of the study. The analysis is different when analyzing process variance. New wafers may be expended in each day for each trial of a process study. Both analyses are discussed below.
11/30/00
JMR
Techniques and Tools For Photo Metrology
455
Gauge Study ANOVA. The gauge study is used to determine whether the precision of a metrology tool is suitable for the measurement purpose required. The suitability depends on the conditions under which the tool operates. The gauge study should attempt to “capture” the variability of the tool under actual operating conditions. Such conditions include the normal maintenance schedule, a test period as long, if possible, as the typical time between tool calibrations, and the effect of process variations similar to those encountered on actual product. In the past, tools were manually operated and it was important to determine the effect of having different hands and eyes performing the measurements. On fully automated tools the operator has no influence on the measurement results. It is unnecessary to rotate operators as part of the gauge study on such tools. The gauge study is a designed experiment in which each of the important factors is varied. The experimental design is a model of the measurement process. The gauge study is interpreted by performing an “Analysis of Variance” (ANOVA). The ANOVA mathematically determines the important variables. Their individual contributions to measurement error are quantified. The mathematics for the ANOVA is straight forward, and can be performed easily on any spreadsheet. This methodology is developed below. Example Gauge Study. For example, a thin film measurement of light absorption at 610 nm might be performed upon five sample product wafers. The measurement is performed at three locations progressing from the center to the edge of the wafer. The wafers are looped three times through this program each day. This entire set of measurements is performed daily for three days. A spreadsheet of the data is shown in Fig. 69. A
B
C
D
E
F
G
H
I
J
repeat 1 A 0.80084 0.80259 0.80844 0.80184 0.80261 0.81017 0.80840 0.80672 0.81063 0.80720 0.80631 0.80945 0.80012 0.80099 0.80678
day 1 repeat 2 A 0.80153 0.80204 0.80866 0.80118 0.80268 0.80866 0.80892 0.80692 0.81022 0.80640 0.80629 0.80799 0.80136 0.79943 0.80522 0.80511
repeat 3 A 0.79926 0.80162 0.80763 0.80105 0.80179 0.80777 0.80828 0.80698 0.80972 0.80452 0.80576 0.80863 0.80098 0.79997 0.80527
repeat 1 A 0.80085 0.80139 0.80747 0.79995 0.80207 0.80805 0.80924 0.80684 0.81013 0.80418 0.80602 0.80853 0.80073 0.80020 0.80612
day 2 repeat 2 A 0.80115 0.80308 0.80840 0.80084 0.80288 0.80967 0.80959 0.80703 0.81037 0.80285 0.80582 0.80758 0.79858 0.80028 0.80535 0.80478
repeat 3 A 0.80113 0.80366 0.80871 0.80034 0.80213 0.80897 0.80856 0.80648 0.80951 0.80280 0.80524 0.80857 0.79887 0.79977 0.80532
repeat 1 A 0.80080 0.80255 0.80870 0.80009 0.80258 0.80929 0.80956 0.80794 0.81106 0.80334 0.80561 0.80862 0.79827 0.80054 0.80621
day 3 repeat 2 A 0.80100 0.80222 0.80822 0.80031 0.80279 0.80921 0.80848 0.80691 0.80951 0.80652 0.80630 0.80783 0.80095 0.79948 0.80553 0.80526
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
wafer site 1 1 1 2 1 3 2 1 2 2 2 3 3 1 3 2 3 3 4 1 4 2 4 3 5 1 5 2 5 3 day avg.
K
L
repeat 3 A wafer avg. 0.80092 0.80386 0.80209 0.80930 0.80205 0.80416 0.80414 0.80918 0.80953 0.80874 0.80777 0.81072 0.80554 0.80643 0.80572 0.81001 0.80137 0.80206 0.80153 0.80630 grand avg. 0.80505
M
site avg. 0.80312 0.80364 0.80839
Figure 69. A spreadsheet of the absorption metrology gauge study performed on five wafers at three sites over a period of three days.
11/30/00
JMR
456
Handbook of VLSI Microlithography
Expected Variability To Be Measured By The Gauge Study. The basic model is as follows: As in many wafer processes, the expected variations are: i. Radial dependence. It is assumed the same radial dependence is common to all the wafers. ii. Wafer-to-wafer variation. iii. The metrology tool has dynamic repeatability noise. This will be a combination of the inherent measurement noise (static repeatability) plus the combination of film non-uniformity and the inability of the metrology stage to return each time to the exact same location (stage repeatability). iv. Metrology tool drift. An important determination of the gauge study is the measurement reproducibility over several days. Again it is assumed this is a tool effect and is the same for all sites and wafers. The ANOVA Table. The analysis of variance is conveniently displayed in an ANOVA table. The ANOVA table for the present example is displayed in Fig. 70.
A 20 21 22 23 24 25 26
B
Source of Variation wafers days sites repeatability error interactions total
C
D
SS 7.21E-04 5.28E-06 7.61E-04 5.26E-05 1.76E-04 1.72E-03
df 4 2 2 90 36 134
E MS 1.80E-04 2.64E-06 3.80E-04 5.84E-07 4.88E-06 1.28E-05
F
G
F 308.63 4.52 651.03
probability 1.21E-51 1.34E-02 2.99E-54
8.36
1.95E-16
Figure 70. The ANOVA Table for the thin film absorption example.
The first column of the ANOVA table lists the sources of variation in the data set. For each row in the table, a sum of squares (SS) attributable to that source, the degrees of freedom (df) of that source (the number of entries in the sum of squares minus one), and a mean square deviation (MS) is calculated. The purpose of each row is to compute the ratio (F) between the mean square of the respective factor and the statistical “noise” in the data. In this case, the dynamic repeats designed into the experiment establish a reliable repeatability noise, with which to compare
11/30/00
JMR
Techniques and Tools For Photo Metrology
457
any other process or metrology variations. Thus, large F numbers occur in this example when the sum of squares for an effect is large compared to the dynamic repeatability. The numbers in column F are distributed according to an F-distribution with the appropriate degrees of freedom. The F-distribution function provides a probability given in the last column that the effect is statistically insignificant (i.e., the probability of the null hypothesis). Results of the Example ANOVA. In the example of Fig. 70, all factor effects are most likely to be real. The validity of the model should be tested by comparing the total variation in the data with the sum of all the factors in the model. In Fig. 70, the “interaction” source of error embodies this difference. Since about 90% of the total SS is explained by the factors, the model is essentially correct. There are some significant interactions, e.g., the radial dependence of the site-to-site variation is somewhat different on different wafers. If the radial dependence is dissimilar on different wafers, a Nested ANOVA analysis is more appropriate, with sites nested within wafers (see below). When a nested analysis is applied to the present data set, ~99% of the total SS is explained by the factors. The important quantities to the metrologist are the dynamic repeatability and the reproducibility. The dynamic repeatability (srpt) is the square root of the mean square repeatability entry, i.e., 3srpt = 3 × 5.84 × 10 −7 = 0.0023. The day-to-day MS term includes day-to-day drift together with a small component of dynamic repeatability noise. Since there are forty-five measurements each day, the reproducibility is 3srpd = 3 × 5.28 × 10−6 − 5.84 ×10 −7 / 45 = 0.00064. Computational Tools for ANOVA Spreadsheets. A table of spreadsheet formulas for the entries in the ANOVA table is given in Fig. 71.
A
B
D
E
F
G
SS
df
MS
F
probability
21
wafers
135*VARP(L6:L18)
4
C21/D21
E21/E24
FDIST(E21,D21,D24)
22 23
days sites
135*VARP(D21:J21) 135*VARP(M6:M8)
2 2
C22/D22 C23/D23
E22/E24 E23/E24
FDIST(E22,D22,D24) FDIST(E23,D23,D24)
24
repeatability error
25 26
interactions total
C26-135*VARP(D29:J48) C26-C21-C22-C23-C24
90 36
C24/D24 C25/D25
E25/E24
FDIST(E25,D25,D24)
135*VARP(C6:K20)
134
20
Source
C
Figure 71. The formula required for each of the entries in the ANOVA table.
11/30/00
JMR
458
Handbook of VLSI Microlithography
The ANOVA starts with a calculation of the sum of square deviations due to each effect of the model. For example, the wafer effect is equal to the total number of measurements on a wafer—i.e., 45 multiplied by the sum of squares of the differences between the wafer averages of measurements and the grand average of all measurements. The day-to-day and the site-to-site effects are computed in the same way. There are two valuable shortcuts for computing the needed sums of squares: (a)
The Variance. The variance function is a convenient tool available in any spreadsheet. For example, the VARP spreadsheet function is defined by
Eq. (27)
VARP ( x1, x 2 , x3 ,...x n ) ≡
(
)2
1 n ∑ k =1 xk − x average n
VARP is the variance function for arguments which include the entire population, and so has n as the denominator instead of n-1. (b)
The Nesting Rule. This rule allows, for example, the total variance to be expressed in terms of the within-wafer variances and variance of the wafer averages. This property is expressed as follows: Let Yij be a set of n measurements (i=1...a, j=1...b and a*b=n). Form the b different factor averages
1 b 1 a ∑ Yij and the grand average, Y ≡ ∑ Y j b j =1 a i =1 Then it can be shown Yj ≡
Eq. (28)
(
∑ Yij − Y i, j
) = ∑ (Y 2
i, j
ij
− Yj
)2 + a ⋅ ∑ (Y j − Y )2 b
j =1
In Fig. 71, the day, wafer, and site effects are computed using the VARP function. Note the sum of squares for the repeatability error is computed by applying the nesting rule, Eq. (28). It is obtained as the total sum of squares minus the variance of all the averages of three repeats. The averages of each of the sets of three repeats needs to be computed in a separate table as in Fig. 72. There are 45 × 3 measurements, but the variance is taken in triplets, so each triplet contributes only two degrees of freedom, for a total of 90 degrees of freedom.
11/30/00
JMR
Techniques and Tools For Photo Metrology A
B
wafer 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
site site1 site2 site3 site1 site2 site3 site1 site2 site3 site1 site2 site3 site1 site2 site3
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
C
D
day 1 avg. of 3 rpts 0.80055 0.80208 0.80824 0.80136 0.80236 0.80887 0.80853 0.80687 0.81019 0.80604 0.80612 0.80869 0.80082 0.80013 0.80576
E
F
G
day 2 avg. of 3 rpts 0.80104 0.80271 0.80819 0.80037 0.80236 0.80890 0.80913 0.80679 0.81000 0.80328 0.80569 0.80823 0.79939 0.80009 0.80559
H
I
459 J
day 3 avg. of 3 rpts 0.80091 0.80229 0.80874 0.80082 0.80317 0.80923 0.80919 0.80754 0.81043 0.80513 0.80588 0.80882 0.80020 0.80051 0.80602
Figure 72. The average of the three repeats for each site are required for the calculation of dynamic repeatability noise.
Grouping the Data: Nested ANOVA. The data in an experiment can be grouped in a variety of ways in order to analyze different effects. For example, in characterizing a process, it may be most important to determine the largest source of variations—do wafer-to-wafer, across wafer, or day-to-day effects predominate? Some processes should be examined to evaluate a “first-wafer” effect. Some processes show a definite functional dependence from center to edge of the wafer. In many processing experiments, each wafer can be used only once. This is in contrast to metrology gauge studies where measurement site identity must be maintained, so measurements are repeated and compared on the same site(s). In the following example, different ANOVA tables will be used to display the data for different data groupings. Different arrangements will highlight different effects. In each case the significance of an effect can be compared by finding the ratio of two variances. In each case the statistical significance of the variance ratio will be established by calculating the Fdistribution. Coater Track Example: Nested ANOVA. In a wafer coating experiment or in a destructive measurement, the wafer may be used only one time. A fresh set of wafers is used each day. The term “nesting” would
11/30/00
JMR
460
Handbook of VLSI Microlithography
apply to the corresponding grouping of data. In the nested model, each day has its own set of wafers that are distinct from the wafers used on another day. The sites are also considered to be nested within the wafers. An example of a data set which can be analyzed by crossed and nested groupings is shown in the following graph, (see Fig. 73) depicting a coater characterization experiment.[72] The experiment was carried out by coating three wafers on each of four days, and measuring nine sites across the diameter of each wafer.
Resist Coat Thickness on 9 Sites of 12 Wafers 15000 Wafer 1
RESIST THICKNESS
14950 14900
Wafer 2 Wafer 3
14850
Wafer 4 Wafer 5
14800
Wafer 6 Wafer 7
14750
Wafer 8
14700
Wafer 9 Wafer 10
14650
Wafer 11 Wafer 12
14600 1
2
3
4
5
6
7
8
9
Site Number
Figure 73. Plot of thickness measured at nine sites on each of twelve wafers. Three wafers were coated on each of four days.
Each day fresh wafers were coated in this experiment, so the wafers are nested within each day. The sites are in corresponding positions on all wafers, but they are located on different wafers and are, therefore, not physically identical sites. Thus, the sites are nested within each wafer. The ANOVA table is calculated by finding (1) the variance of the nine thickness measurements across each wafer, then pooling (averaging) the site variance result for all different wafers; (2) the variance of the three wafer mean thickness values, then pooling the wafer-to-wafer variances for the different days; and (3) finding the variance of the daily average thickness. These variances are then compared by taking their ratios and computing the statistical significance of the ratios as usual by means of an F-test. ANOVA Table 6 displays the results.
11/30/00
JMR
Techniques and Tools For Photo Metrology
461
Table 6. ANOVA Results for Nested Model of Coat Experiment Source day-to-day wafer-to-wafer nested within day site-to-site nested within wafer total
SS 31409 43424 405520 480353
df 3 8 96 107
MS 10470 5428 4224
F 2.48 1.28
prob. 0.07 0.26
sigma 20 25 65
In the above table, the day-to-day sum of squares is found in a spreadsheet calculation by taking 108*VARP (day 1 mean thickness, ...day 4 mean thickness). The nested wafer sum of squares is found as 27*{VARP (wafer 1 mean, wafer 2 mean, wafer 3 mean) + VARP (wafer 4 mean, wafer 5 mean, wafer 6 mean) + ...}. The nested site-to-site sum of squares is found from 9*{VARP (wafer 1 site 1, wafer 1 site 2, ...wafer 1 site 9) + VARP (wafer 2 site 1, wafer 2 site 2, ...wafer 2 site 9) + ...}. The result of this analysis is the nested site-to-site effects appear important relative to the other families of variation. This is shown by the calculation of sigma and of F. To determine the statistical significance, F is calculated as a ratio, in which the site-to-site mean square is the denominator. The small values of F indicate the day-to-day or the waferto-wafer variations are barely significant relative to the site-to-site background “noise.” The model looks only at variations within a wafer and does not distinguish between systematic and random site-to-site variations. This model would give the same result with sites randomly located or in a random order. Coater Track Example: Crossed Variable ANOVA. The above nested analysis provides a clear assessment of the relative importance of different families of variation. It ignores some important aspects of the data, however. For example, the graph shows there is a radial dependence shared by all the wafers. Nesting the data restricts the information to the within-wafer variation, and overlooks the site-to-site trends common to all the wafers. From a different viewpoint, wafers are produced to very tight specifications, and it is reasonable to treat the wafers as being basically identical each day. Site 1 on wafer 1 is also identical to site 1 on wafer 2, etc., in the sense that they both are located at the same radius on the wafer. A crossed, or non-nested analysis of the data is, therefore, also possible, and may reveal effects differently. The model in this crossed analysis is as follows: each datum can be predicted by starting with the grand average thickness, then adding a
11/30/00
JMR
462
Handbook of VLSI Microlithography
differential for the given site position, plus a differential for the particular day, plus a differential for the position of the wafer in the daily sequence, plus a residual error not accounted for by the previous effects. The analysis is summarized in Table 7. Table 7. ANOVA Results for Crossed Model of Coat Experiment Source
SS
df
MS
F
prob
sigma
Site position day-to-day wafer order error
368428 31409 32473 48043
8 3 2 94
46053 10470 16237 511
90 20 32
1.1E–40 2.7E–10 2.9E–11
62 20 21
Total
480353
107
4489
In the above table, the site-position sum of squares is found from 108* {VARP (average site 1, average site 2, ...average site 9}. The day-today and wafer-order sums of squares are determined similarly. The error sum of squares is just the residual after all the effect sum of squares are subtracted from the total sum of squares. The total sum of squares is given by 108* {VARP (all data)}. Although both the nested and the crossed analyses give similar standard deviations for each effect, the statistical significance is much greater in the crossed analysis. This arises because the denominator in F is now the residual error, rather than the site-to-site mean sum of squares. The small value of the error mean square shows the model fits the data well. The large statistical significance indicates the crossed model is sensitive to the similarity in radial dependence among all wafers. This analysis shows wafer order is significant. The first wafer each day (wafers 1, 4, 7, and 10) is systematically smaller in thickness (see Fig. 74). 6.3
Chi-Square Test for Variance Comparisons
Each time a set of n measurements is made on the same measurand, and the variance is taken, the variance will have a different value. The chisquared statistic provides the distribution of variances obtained. The probability distribution is given by:
11/30/00
JMR
Techniques and Tools For Photo Metrology
463
σ2 p σ 2 = χ 2n −1 (n ) 2 σ true
( )
Eq. (29)
where p is the probability of obtaining the variance σ2 of the sample of n measurements and σ2true is the actual value of the variance that would be obtained for an extremely large value of n. Figure 75 depicts the chi-squared distribution for a sample size n = 21. The significance for metrology is the distribution is broad—so broad that a rather large number of readings is required to obtain a precise value of the variance. This applies to the empirical evaluation of such important metrological quantities as repeatability, reproducibility, and precision in a measurement system.
Wafer Average Thickness on Each of 12 Sample Wafers 14900
RESIST THICKNESS, A
14880 14860 14840 wafer avgs
14820 14800 14780
wafer 12
wafer 11
wafer 10
wafer 9
wafer 8
wafer 7
wafer 6
wafer 5
wafer 4
wafer 3
wafer 2
wafer 1
14760
Figure 74. Plot of average thickness measured on each of 12 wafers.
11/30/00
JMR
464
Handbook of VLSI Microlithography
0.07
χ202 0.06 Normal Distribution
Probability
0.05
0.04
0.03
0.02
0.01
0 0
10
20
30
40
50
Figure 75. Comparison of normal and chi-square distributions.
To approximate the uncertainty in the standard deviation, consider that for large n, chi-squared approaches a normal distribution centered at n and with variance 2n. This is also illustrated in Fig. 75, where the normal distribution centered at twenty, and with variance equal to 2 × 20 = 40 is shown for comparison to Chi-Squared. For twenty measurements the uncertainty in the value of σ 2 obtained is ± 40 20 σ 2 = ±0.316 σ 2. The uncertainty in the value of σ is approximately one half this value, or ± 40 40 σ = ±0.16 σ . Thus, the ±1-sigma uncertainty in the value of a standard deviation obtained from 21 repeated measurements is approximately ±16 %. The “statistically correct” uncertainty in a variance or standard deviation must be carefully calculated if the qualification of a metrology tool depends upon a repeatability measurement. For example, suppose n measurements are made in a repeatability test and the sample variance is s2. The specification states the measurement system must have a true variance less than some specified value, σ 2. A spreadsheet calculation of the inverse chi-square distribution can be used to determine what should be the highest “passing” ratio of s2/σ2. Usually we choose a ratio which—
11/30/00
JMR
Techniques and Tools For Photo Metrology
465
if exceeded—would indicate the equipment is failing the specification with a 95% certainty. This is called a significance level, α = .05. The table in Fig. 76 shows the spreadsheet calculation of the pass-fail variance ratio for various values of n. The formulas for the spreadsheet are shown in Fig. 77. A
B
C
1
dof=n-1
α
χ
2 3
10 20
0.05 0.05
18.31 31.41
D variance ratio 1.664 1.496
4
30
0.05
43.77
1.412
5
50
0.05
67.5
1.324
-1
Figure 76. Spreadsheet calculation of the pass-fail variance ratio for various values of n.
1
A dof = n-1
2 3 4 5
10 20 30 50
B
C
α 0.05 0.05 0.05 0.05
χ
D variance ratio
CHIINV(A2,B2) CHIINV(A3,B3) CHIINV(A4,B4) CHIINV(A5,B5)
C2/(A2+1) C3/(A3+1) C4/(A4+1) C5/(A5+1)
-1
Figure 77. Spreadsheet functions used to calculate the entries in the spreadsheet of Fig. 67.
In the above tables, the degrees of freedom (dof) column contains the number of measurements minus one. CHIINV is a spread sheet function, which determines the value on the x-axis such that the tail of the ChiSquare distribution function to the right of this x-value contains 5% of the total area of the distribution. The value of s2/σ2 is given in the last column. For example, if only eleven measurements are taken, the experimental variance must be 66% greater than spec in order to fail the qualification test. If fifty measurements are taken, the experimental variance must be only 32% greater than spec in order to fail the tool qual. In a similar manner, a measurement system is 95 % certain of being within spec only if the measured variance is well below the specified value for the spec. The operative probability here is the significance level, β = 0.05. To be 95 % certain of being within the specification, find the x-value such
11/30/00
JMR
466
Handbook of VLSI Microlithography
that 95 % of the area under the chi-square distribution function lies to the right of this x-value. The ratio required between the experimental variance and the specified variance, to guarantee conformance with 95% certainty, is given in Fig. 78. B
C
D
1
A dof = n-1
α
χ
variance ratio
2 3 4 5
10 20 30 50
0.95 0.95 0.95 0.95
3.94 10.85 18.49 34.76
0.358 0.517 0.597 0.682
-1
Figure 78. Spreadsheet calculation of the ratio between experimental variance and required variance to achieve 95% assurance of conformity given a limited number, n, of repeated measurements.
The above table uses the same formulas as Fig. 77. Thus, after thirtyone measurements, the measured variance must be 40 % better than the specification to ensure statistically—with 95 % assurance—that the “true” variance lies within spec.
REFERENCES 1.
Reimer, L., “Electron Optics and Instrumentation,” Ch. 2, Image Formation on Low-Voltage Scanning Electron Microscopy, Vol. TT12 SPIE Optical Engineering Press, Bellingham
2.
Goldstein, J. I., Newbury, D. E., Eichlin, P., Joy, D. C., Romig, A. D., Lyman, C. E., Fiori, C., and Lifshin, E., Scanning Electron Microscopy and X-ray Microanalysis, 2nd Ed., Plenum Press (New York) Ch. 2, “Electron Optics” Goldstein, J. I., Newbury, D. E., Eichlin, P., Joy, D. C., Romig, A. D., Lyman, C. E., Fiori, C., and Lifshin, E., Scanning Electron Microscopy and X-ray Microanalysis, 2nd Ed., Plenum Press (New York) Ch. 2, “Electron Optics,” p. 34 for details about the different electron sources Reilly, T. W., “Metrology Algorithms for Machine Matching in Different CD-SEM Configurations,” Integrated Circuit Metrology, Inspection, and Process Control VI, SPIE, 1673:48–56 (1992) Rogers, S. R., “New CD-SEM Technology for 0.25 µm Production,” SPIE, 2439:353–362
3.
4.
5.
11/30/00
JMR
Techniques and Tools For Photo Metrology 6.
7.
8.
9.
10.
11.
12. 13. 14. 15. 16.
17.
18.
19.
467
Marchman, H., “Scanning Electron Microscope Matching and Calibration for Critical Dimension Metrology,” J. Vac. Sci. Technol. B, 15(6):2155–2161 (1997) Allgair, J., Archie, C., Banke, W., Bogardus, H., Griffith, J., Marchman, H., Postek, M., Saraf, L., Schlesinger, J., Singh, B., Sullivan, N., Trimble, L., Vladar, A., and Yanof, A., “Towards a Unified CD-SEM Specification for Sub-0.18 µm Technology,” SPIE 3332:138–150 Reimer, L., “Image Formation in Low-Voltage Scanning Electron Microscopy,” Vol. TT12, SPIE Optical Engineering Press, Bellingham, p. 31 (1993) Goldstein, J. I., Newbury, D. E., Eichlin, P., Joy, D. C., Romig, A. D., Lyman, C. E., Fiori, C., and Lifshin, E., Electron Optics, Scanning Electron Microscopy and X-ray Microanalysis, 2nd Ed., 2:34, this type of magnetic lens has a small hole in the iron face of the pole piece, Plenum Press, New York Reimer, L., “Image Formation in Low-Voltage Scanning Electron Microscopy,” Vol. TT12, SPIE Optical Engineering Press, Bellingham (1993) Reimer, L., “Image Formation in Low-Voltage Scanning Electron Microscopy,” Vol. TT12, SPIE Optical Engineering Press, Bellingham (1993) Rogers, S., “New CD-SEM Technology for 0.25 µm Production,” SPIE 2439:353–362 (1997) Joy, D. C., “Contrast in High-Resolution Scanning Electron Microscope Images,” J. of Microscopy, 161(2):343–355, Illustration p. 344 (Feb. 1991) The Monte Carlo simulations depicted here were performed using Metrologia™ software, a product of Spectel, Inc., Mountain View, CA Joy, D. C., “Contrast in High-Resolution Scanning Electron Microscope Images,” J. of Microscopy, 161(2):343–355, Illustration p. 344 (Feb. 1991) Joy, D. C., Database of Electron-Solid Interactions and Measurements of Electron Yields, SEMATECH Technology Transfer Document #96063130A-TR (1996). Figure 14 was composed from several graphs in this reference. Monahan, K., Toro-Lira, G., and Davidson, M., “A New Low-Voltage SEM Technology for Imaging and Metrology of Submicrometer Contact Holes and other High-Aspect-Ratio Structures,” SPIE, 1926:336–346 (1993) Litman, A., Pearl, A., and Rogers, S., “CD-SEM Metrology Using BSE Detection,” SPIE Vol. 2196, Integrated Circuit Metrology, Inspection, and Process Control VIII (1994) Joy, D., “Control of Charging in a Low-Voltage SEM,” Scanning, 11:1–4 (1989)
11/30/00
JMR
468 20.
21.
22.
23.
24.
25.
26. 27. 28.
29. 30. 31.
32. 33. 34. 35. 36.
11/30/00
JMR
Handbook of VLSI Microlithography Monahan, K., Benschop, J., and Harris, T., “Charging Effects in LowVoltage SEM Metrology,” Metrology, Inspection, and Process Control V, SPIE, 1464:2–9 (1991) Monahan, K., Benschop, J., and Harris, T., “Charging Effects in LowVoltage SEM Metrology,” Metrology, Inspection, and Process Control V, SPIE, 1464:5 (1991) Vladar, A., “Measurement of Contamination Rate and Stage Drift in Scanning Electron Microscopes,” Metrology, Inspection, and Process Control for Microlithography VII, SPIE, 3332:192–198 (1998) Davidson, M., and Sullivan, N., “An Investigation of the Effects of Charging in SEM Based CD Metrology,” Metrology, Inspection, and Process Control for Microlithography VI, SPIE, 3050:226–242 (1997) Mizuno, F., Yamada, S., “Effect of Electron Beam Parameters on Critical Dimension Measurements,” J. Vac. Sci. Technol. B, 13(6) (Nov./Dec. 1995) Monahan, K., Khalessi, S., “Application of Statistical Models to Decomposition of Systematic and Random Error in Low-Voltage SEM Metrology,” SPIE Vol. 1673 Integrated Circuit Metrology, Inspection, and Process Control VI (1993) (M. Postek, ed.) Hershey, R., private communication. Keese, W., private communication. Hershey, R., and Elliot, R., “Procedure for Evaluating Measurement System Performance:A Case Study,” Integrated Circuit Metrology, Inspection, and Process Control IX, SPIE, 2439:363–373 (1995) Chain, E., Ridens, M., and Annand, J., “SPC Qualification Strategy for CD Metrology,” SPIE 2876:218–224 (1996) Erickson, D., Sullivan, N., and Elliot, R., “Statistical Verification of Multiple CD SEM Matching,” Proc. SPIE, 3050:93–100 (1997) Bowley, R. R., Beecher, J. E., Cogley, R. M., Dupuis, S. R., and Farrington, D. L., “Matching Analysis on Seven Manufacturing CDSEMs,” Proc. SPIE, 3332:94–99 (1998) Box and Hunter, Statistics for Experiments, pp. 62–68, 523–524 (table) Chain, E., Kulkins, L., and Harris, T., “Submicron Calibration Strategy for CD Control,” SPIE, 2876:250–256 Hitachi, Ltd., Standard Micro Scale, Model HJ-1000 Nakayama, Y. and Toyoda, T., SPIE, 2196:78 (1994) Ballard, D., “A Procedure for Calibrating the Magnification of Scanning Electron Microscope using NBS SRM-484, NBSIR 77-1248,” U.S. National Bureau Standards (1977)
Techniques and Tools For Photo Metrology 37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47. 48.
469
Newell, B., Postek, M., and van der Ziel, J., “Fabrication Issues for the Prototype National Institute of Standards and Technology SRM 2090A Scanning Electron Microscope Magnification Calibration Standard,” J. Vac. Sci. Technol. B, 13(6):2671–2675 (Nov./Dec. 1995) Cresswell, M., Sniegowski, J., Ghoshtagore, R., Allen, R., Guthrie, W., Gurnell, A., Linholm, L., Dixson, R., and Teague, E., “Recent Developements in Elecrical Linewidth and Overlay Metrology for Integrated Circuit Fabrication Processes,” Jpn. J. Appl. Phys., 35(1)12B:6597–6609 (Dec. 1996) Allen, R., Ghoshtagore, R., Cresswell, M., and Linholm, L., “Comparison of Properties of Electrical Test Structures Patterned in BESOI and SIMOX Films for CD Reference-Material Applications,” SPIE, 3332:124–131 (1998) Allgair, J., Sturtevant, J., Barrick, M., Fu, C., Green, K. C., Hershey, R., Litt, L., Maltabes, J., Nelson, C., and Roman, B., “Full-Field CD Controls for Sub-0.20 µm Patterning,” SPIE, Vol. 3051, Optical Microlithography (1997) Buehler, M., Grant, S., and Thurber, W., “Sheet and van der Pauw Sheet Resistors for Characterizing the Line Width of Conducting Layers,” J. Electrochem. Soc., 25(4):650–654 (April 1978) Buehler, M., and Hershey, C., “The Split-Cross-Bridge Resistor for Measuring the Sheet Resistance, Linewidth, and Line Spacing of Conducting Layers,” IEEE Trans. on Elect. Dev., ED-33(6) (Oct. 1986) Lindsay, T., and Orvek, K., “0.5 µm Contact Measurement and Characterization,” Integrated Circuit Metrology, Inspection, and Process Control V, SPIE, 1464:104–118 (1991) Chain, E., and Griswold, M., “In-line Electrical Probe for CD Metrology,” SPIE Microelectronic Manufacturing ‘96, Austin TX, pp. 16–18 (Oct. 1996) Jones, S., Van Asselt, R., Russ, J., Dudley, B., Johnson, G., Cohen, B., Schwartz, M., Besser, P., and Herman, P., “Comparison of SEM, Confocal Light Microscope and Electrical Resistance Measurements of Microelectronic Devices,” J. of Computer-Assisted Microscopy, 2(4):211–221 (1990) Nelson, C., Hector, S., Chu, W., Seese, P., Thompson, M., and Pol, V., “Electrical Linewidth Measurements and Simulations Studying the Effects of Dose and Gap on Exposure Latitude in X-ray Lithography,” Electron Beam, X-ray, DUV, and Ion Beam Submicrometer Lithographies for Manufacturing V, 2437:50–61 (May 1995) Lynch, W. T., “The Reduction of LSI Chip Costs by Optimizing the Alignment Yields,” IEDM Technical Digest, 7G-J (1997) Arnold, W., “Overlay simulator for Wafer Steppers,” Optical/Laser Microlithography, SPIE, 922:94–105 (1977)
11/30/00
JMR
470 49.
50.
51.
52.
53. 54.
55.
56. 57.
58. 59. 60. 61.
62. 63. 64.
11/30/00
JMR
Handbook of VLSI Microlithography Eakin, R., Bishop, W., Johnson, J., Liu, W., and Sardella, J., Stagaman, G., A Method of Determining Overlay Effects on Device Functionality, pp. 255–270 Troccolo, P., Smith, N., and Zantow, T., “Tool and Mark Design Factors That Influence Optical Overlay Measurement Errors,” Integrated Circuit Metrology, Inspection, and Process Control VI, SPIE, 1673:148–156 (1992); Graph from p. 153 Davidson, M., Kaufman, K., and Mazor, I., “First Results of a Product Utilizing Coherence Probe Imaging for Wafer Inspection,” Proc. SPIE, Vol. 921 (1988). This technique is employed in KLA-Tencor overlay tools. Merrill, M., Lee, S., Kim, Y., Jung, Y., and Lee, J., “Misregistration Metrology Tool Matching in A One Megabit Production Environment,” SPIE, 1673:205 Shlumberger (formerly IVS Corporation) Overlay Tools Merrill, M., Lee, S., Kim, Y., Jung, Y., and Lee, J., “Misregistration Metrology Tool Matching in A One Megabit Production Environment,” SPIE, 1673:203–212 Coleman, D., Larson, P., Lopata, A., Muth, W., and Starikov, A., “On The Accuracy of Overlay Measurements: Tool and Mark Asymetry Effects,” Integrated Circuit Metrology, Inspection, and Process Control IV, SPIE, Vol. 1261 (1990) Anderson, P. R., private communication. Anderson, P. R., and Monteverde, R. J., “Strategies for Characterising and Optimizing Overlay Metrology on Extremely Difficult Layers,” Integrated Circuit Metrology, Inspection, and Process Control VIII, SPIE Proceedings, Vol. 2196 (1994) Anderson, P. R., private communication. Tanaka, Y., Kamiya, M., and Suzuki, N., “New Methodology of Optimizing Optical Overlay Measurement,” SPIE, 1926:429–439 Yanof, A., Windsor, W., Elias, R., Helbert, J., and Harker, C., “Improving Metrology Signal-to-Noise on Grainy Overlay Features,” SPIE Perchard, J., Shaw, K., and Mueller, M., “Characterization of Metal Film Reflectivity for Implementation into Manufacturing,” SPIE, 1926:227–1235 (1993) Runyan, W., Semiconductor Measurements and Instrumentation, McGrawHill, p. 161 (1975) Spanier, R., “Double Film Thickness Measurements in the Semiconductor Industry,” Integrated Circuit Metrology, SPIE, 342:109–115 (1982) Hauge, P., and Dill, F., “A Rotating-Compensator Fourier Ellipsometer,” Optics Communications, 14(4):431–437
Techniques and Tools For Photo Metrology 65. 66. 67. 68. 69.
70.
71.
72.
471
Focus Ellipsometer: Operations and Applications Training Handbook, Rudolph Research, Flanders, NJ Optiprobe 3260 Film Thickness Measurement System Specifications, Therma-wave, Fremont, CA Born, M., and Wolf, E., Principles of Optics, 6th Ed., Pergamon Press, pp. 36–66 (1980) Born, M and Wolf, E., Principles of Optics, 6th Ed., Pergamon Press, pp. 44 (1980) If the film is partially absorbing, then n2 has an imaginary component. Snell’s law remains valid, so sin φ is complex. Cos φ is calculated as the complex square root: cosφ = 1 − sin 2 φ . Clark, W., Keefer, M., and Cook, D., “Film Thickness Measurements of Amorphous Silicon,” SPIE Microlithography Symposium, (1993) reflects intensity versus wavelength for 570 degree amorphous polysilicon (Fig. 5) Taylor, B. N., and Kuyatt, C. E., This and the following definitions appear in Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST Technical Note 1297 (1994) Daou, A., private communication.
11/30/00
JMR
472
Handbook of VLSI Microlithography
5 Techniques and Tools for Optical Lithography Whit Waldo Motorola, Inc. Austin, Texas
1.0
INTRODUCTION
The image field of a step and repeat projection aligner (a.k.a.stepper) or of a stepping and scanning projection aligner (a.k.a. Step-and-Scan) generally is much smaller than the wafer on which the pattern is transferred. These aligners expose a field on the wafer, then step to the next specified site on the wafer and repeat the operation. The schematic of Fig. 1 is of the most common stepper in use, where the wafer image is demagnified from the reticle pattern. The light source shown is for a mercury arc lamp, but a laser source could be substituted. Another common stepper has unit magnification and a catadioptric design (see Fig. 2). This means its construction includes both reflective mirrors and refractive lens elements. The design and construction of the equipment from vendors can be different. However, the advantages to a volume production fab are usually a matter of degree and the dollar cost of a wafer processed through with acceptable quality can be difficult to calculate a priori. An advanced technology involves scanning the reticle pattern usually with a small slit or arc and transferring the demagnified image to a stepped wafer.[1] Figure 3 shows several different stepper configurations.
472
11/30/00 JMR
Techniques and Tools for Optical Lithography
473
Figure 1. Schematic of a mercury arc illuminated reduction stepper. (Reprinted with permission of the American Chemical Society, L. F. Thompson and M. J. Bowden,Introduction to Microlithography, American Chemical Society, Advances in Chem. Ser., Vol. 219, 1983.)
Figure 2. Schematic of a unit magnification stepper.(Reprinted with permission of Ultratech Stepper.)
2/24/01 JMR
474
Handbook of VLSI Microlithography
Figure 3. Schematic of different stepper configurations.(Reprinted with permission of Semiconductor International from J. H. Bruning, Semiconductor International, p. 137, Apr., 1981.)
A technical discussion of optical steppers deals with image quality and image placement. Image quality is concerned with the dimensional control of features of a desired size. Metal oxide semiconductor scaling rules[2] generally maintain that final patterned linewidths should be controlled to within 10% of the nominal feature size, and the accuracy of the placement of features should be within 20–25% of the minimum feature size. For a resist image coated to thickness z and of width x, the slope of the resist edge is ∂ z/∂ x. For exact differentials:
11/30/00 JMR
Techniques and Tools for Optical Lithography
Eq. (1)
475
∂z ∂z ∂E = ⋅ ∂x ∂E ∂x
The term ∂ z/∂ E reflects the processing of the resist after the latent image exists, while ∂ E/∂ x depends on the object and the imaging system. This chapter addresses issues of both terms as well as image placement.
2.0
FRAUNHOFER DIFFRACTION
Francesco Grimaldi first studied the deviation of light from rectilinear propagation, which he called diffraction. Diffraction is the redistribution of the intensity of light waves resulting from the presence of an object (for example, mask or reticle feature) causing variations of either the amplitude or phase of the waves. For example, as light passes through a narrow slit, the light is spread out more than can be accounted for using geometric optics construction alone. The simple rules of geometrical optics treating light as traveling in rays are valid only when the path differences of the order of a wavelength can be neglected. In the actual wave process, the passage of light between two points is not of a ray but an appreciable cross section of light. Geometrical optics assumes that a perfect lens will bring light from a luminous point to a point image, but in reality, diffraction by the hole through which the light passes causes a distribution of light of a predictable size in the shape of a disk. The two major classes are Fraunhofer and Fresnel diffraction. Fraunhofer, or far field, diffraction is relevant to collimated light and occurs when both the incoming and outgoing waves approach being planar (to within a small fraction of a wavelength) over the extent of the diffracting features. Collimated light is parallel. Fresnel diffraction is for the more general cases when light approaches and leaves an object in other than plane waves. Fraunhofer effects are described by the amplitude and phase of the light diffracted in a particular direction from the aperture. The diffracted light traveling in a particular direction is brought to a point in the focal plane of a converging lens. Figure 4 illustrates this for a point source. A lens is considered diffraction limited when it has residual aberrations and errors from manufacturing which are negligible compared with the diffraction effects. These aberrations and manufacturing errors will be discussed in more detail below.
11/30/00 JMR
476
Handbook of VLSI Microlithography
Figure 4. System for Fraunhofer diffraction of a point source.
2.1
Diffraction Through a Rectangular Aperture
Fraunhofer diffraction by a rectangular aperture is instructive[3] for understanding stepper performance. These rectangular apertures are openings in a mask or reticle, corresponding to a particular device layer. For collimated light having a wavelength of λ, propagated in the +Z direction through a rectangular aperture in the Z = 0 plane falling upon an image plane at Z = constant, the amplitude of the electromagnetic wave is: Eq. (2)
A(a,b)rectangle = A0[sin(aA/2λ)/(aA/2 λ)]2
where the rectangular aperture extends from –A/2 to +A/2 in the X direction and from –B/2 to +B/2 in the Y direction, A0 is the amplitude in the +Z direction, and a and b are, respectively, the sines of the projections on the XZ and YZ planes of the angle between the Z direction and the direction of the light emerging from the aperture. The “intensity” is proportional to the scalar amplitude squared. This “intensity” differs from illuminance by a constant coefficient and sometimes is substituted for illuminance when the coefficient is unimportant. This gives:
11/30/00 JMR
Techniques and Tools for Optical Lithography Eq. (3)
477
I (a,b) rectangle = I0[sin(aA/2 λ)/(aA/2λ )]2
The function has zeros at all integral multiples ofπ except 0. For small angles, the minima are equidistant. The maxima locations are found by differentiating Eq. (3) with respect to aA/λ and setting it equal to zero: Eq. (4) aA 2 aA aA aA sin sin cos − sin d 2λ + 2λ = 0 2λ = I 2 λ × I 0 0 2 aA aA aA aA aA d 2λ 2λ λ 2λ 2λ Maxima occur whenever: Eq. (5)
tan(aA/2λ) = aA/2λ
or, at 0, 1.43π, 2.46π, 3.47π, …, which are slightly less than (K + 1/2)π , although approaching it for the further locations. These maxima are easiest to solve by graphical construction by finding the intersections of y = tan(aA/ 2λ) and the other for y = aA/2λ. 2.2
Diffraction Through a Circular Aperture
Fraunhofer diffraction by a circular aperture describes stepper performance for a mask or reticle at the contact or via device layer. The rectangular aperture is replaced conceptually by an approximation for a circular aperture; i.e., the aperture is divided into a series of narrow rectangular strips of equal width to calculate the effect at off-axis points. Since these strips are not of equal length, their amplitudes are unequal too, requiring the use of Bessel functions for their addition. The intensity for a circular aperture of radius r, centered at the origin, in a plane normal to the z axis, is: Eq. (6)
I(r)circle=I 0[2J1(ρr/λ)/( ρr/ λ)]2
whereJ1 is the first-order Bessel function andρ is sin(θ), the sine of the angle of deviation from the optical axis to the edge of the objective. This is illustrated in Fig. 5.
11/30/00 JMR
478
Handbook of VLSI Microlithography
Figure 5. Amplitude (solid line) and intensity distribution (dashed line) in Fraunhofer diffraction by a circular aperture.
2.3
Airy Disk
The image of an ideal point source through a circular aperture is illustrated schematically. This function has zeros in a single set of concentric rings with differences in the radial parameter (ρ) slightly greater than l/ 2ρ. The bright central maximum first was noted by British mathematician and astronomer Sir George Biddell Airy, and is known as Airy’s disk (see Fig. 6). The bright central disk is surrounded by a number of fainter rings. Neither the disk nor the rings have intensities that are defined sharply but instead are shaded at the edges. The rings are separated by circles of zero intensity. About 85% of the energy entering the optical system is concentrated in the Airy disk, while the other 15% is spread through the rings.[4] Strehl, a contemporary of John Rayleigh, noticed that small aberrations in the lens decrease the proportion of the energy in the central disk while that in the rings increases.[5] The Strehl intensity ratio measures the relative intensity at the principal maximum of the diffraction pattern with and without aberrations (see Fig. 7).
11/30/00 JMR
Techniques and Tools for Optical Lithography
479
Figure 6. Image of an ideal point source. (Reprinted with permission of Kodak Microelectronics Seminar from K. A. Snow, Kodak Microelectronics Seminar, p. 83, 1975.)
Figure 7. Intensity distribution and its Airy disk.
2/23/01 JMR
480
Handbook of VLSI Microlithography
For diffraction through a circular aperture, the distance from the bright central spot to the first zero is given by 1.22λf/d, where f is the focal length and d is the diameter of the objective lens. Notice the maxima are not located symmetrically about the minima. The energy distribution forming the image of an aperture literally in the air is called the aerial image. Lithographic imaging near the resolution limit produces aerial images with a central peak that is ringed with less intense maxima. If the aperture is much greater than the wavelength, a well focused aerial image will not exhibit a single peak intensity but will have a top with some modulation.
3.0
THEORETICAL RESOLUTION LIMIT
Changing the value of the focal length changes the magnification of the image but does not improve the resolution. The relative size of the diffraction patterns and their separations is unchanged. This is illustrated in Fig. 8. Two images are said to be resolved when the intensity in the shaded or dark region between their images falls to some specified value below the intensity at the principal maxima. Rayleigh suggested that the criterion for angular resolution be defined as the angle between two point sources when the principle maximum of the diffraction pattern due to one point source falls on the first minimum, or dark ring, of the other. This is shown in Fig. 9. This means: Eq. (7)
angular resolving power = θmin = 1.22λ/d
When the separation of two images consisting of a diffraction pattern is large compared with the diameter of their Airy disks, each of the individual bright central intensity curves are defined well and separated. As the objects come closer together, the two intensity curves will overlap to such an extent that the Airy disks will appear to be a single image upon observation and cannot be resolved separately. In Fig. 9, the two Airy images’ effective intensity near their peaks is shown by the dotted line. The minimum of the intensity has a normalized value of 0.735 relative to the peaks’ intensity values. If the light is incoherent, there are no interference effects between the two images and the intensities add. For coherent illumination, the electric fields due to the waves diffracted from the neighboring apertures must be summed and then squared to yield the intensity.
11/30/00 JMR
Techniques and Tools for Optical Lithography
481
Figure 8. Magnification and resolution. The lower lens is twice the focal length of the upper (f2 = 2f1 ), so the images formed at A2 and B2 are twice as far apart as A1 and B1 . The diffraction patterns caused by the equal apertures D simply scale up in linear size so there is no gain in resolution. (Reprinted with permission of Ref. 8.)
Figure 9. Rayleigh’s resolution criterion.
11/30/00 JMR
482
Handbook of VLSI Microlithography
Rayleigh defined the resolution criterion of two images by working backwards so the separation distance between two objects can be found. Let R be the separation distance between two object points O and O´. According to geometrical optics, there should be two point images for the two point objects. However, due to diffraction, the respective images consist of an Airy disk with the angular separation angle defined by Eq. (7), where the principal maximum of one image falls on the first minimum of the other, satisfying Rayleigh’s criterion. The wave from O´ diffracted to I has zero intensity (which is the first dark ring of its disk) and the extreme rays O´BI and OAI differ in path length by 1.22λ . From Fig. 10, O´B is longer than OB or OA by R·sin(i), and O´A is shorter by R·sin(i). This means the path difference of the extreme rays from O´ is 2R·sin(i), and equating this to 1.22λ: Eq. (8)
R = 1.22λ/[2 sin(i)].
Figure 10. Resolving power of a lens. (Reprinted with permission of Ref. 46.)
Since the index of refraction (n) between the object and the objective may not be unity (i.e., of a vacuum): Eq. (9)
11/30/00 JMR
R = 1.22 λ/[2n·sin(i)]
Techniques and Tools for Optical Lithography
483
which simplifies to: Eq. (10)
R = 0.61λ/[n·sin(i)]
The German physicist, Ernst Abbe, proposed that n·sin(i) be known as the numerical aperture (NA) of the objective. Light diffracted from the mask or reticle is collected by the objective lens for imaging if the beams are within the acceptance angle of the objective. In practice, the largest value of the numerical aperture obtainable is about 1.6, restricted by the limited availability of immersion fluids with an index of refraction greater than 1.5.[6] So, the theoretical resolution limit between two object points is: Eq. (11)
R = 0.61λ/NA
Equation (11) assumes the light scattered by two object points are independent in phase. Abbe knew this assumption was inappropriate for two points illuminated by light from a condenser (i.e., they were not selfluminous, and, therefore, incoherent), so the resolution limit was given by: Eq. (12)
R = 0.50λ/NA
Actual descriptions of microlithographic resolution treat the coefficients in Eqs. (11) and (12) as a variable (k1), dependent upon the object feature size and shape, the chemical process used, and the condition of the image plane (e.g., substrate reflectivity, topographical flatness and planarity, defocus of image plane, etc.). A k1 value equal to 0.61 typically is considered the resolution limit under pilot line or advanced manufacturing conditions, although resolution with smaller k1 values has been demonstrated. Eq. (13)
R = k1λ /NA
The typical microlithographic resolution limit in a production environment assumes a value of k1 of 0.81. This is shown in Fig. 11 where two points are considered resolved when the maximum of the first diffraction pattern is superimposed to the first secondary maximum of the second diffraction pattern.
11/30/00 JMR
484
Handbook of VLSI Microlithography
Figure 11. Volume production resolution criterion.
4.0
DIFFRACTION GRATINGS
Diffraction gratings are useful for further understanding projection imaging systems like steppers since they can be thought of as an array of line/space pairs. A diffraction grating has a number of parallel equidistant fine slits located in the same plane through which light passes. A transmissive diffraction grating is shown in Fig. 12 where the zero, first, and second diffracted orders are illustrated. The angle of departure of the orders from the grating depends upon the spatial frequency of the grating. A coarse grating having few slits per unit width will have many orders collected by the objective; a fine grating having many slits per unit width will have fewer orders collected. For high frequency gratings, the principal intensity maxima become higher and more narrow, since the intensity in the principal maxima is proportional to the square of the number of slits due to constructive interference of the light from the two slits (see Fig. 13). However, the secondary maxima between principals are suppressed, since information about these orders is missing due to destructive interference. Destructive interference occurs when certain order spectra have positions corresponding to the minima of the diffraction pattern for a single slit. Simple harmonic motion can be represented by either a sine function or a cosine function: Eq. (14)
11/30/00 JMR
y = A sin(ω t + φ)
or
y = A cos (ω t + φ)
Techniques and Tools for Optical Lithography
485
where A is the amplitude of the wave, ω is the angular velocity, t is the time, and φ is the phase (where ω t = φ). These functions can be combined into complex numbers. The two parts can be drawn along orthogonal axes with a real axis, x and an imaginary axis, y. The complex point P has rectangular coordinates (x, y). A can be represented in polar coordinates (t, φ), too. Eq. (15)
sin(φ) = y/A and cos(φ) = x/A
Since sin(φ) = cos (90° – φ), the sine and cosine functions are essentially the same except for a 90° phase difference.
Figure 12. Transmissive diffraction gratings.
11/30/00 JMR
486
Handbook of VLSI Microlithography
Figure 13. Intensity curves of Fraunhofer diffraction from one, two, and four narrow slits. The heights of the curves for two and four slits are on a much smaller scale than those of the single slit. (Reprinted with permission of Ref. 67.)
If A is the resultant of two orthogonal vectors a and b, where, i stands for a single counterclockwise 90° rotation, thenA can be represented in polar coordinates (t, φ): Eq. (16)
A = a + ib = t cos(φ) + it sin(φ)
Euler’s formula says: Eq. (17)
11/30/00 JMR
cos(φ) + i sin(φ) = eiφ
Techniques and Tools for Optical Lithography
487
Combining Eqs. (16) and (17) gives the complex transmission function of a diffraction grating: Eq. (18)
A = teiφ
where t is the amplitude transmittance of the grating and φ is the phase shift of the grating. According to Fourier optics theory,[7] the Fraunhofer diffraction pattern of a diffraction grating is proportional to the Fourier transform of the transmission function. For a unity amplitude narrow bandwidth collimated light source incident on a mask grating of a 1:1 object: image projection system, the light is split into many beams of amplitudes M n /2, where M n is the nth Fourier coefficient of the mask diffraction grating. All beams of diffraction order m
5.0
FOURIER SYNTHESIS
The French physicist and mathematician, Baron Jean Baptiste Joseph Fourier, developed a theory stating that any periodic function can be represented as the sum of a number of sine and cosine functions with appropriate amplitudes and frequencies: Eq. (19)
y = A0 + A1sinω t + A2sin2ω t + A3sin3ω t + … + A1cosω t + A2cos2ω t + A3cos3ω t + …
This is known as a Fourier series where y is the displacement of the resultant wave at a time t, A0 is a constant (required for functions other than sine or cosine waves that have a nonzero average value), the A and B coefficients are the amplitudes of the component waves, andω is the angular velocity. The resultant wave is represented in sound as the collection of the fundamental note and its various harmonics (integral multiples of the frequency of the original periodic function).
11/30/00 JMR
488
Handbook of VLSI Microlithography
Figure 14. Mask grating with diffracted light projected through the lens and the converging grating orders.
11/30/00 JMR
Techniques and Tools for Optical Lithography
489
In general, the series:
Eq. (20)
(
∞ A 0 + ∑ A cos Kx + B sin Kx n 2 n =1 n
)
represents the Fourier series of the function f(x), where:
Eq. (21)
Eq. (22)
An =
Bn =
1 π
π ∫−π f (x )cos Kxdx
K=1,2,…
1
π
∫
π
−π
f ( x) sin Kxdx
K=1,2,…
General problems of a periodic nature can be solved with these series as well as waves of a nonperiodic nature with substitution of more complex Fourier integrals. By including a very large number of terms, Fourier’s theorem makes it possible to synthesize any waveform, including a square wave.[8] A mask of bar targets could be regarded as such a square wave spatial profile. The Fourier synthesis for this object would use only the odd harmonics. To generate a sawtooth pattern, as illustrated in Fig. 15, the Fourier synthesis would use the fundamental and all of its harmonics. A very good synthesis of a sawtooth object using only a few terms is illustrated. A better approximation would be obtained by adding more terms. The period of the wave isp, so its spatial frequency (ν) is 1/p. For example, the period, or pitch, consisting of a 1.0 µm line and a 1.0 µm space is 2.0 µm. The corresponding frequency, typically expressed in units of cycles per mm, is 500. The appropriate harmonic frequencies are 2/p, 3/p, 4/p, etc.
6.0
ABBE’S THEORY OF IMAGE FORMATION
Abbe’s theory of image formation coupled Fraunhofer’s diffraction theory and the ideas of spatial Fourier synthesis to state that any object can be regarded as the collection of superimposed sine or cosine profile
11/30/00 JMR
490
Handbook of VLSI Microlithography
Figure 15. An example of Fourier transformation. The spectrum of the saw tooth periodic function is shown in (a). The individual sine waves are shown in (c), (d), (e), and (f). The sum of these four sine waves is (b). The addition of more sine waves of suitable harmonic frequencies would improve the approximation to the saw tooth. (Reprinted with permission of Ref. 8.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
491
diffraction gratings. Light from a point source is collimated by passing through a condenser lens. Some of the incident light on the object is passed through and forms the zero-order maximum, containing most of the luminous energy. Other incident light on the object is diffracted, forming the higher orders. The wider angles of diffraction are associated with the finer details, and it is the fine detail which is lost because these are not collected by the limited numerical aperture of the objective lens. The number of diffracted orders collected by the lens limits the information about the object which is transferred to the image. Lower frequency gratings have orders that are diffracted by small angles so many orders can be collected by the objective lens. This will result in faithful recreation in the image of the object. By contrast, higher frequency gratings diffract orders by larger angles, resulting in relatively fewer collected orders for a given numerical aperture objective. This degrades the image synthesis. Light is diffracted at the object plane and is collected by the objective lens that acts as a transform lens. The mask transmissivity function or electric field distribution at the object is transformed by the lens into far field diffraction patterns. The Fraunhofer diffraction pattern of the object is formed at the back focal plane of the lens, which is now known as the transform plane. The phase relationships of the diffracted waves are unchanged since the individual optical path lengths are the same to the transform plane from an object at the front focal plane. The waves continue to travel to the image plane where they overlap and interfere to form the image. The objective is considered a transform lens since it can be considered that the light diffracted by the object is diffracted again by the lens; without the lens, the image plane would display a diffraction pattern of the object instead of the image of the object. Also, at the focal plane of the lens, the Fourier transform of the source is formed. Therefore, the image or wafer plane is conjugate to the object or mask/reticle plane and the Fourier transform or back focal plane is conjugate to the source plane.
7.0
INTRODUCTION TO TRANSFER FUNCTIONS
Transfer functions are a measure of the quality of information transfer. This type of information exchange occurs in photolithography by replicating the pattern on a mask in photoresist on a wafer. Generally, there is some loss of information with each transfer operation.
11/30/00 JMR
492 7.1
Handbook of VLSI Microlithography Spread Functions
Ideally, the image would be replicated as a square wave image for a square wave object. Due to Fraunhofer diffraction, we know that the image will be diffraction limited, at best. The intensity distribution in the image of an incoherently illuminated pinhole is known as the point spread function, S(y, z), of the lens. The point spread function for optical lithography is not Gaussian,[9] but is similar to that shape, as seen in Fig. 16. If the pinhole is replaced by a narrow slit, the distribution of light in the image is known as the line spread function, S(z). The line spread function is the point spread function integrated over the length of the slit, y:
Eq. (23)
S (z ) =
+∞
∫ S ( y, z )dy −∞
If there are aberrations in the lens, the image becomes very complex.[10] Figure 17 shows the images of a narrow slit, including the asymmetrical spread function from a lens with coma. The images are of reduced amplitude relative to the objects. If the lens assembly has an asymmetrical aberration or decentered elements, the image will have lower contrast and exhibit a phase change. This is illustrated in Fig. 18.
Figure 16. The image of an extended object represented as the superposition of spread functions. (Reprinted with permission of Ref. 10.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
(a) Object
(b) Ideal Image
(c) Diffraction Limited Image
493
(d) Asymmetrical Spread Function
Figure 17. Image of a narrow slit. (Reprinted with permission of Ref. 10.)
Figure 18. Intensity distribution in the image of a sine wave test object. (Reprinted with permission of Ref. 10.)
7.2
Modulation The modulation or contrast is defined by:[5]
Eq. (24)
M = (Imax – Imin)/(Imax + I min)
where the intensity values (I) are obtained from Fig. 19. Modulation can be defined by substituting either exposure or percent transmission for intensity in Eq. (24). Modulation values are associated with both the object and image, and the modulation transfer factor (TF) relates the two by: Eq. (25)
TF = Mimage/Mobject
11/30/00 JMR
494 7.3
Handbook of VLSI Microlithography Modulation, Phase, and Optical Transfer Functions
An exposure system can be characterized well by developing the curve of MTF as a function of the transfer factor versus the object spatial frequency,ν. The contrast in the projected image is proportional to the MTF value for a given frequency. A lens cannot be described by a single value of TF, but by the modulation transfer function, MTF, showing that: Eq. (26)
MTF = TF(n)
The MTF is a relative measure with a maximum value of 1 when there is no loss of contrast, and a minimum value of 0 when there is no contrast.
Figure 19. Modulation relationship to sinusoidal intensity distribution.
The optical transfer function (OTF) of a lens is the Fourier transform of the spread function of that lens describing the amplitude and phase change from the object to the image:[5][11] Eq. (27)
OTF =
∫
+∞
−∞
S (z )e2 π ν dz i z
The phase change mentioned earlier caused by an asymmetrical aberration or decentered lens elements is dependent on the spatial frequency
11/30/00 JMR
Techniques and Tools for Optical Lithography
495
and contributes to the phase transfer function, φTF. The optical transfer function relates the amplitude change described by the MTF and the phase change described by the φTF: Eq. (28)
MTF = MTF · ei φ (ν)
where the φTF is described by the exponential term with phase transfer factor, φ = f( ν) and the MTF is the modulus of the MTF. This transform can be expressed as a complex function consisting of a real plus an imaginary part.[11] Eq. (29)
MTF = R( ν) + iI( ν)
Then the MTF can be expressed in a form of:
Eq. (30)
MTF =
[R (í ) + il (í) ] 2
2
and: Eq. (31)
φ(ν) = tan-1[R(ν)/I(ν)]
It is important to recognize that even though the MTF and φTF are presented usually as separate graphs they should not be considered independent of each other. Both are required to describe the MTF. The optical transfer function also can be defined as the ratio of the Fourier transform of the light distribution in the image to the Fourier transform of the light distribution in the object:[10] Eq. (32)
7.4
OTF =
Fourier transform of image intensity distribution Fourier transform of object intensity distribution
Cascading Linear Functions
Multiple information transfer operations occur in photolithography. The information collected by the lens from diffracted orders of light limits the number of terms in the Fourier series, and so limits the image quality. A second information transfer operation takes place in the resist system as
11/30/00 JMR
496
Handbook of VLSI Microlithography
the latent image is formed. A third information transfer step occurs at the resist development operation. Each of these informational transfer operations might, therefore, involve some loss of information so that the developed photoresist image might not be recreated like the mask as a square wave profile, but rather suffer some loss of contrast. One of the greatest advantages of using transfer functions is the cascading property that permits the lens transfer function to be combined with that of the photographic film.[10] Eq. (33)
TF(n) total = TF(n) lens · TF(n) resist
The transfer function for the lens must be for the lens system. The MTF contributions of individual lens elements do not cascade to form the stepper lens system MTF in a form as described by Eq. (33). The overall correction of the lens system depends on the aberration introduced by one element to compensate for that introduced by another element in the system. The MTF of each element can be low, but the combination of elements results in a much higher MTF value for the lens system. Cascading of elements assumes a negligible contribution of a phase shift, which would make the system nonlinear; description by its MTF would be incomplete (without the φTF). Also, the assumption that the transfer function of the resist is linear in response is only approximate,[12] but this inaccuracy can be insignificant. 7.5
Illumination Degree of Coherence
Figure 20 is a schematic of a unity magnification optical stepper. The chief ray is an off-axis object ray passing through the center of the objective lens aperture stop and appearing to pass through the centers of the entrance and exit pupils. Any element that limits the amount of light reaching the image is known as the aperture stop. The entrance pupil is the image of the aperture stop viewed from an axial point on the object through the lens assembly. The entrance pupil appears to limit the rays entering the system to form the image of the object. The exit pupil is the image of the aperture stop viewed from an axial point on the image plane through the lens assembly. The illuminator assembly provides uniform light directed to the objective lens assembly. The illumination used in optical steppers is approximately Kohler, where each element of the spatially incoherent exposure source is imaged through a condenser in the entrance pupil of the projection lens with emerging plane or almost plane waves.[13][14] This is shown in Fig. 21.
11/30/00 JMR
Techniques and Tools for Optical Lithography
497
At every point on the mask, the principal illumination direction is nominally toward the same point in the projection lens entrance pupil, so the diffraction patterns from objects at all points in the field coincide.[15] Kohler illumination permits uniform illumination from a source that is not uniform. A different type of illumination is critical illumination, where the source is imaged into the mask plane. This requires that the source be uniform.
Figure 20. Unity magnification system telecentric on the image side. (Reprinted with permission of Prentice-Hall from J. R. Meyer-Arendt, Introduction to Classical and Modern Optics, 2nd Ed., Prentice-Hall, 1984.)
Figure 21. Kohler illumination shown image-side telecentric. (Reprinted with permission of Ref. 15.)
11/30/00 JMR
498
Handbook of VLSI Microlithography
The illumination of optical imaging systems is described as either coherent or incoherent in amplitude depending upon the extent to which the source image fills the entrance pupil of the objective. Steppers are designed so the source only partially fills the pupil of the objective. The pupil filling ratio (s) is the ratio of the diameter of the imaged source at the pupil to the pupil diameter, and is calculated by: Eq. (34)
s = NAillumination/NA objective
The coherence is increased by narrowing the illumination slit. For mercury arc lamp sources, this will increase the nominal exposure time for a given feature, degrading wafer throughput on the exposure system. Also, as the coherence is increased (s ≤ 0.7), the edge integrity of isolated features is degraded by a phenomenon known as ringing.[16] Ringing is an undulation in the intensity at the edge, dI/dx, instead of a constant value (see Fig. 22). The partial coherence value influences not only the modulation rate but also the profile itself of the aerial image reconstituted by the lens, and it is necessary to know the intensity distribution [I(x)] exactly in order to compute the edge profile and the size of a line developed in photoresist.[17] Temporal coherence means the emitted light is perfectly monochromatic. The actinic bandwidth is made as wide as the projection optics design will allow so the image quality is acceptable. Wave trains travel in phase for coherent (s = 0) sources, which are point sources. Incoherent sources (s = ∞) are infinite sources. In practice, all the light from a finite source is imaged within the entrance pupil, and therefore, s≤1. The difference between s = ∞ and s = 1 is small.[18] A schematic comparison of coherent and incoherent sources is shown in Fig. 23. For partial coherence, the source size is smaller than the entrance pupil of the projection lens.
Figure 22. Edge intensity. (Reprinted with permission of Ref. 16.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
499
Figure 23. Schematic of coherent and incoherent projection printers. (Reprinted with permission of Ref. 21.)
The main difference between a laser and mercury arc source is the spatial coherence (see Fig. 24). In contrast to arc sources which emit in 4π steradians, the laser emits a narrow beam with an angular divergence (θ) limited by diffraction:[19] Eq. (35)
θ = kλ/d
whereλ is the wavelength of the light, d is the output diameter through which the light radiates from the cavity, and k is a constant factor (k = 1.22 for a uniform beam; k = 2/π for a Gaussian beam). This collimation permits focusing of all the emitted energy into a small spot whose size (of approximately a wavelength) is limited only by focusing lens diffraction. In comparison to mercury arc lamps, there is no illuminance penalty for increasing the coherence. The MTF curves for an imaging system with different filling ratios[18][20] are shown in Fig. 25.
11/30/00 JMR
500
Handbook of VLSI Microlithography
(a)
(b)
Figure 24. A schematic of the relative intensity distribution of a mercury arc lamp is shown in (a). A three-dimensional view of the lamp’s output is shown in (b). (Reprinted with permission of SPIE, from J. M. Roussel, SPIE, 275:9, 1981.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
501
Figure 25. Effect of degree of coherence on MTF. (Reprinted with permission of Ref. 17.)
For coherent illumination, the MTF response is a step function of unity until the cutoff frequency is reached, when the response immediately drops to zero. Coherent systems have sharper image edge gradients and smaller point spread functions, but they also have more image degradation resulting from diffraction.[21] Returning to diffraction gratings,[13] Fig. 26 shows a coherent plane wave normally incident on a mask which contains a grating pattern of pitch p with equal lines and spaces. The frequency (ν) equals the inverse of the pitch, p. The undiffracted component of light passing through the grating contains no information about n, the frequency of the grating. Information is contained only in the diffracted light. The direction of the first diffraction peak is given by the grating formula: Eq. (36)
np sin(θ) = Nλ
(N = 1 for first diffracted order)
so that: Eq. (37)
ν = n sin(θ)/λ
11/30/00 JMR
502
Handbook of VLSI Microlithography
Figure 26. Diffraction of coherent and incoherent light by a grating pattern. (Reprinted with permission of Ref. 204.)
If the light diffracted into direction θ is to reach the image plane, it must be collected by the lens, accepting light of all angles θ -i, since the numerical aperture of the lens was defined previously as n.sin(i). Therefore, the highest grating frequency which can be imaged by an optical system with coherent illumination is: Eq. (38)
νmax = nsin(i)/λ = NA/λ
For incoherent illumination (s = ∞), the stepper has a linear response to intensity and the cutoff frequency, nmax, is: Eq. (39)
νmax = 2NA/λ
Notice that the cutoff frequency for incoherent illumination occurs at twice the spatial frequency as that for coherent light. The phase of the transfer function can be expected to vary rapidly as the spatial frequency approaches the MTF cutoff.[21] The incoherent MTF for a circular pupil is given by:[7]
Eq. (40)
11/30/00 JMR
2 TF (ν ) = π
2 ν ν −1 ν ⋅ cos ⋅ 1 − − νmax ν max ν max
Techniques and Tools for Optical Lithography
503
where νmax is defined by Eq. (39), and this can be approximated by the expression:[18] Eq. (41)
TF(ν) ≈ 1 – 4 · sin(νλ/2NA)/π
for all but the highest spatial frequencies. The angles are expressed in radians. For minimum features, this can be further approximated by:[22] Eq. (42)
TF = 1 – 1/(π k1)
where k1 is the Rayleigh resolution coefficient from Eq. (13). The effect of partial coherence, 0<s<1, is to widen the spatial extent of each diffracted order by ±sNA/λ so that the highest spatial frequency with the first diffracted order collected by the aperture of the objective lens is:[23] Eq. (43)
νmax = (1 + s)NA/λ
There are several technical difficulties with Fig. 24.[5][24] When the illumination is partially coherent, the system is no longer linear in either intensity or amplitude, so the imagery of real (square-wave) objects of arbitrary shape cannot be reconstructed by the MTF alone[25][26] since phase information is unaccounted for. MTF curves are shown then only as a point of reference. However, for lithography near the resolution limit, only the fundamental frequency of the mask pattern reaches the image plane,[18] and the MTF is valid for describing the modulation of the fundamental frequency of a bar target of a specified period.[25] When the mask features are relatively large, the MTF becomes a meaningless metric, but the imaging begins to approach ideal so sophisticated prediction methods like the MTF are unnecessary. The MTF applies only to objects which have sinusoidal transmission characteristics[26] such as Fig. 27. It is more difficult to fabricate high quality sinusoidal transmission targets than square wave bar targets.[27] If the square wave (MTFsq) curve is known, the sine wave (MTFsine wave) curve can be calculated by:[28] Eq. (44)
TFsine wave(v) = π/ 4 · [TFsq(v) + TFsq(3v)/3 - TFsq(5v)/5 + TFsq(7v)/7 - TFsq(11v)/11 + .....]
11/30/00 JMR
504
Handbook of VLSI Microlithography
Figure 27. Comparison of bar and sinusoidal targets. (Reprinted with permission of Ref. 27.)
For partially coherent imaging, mask patterns of opposite tone do not produce complementary images.[29] The image produced by an isolated line pattern is fundamentally different than that of an isolated space pattern. The latent image is exponentially related to the aerial image for 1st order reaction kinetics. It is so highly nonlinear that complementary aerial images don’t lead to complementary latent images. Any mask pattern has an optimum tone for maximizing the exposure-defocus latitude. 7.6
Wavelength Effect on MTF
The benefits of shortening the wavelength to improve image quality at higher spatial frequencies are apparent by examining the MTF response[25] in Fig. 28. 7.7
Depth of Focus
The depth of focus (DOF) of an optical system at the Rayleigh resolution limit is referred to as a Rayleigh unit of defocus and was defined by Strehl as: Eq. (45)
11/30/00 JMR
DOF = ±λ/(2NA2)
Techniques and Tools for Optical Lithography
505
Figure 28. Incoherent MTF as a function of wavelength. (Reprinted with permission of Ref. 25.)
In practice, this depth of focus at the resolution limit is not reached usually but the general form of the depth of focus description remains: Eq. (46)
DOF = ±k2λ/NA2
where k2 is determined empirically for the lithography process in use. Equation (46) still has serious deficiencies since it does not recognize the depth of focus latitude increases for features larger than the Rayleigh resolution size, nor the dependence on feature shape or aspect ratio (feature height/width) even at the Rayleigh resolution size, nor the illumination conditions. The effect of defocus on the incoherent MTF as a function of spatial frequency has been calculated[30] and is related to the MTF by the curve DOF2. For a defocus equal to k times the Rayleigh unit of defocus, the incoherent modulation for any feature size is reduced by k2 times the appropriate ordinate of the DOF2 curve. Notice in Fig. 29, the DOF2 curve peaks near an operating resolution factor of k1 = 0.61. For lower spatial frequencies, the DOF2 is linear. Recalling the MTF curve for incoherent illumination is approximately linear in this region, the reduction in MTF due to defocus is inversely proportional to the first power rather than to the square of the numerical aperture.[24] The deleterious effect of defocus for high spatial frequencies can be minimized by increasing the coherence of the light or by shortening the actinic wavelength.
11/30/00 JMR
506
Handbook of VLSI Microlithography
Figure 29. Effect of defocus on MTF. (Reprinted with permission of Ref. 24.)
The MTF calculation with defocus effects[31] depends not only on the spatial frequency (ν), but also on the vertical position (z), according to: 1
Eq. (47)
4 TF (ν , z ) = 1 − ν 2 ⋅ cos[2πν max (ν − ν max )∆(z )]dν π ν max
∫
where ∆(z) is the normalized Rayleigh defocus length defined by:
Eq. (48)
∆(z) = 2
NA2 λ
z δ 0 + n + d
The resist index of refraction is n, δ0 by convention is negative when the focal plane is below the resist surface, the depth variable (z) by convention is z = 0 at the resist-air interface and is positive below the resist surface, and d is an equivalent defocus length to account for aberrations other than defocus.
11/30/00 JMR
Techniques and Tools for Optical Lithography
507
From a lithographic perspective, the selection of pupil filling factor to set the partial coherence most strongly influences the aerial image stability with defocus. Critical dimension control with coherent illumination could suffer too much variation with defocus to be practical. With decreasing budgets for defocus variation, partial coherence values of s = 0.3–0.6 could be more attractive for defocus tolerance. 7.8
Diffraction Limited Resolution
Previously, it was stated that a lens is considered diffraction limited when it has residual aberrations and manufacturing errors which are negligible compared with the diffraction effects. This will be expanded now, following Rayleigh, to designate a system as diffraction limited if the MTF lies above the curve corresponding to an optical path difference of (OPD) = λ/4.[5][21] This corresponds to a Strehl ratio of about 0.8. Using the concept of rays instead of wavefronts in Fig. 30, many rays pass through an optical system from an object point to form an image point. The object and image points are referred to as conjugate points and the length between their respective planes is the conjugate length. Ideally, the optical distance will be the same along each ray. Due to imperfections, the relative paths can differ by many wavelengths. The optical path length is the product of the refractive index for each segment and the geometric length of the segment.[5] Accurate characterization of the indices of refraction for the batches of glass used for fabricating the elements is critical for lens designers.[32]
Figure 30. Effect of optical wavefront disturbances. (Reprinted with permission of Ref. 104.)
11/30/00 JMR
508 7.9
Handbook of VLSI Microlithography Minimum MTF Requirement
It is generally accepted that MTF values of at least 40–60% are required for a minimum working feature in positive resist.[13][25] It can, therefore, be seen that at this level of performance there is some advantage for resolving higher spatial frequencies using more coherent light. These MTF limits arise because the minimum size printable feature using any combination of photoresist and lens must satisfy:[33] Eq. (49)
TFoptical ≥ Mresist
where: Eq. (50)
Mresist = (E100 – E0)/(E100 + E0)
which follows Eq. (24). E100 is the minimum exposure energy for 100% exposure (i.e., E100 is the minimum energy to completely solubilize the resist) and E0 is the maximum exposure energy for zero exposure (i.e., E0 is the threshold energy above which resist solubility dramatically increases). Exposure time or intensity can be substituted for energy in the calculation. A performance parameter of resist is its gamma (γ) value defined as: Eq. (51)
γ = [log(E100/E0)]-1
so Eq. (49) becomes: Eq. (52)
Mresist = [10 (1/γ ) – 1]/[10(1/γ ) + 1]
The theoretical basis for using γ as a figure of merit is presented elsewhere.[34]–[37] Resist γ has been shown to be directly related to resist profile, resolution, and linewidth control.[14][39]–[41] However, the usefulness of γ for process development is questioned[42]–[45] since surface inhibition effects are not always well accounted for. Although γ is an indirect measure of critical dimension control, it is easier to measure than image linewidths. Equally important, critical dimension experiments testing nested variances require equal cell biases to avoid an important confounding factor. This greatly increases the difficulty of linewidth experiments.[46]
11/30/00 JMR
Techniques and Tools for Optical Lithography
509
The resist characteristic curve is generated by making open frame exposures (i.e., exposures with no pattern) of increasing dose into a resist film coated on a bare silicon wafer. Some data are shown in Fig. 31.
Figure 31. Resist characteristic curve.
The value of γ of the resist also can be calculated from the slope of the least squares fitted line for normalized film thicknesses between 15–80%.[46] These criteria are somewhat arbitrary, but at thicknesses >80%, the shoulder effects adversely affect the calculations, since the shoulder is not part of the linear response. Also, at film thicknesses less than 15%, the data tend to be noisy. This technique is contrary to others.[44]–[47] Another figure of merit that has shown good correlation to mask linearity and exposure latitude is the exposure margin, EM: Eq. (53)
EM = E1:1/E0
The exposure margin is the ratio of the energy to image equal lines and spaces to the resist threshold energy.[48] 7.10 Field Application of Transfer Functions MTF values can be obtained without performing laser interferometric bench tests. The object contrast is obtained easiest from measurements of
11/30/00 JMR
510
Handbook of VLSI Microlithography
the percentage transmission on a mask with a series of resolution structures used for the different spatial frequencies and application of Eq. (24). [49] Otherwise, the object can be scanned with a slit and photoelectric device to find I max and I min.[50] The image contrast can be determined experimentally by using a thin layer of photoresist as a threshold detector.[21][26] A very thin resist layer (so the aspect ratio of the features is well below unity) makes this method independent of process parameters like absolute exposure, time, intensity, resist bake times and temperatures, and developer conditions. The image contrast in a pattern of equal lines and spaces can be calculated by noting the exposure (dose, time) at which the spaces begin to clear or open, (E1, T1), and the exposure (dose, time) at which the lines completely disappear, (E2, T2). The image contrast ( γ) is calculated by: Eq. (54)
γ = (T2 – T1)/(T2 + T1)
Equation (54) follows from the definition of Eq. (24), where I max = E0/E1, Imin = E0/E2, for the resist threshold dose to clear E0. Knowledge of both the object modulation and image modulation for different spatial frequencies allows the MTF curve to be generated. Another method of finding the image contrast is to record the image on film and scan it using a microdensitometer to determine the maximum and minimum density values of the image.[50] This method is limited to lower spatial frequencies depending upon the noise introduced to the measurements by light scattered in the film. Alternatively, a small detector could scan the illumination profile at the image plane and determine the image contrast.[51][52] The fluorescent intensity from a scanning wafer of special construction is detected by an offaxis photometer versus the grating position and will reproduce the actual image intensity profile. Defocus will degrade the slope of the aerial image.
8.0
DESIGN CONSIDERATIONS FOR IMAGING EFFECTS
The application of optical transfer functions is essential to modern optics designers.[5] Rapid and accurate calculation of transfer functions, theoretically allows the investigation of several designs to meet a given set of specifications without constructing a prototype lens to test its performance. Reliance on geometrical optics can result in serious discrepancies between the image produced and that predicted by ray optics, primarily
11/30/00 JMR
Techniques and Tools for Optical Lithography
511
caused by the neglection of diffraction effects. Optical transfer functions are related mathematically to the diffraction integral, the wavefront aberration function, and the spread function, so a complete description of the imagery of the optical system can be made. The optical engineer can obtain information about an optical system by tracing the paths of rays from a point object.[4] If all the rays pass through the optical system from the point object and converge to a point image, the system is free of aberrations. In a wave front description of imaging, periodic oscillations leave the point object traveling with equal speed in all directions. On any spherical surface whose center is at the object point, all the oscillations are in phase (see Fig. 32). This surface is called a wave front. Light rays are normal to the wave front. If the system is free of aberrations, the wave front emerging from it is spherical with its center at the image point since all its normals pass through the image point. If the system has aberrations, the departure of the wave front from sphericity can be mapped by tracing rays through the system. B. R. A. Nijboer defined the wave aberration function as the difference between the wave surface and the reference sphere. Aberrations caused by glass inhomogeneities[32] or surface irregularities are responsible for smearing the image or dislocating it from ideal.
Figure 32. The spherical wavefront at the exit pupil.
11/30/00 JMR
512
Handbook of VLSI Microlithography
If the optical rays are close to the axis and almost parallel to the axis, they are called paraxial rays. The paraxial approximation to image formation assumes the sines of the angles of Snell’s law are replaced by the angles. The shortcomings of geometrical optics are most pronounced for rays that are relatively remote from the axis in moderate or high numerical aperture lenses and it is under these conditions that the optical transfer function is most useful.[5] The paraxial approximation can lead to errors in aerial image peak intensity and slope, worsening with increasing objective lens numerical aperture and magnification.[53] Information about the wave front deformations, ∆(x,y), in terms of its statistical properties can be obtained by performing laser interferometric measurements. These measurements can be made on individual lens elements to control the manufacturing process and also on the lens assembly. Lens design can take advantage of some mathematical tools to improve the quality and reduce the cycle time of new designs. Optimization routines can search for designs that optimize a given merit function[54][55] or the response surface can be studied for different design factors.[56] These tools are no substitute for innovative engineering and the “optimized” designs must meet other manufacturability criteria. 8.1
Laser Interferometry
A laser interferometer splits monochromatic light into two beams. Interference, causing bright and dark fringes, takes place when the beams are reunited after traveling over different paths. Changes of position can be measured on a scale in terms of wavelength by counting fringes, allowing precise measurements. The Twyman-Green interferometer shown in Fig. 33 can test the transmission properties of transparent optical components, so it is particularly useful for testing lenses, prisms, and mirrors.[57]–[59] The light source must be well-collimated, coherent, and monochromatic. Light exiting the collimating lens is divided in amplitude by the beam splitter. The lens (L) to be tested is placed in one of the arms and the mirror behind it is chosen so that the waves reflected by the mirror and repassing through the lens are collimated with plane wave fronts. These waves are brought to interference with the reference plane waves from the other arm of the interferometer. In practice, the mirror for the reference light will be tilted slightly so if the lens is optically perfect, the fringe pattern will be uniform and parallel. Any local variation of the optical path will produce fringes of a different pattern (other than uniform and parallel) in the corresponding part of the field. These fringes are essentially the contour lines of the
11/30/00 JMR
Techniques and Tools for Optical Lithography
513
distortedwave front. The entire field of the test lens can be mapped this way. Errors in the test surface can be measured in the presence of large errors in the interferometric optics by subtraction of the wave front obtained from a perfect standard of the same radius.[57] Ideally, the wavelength of the interferometer matches the actinic wavelength.
Figure 33. Twyman-Green interferometer for testing lens L. (Reprinted with permission of Prentice-Hall from J. R. Meyer-Arendt, Introduction to Classical and Modern Optics, 2nd Ed., Prentice-Hall, 1984.)
To achieve diffraction-limited performance of the lens assembly, the wavefront deviation from spherical in the exit pupil must satisfy the Marechal condition of l/14 rms. This places stringent demands on the quality of individual elements in multi-element designs since the errors are assumed to add in quadrature. The Lawrence Livermore National Laboratory has improved conventional interferometry by utilizing a reference wavefront produced from diffraction generating arbitrarily perfect spherical wavefronts. Two single mode optical fibers generate the measurement and reference wavefronts. The first fiber diffracts light. The diffraction forms the spherical measurement wavefront which passes through the optical
11/30/00 JMR
514
Handbook of VLSI Microlithography
system under test, inducing aberrations in the measurement wavefront. This measurement wavefront comes to focus on the endface of the reference fiber and interferes with this second fiber’s diffracted light which has formed a spherical reference wavefront. The resulting interference pattern is captured by a CCD camera for analysis. Because the measurement and reference wavefronts are independently generated, their relative phase and amplitude can be controlled, providing contrast adjustment and phase-shifting diffraction interferometric capability. Examples of fringes for different surfaces with reference to a standard flat are illustrated in Fig. 34.
Figure 34. Examples of interference fringes. (Reprinted with permission of Ref. 57.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
515
Near the end of its manufacture, the reduction lens assembly of a stepper can be tested in a similar fashion (see Fig. 35). The entire lens assembly can be secured kinematically to minimize time variant distortion effects. A kinematic coupling is not overconstrained (i.e., has only six contact points), provides contamination immunity, and can be engineered for suitable stiffness. The normals to the six contact points for two planes are not parallel so the relative position and orientation of the coupled surfaces is uniquely defined. [60] Temperature control is critical with kinematic mounts since thermal expansion after clamping will not be symmetric about the object center due to the kinematic mounts; only one point will stay fixed and it’s not the center. For this reason, individual elements are more constrained in their mounting to preserve the shape of individual elements in the assembly as they were designed and manufactured.[61] Stress to the element is reduced whenever the area of the elastic body contact at the element/retainer interface increases. To evaluate the wave front quality, the shape of the wave fronts emanating from the lens element or assembly is modeled mathematically.[5][57][62][63] The fringes generated by the Twyman-Green interferometer are collected by video, digitized, and analyzed using proprietary or commercially available software.
Figure 35. Schematic of a 5X, h-line, 14 element lens design. (Reprinted with permission of KTI Microelectronics Seminar from I. Friedman, A. Offner, and H. Sewell, KTI Microelectronics Seminar, Nov., 1987.)
11/30/00 JMR
516 8.2
Handbook of VLSI Microlithography Aberration Modeling
Mathematical expression of the wave aberration function has evolved to the use of power series expansion or Zernike circle polynomials. The polynomials are orthogonal so one advantage is both the infinite series and any of the terms can be described by a best least squares regression of the data. This means any of the individual coefficients, related to particular aberrations, can be removed from the data to illustrate residual aberrations.[57] In practice, an infinite series is truncated to a manageable number of terms for calculation, but higher order aberrations must be considered in highly corrected optical systems. The wave aberration function depends on the coordinates where the ray passes through the pupil sphere and where the object point is located in the object plane.[5] The wave aberration function can be expressed as: Eq. (55)
W = W(r, ρ , θ)
where r is the distance from the axis to the object point, and ρ and θ are the polar coordinates. For an axially symmetric wave front,[59] the wave front is described by:[5] Eq. (56)
W(r, ρ , θ) = f ( aCbcr aρ bcoscθ)
The terms of the wave aberration function series are grouped according to their order:[5] Eq. (57)
order = (sum of the powers of r and ρ) – 1
Some of the lens aberrations are listed in Table 1. Another aberration that must be considered in refractive systems is chromatic aberration, or the chromatic spread in best focus. The change of index of refraction of a material with wavelength is called dispersion. For most optical materials, dispersion increases sharply as the wavelength shortens. Dispersion in the lens causes images produced by different colors to come to focus at different planes. Shorter wavelength light comes to focus closer to the lens than longer wavelength light, as shown in Fig. 36. This is called longitudinal chromatic aberration since the error is measured along the axis. Focal lengths vary with wavelength, so the shorter wavelength light is of smaller magnification than the longer wavelength light. This is called transverse chromatic aberration since the displacement of a particular ray is
11/30/00 JMR
Techniques and Tools for Optical Lithography
517
calculated on a given image surface. No optical illumination is perfectly monochromatic. Image contrast is related linearly to the dispersion rate, d(focal distance)/dλ,[64] so a tolerable bandwidth must be determined that causes a negligible impact on imaging. For example, this bandwidth can be narrowed for a mercury arc lamp by insertion of a set of filters in the light path or by spectral narrowing of the excimer laser output. However, for the highest intensity at the wafer plane, it is desirable that the bandwidth be as broad as possible. Table 1. Lens Aberrations First Order Terms Focus Third Order or Seidel Aberrations Spherical aberration Coma Astigmatism and Curvature of field Distortion Fifth Order Terms Spherical aberration Linear coma Elliptical coma Oblique spherical Astigmatism and Curvature of field Distortion
Figure 36. Longitudinal chromatic aberration. (Reprinted with permission of Prentice-Hall from J. R. Meyer-Arendt, Introduction to Classical and Modern Optics, 2nd Ed., PrenticeHall, 1984.)
2/23/01 JMR
518
Handbook of VLSI Microlithography
The effects of chromatic aberration can be seen by changing the spectral bandwidth and noting the change in MTF response.[65] Chromatic aberration also can be measured by comparing peak response versus focus for different wavelengths.[65] Chromatic aberration causes defocus contributions to the aerial image. While Eq. (13) approximates the resolution limit for a single wavelength stepper or an achromatic lens design, the stepper resolution at larger numerical apertures is limited by the spectral bandwidth of the source rather than the lens numerical aperture.[66] The depth of focus is limited by both the laser spectral bandwidth and the lens NA. In particular, the exposure latitude for a given depth of focus is continuously decreased as the spectral bandwidth is increased. Spherical aberration has both longitudinal and transverse properties and is the only monochromatic aberration affecting axial images. Rays passing through the spherical surface at different radial points come to different focus points. Spherical aberration is independent of field angle so it affects all points at the same radial distance in the field similarly.[67] This gives a blur circle centered around the ideal image point, where the size of the blur circle is the same everywhere in the field, but grows with the third and fifth power of the aperture (sin i).[68] Spherical aberration would produce an effect with the same symmetries as defocus and is partially compensated by and difficult to distinguish from defocus.[69] Some distinguishing characteristics are: the outer zones of the lens carry more energy per unit change in y and diffraction effects can become large with respect to the residual geometric aberration.[67] For the special case of parallel incident light, Fig. 37 (a) shows the paraxial focal point F´ and the focal points A, B, and C for zones of increasing image height, h. Figure 37 (b) illustrates the difference between longitudinal spherical aberration and lateral spherical aberration for the object point M on axis and its paraxial image point M´. The image distance for an oblique ray traversing the lens at an image height h from the axis is sh´, and sp´ is the image distance for paraxial rays. Whereas spherical aberration is a difference in axial location of the image for different radial zones of the lens, coma appears as a difference in magnification for different parts of the lens.[59][67] Coma causes an asymmetric aberration pattern with a size proportional to the distance from the object point to the axis, and the square of the aperture.[68] Coma derives its name from the comet tail flare apparent in the well-focused image of a symmetric object point located just off the lens axis (see Fig. 38). In practice, asymmetries show up more clearly in the defocused image, especially as the off-axis angle is increased. The direction of the influence changes with
11/30/00 JMR
Techniques and Tools for Optical Lithography
519
radial position in the field.[69] Sources of the aberration can be the lens design, poor centering of the elements during assembly, asymmetrical polishing of surfaces, or nonuniform distribution of the refractive index in a lens component.[57] As with astigmatism, the tangential coma is three times larger than the sagittal coma.[67] Coma is described easily by monitoring the phase transfer function since it produces a lateral shift of the image, which is a measure of the asymmetry in the line spread function.[65][69] A lens system free of both spherical aberration and coma is said to be aplanatic.
Figure 37. Lateral and longitudinal spherical aberration.(Reprinted with permission of Ref. 59.)
Figure 38. The tangential fan of rays for a lens with coma. The coma shown is negative since the two rays through the margin coming together at B´ are of lower magnification relative to central rays. (Reprinted with permission of Ref. 59.)
11/30/00 JMR
520
Handbook of VLSI Microlithography
The size of the aberration pattern for astigmatism and field curvature is proportional to the square of the distance from the object point to the axis and the first power of the aperture.[68] Field curvature is the curvature of best focus planes along different radial positions (from on-axis to the field edge) for sagittal and tangential features. Field curvature is illustrated in Fig. 39, where the best imagery will occur at the nominal focus values corresponding to the zone of least confusion, ΣLC. The degree of coherence of illumination changes the field curvature.[64] At each radial position, sagittal (ΣS) and tangential (ΣT), features with astigmatism come to best focus at different focal planes. The defocus separation between sagittal and tangential peak contrast determines the magnitude of astigmatism. Effects are radial in nature with astigmatism, where the direction of the influence (i.e., whether the maximum of the image contrast for the sagittal features is at a focus plane closer to or farther from, the lens than the tangential features) will change with position in the field[69] and the magnitude of astigmatism can change with radial position. Astigmatism has a symmetrical optical path difference function. Astigmatism and field curvature reduce the usable depth of focus of the lens since only the intersection of the resolution-defocus space across the entire field being imaged of tangential and sagittal features is available.
Figure 39. An example of field curvature.
11/30/00 JMR
Techniques and Tools for Optical Lithography
521
The surfaces of the individual elements in a lens assembly are ground to λ /100 of the desired sphericity. Cylindrical lenses have different focal lengths across each axis. Any cylindricity of these element surfaces exacerbates field curvature and astigmatism. Correction of field curvature can require additional compensating lens elements, while astigmatism can be corrected by grinding to tolerances finer than λ/20.[71] Distortion does not affect the image quality. It is a radially symmetric error that produces stigmatic images in which the location of the image points is in error proportional to the third and fifth power of the distance from the object point to the axis.[68] Lens designs either balance the third and fifth order distortion effects to minimize the composite error or the fifth order distortion is eliminated and the third order distortion is minimized.[72] The location of the image is measured by finding the centroid of the line spread function at a number of field positions relative to the object point source.[65] The line spread function can be obtained by monitoring the change in intensity of an aerial diffraction image being scanned with a knife edge. The line spread function is the first derivative of the edge trace. The Fourier transform of the line spread function gives the OTF. Although no one set of orthogonal polynomials can be declared best for the calculation of the OTF from a set of interpolation points,[5] Zernike polynomials[62] are preferred for representation of the optical wave front in the final phase of the optical design[73] and in the analysis of the interferometric test data. Zernike polynomials can be expressed as the product of two functions, one depending only on the radial coordinate and the other depending only on the angular coordinate:[57][62] Eq. (58)
Znl = Rnl( ρ)eilθ
wheren is the radial degree of the polynomial, andl is the angular dependence parameter. The coordinate ρ is the normalized radial distance, and θ is the angle from the y axis. The radial polynomialsRnl (ρ) are functions of ρ alone. Zernike polynomials are a set of complete orthogonal polynomials defined on a unit circle with several attractive properties.[62] They are related to the classical understanding of geometrical aberrations developed by Ludwig von Seidel in terms of rays instead of wave fronts. Not only can the wave front be represented faithfully, but the polynomials are able to give the shape of the wave front relative to the reference sphere everywhere over the exit pupil. The mathematical form of the Zernike polynomial is preserved when a rotation with pivot at the center of the circle is applied to the
11/30/00 JMR
522
Handbook of VLSI Microlithography
wave front function.[57] Expansion of the wave front in terms of Zernike polynomials eases the balancing of aberrations of different orders against each other in order to obtain the maximum Strehl intensity.[57] In order to calculate the OTF, knowledge is needed of the wave front aberration. However, particular properties of the optical system can be gotten directly from the wave front aberration coefficients without going to the much greater labor of computing the OTF.[68][73] During final assembly of a reduction lens, minimization of aberrations is obtained by rotating, translating, or tilting selected elements in an iterative process with testing feedback. The unit magnification Wynne-Dyson[74]–[77] system has perfect imagery on-axis and for off-axis points in the sagittal plane. By using a meniscus lens coupled to a material of lower refractive index, correction can be obtained to a higher order and for greater field sizes than with a single component. Practical considerations require an air gap between the mask and the first lens surface. This is achieved by introducing a third component of higher refractive index which introduces aberrations in the opposite sense to those of the air gap. The object and image focal planes are separated by means of folding prisms that reduce the available field to something less than half due to vignetting, or the blocking of some rays. The Wynne-Dyson has advantages in microlithography since the symmetry of its design corrects all odd order Seidel aberrations (i.e., coma, distortion, and lateral chromatic aberration). This makes it possible to achieve excellent correction over a large field with a quarter of the elements of a dioptric lens design. The design is telecentric on both the image and object side. A stepper is image side telecentric if there is no change in magnification as the wafer plane is defocused. This means the exit pupil is at infinity. In Fig. 40, the chief ray for a properly focused condenser (C) intersects the optical axis at the entrance pupil (EP) of the lens (L) and exits approximately parallel to the optical axis, for an image side telecentric lens. A ray originating from a defocused condenser intersects the optical axis either in front of or behind the entrance pupil of the lens for positive and negative condenser defocus, respectively. As this ray exits the lens, it is either converging on the optical axis (positive condenser defocus) or diverging from the optical axis (negative condenser defocus). As the wafer (W) focus is varied, the reticle (R) object fieldCE is magnified or demagnified, affecting the wafer image size C´E´.[78] Telecentric error is inversely proportional to the square of the focal length of the lens. The introduction of aspherical optical elements in lithography equipment brings a relatively dramatic improvement in imaging quality. Higher
11/30/00 JMR
Techniques and Tools for Optical Lithography
523
order aberrations become increasingly significant with larger numerical apertures. Aspherical elements permit a high level of aberration correction at the cost of increased manufacturing and testing complexity.
Figure 40. (a) Positive condenser defocus and (b) negative condenser defocus. (Reprinted with permission of Ref. 78.)
8.3
Aerial Image Intensity Distribution
The aerial image of the stepper can be simulated or empirically measured. The methods are complementary and each has unique advantages. Photoresist is sensitive to light intensity and not its amplitude. Generally, linewidths are evaluated from aerial images at the 25% normalized intensity contour since imaging intensities for nominal critical dimensions are usually 25–30% that of large clear areas. For completeness, simulations should provide developed images. For simplicity and computational speed, aerial images by themselves can be evaluated. The aerial image creates a latent image in the resist defined by a chemical distribution of photoproducts with concentration m. Exposure and develop latitude depends on the quality of the latent image. The latent image has a slope due to light absorption. The gradient of the latent image, ∂m/∂x, is not proportional to the slope of the aerial image [i.e., ∂I/∂x (µm -1)], but to the slope of the log-aerial image [i.e., ∂ lnI/∂x (µm-1)].[79] A larger gradient
11/30/00 JMR
524
Handbook of VLSI Microlithography
means a larger process latitude. For a given aerial image, the slope of the latent image is a function of the exposure dose, implying there is a particular dose to optimize the latent image. Process latitude is a function of the latent image slope so varying the exposure will vary the process latitude.[80] The maximum depth of focus can be found from the log-image slope versus defocus curve at the point where the log-slope goes to zero.[81] Aerial Image and Resist Profile Simulation. Instead of evaluation by MTF for partially coherent illumination, for any specific pattern the intensity distribution of the aerial image can be computed,[24] as illustrated in Fig. 41. For partially coherent illumination, a quantity S can thus be computed that is comparable to the MTF in the case of incoherent illumination as a measure of the recordability of a particular pattern. The value of S is given by: Eq. (59)
S = ∆I/2rIr
where Ir is the intensity at which the width of the aerial image is equal to the scaled linewidth of the object. ∆I/Ir is the fractional intensity change corresponding to the fractional dimensional change. Here,r is the fractional recorded linewidth variation. It should be noted that although the quantity S is similar to the MTF as a measure of recordability (since it is the slope of the intensity variation of the aerial image at the recording point), its value can be greater than unity since with partial coherence the intensity variation curve is generally not a sine wave. The MTF measures both slope and contrast, whereas the quantity S measures only slope. Defocus will degrade the image contrast.
Figure 41. Aerial image intensity distribution of the image of a square wave object with partially coherent illumination. (Reprinted with permission of Ref. 24.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
525
The intensity distribution on the image plane can be calculated with algorithms including the effects of defocus aberration and partial coherence in the illumination.[82]–[84] For simulations of high (≤0.35) numerical aperture lenses or thick (≤1 micron) resist films, it is important to include non-normal directions of propagation of the rays in the resist and thin film layers to account for asymmetries in the image about the nominal focal plane.[12][14][85]–[87] The latent image can be determined with information about the photoresist optical response parameters, which characterize the behavior with exposure,[88][89] and post exposure bake activity, which includes diffusion and crosslinking activities.[87][89][90] A resist development simulator accepts the latent image input file to model the developed image. For diazonaphthoquinone novolak resists, a rate function is used to define the resist dissolution.[91]–[95] Different dissolution rate models have been proposed for chemically amplified or acid catalyzed resists.[89][95][96] Alternatively, stochastic cell models have been proposed[97][98] to account for local chemical activity which cannot be extracted from bulk dissolution data. The empirical methods of dissolution data collection, model parameter extraction, and application to simulations are critical to produce quantitative simulations. These models have significance to the lens designer and process engineer. Lens designs can be accurately simulated including aberrations or the imagery of lens assemblies modeled using Zernike coefficients obtained from OTF testing. Process engineers can simulate the imagery for all their fab’s steppers using individual stepper OTF data and the candidate resist parameters.[99] There are numerous simulators available commercially. For example, the intensity distribution, I(x,y), over the image plane can be calculated by Hopkins’ theory of partially coherent imaging:[84]
l (x,y) = Eq. (60)
+∞
∫∫∫ ∫ T ( f ′,g′; f ′′,g′′)F ( f ′,g′)F ∗ ( f ′′,g′′) −∞
∗ exp{− 2i[( f ′ − f ′′)x + (g ′ − g ′′) y]}df ′dg ′df ′′dg ′′
where x and y are related to the geometrical coordinates of the object or mask, the spatial frequency pairs (f ´, g´) and (f ´´, g´´) represent two sets of diffracted plane waves that are interacting with some degree of coherence, and F(f,g) is the Fourier transform of the object or mask transmittance function F(x,y) where * indicates complex conjunction:
11/30/00 JMR
526
Handbook of VLSI Microlithography
Eq. (61)
F ( f,g ) =
+∞
∫ ∫ F (x,y)exp[− 2i ( fx + gy )] dxdy −∞
There may be a coefficient before the integrals in Eq. (60) acting as a normalization constant so that the resulting intensity equals unity for flood exposure. The function T(f ´, g´; f´´, g´´) is the transmission cross-coefficient of the imaging system in the spatial frequency domain. The transmission cross-coefficient includes information describing the effective source and the pupil function of the projection lens. The transmission cross coefficient can be expressed as:[69] Eq. (62) T ( f ′, g ′; f ′ , g ′′ ) =
+∞
∫ ∫J
0
( f , g )K ( f +
f ′, g + g ′ )K ∗ ( f + f ′, g + g ′ )dfdg
−∞
where J0(f,g) represents the illumination cone and its value is a constant within a radius proportional to the partial coherence. K(f,g) is the objective pupil function, and is given by: Eq. (63)
K(f,g) = exp[–i2πΦ (f,g)/λ]
f 2 + g2 <1
where Φ(f,g) is the wave aberration function which can be expressed as a simple power series in f and g. The objective pupil function also can be derived to avoid making the paraxial approximation, extending the utility of scalar imaging up to lens numerical aperture values of 0.6–0.7.[53] Except for chromatic aberration, the lens aberrations can be described by expanding its wavefront aberration in a power series as: Eq. (64)
Φ ( f,g) = Σ Clmn (x2 + y 2)l(x f + yg)m (f 2 + g2)n
where x and y refer to the object locations in the field relative to the lens center; f and g are the normalized polar pupil coordinates; l, m, and n are three integers that describe the order of the aberrations (i.e., 3rd order, 5th order, etc.), and C lmn is a constant that determines the magnitude of the aberrations. The third order Seidel aberrations are described as: spherical (l = 0, m = 0, n = 2), coma (0, 1, 1), astigmatism (0, 2, 0), curvature (1, 0, 1), and distortion (1, 1, 0).
11/30/00 JMR
Techniques and Tools for Optical Lithography
527
The object or mask/reticle lies in thex,y plane and the z-axis is parallel or along the conjugate length. All of the advanced optical techniques trying to image at or below the Rayleigh resolution limit try to manipulate one or more of the functions for the effective source (for example, by varying the shape of the source and the pupil filling factor), the complex amplitude transmissivity distribution of the object (for example, by light interference or phase shifting at the mask), or the projection lens pupil function (for example, by insertion of a spatial filter). For numerical apertures ≥0.35, paraxial imaging and thin lens approximations introduce relatively significant errors into aerial image calculations. The simple monotonic scaling for resolution and depth of focus introduced by Eqs. (13) and (46) becomes increasingly invalid. Exact scalar versus paraxial approximation calculations of the aerial image indicate increasing disparity in light intensity predictions with increasing numerical aperture and decreasing object to image magnification ratio, although some of this error is removed due to the conventional optimization of actual lenses to aplanatic surfaces.[53] The scalar model predicts total transmitted power will vary in direct proportion to the width of the mask slit opening, and naturally does not exhibit any polarization effects.[100] Vector diffraction models supplant scalar theory for numerical apertures ≥0.60.[101] Vector image models account for the angular dependence of radiation propagation within the resist film and interference coupling into the film. The vector amplitudes of the waves that interfere to form the image will become significantly non-parallel at high numerical apertures, preventing complete interference and reducing contrast. At high numerical apertures, there is a significant variation of the polarization vector across the exit pupil, and the compound angles involved in 3D imaging cause x, y, andz vector amplitudes to be generated in the resist. Also, there is a relatively large difference in x and y polarization with monochromaticity (versus 14 nm width broadband illumination) that becomes important for quarter micron imaging.[102] Empirical Testing. Direct aerial image measurement can be done in real time. In one method, the fluorescent intensity from a scanning wafer of special construction is detected by an off-axis photometer versus the grating position and will reproduce the actual image intensity profile.[51][52] In another method, a sensor array using photodiodes is formed by printing an array of 0.2 micron pinholes on a chromium coated quartz wafer and etching the thin chromium layer.[103] A two-dimensional intensity mapping is achieved by moving the stepper stage with the image sensor across the aerial image produced by a reticle with a matching pattern. Defocus will degrade the slope of the aerial image.
11/30/00 JMR
528
Handbook of VLSI Microlithography
The experimentally measured contrast can be degraded from ideal by many factors, including light scattered within the lens assembly during transmission (flare), residual aberrations from both design and manufacturing, chromatic aberration caused by inadequate source filtering, and wafer plane defocus. Scattered light commonly encountered in optical systems is the result of small, very rapid wavefront disturbances.[104] Reduction of incident illumination reflections is attained by proper optimization of the lens configuration and geometry.[105] Antireflection coatings on lens elements help increase image brightness and eliminate scattering off surfaces of elements. Antireflection coatings on the mask or reticle and wafer surfaces also improve the image contrast. The internal lens barrel can be painted flat black to reduce reflections. Internal knife-edge stops lining the lens barrel can be used to eliminate internal low-angle reflections that no paint alone can stop. Selection of the narrow band filter involves tradeoffs between image contrast and illumination intensity (see Fig. 42). As the quasimonochromatic actinic light is narrowed, exposure times increase and standing wave and thin film interference effects are exacerbated. The importance of image contrast is related directly to linewidth (L) control,[26] as shown in Fig. 43. Image contrast above 90% may be required for submicron lithography. The familiar criterion of 60% MTF for good imaging corresponds to an image contrast of 95% at a partial coherence value of 0.7. Operation at lower image contrast values will require better process control. Image contrasts as low as 60–70% may be useful for volume manufacturing.[106] For example, multilayer resist (MLR) lithography is capable of printing with an image contrast of 70% at s = 0.5. A contrast enhancement material (CEM) with MLR can sustain 50% image contrast at this partial coherence value.[107] Top surface imaging can support 75% image contrast. 8.4
Shaped Illumination Sources and Spatial Filtering
For conventional illumination, light is collected and sent through a uniformer. The condenser aperture then defines the uniform cone of light partially filling the entrance pupil of the objective or projection lens assembly. An iris would permit the pupil filling to be variable. Light diffracts at the object or mask/reticle plane. The 0th order diffracted light travels along the optical axis. The ±1 orders are diffracted by angles according to Eq. (36) and then they are received by the objective or projection lens. The distribution of illumination light on the Fourier transform plane falls within a circular area. The 0th order and higher diffracted orders collected by the lens interfere at
11/30/00 JMR
Techniques and Tools for Optical Lithography
529
the wafer plane and form the image of the mask or reticle. For 1:1 periodic patterns with linewidths of k1 = 0.5, the image contrast is 70% for defocus of k2 = 0.45 and 60% for k2 = 0.70. For linewidths of k1 = 0.6, the image contrast is 70% for defocus of k2 = 0.70 and 60% for k2 = 0.85.
Figure 42. Illumination bandwidth effect on contrast.(Reprinted with permission of Ref. 26.)
Figure 43. Effect of contrast on linewidth control. (Reprinted with permission of Ref. 26.)
2/23/01 JMR
530
Handbook of VLSI Microlithography
It has been proposed that a nonuniform source (i.e., a nonuniform distribution across the illuminator aperture) such as an annulus (where the center is obscured) or a four spot Gaussian distribution of the condenser aperture illumination can improve imaging. Discretizing the illumination cone of conventional illumination indicates zonal preferences for particularly oriented mask patterns.[108] The symmetry of an annulus relative to the optical axis makes imaging independent of pattern direction since information from each of these zones is included with some background or nondiscriminatory light. An axially symmetric source, such as an unidirectional slit or two point sources that are polarized along an axis about the optical center, provides specific improvement for object features whose length is perpendicularly oriented to the source shape. Another nonuniform source divides the conventional cone into quadrants. In each quadrant there is a small circular region intercepting the annular region where both x and y direction patterns will image well. Therefore, this type of source has four Gaussian spots oriented at 1:30, 4:30, 7:30 and 10:30 o’clock. Orientation of the four spots at 3:00, 6:00, 9:00, and 12:00 o’clock changes the transmission cross coefficients [84] versus spatial frequency, with some image degradation. Imaging improvement is thought to result by interaction of diffracted orders so any MTF falloff at the field edge[21][71] can be compensated by a local intensity difference. This is based on the hypothesis that high frequency information generated by the annular area will be gained at the expense of the fundamental order from the center. However, simulations have shown[109] that a source with a central obscuration reduces the image contrast in all cases (i.e., for all radii of obscuration). The best fine line imaging for contrast greater than 80% is obtained with a uniform source. Evidently, the loss of the zero-order component is more critical to the image than any higher-order information that might have been obtained. However, for advanced resist processing where MTF values between 0.4 and 0.55 provide useful imaging, annular illumination offers flatter frequency response, much higher exposure latitude for resolution of high spatial frequency patterns, and improved image quality with defocus compared to conventional Kohler illumination with a uniform circular source.[110][111] The benefits in resolution and depth of focus for periodic patterns using annular illumination occur at low image light intensity contrast. Therefore, high contrast resist processing (for example, -0.5 µm thick resist films or surface sensitive imaging, etc.) must be used.[112] The exposure latitude for fine periodic patterns can be increased by 10% and the depth of
11/30/00 JMR
Techniques and Tools for Optical Lithography
531
focus improved by 40% relative to conventional illumination. For 1:1 periodic patterns with linewidths of k1 = 0.5, the image contrast is 70% for defocus of k2 = 0.70 and 60% for k2 = 1.00. For linewidths of k1 = 0.6, the image contrast is 70% for defocus ofk2 = 0.95 and 60% fork2 = 1.20. Annular illumination is marginal for improving isolated line imaging. For defocused line/space (L/S) pairs, there are outer line proximity effects and corner rounding at the end of lines. Following Eq. (34), two pupil filling factors (si and so) can define the inner and outer radii of annular illumination, respectively. The illumination angle for si subtends the optical axis to the radius of the central obscuration. This radial point starts the annulus. The illumination angle forso subtends the optical axis to the outside radius of the annulus. The optimum annular source has si = 0.7so for so = 0.65. An early model step-and-scan had an annular field optical system, where the imagery was corrected for a narrow annulus centered on the axis of the optical system so that all points in the corrected ring were at substantially the same distance from the optical axis.[24][113][114] The instantaneous exposure area looked like a shallow arc. Across the length of the arc the illumination was Kohler, while across the small width of the annular arc the illumination was critical. The effect of scanning was to take the average of the images at different field points across the slit width leading to increased depth of focus. By contrast, an all-refractive, or dioptric, lens design must satisfy the variations of aberrations with field angle. The benefits of annular illumination can be extended further with the simultaneous use of a complementary conjugate spatial filter.[115][116] The annular source is defined at the entrance pupil of the illuminator. The objective lens forms the Fourier transform on the focal plane conjugate to the plane of the source (i.e., at the entrance pupil of the objective lens). This means the image of the source, diffracted by the mask to form –1, 0, and +1 orders with annular shapes partially overlapping in space, forms at this conjugate plane. A filter inserted at this conjugate plane would be known as a conjugate filter (i.e., the conjugate position to the annular source). A filter designed to interact with some or all of a particular order of the light would be known as a spatial filter. It is hypothesized that reduction of the low spatial frequency components of the image intensity distribution relatively emphasizes the high spatial frequency components and improves the image contrast. In this application, the zero order light will be reduced with an annular shaped amplitude filter. The amplitude spectrum is modified but the phase spectrum is unaltered.[117] The filter blocks most of the 0th order plus a small amount of intersecting 1st orders. The zero order light will not be
11/30/00 JMR
532
Handbook of VLSI Microlithography
eliminated completely or missing image patterns will result. The spatial filter is 30–50% less area than the annulus and must be optimized for each feature shape/pattern. The image contrast is 40–50% for features slightly smaller than the wavelength of the illumination light. This is about 150– 200% greater compared to a conventional uniform source. The intensity is about 50–70% of normal. The linearity of feature sizes (i.e., imaged versus on mask) can be a problem and needs to be characterized. There are lens assembly design issues, too. A spatial filter is an optical element so its centering tolerance and effect on aberrations must be considered. Also, light absorption and scattering by the filter leads to magnification and nominal focal plane changes, and image flare, respectively. Spatial filtering with conventional illumination can be used to improve imaging quality by performing light amplitude superposition of multiple images along the light axis.[118] The amplitude transmission coefficient of a filter is selected with knowledge of the light’s incident and transmitted angles and material indices of refraction. While a clear aperture has an amplitude transmission of unity everywhere, the value for an amplitude filter will change with pupil radius. Obviously, a generally applicable spatial filter represents a compromise design, since the Fourier transform of the mask is modified after transmission through the amplitude filter, changing the spatial frequency composition of the images. Image superposition occurs at zero and ±β defocus intervals. The resolution limit can be extended with this method since the full width at half-maximum of the filtered intensity profile at 40% contrast can be 20% narrower compared with conventional imaging. For a 0.6 value of the resolution coefficient (k1) of Eq. (13), the depth of focus coefficient (k2) of Eq. (46) is about 0.4 with filtering at 60% contrast compared with conventional imaging’s k2 = 0.25 at 80% contrast. The benefits of this method come at the cost of lower image contrast, evidenced by an enlargement of the aerial image’s first secondary maximum, or sidelobes. The average light intensity will be much lower, so sensitive resists will be needed to preclude a throughput penalty. The lens assembly design issues mentioned above are important, too. Use of a phase filter instead of an amplitude filter avoids the problem of filter heating. The transparency of this type of spatial filter does not lower the light efficiency of the system. Application specific filters can be designed for particular source shapes and mask patterns, but conventional lens designs do not tolerate exchangeable elements. The optical design of a projection stepper is very similar to microscopes. Use of spatial filters in projection lithography is very similar to dark
11/30/00 JMR
Techniques and Tools for Optical Lithography
533
field microscopy, where the entrance pupil of the objective has a filter or other element to block the 0th order light and pass only the higher orders. This improves the resolution capability. Returning to nonuniform sources, a four spot distribution of the condenser aperture illumination can improve imaging for 0 and 90 degree oriented periodic features.[108][112][116][119] The source design has been optimized[112] as circular apertures in each quadrant of normalized radius, ρ = 0.25 (where 0 - ρ- 1), with symmetric coordinates about the optical axis where the first aperture is centered at 1:30 o’clock at (ρx, ρ y) = (0.38, 0.38). The mask or reticle is illuminated by the oblique light of incident angle, φ. At the mask or reticle, the light is diffracted. The 0th order light travels at an angle (φ) relative to the optical axis, colinear with the incident light. The ±1 orders are diffracted an angle θ relative to the 0th order light. The diffraction angle q is related to the wavelength (λ), and image pattern pitch (π), by Eq. (36). Relative to the optical axis then, the ±1 diffracted orders have the following angles:[119] Eq. (65)
+1st order: sin(θ) – sin(φ) = λ/p
and Eq. (66)
–1st order: –sin(θ) – sin(φ) = λ /p
The 0th order light [sin(φ)] of Eqs. (65) and (66) has a negative value due to a sign convention indicating a clockwise diffraction angle relative to the optical axis. The diffraction angle θ increases with finer patterns. Near k1 = 0.5, the diffractive angle of the –1 order light relative to the optical axis exceeds the collective angle of the objective lens as given by Eq. (38) [i.e., {–sin(θ) – sin(φ) } > sin(i)], so only the 0th and +1 diffracted order light will interfere on the wafer surface and contribute to image formation. Relative to conventional illumination, the angle between the most widely separated orders of diffracted light used in image formation has been halved. This physical attribute is responsible for the improved resolution and depth of focus capability with only marginal attenuation due to the disparity of the 1:2/π amplitude ratio of the 0th order light to the 1st order light.[108][116] The four spot illumination has higher image contrast for L/S pairs than conventional for defocused images of very fine lines at the Rayleigh resolution limit of 0.5λ/NA.[108][116][119] For 1:1 periodic patterns with linewidths of k1 = 0.5, the image contrast is 70% for defocus of k2 = 0.55 and
11/30/00 JMR
534
Handbook of VLSI Microlithography
60% for k2 = 0.75. For linewidths of k1 = 0.6, the image contrast is 70% for defocus of k2 = 0.80 and 60% for k2 = 1.05. However, there is mask linearity only for fine lines. For isolated lines, the mask linearity is better for fine lines with conventional than nonuniform illumination. Isolated lines are imaged narrower than L/S pairs of the same mask critical dimension, so selective mask biasing would be necessary for correction. The four spot illuminated oblique L/S pairs (for example, 45°) have slightly worse mask linearity than conventional and the reduced contrast precludes resolution and depth of focus even to levels with conventional illumination. Exposure times are ~2–5X normal. Complicated line patterns don’t image well at the line ends or corners due to proximity problems. Operation with low pupil filling factor values or nonuniform illumination dramatically reduces the flux levels. Sensitive resists, like chemically amplifying ones, avoid throughput penalties due to increased exposure times that conventional resists would experience. The resist system selected must have very high contrast since aggressive optical imaging is at lower and lower contrast. Low contrast resists will not discriminate the principal peak of the aerial image and will exhibit either gross bias effects, or narrow exposure latitude, or image spurious features.
9.0
NUMERICAL AND STATISTICAL METHODS
Numerical and statistical methods are invaluable tools of engineering that permit quantification of phenomena. 9.1
Data Regression
Numerical methods can be used to advantage in characterizing the performance of steppers. The method of least squares[120] allows linear and quadratic (or higher order) polynomials to be regressed from data to describe their performance. These two lower order polynomials are useful in many circumstances. The general equation for a straight line has the form: Eq. (67)
y = a + bx
The error criterion to be minimized for the linear fit for N data points would be:
11/30/00 JMR
Techniques and Tools for Optical Lithography
Eq. (68)
N
N
x =1
x =1
535
2 2 ∑ e (x )= ∑ [ y (x ) − a − bx ]
Simultaneous equations are generated by taking partial derivatives of Eq. (68) with respect to a and b and then setting the two resulting equations to zero. The simultaneous equation solution results in[121] a slope (b) of:
Eq. (69)
b=
N ∑ xy − ∑ x ∑ y 2 2 N ∑ x − (∑ x )
where theΣ notation represents the summation of terms fromx = 1 to N. The intercept (a) is given by: Eq. (70)
a = (Σ y – b Σ x)/N
The general expression for a second order (quadratic) polynomial is: Eq. (71)
y = a + bx + cx2
The error criterion to be minimized for the quadratic fit would be:
Eq. (72)
[
N
N
x =1
x =1
2 2 ∑ e (x ) = ∑ y (x ) − a − bx − cx
]
2
Simultaneous equations are generated by taking partial derivatives of Eq. (72) with respect to a, b, and c and then setting the three resulting equations to zero. The simultaneous equation solution is a little awkward to apply, but results in:[122]
γδ − θα γβ − α 2
Eq. (73)
b=
where:
γ = (Σ x2)2 – N Σ x4 δ = Σ x Σ y(x) – N Σ xy(x) θ = Σ x2 Σ y(x) – N Σ x2y(x)
11/30/00 JMR
536
Handbook of VLSI Microlithography
α = Σ x Σ x2 – N Σ x3 β = (Σ x)2 – N Σ x2 and the Σ notation represents the summation of terms from x = 1 to N. Once b has been determined, c is found from:
Eq. (74)
c=
θ − bα γ
The intercept a can be found from: Eq. (75)
a = (Σ y(x) – b Σ x – c Σ x2)/N
The maximum or minimum for each curve is found by solving the first derivative with respect to x set equal to zero, i.e.:
Eq. (76) 9.2
dy = 0 = b + 2cx dx
F-Test and T-Test
The most common experiments test a process under two different conditions, generally two different levels of a single process factor. These are called B versus C tests. A comparison of variances uses the F-test to determine if the difference between the variances (i.e., the square of the standard deviation values) is statistically significant at a given confidence level. A comparison of means uses the t-test to determine if the difference between the mean values is statistically significant at a given confidence level. Both the F-test and the t-test assume that independent random samples are evaluated from normally distributed populations. Even if a test is statistically significant, a subjective engineering judgement must be made about the importance of the results, i.e., whether the observed differences are important. There are two widely used measures of variability. One is the range and the other is the standard deviation. The range is easy to compute and
11/30/00 JMR
Techniques and Tools for Optical Lithography
537
is the difference between the high and low observed test values. For small sample sizes, the estimate of the standard deviation (s) is used to approximate the population standard deviation (σ ). The estimate of the standard deviation is more complicated to compute than the range: N
Eq. (77)
s = 2
∑ i=1
(x i − x )2 n −1
where xi is the i-th value ofx, x is the average value of all the xi’s, and n is the sample size or number of observations. Small sample sizes haven -30, although populations are described by sample sizes at least a couple of orders of magnitude larger. Comparison of standard deviations in a B versus C test is done with an F-test. The shape of the F-test curve is asymmetrical (i.e., the distribution of variances is not a normal distribution) and depends upon the degrees of freedom (n i – 1, or the sample size less one) of test condition i. The F-test is: Eq. (78)
Fexp = s12/s22
where s1 >s2. Fexp is compared againstF, the tabulated value (available in any statistics text) dependent upon the degrees of freedom and a, which is the experiment risk (the confidence coefficient is 1 –α). For two tailed tests (where there is a possibility of either a ± variation about a mean), the F factor must reference a risk of α/2 in the tables. If more than two variances are being compared, Fmax tables are referenced instead of F tables. If Fexp -F, there is insufficient evidence to indicate a difference in the population variances. The t-distribution and the standard normal distribution follow essentially the same curve. The distribution of tα /2,n-1 in: Eq. (79)
Äx ≥ tá/ 2 ,n−1 ⋅ s p,n−1 ⋅ (1/n1 + 1/n2 )
This distribution for samples drawn from a normally distributed population was discovered by W. S. Gosset and published in 1908 under the pen name of Student. He referred to the quantity under study as t and it has been known ever since as Student’s t.[123] A normal frequency curve is centered on the population mean (µ), and is symmetrical about this point. In a normal distribution, the area under the curve for µ ±1σ is 0.6826,
11/30/00 JMR
538
Handbook of VLSI Microlithography
for µ ±2 σ is 0.9544, and for µ ±3σ is 0.9973, where σ is the population standard deviation. In Eq. (76), ∆x is the difference between the estimates of the population means, tα /2,n-1 is a value gotten from a Student t table, α is the experiment risk (the confidence coefficient is 1 – α), the pooled estimate of the standard deviation of x is sp,n-1× (1 / n1 + 1 / n 2 ) , n is the weighted average of the sample size, and n i is the sample size of test group i. For two tailed tests (where there is a possibility of either a ± variation about a mean), the t factor must reference a risk of α/2 in the tables; the value of t from the tables also depends on the degrees of freedom, which is the weighted sample size less one, (n – 1). If the sample sizes of the two test groups are different (i.e., n1 ≠ n2), the pooled estimate of the standard deviation (sp) must be derived from the two individual estimates of the standard deviation (si) by: Eq. (80)
sp2 = [(n1 – 1)s12 + (n2 – 1)s22]/[(n1 – 1) + (n2 – 1)]
The weighted average of the sample sizes (n) can be calculated many ways. One way is the harmonic mean:
1 ∑ i =1 n n= i L L
Eq. (81)
−1
where L is the number of different test groups (e.g., L = 2 for a B versus C test). Determining the appropriate sample size is a key question in any engineering test. It should be obvious now that a few iterations with Eqs. (78) or (79) will yield information about a suitable initial value for the sample size if some preliminary information is known or an educated guess can be made about the expected differences between the variances or means, respectively. 9.3
Multifactor Experiments
In practice, it is rare that a single factor dominates a process so that a simple B versus C test over two levels will adequately characterize the process. Usually, there are many factors which must be tested so that a process can be developed with adequate latitude to work under volume manufacturing conditions. Multifactor tests are used to determine
11/30/00 JMR
Techniques and Tools for Optical Lithography
539
whether one or more factors are acting individually or together to improve the process. Blocking, Randomization, and Replicates. A block is a group of experimental runs made under similar test conditions. For example, blocking the equipment would call for the same equipment set to be used for all the runs. This eliminates a family of variation caused by equipment-toequipment variation. Blocking also may call for use of the same batch of materials to be used or the tests to run in under the same time frame. The objective of blocking is to reduce the experimental error by making test comparisons under homogeneous conditions. Block what you can and randomize what you cannot. The objective of randomization is to distribute the effects of uncontrolled variables randomly throughout the experimental runs and to reduce or eliminate systematic errors. Random order tables are one source available for randomization of runs. Replication of runs is used to estimate the experimental error by repeating each individual cell of test conditions. Replication increases the precision of estimates and improves the chances of making meaningful statistical comparisons. Experimental Designs. There are many experimental designs for multifactor tests.[64] Only two will be discussed here. Important features of matrix experimental designs include orthogonality and rotatability. A design is orthogonal if the coefficients of fitted polynomials are uncorrelated. A design is rotatable if the distance of each level of each factor is equidistant from the matrix center. For regression, this ensures that there is no sensitivity or dependence to the assignment of independent variables as factors in the matrix. For regression purposes, the levels of the independent variables may be entered as their actual values ( e.g., as 90°C and 100°C) or normalized or coded (e.g., –1 and +1). Usually, the former is preferred since this makes the regression easier to understand. However, for statistical analysis, it is imperative to normalize the levels of the independent variables. Factorial Experiments. A factorial experiment is described as a 2N experiment, for 2 levels and N factors. With only a single replication, there are 2N cells in the factorial design. The total number of experimental runs is #Replicates · 2N. The simplest full factorial test is a 2 × 2 matrix with two levels tested for two experimental factors, or a 22. The two levels are designated typically +1 and –1. The next simplest full factorial (i.e., all possible cell combinations are tested) is a 23, which can be visualized as a cube. The cube edges represent the +1 and –1 levels (the cube center is 0, and is untested). Each parallel plane of the cube represents one of the three factors.
11/30/00 JMR
540
Handbook of VLSI Microlithography
Some advantages of a 2N experiment include its greater efficiency versus testing a process by manipulating one factor at a time while holding all the other factors constant. A 2N experiment allows interactions between factors to be detected and estimated (i.e., the combined effect on a process of a number of factors at a certain test level might be undetected except at that special combination). Interactions between factors are observed graphically in Fig. 44 as an interception between the linear responses of two factors. Mathematically, the interaction is a difference in values of the slopes of the responses of one factor tested over two levels of a second factor. Full factorial designs follow the pattern of Table 2. Response Surface Experiments. One type of response surface experiment tests factors over three levels. These experiments can be saturated and are described by the notation 3N. Saturated designs are not usually run except for N = 2 factors. Nonsaturated designs over three levels are called BoxBehnken tests.[44e] In general, this class of second-order designs requires many fewer runs than the complete three-level factorials. Also, three level full factorial designs are orthogonal but not rotatable. Box-Behnken designs are formed by combining two-level factorial designs with incomplete block designs in a particular manner.[66] A Box-Behnken design for three factors can be visualized as a cube with a centerpoint, where the cube edges represent the +1 and –1 levels and the centerpoint is zero. Each plane of the cube represents one of the factors. Box-Behnken designs try to maximize orthogonality and rotatability. (See Table 3.)
Figure 44. Interaction between two factors.
11/30/00 JMR
Techniques and Tools for Optical Lithography
541
Table 2. Factorial Design Generator LEVELS Test
Cell
Factor 1
Factor 2
|
Factor 3
1
–
–
|
–
2
+
–
|
–
3
–
+
|
–
4
+
+
|
–
5
–
–
+
6
+
–
+
7
–
+
+
8
+
+
+
22
23
Table 3. Box-Behnken Design for Three Factors Cell 1
Factor 1 –
Factor 2 –
Factor 3 0
2
+
–
0
3
–
+
0
4
+
+
0
5
–
0
–
6
+
0
–
7
–
0
+
8
+
0
+
9
0
–
–
10
0
+
–
11
0
–
+
12
0
+
+
13
0
0
0
11/30/00 JMR
542 9.4
Handbook of VLSI Microlithography Analysis of Experiments
Analysis of factorial and response surface designs are described in Refs. 124 and 56, respectively. Usually, commercial software packages are used for response surface analysis because of their convenience. Analysis of different factors is facilitated by coding the levels [i.e., instead of using the actual units of interest (for example, develop time, seconds) the levels are normalized (for example, factorials use +1 and –1 levels, Box-Behnken tests use +1, 0, and –1 levels, etc.)], so the difference in magnitudes of levels between the factors do not present computational difficulties. Also, analysis of variance (ANOVA) methods are designed to measure differences between means since the assumption is made that the distributions are normal. If the outputs of the test are variances, which have the asymmetrical distribution described by the F-test, analysis of the data using a transformation may be called for (e.g., taking the natural logarithm of the standard deviation values). If successful, the transformed distribution will be normal. Factorial designs with replications run in random order allow testing of hysteresis (i.e., comparison of cell variation to find if the response, after displacement of factor levels, can reproduce its original performance) and linear effects (i.e., are the values of the intercept and slope reasonable?). Response surface designs have an added advantage of testing for nonlinear effects (i.e., typically, responses are described by second order polynomials) so optimization can be performed by finding the maximum or minimum of the response between levels. Finally, no analysis is complete without graphically analyzing the results (remember the cliche: a picture is worth a thousand words). 9.5
Process Control
There are many statistical tools available to control a process so the product is made with high quality. Quality is the conformance to target performance levels. Statistical process control uses methods to reduce process variations so continuous improvement is possible. Pareto Charts. Alfredo Pareto was an Italian sociologist who noted in a study that 20% of the citizens controlled 80% of the wealth. By analogy, this observation applies as a causal generalization to a lot of manufacturing processes. Concentration on a few critical factors gives the most benefit for reducing process variability. Pareto charts can be used to manage engineering priorities. An example is shown in Fig. 45.
11/30/00 JMR
Techniques and Tools for Optical Lithography
543
Figure 45. Example of a Pareto chart.
Multivariate Studies. Multivariate studies determine the process variation without manipulation of any factors. Data are collected so that several families of variation can be analyzed to determine which family contributes the most variation to the process. The effect of each of these families on the total variation can be described for relative importance using a Pareto chart. One example of a multivariate study is to collect data for critical dimension control using product material. The data can be collected on two wafers out of each lot processed, measuring five sites on a wafer (top, center, bottom, left, and right). The families of variation are site-to-site, wafer-to-wafer, and lot-to-lot. Multivariate data are illustrated in Fig. 46.
Figure 46. Example of a multivariate chart.
11/30/00 JMR
544
Handbook of VLSI Microlithography
Nested Variance. The total process variability can be partitioned into major sources of variability using the concept of nested variances: Eq. (82)
σtotal2 = σ l2+σ w2+σs2
In this example, the data being collected are critical dimension measurements (xijk) for the i-th lot, thej-th wafer, and the k-th site. The grand average is: L W
Eq. (83)
x=
S
∑ ∑ ∑ xijk i =1 j =1 k =1
SWL
where L is the number of lots inspected,W is the number of wafers inspected in the lot and S is the number of inspected sites per wafer. The average of lot i (for i = 1, …, L) is: W S
Eq. (84)
xi =
∑ ∑ xijk j =1k =1
SW
The average of wafer j within lot i (for j = 1, …, W and i = 1, …, L) is: S
Eq. (85)
xij =
∑ xijk
k =1
S
The total variation of lot means is: L
∑ (xi − x)
Eq. (86)
11/30/00 JMR
sl 2 = i =1
2
L −1
Techniques and Tools for Optical Lithography
545
The pooled variation of wafer means within the lots is:
∑ ∑ (xij − xi ) L W
Eq. (87)
sw 2 =
2
i =1 j =1
L(W − 1)
The pooled within-wafer variability is:
∑ ∑ ∑ (xijk − xij ) L W S
Eq. (88)
ss 2 =
2
i =1 j =1k =1
SL (W − 1)
The variance components for each family of variation are calculated from their estimates after accounting for nested variation: Eq. (89)
σs2 = ss2
Eq. (90)
σw 2 = sw2 – ss2/S
Eq. (91)
σl 2 = sl2 – sw2/W
Control Charts. There are two principal types of control charts. The first is a variables control chart, for monitoring outputs such asx, R, ands (for the mean, range, and standard deviation estimate, respectively). The second is an attributes control chart, for monitoring outputs such as P, C, and U (for proportions or percentages, counts, or uniformity of a process, respectively). One type of variables control chart for normally distributed outputs has three zones, a green zone, a yellow zone, and a red zone (see Fig. 47). About the target center, the green zone is ±1.5σ. The yellow zone is outside the green zone, about the target center, to ±3σ. Outside of the yellow zone is the red zone. The rules for using this control chart are to make no adjustments if the outputs are plotted randomly in the green zone. Adjustments are made to the process, if six consecutive points are in the green zone on the same side of the centerline, two consecutive points are in the same yellow zone, or a single point is ever in the red zone.
11/30/00 JMR
546
Handbook of VLSI Microlithography
Figure 47. Example of a variables control chart.
Cp and Cpk. Two process capability indices are Cp and Cpk. These indices are used to judge the capability of the process to meet specifications after the process is under statistical control. Standard deviation and mean values are used in the calculations and it is assumed the distributions are normal. The capability index Cp describes the process variation with respect to the existing specification window.
Eq. (92)
11/30/00 JMR
Cp =
(USL − LSL ) 6σ
Techniques and Tools for Optical Lithography
547
where USL and LSL are the upper and lower specification limits, respectively, and σ is the total process variation. The capability index Cpk describes how well the process is centered relative to the specification window. Eq. (93)
Cpk = min(USL – x, x – LSL)/3σ
where x is the mean value of the process distribution. Ideally, Cpk should be as large a value as Cp. A typical minimum standard of manufacturability is Cpk ≥1.33. Process capability indices are a valuable tools to ensure designs are manufacturable, but care must be taken in a manufacturing environment that they are not used to slow continuous improvement.
10.0 PRACTICAL IMAGING QUALITY The total overlay must combine the families of variation of critical dimension control with the families of variation of registration. The simplest method describing the composite effect assumes the total overlay variation is the sum of the nested variances of critical dimension control and registration.[126] The families of variation for critical dimension control and registration are across-field, field-to-field, wafer-to-wafer, and lot-to-lot. This assumes the variation of the families is distributed normally. However, there are systematic and random errors. By definition, systematic errors are not distributed normally. Systematic errors can contribute to distributions in which there are less data in the extreme tails of the distribution than in a normal distribution.[127] In this case, performances should be described in terms of distributions of 95% and 99.7%, instead of 2σ and 3σ, respectively. Critical dimension control is dependent on the quality of the exposure, the resist process including development of the latent image, and the substrate including its topography. 10.1 Field Diameter and Resolution The design of a lens determines how many pixels (N) can be imaged. The image field and resolution of this design trade off equally according to:[71]
11/30/00 JMR
548
Handbook of VLSI Microlithography
Eq. (94)
N = π[(D · NA)/(2λ)]2
where D is the field diameter. Chemical support restrains the transition to shorter wavelengths, but for a given number of pixels and no limitations on the quality and availability of lens materials, the use of a shorter exposure wavelength simplifies lens designs and matching because of the lower numerical aperture. The stitching of multiple fields together allows very large dice to be manufactured, up to wafer scale integration. The lens constrains the resolution and depth of focus latitude but the maximum area becomes a design and wafer fabrication cleanliness issue. Butted fields either employ lines where overlap is made to staggered, enlarged pad areas or with seamless stitching.[128] This technique overlays lines which have been vignetted by a defocused edge on the mask or reticle. Superposition of the complementary tapered exposure profiles yields a full exposure, with minimal linewidth variation. Economics will determine when field stitching comes into vogue. 10.2 Exposure-Defocus Diagrams Process latitude can be quantified by Exposure-Defocus (ED) diagrams[129][130] like Fig. 48. The graphical presentation of linewidths varying with defocus, with each curve associated with a different exposure dose, are known as Bossung curves.[131] The ED area describing process latitude is derived from these curves, with the acceptable processing latitude defined by the area in which the feature size ±10% is imaged. If the processing latitude is given by a ± value for exposure and defocus, the largest rectangle that will fit in the ED area will be described. Actual construction of an ED curve requires some type of normalization of exposures, to avoid the erroneous appearance of a latitude advantage to high exposure processes. Methods of normalization include taking the natural logarithm of the exposure doses [so a ln (ED) curve is generated] or describing the exposure doses of all features as a percentage relative to a reference feature. The area under the ED window decreases with pitch (the width of the line/space pair). Generally, the depth of focus of an isolated space is greater than an isolated line for the same critical dimension and the depth of focus of a contact or via opening is greater than an opaque island.
11/30/00 JMR
Techniques and Tools for Optical Lithography
549
Figure 48. Bossung Exposure-Defocus diagram. Energies are in mJ/cm2 .
Mask Bias. A mask bias means the (linewidth, spacewidth) on the mask is (larger, smaller) than the desired nominal feature on the wafer when the wafer processing uses positive tone photoresist. As the resolution nears the limit of a lens, the choice of the magnitude of a mask bias becomes more restricted (by the Rayleigh resolution limit), until zero bias is the only processing choice. Zero bias masks reduce process latitude in an obvious way since a two tailed distribution about the nominal wafer size is impossible. The luxury of a mask bias helps enlarge the latitude for both exposure and usable depth of focus.[132] The selection of mask bias has an effect on the process latitude as described by the ED curve,[130] but it does not expand the ultimate resolution capability.[133] The process conditions, particularly develop, can be responsible for geometry dependent feature size bias.[133][134] To print a mask of more than a single feature shape and size with the most process latitude requires all the feature ED curves to overlap. For features of the same shape but different size, this can mean applying a fixed mask bias. For features of different shapes or densities, this can mean applying selective mask biases. Aspect Ratio of Features. The aspect ratio (here length/width) of features has a dramatic effect on the process latitude window, especially for depth of focus. The aspect ratio is defined here as the ratio of the length to
11/30/00 JMR
550
Handbook of VLSI Microlithography
the width of a feature. Features of high aspect ratios are favored over small ones because the process latitude is greater. Isolated small aspect features have the worst latitude, while periodic features (i.e., gratings or line/space pairs) are the best. For example, typically, contacts or vias are unit aspect ratio. If the design is accommodating and the contacts or vias can be stretched to 1.5:1 (i.e., the length is 1.5 times the width), the exposure and depth of focus can be improved by a relatively substantial amount. The optimum aspect ratio can be found through simulation. Conjugate Lithography. The exposure dose which yields a flat linewidth response with defocus on a Bossung curve is termed the isofocal, or conjugate, exposure. In Fig. 48, the isofocal exposure is between 70 and 105 mJ/cm2. Generally, the process latitude is not maximum at the isofocal point because the exposure latitude varies so sharply. However, if the exposure control is acceptable, the isofocal point offers maximum depth of focus latitude.[135] Proximity Effects and Degree of Coherence. Proximity effects caused by scattering, diffraction, and interference of light impacts the size and shape of a feature due to the geometry of the feature itself or neighboring features. There are four general classes of proximity effects: linewidth differences between isolated and packed lines or spaces, linewidth differences between clear field line and dark field space, line length shortening of rectangular spaces or lines, and corner rounding.[136] Packed lines have more rounded corners and smaller linewidths than isolated lines at zero defocus. Under relatively large defocus, packed lines have more resist loss, accentuated linewidth loss, and more pronounced corner rounding. As the numerical aperture increases, the size at which proximity effects become apparent decreases.[109] Proximity effects are relatively small for features larger than one micron. The optical proximity effect results in different biases for the different features (e.g., grating, line, space, contact hole, island). This bias variation is sufficient to reduce or eliminate the process latitude overlap of the windows [130] as shown in Fig. 49. This means a single exposure dose might not image all the features of interest on the mask or reticle to the desirable feature size ±10% on the wafer. As feature size is reduced, it becomes increasingly difficult to find a single exposure condition which is satisfactory for different features sizes and geometries.[33] Selective biasing across the mask or reticle becomes important to maintain equal linesizes across circuits, depending on the aspect ratios (i.e., the ratio of the features’ length to width).[136] The process latitude should describe the performance of the most difficult feature.
11/30/00 JMR
Techniques and Tools for Optical Lithography
551
Figure 49. E-D windows for 0.75 micron objects imaged through a 0.35 NA i-line lens with s = 0.7. (Reprinted with permission of Ref. 130.)
Proximity effects depend on the partial coherence value and feature size and shape. [109] Isolated lines have approximately the correct dimensions in all sizes when the imaging is done with partially coherent light. However, diffraction effects cause deviations from the design size for isolated spaces. For spaces (0.6–1.0)λ/NA wide, the light intensity in the center of the space is larger than the intensity of the incident light, causing the resist to develop faster than in a very large area.[133] The undesirable necking and bulging at elbow and pad corners can be reduced greatly by going to a partial coherence (s) of 0.5 and almost eliminated entirely using s = 0.7.[137] However, the contrast and peak intensity are reduced. For very small, unit aspect ratio contact masks, reducing s to 0.3 from 0.7 will increase the edge slope and give about 50% higher intensity at the center of the contact (but lower total illumination intensity, requiring longer exposures). This higher partial coherence (lower s) improves contact hole image quality and improves the sensitivity to optical defocus aberrations.[137][138] Unfortunately, this means there is a higher sensitivity of the optical transfer process to local defects and dust on the mask or reticle and lens.[17]
11/30/00 JMR
552
Handbook of VLSI Microlithography
The general trend is that when the degree of partial coherence is increased (i.e., s becomes smaller), the intensity tolerance becomes wider but the depth of focus becomes shallower.[129] However, for features near half micron size, the loss in depth of focus is minimal, but the increase in exposure tolerance is substantial. Illumination uniformity is relatively constant for 0.4 ≤ s ≤ 0.65, but worsens quadratically for increased coherence.[139][140] Proximity effects depend strongly on the resist processing. In particular, the resist thickness is an important factor since the resist contrast depends on thickness.[46][142] High contrast resist processes replicate the aerial image diffraction interference effects more faithfully than lower contrast ones. Numerical Aperture.There is an optimum numerical aperture at which the depth of focus is greatest for printing features in a particular pitch,[110][132] i.e., for relatively large features (k1 ≥ 0.8) the imaging quality under defocused conditions can be higher with a lower numerical aperture lens. High contrast resist allows lines with lower contrast in the aerial image to print. At the smaller NA, the aerial image degrades slower with defocus, giving increased latitude.[143] There is no benefit to the use of a high resolution lens when only moderate resolution is required because of the loss of depth of focus. Resolution increases with numerical aperture only without defocus, which is an ideal condition. A variable NA is necessary to achieve the maximum depth of focus for an arbitrary feature size greater than the resolution limit. Optimizing the depth of focus as a function of NA for a given feature size requires setting the derivative of depth of focus with respect to numerical aperture equal to zero [using Eqs. (46) and (13)]. However, the depth of focus for different feature types (specifying density, grating, isolated line, isolated space, etc.) may not change monotonically with feature size,[144][145] making optimization difficult. A numerical aperture of 0.45 is close to the ceiling for microfabrication purposes because the depth of focus requirements become increasingly stringent for any higher NA.[129][143][144][146] For a fixed depth of focus, decreasing the actinic wavelength results in increasingly finer resolution but at smaller optimum numerical apertures. 10.3 Depth Of Focus Issues The resolution of features always must be qualified by its usable depth of focus. The depth of focus latitude directly affects the critical
11/30/00 JMR
Techniques and Tools for Optical Lithography
553
dimension control. In reality, there are many factors that must be monitored carefully to ensure the highest quality imaging. Resolution and depth of focus issues include substrate topography, substrate reflection, resist thickness and variation, the quality of the resist process, chuck flatness, wafer flatness, the orthogonality of the wafer plane with the optical axis, exposure wavelength, lens aberrations (for example, field curvature), lens numerical aperture, the feature size being imaged, pellicle flatness, nominal focus selection, and autofocus precision. The best exposure latitude (or equivalently, develop latitude) corresponds to a nominal focus position near the base of the resist, but moves to the middle of the resist layer as the exposure dose increases and the nominal cleared linewidth decreases.[86] One limit to increasingly higher numerical aperture lenses is that the image focused by a lens with higher NA tends to be slightly broadened by resist absorption,[147] attenuating the effective NA. When the angles of the plane waves are below 18° in the resist, which is equivalent to an NA of 0.46 in air, the increase of light absorption can be neglected. Also, the focus effects for high numerical apertures, where the depth of focus becomes comparable to the resist layer thickness, depend upon the oblique direction of propagation of light in the resist.[12][86] These effects include an asymmetry in resist profile on either side of best focus and an asymmetry in the curves on either side of best focus of a Bossung focus-exposure plot, which cannot be explained by lens aberrations.[12][86][148] This latter asymmetry is shown in Fig. 50, which was imaged with a very high numerical aperture (0.60) g-line lens. These asymmetries worsen with decreasing feature size, increasing numerical aperture, and thickening resist. The resist profile asymmetry on one side of best focus effectively halves the usable depth of focus if subsequent processes include plasma or reactive ion etching with significant resist erosion or high energy implants. For thick resist, part of the space within which rays remain adequately converged will be occupied by the resist.[132]If DOF is the depth of focus for thin resist (a film of thickness substantially smaller than DOF, where the lateral light distribution is essentially that within a single plane), then the focal range, DOF´, for thick layers is:[132] Eq. (95)
DOF´ = DOF – t/n
where t is the resist thickness and n is the index of refraction.
11/30/00 JMR
554
Handbook of VLSI Microlithography
Figure 50. Asymmetric resist profiles as a function of defocus for a 0.60 NA g-line lens with s = 0.5. (Reprinted with permission of Ref. 86.)
Resolution Versus Defocus Plots. A preferred orientation for a resolution versus defocus plot is an X (versus a +) pattern over the lens field, so that an inscribed square field has its corners characterized. Data can be collected by image contrast[51][52] or linewidth measurement. The data are organized typically with the resolution plotted as the ordinate and the defocus condition as the abscissa (see Fig. 51). The data for sagittal and tangential features as a function of defocus each can be regressed to a second order polynomial using the method of least squares. Typical acceptable resolution of linewidths is considered to be ±10% of the nominal target [for example,k1 = 0.8, referencing Eq. (13)]. A resolution versus defocus plot should reveal the best imagery at the optical axis or a translation error might exist in the platen (i.e., the holder of the mask or reticle) alignment to the center of the optical axis. Astigmatism. The difference in resolution between sagittal and tangential features at each defocus position indicates an astigmatism error. There is no preferred test structure and line/space pairs, isolated squares and checkerboard patterns are recommended. The difference between the sagittal maximum and tangential maximum at a certain field position
11/30/00 JMR
Techniques and Tools for Optical Lithography
555
(for example, the right edge of the lens) describes the astigmatism error in terms of defocus units at that field position. Astigmatism can be found anywhere in the lens field, even though the design generally will show this focal error only at the field edges. Figure 51 illustrates an astigmatism error.
Figure 51. A resolution versus defocus plot for different radial image heights.
Field Curvature. Many radial positions must be characterized to determine the field curvature. At each radial position, the curves for the sagittal and tangential resolution as a function of defocus must be found, as well as their respective maxima. Field curvature is the connection of the
11/30/00 JMR
556
Handbook of VLSI Microlithography
maxima to form a curve from the lens center to the field edge. Two curves will be created, one for the tangential features and the other for sagittal features. Field curvature changes as a function of degree of coherence so MTF testing must be corroborated by testing at the partial coherence the stepper will normally operate at. Astigmatism and field curvature can degrade the average lens performance by up to one half a Rayleigh unit, of which about 30–40% is in the design and the rest is in the manufacture of the lens.[132] Column Tilt. Resolution versus defocus plots should characterize the imagery at the lens top, center, bottom, left, and right as well as different radial positions along an axis. Column tilt indicates the wafer plane is not orthogonal with the optical axis and is characterized as a difference in value between the best focus position at the lens top versus bottom or left versus right. Figure 51 illustrates column tilt along one radial arm. The effect of column tilt as feature sizes become more aggressive is there will not be enough depth of focus latitude to resolve the features everywhere over the field. Correction of column tilt is either through adjustment of the chuck defining the plane of the wafer, or physical adjustment of the column position. Determining the best focus positions at just the opposite edges of the lens (e.g., top and bottom) allows the column tilt to be found directly by difference. If data are taken at the lens center or other radial positions so data are collected from more than two extreme positions across the field a linear least squares fit will describe the column tilt by the slope of the line. Coma. It is difficult to separate the effects of true coma from decentered lens elements. From the earlier discussion of lens aberrations, it was disclosed that the tangential coma is three times larger than the sagittal coma.[67] These data are apparent from the resolution versus defocus plots, where the curve connecting the tangential features will appear to always lag the sagittal features in resolution at each defocus position. Coma causes asymmetrical imagery so a qualitative test structure should be symmetrical on the mask or reticle.[69] These test structures should be positioned at several radial positions. One candidate test structure is a long line of width equal to 0.7NA/λ. At the middle of its length on either side of the line are two small squares. The squares cannot be resolved since they have sides equal to 0.3NA/λ. The separation distance of the squares from the line cannot be resolved and is equal to 0.1 micron. These test structures preferably are oriented tangentially. Although it is characteristic that the asymmetric images are present at best focus, the comatic asymmetries show up more clearly in the defocused image. Coma can also cause what appears to be a local distortion effect.
11/30/00 JMR
Techniques and Tools for Optical Lithography
557
Another qualitative test for coma images L/S pairs with a conventional transmission mask. The presence of coma will cause bridging of the internal L/S pairs but not the outsides.[70] One stepper design has a lens assembly with a bottom flat element. It is possible to tilt this element and rotate it about its optical axis to vary the optical path length. This provides the means to correct for coma that is systematic across the exposure field. Examination of long lines for the difference in resist profile angle from one side to the next is one means of objective test. The comatic optical path length deviation can be correlated to this relative difference in resist profile angle. Autofocus Systems. Autofocus systems are responsible for bringing each field to the nominal focus position. Typically, the autofocus system functions after stepping to a new field prior to exposure. Types. One type of autofocus system is an grazing incidence optical autofocus system. Usually, multiple wavelengths are used to eliminate interference effects. Infrared light (700–900 nm) is common since there is no resist sensitivity at these wavelengths. A grazing angle (of 2–4°) of reflection from the light source across the wafer surface to the detector is preferable for elimination of substrate reflections, since an oblique angle causes greater reflection. The autofocus light is usually incident at 30–45° to the field to avoid interference with topographical patterns. An air gauge autofocus system[149] emits a pressurized jet of air to sense the wafer surface. Multiple jets are positioned at edge positions of an average field size so die leveling can be sensed. Advantages of air gauge autofocus systems are their high precision and immunity to thin film interference effects. The largest disadvantage is the particulate issue, which is revisited with each decrease in feature size. The air or nitrogen can be filtered to remove particles but the pressurized air can still create an area of turbulence that stirs up local particles. Usually, the pressurized jet is always on to avoid cycling (on again, off again) bursts, which exacerbate the problem. A third type of autofocus system uses capacitance gauges. The sensors are several millimeters in diameter so data averaged over the device area are collected. A capacitive sensor needs a conductive electrode for its sensing path. Doped silicon wafers work well. The rear wafer can be grounded through the chuck or capacitively through a large contact area or a floating electrode with no contact can be used. Wafer substrates like gallium arsenide (GaAs) and silicon on insulator (SOI) with higher resistivity than silicon do not work well with this sensor. The capacitive sensor operates relatively close to the wafer. Dielectric films behave differently
11/30/00 JMR
558
Handbook of VLSI Microlithography
than conductive films since the plane that is sensed depends on the thickness of the dielectric and the dielectric constant. The robustness and sensitivity of this type of electrical sensor is attractive. Leveling. Global leveling of a wafer occurs before alignment or exposure begins. The autofocus system is used to map three (to define the plane) or more sites on a wafer. The wafer wedge (i.e., the planes of the polished and back side of a wafer are not parallel) is removed by tilting the chuck upon which the wafer is vacuum clamped. Local leveling improves the usable depth of focus further by compensating for the local focal plane deviation of a wafer. Local leveling[113][114][149]–[151] requires information to determine the local plane of the field. Correction to nominal focus is made by adjusting the two axes, determining the image field tilt and the third z axis for height separation of the lens and wafer field. A typical local leveling sensor is a four-quadrant optical detector. The wafer plane is adjusted until the signals are balanced. Local leveling also can be done optically using a crossed pair of dual quadrant detectors. This design matches the leveling signal sensitivity across each axis better. Usually, there is a small penalty in throughput for local leveling. Also, a five-axis laser interferometer is needed to track accurately the wafer’s position after local leveling so stage yaw, pitch, and roll don’t contribute to misregistration (if there is any post-alignment stage motion before exposure) due to Abbe errors. Enhancing Focus Latitude. One method used to enhance the depth of focus is called Focus Latitude Enhancement eXposure (FLEX). In FLEX, several focal planes are created at different positions along the light axis, and exposures are made using each focal plane at the same field position on the wafer.[107[152]–[154] According to SEM measurements of features at the resist/substrate interface, the focus latitude is increased about three times using FLEX, although the edge slope of the features is degraded relative to a conventional, fixed focus method. In FLEX, defocused images in one focal plane are superimposed on the sharply focused images in another plane. Since the image intensity distribution is flatter with defocus, the contribution of this exposure lowers the image contrast of the composite image. Best results seem to be gotten on isolated low aspect ratio features, which is where depth of focus latitude usually is shallowest. Nominal Focus. Nominal focus position depends on the ambient temperature and pressure since a change in the index of refraction of air causes changes in the magnification and focus of the reduction lens. Temperature is controlled either by enclosure of the stepper inside an environmental chamber where the temperature is controlled to at least ±0.1°C, or by close monitoring of a thermocouple sensor and compensation
11/30/00 JMR
Techniques and Tools for Optical Lithography
559
to the autofocus system. The amount of focal shift resulting from a 0.1°C shift in temperature depends on the glass characteristics of the lens elements (i.e., the change in glass index of refraction with temperature, dn/dT). A barometrically controlled stepper chamber has never been sold commercially, so the ambient pressure is monitored at an environmental station and compensation for the effect on nominal focus is made to the autofocus system. For a 0.3 NA g-line lens, a change in atmospheric pressure of 1 mm Hg causes a focal shift of ~0.15 microns. A lens is a thin lens if its thickness is much less than the radii of curvature of its two surfaces. Assuming thin lens theory holds, the change in focus (∆f) with pressure is:[155] Eq. (96)
∆f = ∆P(kδ nlf 2/P0)
where ∆P is the difference between the ambient pressure (P), and the standard pressure, P0 = 760 mm (i.e., ∆P = P – P0), k is a lens constant, d is a function related to the index of refraction of air, nl is the index of refraction of the lens, and f is the focal length of the lens. Equation (96) predicts a linear dependence in focal position with ambient barometric pressure. Nominal best focus is found by evaluating data from a defocus array. The defocus increments are selected based upon the confidence level of the position of best focus and time constraints. Data can be collected from a linewidth defocus array such as Fig. 48, aerial image quality,[51][52] or alignment signal contrast from a latent image.[156] Data regression to a quadratic equation allows the best focus to be found after the first derivative is taken, set equal to zero, and is solved. Offsets between the different methods can exist and these must be correlated (preferably to a linewidth defocus array) so the most convenient procedure is available for production use. Convenience should reflect the gauge capability of the method and the time spent on the stepper (versus off-line). The gauge capability is the total variance measured from a multivary study, where families of variation may include test-to-test and operator-to-operator (this family should be small for objective metrology and mathematical analysis of data). Asymmetric Imaging Effects. As previously mentioned, the focus effects for numerical apertures greater than ~0.3, where the depth of focus becomes comparable to the resist layer thickness, depend upon the oblique direction of propagation of light in the resist.[12][86] These effects include an asymmetry in resist profile on either side of best focus.[148] This asymmetry occurs for the top of the resist profile but not its base. The effect this would have on the device performance depends on the resist erosion in the dry etch
11/30/00 JMR
560
Handbook of VLSI Microlithography
or the momentum of the dopant during implant. Also, there is an asymmetry in the curves on either side of best focus of a Bossung focus-exposure plot, which cannot be explained by lens aberrations. This effect would evidence itself most strongly in critical dimension control. These asymmetries worsen with decreasing feature size, increasing numerical aperture, and thickening resist. Asymmetrical resist profiles frequently occur on the outboard lines of arrays. This is the result of an asymmetrical aerial image caused by proximity effects.[109] Depth of Focus Dependence Upon Exposure Duty. Thermal absorption in the lens assembly due to the exposure duty can affect the nominal focal plane and cause defocus errors without a scaling or offset compensation.[157] The change in nominal focus with time is an exponential decay.[158] Satisfactory modeling of this lens heating effect can require multiple exponential terms.[159] The exposure duty is a function of the transmission of the mask or reticle (i.e., how much of the mask or reticle pattern is glass versus chrome) and exposure dose. The change in glass index of refraction with temperature (dn/dT) causes the effect. Absorption of light by the lens materials at the actinic wavelength becomes important for this reason. Generally, i-line lenses use glasses of higher thermal coefficient of refractive index than g-line,[161] so the effect probably would be slightly greater at the shorter wavelength. However, selection of suitable glasses results in only a 0.2 micron focal shift after a one hour flood exposure of one commercial 0.40 NA i-line lens, which is very competitive. Wafer Flatness. Wafers imaged using full field scanners are specified most appropriately for flatness using global criteria. One global flatness metric is the Total Indicated Reading (TIR), which is the difference in elevation between the highest and lowest points on the surface of a wafer (i.e., the range).[162] Wafers imaged using steppers, which print one field at a time, more appropriately are quantified for flatness using local criteria. One local flatness metric is the Local Focal Plane Deviation (LFPD), which is the peak absolute deviation of the wafer surface from the focal plane of a representative size exposure field. Global leveling of the wafer by the stepper, prior to exposure, compensates for wafer wedge (the frontside and backside planes of the wafer are not parallel). Local leveling of the wafer by the stepper prior to exposure compensates for random polishing errors, bow, and edge effects (e.g., domed or cupped wafers). The flatness variation is most severe at the edge of a wafer. The local thickness variation directly affects the usable depth of focus so wafer specifications can be referenced to the minimum feature that must be imaged during construction of the product die.
11/30/00 JMR
Techniques and Tools for Optical Lithography
561
Chucks. Wafer chucks are constructed from materials such as aluminum (which can be anodized to improve hardness), stainless steel, metal alloys, or ceramics. There is a strong correlation between average wafer flatness and clamping vacuum.[162] Unfortunately, the surface contact with the wafer preferably is minimized so that trapped particles under the wafer do not create a local focal plane deviation. Low contact chucks include designs of raised concentric circles or pins (which are an array of elevated pyramidal structures) which maintain vacuum contact with the backside of the wafer. Chuck flatness is measured by fringe analysis with a laser interferometer. Substrate Planarization. The depth of focus is maximum on planar substrates. The deposition of conductive and dielectric films and their etching after patterning to form device layers produces nonplanar topography. A spin coating of resist[163] over topography will be neither perfectly conformal[164]–[166] nor perfectly planarizing.[167]–[170] Thin film interference effects influence most resist processes producing local variations in exposure requirements. Also, topography can be responsible for scattered or reflected light, producing notching.[171] Doped atmospheric CVD low temperature oxide (LTO) is a common dielectric applied to isolate first metal and the gate level. High temperature thermal reflow is used to planarize local (range less than 10 micron) structures. Improved planarizing results are possible using borophosphosilicate glass (BPSG) followed by an anneal and a thermal reflow. High temperature processing after first metal is limited by the metal’s thermal diffusion coefficient so LTO and BPSG are not considered candidates since thermal reflow is not possible. Spin-on-glass (SOG) has been used extensively since glass annealing temperatures are low. The local planarization capability is modest. Plasma enhanced tetraethylorthosilicate (TEOS) is one of the most attractive backend glasses since it can be deposited with very good uniformity. Planarization is achieved iteratively through deposition and anisotropic selective etch back. Conventional planar oxide techniques rely on a sacrificial organic coating (usually resist although more specialized films are available commercially) for planarization, followed by anisotropic etch back with 1:1 resist:oxide selectivity.[172]– [174] An advanced planarization method[175]–[178] uses photolithography to image resist features in wide spaces. The resist is the same height as the surrounding etched features. This method recognizes uniform narrow spaces will planarize but wide spaces will not, so resist images are used to fill the wide spaces. Now having performed a better substrate preparation, the conventional sacrificial etch back method described above is done.
11/30/00 JMR
562
Handbook of VLSI Microlithography
Depth of Focus Error Budget. There is a minimum depth of focus required for consistent imaging quality. An error budget can be used to describe the individual contributions. Some of the individual contributors produce a systematic error and others a random variable error. An error budget with sample values is shown in Table 4. In some cases, errors are described as across-field errors. For the sample values used, the (mean +3σ ) depth of focus error budget is ~0.75 microns. The total focal range requirement is twice this value, or 1.5 microns. The exercise of forming error budgets is useful to understand better the limitations and prospects of specific stepper equipment configurations and lithographic processes. A Pareto chart (Table 4) of the individual errors defines their relative importance better and suggests priorities for vendors and process engineers. The last component in the error budget is most important for excimer laser sources since focus can change by 0.1–0.2 microns per picometer of wavelength drift. Astigmatism and field curvature are simultaneously affected as well as stigmatic image placement errors like magnification and distortion. 10.4 Illumination Sources. Mercury vapor lamps are the illumination sources for broadband and narrow band steppers, as well as the deep ultraviolet (UV) step and scan machine. Mercury lamps contain a starting gas like argon. HgXe lamps emit virtually the same spectrum as mercury lamps with the addition of some infrared radiation and a slight increase in the continuum. A typical spectrum is illustrated in Fig. 52. Xenon mainly assists in the lamp ignition and warming quicker to equilibrium. A discharge arc of the high pressure vapor emits a characteristic spectrum for the element mercury as its electrons relax from an excited state to the ground state. Infrared (IR) radiation is emitted due to the heating of the lamp during its operation. Infrared radiation is filtered with a cold mirror, which passes the IR to a radiator and reflects UV. The mercury spectrum is not of uniform intensity, nor does its intensity vary smoothly as the wavelength changes. As the lamp ages, the relative intensities at each wavelength change. Steppers incorporate an integrator (a light sensor coupled to the shutter servo) to vary the nominal exposure time so that the actual exposure dose remains constant over the life of a lamp. The high intensity peaks at the shorter wavelengths in the ultraviolet and deep ultraviolet have been of the most interest in microlithography since the resolution improves with shorter wavelength.
11/30/00 JMR
Techniques and Tools for Optical Lithography
563
Table 4. Focus Error Budget Error Source set focus set wafer tilt set lens tilt lens heating substrate topography wafer local focal plane deviation
Mean value (µm) 0.10 0.05 0.05
0.10 0.02 0.02 0.10 0.20 0.30
chuck local focal plane deviation
0.20
lens field curvature lens astigmatism reticle (object plane) nonflatness
0.10 0.02
finite resist thickness
0.00 for 1 µm film
λ variation effect on lens aberrations Total
3σ value (µm)
0.02 0.02 0.05
0.05 0.32
0.44
Figure 52. Illumination spectrum of a mercury arc lamp.
11/30/00 JMR
564
Handbook of VLSI Microlithography
A cool mercury arc lamp has an internal pressure of 0.3 atmospheres. Intensity is raised by increasing the equilibrium operating pressure, which is in excess of 40 atmospheres.[179] This high pressure makes possible lamp explosions, which are both dangerous and expensive. Thermal control of the lamp helps minimize the possibility of a lamp explosion and is related directly to the useful lifetime.[180]Short term fluctuations in lamp temperature can cause a variation in illumination uniformity, adversely affecting intrafield linewidth control.[181] Energy output is nonlinear with lamp wattage. Past a certain point, there are diminishing returns for raising the wattage. Stepper lamps of 1000 watts are common and the early Step-and-Scan used a 2400 watt lamp.[114] 3500W lamps for the latter provided even better throughput, enhanced even more by avoiding the use of shaped sources which would reduce the light intensity. The illuminance at the wafer plane for g-line is ~800 mW/cm2 and for i-line is ~400– 500 mW/cm2 (these values depend on illuminator and reduction lens transmission). Actual exposure times depend on the resist sensitivity and the process. Commercial volume photoresist development has paralleled the emphasis on these peaks, 436 nm (g-line), 405 nm (h-line), 365 nm (i-line), and 250 nm, in chronological order. An alternative source for deep UV at 248 nm and 193 nm is the use of pulsed gas excimer lasers. These lasers use noble halogen gases such as KrF (at 248 nm) and ArF (at 193 nm). Excimer lasers offer high average power, absence of speckle (i.e., a stationary interference pattern of small dots randomly scattered throughout the field), and higher spatial coherence than mercury arc lamps but relatively low for a laser. The excimer laser can permit the use of shaped sources in step and scan systems, which can be advantageous for imaging performance. The word “excimer” comes from excited dimer. Excimer molecules or exciplexes exist only in the electronically excited state, since the atoms are unbound or only weakly bound in the lower energy level. The rare gas and halogen atoms reach the upper electronic excited level by chemical reaction in a high voltage (e.g., 15–20 kV) pulsed discharge.[182] The average power of a standard commercial excimer laser is above 100 watts, but for microlithography it is as low as 3–6 watts.[183]–[188] Unnarrowed single pulse energies can be greater than 500 mJ, while narrowed emissions are ~10–20 mJ/pulse. Maximum repetition rates are ~400–600 Hz. Spectral narrowing is required to control chromatic aberration and power is reduced appreciably. Spectral narrowing can be accomplished by injection locking, gratings, intracavity prisms, or etalons. The last method extends the lasing cavity beyond the excitation region and
11/30/00 JMR
Techniques and Tools for Optical Lithography
565
inserts two etalons (an etalon is used to measure distances in terms of wavelengths of spectral lines). The etalons act as narrow bandpass filters; one has high resolution but has several transmission peaks while the second etalon has lower resolution and is used to select one of the peaks.[189] Typical energy densities are 0.1–1 mJ/cm2 at the wafer. Maintenance is more complicated for laser sources than mercury arc lamps. The discharge unit can be replaced as easily as a mercury arc lamp. The excimer laser is housed in a separate chamber and the window requires cleaning after ~5 × 108 pulses to correct a drop in transmittance caused by polymer deposition onto the optical components of the illumination system. It is believed residual organic vapors in the atmosphere of the clean room are polymerized onto the surface of the optical components, since they have absorbance at 248 nm.[190] Activated charcoal filters in the air purification and handling system can be used to reduce organic vapor levels. Gas refills are needed after ~5 × 107 pulses. A laser head can be completely rebuilt in less than a day, including replacement of the optics, preionization electrodes, and preionizer. Preionization prevents electrical breakdown in the laser gas from occurring in the form of arcs by providing an initial population of electrons or weakly bound negative ions.[191] This type of major overhaul is performed after ~1 × 109 pulses. Bandwidth. Historically, the volume commercial use of optical steppers at shorter and shorter wavelengths has been slowed by lack of chemical support (for example, high contrast resists, contrast enhancement materials, etc.), since the resolution benefits are indisputable. Generally, lower numerical aperture lenses of shorter wavelengths are easier to design, manufacture, and match than higher NA lenses of longer wavelengths.[71] However, the selection and availability of high quality lens materials for shorter wavelength use can be a severe hindrance, especially for correction of image smear due to chromatic aberration.[161] Filter technology is becoming more important as the actinic wavelength is reduced since the acceptable bandwidth of illumination is narrowing. High pressure lamps used to increase intensity also raise the continuum and this can significantly degrade the image contrast (by up to 5–10%) without adequate filtration.[192] The effective energy is defined as the integral of the product of the spectral intensity, the filter transmittance, and the resist sensitivity. The spectrum of an excimer laser is not symmetrical with respect to the peak wavelength, but it has a higher tail on the shorter wavelength side.[193] The shape resembles a Lorentzian waveform with truncated wings. This distribution can be expressed as:[194]
11/30/00 JMR
566
Handbook of VLSI Microlithography
Eq. (97)
f (λ ) =
Γ (∆λ )2 + Γ 2 2π 4
where Γ is the FWHM, and ∆λ is the deviation of the wavelength from center. The bandwidth of commercial g-line lenses of 0.45 NA is ±2.5 nm, compared to ±4 nm for less aggressive 0.35 NA designs. The bandwidth of 0.40 NA i-line lenses is ±1.5 nm, illustrating the stricter bandwidth requirement to control chromatic aberration. It is too difficult to achromatize lenses at 248 nm due to difficulties working with complementary crystalline halide materials and cements. Therefore, the only choice of glass is fused silica. A chromatic lens requires substantial reduction of the natural bandwidth of about 0.5 nm for the KrF excimer laser. The required laser bandwidth for fused silica lenses depends on the field size and numerical aperture. The narrowed Lorentian distributed bandwidth still is relatively wide to avoid speckle and still minimize attenuation of the OTF. The OTF decreases generally and the value added by a higher numerical aperture is less if the bandwidth is not sufficiently narrow. Excimer laser wavelength drift errors are evidenced as focus drifts, changes in distortion values, field curvature variation, and astigmatism shifts. The minimum spectral bandwidth of a 0.35 NA lens using an KrF excimer laser source is 5 pm (i.e., picometer) FWHM and the total spectral bandwidth including the spectral swing is 8 pm FWHM. FWHM stands for Full Width at Half Maximum and is the width of the spectral line in hertz at the 50% peak intensity point. Lens designs of ~0.5 NA require spectral bandwidths of 1–2 pm FWHM and a wavelength stability of ±0.15–0.25 pm. For lasers line narrowed to -1.5 pm, speckle can become an issue since the high spatial frequency component now has enough contrast to image. This can show itself as high frequency linewidth variation. Separately, as the light becomes increasingly monochromatic, any imperfection or point defect (for example, a dust particle) in the optical path diffracts the light to produce a distinct image pattern. There are several processes available to introduce incoherence to reduce speckle. One choice is a three mirror speckle reduction system. Light from the laser strikes a beamsplitter, so some of the light is immediately reflected while some is passed into a pair of 45° mirrors. This light is sent to a normal incidence mirror and then out via the beamsplitter. The mirrors act as a delay multiple pulse generation system. A single pulse goes in, while
11/30/00 JMR
Techniques and Tools for Optical Lithography
567
three come out. The three pulses are uncorrelated to each other so speckle is reduced. Another choice with Kohler illumination, which can be implemented in parallel with other systems, is a pair of counter-rotating diffusers.[195] The diffusers can be a material like ground glass. The two diffusers are moving off-axis, relative to each other. The speckle noise reduction is proportional to the diffuser disk rotation speed. The Gaussian distribution of the diffuser plate thickness and the molecular roughness of the surface effectively produce a phase modulator. However, rotating diffusers pose a vibration problem, which generally is undesirable in illuminators. Also, speckle noise is reduced with increased surface roughness or scattering angle but this means less total energy will be collected by the illuminator. For long enough exposure times, time averaging aids the illumination uniformity at the mask plane. Optimization of the bandwidth by manipulation of the mercury arc actinic narrow-band line filter (for example, adjusting its type or its physical position) can be done empirically by monitoring the effect on astigmatism of a resolution versus defocus plot. Most of the optical power of the step and scan resides in its achromatic spherical mirrors so average values of optical path differences as a function of field radius are about 0.06 waves for an exposure bandwidth of 235–260 nm.[114] This wide bandwidth is favorable for reducing standing wave and thin film interference and improving wafer throughput by reducing exposure times. Wavelength Limitations. As the illuminating wavelength shortens, there are dramatically fewer choices of glasses to construct lenses, and in the deep ultraviolet there are some unique problems.[196]–[198] Candidate deep UV sources and lens materials are listed in Table 5. Fused silica is the glass or amorphous state of silicon dioxide. In the amorphous state, only a short range order exists. The density of thermally grown fused silica is about 20% less than that of one of the crystalline phases of silicon dioxide known as quartz. The properties of fused silica are changed by introducing impurites that are substitutional (e.g., boron or phosphorus) or interstitial (e.g., oxides of Na, K, Pb, and Ba) to the silicon dioxide lattice. Interstitial diffusion of impurities is inversely proportional to the glass density. The transmission of 248 nm light through fused silica is reasonably efficient at ~92%. However, excimer laser illumination by ArF at 193 nm has a transmission through fused silica of only ~88%. Calcium fluoride (CaF2) and lithium fluoride (LiF) have slightly more favorable transmission values at 193 nm at ~92%, but the technology to work with these materials
11/30/00 JMR
568
Handbook of VLSI Microlithography
is immature. Lithium fluoride is hygroscopic and is difficult to polish to the exacting surface tolerances required in projection lenses. A transition to 213 nm is less risky than 193 nm since the transmission in fused silica is ~91% and the depth of focus benefit relative to 248 nm is the same the industry considered favorable going from g-line (436 nm) to i-line (365 nm).
Table 5. DUV Sources and Lens Materials Wavelength (nm)
Source
Lens materials
250
Hg arc lamp
Fused silica, CaF2, LiF
248
KrF excimer laser
Fused silica, CaF2, LiF
213
5th harmonic Nd-YAG laser
Fused silica, CaF2, LiF
193
ArF excimer laser
Fused silica, CaF2, LiF
157
F2 laser
CaF2 , LiF
The transmission of relatively impure CaF2, LiF, and fused silica can degrade dramatically after ~2 × 106 pulses at 193 nm, worsening as the optical path lengthens. For nonhomogeneous and impure materials, irradiation in the ultraviolet can induce formation of absorptive color centers, accompanied by a reduction in transmission. Reduced transmission and transient heating causes a transient change in the material index of refraction. Impurities can play a role, as well as intrinsic point defects. Damage is caused from absorption by resident defects and by two photon absorption.[159][160] When two photons are absorbed simultaneously, there is color center formation and physical compaction. Laser induced compaction induces stress birefringence and a change in optical path length from densification. Crystalline materials that are optically anisotropic, with axially different indices of refraction, are birefringent. Also, there can be polarization scattering of incoming light. Annealing of bulk starting materials can change the absorption coefficient to reduce the likelihood of color center formation but compaction stays constant. There is evidence that UV-induced compaction rates are not intrinsic to fused silica, but rather depend on fused silica chemistry and processing history, suggesting more damage resistant materials can be developed. It is projected that at low fluence (for
11/30/00 JMR
Techniques and Tools for Optical Lithography
569
example, exposures of 20–50 mJ/cm2), using 193 nm light, the change in optical path length of high quality fused silica is negligible after ten production years. Solarization is highly nonlinear in response and is dose dependent. No reliable acceleration testing is known. At the very least, this can limit the refractive lens thickness. A low absorption oil (e.g., silicon-based) can be used to couple optical elements.[200] The oil can provide a good optical index match between elements of disparate materials to reduce stray and scattered light. Potential problems include compatibility of the oil with the structural sealants used for the lens mount and oil transmission degradation due to DUV absorption. Uniformity. The uniformity of illumination commonly is calculated by: Eq. (98)
U(%) = (±100%) × (IHigh – ILow)/(IHigh + ILow)
where I is the intensity measurement at different points in the field. The measurements usually are done with an illumination meter and appropriate wavelength sensitive probe, but equivalent information also could be gotten by measuring resist film thickness after exposure and development (this is not preferred since the resist process can confound the results). Typical specifications call for uniformity to be less than ±2.5%. Uniform illumination is obtained by passing light through a fly’s eye integrator or mixing light in a fiber bundle or kaleidoscope. For excimer laser sources, exposures also require multiple pulses. There are variations from pulse-to-pulse in their energy/pulse, so there is a minimum desirable number of pulses for exposing each field.[196] Some sources of laser fluctuation noise include the discharge voltage, gas mix recipe, age of the gas fill, pulse repetition rate, head temperature, immediate operating history, and the state of maintenance of the laser discharge system.[201] This lower limit is related to the desired tolerance on linewidth control, which is a function of the aerial image quality and the material response of the photoresist. Because of the shorter wavelength of excimer lasers as well as their high peak intensity, a growing number of photo processes have been found to be nonreciprocal (i.e., the linewidth for a given total exposure energy is different for a single pulse of all the energy versus multiple pulses of equal fractions). Exposure control strategies which minimize the number of laser pulses can improve system throughput while allowing simpler, lower repetition rate lasers to be used. More importantly, laser maintenance intervals can be extended and reliability improved.[201]The drift of the mean
11/30/00 JMR
570
Handbook of VLSI Microlithography
shot energy can be controlled well by feedback control of the laser discharge voltage. Continuous sampling of the pulse energy allows an updated energyvoltage transfer function to be calculated so any exposure dose requirement can be executed with exactlyn (10≤ n ≤ 20 seems practical) pulses being fired. Systematic nonuniformity can be improved by insertion of an apodizing filter, which introduces a compensating gradient. Illumination uniformity depends on the partial coherence and is relatively constant for 0.4 ≤ s ≤ 0.65, but worsens quadratically for increased coherence.[139][140]However, the uniformity of the light can be held constant if the illuminator collimator is refocused onto the entrance pupil of the reduction lens after the partial coherence is adjusted.[141] Partial Coherence. The pupil filling ratio (s) for the reduction lens can be determined empirically by measuring the condenser lens spot size at the entrance pupil plane.[72] The illuminator is moved aside and a resist covered wafer positioned at the entrance pupil plane is exposed. The developed spot size then can be used, along with the entrance pupil diameter, to calculate the ratio which is a metric for the partial coherence for the reduction lens. Most steppers operate with a partial coherence value of 0.5. Laser sources can emit light in a nearly coherent illumination mode due to the low divergence of the beam while keeping the energy efficiency high. Partial coherence can be obtained by moving the point source in the conjugated plane of the entrance pupil with respect to the condenser,[17] for example, by scanning or by divergence at the source (for example, by means of a fly’s eye element, which is an array of short focusing lenses). Flare. Flare, or background exposure, is the result of stray reflections and scattering from the mask or reticle, lens, and wafer combinations.[26][49][202] Flare increases the overall exposure level in low exposure areas and decreases it in high exposure areas. Reduction of incident illumination reflections is attained by proper optimization of the lens configuration and geometry. Antireflection coatings on lens elements help increase image brightness and eliminate scattering off surfaces of elements. Antireflection coatings on the mask or reticle and wafer surfaces also improve the image contrast. The internal lens barrel can be painted flat black to reduce reflections. Internal knife-edge stops lining the lens barrel can be used to eliminate internal low-angle reflections that no paint alone can stop. A method of measuring flare uses positive resist as a radiometer.[202] Theoretically, there should exist no exposure in large opaque regions of the mask or reticle, where large is assumed to mean at least 100 times the size of the Airy disk pattern. A brightfield mask (i.e., a chrome feature surrounded by a clear glass field) is used to create a plot of developed thickness
11/30/00 JMR
Techniques and Tools for Optical Lithography
571
within opaque areas as a function of exposure. The plot should be constant if no flare exists. 10.5 Thin Film Interference and Standing Waves Light has many complex interactions in thin film stacks due to reflectance at interfaces, transmittance, and absorbance, that are important in optical lithography for image formation. The photoresist system and its processing are critical since the exposure latitude depends on the latent image formation. Roughly, single layer resists, single layer resists with antireflection layers, and multilayer resists or top surface imaging require exposure latitudes of 30, 20, and 10%, respectively, for ±10% CD control. Imaging with the tighter exposure budget is gained at the expense of morecostly or complicated processing and increased susceptibility to defects and particulates. Thin Film Interference. Energy is coupled in the resist by the coherent interference effects under the air/resist interface due to a partially reflecting substrate which result in periodic intensity distributions in the direction perpendicular to the plane of the resist with a period λr/2 where λr is the exposure wavelength in the resist[13][203] (see Fig. 53). Whereλ is the exposure wavelength in air and n is the real part of the resist index of refraction (N = n + ik), λr = λ/n. A film attenuation term in the complex part of the resist index of refraction is represented by k, and for transparent films is zero. This thin film interference results in a sinusoidal dependence of intensity upon the film thickness where the maximum intensity envelope (destructive interference) is given at a resist thickness (z) of: Eq. (99)
zmax = (2m + 1)λ/4n
and the minimum intensity envelope (constructive interference) is given at: Eq. (100)
zmin = mλ/2n
where m is an integer.[46][204] The variation of exposure energy needed to clear a resist film is described by the same equations and since the linewidth of resist patterns is related directly to the exposure dose, linewidths vary with a similar dependence upon resist thickness. The exposure time needed to just clear a resist film (E0) is at a maximum for a thickness corresponding to zmax. The exposure time-to-clear for a resist film is at a minimum for a thickness of zmin (interference data are illustrated in Fig. 54). Data for these
11/30/00 JMR
572
Handbook of VLSI Microlithography
coupled energy curves are actually fit to a function containing a sinusoidal term (representing thin film interference) and a linear term (representing bulk film absorption) to account for both the swings and the rising nominal dose-to-clear as the resist thickens, as seen in Fig. 54.[205][206] Although the exposure dose requirement is less at a resist thickness corresponding to zmin, most processes are controlled near a nominal thickness of one micron around a centerpoint of zmax ± tens of angstroms. This ensures that all features will receive a dose adequate to clear the film, even with surface topographical variations, where the spun resist film thins out slightly at the shoulder of a nonplanar feature[207] to a thickness of zmin ≤ z local ≤z max. However, for resist films corresponding to thicknesses described by Eq. (100), the amplitude of the intensity distribution is minimized. The phase of the light formed by the superposition of the primary wave and the first reflected wave is opposite the phase of the light formed by the superposition of the second and third reflected waves at any resist position z (see Fig. 55). Resist thickness variations of λ/4n from zmin can alter the intensity of the coupled energy by the ratio Imax/I min of 2.5 for a perfect reflector.[204] Regardless of the exact choice of resist film thickness, a significant increase in MTF is obtained by thinning the resist.[9] A comparison of broadband exposure in deep UV about 250 nm versus the very narrow bandwidth of a KrF excimer laser exposure indicates a relatively large difference in the reflectance and energy coupling in the resist due to thin film interference as a function of resist thickness.[208] The wavelength induced intensity smoothing reduces the amplitude of the energy coupling curve with thickness.
(a)
(b)
Figure 53. A schematic representation of a typical wafer substrate is shown in (a). A schematic representation of incident and reflected waves during exposure of a photoresist layer deposited on a reflective substrate is shown in(b). (Reprinted with permission of Ref. 207.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
573
Figure 54. Thin film interference effect on the exposure time-to-clear.
Figure 55. Intensity distribution in photoresist layers (a) of thickness λ and (b) of thickness 5λ/4. The incident light intensity in both cases is I0. (Reprinted with permission of Ref. 207.)
11/30/00 JMR
574
Handbook of VLSI Microlithography
Nonplanar surface topography causes local variations in the resist thickness, and spurious reflections, causing local critical dimension control problems. In general, the resolution of the resist pattern is degraded by using highly reflective films like Al, AlCu, Mo, and Ti. Films like SiO2, Si3N4, and poly-Si are much more transparent. At an exposure wavelength of 436 nm, the backward reflection from a flat substrate for silicide, poly-Si, and Al is 33, 23, and 80%, respectively. A 40% linewidth change for nominal 0.30 µm line/ space pairs for a 200Å resist film thickness variation on bare silicon is possible at an exposure wavelength of 248 nm, unless reflectivity is reduced.[143] The effect of reducing film reflectivity is reduction of the amplitude of the coupled energy curve versus thickness and standing waves. The refractive indices of common resists are very close to those of SiO2 and Al2O3 so the reflections which occur at the resist/oxide interface are very weak, and to a good approximation, can be ignored.[204] This means the resist and oxide combination of films can be treated as a single film described by the equations for zmin and zmax above. The suppression of reflections is critical for maximizing imaging latitude. Variations in reflectivity among substrates affects the MTF with a decrease in resolution caused by diffuse reflection.[9][209][210] This adds significance to the grain size and hillock density of the metal module used in processing. The standing wave effect also depends on the exposure wavelength and the substrate since indices of refraction vary with wavelength. For example, standing waves are more pronounced at the silicon-resist boundary using i-line exposure relative to g-line because of a reflection of 38% versus 23%, respectively.[161] Silicon layers are expected to produce increased reflections at the 260 nm wavelength.[211] Less reflection of actinic radiation is expected off of aluminum at the 260 nm wavelength. Also, the period of the thin film coupled energy curve with thickness is longer for longer wavelengths so there is more latitude in controlling film thicknesses before the interference effect is seen. The reflectance (R) at the interface of two nonattenuating films without an antireflection layer (ARL) for normally incident light depends only on the two films’ indices of refraction: Eq. (101)
R = [(n1 – n2)/(n1 + n2)]2
For example, at the air/resist interface, n1 = nair = 1.0 and n2 = nresist = 1.65, so the reflectance (R) is 6%.
11/30/00 JMR
Techniques and Tools for Optical Lithography
575
The application of a quarter wave optical thickness (i.e., λ /4n ARL) antireflection layer over the resist [212][213] can reduce the reflectance significantly: Eq. (102)
R = [(n1n2 – nARL2)2/(n1n2 + nARL2)2]
The reflectance will equal zero when: Eq. (103)
nARL2 = n1n2
For example, a commercially available organic antireflection layer spun on resist for i-line applications has nARL = 1.4. For n1 = nair = 1.0 and n2 = nresist = 1.65, the reflectance (R) is 0.7%. This is a marked improvement compared to a process without this film. The value of candidate antireflection layers at any film interface can be found using Eqs. (101)–(103). The effective light intensity contributing to the exposure of the resist is expressed by: Eq. (104)
I = I0 · (1 – R)
where I0 is the incident intensity and R is the reflection.[214] For each feature shape, the critical dimension (CD) changes with exposure dose (E), according to the empirical relationship:[40] Eq. (105)
CD = A · lnE + B
This leads to a simple comparative expression for different resists, processes, etc., based on photospeed: Eq. (106)
∆CD = k · ln(E1/E2)
The critical dimension variation ∆CD depends on energy coupling with thin film thickness as shown in Fig. 54. This is expressed as: Eq. (107)
∆CD = A · ∆E/E
where ∆CD depends on the exposure maximum (Emax) and exposure minimum (Emin). Therefore, the maximum CD variation, ∆CDmax, is: Eq. (108)
∆CDmax = A · {(Emax – Emin)/(Emax + Emin)/2}
11/30/00 JMR
576
Handbook of VLSI Microlithography
The swing ratio (S) of the curve shown in Fig. 54 is defined as the peak-tovalley change in exposure dose divided by the average exposure dose. Reducing the swing ratio minimizes the difference between Emax and Emin so the maximum CD variation is reduced. Eq. (109)
∆CDmax = A· S
The swing ratio (S) is:[213] Eq. (110)
S = 4(R1R2) 1/2e-α z
where R1 is the reflectance at the interface at the top of the resist and R2 at the bottom of the resist, α is the resist absorption coefficient (i.e., the sum of the molar extinction coefficients of the bleachable and nonbleachable components), and z is the resist thickness. Lowering the reflectance at resist interfaces lowers the amplitude of the coupled energy curve so the difference in effects seen between maxima and minima are minimized. Processes accomplishing this include, using a resist that remains highly absorbing after bleaching,[210][215] increasing the photoresist thickness, adding to the resist a nonbleaching dye absorbing at the actinic wavelength,[216] using a process incorporating an antireflective layer over the resist film,[212][213] using a process incorporating an antireflective layer under the resist film, or using a surface sensitive resist process.[217][218] As the resist contrast increases, the benefit of adding a dye decreases until it is negligible.[219] Unfortunately, addition of a dye increases the required exposure dose (lowering the stepper throughput) and degrades the resist profile. This increases the importance of the selection of the dye concentration. Multilayer lithography using a portable conformable mask[220] makes fine resolution possible with high aspect ratio features by combining an absorbing, planarizing bottom polymeric layer over which a thin imaging resist is coated. Bottom antireflective layers can be either organic coatings or inorganic films. Organic ARLs are convenient since they can be applied during the resist coat process, and then require no special processing other than what the resist receives. The spin-on antireflective material is basically an actinic light-absorbing dye dissolved in a polyimide silane type resin.[221]–[223] Polymeric ARL benefits are limited when topography becomes very severe and undercutting of lines (causing lifting if the adhesion loss is severe enough) can be a process dependent problem for
11/30/00 JMR
Techniques and Tools for Optical Lithography
577
features near half micron feature size. Inorganic ARLs include amorphous semiconductors such as silicon and selenium,[224] Ta-Si or TiN, and sandwiches of SiO2 and Si3N4.[225] For example, a quarter wave coating of either silicon nitride or silicon dioxide is applied to minimize the reflectance and a half wave coating of resist is used because it preserves the low reflectance and is easily controlled to have less than 100Å thickness variation.[26] Some antireflection data are graphed in Fig. 56.
Figure 56. Comparison of the effect on reflectivity for three antireflection layers on aluminum as a function of varying ARL thickness, t. (Reprinted with permission of Ref. 225.)
Candidate antireflective layers should have complex refractive indices n + ik with large ratios of n/k.[226] The imaginary part (ik) of the index of refraction gives rise to an absorption of light waves.[203] Reflectivity modeling is helpful for designing ARLs with optimized complex indices of refraction.[227] Inorganic ARLs require application of thin films on equipment that is not usually found in a photo bay, adding to process flow complexity but mitigated with cluster tool processing. Compatibility of the ARL over the reflective substrate can be a device reliability question (for example, amorphous silicon over aluminum, raising diffusion and precipitation issues, or TiN over an Al/Si/Cu alloy, which can aggravate corrosion after metal etch in a chlorine chemistry). Ta-Si or TiN are typical ARLs over Al
2/23/01 JMR
578
Handbook of VLSI Microlithography
films since they are usually not removed. The stoichiometry of the sputtered TiN film can be adjusted so a common thickness can be found for acceptable imaging using multiple actinic wavelengths (for example, both g-line and i-line). There is a marginal improvement in linewidth variations versus resist thickness if contrast enhancement materials[228] are coated over either a low or high contrast resist.[219]Contrast enhancement materials (CEM) increase the aerial image contrast photochemically.[229]CEM is an optically dense polymer that is extremely photobleachable. A great advantage is obtained using a thermal image reversal process[230]–[232] since there is no change in the exposure time-to-clear as resist thickness changes. This process benefits layers with sharply varying topography where the resist thickness will change. Standing Waves. The scalloped surface of resist profiles is due to standing waves. Standing waves in the resist film are caused by interference between the incoming light and light reflected from the substrate.[233] The standing waves along the resist edge cause a gradual transition in resist thickness from the base of the resist to its top. This lateral transition zone is proportional to λ/(2NA) since the magnitude of the oscillation reflects the extent to which the diffraction pattern penetrates into the geometrical shadow.[224] Reduction lenses of high numerical aperture can be used to reduce this transition zone. High numerical aperture lenses reduce line width fluctuations versus resist thickness relative to low numerical aperture lenses, near the resolution limit.[219] The amplitude of the standing wave intensity is not dependent on z, the depth into the resist.[215] However, the effects of interference can be moderated by the resist process, by minimizing the reflectance at one of the two resist interfaces, by increasing the resist absorption coefficient through the addition of actinic dyes,[234] or incorporating a post exposure bake.[235] A post exposure bake is most effective at a temperature greater than or equal to the softbake temperature (i.e., the bake immediately after resist coat and before exposure), with an appropriately long bake time. The effectiveness of the post exposure bake for reducing the effects of standing wave interference depends upon the diffusion of the photoactive compound (it is the photoactive compound that inhibits unexposed diazonaphthoquinone, DNQ, resist dissolution in a developer). Resists that bind the photoactive compound to the resin will not receive optical interference relief from a post exposure bake. Standing wave effects are negligible in chemically amplified (a.k.a. acid catalyzed) resists[236] since a post exposure bake is incorporated. Although standing waves are minimized with a post exposure bake
11/30/00 JMR
Techniques and Tools for Optical Lithography
579
process, the coupled energy curve with thin film thickness is unchanged. If there is sufficient mask bias, overexposure accompanied by over development can reduce standing wave linewidth modulations to lower than 0.05 microns per edge.[129[215] Another simple process to minimize standing wave effects includes minimizing the exposure time (for example, by increasing the develop time or developer concentration, to substitute chemical etching for photoactivity). Polychromatic exposure light can result in some wavelength induced intensity smoothing of standing waves, but for dimensions of most interest the benefits could be considered marginal.[204] It has been theorized that increased coherence of illumination increases the severity of the standing wave effect in resist. 10.6 Vibration Resolution and depth of focus performance of steppers are affected adversely by excessive floor vibration. Sources of this vibration include building motion, building structural dynamics, acoustic buffeting from clean room air handling, and the dynamic interaction between the stepper and its vibration isolators.[25][237] Stepper resonant frequencies are usually designed around 100Hz so there is no inference from servos. A vibration isolator is typically a resilient element, such as a metal spring or a rubber mount. However, reduction of machine vibration takes place only above the resonant frequency of the vibration isolator. Most steppers use pneumatic isolators. These isolators can achieve a low resonant frequency (~2–3 Hz), but they do not null out vibration actively, so around the resonance peak, vibrations actually are amplified. Advanced isolators have electronic feedback control using servo control components that operate over an extended frequency range to actively dampen vibrations.[237] Mask-to-wafer vibration has the effect of averaging out the intensity distribution in a given range of vibration amplitude.[238][239]The frequencies of interest are usually between 1–100 Hz. There is no unique MTF curve since the exposure time and vibrational frequency are coupled. The blur radius changes from the peak-to-peak sinusoidal displacement at high frequencies to a random process that depends on the time the exposure takes place at low frequencies. The frequency characteristics of the stepper site can be determined using a very sensitive accelerometer and performing a spectrum analysis of the signal. Measurements must be made for the x and y horizontal axes and
11/30/00 JMR
580
Handbook of VLSI Microlithography
the vertical axis. The variability of the amplitude and phase of the frequency spectrum makes it difficult to describe vibration concisely. Acceleration measurements are converted to displacements by dividing by the square of the frequency. Alternatively, a vibration histogram can be created which shows the percentage dwell time of the object at each position during the vibration.[240] The histogram data are collected from a given time function of the object by summing and normalizing the time the object stays within small position intervals. Vibration displacement also can be monitored with an inductive sensor but the sensitivity is limited to about 10 nm. An inductive sensor is a rigid rod coupled to the outside frame that protrudes into a coil, disturbing its magnetic field, generating a signal. Simulations[240] show that the depth of focus remains fairly constant for the amplitude of vibration until a threshold is crossed and the depth of focus suddenly ceases to exist. The optimum exposure changes rapidly with vibration amplitude. The depth of focus is highest with a partial coherence of s = 0.4. However, s = 0.6 is most tolerant to vibrations. The effects of vibration can be studied directly by observation of the aerial image.[52] The real-time nature of the technique allows the monitoring of image placement errors versus time with frequency response limited only by the bandwidth of the detector. It is possible that assignment of an image quality error to astigmatism actually can be due to vibration. 10.7 Miscellaneous Processing Issues The resolution depends on both the aerial image quality and the absorption of light in the resist, which is wavelength dependent.[241[242] High exposure independent absorption (i.e., due to the base resin) contributes to degraded pattern profiles and loss of resolution. Photoabsorption is evidenced by decreased development rate with time in positive tone resist films. However, increasing the resist transmission increases the standing wave effect in DNQ resists, necessitating relief by a post exposure bake or other means. After coating and baking, DNQ resists can achieve a resolution benefit by pre-exposure immersion in an aqueous alkaline developer.[243][244] A solubility inhibiting layer only a few molecules thick at the resist surface is formed of tri-ester PAC and high molecular weight novolak resin by the selective removal of low molecular weight phenolic species. The absence of a phenolic group on the DNQ tri-ester results in developer insolubility; typical photoactive compounds are of the DNQ mono- and di-esters since the free phenolic groups will dissolve in aqueous base.
11/30/00 JMR
Techniques and Tools for Optical Lithography
581
The development rate can be determined by using the principle of thin film interference to monitor the sinusoidal changes in reflectivity.[245]–[247] Interference between light reflected from the top surface of the resist and the light reflected from the substrate creates fringes which are an in-situ record of the resist loss. The intensity is a function of thickness since the reflections add coherently and their relative phase depends on the remaining resist thickness. The thickness (t) of resist loss between a neighboring maximum and minimum (i.e., during the time of one half of a fringe) is: Eq. (111)
t = λ/(4n)
where λ is the inspection wavelength and n is the resist refractive index. The total thickness for a number of fringes divided by the respective development time for these fringes gives the develop rate. After the resist film has developed to the substrate, the reflected intensity becomes nearly constant. This is the end point. A constant or proportional over-development time is applied and then the develop is quenched. Unfortunately, standing waves in the latent image cause large changes in the instantaneous development rate, which can confound results. Under favorable conditions, the end point is more sensitive to process parameter variation than critical dimensions, so improved control of parameters is possible as well as an indirect method of monitoring critical dimension variation. Application of multiple development sequences, called interrupted development, can improve DNQ novolak resist imaging.[248]–[250] Each sequence has a develop, rinse, and dry step. The mechanism is believed to involve creation of a thin insoluble film by base-induced diazo coupling reaction between the resin and the unreacted photoactive compound, permitting more anisotropic developer etching. Replacing the dry step in the interrupted development process with a warm water intermediate development bake step can bring added latitude, with the mechanism believed similar to that induced by the dry step above.[250] 10.8 Industrially Accepted Designs Figure 3 illustrates the most commonly accepted designs of lithographic equipment by industry. In fact, the industry installed base is in the order of the figures shown. Fig. 35 shows more detail of the many elements in a refractive reduction lens assembly for a system shown in Fig. 1. The unity magnification Wynne-Dyson lens system is illustrated in more detail in Fig. 2. Discussions of the imaging theory behind these systems are given in previous sections.
11/30/00 JMR
582
Handbook of VLSI Microlithography
The step-and-scan systems are experiencing rapid acceptance by the industry as the state of the art fabrication for integrated circuits extends below a quarter micron in feature size. These systems can accommodate increasingly large die without sacrificing resolution via the pixel tradeoff for field size discussed at the start of this section. A scanned field length of 32.5 mm is scanned using an image slit for standard 6" × 6" reticles and field lengths of 50 mm are possible with 9" reticles. An aspheric mirror is used with a beam splitting cube in the imaging optics. The polarizing beam splitter cube is sized proportionately to numerical aperture. The purpose of the cube is to conserve light. Light goes through the beamsplitter, through a quarter wave plate to polarize the light, to the aspheric mirror, back to a quarter wave plate to polarize the other orientation, and again through the beamsplitter, with almost 99% efficiency. 11.0 PRACTICAL IMAGE PLACEMENT The placement or registration of a mask or reticle die pattern to a previous wafer die pattern is critical to integrated circuit performance. The main sources of registration errors are alignment, stepper induced field errors (only some of which are correctable), mask errors, and process induced errors (typically resulting in a magnification error, caused by a high temperature or stressful film deposition). The results obtained from bench testing a lens do not guarantee necessarily the practical imaging quality obtained when a stepper is used daily. 11.1 Alignment Alignment is the registration of one field defined by the mask or reticle pattern to the previous fields on the wafer after survey of the locations of alignment targets on the previous layer. Alignment errors include a nonzero mean and a finite distribution from field-to-field of the registration errors. Alignment variation is caused by stage imprecision and signal processing errors. Signal processing errors are being improved through different choices of algorithms that determine the alignment target position, digital signal processing,[253] and various noise reduction techniques, improved understanding of the statistical error in position measurement, and higher resolution hardware. Mapping or Enhanced Global Alignment and Field-By-Field Alignment. Alignment variation is the sum of the nested variances of stage imprecision and signal processing errors.
11/30/00 JMR
Techniques and Tools for Optical Lithography Eq. (112)
583
σalign2 = σ stage prec2 + σ signal proc2
Stage precision (including mechanical, electrical, and metrological noise contributions) is important since most steppers survey only a few sites on the wafer, calculate a wafer grid model, and blind step the pattern. This mapping procedure minimizes the alignment time and helps improve wafer throughput. An advantage of this procedure is the signal processing noise (including algorithm errors, signal asymmetry, bit-wise errors in signal intensity detection and other detector and electrical noise) over the wafer is averaged out. Alignment of every field (field-by-field) is an advantage only if there is no post-alignment motion (so stage errors are zero) and there is low signal processing noise. Mapping only two sites on a wafer is called global alignment. Mapping more than two sites also is called enhanced global alignment. For optimum device performance, usually it is desirable to align a current layer directly to the previous layer (for example, aligning first metal to contact). In this case, alignment to some earlier layer (but not the immediately previous one) would introduce an alignment variation since a translation (of the desired previous pattern) error would be unaccounted for. In circumstances where there are strong interaction effects, it is desirable to align several sequential layers to the same earlier layer. Stage Precision. Stage precision is improved by more accurate and stable laser interferometers, referencing an increased number of stage axes, making a transition from analog to digital stages for improved mechanical motion control for acceleration, velocity, and damping, reducing column and stage vibrations, and improving the mechanical stage design and motion efficiency. A displacement measuring interferometer is used to measure the position of the stage. There are many error components of laser interferometer system accuracy and repeatability.[254][255] The laser has some type of frequency stabilization to maintain the wavelength accuracy and repeatability. The electronics error is equal to the least resolution count and is the measurement resolution set by the quantization error of the electronic counter in the system. Optics nonlinearity introduces an error since the polarized components of the light cannot be separated perfectly, causing an optical path length change. Atmospheric compensation is best provided by a differential refractometer, where changes in the air’s index of refraction are measured. This is superior to measuring the air temperature and pressure and the relative humidity and relying on a model expression for the index of refraction of air. The material thermal expansion is compensated with information about temperature changes and the material coefficient of
11/30/00 JMR
584
Handbook of VLSI Microlithography
linear thermal expansion. Optics thermal drift is a change in optical path length with temperature. The deadpath error is the difference in the optical path length of the reference and measurement beams at the baseline position. An Abbe error is a parallax error which can be minimized by referencing the stage laser interferometer closer to the actual wafer plane reducing tilt (θ) induced translation errors: Eq. (113)
Abbe error = offset distance × tan (θ)
A cosine error is a misalignment of the measurement axis defined by the laser beam relative to the mechanical axis of motion. This error results in a scale length error between the measured distance and the distance actually travelled. The earliest steppers used two axis interferometers to track stage position. The two axes monitored were the x and y axis. Three axis interferometers permit measurement of x and y displacement and stage yaw (θz), which is an azimuthal rotation. Five axis interferometers can measure x, y, stage yaw (θz), stage pitch (θx), and stage roll (θy). A five-axis laser interferometer is needed to track accurately the wafer’s position after local leveling so stage motions don’t contribute to misregistration (if there is any post-alignment stage motion before exposure) due to Abbe errors. A sixth axis can be used to track mirror nonflatness, which can represent a relatively large systematic error source. Stages use some form of rail and bearing combination to provide guidance along the x and y axes, or electromagnetic drive,[114] or servocontrolled friction drives.[256] Needle roller bearings and crossed roller bearings are common and have many performance advantages. Many vendors have experienced very good success with air bearings since they have frictionless motion, exceptional linearity, and avoid lubrication, but they can be susceptible to dust and particulates and maintenance is relatively difficult. Motors are usually brushless torque motors or linear motors, although thermal shielding must be provided for the latter. It is important that all heat sources be placed in isolation or downstream of the interferometer. Fine stage motion is usually made by a flexure stage driven by linear voltage differential transformers (LVDTs) or piezoelectric transducers.[257] These stages are of increasingly monolithic design to lower the mass, raise stiffness, elevate the first resonant frequency, and improve the reliability by reducing the number of internal components. Increasing the natural stage resonance allows higher servo bandwidths. Mounted mirrors are being
11/30/00 JMR
Techniques and Tools for Optical Lithography
585
replaced with in-situ mirrors on the stage surfaces[114][256][258][259] and their flatness is being incorporated into stage positioning calculations. Low thermal expansion materials, such as invar and Zerodur,[260]–[263] are candidate stage materials with magnesium and aluminum metals. Invar and Zerodur have similar material thermal coefficients of expansion. Invar is a metallic alloy that is more easily machinable, while highly polished surfaces can be formed in glass ceramic Zerodur. Magnesium has lower density than aluminum with better damping properties.[259] An air shower placed over the laser interferometer and the wafer stage area minimizes temperature variations and gradients in the stage environment.[264] This reduces turbulence and stage noise. Interfield Model. One interfield model describing the overlay of the current wafer grid to a previous one is:[265]–[267] Eq. (114)
dxf = dxw + xwMwx – yw φwx + yw2D2x + Rwx
Eq. (115)
dyf = dyw + ywMwy – xw φwy + xw2D2y + Rwy
Eq. (116)
φfs = φf + yw Mwφ – 2xw D2y + Rwφ
where dxf and dyf are field translation values, φfs is intrafield rotation, dxw and dyw are wafer to field translation values, xw and yw are the wafer coordinate axis system, Mw is wafer scaling, φw is wafer rotation, D2 is stage bow, φf is die rotation, and Rw is the interfield residuals. With a software controlled x-y-θ stage, every systematic grid error is correctable. Figure 57 diagrams some of these errors. The stages of all the steppers are matched relative to an artifact wafer. A reference stepper is selected and interfield errors are minimized. A first layer pattern is exposed on a wafer, this wafer is rotated 90°, and the wafer is aligned. Software corrections are made to make the stage motions orthogonal (the rotation will double the real orthogonality error) and the x and y steps are made equidistant by adjustment of the y scaling software value. All other stepper stages are referenced to an artifact wafer exposed on this reference stepper. Alignment Signal Collection. Alignment system configurations are about as varied as the choice of stepper models.[268][269] The most common alignment types use either brightfield, darkfield, or phase grating methods (see Fig. 58).
11/30/00 JMR
586
Handbook of VLSI Microlithography
Figure 57. Interfield errors.
Figure 58. Schematic of (a) partially coherent bright field, (b) darkfield, and (c) first order diffraction interferometric alignment systems. F is the field stop at the reticle/wafer image and I is the imaging lens to the detector. (Reprinted with permission of Ref. 269.)
2/23/01 JMR
Techniques and Tools for Optical Lithography
587
Darkfield. Incident light striking a wafer target scatters the light. The brightfield specular light scattered back in the same spatial dimensions as the incident light is not collected by the darkfield alignment optics. Darkfield light scattered off the target edges outside this cone of illumination is collected by the reduction lens and some is directed to a detector like a photodiode or photomultiplier tube for signal processing.[270]–[272] When diffraction gratings are used,[273]–[275] a beam of light incident on the mask or reticle transmission grating is split into several beams which are collected by the projection lens and reimaged on the wafer reflection grating target. Besides the specular reflection, light will be diffracted in many directions. If a periodic structure (multiple line/space pairs) is used for the alignment target, the light is diffracted only into a few well confined directions, which are the diffraction orders of the grating. Generally, only the first diffracted order is collected. The higher orders increasingly have lower signal-to-noise ratios, so it is difficult to improve the information content with their collection. Dark field alignment can be adversely affected by large, reflective grain boundaries. Dark field alignment errors are relatively small for resist film asymmetries over the target and there is reasonable insensitivity to alignment optical path aberrations. Brightfield.Brightfield alignment collects all reflected light. Brightfield is attractive due to advancing camera technology. Resolution can be improved by increasing alignment objective magnification and decreasing the pixel size of the charge coupled device (CCD) cameras. Optical fiber bundles permit remote location of any heat sources away from the align optical path so thermal effects don’t contribute to systematic or variable offsets. Brightfield can suffer on wafers with topography covered by resist.[271] Simulations[269] indicate brightfield alignment systems typically have high intensity, but the signals can be low contrast and brightfield can be susceptible to relatively large errors caused by small angular asymmetries of the resist coat and adverse sensitivity to uncorrected alignment optical path aberrations. Darkfield alignment can be adversely affected by large, reflective grain boundaries. Darkfield alignment errors are relatively small for resist film asymmetries over the target and there is reasonable insensitivity to alignment optical path aberrations. Interference or Phase Grating Method. A periodic structure (multiple line/space pairs) is used for the alignment target, so the light is diffracted only into a few well confined directions, which are the diffraction orders of the grating. If the zero order and diffracted orders higher than +1 or –1 are blocked by a spatial filter, the image always has a sinusoidal shape with
11/30/00 JMR
588
Handbook of VLSI Microlithography
100% contrast. If the marks are degraded by planarization or thin film effects or physically impaired, the amplitude of the diffracted signal can be lowered but the positional information determined by the period of the grating is the same. Alignment systems which use the first diffracted orders of the light scattered by a phase grating for heterodyne alignment are relatively insensitive to resist coat asymmetry and defocus and coma effects. Through the Lens. Most alignment systems pass light through the lens assembly of the stepper. Off axis objectives have a difficult time maintaining the distance to the center of the optical axis so relatively frequent compensation checks must be performed if they are used. Differences in material thermal expansion coefficients and physical displacement are the most common problems for stabilizing the position of the alignment objective. Any of the three types of alignment systems of darkfield, brightfield, or diffraction type can be used with a through the lens stepper design. Direct Mask or Reticle Reference. It is desirable for the stepper to reference the mask or reticle directly to the alignment target on the wafer so that mask or reticle translation errors are normalized. Because of symmetrical field errors, the reference should either be on the optical axis (which is rarely practical from a semiconductor manufacturer’s perspective since this area is preferentially occupied by a product die) or symmetrical about the field at two or more sites. A single reference point at the mask or reticle field edge can be susceptible to relatively large field errors. Alignment translation errors may be caused since the center of the mask or reticle position (corresponding to the optical axis) may not be accurately known (minimizing the advantages of direct mask or reticle alignment). Any of the three types of alignment systems darkfield, brightfield, or diffraction can be used with a direct mask or reticle reference stepper design. Wavelength. Alignment using the actinic wavelength has an advantage since the exposure optics are corrected chromatically already, which will help stabilize alignment mean shifts. Alignment with monochromatic light subjects the signal to thin film interference effects so photoresist thickness and topographical variations exert an influence. Asymmetry of the alignment signal can result in an alignment grid error. Process optimization can improve the target visibility. On substrates that are neither highly reflective nor that scatter the light much, removal of the resist by local exposure and development or by laser photoablation,[182] can improve alignment accuracy and precision. Alternatively, prior to actinic alignment, bleaching of the resist covering the target can improve the
11/30/00 JMR
Techniques and Tools for Optical Lithography
589
signal-to-noise ratio.[276] On substrates that are highly reflective or that scatter the alignment light quite a bit, addition of a dye that is absorbing at the alignment wavelength can help absorb scattered light so the signal-to-noise ratio is raised.[277]If this dye is not absorbing at the actinic wavelength, there will be no deleterious effects to the resist profile. To improve target visibility by reducing resist absorption, the alignment wavelength can be lengthened. However, since this alignment wavelength is nonactinic, chromatic aberration and astigmatism will appear and only sagittal aberrations can be corrected.[151][268][278] The alignment optical focal length must be corrected separately, usually with a separate alignment optical path. Residual chromatic aberration of a wideband alignment source blurs the images for the wavelengths farthest departed from correction in a through-the-lens design. The composite image of blurred and sharp images results in an attenuated edge definition and causes alignment inaccuracies. For nonactinic alignment wavelengths, radial orientation of the alignment targets is preferred since this orientation remains well focused. The use of multiple wavelengths can improve the composite signal symmetry by minimizing the signal intensity fluctuations from the individual wavelengths caused by optical interference effects.[154)[278] Alignment Signal Processing. The quality of the alignment signals influences the quality of the registration directly. Symmetric Signals. Analysis of symmetric signals can be accurately done by many methods. Some of these include: peak intensity, correlation methods, digitization of the time varying signal and then integration to find the centroid, taking one or more threshold slices of the image to find the midpoint and averaging those to find the centroid, first derivative analysis of the edges followed by centroid analysis of the two derivative distributions followed by averaging to find the center of the signal, second derivative analysis, analysis for maximum symmetry by averaging the signal and the signal folded upon itself, etc. Asymmetric Signals. Alignment errors are strongly rooted in asymmetric signals. Without additional information about the source of the asymmetry, the true position of the alignment mark is uncertain. Most signal processing assumes symmetry for analysis and registration analysis records any error as an offset error (i.e., a mean shift) or, for enhanced global alignment, there can be a relatively large residual error in the models of Eqs. (114)–(116), or otherwise, a source of variation. Generally, algorithms for asymmetries simply don’t exist. Feedback of the effect of asymmetries is gotten from overlay measurements after align is completed (for example, from verniers or box-in-box structures), but this doesn’t prevent reworks.
11/30/00 JMR
590
Handbook of VLSI Microlithography
Optical interference techniques, which analyze information in the spatial frequency domain (i.e., by using pitch information) instead of collecting edge information have some advantages with asymmetries but not complete immunity. Implementation of an expert system known as case-based reasoning can improve alignment with asymmetric signals. Case-based systems borrow from a wide range of disciplines including Boolean logic, rule-based technology, pattern matching, statistical weighting, and natural language processing. Casebased systems are expert systems that can enable alignment signal processing to learn. Using this technique, a signal is processed conventionally, generating initial estimates of the mark position. The alignment signal is then compared to an indexed case library of signals, each corresponding to registration offset data. The most closely matched set of signals in the library is retrieved. If there is an exact signal match, the historical offset is added to the initial estimate of the mark position, and the alignment is concluded. If there is an exact signal match, there is no new learning by the system; i.e., no new signal is stored in the library. If there is no exact signal match, the closest signal is matched and the historical offset is added to the initial estimate of the mark position. The alignment continues, the signal is stored in an engineering file pending feedback on the registration, and then the new signal and offset information is added to the library. Alignment signals can be compared to historical case signals by analog or digital means such as pattern recognition, or Fourier analysis can reduce the signal to its frequency components. Either recognition scheme provides the means to objectively correlate a new signal with a library of old signals, so the accompanying offset information can be applied prior to alignment. Target Design. Alignment target designs are of almost all imaginable types. The shapes can be crosses, chevrons, bars, islands, concentric circles (for example, Fresnel zone targets that form a point focus which moves with the wafer), or diamonds. The general shape depends upon the design of the stepper’s alignment system that illuminates the target and the method by which diffracted or scattered light is collected for signal processing. In some cases, the density of the target (i.e., mesa or raised features versus trench or depressed features) may not be negotiable (for example, device structures of deep isolation trenches would preclude mesa targets since these would act like picket fences, interfering with the resist coat) or could depend upon the reflectivity of the deposited substrate and scattering of light off the field. Aspect Ratio. The aspect ratio is defined here as the ratio of the length to the width of a feature. If it is possible, features of high aspect ratios are
11/30/00 JMR
Techniques and Tools for Optical Lithography
591
favored over small ones because there is more focus and exposure latitude. For example, targets consisting of small trench unit aspect ratio (for example, contact) features (that are square on the mask) will image as something between a square and a circle under good focus conditions. Under defocused conditions, the images of the contact features will degrade to football or pear shapes. Light scattered off these target features can yield an asymmetric signal which will cause an alignment inaccuracy. However, an advantage of the unit aspect ratio targets is the favorable resist flow characteristics, since there is no long edge to block resist flow and allow asymmetric resist buildup. By comparison, alignment targets of the same density made of dashes (i.e., the length of the target is stretched so the aspect ratio is raised) will have improved focus latitude since a defocused image will yield cigar shaped targets. Although the target ends may not be ideal, the edges will remain straight and light scattered off these edges will continue to yield symmetrical signals. Another benefit is the higher aspect ratio of dash targets relative to contact targets results in more exposure latitude. Dashes permit favorable resist flow also. Resist Coat and Planarization. If resist is coated over a row of mesa features (i.e., features that protrude above the substrate), the resist film thickness is symmetrical with respect to the mesa feature edges for all of the features except the edge ones. The edge features tend to have film thicknesses that are thicker on one side than the other.[279] Oscillations in the steady state resist film coat profile originate in the shape of the underlying feature since it is the underlying feature shape that determines the nature of the capillary force when profiles are approximately conformal.[280] Asymmetry at coat is due to the different sign of the capillary force on one side of the feature relative to the other side, which produces a dip on the upstream side and a bump near the downstream side. Asymmetry in baked resist film profiles is caused by a combination of the asymmetry at coat and an amplification of these distortions due to baking. Changes in the phase component of the intensity dominate, and the resulting asymmetric alignment signal causes an alignment shift.[269][281][282] Careful characterization of the resist spin process can yield a planarizing coating over alignment target features.[281] Alternatively, planarization structures in proximity to alignment targets act to minimize alignment variation caused by asymmetrical photoresist coating of alignment target features.[283] Asymmetric resist pileups on target features are prevented by dummy border pattern placement (which can be a shape identical to one of the target features), helping to stabilize alignment shifts caused by
11/30/00 JMR
592
Handbook of VLSI Microlithography
asymmetric alignment light absorption, asymmetric signal reflection, and local variations in the resist’s index of refraction after partial bleaching. Dependence on Field Errors. Alignment data must be collected by the stepper for signal processing so that field errors do not interfere. For this reason, the target site on the field should be blocked (i.e., a scattergram of locations in the field that are aligned should indicate either a single point or symmetrically located misregistration vectors in the field). Since field errors are minimized at the center of the lens, the center of the field is optimum for placement of alignment targets.[284] This may be an option if there are multiple die in the exposure field since there may be a central scribe lane. If it is impossible to position the wafer alignment target at the center of the field, a pair of targets located at the field edge on a diagonal through the optical axis is a good compromise. Symmetric field errors will be compensated for so they don’t adversely influence the translation value. Alternatively, the field error contribution to wafery alignment can be minimized if the alignment target for y is positioned at the edge of the field in x but near the origin of the field’sy axis.[285] Thex alignment target is positioned similarly in the field. Collection of alignment data from across the wafer or across a field allows in-situ correction of some field errors. Wafer scaling, or magnification, caused by high temperature processing or stressful film deposition represents a physical expansion or contraction of the field, also. Isotropic effects can be corrected easily. Field rotation is correctable with data from only two points in the field. With a correction for translation and these two field errors corrected in-situ, total registration is improved. 11.2 Field Errors Field errors become a very important family of image placement variation for feature sizes near a micron or smaller. The ideal image field is rectangular and described perfectly by the mask or reticle. Under real conditions, field errors contribute to a distortion of the rectangular field.[286][287] Some of the field errors are correctable and others are fixed by the design or manufacture of the stepper. Some examples of field errors are shown in Fig. 59. Geometric Model. The analysis of multiple image field placement deviation values using a least squares fit to a geometrical model allows characterization of the individual errors.[265][267][286]–[288] One such model is:
11/30/00 JMR
Techniques and Tools for Optical Lithography Eq. (117)
dx = T x – Ry + Mx + Kxx2 + Kyxy + xr2D3 + xr4D5
Eq. (118)
dy = Ty + Rx + My + Kyy2 + Kxxy + yr2D3 + yr4D5
where:
r2 = x2 + y2
593
Figure 59. Examples of (a) symmetrical and (b) asymmetrical intrafield errors.
In the model, dx and dy are the component image displacement values at various field positions, T is the field translation, R is the field rotation, M is the magnification of the field, K is keystone,D 3 is third order distortion, and D 5 is fifth order distortion. The signs preceding the rotation (R)
11/30/00 JMR
594
Handbook of VLSI Microlithography
coefficients in Eqs. (117) and (118) are opposite for the x and y components since their directions are orthogonal to each other. It is reasonable to expect the translation, rotation, and magnification values after regression would be identical. If Eqs. (117) and (118) are solved separately, this probably will not result and some type of averaging will be necessary.[267] If the equations are coupled then a unique solution will result. The matrix for coupled equations is shown in Table 6. Lens matching compares the intrafield errors of different lenses and is defined generally as the correctable errors subtracted from the measured errors. The unit magnification folded catadioptric Hershel-Wynne-Dyson [165]–[167] design is telecentric on both the image and object side. In lens reality, there are random field errors, magnification errors, and trapezoid errors. There is no provision for the correction of magnification and trapezoid errors, so these must be treated as uncorrectables. The only correctable field errors are translation and rotation. Distortion of the prism surfaces due to stress induced during the stepper’s assembly can be responsible for the magnification and trapezoid errors.
Table 6. Matrix for Coupled Equations
2/23/01 JMR
Techniques and Tools for Optical Lithography
595
Application of Eqs. (117) and (118) yields the information necessary to align the mask or reticle exposure field with the symmetric center of the reduction lens assembly. Mask or Reticle Design and Testing. The two tests of interest are the test of a single lens and the test from stepper-to-stepper. The test of a single lens is facilitated with in-situ stepper metrology.[51][52][267][289] This insitu test mask or reticle has an array of alignment targets in the usable field that are imaged and then individually aligned with the stepper’s local alignment system. The local displacement in x and y for each target is measured using the laser stage as a yardstick and these data are modeled for intrafield errors. Alternatively, a comparison can be made at selected sites in the field relative to the center of the lens, which is assumed to be ideal. Registration is determined from electrical probe structures,[265]optical verniers,[290] or box-in-box structures.[291] The registration of these structures is found from interlocking two parts, referred to here as the male and female halves (see Fig. 60). A test mask or reticle has an array of male structures in the usable field that is imaged. Before the wafer is removed from the chuck, a second exposure pass is made where the mask or reticle is framed down with opaque blades so only the center female structure is exposed, and the laser stage is stepped to interlock the female half with each of the male halves. Stage imprecision shows up in the intrafield model as a residual error since it is a random error. Systematic stage errors such asx andy scaling and orthogonality must be minimized for uniform comparison of results between steppers. Stage scaling errors directly affect field magnification and stage orthogonality shows up as field anamorphism. Stepper-to-stepper tests can be either in-situ or remote also, but since lens matching is defined as measured-correctables, stage imprecision must be eliminated as a source of residual error or it will be a confounding effect. Test masks or reticles for inter-stepper matching have adjacent male and female structures in pairs in an array in the usable field. The nominal separation of the structures is known a priori. One stepper exposes fields on a wafer and the second stepper introduces a shift prior to exposure so the male and female structures interlock (for in-situ metrology, a shift is needed to avoid overlay of the alignment targets, so they can be read later). Testing all random pairs of steppers for matching becomes overwhelming quickly since the number of random pairs increases geometrically as n(n – 1)/2. Symmetric Errors. Symmetric errors are the same value at a given radial distance regardless of position in the field.
11/30/00 JMR
596
Handbook of VLSI Microlithography
Figure 60. Schematic of an intrafield test reticle showing the layout of the complementary male and female registration halves.
Translation. Translation of the field, theoretically, is not a problem for steppers that reference the mask or reticle directly. Steppers that do not reference the mask or reticle directly have procedures that try to reference the mask or reticle to some fiducial (usually on the stage near the chuck), and then reference the alignment system to the same or a related fiducial, so the alignment system is referenced indirectly to the mask or reticle.[285] As overlay requirements tighten, the stability of indirect mask or reticle reference systems becomes a larger issue, since translation instability requires the use of send-ahead wafers to center alignment mean values before the lot is committed for processing. Magnification. A stepper is image side telecentric if there is no change in magnification as the wafer plane is defocused. Typical telecentric values are ±0.05 µm or less of magnification change for ±2 µm wafer plane
11/30/00 JMR
Techniques and Tools for Optical Lithography
597
defocus. Experimental failure of a test for telecentricity usually is caused by condenser lens defocus, but the effect can be caused by a condenser aberration such as spherical aberration.[15] A double telecentric design calls for no magnification change if either the object (i.e., the reticle) or image (i.e., the wafer) is defocused. For steppers that are image side telecentric only, a reticle z stage can be used to correct for magnification errors. Reticle z motions can correct for trapezoid errors quickly, too. Generally, there are six degrees of freedom on a reticle stage: x and y translation, z motion, azimuthal rotation, and front-toback and left-to-right tilt. Magnification corrections are made by: Eq. (119)
∆M = M · ∆z/EP
where ∆M is the desired magnification change, M is the nominal lens magnification, ∆z is the reticle z displacement, and EP is the entrance pupil distance on the reticle side of the projection lens.[267] It is apparent that reticle z motions become more critical as the reduction ratio of the lens is lowered. Very high resolution micropositioning of the reticle stage is possible using piezoelectric translators.[257] Double telecentric lenses have the disadvantage that magnification and distortion adjustments cannot be made by reticle plane manipulation so more complex adjustments of individual lens elements have to be made for reduction steppers.[267] No correction is available for unit magnification double telecentric steppers. Wafer defocus can induce distortion and magnification errors even in cases of wafer side telecentric lenses, but requires severe (i.e., improbable) levels of condenser and wafer plane defocus. Alternatively, the magnification can be adjusted by changing the refractive strength of the lens elements nglass( λ)/n medium(P,T,H), where P is the pressure of the gas medium, T is its temperature, and H is its humidity. Typically, the gas is nitrogen and pressure changes of only a couple of psi make a significant (e.g., tens of ppm) magnification change. This low pressurization avoids element flexure, which could change aberration values. Lens assembly pressurization is relatively slow since a change requires equilibration, so it may not lend itself as well as a reticle z stage to in-situ correction of process induced magnification errors (caused by high temperature processing or the deposition of stressful films). The dependence of magnification on environmental factors is:[275]
11/30/00 JMR
598
Handbook of VLSI Microlithography
Eq. (120)
∆My = 0.5[(LFL + LFR) + (LRL + LRR)] – 10[αw(Tm,w –Texp,w) – αr(Texp,r – 20) + αp(760 – Pexp)]
where ∆M y is the magnification error in the y direction, LFL and LFR are the y direction lengths on the left and right side of the field, respectively, LRL and LRR are the corresponding lengths on the left and right side of the reticle, respectively,αw andαr are the coefficients of thermal expansion for the wafer and reticle, respectively,Tm,w is the temperature of the wafer during measurement, Texp,w is the temperature of the wafer during exposure, Texp,r is the temperature of the reticle during exposure,αp is the magnification correction factor for barometric pressure, and Pexp is the barometric pressure during exposure. The coefficients of thermal expansion for silicon, Hoya low expansion LE-30 glass, and fused silica are 2.5 × 106, 3.7 × 106, and 0.55 × 106 mm/°C, respectively.[293] It is important that the focus walk be small (for example, ±0.2 µm) for a relatively large change in magnification (±10 ppm). This check is most important in systems that pressurize the reduction lens to achieve magnification changes. Also, the exposure duty should have a negligible impact on magnification change for either the across wafer or wafer-to-wafer families of variation (unless compensation is applied). Although there is a stronger effect on focus from this lens heating effect, absorption of light in the lens assembly changes the glass index of refraction, dn/dT, causing a magnification shift. Systematic stage errors such as x and y scaling must be minimized between random stepper pairs for uniform comparison of results in intrafield testing. Stage scaling errors are a confounding effect for field magnification. Also, testing for magnification can be confounded by the feature size of the object so either the test vehicle must be selected carefully or the aerial image should be sampled.[158] Rotation. Field rotation is measured for the current layer being imaged relative to the previous one.[290] Systematic sources of field rotation include a rotational misalignment of the mask or reticle on the platen (the platen vacuum clamps the mask or reticle in position at the object plane), nonparallelism between the mask or reticle alignment marks and the stage motions, and asymmetrical keystoning of the system. Random sources of rotation errors include twisting of the optical column from one exposure to the next and stage yaw. Field rotation is an error that is correctable with wafer grid errors during the alignment sequence.
11/30/00 JMR
Techniques and Tools for Optical Lithography
599
Distortion. Distortion is an uncorrectable error for reduction stepper users for fixed illumination conditions. Pincushion distortion occurs when the distortion parameters are positive, with each radial image point displaced outward symmetrically from the center, as seen in the left-hand cartoon of Fig. 59 (a). The right-hand cartoon of Fig. 59 (a) shows barrel distortion, which occurs when the distortion parameters are negative. In most practical cases, the third and fifth order parameters will have opposite signs leading to partial barrel and partial pincushion distortion.[267] Relative to a perfect grid, distortion errors along a radial line can have both positive and negative errors, with a shape like a distorted sine wave. Including the optical axis, there can be multiple radial positions with zero distortion errors, and the largest absolute error is usually at the full image height (R) at the end of the radial arm at the field edge. Field matching of lenses is principally concerned with how well these distortion curves match at each radial position. Asymmetric Errors. Asymmetric errors have different absolute values at the same radial position on all diagonals through the optical axis. Trapezoid. Trapezoid is a correctable error, except with unit magnification steppers. If the stepper is not telecentric on the object side, a mechanically fine z axis tilt of the reticle will introduce a geometric trapezoidal distortion from an ideal square shape. From the coefficients of the intrafield model, if Kx or Ky ≠ 0 the trapezoid error is keystone; if Kx ≠ Ky and Kx ≠ 0 ≠ Ky, there is an irregular quadralateral; if Kx = ±Ky ≠ 0, there is kite.[287] Anamorphism. Anamorphic errors are uncorrectable except with the Step-and-Scan machine where relative differences in mask and wafer motions allow correction.[114] Anamorphism is a non-rotationally symmetrical distortion term which is caused by cylindricity in lens elements[267] or by distortion of the plane of the reticle. Cylindricity of lens elements is due to improper lens manufacture. Anamorphism causes a difference in magnification across the axes. Systematic stage errors for orthogonality must be minimized for uniform comparison of results in intrafield testing. Stage orthogonality is a confounding effect for field anamorphism. Referencing the coefficients of the intrafield model, distortions from an ideal square shape are rectangular if Mx ≠ My and field orthogonality = 0; a rhomboid results if M x ≠ M y and field orthogonality ≠ 0, and; there is a rhombus, if M x = My and field orthogonality ≠ 0.[287] Non-Concentric Imaging and Total Registration Analysis. Nonconcentric imaging exposes and registers two or more fields imaged with a small diameter lens assembly to a single field exposed with a larger diameter lens assembly.[288] The larger diameter stepper lens typically is used for
11/30/00 JMR
600
Handbook of VLSI Microlithography
increased throughput. This may be restricted to imaging just noncritical layers but increased overlay errors might be a concern. The magnification errors of the larger diameter lens result in subfield translational errors, unique to each subfield. Total registration analysis of concentric or non-concentric imaging systems can be done by creating representative distributions of the total error vectors by convolving the intrafield and interfield geometrical models together. For simple linear models, variances can be summed. For the multiple linear models, Monte Carlo simulations can be performed for each correctable error.[294] For these, nested variance analysis is performed to account for all families of variation to calculate the mean values of each correctable error and their respective standard deviations. Similarly, total overlay distributions can be simulated by convolving the registration and critical dimension data. Illuminator Issues. Condenser lenses are manufactured with certain aberrations that are complementary to those in the reduction lens assembly.[15][72] Control of the magnitude of the aberrations is much less critical, and can be one or two orders of magnitude larger than the reduction lens assembly requires. The cost and quality of a condenser can be much less than the reduction lens assembly. The z axis position of the condenser lens affects illumination uniformity, the partial coherence value (i.e., the degree to which the pupil of the reduction lens is filled), and the degree of image side telecentricity. Condenser aberrations are important because they change the directional distribution of illuminating radiation, which affects both the shape and telecentricity of the resulting images. If the wafer is in focus and the projection lens is aberration-free, the mask image (especially for small features) does not depend critically on the shape of the source image.[15] Bench testing of the condenser lens is preferred since the interaction of condenser aberrations and projection lens aberrations is complex and difficult to sort out with testing on wafers. The effect of condenser tilt on imaging depends on the degree of partial coherence and the type of mask used (i.e., conventional transmission or phase shifting).[292] A test vehicle for condenser tilt is two butted L/S pairs, one a conventional transmission mask and the other a Levenson phase shifting mask (discussed in more detail in the phase shift mask section of this chapter). When a defocus error is introduced, resist patterns of 0.4 micron L/S pairs produced by the phase shifting mask and the transmission mask shift in the opposite direction from each other, so a butting error will be apparent. For example, a 10 mrad condenser tilt will produce a 0.2
11/30/00 JMR
Techniques and Tools for Optical Lithography
601
micron difference in the direction of image shift between the phase shifting mask and the transmission mask if a ±1 micron defocus error is present. When there is a condenser tilt, there is no degradation of image contrast with a conventional transmission mask. In fact, image contrast can be greater. Images with small features are less sensitive to condenser tilt.
12.0 MASK ISSUES A few particular issues are treated here since they have special application to optical steppers. By convention, unit magnification patterned plates are usually called masks and reduction plates are called reticles, although no strong distinction is drawn here. 12.1 Particulate Protection The frequency of particles increases quadratically as their size decreases. The population of particles is related to their size (x) by:
Eq. (121)
x =∞ 1 Population (area) ∝ ∫ 2 x= n x
−1 ∞ 1 = dx = x n n
so the particle population increases inversely linearly. This is important in considering the choice of lens reduction since the chip area on the mask of reduction steppers relative to unit magnification increases by the inverse square of the magnification. Pellicles. Pellicles are thin transparent membranes on a frame offset from the mask surface that are hermetically sealed, dust-free enclosures. The use of pellicles prevents particles from falling onto the mask surface where it can print as a defect. Particle size immunity to printing is proportional to the standoff distance of the pellicle frame. The standoff distance requirement also depends on the illumination wavelength since scattering of light is inversely proportional to the fourth power of the wavelength. The most common pellicle films for g-, h-, or i-line exposures are manufactured from nitrocellulose with antireflection coatings. There is little degradation of a nitrocellulose film, as a function of lifetime exposure duty for wavelengths at 365 nm or longer, if the thickness is selected and manufactured carefully.[295] Transmission spectra for nitrocellulose as a
11/30/00 JMR
602
Handbook of VLSI Microlithography
function of wavelength exhibit standing wave interference. The oscillatory behavior of reflectivity (R) can be approximated by:[296] Eq. (122)
R = r[1 – cos(X)]
where:
r = 2[(n – 1)/(n + 1)]2
and
X = 4πnt/λ
where n is the index of refraction of the film (~1.5 for nitrocellulose), and t is the film thickness. The transmission reaches a theoretical maximum of 1.0 when the optical thickness (nt) is an even integer of quarter wavelengths and a minimum when equal to an odd integer. Selection of the pellicle thickness with good transmission depends on the required mechanical strength but affects the periodicity of the waves (at 436 nm, both 0.865 and 2.85 micron thick pellicles have≥99% transmission, but the thinner, more fragile film has a longer period). Raising the transmission of a single pellicle slows the degradation time of the film. A laser reflectometer can be used to ensure transmission uniformity to less than one fringe in the active area. Application of an antireflection coating on the nitrocellulose film maintains image contrast and minimizes the amplitude of the waves. Typical antireflection coatings are organic Teflon-like polymers (Teflon is a trademark of E.I. DuPont de Nemours Co.) or inorganic MgF2. At 436 nm illumination, many antireflective coatings are available that do not degrade with cumulative exposure dose, but at 365 nm, the selection of a durable antireflective coating is less trivial.[295] The useful lifetime of a pellicle has expired when its transmission varies by more than 1%, corresponding to a shift of the interference fringe to the short wavelength side. Material selection for pellicles that have acceptable transmission properties, lifetime characteristics, and durability is much more limited for wavelengths shorter than 365 nm, but there are some suitable candidates[295][297][298] of Teflon type or fluoropolymer films. The heating effect in the pellicle film from deep UV excimer laser exposure is negligible compared to the photochemical change due to oxidation. Selection of a pellicle film with acceptable transmission stability and uniformity still requires matching an antireflective coating material that won’t decompose and that has a compatible index of refraction. Soft defects are defects that are not imaged consistently. In many cases, these printed defects are caused by floating, unsecured particles that attach to, and then release from, the mask surface. If the mask is pelliclized,
11/30/00 JMR
Techniques and Tools for Optical Lithography
603
the volume of air under the pellicle of a reduction stepper relative to unit magnification increases by the cube of the reduction ratio. The probability of soft defects increases with increasing ratios of lens reduction. The focus shift (∆f) resulting from the insertion of a membrane in the optical path is approximately:[299] Eq. (123)
∆f = t(n – 1)/n
where t is the film thickness and n is the refractive index of the pellicle film. The depth of focus at the object plane is relative to the objective lens magnification, M(M≥1): Eq. (124)
DOFobject plane = M × DOFimage plane
so there is less sensitivity to the choice of pellicle thickness for reduction steppers relative to unit magnification steppers. To ensure less than a 10% variation in exposure, the minimum standoff distance, D (mm) required for a particle size p (µm) is: [296] Eq. (125)
D = (np)/(560NA)
where n is the refractive index (1.0 for air and 1.5 for glass) and NA is the numerical aperture. Particles on the glass side of the mask appear ~33% closer because of the refractive index of the mask glass, so an equidistant glass standoff is not as effective as a pellicle, but there is an obvious benefit to using 250 mil thick masks instead of 90 mil thick ones. Common pellicle standoff distances to keep particles out of focus at the object plane range between 3–10 mm. Glass Coverplates. An alternative to pellicles is a glass coverplate.[300] Materials of construction include soda lime, borosilicate, or quartz glass. Advantages of glass coverplates include their greater durability relative to pellicles and chrome damage from electrostatic discharge is eliminated. Like pellicles, addition of a planar glass element to the optical path can introduce optical aberrations. Voting Lithography. Voting lithography is a technique that superimposes multiple images of nominally identical mask fields and exposes each with a suitable fraction of the total exposure energy.[301] A random defect unique to one of the mask fields will be averaged out, reducing the influence of the defect on the final image. Typically, voting occurs with two or three masks. This procedure reduces wafer throughput and assumes
11/30/00 JMR
604
Handbook of VLSI Microlithography
precise alignment. Simulations[301] indicate since the image intensity rises roughly linearly over a distance of ~0.3λ/NA, total displacements between the individual images less than this will result in essentially no linewidth change. Increased coherence (i.e., small s) gives a higher intensity slope and yields a smaller linewidth variation. For larger values of s, the interactions with neighboring features are less coherent. Therefore, there are smooth linewidth variations and defect suppression with voting with results ≥300% better than a nonvoted image (since voting effectively reduces the intensity coming through the defect by one third). When three votes are used, defects which bridge on the mask do not bridge when printed, and the size of defect which can be tolerated for a 10% linewidth variation doubles from 0.24λ/NA to 0.5λ/NA. 12.2 Phase Shifting Masks Special construction of the masks can increase the resolution and depth of focus of features being imaged. A conventional transmission mask limits resolution because the electric field corresponding to the intensity pattern has the same phase at every aperture (i.e., opening in the chrome of the mask). This causes a finite intensity in the region between features that ideally should be completely darkened (i.e., conventional masks have binary light amplitude values with values of either zero or one). Constructive interference between waves diffracted by adjacent apertures enhances the field between them.[302] The intensity pattern is proportional to the square of the electric field. There are quite a few different phase shift configurations which manipulate the mask transmissivity function to improve imaging performance, with light amplitude values anywhere between 1 and –1. Phase shifting applies light interference effects to reduce the spatial frequency of the object or to improve its edge contrast. There are a finite number of ways to create phase shifting effects. Some of the phase shifting approaches improve exposure or depth of focus latitude more than others, depending on the feature shape and spatial density. The highest contrast imagery occurs when the zero order component of the spatial frequency spectrum vanishes[303] providing frequency doubling, as is the case with the Levenson and chromeless phase shifting methods discussed below. Depending on the degree of illumination coherence, Levenson and chromeless phase shifting techniques double the frequency range whose first order diffracted wave components are collected by the objective lens, providing enough contrast for imaging. In these methods, destructive interference eliminates the 0th order light. These
11/30/00 JMR
Techniques and Tools for Optical Lithography
605
techniques improve the resolution and depth of focus significantly but they are difficult to apply to many integrated circuit patterns. Also, it is virtually impossible to print equal lines and spaces with a frequency doubling system.[23] Phase shifting techniques that improve the edge contrast such as rim, outrigger, or attenuating shifters accommodate real device layouts better but their imaging improvement is less dramatic. There are no light interference effects in the quartz blank itself, since the coherence length of the stepper light source is much shorter than the quartz thickness. However, thin film interference effects from reflections caused by refractive index mismatches at transmissive film boundaries can introduce transmission errors when extra films are used in the mask construction for electron beam charging, or etch stop control.[304]–[306] The transmittance swing for i-line, caused by interference for a mask with a conductive antistatic layer in between the quartz substrate and the phase shift layer, has an amplitude of 20% compared to an amplitude of 2% for a mask without this film. The thickness requirement for a nonabsorbing phase shift layer to induce 180° of phase shift in most cases does not match the thickness requirement of 100% transmittance. Since thin film interference depends on the film thickness, the transmittance value depends on thickness. There are only a few specific shifter refractive index values that will yield a π phase shift at that thickness. Even then, any film thickness variation from ideal will result in phase and transmission errors. If antistatic or etch stop layers are used during mask construction, they should be removed from transmissive mask regions before entering lithographic service. Phase shifting is attractive for imaging near and below the Rayleigh resolution limit since the relative image contrast is so much greater than with conventional imaging. For resolution coefficient k1 values from Eq. (13) greater than ~0.6, the process latitude benefits of phase shifting are negligible and the masks costs are much greater compared to a standard transmission mask. This means that a photo section must run under statistical control with adequate exposure and defocus latitude, with biased conventional imaging at k1 = 0.6, before any benefits of applying phase shifting can be gained under the same production standards (for example, cycle time, scrap rate, rework rate, etc.). The image contrast produced by a phase shifting mask depends upon the partial coherence value of the illumination as well as the mask spatial frequency and the numerical aperture of the imaging system.[302] The benefit of a phase shifting mask appears as the feature sizes decrease to the Rayleigh resolution limit, since large features are not susceptible to
11/30/00 JMR
606
Handbook of VLSI Microlithography
proximity effects. Phase shifting masks can increase the exposure latitude significantly by increasing the image contrast.[302] The expected improvement is greatest for more coherent light; a partial coherence value of 0.3 gives much greater image contrast for fine features than a value of 0.7. As the phase of the shifter increases from 0 to π, the contrast becomes more focus tolerant.[307] Phase shifting masks can give results equivalent to a numerical aperture increased by ≥33%, but without the large reduction in depth of focus, the higher NA lens would yield. The design of phase shift masks is nontrivial since arbitrary pattern shapes and sizes must be accommodated.[308]–[310] Usually, optimized patterns only vaguely resemble intuitive designs. An automated CAD tool is highly desirable, especially one that can consider bias effects, proximity effects,focus plane shifts, and maximizing depth of focus. It is unclear yet whether some of these optimized designs are practical from a mask manufacturing perspective. Quartz, SiO2, and polymer films transparent to actinic light have served as phase shifting layers. The index of refraction and thickness of these films must be controlled over the exposure field and from mask-tomask so their influence on the phase shifted imaging does not decrease the nominal exposure-dose (ED) response curve unacceptably. Initial film control limits can be studied most easily with simulations. Defectivity and stability over time are factors in selecting materials for volume applications. Also, transmission errors caused by shifter absorption can become important as the actinic wavelength shortens and the imaginary part of the shifter refractive index increases. Generally, the aerial images produced by phase shifting have pronounced first secondary maximum side lobe intensities. High contrast resist processes can amplify the side lobes due to relatively higher exposure margins, as defined by Eq. (53). The problem is minimized by using resist systems with high surface inhibition and high transparency at the actinic wavelength.[311][312] Alternating Aperture Phase Shifting. An alternating aperture phase shifting mask has a transparent phase shifting layer covering adjacent apertures of a normal transmission mask, The light coming from one of the features is delayed so that it arrives 120–180° out of phase. Figure 61 shows two light waves passing through a transparent material. The light waves begin in phase in air and stay in phase through the thick material. As illustrated, the light passing through the phase shifting layer has its phase shifted by π or 180° relative to the non-phase shifted light. A π phase shift is equivalent to changing the sign of the light amplitude. The shifter edge is
11/30/00 JMR
Techniques and Tools for Optical Lithography
607
never imaged since the chrome obscures it. There are no profile or surface roughness tolerances of concern. If there is not a shifter overlap, a nonorthogonal side wall of the phase shifter will introduce linewidth differences in alternating lines.[313]
Figure 61. Phase shift principle showing an alternating aperture phase shifting mask. (Reprinted with permission of Ref. 328.)
A tilt in the condenser displaces the source images formed in the entrance pupil of the projection lens and can produce lateral positional shifts in periodic line patterns when the image is out of focus.[292] For fine features near the resolution limit of the projection lens, the direction of the image shift with an alternating aperture phase shifting mask is opposite the shift that occurs with a transmission mask. For these fine features, condenser tilt diminishes the image contrast obtained with a phase shifting mask but increases the contrast of a conventional transmission mask. The partial coherence, the defocus magnitude, mask feature size, and the condenser tilt magnitude are coupled in their effect on image shift. Partially coherent light reduces the amount of the image shift. Levenson. The earliest phase shifting mask [302] is called a Levenson phase shifting mask. The light passing through the phase shifting layer has
11/30/00 JMR
608
Handbook of VLSI Microlithography
its phase shifted by π or 180° relative to the non-phase shifted light. The phase difference between shifter elements and the transmission field isπ, so there are transmissive elements of two different phases. With a phase shift mask, the two diffracted beams from adjacent apertures will cancel by destructive interference and the desired dark area between images will be obtained,[307] as shown in Fig. 62. A π phase shifting transparent layer of thickness, d: Eq. (126)
d = λ(2m + 1)/[2(n1 – n2)]
(where n1 is the index of refraction of the phase shifter, n2 typically equals 1 for air exposures, λ is the wavelength, and m = 0, 1, 2, …, so any odd multipleof π thickness is permitted) reverses the sign of the electric field corresponding to the phase shifted aperture. The intensity pattern at the mask is unchanged.[302] Film thicknesses (d) of arbitrary phase difference (φ) are found from: Eq. (127)
d = φλ/[2 π(n1 – n2)]
Figure 62. Levenson phase shift imaging. (Reprinted with permission of Ref. 328.)
Destructive interference between waves diffracted from adjacent apertures minimizes the electric field and intensity at dark lines between bright images at the wafer plane. Imaging contrast is improved by a marked decrease in the minimum intensity and a marginal increase in the maximum intensity, leading to an increase in resolution of >25% for near-periodic structures when there is a π phase shift between neighboring clear mask regions.[307][314] This is seen in Figs. 63 and 64.
11/30/00 JMR
Techniques and Tools for Optical Lithography
609
Figure 63. Comparison of the diffraction optics of an ordinary transmission mask with a phase shifting mask. E is the electric field and I is the intensity. (Reprinted with permission of Ref. 302.)
Figure 64. Image intensity for a periodic series of lines and spaces 0.75 microns wide, for no phase shift (B), and 180° phase shift (A). (Reprinted with permission of Ref. 307.)
11/30/00 JMR
610
Handbook of VLSI Microlithography
An approximation for the spatial frequency (line pairs/mm) at an MTF of 60% is:[302][314] Eq. (128)
ν60% = [2 – (6/7)s]·NA/λ
where the spatial frequencyν = 1000/[2.linesize(µm)],s is the partial coherence value, and λ is the wavelength in mm. There is no bias difference between conventional and Levenson imaging.[315] There is improvement in imaging with this type of phase shifting since the spatial bandwidth required to transmit an intensity pattern of a given periodicity is halved.[302] For coherent illumination, Levenson phase shifters double the spatial frequency response of the imaging system, so that Eq. (39) actually describes the performance instead of Eq. (38).[23] For 0.5λ/NA L/S pairs illuminated with s = 0.5, the depth of focus using a Levenson mask doubled the conventional mask value ofk2 = 0.32 and improved the exposure latitude from 22% to 27%.[306] While the defocus latitude decreases for conventional imaging with increased coherence, there is an interaction between the objective numerical aperture value and the partial coherence value for the defocus latitude of Levenson masks.[316] As the i-line illumination becomes more coherent, the depth of focus improves for a 0.48 NA lens but decreases for a 0.54 NA lens. However, the ultimate resolution is finer for any partial coherence value with a Levenson mask compared to conventional imaging.[315] In fact, the ultimate resolution of Levenson phase shifting is described by Eq. (39) regardless of the degree of coherence, but the image contrast is much higher for more coherent illumination.[317]Also, increased coherence improves the exposure latitude for both Levenson and conventional transmission masks. This type of phase shifting application has no effect on the imaging of isolated features. Levenson phase shifters are most relevant to dense or periodic device layers, such as the gate level or for the bit or word lines. This implies that exposures of the shrinking mask feature sizes must be in negative resist. The Levenson approach will resolve these fine apertures and negative resist will leave the proper tone resist image for etch protection. The log slope of the aerial image using Levenson masks is maximum for equal line/space pairs. For a line/space ratio of (1.35/1,1.5/1) with a partial coherence (s) value of (0.3,0.5), the second side-lobe of the phase shifted adjacent apertures destructively interferes with the main lobe of the primary feature, creating imagery worse than the conventional transmission mask.[318]
11/30/00 JMR
Techniques and Tools for Optical Lithography
611
Levenson masks create a few inherent imaging problems.[306][319] Levenson phase shift masks suffer from an asymmetry of image intensity profiles that is especially acute for an odd number of parallel opaque lines. At perfect focus, the interior lines in a grating are identical but one of the two exterior lines suffers proximity problems. With ±0.5 microns of defocus, adjacent spacewidths can be 5–10% different in size.[320] Defocus or coupling with phase errors can exacerbate this. It will be virtually impossible to focus both of these outside lines at the same time for arrays of parallel lines with an odd number of lines.[321] An array of parallel lines with an even number of lines results in both outside lines having the same phase shift relative to their backgrounds. However, the outside lines still have an intensity profile different than the interior lines. The root of this asymmetry and the imaging of edge artifacts is the sharp π phase transition between shifter elements and the transmission field causing skewed interference effects. This π phase transition at line ends on Levenson structures can cause anomalous artifacts at line ends (such as looping bridges) to print under certain small defocus conditions. One way to solve the bridging problem is to use tapered shifters at the shifter ends, although this may be a difficult manufacturing problem.[306] Mask linearity to the resolution limit is worse for the Levenson phase shift mask relative to a conventional transmission mask.[306] Contrary to the conventional mask, it will be difficult to find an exposure energy to image a range of linewidths in equal L/S pair arrays between 0.4λ/NA to 1.7λ/NA with ±10% linewidth control. There is linearity above 0.6λ/NA, but these are relatively large features. There is a tendency toward more non-linearity with increased coherence. [315] Image contrast with Levenson phase shifting at s = 0.77 is only marginally better than conventional imaging but the Levenson mask linearity is best at this partial coherence value.[317] Phase shift errors and transmission errors cause the aerial images from alternating apertures to be different, resulting in spacewidth differences in positive tone resists and more sensitive linewidth differences in negative tone resists.[305][322] Intensity errors can cause asymmetric ED trees reducing latitude, while phase errors produce symmetric ED responses about best focus.Phase errors reduce the depth of focus for a given exposure budget since image intensity does not vary symmetrically with defocus. [320] The change of the image intensity at the phase shifted part is opposite to that at the non-shifted one, and the defocus dependence of positive phase errors is opposite that of negative phase errors of the same magnitude. Also, the
11/30/00 JMR
612
Handbook of VLSI Microlithography
phase shift deviation becomes more influential with increased coherence and finer patterns. As a result, defocus introduces alternating linewidth differences. For 0.35 micron line/space pairs exposed with λ = 365 nm, 0.48 NA, s = 0.38, a phase error of 20° reduces the depth of focus from 3.5 microns to 1.2 microns with an exposure budget of 10%.[305] A transmission error of 10% has little impact on the difference between alternating aperture linewidths within ±0.5 microns of defocus for 0.25–0.50 micron L/S pairs, if there are no phase errors. At 100% transmission, a phase error of 20° introduces differences that change with defocus and are a function of feature size. Transmission errors of 12% and ±6% phase errors can be tolerated for ±0.75 microns defocus latitude using a 0.4 NA Kr excimer laser stepper imaging L/S pairs at k1 = 0.35.[322] The linewidth difference response with defocus is linear for 0.25 micron L/S pairs and more hyperbolic shaped for 0.50 micron L/S pairs (with relatively large differences occurring even at ±0.5 microns defocus). These responses are shown in Fig. 65.
Figure 65. Transmission and phase error effects on Levenson linewidths. (Reprinted with permission of Ref. 305.)
Unfortunately, the benefits of greater resolution extend to unwanted defects on the mask, also.[307] Phase shifts of 180° cause defects twice as small to print. Phase angles less than 180° but greater than 120° give similar improvement in resolution. The lower phase angle reduces the susceptibility of printing defects. A phase defect of π/2 prints about the same as an opaque defect.
2/24/01 JMR
Techniques and Tools for Optical Lithography
613
Phase Transitions and Aberration Effects on Imaging. Phase transitions may be needed to bridge clear regions of π phase difference in some device designs.[323] The simplest transition is a boundary line of 90°; another is a continuous sloped edge going from 0 to π.[324][325] Without aberrations, either construction gives good imagery, but discrete segments of different phases image some high frequency noise or notching at boundaries. The impact on device performance with this variation of CD is unknown. By using a four phase transition (180°, 120°, 60°, and 0°) versus three (180°, 90°, and 0°), the intensity reduction at boundaries was virtually eliminated.[141] The depth of focus with the three phase transition can actually be worse than conventional lithography, while the four phase transition represents a meaningful improvement. With λ/14 root mean square (RMS) coma, there is a slight image movement but little influence on image quality. In particular, when the L/S pairs are radial, image quality is similar to aberration free imagery. Coma tends to pile up the intensity inward along the radial line, so there is a slight difference in intensity profile depending on the field height. However, the degree of image degradation is small compared to λ/14 RMS astigmastism. Astigmatism results in no image movement, but the image intensity at the phase boundaries varies depending on the field position and the actual phase values at the boundaries. The effect of defocus is similar to astigmatism, except there is no field height or pattern orientation dependence. Defocused images tend to degrade and separate the images. There is an asymmetric intensity distribution at the phase boundaries depending on the magnitude of the defocus, whether the defocus is positive or negative, and the actual phase values at the boundaries. A 90° phase boundary can also cause sharp intensity dips with defocus. This intensity dip can be moderated sharply with a continuous phase boundary instead of a 90° boundary. Conjugate Twin-Shifter. In practice, the many serious deficiencies with the Levenson approach are addressed with a similar technique called the conjugate twin-shifter phase shift method[319] which utilizes transmissive elements of three different phases. This third phase difference brings several advantages but it also represents added mask manufacturing complexity. The two technical features of this method include a π phase shift between alternating apertures and also a phase difference between shifter elements and the transmission field of π /2. Figure 66 shows some candidate constructions.
11/30/00 JMR
614
Handbook of VLSI Microlithography
Figure 66. Conjugate twin-shifter constructions. (Reprinted with permission of Ref. 319.)
The first feature of conjugate twin-shifting satisfies the Levenson criterion to gain the necessary destructive interference for frequency doubling; i.e., the resolution of this method is identical to the Levenson phase shift approach. The second feature of conjugate twin-shifting sharply decreases the sensitivity to phase defects. Phase transitions of π reduce intensity by ~95% while π/2 transitions reduce intensity by only ~50%. Conjugate twin-shifting has the additional advantage of providing intensity profiles that are symmetric in three dimensions, including the zaxis, which indicates an improvement versus conventional imaging in depth of focus for fine patterns. Figure 67 compares the intensity profiles of the Levenson and conjugate twin-shifting methods. The asymmetry of the two interior intensity peaks with the Levenson technique is apparent. One design constraint with the conjugate twin-shifter is that the outside of the exterior lines must not be an immediateπ/2 phase transition to the field, but there should be a transition shifter of half pitch width (this shifter is, of course, π phase shifted to the adjacent aperture). The inclusion of this extra outside shifter provides the exterior lines with the same π phase difference seen by the interior lines and raises the contrast of imaging the exterior lines to avoid proximity problems (i.e., critical dimension variation different than its neighbor).
11/30/00 JMR
Techniques and Tools for Optical Lithography
615
Figure 67. Conjugate twin-shifter and Levenson aerial images. (Reprinted with permission of Ref. 319.)
Figure 68 is a Bossung curve for a conjugate twin-shifting mask. The vertices of the CD versus focus curves will lie on the same focal position if there is a π phase difference between apertures. Phase errors caused by an offset or variation of the shifter thickness from nominal or by a shifter refractive index error will introduce a relative shift in vertices. Also, alternating aperture phase errors from π will introduce skewed interference effects so that with defocus even interior lines will experience apparent proximity problems. The conjugate twin-shifter mask must also control the extra relative phase of the field. Relative field phase errors can introduce anomalous artifacts at line ends such as looping bridges, changing with focus conditions. Unfortunately, the π/2 phase transitions produce dim regions in the image with only about 40% of the flood exposure intensity which move with defocus. This narrows the ED process window. Phase steps of 60° do not seem to print but a phase mask with four phases (0°, 60°, 120°, and 180°) plus chrome patterning is a complicated manufacturing task.[326] Double Edge Unattenuated or Chromeless Phase Shifting. Conventional masks pattern openings in chrome to make line/space pairs. A double edge chromeless phase shift mask uses both edges of relatively small width phase shifters for image formation and neither a chrome or molybdenum silicide absorber film nor an amplitude attenuating film are on the
11/30/00 JMR
616
Handbook of VLSI Microlithography
mask. A single intensity drop is produced from the two edges of the phase shifter. Patterns of phase shifters are placed directly on the quartz substrate to produce images,[327] as Fig. 69 illustrates. The phase shifters are π thick films. It is important to control the lateral dimension and edge profile of these shifters. [328] Phase errors of ±10° and shifter edge profiles from 75–90° result in negligible degradation of the image contrast at best focus.[329] Any magnitude of phase error still yields an aerial image with lateral positional symmetry when defocused, since the shifter itself is geometrically symmetric. [321] However, any phase error results in intensity profiles which are much different for defocused positions on opposite sides of nominal focus. Defocused images in the negative focal direction for 0 to +π phase shifters initially causes a decrease in contrast, followed by phase reversal and line splitting. This effectively means there is focus latitude only in the positive direction. Image contrast increases as the phase shift approaches π. The degree of intensity asymmetry with defocus peaks at 2π/3 and decreases to zero at π. The defocused aerial image response is strongly coupled to both the phase error and the width of the shifter.
Figure 68. Bossung curve for adjacent apertures with π phase shift. (Reprinted with permission of Ref. 319.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
617
Figure 69. Double edge chromeless imaging. (Reprinted with permission of Ref. 328.)
Imaging Line/Space Pairs. Mask linearity is preserved only for relatively large features, where this technology is unneeded. Conventional imaging produces mask linearity near the Rayleigh resolution limit but unbiased double edge chromeless phase shifting introduces significant and varying deviation between mask and image feature sizes.[327] A substantial bias is required between the mask and final resist dimension, limiting the range of linewidths over which this technique will work.[330] For example, to image 0.3 micron line/space pairs using a g-line, 0.54 NA stepper, the mask phase shift lines are 0.15 microns wide with clear conventional spaces of 0.45 microns. Figure 70 compares aerial images of line/space pairs using Levenson and double edge chromeless phase shifting versus a conventional mask. Similarly, if the mask is constructed of 0.35 micron shifter line/space pairs, the exposure latitude to image equal line/space pairs is nil; however, overexposure to produce linewidths of 0.10–0.15 microns in that 0.70 micron pitch is on a much flatter part of the response curve where there is acceptable exposure latitude.[23] These problems can limit applications without careful design planning. There are a couple of other limitations.[331] At a defocus value of 1.5 Rayleigh units, the width of the outermost space of the line/space pairs varies significantly. The common ED space is limited for printing isolated lines and L/S pairs since they have different focus conjugate intensities, i.e., the exposure levels where the defocus tolerance is greatest are much different.
11/30/00 JMR
618
Handbook of VLSI Microlithography
Figure 70. Aerial image comparison imaging line/space pairs. (Reprinted with permission of Ref. 330.)
Imaging Isolated Lines. Line/space pairs image because of differences in intensity across the features. In the example above, the intensity is decreased at the linear phase shifted features. This suggests applications for subresolution double edge chromeless phase shifters. A single line shifter on a mask with a feature size significantly below the Rayleigh resolution limit produces an intensity dip that can image as a fine dark line.[307][332][333] Shifter linewidths are between 0.2λ/NA to 0.35λ/NA to provide sufficient destructive interference in the center of the line to keep the intensity low,[23] producing resist linewidths approximately twice as wide at 50% of the normalized intensity. As shifter widths approach the Rayleigh resolution limit, the optics begin to resolve both edges. The aerial image for a dark line with this technique will not be quite as sharp as that produced by the single edge chromeless method discussed below. The developed image is approximately twice as large as the mask feature. Imaging Opaque Areas. An array or grating of linear phase shifters with a pitch below the Rayleigh resolution limit acts as an opaque area by inhibiting light transmission due to destructive interference.[332]–[335] Gratings with linewidths between 0.3λ/NA and 0.5λ/NA produce zero contrast
11/30/00 JMR
Techniques and Tools for Optical Lithography
619
imagery. Equal shifter lines and spaces in a chromeless mask eliminate the zero order. For small enough pitches, the first diffracted order passes outside the collection angle of the objective lens so no light from the mask pattern will be transmitted through the lens.[23] At least five lines must be used in a grating to get reasonable results. For partially coherent illumination, Eq. (43) describes the maximum spatial frequency or minimum pitch which will image. The spatial frequency of double edge chromeless shifters will have to be greater than this to eliminate light transmission. If the spatial frequency of the grating is higher than vmax and there are phase errors in the shifter, only the zero order will pass through the lens with an intensity (I) in the dark grating area of:[23] Eq. (129)
I = sin2(∆φ/2)
This equation indicates that a relatively large tolerance exists for phase errors to produce virtually no light transmission. A limitation to this method for masks requiring large opaque areas is the relatively large increase in pattern count impacting defectivity control and mask making data management. Imaging Isolated Spaces. Two sets of subresolution gratings, which decrease the intensity by destructive interference, can be brought into proximity, permitting some light to transmit, producing isolated spaces.[331] Single Edge Unattenuated or Chromeless Phase Shifting. Single edge chromeless phase shifting produces a narrow dark line at the boundary of the shifter.[336] Neither an absorber film nor an amplitude attenuating film are on the mask. The shifter occupies a relatively large area since single edge effects are desired. The mask construction, the electric field on the mask, the electric field on the wafer, and the intensity on the wafer are shown in Fig. 71. The abrupt change in phase from 0 to π at the shifter edge means the light amplitude passes through zero causing a dip in the intensity producing a dark line on the wafer as illustrated in Fig. 71. The shape and width of the intensity null that images the isolated resist line is largely independent of the width of phase shifters wider than 1.5λ/NA.[331] It is important to control the lateral dimension and edge profile of these shifters.[328] Defocused images withπ phase shifters have lateral symmetry; however, if there is any phase error, the image will shift laterally with defocus away from its proper location.[321] This is a result of the geometric asymmetry of the mask pattern as shown in Fig. 72.
11/30/00 JMR
620
Handbook of VLSI Microlithography
Figure 71. Single edge chromeless imaging. (Reprinted with permission of Ref. 328.)
Figure 72. Defocus position asymmetry of single edge chromeless phase shifter due to phase errors. (Reprinted with permission of Ref. 319.)
Light absorption of 20% will produce a 20% intensity mismatch between the phase shifters and the clear areas. The image formed at the phase shifter edge would then become asymmetrical, resulting in asymmetrical lines (i.e., the profile angles of the two line edges will be different) being printed on the wafer.[331]
11/30/00 JMR
Techniques and Tools for Optical Lithography
621
Imaging Line/Space Pairs. Line/space pairs are formed from an periodic array on the mask of chromeless phase shifters with a width of 1.5λ/ NA and a pitch double that. Since each edge will image as a dark line, the image of line/space pairs is doubled in frequency relative to the mask. The linewidths are approximately 0.4λ/NA so equal line/space pairs are impossible. Shallow wall angles on the mask phase shifters cause the resist lines to be narrower than the printed spaces. Asymmetries in the shifter edgeprofiles produce two lines with different aerial image intensity profiles.[331] This phase shift technique has the same image contrast as the Levenson technique. One way to introduce linewidth control in L/S pairs is to change the construction of the L/S phase shifters so the shifter lines now have a edge defined by shifter comb structures.[337] These comb structures are subresolution rectangles of width 0.25λ/NA (followed by a space of the same dimension on the edge) and a length tailored to produce the line size required. Resist patterns of equal lines and spaces can be formed using comb shaped shifters with comb teeth lengths for positive resist of ~0.5λ/NA. This construction reduces the contrast some, but it still exceeds that of a conventional transmission mask. Imaging Isolated Lines. Aerial images for isolated lines using single edge chromeless phase shifting can be seen in Fig. 73. Coherent illumination is compared with partial coherence of s = 0.5 for different defocus conditions.[23] The general shape of the aerial image for coherent light doesn’t change much with defocus, but the nominal linewidth changes sharply. For the partial coherence case, the nominal linewidth changes less with defocus although the aerial image itself changes shape. For s = 0.45, the isofocal linewidth is 0.25λ/NA with ±10% exposure latitude over a focus range of 1.2 µm for ±10% linewidth control.[23] The linewidth can be controlled by the exposure dose. Exposure latitude data using a 0.42 NA, i-line, s = 0.5 stepper indicate the linewidth changes by 0.02 µm with 10% overexposure for linewidths ranging from 0.15–0.35 µm.[336] Also, as discussed above, one way to introduce line width control is to change the construction of the shifter edge so it now has an edge defined by shifter comb structures.[337] Imaging Contact Holes. Fine hole patterns can be produced by a double exposure method using a pair of single edge chromeless masks and a negative working resist.[336] The phase shifter edge line on the second exposure mask is at a right angle to the first mask. Holes of 0.25λ/NA micron size are resolved in a wide 1.5 micron focus range and can be imaged in a pitch pattern down to 0.9λ/NA at s = 0.5. However, it is unclear if this is a practical process.
11/30/00 JMR
622
Handbook of VLSI Microlithography
Figure 73. Single edge chromeless defocused aerial image for imaging isolated lines. (Reprinted with permission of Ref. 23.)
11/30/00 JMR
Techniques and Tools for Optical Lithography
623
Phase Transitions. Without further modification, lines will print as closed loops at the boundary of each phase shift edge. For most device designs, this would have limited applicability. To eliminate extraneous or useless patterns requires more complicated chromeless phase shifter designs.[338] An array of subresolution linear chromeless phase shifters in proximity to a single edge shifter produces diminished light intensity up to the single edge chromeless shifter, as seen in Fig. 74. To prevent another line image at the remote edge of the single edge shifter requires application of a multistage chromeless phase shifter which gradually changes the phase at the shifter edge until it is adequately matched to the background. [337] Resist patterns at the multistage shifter with a phase transition width of more than 0.5 microns were eliminated. Figure 75 shows that multistage shifters are just the series formation of single edge shifters with increasing or decreasing phases.
Figure 74. Phase transition construction. (Reprinted with permission of Ref. 338.)
Figure 75. Three step multistage shifter. (Reprinted with permission of Ref. 338.)
11/30/00 JMR
624
Handbook of VLSI Microlithography
The thickness and profile of the single edge chromeless shifter introduce phase errors affecting the linesize so the phase edge could terminate on top of a subresolution chrome line (a “chromeless” Levenson extension, i.e., a Levenson design at a single shifter edge, since the pitch is so large there is no imaging contribution from the second edge). This gives the same effect with some trade-off in mask manufacturing.[326] However, if volume production requires at least 30% exposure dose latitude for process control and depth of focus ofk2 = 0.5, slightly finer isolated lines can be printed with an optimally biased conventional transmission mask than with an optimally biased Levenson “chromeless” mask (k1 = 0.55 versus 0.60). Phase Shifting on the Substrate. A wafer level phase shifter operating on a similar principle as the chromeless phase shifting mask can serve to image isolated lines with high resolution.[339] Using a conventional transmission mask, the photoresist on the substrate is exposed with the resist threshold energy, E0. The develop process removes a small thickness of resist according to Eq. (127). A second blanket exposure is now performed where the top surface of the resist itself serves as a phase shifting mask. A sharp drop in photo intensity occurs at the top resist pattern edge yielding a fine line after development. Subresolution Outrigger or Assistant or Auxiliary Feature Phase Shifting. Both isolated and periodic features on a mask can be imaged with suitable surrounding phase shifted apertures, that are themselves beyond the resolution limit of the system.[307] Isolated features are targeted most often since the design doesn’t lend itself well to high density line/space pairs. To obtain a narrow bright line for printing a fine isolated space, two companion apertures are placed on each side of the main aperture on the mask. The companion apertures are sized below the Rayleigh resolution limit of the stepper so they do not image as spurious features and they are of opposite phase to the main aperture. Interference of light between these subresolution outrigger phase shifters and the main aperture contributes to improved edge contrast. The first diffraction side-lobe of the subresolution companion aperture constructively interferes with the center lobe of the primary aperture.[318] Similarly, for printing fine contact holes or vias, the main aperture is surrounded by four companion apertures.[340][341] Obviously, the pattern count increases tremendously with this phase shifting approach. To print L/S pairs, the depth of focus below the Rayleigh limit can be improved over a conventional mask by placing a single assistant phase aperture of 0.1λ/NA–0.15λ/NA width between the chrome lines.[306] Relative to a conventional mask, a shifter width of 0.15λ/NA improves the depth
11/30/00 JMR
Techniques and Tools for Optical Lithography
625
of focus at k1 = 0.5 from k2 = 0.3 to k2 = 0.5 with an attendant improvement in exposure latitude from 22% to 28%. Above the Rayleigh limit, a conventional mask has superior focus and exposure latitude. Placing multiple assistant apertures between the chrome lines of equal L/S pairs does not improve the imagery better than a conventional mask, since the interference between two neighboring shifters can create residue formation or print spurious images. For isolated spaces and holes, the resolution limit with this phase shift method appears to be k1 = 0.35 and the exposure latitude down to that level is of the same slope response as conventional imaging well above the Rayleigh limit.[341] The optimum sizing of the outrigger shifters and their separation from the main aperture is found empirically or through simulation. Increased sizing of the outrigger apertures improves the constructive interference. However, there is a limit since the apertures must remain subresolution or they will print as spurious features. The main criteria is to increase the intensity profile at the wafer. Simulations are helpful to evaluate the size of the secondary lobes on the aerial image. For example, to image 0.6λ/NA square holes, the outrigger shifters should be sized 0.2λ/NA with 0.65λ/NA separation.[342] An outrigger position at 0.6λ/NA yields a higher log slope of the aerial image but produces an exposure level drop as the side lobe strength rises with increasing outrigger sizing. A position at 0.7λ/NA produces a flat exposure level over a wide range of outrigger sizings. Outrigger sizings greater than 0.15λ/NA decrease the exposure-defocus latitude.[318] The resolution limit for contact holes with this phase shift method appears to only match that of conventional lithography, but the depth of focus for 0.75λ /NA contacts is 30% better at s = 0.5 and 40% better at s = 0.3.[316] The change in critical dimension with exposure dose is smaller near the Rayleigh resolution limit for this type of phase shifting than conventional lithography, indicating greater process latitude.[340] However, Fig. 76 shows that, when the optical phases of the outrigger phase shifters differ from π , apparent asymmetry of the developed images with respect to focus appeared. To maintain ±0.5 µm focal shift variation requires ±15° shifter control. This indicates that some of the improved depth of focus latitude available under ideal conditions could be lost during mask fabrication. It is not obvious from Fig. 76, but the phase shifting mask with 210° optical phase shifting gives a narrower focus latitude than 150° phase shifting.[341] Also, the shape of the ED response curve will vary with misalignment of the outrigger structures relative to the main aperture. A mask field populated with these designs must image within the common ED domain of all the individuals.
11/30/00 JMR
626
Handbook of VLSI Microlithography
Figure 76. Focal shift dependency on phase errors of outrigger phase shifters. (Reprinted with permission of Ref. 330.)
Rim or Self-Aligned Phase Shifting. Rim phase shifting[343] can be applied to any arbitrary feature shape or density, but it has great appeal especially for contact hole imaging. These shifters are self-aligned and improve imaging by edge contrast enhancement with shifters typically overhanging absorber patterns. Contrast enhancement occurs by destructive interference between the phase shifted light and the unshifted light from the feature edge. Rim phase shifting of nominally sized contact holes improves the depth of focus at the expense of a much higher exposure.[328] A large positive mask bias can reduce the exposure times, but they are still quite higher than conventional. This biased rim shifter yields a larger exposure-defocus latitude for a given side lobe strength than a corresponding outrigger phase shift mask.[318][345] Increasing the lateral width of the shifters decreases the total light amplitude. This decreases the intensity raising the nominal exposure.
11/30/00 JMR
Techniques and Tools for Optical Lithography
627
The lateral dimension and edge profile control of both the absorber and rim shifter must be controlled.[328] Also, a 10% phase change produces a 5% CD deviation and halves the ED process window. A 10% lateral shifter size change closes the rim shifter process window.[318] The shifter width should be≤0.2λ/NA since 0.07λ/NA variations about this value produce flat aerial intensity width variations and widths ≤0.2λ/NA increases the side lobe intensity sharply.[345] Conventional self-aligned isotropic processing yielding the overhanging shifters means all feature shapes and sizes will have the same shifter lateral widths, although that width is optimum for exposure latitude for only one. Imaging Isolated Lines. For isolated linewidths around k1 = 0.7, the rimshifter mask with rim width equivalent to k1 = 0.15 gives the best exposure window compared to single edge chromeless, subresolution double edge chromeless, or biased conventional transmission masks, and since it transmits more light than a biased transmission mask, it images at lower exposure doses.[326] Control of the chromium undercut is critical to CD control. The aerial images show relatively modest improvement with small undercuts up to 0.2 microns per side but significant improvement with larger undercuts of 0.4–0.6 microns per side, if the pattern pitch is large enough to tolerate them.[343] Rim phase shifting can produce the same image contrast as double edge chromeless phase shifting below the Rayleigh resolution limit.[344] Imaging Line/Space Pairs. For equal L/S pairs, a shifter width of 0.07λ/NA gives 50% more exposure latitude than a width of 0.14λ/NA and extends the mask linearity down to 0.5λ/NA from 0.7λ/NA.[306] For DUV exposures, this critical tolerance is only 0.04 microns. Relative to a conventional mask, a shifter width of 0.07λ /NA improves the depth of focus at k1 = 0.5 from k2 = 0.3 to k2 = 0.5 with an attendant improvement in exposure latitude from 22% to 26%. Below the Rayleigh limit, the process latitude of L/S pairs using a Levenson mask is greater than using rim shifting. Rim phase shifting is only effective on periodic patterns having wider spaces than lines.[345] The rim shifter technique gives ~30% higher image contrast than conventional imaging. [344] The log slope of the aerial image improves with increasing coherence of s = 0.5 → 0.3 but the side lobes of the aerial image can be twice as large making this impractical.[318] Attenuated Phase Shifting.An attenuated phase shift mask replaces the opaque part of the mask with an absorbing π (pi or 180°) phase shifter.[328][346] This mask is sometimes called a transmission-pi, or t-pi
11/30/00 JMR
628
Handbook of VLSI Microlithography
mask. The transmission of the phase shifter is adjusted to <10% to prevent imaging of spurious features. A multilayer structure can provide the absorption and phase shifting in separate functions. Unbiased mask features are impractical since the exposure times are extremely high and there are strong proximity effects. Features of different shapes will not share the same exposure domain. Exposure times and proximity effects are reduced with a uniform bias of 0.1λ/NA. This construction gives better depth of focus, smaller proximity effects, and higher packing density potential than a biased rim shifter. Attenuated phase shifting can produce the same image contrast for isolated lines as rim or double edge chromeless phase shifting below the Rayleigh resolution limit.[344] 12.3 Serifs Serifs allow improvements in terms of reduced radius of corner rounding, reduced area loss, and better preservation of aspect ratio of rectangles.[347]–[349] Serifs are an optical defect (i.e., their size is well below the resolution limit) placed in proximity to the corners of a feature of interest (e.g., a square contact on a mask will have subresolution squares placed at the tips of each corner). The slope of the image intensity distribution is lower as the radius of corner rounding is reduced with serifs. However, the dose-related size variability is increased with serifs in the corner area. Serifs help maintain better resist profiles in the corner region than rectangles imaged without serifs. This translates into increased depth of focus latitude. However, in the context of linewidth control, the serifs make matters worse.[328] Serifs can be additive or subtractive relative to the original design structure and are an important tool addressing proximity effects and expanding overall process latitude. 12.4 Excimer Laser Irradiation Damage When the mask or reticle pattern is formed by excimer laser projection processing, chromium mask damage results for metal films of 70–80 nm thickness.[350] Damage of the chromium film on quartz is a result of poor heat transfer following optical absorption causing metal stress and fatigue. Assuming no transmission, chromium film absorbs 79% at 248 nm. The degree of damage is pattern size dependent (smaller features are more
11/30/00 JMR
Techniques and Tools for Optical Lithography
629
desirable) and ranges from erosion of pattern edges to total ablation of chromium film depending on fluencies. At lower fluencies, chromium films crack to various degrees and the damage can be cumulative. This is the major concern for excimer laser wafer steppers, although the energy at the object plane is reduced by the square of the reduction ratio (which presents no relief for unit magnification steppers). Damage thresholds are ~25–50 mJ/cm2 at 248 nm and total ablation can happen at ~800 mJ/cm2. Increasing the chromium film thickness appears to improve the heat transfer, so damage is avoided. Chromium films of 2 microns thickness offer relief, but these must be etched in reactive ion etchers because the isotropic undercut of wet etching becomes prohibitive. Films of this height are not a problem for reduction lenses with their large objective plane depth of focus. Focused air cooling of the mask or reticle can allow the more conventional chromium masks to continue in use or alternative mask materials could show better resistance to damage. 12.5 Registration Error Contributions It is desirable that the mask or reticle form as nearly an ideal grid as possible so the image placement errors will be dominated by the distortions of the stepper optical system. For 10X reduction lenses, this may be a reasonable assumption for random errors on the reticle. Unfortunately, 4–5X reduction systems must consider the quality of systematic and random placement errors present on the reticle as registration requirements approach the design limits of the stepper (even though their magnitude at the wafer plane is reduced by the demagnification). Unit magnification steppers processing submicron features, absolutely must demand the highest quality masks since mask errors degrade registration and critical dimension control capability directly. It is preferable that all masks or reticles in a mask or reticle set be generated on a single electron beam machine or pattern generator, so that equipment variation is blocked. For a particular device mask set, absolute errors are not as important as relative ones. In fact, intentional introduction of a magnification offset or correction on selected mask or reticle layers can be used to compensate for systematic process induced errors (e.g., high temperature deposition of films).
11/30/00 JMR
630
Handbook of VLSI Microlithography
REFERENCES 1. Shafer, D. R., Offner, A., and Singh, R., U.S. Patent, number 4,747,678, assigned to Perkin-Elmer Corp., “Optical Relay System with Magnification,” (May 1988) 2. Tasch, A. F., SPIE, 333:68 (1982) 3. Condon, E. U., and Odishaw, H., Handbook of Physics, 2nd Ed., McGrawHill (1967) 4. Offner, A., Perkin-Elmer Symposium on The Practical Application of Modulation Transfer Functions, Perkin-Elmer, p. 240 (Mar. 6, 1963) 5. Williams, C. S., and Becklund, O. A., Introduction to the Optical Transfer Function, John Wiley & Sons (1989) 6. Kawata, H., Carter, J. M., Yen, A., and Smith, H. I., Microelectronic Eng., 9:31 (1989) 7. 8. 9. 10. 11.
Goodman, J. W., Introduction to Fourier Optics, McGraw-Hill (1968) Smith, F. G., and Thomson, J. H., Optics, John Wiley & Sons (1971) Nakase, M., Photog. Sci. and Eng., 27(6):254 (1983) Abbott, F., Optical Spectra, p. 54, (Mar. 1970) Berkovitz, M. A., SPIE, 13:115 (1969)
12. 13. 14. 15.
Yeung, M. S., SPIE, 922:149 (1988) Bowden, M. J., J. Elect. Chem. Soc., p. 195C (May 1981) Yeung, M., Proc. Kodak Microelectronics Seminar (Oct. 1985) Goodman, D. S., and Rosenbluth, A. E., SPIE, 922:108 (1988)
16. 17. 18. 19.
O’Toole, M. M., and Neureuther, A. R., SPIE, 174:22 (1979) Lacombat, M., and Dubroeucq, G. M., SPIE, 174:28 (1979) King, M. C., Proc. Kodak Microelectronics Seminar, p. 33 (Oct., 1980) Lacombat, M., Dubroeucq, G. M., Massin, J., and Brevignon, M., Solid State Technology, p. 115 (Aug. 1980) 20. Lacombat, M., Dubroencq, G. M., Massin, J., and Brevignon, M., Solid State Technology, 23:115 (Aug. 1980) 21. 22. 23. 24. 25.
King, M. C., and Goldrick, M. R., Solid State Technology, p. 37 (Feb. 1977) Anon. Mack, C. A., Proc. KTI Microelectronics Seminar, p. 23 (Oct. 1991) Offner, A., Optical Eng., 26(4):294 (Apr. 1987) King, M. C., IEEE Trans. Electron Devices, ED-26:711 (1979)
26. Oldham, W. G., Jain, P., and Neureuther, A. R., Proc. Kodak Microelectronics Seminar (Oct. 1981)
11/30/00 JMR
Techniques and Tools for Optical Lithography
631
27. Rosenau, M. D., Perkin-Elmer Symposium on The Practical Application of Modulation Transfer Functions, Perkin-Elmer, p. 252 (Mar. 6, 1963) 28. Coltman, J. W., J. Opt. Soc. Am., 44(6):468 (1954) 29. Mack, C. A., and Connors, J. E., SPIE, 1674:328 (1992) 30. Offner, A., Photogr. Sci. Eng., 23(6):374 (1979) 31. Crisalee, O. D., Keifling, S. R., Seborg, D. E., and Mellichamp, D. A., IEEE Trans. Semic. Manuf., 5(1):14 (1992) 32. Marker, A. J., SPIE, 1535:60 (1991) 33. Tai, K. L., Vadimsky, R. G., Kemmerer, C. T., Wagner, J. S., Lamberti, V. E., and Timko, A. G., J. Vac. Sci. Technol., 17(5):1169 (Sept./Oct. 1980) 34. 35. 36. 37. 38.
Wake, R. W., and Flanigan, M. C., SPIE, 539:291 (1985) Babu, S. V., and Srinivasan, V., J. Imaging Technology, 11(4):168 (1985) Babu, S. V., and Srinivasan, V., SPIE, 539:36 (1985) Srinivasan, V., and Babu, S. V., JECS, 133:1686 (1986) Trefonas, P., and Daniels, B. K., SPIE, 771:194 (1987)
39. 40. 41. 42.
Blais, P. D., Solid State Technology, 20(8):76 (1977) Flanigan, M. C., and Wake, R. W., SPIE, 539:44 (1985) Arden, W., and Mader, L., SPIE, 539:219 (1985) Arnold, W. H., and Levinson, H. J., Proc. Kodak Microelectronics Seminar, p. 80 (Nov., 1983) 43. Daniels, B. K., Trefonas, P., and Woodbrey, J. C., Solid State Technology, p. 105 (Sept., 1988) 44. 45. 46. 47.
Mack, C. A., SPIE, 922:135 (1988) Watts, M. P. C., Semiconductor International, p. 124 (Apr. 1985) Waldo, W. G. and Helbert, J. N., SPIE, 1088:153 (1989) Taylor, G. N., Solid State Technology, 27(6):105 (1984)
48. Hansen, S. G., Dao, G., Gaw, H., Qian, Q. D., and Spragg, P., SPIE, 1463:230 (1991) 49. Arden, W., Klose, H., and Krause, A., Proc. Kodak Microelectronics Seminar, p. 11 (Oct. 1982) 50. Scott, F., Perkin-Elmer Symposium on “The Practical Application of Modulation Transfer Functions”, Perkin-Elmer, p. 248 (Mar. 6, 1963) 51. Brunner, T. A., and Allen, R. R., SPIE, 565:6 (1985) 52. Brunner, T. A., and Allen, R. R., IEEE Elect. Device Letters, EDL-6(7):329 (July 1985) 53. Cole, D., Barouch, E., Hollerbach, U., and Orszag, S. A., J. Vac. Sci. Technology B (Nov./Dec. 1992)
11/30/00 JMR
632
Handbook of VLSI Microlithography
54. Beveridge, G. S. G., and Schechter, R. S., Optimization: Theory and Practice, McGraw-Hill (1970) 55. Dobrowolski, J. A., and Kemp, R. A., Appl. Opt., 29(19):2876 (July, 1990) 56. Box, G. E. P., and Draper, N. R., Empirical Model Building and Response Surfaces, John Wiley & Sons (1987) 57. Malacara, D., Optical Shop Testing, John Wiley & Sons (1978) 58. Yoder, P. R., Grosso, R. P., and Crane, R., SPIE, 330:84 (1982) 59. Jenkins, F. A., and White, H. E., Fundamentals of Optics, 4th Ed., McGrawHill (1976) 60. Slocum, A. H., Precision Eng., 10(2):84 (Apr. 1988) 61. 62. 63. 64.
Yoder, P. R., SPIE, 1533:2 (1991) Wang, J. Y., and Silva, D. E., Appl. Opt., 19:1510 (May 1980) Berggren, R., Optical Spectra, p. 22, {Dec. 1970) Kano, I., SPIE, 174:48 (1979)
65. Webb, J. E., SPIE, 480:133 (1984) 66. Yan, P., Qian, Q., Langston, J., and Leon, P., SPIE, 1674:316 (1992) 67. Fincham, W. H. A., and Freeman, M. H., Optics, 9th Ed., Butterworths (1980) 68. Van Heel, A. C. S., Advanced Optical Techniques, John Wiley & Sons (1967) 69. Toh, K. K. H., and Neureuther, A. R., SPIE, 772:202 (1987) 70. Eib, N., Barouch, E., Hollerbach, U., and Orszag, S. A., SPIE, 1674:105 (1992) 71. 72. 73. 74.
Bruning, J. H., President, Tropel, Rochester, NY, personal communication. Peters, D., Proc. Kodak Microelectronics Seminar, p. 66 (1985) Sewell, H., and Friedman, I., SPIE, 922:328 (1988) Goodall, F., and Lawes, R., Proc. KTI Microelectronics Seminar (Nov. 1987)
75. 76. 77. 78.
Hershel, R. S., SPIE, 174:54 (1979) Dyson, J., J. Opt. Soc. Amer., 713:49 (1959) Stephanakis, A. C., and Rubin, D. I., SPIE, 772:74 (1987) Dunbrack, S. K., and Langston, J. C., Proc. Kodak Microelectronics Seminar, p. 62 (Nov. 1983) 79. Mack, C. A., Stephanakis, A., and Hershel, R., Proc. Kodak Microelectronics Seminar, p. 228 (Nov. 1986) 80. Mack, C. A., Proc. KTI Microelectronics Seminar, p. 153 (Nov. 1987) 81. Mack, C. A., and Kaufman, P. M., SPIE, 1088:304 (1989)
11/30/00 JMR
Techniques and Tools for Optical Lithography
633
82. Subramanian, S., Appl. Opt., 20(10):1854 (May 1981) 83. Matsumoto, K., Konno, K., and Ushida, K., Proc. Kodak Microelectronics Seminar, p. 74 (1983) 84. Born, M., and Wolf, E., Principles of Optics, 5th Ed., Pergamon Press (1975) 85. Yeung, M., Proc. Kodak Microelectronics Seminar, p. 115 (1985) 86. Bernard, D. A., IEEE Trans. Semic. Manuf., 1(3):85 (1988) 87. Barouch, E., Hollerbach, U., Orszag, S. A., Allen, M. T., and Calabrese, G. S., SPIE, 1463:336 (1991) 88. Hornberger, W. P., Hauge, P. S., Shaw, H. M., and Dill, F. H., Proc. Kodak Microelectronics Seminar (1974) 89. Ferguson, R. A., Hutchinson, J. M., Spence, C. A., and Neureuther, A. R., J. Vac. Sci. Technol. B, 8(6):1423 (Nov./Dec. 1990) 90. Ziger, D., Mack, C. A., and Distasio, R., SPIE, 1466:270 (1991) 91. Kim, D. J., Oldham, W. G., and Neureuther, A. R., IEEE Trans. Electron Devices, ED-31:1730 (1984) 92. Trefonas III, P., Fisher, T. A., and Lachowski, J., Microelectronic Eng., 13:109 (1991) 93. Ushirogouchi, T., Onishi, Y., and Tada, T., J. Vac. Sci. Technol. B, Vol. 8(6):1418 (Nov./Dec. 1990) 94. Flanner, P. D. III, Proc. KTI Microelectronics Seminar, p. 231 (Nov. 1987) 95. Fukuda, H., and Okazaki, S., J. Electrochem. Soc., 137(2):675 (Feb. 1990) 96. Mack, C. A., Capsuto, E., Sethi, S., and Witowski, J., J. Vac. Sci. Technol. B, 9(6):3143 (Nov./Dec. 1991) 97. Turner, S., Babcock, C., and Cerrina, F., J. Vac. Sci. Technol. B, 9(6):3440 (Nov./Dec. 1991) 98. 99. 100. 101.
Szmanda, C. R., Shipley Co. Inc., Newton, MA, personal communication. Waldo, W. G., Motorola Techn. Devel., 15:133 (1992) Haller, W. E., and Neureuther, A. R., SPIE, 922:168 (1988) Flagello, D. G., and Rosenbluth, A. E., J. Vac. Sci. Technol. B (Nov./Dec. 1992) 102. Kuyel, B., Barouch, E., Hollerbach, U., and Orszag, S. A., SPIE, 1674:376 (1992) 103. Pfau, A. K., Hsu, R., and Oldham, W. G., SPIE, 1674:182 (1992) 104. Hufnagel, R. E., Perkin-Elmer Symposium on The Practical Application of Modulation Transfer Functions, Perkin-Elmer, p. 244 (Mar. 6, 1963) 105. Vollrath, W., SPIE, 1138:166 (1989) 106. Chien, P., Liauw, L., and Chen, M., SPIE, 538:197 (1985) 107. Fukuda, H., Imai, A., and Okazaki, S., SPIE, 1264:14 (1990)
11/30/00 JMR
634
Handbook of VLSI Microlithography
108. Noguchi, M., Muraki, M., Iwasaki, Y., and Suzuki, A., SPIE, 1674:92 (1992) 109. Robertson, P. D., Wise, F. W., Nasr, A. N., Neureuther, A. R., and Ting, C. H., SPIE, 334:37 (1982) 110. Mack, C., Proc. KTI Microelectronics Seminar, p. 209 (Nov. 1989) 111. Fehr, D. L., Lovering, H. B., and Scruton, R. T., Proc. KTI Microelectronics Seminar, p. 217 (Nov. 1989) 112. Tounai, K., Tanabe, H., Nozue, H., and Kasama, K., SPIE, 1674:753 (1992) 113. Chandra, S., and Wu, F. Y., SPIE, 772:86 (1987) 114. Buckley, J. D., Galburt, D. N., and Karatzas, C., J. Vac. Sci. Technol. B, 7(6):1607 (1989) 115. Noguchi, M., Yoshitake, Y., and Kembo, Y., SPIE, 1674:662 (1992) 116. Matsuo, S., Komatsu, K., Takeuchi, Y., Tamechika, E., Mimura, Y., and Harada, K., Intl. Elect. Dev. Mtg. Tech. Digest, p. 970 (1991) 117. Gaskill, J. D., Linear Systems, Fourier Transforms, and Optics, John Wiley & Sons (1978) 118. Fukuda, H., Terasawa, T., and Okazaki, S., J. Vac. Sci. Technol. B, 9(6):3113 (1991) 119. Shiraishi, N., Hirukawa, S., Takeuchi, Y., and Magome, N., SPIE, 1674:741 (1992) 120. Hornbeck, R. W., Numerical Methods, Quantum Publishers (1975) 121. Averill, E. W., Elements of Statistics, John Wiley & Sons (1972) 122. Bedworth, D. D., and Bailey, J. E., Integrated Production Control Systems, John Wiley & Sons (1982) 123. Mendenhall, W., Introduction to Probability and Statistics, 4th Ed., Duxberry Press (1975) 124. Box, G. E. P., Hunter, W. G., and Hunter, J. S., Statistics for Experimenters, John Wiley & Sons (1978) 125. Box, G. E. P., and Behnken, D. W., Technometrics, 2:455 (1960) 126. Yeung, M., Langston, J., and Sparkes, C., SPIE, 565:32 (1985) 127. Arnold, W. H., SPIE, 922:94 (1988) 128. Rominger, J. P., SPIE, 922:188 (1988) 129. Lin, B. J., IEEE Trans. Electron Devices, ED-27(5):931 (1980) 130. Rosenbluth, A. E., Goodman, D., and Lin, B. J., J. Vac. Sci. Technol. B, 1(4):1190 (Oct./Dec. 1983) 131. Bossung, J. W., Proc. Soc. Photo-Optical Instrum. Eng., 100:80 (Apr. 1977) 132. Arnold, W. H., and Levinson, H. J., SPIE, 772:21 (1987)
11/30/00 JMR
Techniques and Tools for Optical Lithography
635
133. Meyerhofer, D., SPIE, 922:174 (1988) 134. Bruce, J. A., Leidy, R. K., and Cole, D. C., Proc. KTI Microelectronics Seminar, p. 205 (Nov. 1991) 135. Lis, S. A., Proc. Kodak Microelectronics Seminar (Nov. 1986) 136. Chien, P., and Chen, M., SPIE, 772:35 (1987) 137. Flanner, P. D., Subramanian, S., and Neureuther, A. R., SPIE, Vol. 633 (1986) 138. White, L. K., SPIE, 772:239 (1987) 139. Horiuchi, T., and Suzuki, M., Symp. on VLSI Technol., Digest of Technical Papers (May, 1985) 140. Canestrari, P., Degiorgis, G., De Natale, P., Gazzaruso, L., and Rivera, G., SPIE, 1463:446 (1991) 141. Nistler, J. L., Preil, M., and Singh, B., Proc. KTI Microelectronics Seminar, p. 295 (Oct. 1991) 142. Wolf, T. M., Fu, C. C., Eisenberg, J. H., and Fritzinger, L. B., Proc. KTI Microelectronics Seminar, p. 335 (Nov. 1989) 143. Spence, C. A., Ferguson, R. A., Yeung, M., Das, S., Hutchinson, J. M., and Neureuther, A. R., J. Vac. Sci. Technol. B, 8(6):1735 (1990) 144. Lin, B. J., SPIE, 1463:42 (1991) 145. Feldman, M., Wong, G. G., and Cheng, M., J. Vac. Sci. Technol. B, 5(1):241 (1987) 146. Okazaki, S., J. Vac. Sci. Technol. B, 9(6):2829 (1991) 147. Lin, B. J., IEEE Trans. Electron Devices, ED-25(4):419 (1978) 148. Lee, W., Davis, R., Miller, R., and McCoy, J., Proc. KTI Microelectronics Seminar, p. 179 (Nov. 1989) 149. 150. 151. 152.
Buckley, J. D., Solid State Tech. (Jan. 1987) Suwa, K., and Ushida, K., SPIE, 922:270 (1988) Mayer, H. E., and Loebach, E. W., SPIE, 221:9 (1980) Fukuda, H., Hasegawa, N., Tanaka, T., and Hayashida, T., IEEE Trans. Electron Devices, EDL-8(4):179 (1987) 153. Hayashida, T., Fukuda, H., Tanaka, T., and Hasegawa, N., SPIE, 772:66 (1987) 154. Sugiyama, S., Tawa, T., Oshida, Y., Kurosaki, T., and Mizuno, F., SPIE, 922:318 (1988) 155. Hale, K., and Luehrmann, P., Proc. Kodak Microelectronics Seminar (Nov. 1986) 156. Edmark, K. W., and Ausschnitt, C.P., SPIE, 538:91 (1985) 157. Suzuki, A., Yabu, S., and Ookubo, M., SPIE, 772:58 (1987) 158. Brunner, T. A., Cheng, S., and Norton, A. E., SPIE, 922:366 (1988)
11/30/00 JMR
636
Handbook of VLSI Microlithography
159. Bouwhuis, G., and Wittekoek, S., IEEE Trans. Electron Devices, ED26(4):723 (1979) 160. Schenker, R., Eichner, L., Vaidya, H., and Oldham, W. G., SPIE, 2440:118 (1994) 161. Kano, I., SEMI Technol. Symposium ’87, Tokyo, Japan, p. 54 (Dec. 9–10, 1987) 162. Guidici, D. C., SPIE, 174:132 (1979) 163. Daughton, W. J., and Givens, F. L., J. Electrochem. Soc., 129(1):173 (Jan. 1982) 164. White, L. K., Electrochem. Soc. Ext. Abstracts, 84-1:138 (May 1984) 165. White, L. K., SPIE, 539:29 (1985) 166. Peurrung, L. M., and Graves, D. B., J. Electrochem. Soc., 138(7):2115 (July 1991) 167. Wilson, R. H., and Piacente, P. A., Electrochem. Soc. Ext. Abstracts, 842:609 (Oct. 1984) 168. Schiltz, A., Abraham, P., and Dechenaux, E., J. Electrochem. Soc., 134(1):190 (Jan. 1987) 169. Stillwagon, L. C., Larson, R., and Taylor, G. N., J. Electrochem. Soc., 134:2030 (1987) 170. Stillwagon, L., and Taylor, G. N., Polymers in Microlithography, (E. Reichmanis, S. A. MacDonald, and T. Imayanagi, eds.), ACS Symp. Series 412, Amer. Chem. Soc., Washington, DC (1989) 171. Barouch, E., Bradie, B., Hollerbach, U., and Orszag, S. A., J. Vac. Sci. Technol. B, 8(6):1432 (Nov./Dec., 1990) 172. Adams, A. C., and Capio, C. D., J. Electrochem. Soc., 128:423 (1981) 173. Ellwanger, R. C., Broadbent, E. K., and Bril, T. W., Electrochem. Soc. Ext. Abstracts, 84-2:603 (Oct., 1984) 174 Wilson, R. H., and Piacente, P. A., Semiconductor International, p. 116 (Apr. 1986) 175. Schiltz, A., and Pons, M., J. Electrochem. Soc., 133:178 (1986) 176. Sheldon, D. J., Gruenshlaeger, C. W., Kammerdiner, L., Henis, N. B., Kelleher, P., and Hayden, J. D., IEEE Trans. Semic. Manuf., 1(4):140 (Nov. 1988) 177. Nagy, A., and Helbert, J., Solid State Technology, p. 53 (Jan. 1991) 178. Daubenspeck, T. H., DeBrosse, J. K., Koburger, C. W., Armacost, M., and Abernathey, J. R., J. Electrochem. Soc., 138(2):506 (Feb. 1991) 179. Bettes, T. C., Semiconductor International, p. 83 (Apr. 1982) 180. Peters, D. W., Proc. Kodak Microelectronics Seminar (Nov. 1986) 181. Gear, G., Proc. Kodak Microelectronics Seminar, p. 104 (1985)
11/30/00 JMR
Techniques and Tools for Optical Lithography
637
182. Elliot, D. J., Proc. KTI Microelectronics Seminar (Nov. 1987) 183. Ruckle, B., Lokai, P., Rosenkranz, H., Nikolaus, B., Kahlert, H. J., Burghardt, B., Basing, D., and Muckenheim, W., SPIE, 922:450 (1988) 184. Znotins, T. A., McKee, T. J., Gutz, S. J., Tan, K. O., and Norris, W. B., SPIE, 922:454 (1988) 185. Oesterlin, P., Lokai, P., Rosenkranz, H., Kahlert, H. J., and Basting, D., SPIE, 1138:113 (1989) 186. Elliot, D. J., Pennelli, C. P., and Sengupta, U. K., J. Vac. Sci. Technol. B, 9(6):3122 (Nov./Dec., 1991) 187. Lokai, P., Rebhan, U., Stamm, U., Bucher, H., Kahlert, H. J., and Basting, D., SPIE, 1674:669 (1992) 188. Mizoguchi, H., Wakabayashi, O., Itoh, N., Kowaka, M., Fujimoto, J., Kobayashi, Y., Ishihara, T., Amada, Y., and Nozue, Y., SPIE, 1674:532 (1992) 189. Pol, V., Bennewitz, J. H., Escher, G. C., Feldman, M., Firtion, V. A., Jewell, T. E., Wilcomb, B. E., and Clemens, J. T., SPIE, 633:6 (1986) 190. Tipton, M., Misium, G., and Garza, C., J. Vac. Sci. Technol. B, 8(6):1740 (Nov./Dec. 1990) 191. Mace, P. N., SPIE, 247:108 (1980) 192. Jain, P. K., Neureuther, A. R., and Oldham, W. G., IEEE Trans. Electron Devices, ED-28(11):1410 (1981) 193. Ishihara, T., Sandstrom, R., Reiser, C., and Sengupta, U., SPIE, 1674:473 (1992) 194. Lokai, P., Rebhan, U., Osterlin, P., Kahlert, H. J., and Basting, D., SPIE, 1264:496 (1990) 195. Partlo, W. N., and Oldham, W. G., J. Vac. Sci. Technol. B, 9(6):3126 (Nov./ Dec., 1991) 196. Rothschild, M., and Ehrlich, D. J., SPIE, 922:466 (1988) 197. Bobroff, N., Rev. Sci. Instrum., 57(6):1152 (1986) 198. 199. 200. 201. 202.
Goodall, F. N., and Lawes, R. A., SPIE, 922:410 (1988) Rothschild, M., J. Vac. Sci. Technology B (Nov./Dec. 1992) Ruff, B., Tai, E., and Brown, R., SPIE, 1088:441 (1989) Tracy, D. H., and Wu, F. Y., SPIE, 922:437 (1988) Flagello, D. G., and Pomerence, A. T. S., SPIE, 772:6 (1987)
203. 204. 205. 206.
Widmann, D. W., Applied Optics, 14(4):931 (Apr. 1975) Cuthbert, J. D., Solid State Technology, p. 59 (Aug. 1977) Mack, C. A., Appl. Opt., 25(12):1958 (June 1986) Lyons, C. F., Long, D. T., Miura, S. S., and Wood, R. L., Solid State Technology, p. 95 (Nov. 1990)
11/30/00 JMR
638
Handbook of VLSI Microlithography
207. Widmann, D. W., and Binder, H., IEEE Trans. Electron Devices, ED22(7):467 (1975) 208. Kuyel, B., and Sewell, H., J. Vac. Sci. Technol. B, 8(6):1385(Nov./Dec. 1990) 209. Casalnuovo, S. A., SPIE, Vol. 771 (1987) 210. Bolsen, M., Buhr, G., Merrem, H. J., and van Werden, K., Solid State Technology, p. 83 (Feb. 1986) 211. White, L. K., Proc. Kodak Microelectronics Seminar (Oct. 1981) 212. Tanaka, T., Hasegawa, N., Shiraishi, H., and Okazaki, S., J. Electrochem. Soc., 137(12):3900 (Dec. 1990) 213. Brunner, T. A., Lyons, C. F., and Miura, S. S., J. Vac. Sci. Technology B, 9(6):3418 (Nov./Dec. 1991) 214. Ohtsuka, H., and Kanamori, J., OKI Technical Review 124, p. 33 (July 1986) 215. Mack, C. A., Solid State Technology, p. 125 (Jan. 1988) 216. Brown, A. V., and Arnold, W. H., SPIE, 539:259 (1985) 217. Coopmans, F., and Roland, B., SPIE, 631:34 (1986) 218. Garza, C. M., Misium, G. R., and Doering, R. R., SPIE, Vol. 1086 (1989) 219. Hashimoto, T., Yamanaka, H., Iino, T., and Takahashi, S., SPIE, Vol. 920 (1988) 220. Lin, B. J., SPIE, 174:114 (1979) 221. Brewer, T., Carlson, R., and Arnold, J., J. Appl. Photog. Eng., 7(6):184 (1981) 222. Coyne, R. D., and Brewer, T., Proc. Kodak Microelectronics Seminar (Nov. 1983) 223. Lin, Y., Marriott, V., Orvek, K., and Fuller, G., SPIE, 469:30 (1984) 224. van den Berg, H. A. M., and van den Berg, P. M., IEEE Trans. Electron Devices, ED-28(12):1535 (1981) 225. Nolscher, C., Mader, L., and Schneegans, M., SPIE, 1086:242 (1989) 226. van den Berg, H. A. M., and van Staden, J. B., J. Appl. Phys., 50:1212 (1979) 227. Yen, A., Smith, H. I., Schattenburg, M. L., and Taylor, G. N., J. Electrochem. Soc., 139(2):616 (Feb. 1992) 228. Griffing, B. F., and West, P. R., Electron Dev. Letters, EDL-4:14 (1983) 229. Brown, T., and Mack, C. A., SPIE, 920:390 (1988) 230. Spak, M., Mammoto, D., Jain, S., and Durham, D., Proc. 7th Int. Techn. Conf. Photopolymers, 247 (1985) 231. Balch, E. W., Weaver, S. E., and Saia, R. J., SPIE, 922:387 (1988) 232. Vollenbroek, F. A., and Geomini, M. J. H. J., SPIE, Vol. 920:419 (1988)
11/30/00 JMR
Techniques and Tools for Optical Lithography
639
233. Middelhoek, S., IBM J. Res. Dev., 14:117 (1970) 234. Neureuther, A. R., and Dill, F. H., Optical and Acoustical Microelectronics, Polytechnic Press, pp. 233–249 (1974) 235. 236. 237. 238. 239.
Walker, E. J., IEEE Trans. Electron Devices, ED-22(7):464 (1975) Ito, H. and Willson, C. G., Polym. Eng. Sci., 23:1012 (1983) Burggraaf, P., Semiconductor International, p. 23 (Dec. 1987) Lin, B. J., Solid State Technology, p. 63 (Jan. 1987) Rudoler, S., Hadar, O., Fisher, M., and Kopeika, N. S., Optical Eng., 30(5):577 (May 1991)
240. Lin, B. J., SPIE, 1088:106 (1989) 241. Hornberger, W. P., Hauge, P. S., Shaw, J. M., and Dill, F. H., Kodak Microelectronics Seminar (1974) 242. Tanaka, T., Fududa, H., Gasegawa, N., Hashimoto, M., and Okazaki, S., J. Vac. Sci. Technol. B, 7(2):188 (Mar./Apr. 1989) 243. Asaumi, S., and Nakane, H., J. Electrochem. Soc., 137(8):2546 (Aug. 1990) 244. Jian-Ping, H., Kwei, T. K., and Reiser, A., Macromolecules, 22:4106 (1989) 245. Flanner, P. D., III, Proc. KTI Microelectronics Seminar, p. 231 (Nov. 1987) 246. Bruce, J. A., and Lin, B. J., Proc. KTI Microelectronics Seminar, p. 1 (Nov. 1987) 247. Sautter, K. M., Ha, M, and Batchelder, T., Proc. KTI Microelectronics Seminar, p. 99 (Nov., 1988) 248. Grace, K. G., Petrillo, K., Hohn, F. J., Wilson, A. D., and Moreau, W. M., J. Vac. Sci. Technol. B, 6(6):2238 (Nov./Dec. 1988) 249. Yoshimura, T., Murai, F., Shiraishi, H., and Okazaki, S., J. Vac. Sci. Technol. B, 6(6):2249 (Nov./Dec. 1988) 250. Asaumi, S., and Furuta, M., J. Electrochem. Soc., 139(3):889 (Mar. 1992) 251. Samarakone, N., Jaenen, P., Van den hove, L, Thirsk, M., and Daraktchiev, I., Microelectronic Eng., 13:85 (1991) 252. Sommargren, G. E., SPIE, 1088:268 (1989) 253. Cote, D. R., Lazo-Wasem, J. E., and Rahmlow, T. D., SPIE, Vol. 921 (1988) 254. Steinmetz, C. R., Precision Eng., 12(1):12 (Jan. 1990) 255. Wayne, K. J., SPIE, 60:124 (1975) 256. Kendall, R., Doran, S., and Weissmann, E., J. Vac. Sci. Technol. B, 9(6):3019 (Nov./Dec. 1991) 257. Spanner, K., Marth, H., and Gutheil, W., SPIE, 565:41 (1985) 258. Ebert, E., SPIE, 1087:415 (1989)
11/30/00 JMR
640
Handbook of VLSI Microlithography
259. Cameron, J. F., Seto, J. A., and Wise, L. A., SPIE, 1674:435 (1992) 260. Jacobs, S. F., Johnston, S. C., and Schwab, D. E., Applied Optics, 23(20):3500 (Oct. 1984) 261. 262. 263. 264.
Zerodur is a registered trademark of Schott Glaswerke, Mainz, Germany. Marx, T. A., SPIE, 1535:130 (1991) Pepi, J. W., and Golini, D., SPIE, 1533:212 (1991) Luehrmann, P. F., de Mol, C. G. M., van Hout, F. J., George, R. A., and van der Putten, H., SPIE, 1463:434 (1991) 265. Perloff, D. S., IEEE Journ. Sol. St. Circ., SC-13(4):436 (1978) 266. Arnold, W., SPIE, Vol. 394 (1983) 267. van den Brink, M. A., de Mol, C. G. M., and George, R. A., SPIE, 921:180 (1988) 268. Heflinger, B., SPIE, 334:70 (1982) 269. Kirk, C. P., SPIE, 772:134 (1987) 270. Abraham, G., Kirk, J. P., Tibbetts, R. E., and Wilczynski, J. S., IBM Tech. Disclosure Bull., 19(9):3417 (Feb. 1977) 271. 272. 273. 274. 275.
Wilczynski, J. S., J. Vac. Sci. Technol., 16(6):1929 (Nov./Dec. 1979) Beaulieu, D. R., and Hellebrekers, P. P., SPIE, 772:142 (1987) Kleinknecht, H. P., SPIE, 174:63 (1979) Trutna, W. R., and Chen, M., SPIE, 470:62 (1984) Wittekoek, S., Linders, H., Stover, H., Johnson, G., Gallagher, D., and Fergusson, R., SPIE, 565:22 (1985)
276. Ohtsuka, H., Funatsu, H., Kushibiki, G., and Koikeda, T., SPIE, 470:70 (1984) 277. Sawoska, D. A., SanGiacomo, K. D., Jacovich, E. C., and Cordes, W. F., SPIE, 922:217 (1988) 278. Yao, S., SPIE, 772:118 (1987) 279. Sheldon, D. J., Gruenschlaeger, C. W., Kammerdiner, L., Henis, N. B., Kelleher, P., and Hayden, IEEE Trans. Semic. Manuf., 1(4):140 (1988) 280. Manske, L. M., and Graves, D. B., SPIE, 1463:414 (1991) 281. Chivers, K. A., Proc. Kodak Microelectronics Seminar, p. 44 (Oct. 1984) 282. Gallatin, G. M., Webster, J. C., Kintner, E. C., and Wu, F., SPIE, 772:193 (1987) 283. Waldo, W. G., and Helbert, J. N., Motorola Techn. Devel., 9:15 (1989) 284. Fujiwara, K., Tokui, A., and Uoya, S., Proc. Kodak Microelectronics Seminar (1985) 285. Suwa, K., Nakazawa, K., and Yoshida, S., Proc. Kodak Microelectronics Seminar (Oct. 1981)
11/30/00 JMR
Techniques and Tools for Optical Lithography
641
286. MacMillen, D., and Ryden, W. D., SPIE, 334:78 (1982) 287. Armitage, J. D., SPIE, 921:207 (1988) 288. 289. 290. 291. 292.
Preil, M. E., Manchester, T., and Minvielle, A., SPIE, 2197:753 (1994) Tan, R. V., and Ausschnitt, C. P., SPIE, 565:45 (1985) Schneider, W. C., SPIE, 174:6 (1979) Cote, D. R., Clayton, R. H., and Lazo-Wasem, J. E., SPIE, 772:124 (1987) Terasawa, T., Hasegawa, N., Hama, K., and Katagiri, S., SPIE, 1674:127 (1992)
293. Cowan, M. J., Mattis, D. G., and Chapman, E. W., Proc. Kodak Microelectronics Seminar (Oct. 1981) 294. Waldo, W. G., Selinidis, K., and Espenscheid, A., Proc. 1997 IEEE/SEMI Advanced Semiconductor Manufacturing Conference and Workshop, Cambridge, MA, p. 305 (Sept. 10–12, 1997) 295. Yan, P., and Gaw, H., Proc. KTI Microelectronics Seminar, p. 261 (Nov. 1989) 296. Hershel, R., SPIE, 275:23 (1981) 297. Ward, I. E., and Duly, D. L., Proc. Kodak Microelectronics Seminar, p. 35 (Oct. 1984) 298. Partlo, W. N., Oldham, W. G., and Flynn, S., Proc. KTI Microelectronics Seminar, p. 107 (Nov. 1989) 299. Brunner, T. A., Ausschnitt, C. P., and Duly, D. L., Solid State Technology, p. 135 (May 1980) 300. Zavecz, T. E., and Banks, E. L., SPIE, 772:224 (1987) 301. Toh, K. K. H., Fu, C. C., Zollinger, K. L., Neureuther, A. R., and Pease, R. F. W., SPIE, 922:194 (1988) 302. Levenson, M. D., Viswanathan, N. S., and Simpson, R. A., IEEE Trans. Electron Devices, ED-29(12):1828 (1982) 303. Fukukda, H., Imai, A., and Okazaki, S., SPIE, 1264:14 (1990) 304. Pfau, A. K., Oldham, W. G., and Neureuther, A. R., SPIE, 1463:124 (1991) 305. Ronse, K., Jonckheere, R., Baik, K. H., Pforr, R., and Van den hove, L., J. Vac. Sci. Technol. B, 10(6):3012 (Nov./Dec. 1992) 306. Op de Beeck, M., Tokui, A., Fujinaga, M., Yoshioka, N., Kamon, K., Hanawa, T., and Tsukamoto, K., SPIE, 1463:180 (1991) 307. 308. 309. 310. 311.
Prouty, M. D., and Neureuther, A. R., SPIE, 470:228 (1984) Newmark, D. M., Neureuther, A. R., and Pfau, A. K., SPIE, 1674:2 (1992) Liu, Y., Pfau, A. K., and Zakhor, A., SPIE, 1674:14 (1992) Chang, C., Schaper, C. D., and Kailath, T., SPIE, 1674:65 (1992) Nolscher, C., and Mader, L., SPIE, 1463:135 (1991)
11/30/00 JMR
642
Handbook of VLSI Microlithography
312. Spragg, P. M., Dao, G. T., Hansen, S. G., Leonard, R. F., Toukhy, M. A., Singh, R., and Toh, K. K. H., SPIE, 1674:650 (1992) 313. Kusunose, H., Aoyama, S., Hosono, K., Takeuchi, S., Matsuda, S., Op de Beeck, M., Yoshioka, N., Watakabe, Y., SPIE, 1674:230 (1992) 314. Levenson, M. D., Goodman, D. S., Lindsey, S., Bayer, P. W., and Santini, H. A. E., IEEE Trans. Electron Devices, ED-31(6):753 (1984) 315. Katz, B., Greeneich, J., Rogoff, R., Dao, G., Gaw, H., Toh, K., and Sager, C., Proc. KTI Microelectronics Seminar, p. 179 (Oct. 1991) 316. Katz, B., Greeneich, J., Rogoff, R., Dao, G., Gaw, H., Toh, K., and Sager, C., Microelectronics Manuf. Technol., p. 28 (Dec. 1991) 317. Ronse, K., Jonckheere, R., Goethals, A. M., Baik, K. H., and Van den hove, L., Microelectronics Eng., 17:69 (1992) 318. Garofalo, J. G., Kostelak, R. L., and Yang, T. S., SPIE, 1463:151 (1991) 319. Ohtsuka, H., Abe, K., Onodera, T., Kuwahara, K., and Taguchi, T., SPIE, 1463:112 (1991) 320. Yasuzato, T., Iwasaki, H., Nozue, H., and Kasama, K., SPIE, 1674:241 (1992) 321. Ohtsuka, H., Onodera, T., Kuwahara, K., and Taguchi, T., SPIE, 1674:53 (1992) 322. Katnani, A. D., and Lin, B. J., SPIE, 1674:264 (1992) 323. Terasawa, T., Hasegawa, N., Imai, A., Tanaka, T., and Katagiri, S., SPIE, 1463:197 (1991) 324. Pfau, A. K., Scheckler, E. W., Newmark, D. M., and Neureuther, A. R., SRC Pub C91753, Contract 88-MC-500 (Oct. 1991) 325. Pfau, A. K., Scheckler, E. W., Newmark, D. M., and Neureuther, A. R., SPIE, 1674:585 (1992) 326. Levenson, M. D., Microlithography World, p. 6 (Mar./Apr. 1992) 327. Hanyu, I., Asai, S., Kosemura, K., Ito, H., Nunokawa, M., and Abe, M., SPIE, 1264:167 (1990) 328. Lin, B. J., 10th BACUS Symposium (Sept. 1990) 329. Nakatani, M., Nakano, H., Kusunose, H., Kamon, K., Matsuda, S., Watakabe, Y., Takano, H., and Otsubo, M., SPIE, 1674:543 (1992) 330. Kemp, K., Lee, F., Gelatos, C., and Wu, W., Proc. KTI Microelectronics Seminar, p. 67 (Oct. 1991) 331. Toh, K. K. H., Dao, G., Singh, R., and Gaw, H., SPIE, 1463:74 (1991)
11/30/00 JMR
Techniques and Tools for Optical Lithography
643
332. Nakagawa, K., Taguchi, M., and Ema, T., Intl. Elect. Dev. Mtg. Tech. Digest, p. 817 (1990) 333. Jinbo, H., Yamashita, Y., and Sadamura, M., J. Vac. Sci. Technol. B, 8(6):1745 (Nov./Dec. 1990) 334. Toh, K. K. H., Dao, G., Singh, R., and Gaw, H., 10th BACUS Symposium (Sept. 1990) 335. Watanabe, H., Todokoro, Y., and Inoue, M., Intl. Elect. Dev. Mtg. Tech. Digest, p. 821 (1990) 336. Jinbo, H., and Yamashita, Y., Intl. Elect. Dev. Mtg. Tech. Digest, p. 825 (1990) 337. Watanabe, H., Todokoro, Y., Hirai, Y., and Inoue, M., SPIE, 1463:101 (1991) 338. Watanabe, H., Takenaka, H., Todokoro, Y., and Inoue, M., J. Vac. Sci. Technol., 9(6):3172 (Nov./Dec. 1991) 339. Tabuchi, H., Taniguchi, T., Moriwaki, H., Tanigawa, M., Uda, K., and Sakiyama, K., SPIE, 1674:626 (1992) 340. Terasawa, T., Hasegawa, N., Kurosaki, T., and Tanaka, T., SPIE, 1088:25 (1989) 341. Terasawa, T., Hasegawa, N., Tanaka, T., and Katagiri, S., J. Vac. Sci. Technol. B, 8(6):1300 (Nov./Dec. 1990) 342. Pfau, A. K., University of California at Berkeley, Berkeley, CA., personal communication 343. Nitayama, A., Sato, T., Hashimoto, K., Shigemitsu, F., and Nakase, M., Intl. Elect. Dev. Mtg. Tech. Digest, p. 57 (1989) 344. Palmer, S., Garza, C. M., Sager, C., and Reynolds, P., SPIE, 1674:73 (1992) 345. Yanagishita, Y., Ishiwata, N., Tabata, Y., Nakagawa, K., and Shigematsu, K., SPIE, 1463:207 (1991) 346. Lin, B. J., Solid State Technology, p. 43 (Jan., 1992) 347. Casteel, C. M., unpublished work, Motorola, Inc., Phoenix, AZ (1986) 348. Zollinger, K. L., O’Mahoney, C. P., and Chang, M. S., SPIE, 633:122 (1986) 349. Starikov, A., SPIE, 1088:34 (1989) 350. Yeh, J. T. C., SPIE, 922:461 (1988)
11/30/00 JMR
644
Handbook of VLSI Microlithography
6 Microlithography Tool Automation Charles T. Lambson ASML Hong Kong, China
1.0
AUTOMATION BASICS
1.1
Introduction
The word “automation” is derived from the Greek “auto-matos,” meaning self-acting. Automation of a process has the effect of making the process more self-acting by reducing the manual tasks required to run the process. In most cases, automation is accomplished by implementing machines that perform tasks previously performed by a human. A familiar example is the automatic transmission. The driver of a car having an automatic transmission must put the car into drive to begin the commute to work. Once that is done, the automatic transmission replaces all of the clutching and shifting that is required with manual transmissions. The most readily observed result of automation is labor savings. While labor savings is typically one of the more visible benefits of automation, the cost of most automation systems could not be justified on this basis alone. Very often, the more important considerations for automating a process are: improved reproducibility of the process, reduction in scrap and rework, and improved safety. The overall goal of most
644
Microlithography Tool Automation
645
automation is to improve the profitability of the operation. Automobile makers did not put automatic transmissions in automobiles simply to save labor on the part of the drivers. They knew that people would pay more for this feature, and, therefore, they could make more money by automating the shifting function. There are various ways that automation can contribute to the profitability and safety of an operation. They should all be considered, prioritized, and weighed against costs when considering the automation of a particular operation. 1.2
Automation Is a Gradual Process
As previously stated, automation has the effect of making a process more self-acting. Rarely does a process move from being totally manual to totally automated in one step. Typically, automation is a gradual process that occurs over many iterations of the process and equipment. With each iteration, another group of manual tasks is eliminated and the process becomes a little “more self-acting.” Over the years, microlithography processes have gradually migrated to increasingly higher levels of automation. In the early days of wafer processing, wafers were manually placed onto coater chucks, resist was manually dispensed, and the operator manually set the spin speed. The operator placed the wafers onto the exposure tool one at time, manually aligned, and manually removed wafers from the tool. The develop process was also accomplished manually, by dunking a cassette of wafers into a bath of develop solution, timing, then removing the wafers from the develop bath, and taking them through the rinse in a similar fashion. The industry has seen the gradual implementation of speed controllers, automatic timers, temperature controllers, automatic wafer handlers, resist pumps, developer dispense systems, and many other advances. All of these advances were incremental steps in automation of the lithography operation. 1.3
Cluster Tools
With the advent of cluster tools, several operations that were once performed in separate pieces of equipment have been grouped together into a single tool. This has made it possible to eliminate much of the transit time and que time accrued by lots between processing steps. Modern randomaccess wafer tracks are the most prominent example of the application of cluster tools to microlithography. [1] In these tools, wafers may be routed
646
Handbook of VLSI Microlithography
through the modules in any sequence desired, thus providing considerably more process flexibility than the serial tracks of previous equipment generations. An extension of the cluster tool approach, and a significant milestone in microlithography automation, was brought about with the advent of linked lithocells.[1] The linking of steppers and tracks was a particularly challenging milestone because it required making unlike pieces of equipment (often from two different manufacturers) to work together as a single machine. Progress in microlithography automation has resulted in significant benefits. Today’s critical-dimension control from run-to-run, and across several shifts, could not be accomplished without removing operator variability by automating the processes. Cycle time of the photo process is dramatically better on a modern lithocell than it was when lots sat queued on shelves prior to the coat, then again prior to expose, and again prior to develop.[1] In recent years, the time-delay sensitivity of chemically amplified resists used in deep-UV processing has reinforced the importance of linked lithocells.[14] This chapter will deal with those areas of microlithography automation that are continuing to move forward, and will present proven methods and practices for achieving an increasingly higher level of automation. 2.0
CELL CONTROLLERS
2.1
Motivation for Cell Controllers
There remain numerous opportunities for further automation of microlithography operations. In the realm of software alone, much can be done to automate operator tasks. These opportunities are of particular interest when considered from the standpoint of reducing operator error, thereby reducing misprocess-related scrap and rework. Common causes of scrap and rework that plague photo processes are: wrong reticle, wrong track recipe, wrong stepper job, wrong focus setting, and wrong exposure dose. Such problems are sometimes lumped into the category of “operator error.” There are two common approaches to solving the problem of operator error. One approach is to minimize scrap and rework by improving training and focusing on providing robust manual systems and procedures. This is often a gradual evolution wherein the culture and circumstances of the workplace evolve into an environment where each mistake is seriously analyzed by cross-functional teams. Team solutions are worked out that reduce the likelihood of repeating the error.
Microlithography Tool Automation
647
Another approach is to eliminate these misprocesses entirely by automating the tasks that lead to common errors. In other words, take the opportunity for manual error out of the critical tasks of the process by turning these tasks over to machines. Cell controllers may be advantageously employed for reduction of scrap and rework in a microlithography process.[2] 2.2
Work Cells
Let us consider the origin of the word “cell controller.” The usage of the word “cell,” in this context, comes from the manufacturing concept of work cells.[3] Complex manufacturing processes, such as integrated circuit manufacturing flows, are typically broken down into many individual processing steps. Each step typically consists of a process that is performed at a single piece of equipment. There are often hundreds of steps in process flows of integrated circuits. Steps, in turn, are grouped into stages. Each stage consists of a number of process steps that can be performed by a team having specialized skills and equipment. A work cell is a group of equipment that is laid out in an arrangement that compliments a particular stage, or sequence, of process steps. The work cell allows specialization of the operators and promotes ownership of a particular slice of the manufacturing process by the cell team. The cell controller was originally envisioned as a software system that would serve as a controlling host to equipment grouped into a work cell.[4] Over time, the term became generalized to denote any host software system that controls manufacturing equipment, whether a group or a single tool. Other names have been applied to this type of automation such as: equipment servers, remote controllers, etc. For purposes of this chapter, a host software system that controls one or more pieces of manufacturing equipment will be called a cell controller. 2.3
Model Cell Controller
Consider the model cell controller depicted in Fig. 1. Such a cell controller is capable of automating many operator functions associated with the operation of a linked stepper-track lithocell. There are many conceivable designs for such a cell controller. This model will represent only one of the many possible designs but will allow illustration of some basic principles that apply to cell controllers in general.
648
Handbook of VLSI Microlithography
Other Factory Systems
MES Manufacturing Execution System
Factory LAN (Local Area Network)
Model Cell Controller
Terminal Server
User Interface Endpoint Unit Stepper Track
Figure 1. Model cell controller.
Figure 1 shows the model cell controller interfaced with a linked stepper and track. In addition to the stepper and track, the cell controller also interfaces with a developer endpoint detection unit that has been mechanically integrated to the track. (The endpoint detection unit would normally be controlled via the track controller unless it was not a part of the original track. In this case, assume the unit has been retrofitted by the owner of the track. The endpoint unit needs its own communication interface in order to be useful.) The cell controller acts as host to all three pieces of equipment: track, stepper, and endpoint detection system. In the model cell controller, all of these connections to equipment are made via a terminal server that is connected to the factory LAN (Local Area Network). Figure 1 also shows a connection from the cell controller to the factory LAN. The cell controller will communicate with it’s host, the Manufacturing Execution System (MES), over the factory LAN. Although Fig. 1 shows only one cell controller, in reality, there may be many instances of this cell controller running simultaneously, controlling any number of lithocells in the factory.
Microlithography Tool Automation
649
A typical lot processing scenario for this model cell controller is as follows: i. The operator selects and tracks-in the lot. At this time the cell controller receives the lot processing information from the MES. ii. The cell controller prompts the operator which track port the lot should be placed on. iii. The operator places the lot on the appropriate track port. iv. When the lot has been sensed on the appropriate port, the cell controller receives a message from the track, via the factory LAN, that the lot is present. v. The cell controller selects the appropriate track recipe based on the lot processing information it received from the MES at the time of track-in. vi. The cell controller starts the lot on the track. vii. After the lot is started on the track, the reticle is prestaged within the stepper. At this time, the identity of the reticle is known to the stepper by means of the stepper’s internal barcode reader. A dialogue of messages between the stepper and the cell controller verifies that the reticle being pre-staged is the one that is called for in the lot processing information previously received from the MES. viii. When the first wafer of the lot arrives at the stepper, and the previous lot finishes, the cell controller will start the stepper with the appropriate job file and exposure parameters according to the information it previously received from the MES. ix. When the lot has completed on the track, the cell controller will receive a message from the track that the lot is complete. At this point, the cell controller will inform the operator (via the user interface) that the lot is complete. The cell controller may also store appropriate collection data to the MES or other factory systems in accordance with the practices of the particular factory.
650
Handbook of VLSI Microlithography x. The operator removes the completed lot from the track. xi. When the lot is removed from the track, the cell controller receives a message that the lot has been removed. At this time, the cell controller communicates with the MES to track-out the lot.
2.4
Cell Controller Benefits
What are the benefits that might be expected from implementation of such a cell controller, relative to operation of the lithocell without the cell controller? Table 1 shows the operator tasks required to run a lot on the lithocell with and without the cell controller. Those tasks labeled “critical” are tasks where operator error could potentially create scrap or rework.[13] With the cell controller, the operator tasks for running a production lot have been reduced from eleven tasks to five tasks, and all three of the critical tasks have been automated. Since all critical tasks have been automated, we expect manual-error related scrap and rework to decrease after implementation of this cell controller. An additional benefit is the freeing up of the operator’s time by eliminating more than half of the operator’s tasks. This allows the operator to dedicate more attention to such tasks as statistical process control, equipment issues, long term problem solving, and other efforts that positively impact quality and profitability. Table 1. Operator Tasks Without and With the Cell Controller Without Cell Controller Track-in a lot Place lot on track Select track recipe Start lot on track Obtain qualified reticle Insert reticle at stepper Verify reticle matches with MES required reticle
With Cell Controller Track-in a lot Place lot on track
Critical Obtain qualified reticle Insert reticle at stepper Critical
Select stepper job file and exposure parameters Start lot on stepper Remove completed lot from track Track-out completed lot
Critical Tasks
Critical
Remove completed lot from track
Microlithography Tool Automation 3.0
651
EQUIPMENT COMMUNICATION INTERFACES
Individual equipment communicates to the host computer through a special software and hardware system called an interface. The interface supports a set of messages to be sent and/or received by the equipment that enables it to be controlled and monitored remotely. In order to facilitate this communication, the semiconductor industry has adopted equipment communication standards. These standards have become very important because without them each attempt to automate would necessarily start from the ground up. Standards allow re-use of hardware and software and greatly simplify the task of automation. These standards have not come about all at once. Like the automation they support, they are constantly evolving. This section will provide a background for understanding current communication standards. Details may be found in the SEMI standard documents. There are three levels of SEMI standards for equipment communication. They are: SECS-I (or HSMS), SECS-II, and GEM. A fourth level (SEM) is under consideration at the present time for adoption as a standard. 3.1
SECS-I Protocol
SECS-I (Semiconductor Equipment Communication Standard #1, Message Transfer) [5] is the most basic level and was first adopted as a standard in 1980. SECS-I defines protocol for constructing and transferring messages at the lowest level but does not specify the content of the messages. Some of the items defined in the SECS-I protocol are: i. The physical connector to the equipment interface. ii. The voltage levels of the signals. iii. The range of baud rates that may be used. iv. How characters are digitally formed and delimited. v. How characters are lumped together into discrete blocks of information that will be sent sequentially across the interface. vi. The handshake dialogue that occurs between the equipment and the host each time a block of information is sent. vii. The time-outs and re-try limits associated with sending blocks of information across the interface.
652
Handbook of VLSI Microlithography viii. The maximum number of blocks that may be included in a single message.
SECS-I may be considered the bottom layer, or foundation, of the equipment interface pyramid shown in Fig. 2. All messages must conform to the SECS-I message transfer protocol.
SEM GEM SECSII SECSI - HSMS Figure 2. Communications standards hierarchy.
HSMS (high speed SECS message service) is a SEMI standard that is intended as a much faster replacement for SECS-I. The HSMS standard is particularly useful for metrology and test equipment where large volumes of data are uploaded to the host for each lot. At the time of this writing, many equipment suppliers have already made the switch from SECS-I to HSMS. 3.2
The SECS-II Standard
SECS-II (Semiconductor Equipment Communication Standard no. 2, Message Content)[6] is the second level of the pyramid in Fig. 2. SECSII contains a standard set of messages and specifies the methods for creating user-defined messages that are compliant with the standard. All the messages described in SECS-II are built from the tools provided by SECS-I. Another way of looking at the relationship between SECS-I and SECS-II is to use the metaphor of a written language. SECS-I is analogous to the rules for making letters and combining them into words, whereas SECS-II is analogous to the grammar for constructing sentences and paragraphs. In SECS-II, messages are classified into broad categories of communication called streams. Each stream contains specific messages called
Microlithography Tool Automation
653
functions. A function is a single message for a specific activity. An individual message is referred to by its stream and function. For example: Stream 1 and Function 1 (commonly written: S1F1) is the “Are you there?” request. Some SECS-II messages request a reply. Such a message, with its reply, are referred to as a transaction. In a transaction, the first message is referred to as the primary message and is always an odd numbered function. The reply to a primary message is referred to as a secondary message and is always an even numbered function (the reply function is always designated by the number of the request function incremented by one). For example: The reply to the S1F1 “Are you there?” message, above, is the S1F2 message “Acknowlege.” These two messages are a transaction. The SEMI standard reserves some stream and function code combinations for messages defined in the standard. In addition, a large block of stream and function codes are allocated for user-defined messages (the user, in this case, being the equipment supplier). For more detailed information on SECS-II communication protocol refer directly to the SEMI standards. Table 2 is an example of a SECS-II dialogue that might occur between a stepper and a host computer for the running of a normal error-free lot when no other lots are queued to run. 3.3
The GEM Standard
GEM (Generic Equipment Model) is the second level from the top of the pyramid of Fig. 2. GEM came after SECS-I and SECS-II as a means of attaining more complete and consistent implementation of SECS-II. Prior to GEM, the extent to which interfaces could support factory automation requirements varied widely from supplier to supplier. GEM requires a minimum implementation, wherein many of the automation functions that are common to all semiconductor processing equipment are done in a standard way. The GEM standard is the SEMI E30 document (Generic Model for Communications and Control of Semiconductor Equipment).[7] This standard details common manufacturing communication scenarios that must be supported by GEM compliant equipment. A list of required data collection events is specified along with examples of trace data collection, equipment alarms, and other examples pertinent to semiconductor manufacturing. GEM also requires that each time a machine changes state, an event message must be sent to the host, so that, the host can keep track of the state of the equipment.
654
Handbook of VLSI Microlithography
Table 2. SECS-II Dialogue Example Comments
Host
Equipment
The stepper is in the IDLE processing state and in the REMOTE control state. Reticle “SSEM601” is qualified and is called for in Process Program “SSEM01.” Chaining is off.
ONLINE
PP-SELECT (job file)
Positive Acknowledge
Positive Acknowledge START
Positive Acknowledge
Positive Acknowledge
Positive Acknowledge
Positive Acknowledge
S2,F141 →
←S2,F142 ←S6,F11
Positive Acknowledge Transition to SETTING UP
←S6,F11
Reticle “SSEM601” loaded event to exposure stage.
←S2,F42 ←S6,F11
Positive Acknowledge Transition to LOAD state
←S6,F11
Wafer loaded to exposure chuck. Transition to WORKING state.
←S6,F11
Transition to UNLOAD state. Substrate complete event.
←S6,F11
Wafer unloaded from exposure chuck. Transition to LOAD state.
S6,F12→
S6,F12→ S2,F41→
S6,F12→
S6,F12→
S6,F12→
S6,F12→
←S6,F11
Positive Acknowledge
Positive Acknowledge
Comments
S6,F12→
{WHILE} Not last wafer 1) LOAD → WORKING 2) WORKING → UNLOAD 3) UNLOAD → LOAD {END WHILE} Last wafer loaded to exposure chuck. Transition to WORKING state.
←S6,F11
Transition to UNLOAD state. Lot complete event.
←S6,F11
Last wafer unloaded from exposure chuck. Transition to STOPPING state.
←S6,F11
Transition to IDLE state
S6,F12→
Positive Acknowledge
S6,F12→
Positive Acknowledge
S6,F12→
Microlithography Tool Automation
655
GEM specifies a standardized method for documenting state models for equipment. The next section will include information on the Harel Statechart notation used by GEM. Harel notation is a method of documenting equipment behavior from the viewpoint of the host. A communications state model is specified in GEM that defines the behavior of the equipment with respect to existence or absence of communication links to the host. These and other features of GEM make it a very important standard that has contributed significantly to the advancement and ease of implementation of semiconductor equipment automation. 3.4
The SEM Standards
The next level of standardization of equipment interfaces, the SEM (Specific Equipment Model) standards are in the process of being developed and balloted as industry standards at the time of this writing. The SEM standards provide individual models specific to each class of semiconductor equipment (i.e., SSEM for steppers, TSEM for tracks, MSEM for metrology). For example: The SSEM (Stepper Specific Equipment Model) is developed specifically for exposure tools. It contains state models, variables, remote commands, etc., that are applicable to all steppers. The intent of the SEM standards is to simplify and shorten the development cycle for the host software, and obtain more efficient re-use of the software. Once the host software is created to control stepper model A, the same software should work on stepper model B with very little modification, provided both steppers are SSEM compliant.
4.0
STATE MODELS
State models are a convenient tool for visualizing the operation of a machine from the perspective of the host controller. The convention for creating state models that is widely used in the semiconductor industry is Harel state model notation.[7] In Harel state model diagrams, the various states of the machine are represented as boxes. Each box is labeled with the name of the state that it represents. Figure 3 shows the state model diagram for a simplified stepper. This simplified model is adapted from a more comprehensive version in the SSEM document.[8] An arrow from one box to another indicates a transition of the machine from one state to another. Each arrow is labeled with a transition
656
Handbook of VLSI Microlithography
number. A box inside of a box represents a substate. A circle with an H inside represents a “return to history” transition. This implies that the machine remembers how it got into the state that is being exited and it can get back to that state when this transition is taken. In the stepper state model of Fig. 3, the transition to history will bring the machine back to the state it was in before being paused.
INIT
IDLE
1
2
SETTING UP 3
READY 4
EXCHANGING 9
LOAD
7
UNLOAD
8
STOPPING
5
PAUSED
H
12
6
11
PAUSING
WORKING 10
EXECUTING PROCESSING ACTIVE Figure 3. Simplified stepper state model.
A state model diagram must be accompanied by a list of definitions and a transition table. In the definitions (Table 3), the conditions of each state are clearly defined. The transition table (Table 4) describes each of the transition arrows in the state model diagram. For each transition, information is provided describing the state being exited, the state being entered,
Microlithography Tool Automation
657
the “trigger” or the set of conditions that caused the transition, and the machine actions that accompany the transition. The GEM standard requires that an event message be sent to the host each time any transition of the state model is taken. This allows the host to constantly maintain a knowledge of the state of the equipment. Table 3. Stepper State Model Definitions
EXCHANGING (EXECUTING Sub-state) The stepper is in the process of loading and unloading substrates from the exposure chuck. If the first reticle is not loaded and aligned in the SETTING UP state then this is performed upon the entry of this state from READY. EXECUTING (PROCESSING ACTIVE Sub-state) The stepper is processing material automatically and can continue to do so without external intervention. This state may include interaction with the host or operator. IDLE Awaiting a command. IDLE is free of ALARM and error conditions. INIT Stepper initialization is occurring. If an error or alarm condition occurs during initialization the stepper can not exit this state until the condition is cleared. LOAD (EXCHANGING Sub-state) A substrate is transferred to the exposure chuck. The material transferred to the exposure chuck shall be prealigned before this state can be exited. PAUSED (PROCESSING ACTIVE Sub-state) The PROCESS state has been suspended and the stepper is waiting for a command. In this state, the operator may correct error conditions that do not effect the reticle or Process Program. PAUSING (PROCESSING ACTIVE Sub-state) The PROCESS state will be suspended at the completion of the current substrate or next opportunity. The stepper can not transition to PAUSED state until the current substrate is completed and the stepper is in a “safe state.” (Cont’d.)
658
Handbook of VLSI Microlithography
Table 3. (Cont’d.)
EXECUTING (PROCESSING ACTIVE Sub-state) This state is the parent of those sub-states which refer to the preparation and execution of a process program. PROCESSING ACTIVE This state is the parent of all sub-states where a process program exists. READY (PROCESSING ACTIVE Sub-state) The stepper is ready to begin processing and is awaiting a START command from the operator or host. SETTING UP (PROCESSING ACTIVE Sub-state) The stepper is satisfying conditions so that processing can begin. This includes the receipt of any process programs and material to be processed and their validation. All reticles required for the execution of all Process Build Groups within the current selection are required to be available within the stepper before exiting this state. The loading and aligning of the first reticle in this state is based on a user defined equipment constant. STOPPING (PROCESSING ACTIVE Sub-state) The stepper has completed a Process Program or has been instructed to stop processing and will do so gracefully at the next opportunity. All necessary cleanup is completed within this state with regard to material, data, control system, etc. Data is preserved. Any alarm or error condition is cleared before exit from this state. UNLOAD (EXCHANGING Sub-state) The substrate is being removed from the exposure chuck and the stepper determines which transition to take. WORKING (EXECUTING Sub-state) The stepper is processing a specific substrate.
SETTING UP All setup activity has completed READY and the stepper is ready to receive a START command. READY
LOAD
WORKING
UNLOAD
4
5
6
7
WORKING
LOAD
The material unload is complete.
LOAD
The processing of the current UNLOAD substrate has completed normally.
A substrate has completed prealign and is loaded to the exposure chuck.
The stepper receives a START command.
None
Actions
Commit has been made to set up.
None
Comments
Transfers the next substrate to the exposure chuck.
This substrate is transferred from the exposure chuck.
The substrate is being processed.
Transfers the next substrate to the exposure chuck.
None
(Cont’d.)
Completion of the substrate.
None
LOAD is an EXECUTING sub-state.
The stepper is waiting for The selected Process Program is availa START command able for execution and material is present at the input port.
SETTING UP Stepper dependent
3
A Process Program is selected.
IDLE
IDLE
2
All stepper initialization is complete with no alarms or error conditions.
New State
INIT
Trigger
1
Transition Initial State #
Table 4. Stepper State Model Transitions Table
Microlithography Tool Automation 659
EXECUTING The stepper has received a PAUSE command.
PAUSING
PAUSED
10
11
12
The stepper has received a command to resume execution.
The stepper has completed Processing the Current substrate in the WORKING state and achieved a safe condition.
The stepper clean up is complete and the stepper is free of alarms.
STOPPING
9
The processing of the lot is complete.
UNLOAD
Trigger
8
Transition Initial State
Table 4. (Cont’d.)
The stepper is waiting for a command to resume executing where it left off.
The EXECUTING state will be suspended at the completion of the current substrate. Any necessary actions to put the stepper in a safe state will be performed.
None
A graceful end. All substrates are removed from the stepper.
Actions
Previous Proceeds from the point EXECUTING where processing was State previously suspended.
PAUSED
PAUSING
IDLE
STOPPING
New State
None
None
Any events that are normally generated during the completion of the current substrate are still generated.
None
None
Comments
660 Handbook of VLSI Microlithography
Microlithography Tool Automation
661
The state model is also useful for defining valid states for remote commands. Table 5 shows the valid states for remote commands for this simplified stepper state model. If an X is found at the intersection of a particular state and remote command, then the remote command is valid for that state. This means that the stepper will be able to accept this remote command if it is received while the stepper is in that state. For example, our model stepper is able to receive a remote command to pause while it is in the WORKING state but not while it is in the IDLE state. State models are very useful tools when creating a cell controller. They provide the designers of the host software with a visual summary of how the equipment will behave under host control.
Table 5. Valid States for Remote Commands COMMAND PAUSE PP-SELECT PP-UPDATE RESUME START PROCESSING STATE INIT IDLE
X
PROCESSING ACTIVE ....SETTING UP
X
....READY
X
....EXECUTING ....EXCHANGING ....LOAD
X
....UNLOAD
X
....WORKING
X
....PAUSING ....PAUSED ....STOPPING
X
X
662 5.0
Handbook of VLSI Microlithography LADDER DIAGRAMS
A useful tool for documenting communication scenarios is the “ladder diagram.” The ladder diagram details each communication event of a given scenario. The chronological order of the messages, message content, point of origin, and recipients are shown. Figure 4 shows a ladder diagram for the scenario of starting one lot on a coater track under control of the model cell controller. Each vertical line represents an entity that is capable of communication (either one-way or two-way). While there is no scale for time, it is understood that time begins at the top of the diagram and moves forward as the scenario progresses downward. Any horizontal arrow on the diagram represents a message from the entity where the arrow originates to the entity where the arrow terminates. Horizontal arrows, representing messages, are numbered sequentially and labeled with an abbreviated version of the message. The intent is to provide enough information on the diagram so that the automation designers can walk through the scenarios, understanding the communication and check for inconsistencies and weaknesses. Accompanying the ladder diagram should be a detailed document containing descriptions of each of the messages. MES
CELL CONTROLLER
TRACK
Lot Tracked-in 1) Track-in (Lot Process Data) 2) Reply 3) Load Port Request 4) Load Ports Available (list) Load Lot on Port Designated by Cell Controller 5) Load Port Occupied 6) Reply
7) Select Process Program (PPID) 8) Selection Successful
9) Start Lot (Load Port ID) 10) Start Successful
Figure 4. Ladder diagram of lot start.
Microlithography Tool Automation
663
The automation standards and tools mentioned in this chapter are those that are considered most relevant to the lithographer that may be participating on a cell controller project. Additional general topics that are relevant to automation projects are: i. Software Quality Control Practices. ii. Software Documentation. iii. Training of the system administrators, operators, and engineers. These topics will not be covered in this chapter but are important to the success of a cell-controller automation-project. 6.0
MATERIAL TRANSPORT
Another area that presents itself as an opportunity to continue the trend toward higher levels of automation within the realm of microlithography is that of material transport. Automated transport of chemicals required by the microlithography process is available in current generation wafer fabs and is known as “bulk delivery.” Automated transport of wafers and reticles to and from the equipment, however, remains an area that has not been addressed in most fabs. There are some important considerations that make this area of automation particularly appealing. The first of these occurs with the advent of 300 mm wafers and 9-inch reticles—the idea of people handling these heavy and expensive substrates presents an increased risk. Both the risk of dropping these precious payloads, and the risk of injury to the operators from repeatedly handling heavy loads are present. Second, automated material handling presents a means for reducing the number of people within the cleanroom, thereby reducing particulates that impact yields. Since people are the primary source of contamination within the cleanroom environment, any reduction in the number of people in the cleanroom is expected to have a positive impact on particulate-limited yields. Studies have demonstrated that automated material handling reduces wafer contamination.[9] There are three important points to make about these types of automation systems: i. These systems generally must be planned in advance of the construction of the fab so that the fab layout will accommodate the stockers, routing pathways, and robotic systems.
664
Handbook of VLSI Microlithography ii. Equipment must be purchased with appropriate mechanical and software interfaces for robust hand-off of materials. iii. A CIM (Computer Integrated Manufacturing) architecture must exist for the factory that is capable of supporting this level of automation. These will be discussed in more detail in the following sections.
6.1
Fab Layout Considerations for Automated Material Transport
Reticle and wafer stockers will be distributed in the fab and will function as ultra-clean storage receptacles, where wafers and reticles will reside while not on the equipment. The capacity and locations of these stockers will have to be decided upon and mapped into the fab layout by the appropriate Equipment and Facilities engineers. Depending on the type of transport system, additional floor space may be required for dedicated routes. There are generally four types of transport systems for lots and reticles in wafer fabs.[10] i. Overhead systems of conveyors or vehicles that travel on rails (OHCs or OHVs). ii. Automatic guided vehicles (AGVs) that are self propelled robotic vehicles that travel on the fab floor. iii. Rail guided vehicles (RGVs) that travel on rails placed at the level of the fab floor. iv. Stationary conveyor systems that are located on the fab floor, but are able to convey and route material to programmed destinations. OHVs have the advantages of not occupying much floor space in the fab, and not interfering with human traffic in the fab. These systems also provide some flexibility in that they are readily extendible by adding more conveyor track and may be reconfigured at relatively low cost in the event fab layouts change. For these reasons, overhead systems have been a popular choice of chipmakers particularly for interbay transport of lots. AGVs, on the other hand, have seen extensive use for intrabay transport of lots. That is, the transport of lots between stockers, or intrabay loading ports, to such equipment and back to stockers. AGVs operate at the same level of the operator and are well suited for placing cassettes onto
Microlithography Tool Automation
665
equipment and removing cassettes from equipment. They have the disadvantage of requiring dedicated or semi-dedicated pathways on the floor of the fab. RGVs have been used in much the same way as AGVs to provide a means of moving lots from stockers to equipment and back. Instead of traveling across the floor these vehicles move along rails that are at the level of the floor. The rails enable very accurate positional control for handing off lots to equipment but add complexity to the routing of the vehicles in the event the fab layout, or process flow, changes. Stationary conveyor systems on the fab floor have been used for rapid transport of lots down a central corridor to positions where they can be retrieved by AGVs or operators. Although relatively little work has been done in this area, two types of systems have been used to load reticles directly to and from steppers: AGVs and Stepper Loading Robots.AGVs are used for reticle transport the same way as they are used for lots. They transport reticles from stockers and load them directly into the manual port of the stepper in the same way as an operator would do it manually. Another method for reticle delivery is the combination of an OHV system with a robotic elevator that removes the reticle from the OHV and loads it into the stepper.[11] Stepper Loading Robots (or SLRs) are specially designed small-footprint robots that stand adjacent to each stepper and exchange reticles between a dedicated automation port in the stepper and an OHV system. These robots are pinned in place on rails that are positioned below the surface of the fab floor so that they can be moved away from the stepper when service access is required. It is clear that none of these systems can be readily retrofitted into a fab once it is in operation, without prior planning. To obtain maximum benefit, the automation plan must be an integral part of the original fab design. 6.2
Interface Considerations for Automated Material Transport
A well designed and implemented interface is critical to the success of systems that transfer materials between different pieces of equipment. Fortunately, standards have been developed and proven for transfer of cassettes of wafers between equipment. The SEMI E23-91 Specification for Cassette Transfer Parallel I/O Interface provides a standard for the communication that takes place between two pieces of equipment when a
666
Handbook of VLSI Microlithography
cassette is handed off.[12] Although the standard was developed for wafer cassette transfer, it has also been used successfully for hand-off of reticles. The SEMI E23 standard specifies a logical dialogue between the robot and the equipment as material is handed to and from the equipment. It ensures that a move is never attempted when the equipment is not ready for the move. It also ensures the knowledge on the part of both the receiver and the sender that every step of the transaction is completed in the proper sequence. Figure 5 is a communication diagram taken from the SEMI E23 specification. This figure only shows one of four dialogues. This, and similar diagrams in the this standard, illustrate the dialogues that take place between the equipment and the robot when a tool is being loaded or unloaded. Figure 5 illustrates the cassette load sequence for a wireconnected interface. The SEMI E23 standard also contains the cassette unload sequence for a wire-connected interface and both the load and unload sequences for a photo-connected interface.
READY (Passive to Active) LOAD REQ (Passive to Active) UNLOAD REQ (Passive to Active) TRANSFER REQ (Active to Passive) BUSY (Active to Passive) COMPLETE (Active to Passive)
Transfer is Complete
Transfer Starts Cassette is Detected
Figure 5. Cassette load sequence for a wire-connected interface.
Microlithography Tool Automation
667
In this type of interface, the cassette transfer robot is designated as the ACTIVE equipment and the process equipment is designated as the PASSIVE equipment. The following list explains the scenario for loading a cassette onto process equipment as illustrated in Fig. 5: i. Both the active and passive equipment start out in an initial state having all six wires set to the logical Lo state. There is often a panel of LEDs associated with these interfaces. In the initial state all six LEDs are off. ii. When the passive equipment is ready to receive a cassette, it activates the LOAD REQUEST by setting the second wire to a logical Hi state. iii. When the active equipment has a cassette ready to transfer, it will activate the TRANSFER REQUEST by setting the fourth wire to Hi. iv. The passive equipment responds to the TRANSFER REQUEST with a READY by setting the first wire to Hi. v. After the passive equipment has signaled it is READY, the active equipment signals BUSY by setting the fifth wire to Hi, and the robot starts to move the cassette to the passive equipment. vi. When the cassette is placed on the passive equipment, it is detected by a sensor and the passive equipment signals it has the cassette by setting the second wire (which was earlier set Hi for LOAD REQUEST) back to Lo. This indicates the request has been met and a load has been received. vii. After the active equipment has finished transferring the cassette and its end effector has been retracted from the passive equipment, it signals COMPLETE by setting the sixth wire to Hi. At the same time, the active equipment will set the fourth and fifth wires back to Lo indicating it is no longer busy. viii. At this point, the passive equipment acknowledges by setting the first wire back to Lo. ix. The load sequence is closed out by the active equipment setting the sixth wire back to Lo. This returns the interface to the initial condition.
668
Handbook of VLSI Microlithography
If either the passive or active equipment attempts to do any action that is out of order, this will halt the transfer and cause an error condition. Standards for the mechanical configuration of cassette ports are specified in the SEMI E19 Standards. Here again, it is clear that the groundwork for this level of automation must be laid early in the planning of the wafer fab. Since current generation lithography equipment is not typically provided with material transfer interfaces, joint development work may be required between the equipment supplier and purchaser to prepare and test interfaces. The negotiation, development, and testing of new subsystems such as this takes considerable time that must be anticipated in the factory automation plan. 6.3
CIM (Computer Integrated Manufacturing) Architecture Considerations
There are many approaches to CIM architecture for manufacturing environments, and it is not the intent here to advocate any particular approach over any other. It is important, however, to mention that in the design of the CIM architecture, consideration must be given to the automation plan. Some important considerations are listed below: i. The Factory LAN must have the capacity to handle the volume of traffic that will be generated by the automation messages. ii. The MES must be capable of supporting the messages that will allow the cell controller to have sufficient knowledge of lot processing parameters to be able to process lots correctly. iii. The Engineering Database for the factory must be capable of supporting data uploads directly from the equipment or from the cell controller if this is a desired function. iv. If recipe management is to be a function of the CIM architecture, the capability to store, manage, upload, and download recipes must be supported. v. If interbay or intrabay lot and/or reticle transport are to be deployed, the messaging between the Material Control System (MCS) and other systems such as the MES and Cell Controller must be worked out and supported.
Microlithography Tool Automation
669
REFERENCES 1. Dick, S., and Greenstein, B., “Improved Microlithography Process Performance Through the Use of an Integrated Photosector,” KTI Interface (1991) 2. Burggraaf, P., “Lithography CIM Cell Drives Reworks Down, Feedback In Less Time,” in: Semiconductor International 62 (September, 1989) 3. Englisch, A., and Deuter, A., “Automated Lithocell,” SPIE Vol. 1261 Integrated Circuit Metrology, Inspection, and Process Control IV (1990) 4. “SCC1 Product Scope and Objectives,” Technology Transfer #91050541BENG (Jan. 1992) 5. SEMI Equipment Communications Standard 1 Message Transfer (SECS-I), SEMI E4-91 (1995) 6. SEMI Equipment Communications Standard 2 Message Content (SECS-II), SEMI E5-94 (1994) 7. Generic Model for Communications and Control of SEMI Equipment (GEM), SEMI E30-94 (1994) 8. Stepper Specific Equipment Model, SEMI Draft Doc. #2561 (1997) 9. Chrisos, J., and Horan J., “Automation Reduces Contamination and Costs within Semiconductor Fab Cleanrooms,” Cleanrooms 50 (July 1996) 10. DeJule, R., “Further Fab Automation: Intrabay Handling,” Semiconductor International 79 (May 1997) 11. Lambson C., Choudhury H., and Davis R., “Automated Reticle Transport and Stepper Loading,” Solid State Technology 97 (October 1997) 12. Specification for Cassette Transfer Parallel I/O Interface, SEMI E23-91 (1994) 13. Endsley, M. R., “The Integration of Humans and Andvanced Manufacturing Systems,” Journal of Design and Manufacturing (March 1993) 14. Bruning J. H., “Optical Lithography—Thirty Years and Three Orders of Magnitude: The Evolution of Optical Lithography Tools,” SPIE Vol. 3051 Optical Microlithography X (1997)
670
Handbook of VLSI Microlithography
7 Electron-Beam ULSI Applications Allen Lepore Army Research Laboratory Adelphi, Maryland
1.0
INTRODUCTION
Electron-beam lithography is a technology for the transfer of computer-aided design pattern data from a digitally stored format (i.e., data file) to a high-resolution spatial reality on a nominally flat substrate. Its main characteristics are high resolution, due in part to the short electron wavelength, and flexibility, due to the easily modified pattern source (a computer data file). The electron beam is almost exclusively used to pattern a temporary imaging layer (electron-beam resist) applied to the substrate rather than directly modifying the substrate or patterning a permanent imaging layer. The patterned electron-beam resist is subsequently used as a mask for the permanent transfer of the pattern image from the resist to the substrate. The suitability of the overall lithographic process for the ultimate pattern transfer step is essential for a viable lithographic process. The most common application for electron-beam lithography is in the fabrication of masks and reticles used for projection optical lithography. For this application, the electron-beam lithography process generates the master mask primarily in a serial manner. This mask is subsequently used in projection optical lithography equipment (steppers) to replicate the mask pattern (usually demagnified) directly onto resist-coated substrates in a parallel manner. The electron-beam lithography process is highly suitable for
670
Electron-Beam ULSI Applications
671
mask fabrication and provides sufficient resolution for increasingly demanding mask requirements, such as optical proximity correction, phase shifting, and 1X magnification x-ray masks. For these applications, the principle challenge for tool manufacturers is feature placement accuracy, as overlay error budgets become significant fractions of ever-decreasing feature sizes. A less common application for electron-beam lithography is direct writing, where the electron-beam lithography process is used to directly transfer the pattern to the resist-coated substrate without the intervening mask fabrication and optical lithography steps. Electron-beam lithography is compatible with direct writing, as the electron beam can precisely measure the pre-existing wafer pattern position and distortion and compensate for irregularities. For high-volume applications, throughput limitations may dictate alternate technologies. In the resolution regime, where electron-beam direct writing and optical lithography overlap, the parallel nature of optical lithography provides a clear throughput advantage over the primarily serial electronbeam lithography process. An additional throughput advantage for optical lithography is exposure at atmospheric pressure without the vacuum system pumpdown overhead of electron-beam lithography. These limitations are important for throughput-intensive tasks such as dynamic random access memory (DRAM) fabrication, where both high resolution and highvolume production are necessary. Currently, electron-beam direct writing is capable of higher resolution than projection optical lithography, making electron-beam direct writing useful for prototyping next-generation DRAM wafers, especially when the availability of the expected production lithography tools is initially limited. A distinct advantage of electron-beam direct writing over masked optical lithography is flexibility. Pattern design changes are implemented by modifying the digital file without intervening mask fabrication, a useful advantage for low-volume prototyping and circuit customization. Equipment-specific exceptions to the serial exposure process will be noted later in this chapter. Electron beams are useful for patterning a substrate as they are readily formed and shaped and are easily deflected using electromagnetic or electrostatic means.[1] For high-resolution patterning, electron beams typically have effective wavelengths significantly smaller than the pattern minimum feature size and are therefore not diffraction limited.[2] This contrasts with optical lithography, in which the minimum resolution is comparable to the exposing light wavelength, resulting in resolutionlimiting diffraction effects. Difficulties arise with electron-beam patterning as the low effective mass electrons are easily deflected from their
672
Handbook of VLSI Microlithography
intended position on the substrate surface by scattering from collisions with the substrate materials and by charge built up on the substrate surface. The basic electron-beam lithography system is shown schematically in Fig. 1. It consists of an electron source that provides an energetic electron beam and associated components to control beam size, shape, and current. At the receiving end of the beam is a stage that holds the substrate to be exposed. The stage control system positions the substrate in a plane orthogonal to the incident beam axis. A fundamental requirement for pattern exposure and calibration operations is the ability to measure stage position in this plane with high resolution and accuracy. A deflection system controls the movement of the beam across the substrate according to the user-specified pattern data. A blanking system provides beam intensity modulation to prevent exposure while the beam or substrate is properly positioned. A detection system permits the location of features (reference marks) on the stage or substrate by detecting electrons emitted when the beam is scanned across the reference mark. Mark detection is used to calibrate the electron-beam deflection relative to the stage reference (a laser interferometer) and to align the exposed pattern to pre-existing substrate patterns (pattern registration). A vacuum system provides the necessary vacuum level at different points in the system and provides for substrate exchange at atmospheric pressure. A system control computer integrates all of these functions, establishing beam parameters, executing calibrations and alignments, sequencing stage movement, controlling pattern data flow to the deflection system, and controlling substrate exchange. Typically a separate computer or computers provides computer-aided design (CAD) layout and CAD data translation functions. Improvements in exposure tool performance demand increasingly more stringent environmental controls for precise temperature regulation, minimal stray magnetic fields, and minimal structural, mechanical, and acoustic vibrations. The development process completes the transformation of the resist coating into the three-dimensional representation of the CAD pattern. Although the CAD pattern is specified only in the plane of the substrate, the resist sidewall profile (normal to the substrate) can significantly impact pattern transfer and dimensional control. Development is most commonly performed using organic solvent or aqueous-based chemistry. Less commonly used are materials that ablate directly under exposure to the electron beam and are subsequently self-developing. Also less common are materials that provide plasma etch rate variations due to deposited electron energy which are developed using plasma etching, often referred to as dry development.[3]
Electron-Beam ULSI Applications
673
Figure 1. Block diagram of typical electron-beam lithography system.
A wide variety of equipment is available for performing electronbeam lithography. The electron beam may be a finely-focused Gaussian spot, a variable-sized spot or rectangle, or a complex shape determined by a pattern-dependent mask. The writing approaches also differ greatly, with some machines writing with the substrate stationary and others with the substrate moving. Operating conditions also vary widely for many parameters, such as beam energy, beam current, and beam deflection rate. All of these parameters are important considerations and possible limitations for the end result, the pattern on the substrate. The following discussions explain the connections between the machine designs, the applications, the exposure approach, and the resultant pattern.
2.0
THE LITHOGRAPHY PROCESS
2.1
Logistics of Exposure
The basic lithography process is a sequence of pattern design, substrate preparation, pattern exposure, latent image development, and
674
Handbook of VLSI Microlithography
pattern transfer.[4] Although it can be argued that the terminology of lithography does not include pattern transfer, it is included here because it significantly impacts the choice of lithography process conditions and materials. Pattern transfer is the ultimate requirement for the lithography, since it permanently records the pattern image on the substrate. The following section provides a very basic description of each element in this process sequence. Details of specific operating conditions, machinespecific limitations, and process-related tradeoffs are provided in later sections. The beginning of the lithography process is the digital definition of the pattern using CAD tools, termed “layout.” The layout of the geometric elements of the pattern is best accomplished with some knowledge of specific machine and exposure-dependent parameters in order to achieve the desired result. For example, in the case of a phase-coherent grating with 0.12564-µm lines and spaces and a phase-shifting region in the center, it must be known that the exposure system can be calibrated to the required resolution and that the grating pattern does not cross any machine-dependent pattern exposure boundaries that might introduce a grating phase error. Intermediate pattern conversion software exists that will provide the translation (fracturing) from the desired user-specified pattern to the basic shapes that the exposure system is capable of replicating. This intermediate software may allow user-specified corrections for material, process, and, to some degree, pattern-dependent effects that deviate from the imaged pattern specified in the CAD layout. The next part of the lithography process, substrate preparation, begins with the application of an electron-sensitive coating, called electron-beam resist, on the substrate. It is the electron-beam resist that is modified by the deposition of the energy from the electron beam to form a latent image analogous to the latent image formed in photographic film. This latent image is subsequently processed to yield a physical representation of the computer-aided design pattern as a three-dimensional resist pattern. This resist is most commonly a polymer that originally is dissolved in a solvent to form a liquid. The liquid resist is applied to the substrate using spin-casting techniques to form a uniform-thickness coating. The solvent is then driven from the coating by baking of the substrate, yielding a durable polymer film on the substrate surface. The application, spinning, and baking of the resist are referred to as the coating process. Exposure of the electron-beam resist follows the coating process. In an exposure, the exposure tool will deposit energy in the resist coating in the areas specified by the CAD pattern, forming a latent image. There are
Electron-Beam ULSI Applications
675
many variations in the exact way that the exposure is completed, depending on the pattern and the exposure equipment, as discussed in Sec. 3.0, “Electron Beam Lithography Equipment.” The electron-beam lithography system is limited in how it accesses portions of an arbitrarily large CAD pattern, because often the CAD pattern exceeds the range accessible through beam deflection only. In these cases, the pattern is exposed through a combination of beam deflection and substrate translation, either sequentially or simultaneously. From a global perspective, the pattern is exposed as a mosaic of rectangles (fields), as shown in Fig. 2. For a system that uses beam deflection simultaneously with substrate translation, the rectangles of the mosaic are high aspect ratio stripes. For a system that uses beam deflection with the substrate fixed, the rectangles of the mosaic are square fields. Relative misalignment of the boundaries of these distinct pattern areas (fields or stripes) is known as stitching error.
Figure 2. Fixed versus moving stage exposure strategies.
676
Handbook of VLSI Microlithography
The resist pattern is temporary and functions solely for the purpose of pattern transfer. Pattern transfer uses the resist as a mask or stencil (to provide for addition or subtraction of material through the developed openings) thus providing a permanent functional image transfer to the substrate. The actual resist material is subsequently stripped from the sample before further processing. The specific nature of the resist material (for example, adhesion, etch resistance, thermal stability) and its developed profile (for example, undercut or overcut sidewall slope) are thus crucial in realizing the desired permanent substrate pattern. 2.2
Physics of Exposures
Introduction to Electron-Solid Interactions. The physics of the electron-beam exposure process is dictated by the nature of electron-solid interactions, both in the resist and in the underlying substrate materials. The main parameters influencing these interactions are those of the beam (primarily the beam energy) and those of the materials (the resist and substrate types), and the resist thickness. The electrons in the beam interact with the resist-coated substrate through elastic and inelastic scattering by the resist and substrate atoms. This results in the deposition of energy in the resist and substrate, and in the spreading of the electrons from the point of incidence. The inelastic scattering events are the means by which the electrons deposit energy in the resist and substrate and determine the absorbed energy distribution. It is this absorbed energy distribution in the resist that determines the developed resist pattern.[5] A consequence of the energy loss of the primary beam is the generation of secondary electrons. The bulk of these secondary electrons have characteristically low energy that is readily absorbed by the resist, contributing significantly to the exposure. Fortunately, the low energy also limits their effective range in the resist to a few nanometers, thus limiting the secondary electron contribution to beam broadening to approximately 10 nm.[6] Because of the severely limited range of secondary electrons in resist, they are of limited use for the detection of resist-encapsulated substrate alignment marks. Secondary electron generation can include significant numbers of fast secondary electrons (i.e., 1–2 keV). These higher energy secondaries can travel tens of nanometers in the resist, resulting in limited resolution in the case where larger order resolution-limiting effects have been reduced (for example, by increasing the accelerating voltage to diffuse the backscatter distribution). In the case of a substrate with a significantly higher atomic
Electron-Beam ULSI Applications
677
number than the resist, the bulk of the energy loss and the associated fast secondary generation will be in the substrate. In this case, an organic or inorganic layer between resist and substrate can serve to absorb these secondary electrons from the substrate before they reach the resist, improving resolution.[7] Electron Scattering. The scattering events are classified as forward scattering and backscattering. Forward scattering is characterized by primarily small-angle scattering, less than 90° from the primary beam direction. The main impact of forward scattering on the lithography process is broadening of the incident beam as it passes from the resist surface to the substrate. The amount of beam broadening is inversely related to the accelerating voltage. Since the scattering angles are characteristically small, the forward-scattered electrons that reach the substrate do not have further statistically significant interactions with the resist. The beambroadening effects introduced by forward scattering are reduced as resist thickness decreases or accelerating voltage increases. Backscattering is characterized by large-angle scattering, to nearly 180° from the primary beam direction. Unlike forward scattering, these large scattering angles make it possible for electrons reaching the substrate to return to the resist. Backscattered electrons may also originate from the resist layer without ever reaching the substrate. This contribution is usually much smaller than that from the substrate, since the magnitude of this scattering is strongly dependent on the atomic number of the material, and typically the ratio of resist to substrate atomic numbers is small. The practical implications are that subsequent scattering events from substratebackscattered electrons can deposit energy in the resist away from the primary beam (i.e., away from the pattern), leading to pattern distortions. Electrons backscattered from the substrate have significantly higher energy than secondary electrons and can thus escape out of the resist and be detected. The detected backscatter signal serves a useful purpose for the location of resist-covered alignment marks on the surface of the substrate. The lateral distribution of deposited energy in the resist from the point of incidence is often approximated by Gaussian distributions, assuming initially that the primary beam is a delta function. The forward-scattered incident (primary) beam is best represented by its own Gaussian distribution and the contribution from backscattered electrons is represented by a separate Gaussian distribution, as shown in Fig. 3.[8] The forward-scattered Gaussian distribution is dependent on the beam energy, resist type, and resist thickness. It increases in width with either increasing resist thickness or atomic number, and decreases in width with increasing beam energy. The
678
Handbook of VLSI Microlithography
total quantity of backscattered electrons is characterized by a backscattering coefficient η. This coefficient is strongly dependent on the atomic number of the substrate and only weakly dependent on the incident beam energy. The backscattered Gaussian distribution is therefore highly dependent on the substrate atomic number, increasing in magnitude with increasing atomic number. Although not indicated by η, the width of the backscattered distribution is highly dependent on incident beam energy. The radial spread of backscattered electrons is comparable to the Bethe range, which is the characteristic path length for an electron in the solid that has given up all of its energy. The Bethe range increases with increasing incident beam energy. Thus, the characteristic width of the backscattered electron Gaussian distribution increases with increasing beam energy. For commonly used substrates and resists, the backscattered Gaussian characteristic widths vary from the order of 0.8 µm at 10 kV incident energy to 8 µm at 50 kV.[9] Since η is weakly dependent on the beam energy, as the characteristic widths of the backscattered electron Gaussian distribution increase with beam energy, the nearly constant value of η dictates that the integrals of these distributions remain approximately constant. This subsequently implies a decrease in the backscattered electron Gaussian peak height with increasing beam energy. In considering a real primary beam, for the case of a spot beam, the actual energy distribution must be approximated using a convolution of the Gaussian electron distribution in the spot with the assumed delta function.
Figure 3. Forward scattering and backscattering distributions.
Electron-Beam ULSI Applications
679
Substrate Beam Damage. For direct-write exposures, consideration may be necessary for possible damage of the substrate due to electronenergy loss. In considering the deposition of energy by the incident beam, it is noted that, for practical exposures, the bulk of the energy is deposited in the substrate. At lower energies, the deposited energy is concentrated nearer to the surface of the substrate, closer to the active layers of most device layer structures. A reduction in damage to active device layers has been demonstrated for compound-semiconductor quantum-well structures as the beam energy is increased due to the increase in depth of the deposited energy, away from the surface active layers.[10] An alternate approach to minimizing beam-induced substrate damage is the use of a low-energy beam that is completely absorbed in the resist, although such low-energy systems are less common than higher energy systems.[11] The complete absorption of the beam energy in the resist requires a multiple-layer resist approach where the top layer images the pattern and is exposed through its entire thickness. Electrons passing through the top layer are subsequently absorbed in the lower layer before reaching the substrate. This approach adds complexity to the resist processing. In addition, it requires the top layer to be thin enough to permit complete low-energy electron transmission and simultaneously durable enough to permit pattern transfer to the lower layer. The pattern transfer technique must not introduce significant substrate damage or pattern degradation (for example, via energetic ion-bombardment due to plasma processing). In addition, charge deposited above the substrate must be conducted away to prevent charge buildup and associated pattern distortion. This requires the use of a conductive lower layer or the addition of a thin conducting layer above or below the imaging layer. Proximity Effects. The pattern distortions due to backscattered electrons are termed proximity effects and are best understood by considering specific examples. First, it is noted that the resist serves to integrate the deposited energy from all sources (i.e., from all locations of the exposed pattern due to either primary beam or backscattered electron exposures). It is also noted that the approximate developed resist pattern will match a contour of constant deposited energy that is dependent on developer strength and development time. In considering an arbitrary point in the resist, this point will be developed out as part of the resist image, if the deposited energy at the point exceeds a certain accumulated dose. It is desirable for the primary electron Gaussian distribution to dominate the energy deposition at the point. For this condition, the point is part of the developed image only if it lies within the limited forward scattering range
680
Handbook of VLSI Microlithography
of the exposed pattern. A more difficult situation arises when the sum of backscattered electron contributions dominate the energy deposited at the point. This can occur significant distances from the pattern due to the larger range of backscattered electrons relative to forward-scattered electrons. Furthermore, it is important to note that the total contribution at a point in the resist is pattern-dependent. Consider a specific example of the exposure of an isolated square pattern of dimensions comparable or larger than the Bethe range for the given exposure conditions. The exposure conditions will be fixed such that the primary energy deposited in the resist will be equal for all points in the pattern, a typical condition. At the center of the pattern, there is a total contribution of deposited energy that is the sum of the primary energy plus backscattered energy contributions from nearby points exposed in all directions. However, in the corners of the square there are nearby backscattered contributions only over one quadrant, reducing the total deposited energy relative to the center point. The result, after development to a constant energy contour, is the rounding of the corners. This is one example of the intraproximity effect, the variation of total deposited energy within the pattern. Next consider the exposure of an isolated cross pattern of dimensions comparable or larger than the Bethe range for the given exposure conditions, as shown in Fig. 4. At the tips, the outside corners will have backscattered contributions integrated over one quadrant (similar to the square example above) and will thus exhibit the characteristic intraproximity rounding. Now consider a point just outside of the pattern area, at one of the inside corners where the two cross arms intersect. At some points sufficiently close to but still outside the pattern, the integrated backscattered electron contribution will exceed the energy threshold for development. This will result in a fillet or web across the corner. This is one example of the interproximity effect, the variation of total deposited energy outside the pattern. In considering the impact of exposure energy on proximity effects (although the range for backscattered electrons increases with increasing energy), the peak intensity of the backscatter is reduced with increasing energy. This is because the backscatter coefficientη is only a weak function of energy, which renormalizes the backscatter distribution curve to maintain a nearly constant area. This reduction in intensity can reduce the total integrated energy at a point outside the pattern to below the development threshold, improving pattern integrity by reducing interproximity effects (as demonstrated by reduced inside corner filleting). Since the backscatter
Electron-Beam ULSI Applications
681
contribution is made more diffuse, the variation of dose within the pattern is also reduced with increasing energy, leading to reduced intraproximity effects as demonstrated by reduced corner rounding. Examples of these energy-dependent proximity effects are shown in Fig. 5. The above considerations are all made considering an arbitrarily thick substrate where all incident electrons either lose all their energy close to or escape from the substrate front surface. If the substrate thickness is less than the range of the electrons, then the backscatter contribution and associated proximity effect are significantly reduced as electrons pass through the substrate. This is the case for x-ray proximity masks, which require patterns formed on thin supporting membranes, such as Si, Si3N4, or SiC. The patterned resist defines an x-ray absorber material either by addition of the absorber to the membrane after patterning by electroplating or by subtraction of refractory absorber material deposited before patterning. A smaller proximity disadvantage exists for the subtractive refractory process compared to the additive plating process due to the existence of the thick, high atomic number refractory absorber during patterning.
Figure 4. Pattern distortion due to proximity effects.
682
Handbook of VLSI Microlithography
(a)
(b)
Figure 5. Impact of exposure energy on proximity effects. Reduction of intraproximity corner rounding with (a) 20 kV and 100 kV exposure energy at tip of 0.4 µm cross, and (b) reduction of interproximity corner filleting with 20 kV and 100 kV exposure energy at center of 0.4 µm cross.
Electron-Beam ULSI Applications
683
Proximity Effect Minimization. For an arbitrary pattern, the exposure of any part of the pattern can be significantly impacted by exposure of adjacent pattern features. The interproximity effect will lead to narrowing or even closure of small gaps between features. The range of this interaction is that of the backscattered electron distribution, approximated by the Bethe range. In considering the general nature of the approximating backscattered electron Gaussian distributions as a function of energy, at lower energies the distribution is narrow and peaked relative to higher energies, where it is broader and flatter. The resulting interproximity effect at lower energies is subsequently a strong function of distance relative to that at higher energies. From a practical standpoint, there is an advantage in using higher energies to expose closely spaced features as the backscattered electron distribution is spread out over a larger area and is less concentrated in the inter-feature gap relative to lower energy exposures. The reduced concentration in the gap increases the possibility of the total dose in the gap being below the development threshold, allowing the separate features to be resolved, as shown in Fig. 6.
Figure 6. Preservation of 0.1µm gap between large area features using 100 kV exposure energy.
If the beam energy is reduced sufficiently, then the electron range can be made narrower than the minimum feature size and the proximity effect eliminated. The limited electron range requires correspondingly thin resist
684
Handbook of VLSI Microlithography
layers to provide complete exposure through the entire thickness of the resist. Systems with these low accelerating voltages are less common than higher voltage systems and suffer from difficulties in electron-optical design. Scanning probe instruments can provide very low exposure energy, but these systems are currently far from commercial viability due to very low throughput. At low enough energy and with thick enough resist, the beam can be completely absorbed in the resist, as discussed above (see Substrate Beam Damage). The generation of backscattered electrons in the substrate suggests that a reduction of the associated proximity effect may be possible by separating the imaging resist layer from the substrate via a thick intermediate layer. Introducing this layer produces negligible additional backscattered electron generation, provided that the atomic number of the intermediate layer is low relative to the substrate, as for an organic film on a semiconductor. The imaging layer placed on top of the intermediate layer can be made thin, thereby reducing forward scattering. If the intermediate layer is sensitive to the beam, then selective solvents can develop it away without compromising the top layer image. Otherwise, the top imaged layer must be used to transfer the pattern through to the substrate by etching. This approach can significantly increase the processing overhead relative to a single-layer resist approach. The preceding discussions describe variations in exposure and resist conditions that impact proximity effects in a pattern-independent manner and are best described as proximity-reduction techniques. In contrast, proximity correction techniques are pattern-dependent approaches. One approach to proximity correction involves variation of the exposure dose within the pattern to compensate for the excessive or insufficient dose contributions of interproximity and intraproximity, respectively (see Fig. 7).[12] This approach requires an exposure system capable of modulating the dose within a pattern. To accomplish dose modulation, additional clock speed bits are added to shape address words during fracture. These bits are used during shape exposure to select the exposure clock speed and thus vary the dose. Software is used to analyze the pattern data and determine pattern regions requiring dose variation, which can be time consuming for patterns with large numbers of elements. The software performs any additional fracturing of the pattern shapes required for dose modulation (for example, to create separate edge or corner shapes from a single initial CAD element). The approaches taken by these programs are numerous and constantly changing, and the details are beyond the scope of this chapter.[13][14] These programs often require test exposures
Electron-Beam ULSI Applications
685
of specific calibration patterns to determine correction parameters. The pattern areas where dose modulation will be implemented may be fractured into separate shapes, which may require increased fracture and exposure resolution that can significantly impact overall shape count and data processing overhead. Note that for a 1-Gb DRAM pattern’s approximately 1010 rectangular shapes, a pattern conversion including full two-dimensional proximity correction could take on the order of days to execute.[15]
Figure 7. Impact of proximity effect and correction techniques on dose and resist profiles.
686
Handbook of VLSI Microlithography
If dose modulation is not available, overlapping shapes can be placed at the regions of low dose, such as small squares in the corners of large rectangles. The size of overlapping shapes may need to be limited as the dose in the overlap region is doubled, for example, by using a short 0.25 µm wide line at the tips of an isolated 0.5 µm line. If fine exposure resolution is available, the duty cycle of repeated small features can be varied to effect some dose modulation between a factor of one and two. Although most proximity correction techniques rely on complex pattern analysis, the GHOST technique requires only a simple pattern tone reversal and re-exposure using a highly defocused beam (see Fig. 7). The GHOST technique uses a defocused beam that mimics the long-range backscatter dose distribution to expose the regions outside the pattern and provide a more uniform background dose.[16] The GHOST exposure dose is defined as the largest expected backscatter dose contribution, which is determined at the center of a large area feature. Conventional implementations have a throughput disadvantage because of the need for the second (reverse tone) pattern exposure, although the data file processing is simplified relative to algorithmic approaches to correcting the initial pattern exposure. In the specific case of the SCALPEL system, where the inverse pattern is projected simultaneously with the pattern, the defocused inverse (GHOST) exposure is generated without an additional pattern exposure through a modification of the scattering selection aperture (see Sec. 3.6).[17] 2.3
Lithography Process Issues and Parameters
CAD Considerations. Commercial lithography systems must interpret the desired CAD pattern as a digital approximation, since the hardware limits beam placement to discrete points on the substrate. This hardwarespecific software task, called fracturing, is part of the process of converting the generic CAD data format to the hardware-specific binary exposure format. The discrete exposure points are called picture elements, or pixels. For simple shapes and with fine digital resolution, this digital approximation is an adequate representation of the desired pattern. However, arbitrary pattern shapes may lead to significant errors in the digital approximation. The following section explains some of the conditions of machine limitations, pattern configurations, and user-specified parameters that require special consideration in preparation of the CAD pattern. The interpretation of a given shape within the CAD pattern is accomplished by a subsystem called the pattern generator. The pattern generator is restricted to producing basic shapes, typically a trapezoid. The
Electron-Beam ULSI Applications
687
basic shape is composed of pixels that nearly always are restricted to a rectangular Cartesian grid. This restriction can lead to limitations in the pattern generator’s ability to represent arbitrary feature shapes, including lines of arbitrary angle, especially in the case of larger pixel sizes (see Fig. 8). Exact edge slope representation is possible only for angles with tangents that are the ratio of integral numbers of pixels. This limitation is of greatest consequence for lines that are less than a few pixels wide, leading to a noticeable sawtooth representation if the pixel size is not made arbitrarily small. This error can be reduced by making the resolution smaller (smaller pixels), which may in turn restrict field size and reduce throughput. Knowledge of the intended exposure pixel size during CAD digitization can avoid error during fracture, where points digitized off the exposure grid are snapped to legitimate pixel positions
Figure 8. Digital approximation of CAD shapes.
688
Handbook of VLSI Microlithography
It is also noted that for narrow lines there is a variation of linear pixel density as a function of line angle due to the Cartesian geometry. This density variation implies a variation in exposure dose as a function of angle. For example, a single-pixel-wide line at 45° to the coordinate axes has a − linear pixel density that is reduced by a factor of 1/√ 2 from the pixel density along either coordinate axis. This effect is most pronounced for a narrow circular annulus where the linear dose varies continuously as a function of position along the annulus. Fracturing is part of the data conversion process that converts the generic CAD data into machine-specific and exposure-condition-specific data. The fracture process provides for the digital approximation of the CAD pattern based primarily on the following two requirements: first, CAD data must be approximated using the basic shapes that the pattern generator is capable of exposing. Second, machine limitations combined with user-determined parameters, such as pixel size, require that the pattern is exposed as a mosaic of rectangles, either high-aspect-ratio stripes or square fields. Fracturing breaks down, or fractures, the specified CAD shapes into the basic shapes compatible with the pattern generator and determines the boundaries of the mosaic of stripes or fields. Additional hardware requirements may necessitate further subdivision of the stripe or field into subunits, introducing additional fracture boundaries. These boundaries lead to the division of CAD features that cross the boundaries into separate shapes that are exposed under differing conditions of deflection or stage position. Errors in how the fields or stripes are butted, or stitched, together lead to discontinuities in shapes crossing these boundaries, known as field or stripe butting errors for fixed or moving stage approaches, respectively. Butting errors are also referred to as stitching errors. In addition to stitching between fields or stripes, stitching can occur between subfields and between variable-shape and cell-projection exposure regions. The field or stripe butting error is usually much larger than the subfield butting error as the field or stripe butt involves significant stage stepping. This usually requires the matching of patterns written at the limits of beam deflection, where distortions are greatest. The hardware addressing the beam over a stripe or field using beam deflection is limited in deflection resolution by the number of bits used for the deflection digitalto-analog converter (DAC). The maximum possible deflection range is limited by the deflection hardware, but for fine resolution, the pixel size may limit the exposure field to less than the hardware maximum range. For n bits, the field or stripe size is pixel size × 2n. To reduce the number of field or stripe butts in a pattern, a larger pixel size can be chosen. However,
Electron-Beam ULSI Applications
689
choosing a larger pixel size may lead to larger errors in the fractured representation of the CAD data (if the CAD shape vertices are digitized on grid points that are smaller than the fracture pixels) and can limit the ability to introduce fine increments of pattern bias. In choosing fracture parameters, consideration should also be given, if possible, to minimizing the field or stripe butts across critical features, such as the gate of a field-effect transistor or the lines of a coherent grating. In some cases, such as integrated-optics, it may be desirable to overlap fields to distribute the butting errors more uniformly across the pattern. Throughput considerations are also involved in choosing fracture parameters beyond the basic consideration of total pixel count. Increasing field or stripe size will reduce the total number of major stage steps involved and reduce the overhead associated with these moves. If registered writing is involved, increasing field or stripe size will reduce the total number of alignment sequences needed and reduce the associated registration overhead. In some lithography levels, there can be widely differing fracture requirements that may be better addressed by separating the CAD data into two separate fracture groups. A good example of such a situation is the gate level of a field-effect transistor. In this case, the CAD data are a mixture of large shapes (interconnections) suitable for fracturing with relatively coarse pixels and very fine shapes (gate lines) requiring much finer pixels. A good solution here is to fracture these shapes separately using the fracture parameters best suited for each group of shapes as a coarse-fine split. This will result in two binary exposure files that are overlaid by exposing separately using separate calibrations, a larger pixel size, and large beam current and spot size for the pads, and a smaller pixel size and smaller beam current and spot size for the gate lines. The separate binary exposure files can be generated by filtering the data according to polygon size or aspect ratio during fracture. Alternately, the separate shapes can be assigned separate CAD layers during layout, allowing the layers to be fractured independently without filtering during fracture. A variation on the coarse-fine split technique is sleeving. Sleeving provides for the exposure of the perimeter of the coarse level of a coarsefine split along with the fine level. This provides a continuous pattern outline without overlay. The center of the coarse level is filled typically with a small overlap of the fine level that allows for misregistration. This tolerance of misregistration permits the selection of faster, less accurate calibration and registration of the coarse level for improved throughput. The data file for the coarse exposure can be generated by fracturing while filtering for larger shapes and applying a negative bias. The data file for the
690
Handbook of VLSI Microlithography
fine exposure can be generated by first fracturing while filtering for larger shapes and applying a negative bias to generate an intermediate core pattern. The final fine exposure file is then created by fracturing all data and performing a Boolean subtraction of the core pattern. The difference in bias chosen for the generation of the coarse data and the core data provides for the coarse-fine overlap. Typically sleeves are a few microns and coarsefine overlaps are a few tenths of a micron. For direct-write applications, CAD layout must also provide registration marks at the appropriate levels and positions. Reference mark levels must provide sufficient contrast to the scanning beam to provide accurate mark location, which may require the use of indirect referencing to a third, high-contrast level when aligning to a low-contrast level. Typical lowcontrast levels include ion-implantation and shallow mesa etches. High contrast is possible using high-atomic number materials on low-atomic number substrates or with deep, high-aspect ratio etching. Note also that contrast decreases with increasing accelerating voltage. Placement of reference marks for direct-write involves some knowledge of exposure strategy. In-field registration will require marks to lie within the deflection field range minus the distance for mark overscan. Marks must be sufficiently isolated to allow for the length of alignment scans plus possible scan placement error, in order to avoid detection of adjacent feature edges. Additional special marks, possibly even an entire die, may be required for global substrate alignment. Multiple electronbeam lithography levels are best implemented using redundant alignment marks, as discussed below. Placement Accuracy and Registration. Pattern placement accuracy is the accuracy of placement of an arbitrary pattern element with respect to an ideal coordinate grid. This figure-of-merit is specified for an exposure that does not reference existing marks on the substrate, as is typical for mask or reticle exposure. Placement accuracy is ultimately tied to the laser interferometry system, as are all calibrations: however, many other factors limit placement accuracy. The total placement error of a pattern element can be specified as an error budget that includes errors in measurement of stage position, errors in beam position, errors due to thermal effects, and errors due to physical substrate distortion. These errors are uncorrelated and are thus added in quadrature. Errors in beam position can arise due to deflection noise and drift, deflection nonlinearity, column and/or substrate charging, and environmental factors, such as mechanically and acoustically coupled noise and stray magnetic fields. Thermal effects arise from temperature variations of mechanical components (for example,
Electron-Beam ULSI Applications
691
stage, column, substrate, interferometer mounting) as well as electronic components (for example, amplifiers, lenses, power supplies). Stage positioning is accurate to within the resolution of the laser interferometer, for example, λ/1024 or 0.6 nm. The interferometer measures the position of the stage to which the substrate fixture is attached. Additional substrate position errors can arise from displacements of the fixture or substrate relative to the stage due to time-dependent temperature variations of these components during exposure. This error can be minimized by equilibrating substrate and fixture temperature and by using fixture materials with low thermal coefficients of expansion. Pattern distortions due to beam-placement effects are discussed later (see the section labeled Beam Deflection). Careful design of fixtures is required to minimize added substrate stress when fixtured and exposed. This stress will be relieved after removal from the fixture, and the resultant substrate relaxation will distort the pattern. Consideration in the error budget must also be given for substrate processing after exposure, which can change substrate stress, distorting the pattern. Registration uses the electron beam to locate a pre-existing reference mark on the substrate. The location of this mark is used for the relative placement, or registration, of the pattern. Factors leading to errors in registration include poor definition of the registration mark edges, poor mark-substrate contrast, charging of the substrate during mark acquisition, and interference of the resist covering the mark. Thick resist layers may reduce mark-substrate contrast and may result in local transient outgassing of residual resist solvent during mark acquisition. Poor mark contrast may be due to an insufficient difference in atomic numbers of mark and substrate materials or the use of high accelerating voltage. Pre-existing errors in registration mark placement must also be considered in a registration error budget. These errors may arise from placement errors of the lithography tool used for registration mark patterning, as well as subsequent substrate distortions due to processing. The process of mark registration necessarily exposes the alignment mark. This registration exposure is typically a sufficient dose to clear the resist where the mark has been scanned. This clearing leaves alignment marks vulnerable to destruction during pattern transfer. Some resists may also exhibit highly cross-linked behavior at alignment scan positions that results in residual resist following normal resist-stripping procedures. These mark-degrading effects indicate the need for redundant alignment marks where multiple levels of electron-beam lithography are performed on a substrate.
692
Handbook of VLSI Microlithography
An important consideration for device fabrication is level-to-level registration accuracy, the registration accuracy of one level of lithography to a preceding level. The preceding lithography level may be exposed by means other than electron-beam lithography, for example, via optical lithography, referred to as mix-and-match lithography. An example is the use of optical lithography for a coarse feature level with subsequent higherresolution pattern levels exposed using electron-beam lithography registered to alignment marks defined on the coarse feature level. Resolution and Dimensional Control. Process resolution refers to the minimum feature dimension that can be fabricated on the substrate in a reproducible manner. This statement implies a measurable degree of process latitude, as well as an acceptable level of dimensional tolerance. The size of the smallest feature is known as the critical dimension. Usually parameters are chosen based on requirements to achieve the critical dimension, as these smallest features are most difficult to produce and are often the most important features of a pattern. Process resolution also implies that the resultant resist pattern is usable for pattern transfer and has the required profile, adhesion, thickness, and any other necessary condition. This is distinctly different from the requirements for achieving the minimum lithographic resolution, that is, the smallest feature that can be patterned in the resist. Minimum lithographic resolution dictates thinner resist due to scattering of electrons, while process resolution often dictates thicker resists for pattern transfer requirements such as liftoff, etch resistance, or planarization. Deviation of actual feature size from the design value occurs due to effects such as proximity, finite probe size, processing for pattern transfer, development, resist swelling, resist thickness variation, and improper exposure dose. Dose variations can occur due to beam current drift or pattern-dependent proximity effects. Biasing is the process of incrementally increasing or decreasing the perimeter of the shape to compensate for feature size deviations. A negative bias can be used to compensate for probe size by reducing the shape perimeter by one half the probe size on all sides. For a positive resist, a negative bias can compensate for undercutting and erosion due to etching. For a negative resist, a positive bias can compensate for undercutting and erosion due to etching. The effectiveness of bias is reduced where the bias is a significant fraction of the overall feature size. The choice of a large pixel size during fracture can limit the ability to achieve fine bias, since the bias increments are integral pixels. The proper choice of exposure dose is necessary for both resist profile control and feature size control. Very large and very small features have
Electron-Beam ULSI Applications
693
significantly different dose requirements, with smaller features requiring significantly higher doses. To properly accommodate these differing requirements, dose modulation can be associated with specific CAD features. Specific dose assignments can often be made at fracture time by filtering shapes according to size. Alternately, shapes with specific dose requirements can be placed on separate CAD layers or assigned unique CAD datatypes. Where supported by CAD tools, layers and datatypes can be specified independently for a given CAD feature. Corrections for proximity effects are necessarily more complicated than simple biasing, as they are dependent on the specific pattern and the backscattered dose interactions between adjacent pattern elements. Software is available to analyze CAD patterns and provide specific fracture requirements for segmenting of pattern elements for subsequent dose modulation. The software also calculates specific exposure dose modulation requirements for the specified pattern segmenting and may require a separate test pattern exposure to determine the condition-specific and material-specific backscatter contributions. The result is an approximation of uniform total dose within the pattern equal to the sum of exposure and backscatter doses. In pattern areas where interproximity locally increases backscatter dose, pattern exposure dose is reduced to compensate. Similarly, in pattern areas where intraproximity locally reduces backscatter dose, pattern exposure dose is increased. Throughput. A practical consideration of throughput must consider the time required in the exposure tool, as well as pre- and post-exposure time (for example, post-exposure baking, development, etc.), and fracture time. For simplification, this section concentrates on exposure tool considerations. Specific notation is made throughout the chapter where throughput is significantly affected by other considerations. The time required by the exposure tool is a combination of pattern writing time and overhead time, during which no writing occurs. Overhead time is incurred due to factors such as pumpdown and substrate exchange, thermal equilibration, calibration, machine and pattern-dependent settling times, and registration mark acquisition time. Different exposure tools with different exposure strategies can significantly impact throughput. For example, neglecting most overhead factors and assuming common exposure parameters, the rate of exposure per unit area is directly related to pattern density for vector scan systems and not at all related for purely raster scan systems. A comparison of different exposure strategies is shown in Fig. 9 and discussed in detail in Sec. 3.2 (see the subsection entitled “Sources”).
Figure 9. Comparison of pattern exposure strategies.
694 Handbook of VLSI Microlithography
Electron-Beam ULSI Applications
695
Raster scan systems can suffer a significant throughput disadvantage relative to vector scan systems for the case of low pattern density. Vector scan systems are ideally suited for lower pattern densities, as only defined pattern areas are addressed by the beam. Raster scan systems must address all possible exposure areas, specifically the entire grid of pixels, regardless of pattern density. Since the number of pixels in the grid is inversely proportional to the square of the pixel size, reducing pixel size can seriously impact throughput if additional clock speed is not available. Shaped beam systems (variable-aperture contiguous beam crosssection, as well as cell/mini-reticle projection) can provide significant throughput improvements over Gaussian beam systems through parallel pixel exposure. Moving-stage systems offer a throughput advantage over fixed-stage systems by eliminating the stage-move settling time in one axis. For a fixed-stage system, choosing the largest exposure field will reduce the total number of stage moves to expose the pattern, reducing the number of associated stage move settling times. The overhead impact of limiting field size occurs in both axes for fixed-stage systems, with the overhead time increasing proportionally to the inverse square of the deflection range. A small deflection range has significantly less throughput impact for a moving stage system, where stepping overhead is incurred only in one axis, increasing proportionally to the inverse of the range. Choosing the largest possible field is additionally efficient for registered writing, as it reduces the number of alignments per unit area, reducing the total associated alignment overhead. One possible approach to reducing registration overhead is performing registration mark acquisition only in the center of a local matrix of fields and applying the measured corrections to all fields in the matrix.[18] The many machine and pattern-dependent factors affecting throughput (for example, pattern density, data exchange rates, stage motion, beam deflection rate) make estimation of throughput for various systems difficult. However, a limiting throughput value is readily obtained from available beam current versus resist sensitivity. For maximum throughput, the beam current should be the maximum value that has an associated spot size suitable for the desired feature size. Resist sensitivity is a significant factor in throughput, since ultimately a beam of a given current density can only be stepped as fast as the resist can be exposed. Continuing improvements in high-sensitivity resists can permit operation at or near maximum clock speed, while permitting lower beam currents and associated lower spot sizes compatible with high-resolution requirements. Increases in resist sensitivity for improved throughput are especially useful with
696
Handbook of VLSI Microlithography
cell-projection systems, where beam current density is restricted to limit electron-optical-related pattern distortions. For field emission systems, with their superior brightness, higher beam currents are possible for a given spot size relative to thermal emitters, providing for higher exposure throughput if sufficient deflection speed is available. Defects and Contamination. A defect can be characterized as any unwanted feature in a pattern and can be further classified as random or nonrandom and fatal or nonfatal. Defects can arise from particulate contamination, exposure errors, adhesion, and pattern transfer. Typical random defects in resist patterns include voids, spots, edge intrusions, edge protrusions, bridges between features, breaks, and larger areas of residual resist. Typical nonrandom, repeating defects are due to fracture errors or dosing errors. Particulate contamination may be introduced on the substrate surface before resist coating by the various equipment, environments, and processes to which substrates are subjected. These encapsulated particles can result in local nonuniformities in the resist coating thickness, introducing critical dimension-control problems. Encapsulated particles may also impede pattern transfer processes. Particulates in general can contribute to additional electron scattering and associated dose variations. Particles may be introduced to the resist surface after coating by the external wafer handling environment, as well as the vacuum environment of the exposure tool. These surface particles may block the exposure entirely, leading to voids, intrusions, or breaks in the pattern with negative resists and spots, breaks, or intrusions in the pattern with positive resists. Surface particles may also be subject to charging, leading to pattern distortions. Exposure errors can arise from stitching errors, fracture errors, insufficient settling times, improper proximity correction, incorrect dose, noise, surface charging, data errors, calibration errors, or registration errors. Stitching and fracture errors are discussed in Sec. 2.3 (see subsection “CAD Considerations”). Insufficient settling times are indicated by oscillations or bends in specific pattern features, often seen repeatedly at the first exposed elements in a field. Beam-deflection settling times can be patterndependent, depending on whether there are large jumps between pattern elements. Improper proximity correction can lead to unacceptable tapering or linewidth variation, webbing of inside corners of features, or bridging of narrow gaps between features. Bridging and webbing may also occur due to improper dosing of the overall pattern when proximity correction is not employed. Excessive edge roughness in narrow features may be evidence of deflection system noise. Averaging of this noise is possible by
Electron-Beam ULSI Applications
697
employing multiple scans, with the feature size increasing by the root mean squared value of the noise since the position errors of multiple scans are uncorrelated. Surface charging can lead to extreme pattern distortion with highly insulating substrates such as quartz or lithium niobate. Charging can be controlled by application of either organic or inorganic conductive overlayers. Random data errors may occur due to glitches in the pattern generator system or errors in calibration. Registration errors can lead to unacceptable pattern alignment, scaling, or rotation. Fractional field stepping can be employed to average deflection field and stitching errors, where each section of the pattern is written several times using different parts of the deflection field. Defects due to resist adhesion to the substrate are often manifested as post-development distortions of narrow lines that relieve intrinsic resist stress by delaminating. Poor resist adhesion under conditions of pattern transfer may lead to defects in the transferred pattern that are not detectable in the developed resist image. A common example of this is ragged feature edges due to nonuniform adhesion failure during etching. Improvement of resist to substrate adhesion is typically accomplished through pre-resistcoating treatments such as hexamethyl disilazane (HMDS) vapor priming with silicon substrates, oxygen plasma cleaning, organic solvent cleaning, chemical etching, or UV-ozone cleaning. The resist may be hard baked after development to improve adhesion, but caution must be exercised, as many electron-beam resists plastically deform at relatively low temperatures. In some cases, it may be necessary to either dilute an etchant or switch etch chemistry. For example, for polymethyl methacrylate (PMMA) or the copolymer of methyl methacrylate and methacrylic acid [P(MMA-MAA)] on GaAs, adhesion problems using a highly basic 1:5 NH4OH:H 2O preevaporation oxide etchant were eliminated by further dilution or by switching to HF or HCl-based acidic etchants. Process Latitude. Process latitude is an indication of the sensitivity of the end process result to variations in parameters. In the case of lithography, at interest is the final transferred pattern. A large process latitude is desirable as it reduces the requirement for parameter control for a given amount of process control. Lithography process latitude is highly dependent on the choice of resist. Pre- and post-exposure latency times, and latency times following post-exposure bake, if applicable, can be parameters that must be controlled with certain resists. Increasing accelerating voltage improves process latitude. Resists exposed with higher accelerating voltages are less affected by variations in dose. In addition, higher accelerating voltages reduce proximity effects that would otherwise further restrict the acceptable range of dose at lower voltages.
698
Handbook of VLSI Microlithography
Pattern Transfer. Pattern transfer techniques are categorized as additive or subtractive. Additive processes add material to the substrate, while subtractive processes remove material. Examples of additive processes are ion implantation and liftoff. Examples of subtractive processes are wet etching and plasma etching, otherwise referred to as dry etching. Subtractive processes require good resist adhesion for the case of wet etching, or good resist etch selectivity over the etched material in the case of dry etching. Additive processes require thick layers in the case of ion implantation, or undercut profiles for the case of liftoff. The liftoff process may also demand good adhesion or dry etch selectivity of the resist, where the resist opening is used additionally for subtractive processing before the additive processing, as in the case of microwave transistor gate fabrication. The bias introduced by pattern transfer may be a resolution-limiting factor, as in the case of a narrow opening in a positive resist that is subject to undercut etching. In other cases, the bias introduced by the process may extend the resolution, as in the case of a bar of negative resist that is undercut etched to define an etched feature smaller than the resist feature.
3.0
ELECTRON-BEAM LITHOGRAPHY EQUIPMENT
3.1
Introduction
The two major subsystems of an electron-beam lithography system are the stage, used to position the substrate, and the electron optics, used for beam shaping and deflection. Ultimately, it is the electron beam that scans the substrate to generate the exposed image. However, practical design issues limit the scanning range of the beam using electron optics, requiring a stage that translates the substrate beneath the electron optics so that all areas are accessible to the beam. Two approaches exist for stage motion, fixed and moving. Commercially available systems have three types of beam shape and control strategies: Gaussian-spot raster scan, Gaussianspot vector scan, and variable-spot/cell projection vector scan. Approaches other than these are discussed in Sec. 3.6, “New Technologies.” The most critical component of the stage subsystem is a laser interferometer that can determine the absolute position of the stage in two axes. Orthogonal mirrors affixed to the stage permit position readout of both axes independently. The requirements for stage positioning precision are much less than for stage measurement precision, as small differences between desired and measured stage positions are easily corrected by minor
Electron-Beam ULSI Applications
699
beam deflection offsets. The stage is driven by separate motors controlling each axis. For a fixed-stage exposure strategy, the stage motors have characteristic high acceleration to minimize the overhead time associated with stage movement. For a moving stage exposure strategy, the stage motor in one axis must provide smooth, continuous motion for the full scan length. Stage materials must be nonmagnetic or unwanted beam interactions can occur. Additionally, stage materials must have low thermal expansion coefficients to provide stable referencing between the mirror mounting points, the point of actual measurement, and the substrate fixture. Incorporated into the stage or substrate fixture are markers that permit interferometer-based calibrations and a Faraday cup for beam current measurement. The stage must be made such that the substrate remains orthogonal to the electron-optical axis, and the distance from the substrate to the electronoptics along the electron-optical axis does not vary with stage position. A backscattered electron detector system provides a signal level in response to scanning the electron beam across the substrate or marker, operating similarly to a scanning electron microscope. This signal can be analyzed for edge detection and subsequent acquisition of mark positions for both calibrations and registered writing. The detector can be either a solid-state diode or photomultiplier tube. In either case, the detector must be close to the beam-substrate intersection to have sufficient solid angle and resultant signal level. In the case of photomultiplier tubes, scintillators and light pipes are necessary to convey the signal to the remote, bulky tube body. Marks may be etch pits or mesas relying solely on topographical contrast in the case of a homogeneous mark and substrate material. Since the backscattered yield is strongly related to atomic number, a mark of material with a significantly different atomic number than the substrate will provide significant additional atomic number contrast. This contrast will improve the overall alignment signal contrast significantly in the case of high atomic number marks on a low atomic number substrate. The vacuum system is typically a mixture of ion pumps for the column and gun and turbomolecular pumps or diffusion pumps for the chambers. Turbomolecular pumps are prefered because a reduction of oil backstreaming into the system is necessary to limit the amount of hydrocarbon deposits that can build up charge and affect beam quality. In addition, the vacuum quality at the gun directly influences the life of the cathode. Typically a multiple-substrate load-lock is provided for substrate exchange without disturbing the vacuum level in the exposure chamber. The substrate exchange subsystem may also incorporate means for thermal equilibration of the substrate and fixture before loading on the stage.
700
Handbook of VLSI Microlithography
A master control computer integrates vacuum control, substrate exchange, calibration and electron-optics control, and data handling and exposure control. Fractured data are streamed from the data file to the pattern generator buffer as the stage is stepped or scanned. The pattern generator output drives the deflection electronics and the beam blanker. The key components of the deflection electronics are the DACs that convert the digital pattern data to analog deflection signals. A master clock provides synchronization of beam blanking to pattern data transfer to the DACs. The clock speed determines the dwell time for the beam at an exposure site, allowing dose control. 3.2
Electron-Optical System
Introduction. The electron-optical system is arranged as a column, with the gun and emission-chamber at the top. Together, the gun and emission-chamber provide a source of electrons, accelerated to a welldefined energy and roughly collimated. The emission chamber is followed by beam-shaping optics, beam-blanking components, and different combinations of focusing lenses, deflection coils, and beam-limiting apertures. These components result in a beam that travels from source to substrate, with various divergences and convergences and associated crossovers. Variations in arrangement of these components can serve specific applications or reflect a design preference. Mechanical rigidity of the column is essential for control of the beam position at the substrate, since onlydeviations of the position of the stage are monitored and corrected. Typically, the parts of the column adjacent to the electron beam are fitted with removable platinum liners, especially where insulating materials are present that may charge. These liners can be flame-cleaned during maintenance to remove organic contamination that can charge and cause beam drift. Sources. From a physical viewpoint, the electron source is composed of the emission chamber, which is a high-vacuum vessel, a gun cap that seals the top of the emission chamber, a high-voltage insulator suspended from the gun cap inside the emission chamber, a Whenelt assembly suspended from the insulator, and an anode at the base of the emission chamber. From an electrical viewpoint, the electron source is a triode circuit of cathode, grid, and anode. The term electron gun best describes the components of this triode circuit. The Whenelt assembly houses the electron emitter, which acts as the cathode, and a spray aperture, or Whenelt electrode, that serves as the grid. The cathode is held at a negative voltage
Electron-Beam ULSI Applications
701
with respect to the grounded anode, typically ranging from –10 to –100 kV. The grid is held at a negative bias relative to the emitter that serves to focus and regulate the electron current passing through its aperture. The focusing effect results in a crossover of the electron beam, as shown in Fig. 10. The roughly focused beam of electrons is accelerated by the accelerating voltage between cathode and anode, and passes through an aperture in the anode and into the electron-optical column. A narrow spread in accelerating voltage is required to minimize the effects of chromatic aberrations in the electron optics.
Figure 10. Electron gun schematic. For clarity, emitter heating bias is not shown.
Brightness is a figure of merit that, for a source of charged particles, is defined as the current density per unit solid angle.[19] For the electron gun, brightness indicates the number of electrons passing through a limited area spot with a certain convergence angle. The convergence angle is significant, because current caused by electrons passing through an area (for example, the first beam crossover) at a large angle is of little use due to limitations in the electron optics. The brightness β is expressed as β = 4I0/(π2d02α2) where I0 is the current through the spot with diameter d0 and
702
Handbook of VLSI Microlithography
convergence half angle α. The brightness is conserved throughout the electron-optical column and is affected by thermal emitter operating temperature, cathode to Whenelt electrode spacing, and bias voltage. One definition of d0 is the distance between the points where the beam current distribution tails drop to 80% of the peak value. For a Gaussian distribution, the area defined by this diameter contains 85% of the total available current.[20] Practical limitation of the brightness arises from limitations on convergence half angle resulting from lens aberrations. A practical measurement of spot diameter is often made by deflecting the beam across a high contrast knife edge. Since the backscattered electron signal intensity is proportional to beam current, measurement of this signal as a function of beam position across the knife edge gives a first order approximation of the radius d0 /2 by measuring the distance required for the signal to go from the maximum value to the desired percentage drop, for example 80% of maximum intensity. Note that this measurement includes the distance introduced by any finite slope of the knife edge. The edge slope δ can be approximated by measuring the beam radiusR at two different lens demagnifications. R1 is measured at a low demagnification M 1 where R1>>δ . R2 is measured at a high demagnification M 2 where R 2 is of the same order of magnitude as δ . The edge slope δ is independent of demagnification and the demagnification ratio M 1/M 2 is known from the electron optics design, thus δ can be determined from the approximation (R1 + δ )/(R2 + δ )= M 1/M 2 ≅(R1)/(R2 + δ ). Electron emitters may be field emitters or thermal emitters. Thermal emitters are available using W, LaB6, or CeB6 materials. All are commonly referred to as filaments, although only the W emitter has the characteristic hairpin filament shape using a bent W wire. LaB6 and CeB6 cathodes use a crystal that is typically polished to a sharp cone for Gaussian-beam systems or a truncated cone for large-area beams. In order of increasing vacuum requirements, operating lifetime, brightness, spot resolution, and cost are W thermal emitters, LaB6 and CeB 6 thermal emitters, and field emitters. Resistance heating is used to achieve the high temperatures required for useful thermal emission. This high operating temperature, 2700 K for W and 1800 K for LaB6 and CeB6, leads to evaporation of the thermal emitter material, which limits operating lifetime. W emitters are poor overall choices due to their short lifetimes (200–400 hr), low brightness (105 A/cm 2/steradian), and poor mechanical stability. LaB6 and CeB6 emitters provide increased lifetime (1000–3000 hr), improved brightness (106 A/cm2/steradian), and better mechanical stability using rigid emittercrystal mounting techniques. CeB6 emitter crystals provide for reduced
Electron-Beam ULSI Applications
703
evaporation rates and improved lifetimes relative to LaB6 emitters. Operating lifetime of thermal emitters is also improved by increasing tip radius or using a flattened tip, with a penalty of slightly increased minimum spot size. Field emitters provide the longest lifetime (over 8000 hr), and the highest brightness (108–10 9A/ cm2/steradian). Gun vacuum requirements range from 10-7–10-8 Torr for W, LaB6, or CeB 6, and 10-10 Torr for field emitters. A bakeable emission chamber is a requirement for the vacuum levels for field emitters and a useful feature for LaB6 and CeB6. Source lifetime can be significantly improved through better vacuum. As with the overall column, mechanical stability of the tip is essential for accurate beam positioning. For some thermal emitter guns, the Whenelt electrode bias is automatically controlled through a resistor connection to the cathode. In this mode, an emission current variation impresses a bias change at the Whenelt electrode that counteracts the current variation and stabilizes emission. The level of emission can be varied by varying the bias resistor. The setup of thermal emitters for stable operation is possible using the proper selection of Whenelt electrode bias voltage where this bias is independently controlled. As Whenelt bias voltage is made increasingly negative, eventually a cutoff bias is reached where the gun emission is completely eliminated. As the bias voltage is varied slightly above cutoff, a very stable beam current and spot size minimum can be found at a bias point that is largely independent of cathode temperature. As cathode temperature is increased, the beam current at this stable bias point will also increase, allowing the brightness to be adjusted. Unfortunately, this stable bias point is also a brightness minimum for a given operating temperature. The brightness can be maximized at a given temperature by making the bias voltage less negative until the beam current just peaks, although the spot size will be less stable and slightly larger.[21] Field emitters are characterized by finely sharpened emitters with submicron tip radius.[22] The sharpened tip concentrates the electric field at the tip sufficiently to overcome the surface vacuum work function and extract electrons. The resulting emission current is strongly affected by surface work function variations that occur due to contaminant absorption, making emission stability strongly related to vacuum quality. The high field at the tip makes it susceptible to sputtering by beam-ionized gas atoms, also dictating operation in ultra-high vacuum. Reduced susceptibility to contaminants is possible through heating of the tip, which reduces contaminant adsorption. The terms thermal field emission and cold field emission
704
Handbook of VLSI Microlithography
are used to distinguish between heated and nonheated tips respectively. As with other heated emitters, the thermal field emission tip suffers from evaporative loss of material, which must be replenished for extended operating life. General characteristics of field emitters are nanometer-scale source size, high brightness, and low thermal energy spread.[23] The small source size is an advantage for spot beam systems since less source demagnification is required to achieve a small exposure spot; but, it is a disadvantage for shaped-beam systems, where broad, uniform beam current distribution is necessary. The low thermal energy spread of field emitters relative to thermal emitters (0.2–0.9 eV versus 2ñ3 eV) reduces the beam broadening due to chromatic lens aberrations (see Sec. 2.3, subsection “Defects and Contamination”). The configuration of the thermal field emitter gun differs from a thermal emitter gun, as it includes a suppressor electrode to limit beam current due to thermionic emission from regions other than the tip and an intermediate anode to generate the high required extraction field.[24] Apertures. Apertures are used in various positions throughout the electron-optical column. A spray aperture placed in the diverging beam of the electron source limits the solid angle of the beam and thus limits the extreme off-axis beam components that are subject to the largest lens aberrations. Further beam-limiting apertures may be placed at beam divergence points following a crossover. A common configuration is a provision for an externally selectable final aperture that permits the selection of ranges of beam currents, depending on aperture size. The final aperture selector allows for alignment of the selected aperture with the optical axis of the column and the removal of the aperture set for cleaning. Shaped-beam systems may employ one or more beam-shaping apertures defining a specific beam cross-section. Shaping apertures may be used in combination with shaping deflectors and lenses to provide a variable beam shape and size. A grounded aperture may be used as a beam stop in a beamblanking system. Apertures may be heated in situ to reduce the condensation of contaminates that can degrade beam quality through charging. Beam Blanking. A beam blanking system providing 100% modulation of beam current is required to prevent unwanted exposure when the beam is moved between noncontiguous areas of the pattern, during electronic or mechanical settling times, while the stage is being positioned correctly, or simply when the exposure is completed. Most beam blanking systems use electrostatic deflector plates that are biased during blanking such that the beam is directed away from the electron-optical axis and into a grounded stop, such as an aperture. The blanking must be carefully timed
Electron-Beam ULSI Applications
705
to coordinate with the beam deflection. As deflection speeds increase, so do requirements for beam-blanking response time and phase lag of the blanking signal. Additionally, the blanking response should not result in any imaged beam deflection at the substrate. This condition, called conjugate blanking, is maintained by placing the blanking deflection pivot point at a crossover point. This works because the electrons traveling at an angle from the electron-optical axis at the crossover are already focused at the sample by the electron optics. A typical blanker calibration involves adjusting the current of the beam blanker crossover lens to minimize beam spot motion at the substrate while the blanker is modulated. Lenses. Lenses are critical components of the electron-optical system, serving several functions. Magnifying lenses used with shaped-beam systems help provide a more uniform distribution of current density within the beam cross-section. Shaping lenses help form variable size or shape beam cross-sections, even for Gaussian-beam systems where a multiplesector shaping lens is typically used to correct for beam astigmatism. Demagnifying lenses control the ultimate spot size and current at the substrate, with multiple lenses for systems with large demagnifications. Unlike optical systems, where lens properties are fixed by lens shape and material (refractive index), electron optics use electromagnetic fields that can vary in strength, allowing for electronically variable properties, such as focal length and deflection angle. The typical electromagnetic lenses used suffer from various aberrations that result in undesired variations in the expected shape and position of the beam. Spherical aberration is an inherent characteristic of magnetic lenses due to the radial variation in magnetic field strength inside the lens bore, resulting in a variation in focal plane as the electron trajectory through the bore is moved off axis.[25] The impact of this variation at the plane of the substrate is an increased beam diameter. Spherical aberrations can be reduced by using a pre-lens aperture to reduce the off-axis beam component at the expense of reduced beam current. Chromatic aberrations arise from the dependence of the Lorentz force of the lens magnetic field on the electron velocity. Variations exist in the energy, and hence in the wavelength and velocity, of the emitted electrons due to ripple in the accelerating voltage power supply, as well as due to intrinsic thermal effects. Chromatic aberrations manifest these velocity variations as variations in focal length with a ∆V/V dependence, where V is the accelerating voltage. Other aberrations exist due to geometrical errors in the shape of the pole piece, mechanical errors due to lens misalignment, and errors due to instabilities and nonlinearities of lens excitation. An inherent feature of the Lorentz force
706
Handbook of VLSI Microlithography
in a magnetic lens is the rotation of the electron trajectories about the optical axis. This effect results in a substrate height-dependent image rotation. For low-energy exposure systems (5 kV and below), a low-energy column is particularly susceptible to chromatic aberrations that are ∆V/V dependent since V is reduced. The lower energy beam in such a column is also more susceptible to stray magnetic fields. One method of reducing these low-voltage column difficulties is to use a higher voltage column followed by a retarding field which reduces the electron energy at the substrate.[26] Beam Deflection. Beam deflection is possible using magnetic coils or electrostatic plates, both affected by aberrations that result in deviations from a linear response. The placement of the deflection unit can be ahead of, within, or following the final lens. Pre-lens deflection allows for the shortest final lens to substrate separation, or working distance, which permits the lowest aberrations and spot sizes but increases deflectiondependent spot distortion, limiting usable deflection range. This configuration is most common for scanning-electron microscopes, where distortion requirements are less demanding than for electron-beam lithography and where resolution can be maximized by using working distances approaching zero. As the deflection unit is moved to in-lens and then postlens, the distortions and spot size increase, but this is a tradeoff for an increase in deflection range. Dynamic response of the deflection unit must also be considered as this determines the maximum pattern writing speed. Hybrid approaches are possible, allowing the use of slower (lower distortion) magnetic deflection for sub-field placement over large deflections, combined with faster electrostatic deflection for shape writing over a distortion-limited deflection range. The total displacement of the beam is the superposition of the separate deflections. The required frequency response of lower levels of deflection is reduced by a factor as large as 2n (maximum when higher order deflection is raster scan, addressing all possible high-order pixels) where the higher-level deflection addresses 2n pixels. The commonly corrected errors in a deflection, or scan, field are shown in Fig. 11. These are x and y gain (i.e., x position is a function of x only), x and y rotation (i.e., x position is a function of y position), x and y keystone (i.e., x position is a function of xy), and pincushion (i.e., x position is a function of x2, y2, x2y2, xy2, and x2y). The ability to adjust the gain, rotation, and keystone properties of the deflection field is necessary for the matching of pre-existing substrate coordinate systems in directwrite applications.
Electron-Beam ULSI Applications
707
The simplest beam deflection systems result in increasing beam angle relative to the substrate normal as the beam is deflected off the electron-optical axis. A telecentric deflection system provides for normal beam incidence both on and off axis, i.e. under deflection. Telecentric deflection reduces or eliminates the requirements for deflection gain corrections as a function of substrate height variation by eliminating the geometrical effect (see Fig. 12), although the total available deflection range is less than for many non-telecentric systems.
Figure 11. Deflection field distortions.
Figure 12. Deflection gain and focus dependence on substrate height.
708
Handbook of VLSI Microlithography
In some systems, the scan field size is scalable. Usually, at least a separate scan field size is available for each range of accelerating voltage (decreasing with increasing accelerating voltage). If the maximum scan field at a given accelerating voltage can be reduced, the pixel resolution will be improved. If the adjustment is continuous, then arbitrary pattern grids can be accommodated. Scaleable scan fields are particularly useful for grating patterns of arbitrary pitch, for example, for determining the precise lasing wavelength of a distributed-feedback laser. Separate deflection systems are used for beam shape and size variation in variable-shape beam systems. The beam is typically shaped by an initial aperture and then deflected to have a varying amount of overlap onto a second aperture. There is a necessary settling time associated with a change in beam shape, as well as additional calibrations for beam sizing. Deflection-Based Corrections. Deflection of electrons over a flat substrate poses many pattern integrity challenges due to nonlinearities and aberrations of electron optics. This leads to deflection-dependent focus and to pincushion distortion of the uncorrected projected image (see Figs. 11 and 12). Interactions of the deflected beam with downstream lenses can cause deflection-dependent field rotation. Also, deflection axes orthogonality errors may exist if the respective deflection coil or plate pairs for the two axes are not perpendicular to each other. Additionally, the stigmation is a function of beam deflection. Deflection-based distortion, focus, and stigmation errors are largest for large deflections. The simplest approach to minimizing these errors is to restrict deflection range, although this may impact throughput, as discussed in Sec. 2.3, in the subsection “Throughput.” The small-field approach is less desirable where the additional field boundaries may restrict coherent patterns, such as gratings. Corrections to these deflection-dependent errors allows the expansion of the usable deflection range while maintaining the desired error budget for pattern placement. Advanced systems provide for deflection-based corrections of some or all of distortion, focus, image rotation, and stigmation. An average value for a correction parameter may be measured in the center of each subfield, or a matrix of points may be measured for each correction parameter to determine the coefficients of a correction polynomial. Height Measurement. Nontelecentric deflection systems, by definition, have a nonorthogonal intersection of beam and substrate that leads to variations in scan field size as the height of the substrate is changed. Height measurements of the substrate allow for the scaling of deflection gain to compensate for the height-dependent field size variation, as well as
Electron-Beam ULSI Applications
709
height-dependent focus variation (static) for a limited range of height variations (see Fig. 12). Measurements are taken with capacitive probes, which are scanned over the before exposure, or with laser beams reflected off the substrate surface. The capacitive probe interferes with the beam and requires a mapping of substrate height prior to pattern writing, which provides additional throughput overhead. Laser height measurement systems do not interact with the electron beam and can operate simultaneously with pattern exposure. Because the substrate must reflect the laser beam, difficulties may be encountered with certain resist or substrate materials. One possible solution to reflection difficulties is the use of a thin (20 nm) inorganic reflection layer such as Ge, although this may introduce unacceptable increases of forward scattering and reductions of mark acquisition contrast. The laser wavelength must be chosen such that the resist is not exposed, which is the case for the commonly employed 670 nm laser. 3.3
Writing Strategies and Architecture
Stage Motion. In a fixed-stage approach, pattern writing occurs with the stage fixed. The beam is scanned to expose all parts of the pattern that lie within the exposure field, the range that is accessible using only beam deflection. When a field exposure is completed, writing is suspended and the stage is stepped to the next field position and the process repeated. The exposure fields are square, and the complete pattern is exposed as a mosaic of fields stepped out in both axes (see Fig. 2). Stages for these systems operate with large accelerations and have associated settling times following deceleration before pattern writing may proceed. Fixed-stage exposure systems are capable of a much larger scan range, or field size, relative to moving-stage systems. The large field capability helps to limit the total amount of stage stepping and field registration overhead time incurred, but at the same time is subject to larger deflection-based errors. Deflection-based corrections to focus, stigmation, and distortion may be provided to the electron optics to minimize these errors. Corrections can be implemented in software as either lookup tables or correction polynomials. For a given deflection value, a digital correction word is either retrieved from the table or calculated using the polynomial coefficients. These correction data are provided to an associated correction DAC, resulting in a correction signal that may be fed to the major deflection subsystem via summing amplifiers or that may be fed to a separate correction subsystem. An example of this latter case would be a focus correction signal that is fed to a separate dynamic focus correction coil,
710
Handbook of VLSI Microlithography
capable of only a limited range of focus variation. Separate correction subsystems with significantly limited operating ranges provide improved operating bandwidth and linearity relative to the associated major subsystem with its corresponding larger operating range, an important factor for dynamic corrections. In the typical moving-stage approach, the pattern is written as the stage is scanned in one axis across the entire pattern (see Fig. 2). The primary beam deflection is orthogonal to the stage scan direction, the secondary beam deflection is in the direction of stage travel for the purpose of tracking substrate position. At the end of the scan, writing is suspended and the stage is stepped in the other axis to the next scan position and the process repeated. In this case, the complete pattern is made up of a linear array of stripes. The width of the stripe (i.e., the deflection range) is typically smaller than the deflection field size for a fixed-stage system, resulting in lower deflection-dependent distortions. Unique approaches to moving stage systems exist for application to the curvilinear extended shapes of integrated optics devices.[27] System approaches implement a nominally on-axis beam with the stage motion tracing out the optical element. An advantage of such systems is that the position errors are averaged over the entire feature, rather than being concentrated at field or stripe boundaries. Deflection. Deflection systems or subsystems perform pattern exposure with either vector scan or raster scan. In a raster scan system, the beam addresses all points in a two-dimensional exposure grid along a raster path, with the pattern data sorted according to the raster path and fed to the beam blanker with beam deflection controlled independently of pattern shape. In a vector scan system, the beam addresses only points within the pattern, with the pattern data determining both the beam blanking and beam deflection control. Hybrid combinations of vector and raster scanning are possible where an exposure area reference point is positioned in a vector fashion and the shapes contained in this area are exposed in a raster fashion. A raster scan covering the entire deflection range provides simplification of moving stage exposure logistics, as the deflection coverage rate per unit area is pattern independent and allows the stage to move at a nearly constant rate. Improvements in throughput and hardware requirements are possible using hierarchical deflection strategies. In a multiple-level deflection hierarchy, the total deflection signal to place the beam is a result of multiple DAC outputs generated from multiple address words. These outputs may be summed electronically and fed to a common deflection system, or they
Electron-Beam ULSI Applications
711
may be directed to entirely separate deflection subsystems. The lowestlevel address word is responsible for the largest range of deflection, with higher levels responsible for increasingly smaller ranges. Limiting the deflection range of the higher levels permits higher operating bandwidth and linearity, and a given number of bits will address this range with higher resolution. Addressing of lower levels of deflection can then be done at coarser resolution, with a given number of addressing bits providing a larger scan range. Lower level addressing is also done at lower data rates, since these data are updated only after all higher level addressing is completed for the given lower level input (for example, all elements of the subfield are addressed before addressing the next subfield). Resolution and bandwidth requirements of lower levels are thus significantly reduced relative to the higher levels. Only the highest-level addressing DAC must operate at the dose-limited clock speed. Throughput is improved by separation of deflection-dependent settling times. The largest times are needed only for the lower deflection levels, where large deflection steps are made. Figure generation by the highest level of beam deflection is possible using either vector scan or raster scan. An example of a two-level hierarchical deflection system uses the lower level deflection DAC to position a figure corner and the higher level DAC to generate the figure. Another example of a two-level hierarchical deflection system is the use of the lower level deflection DAC to position a subfield and the higher level DAC to generate all subfield figures. An example of a three-level hierarchical deflection uses the lowest level to position a subfield, the middle level to position a figure in the subfield, and the highest level to generate the figure. Beam Shape. Vector scan systems can use either a Gaussian spot or a shaped beam, while raster scan systems are limited to a Gaussian spot beam. A shaped-beam system can rapidly change the size and shape of the beam during patterning. These changes expose multiple pixels simultaneously, achieving a significant throughput advantage over spot beam systems, where large shapes are patterned. The advantages of a shaped beam over a Gaussian beam are compounded in considering the ability to increase beam current, as the beam cross-section is increased from a spot to an extended area. Since more current can be made available in the larger beam, the dwell time for a given dose does not increase proportionally to the beam area. In a variable shape approach, the pattern generator must provide beam size and shape information in addition to figure placement information. However, if larger numbers of pixels are exposed for a given shaped-beam address word, then data exchange and deflection bandwidths
712
Handbook of VLSI Microlithography
are correspondingly lower than for Gaussian beam systems. The throughput advantage of variable beam shape is reduced when the pattern elements become smaller than the maximum beam size, since the beam shape is restricted to a contiguous region. The beam shape and size are typically defined by sequential apertures. The beam from the first aperture is projected onto the second aperture with an electronically variable overlap position, providing for rapid size and shape control. Gaussian spot beam systems are capable of the finest resolution, with beam diameters as low as 5 nm possible with currently available fieldemission sources. Systems with suitable electron optics can vary the beam current, increasing the beam diameter by more than an order of magnitude. This spot range can help offset spot beam throughput limitations through a coarse-fine exposure split strategy. This strategy requires the separation of pattern data into higher throughput and high-beam-current coarse features, with only the fine features exposed with the low beam current at lower throughput (see Sec. 2.3, subsection “CAD Considerations”). Currently produced shaped-beam systems are limited to producing minimum shape dimensions of 150 nm. Cell (also known as character, block, or mini-reticle) projection approaches provide a shaped-beam cross-section, using a specific shaped stencil mask in an aperture position in the column (Fig. 13). Unlike the more common variable-shaped beam system, the cross-section of the beam produced by cell projection can be a complex combination of noncontiguous shapes. Multiple cross-sectional patterns are available by deflection of the beam to different patterns or “cells” on the stencil mask. The stencil pattern of the cell is demagnified by 25–100 times, simplifying highresolution fabrication of pattern details in the stencil. The demagnified cell dimensions, which define the beam cross-section, are limited to the order of a few microns to minimize the geometrical aberrations of the offaxis beam components. In addition, electron-electron interactions (Coulomb interaction) at high current densities (image crossover points) also lead to image distortions that can limit the total usable current density and throughput. [28] Reduction of the illuminated area of the cell for each exposure shot or reduction of the stencil linewidth with a corresponding increase in shot dose are demonstrated approaches to reducing Coulomb effects.[29] The stencil mask is readily produced with existing semiconductor processes and materials. The patterns in the stencil mask cells are chosen to match repeating patterns in the CAD data. A rectangular stencil cell allows operation like a conventional variable shape beam system for exposure
Electron-Beam ULSI Applications
713
of nonrepeating pattern elements. The throughput advantage of cell projection over shaped beams is reduced greatly for CAD designs with limited repeating pattern elements. However, for repetetive patterns such as DRAMs, cell projection provides a significant throughput advantage.
Figure 13. Schematic of cell-projection system.
The choice of specific machine architecture from fixed versus moving stage writing, levels of deflection hierarchy, and shaped or Gaussian spot all directly dictate the nature of the pattern data fracturing. Fixed-stage writing will dictate fracture into fields, while moving stage writing dictates fracture into stripes. A moving stage system will require the sorting of exposure regions according to the stage scan path. Deflection hierarchy can
714
Handbook of VLSI Microlithography
introduce further fracture boundaries in order to generate subfields corresponding to mid-level deflection. Finally, figure generation will determine whether the pattern elements are divided as fundamental shapes for a shaped-beam system or have boundaries corresponding to the highestorder deflection range. Systems with multiple deflection clock frequencies may assign deflection frequency bits to pattern elements during fracture to implement dose modulation (for example, for proximity correction). Figures generated by raster scan must have the shape-dependent blanking information serially sorted to correspond to the raster path. 3.4
Calibrations
All electron-optical system calibrations are ultimately referenced to the laser interferometer standard. The most fundamental calibration maps the deflection gain to the laser reference. This begins with the detection of a nominally on-axis mark position by a nominally on-axis beam, here referring to the electron-optical axis. The stage is then used to move the mark off-axis a known amount, as measured by the laser interferometer. The required beam deflection to reposition the beam at the mark is measured for mark displacements in both stage axes, providing an accurate calibration of deflection input versus beam displacement. Note that this calibration must be performed for all deflection subsystems where deflection hierarchy or deflection correction subsystems are employed. A fundamental calibration is a drift calibration that is performed at intervals during exposure to correct for variations in beam current and position drift with time. A marker on the stage is located by beam scanning before exposure and then subsequently at the drift calibration intervals. Variations in measured mark position indicate beam drift and are corrected by an opposing pattern shift, added via beam deflection. Beam current variations are corrected by adjustment of exposure frequency to maintain desired dose. Stage runout calibrations correct for stage position errors due to mirror imperfections. The resulting errors can be averaged out by exposing a matrix of marks on a substrate and then acquiring the mark positions as the orientation of the substrate axes is changed. Nonorthogonality of the stage axes can be measured and applied as a software coordinate correction to the measured stage position. Variable shape beam systems require additional calibrations for beam dimensioning. Cell projection/mini-reticle systems require additional calibrations to properly stitch nonrepeating portions of the
Electron-Beam ULSI Applications
715
pattern exposed with the variable-shaped beam with repeating shape areas exposed by the various cells. Measurements can be made for focus, stigmation, deflection field rotation, and deflection nonlinearity (distortion) corrections in a matrix of points covering the entire deflection range. These correction measurements can then be used to determine coefficients of correction polynomials, or simply stored in a lookup table for each parameter. Since the behavior of the lenses and coils responsible for these errors is strongly dependent on beam energy, a set of calibrations must be done for each accelerating voltage. For a description of deflection field distortions and deflectionbased corrections, see Sec. 3.2, subsections “Beam Deflection” and “Deflection-Based Corrections.” 3.5
Examples of Commercial Equipment
Electron Microscope Conversions. Scanning electron microscopes (SEMs) pioneered the fundamental electron-optical components used for electron-beam lithography. For limited-throughput device prototyping or resist/process evaluation, electron microscopes (including scanning transmission electron microscopes, STEMs) can be modified to perform electron-beam lithography. The standard electron microscope lacks several key lithographic system components that can be added at varying costs. A fundamental addition to the basic electron microscope is a pattern generator, composed of a computer and DAC that translate the pattern data into analog scan coil signals to replace the existing raster scan beam deflection signals. Electron microscope scan coil nonlinearity will limit the deflection range that is usable without appreciable pattern distortion. Electron microscopes also lack beam blanking, although it is possible to do exposures without a blanker by “parking” or “dumping” the beam after shape exposure in an unused pattern area. Exposures without a blanker are subject to exposure artifacts due to scan coil settling and dose limitations, if the required pixel stepping rate exceeds the limited scan coil bandwidth. The addition of a beam blanker and pattern generator can be done for moderate cost and provide limited lithography capabilities, including registration of the exposure to pre-existing substrate patterns. The most significant limitation of a basic electron microscope is the lack of a precision stage position measurement system that precludes accurate stitching of the small, usable scan fields. The normal sample stage is poorly equipped to precisely fixture the substrate in a plane orthogonal to the electron-optical axis or to maintain a constant working distance with stage movement. The inability to accurately stitch exposure
716
Handbook of VLSI Microlithography
fields limits the applications of SEM/STEM conversions mainly to smallarea device prototyping and resist evaluation. Components for SEM/STEM conversions, such as beam blankers and pattern generators or complete systems are commercially available.[30] Conversions may be applied STEMs, in a similar fashion to a SEM conversion, taking advantage of the higher accelerating voltages (greater than 100 kV) of STEMs over SEMs. A significant limitation with a STEM conversion is the limitation on maximum sample size, perhaps only a few millimeters square, due to the placement of the sample between lens pole pieces. The Leica Nanowriter system is available in a configuration that provides a turn-key, fixed stage, vector scan SEM conversion lithography system, combining column components for beam generation and deflection from the EBPG-Series lithography system and the 400-Series SEM, respectively. The lithography beam source provides up to 100 kV accelerating voltage in either thermal or thermal field emitter (TFE) emitter configurations. A laser interferometer stage option is available that permits system performance and applications significantly beyond the average SEM conversion.[31] Gaussian Spot Systems. Leica EBPG-5. The Leica EBPG-5 system is a Gaussian-spot, fixed stage, vector-scan system that evolved from the Philips EBPG-3, EBPG-4, and EBPG-5 systems. The EBPG-5 uses a thermal emitter at 20, 50, or 100 kV, has a minimum spot size of 12 nm, and has demonstrated 20 nm features over a 400 µm scan field. Stitching error is 80 nm or better and in-field distortion is 60 nm or better, over a 1 mm scan field. The stage is capable of five inch travel in both axes with λ/120 (5 nm) measurement precision. Stage position deviations from expected position are corrected by offsetting the main deflection up to ±20 µm. Four separate photomultiplier tube backscatter detectors are provided, one at each quadrant surrounding the final lens. This configuration permits optimization of the signal from topographical markers, where the backscatter orientation is facet and scan-direction dependent. Temperature control is by air cooling to remove excess heat, and temperature-controlled closed-loop water cooling for component temperature regulation. The first of four electromagnetic lenses provides a beam crossover at the center of the electrostatic beam blanking plates. The next two electromagnetic lenses provide exposure spot diameter and current control while maintaining nearly constant focus, acting together as a zoom lens. The final electromagnetic lens is an asymmetrical objective lens that provides coarse focus at a working distance of 40 mm. Fine focus is provided by fine focus
Electron-Beam ULSI Applications
717
coils that correct for both static (substrate height dependent), as well as dynamic (deflection-dependent) focus correction. Double quadrupole stigmator coils (diagonal and axial) control spot shape for both static (on-axis) and dynamic (deflection-dependent) corrections. A manually selectable, three-position final lens aperture selector permits selection of current density range. The deflection system is pre-final lens and hierarchical, with each level driving its own magnetic coil pair. The lower level (or main deflection) system positions the corner of a primitive figure within the scan field using 15-bit DACs. Corrections are made in the main deflection system for x and y gain, orthogonality of scan axes (dynamic correction), rotation of scan field due to focus changes (due to pre-lens deflection), pincushion distortion (dynamic correction), and keystone distortion. The dynamic corrections to main deflection are implemented in hardware, while the remaining corrections are implemented in software. In addition, separate software-supplied gain and rotation corrections are provided for calibration of the stage tracking signal that is incorporated into the main deflection signal. The maximum main deflection is 1600, 800, or 512 µm for 20-, 50-, or 100-kV accelerating voltages respectively. The main deflection range is continuously scalable to allow improved figure placement resolution. The higher level (or trapezium) deflection writes a primitive figure (trapezoid) using one of the eight available stepping frequency channels as encoded with the primitive figure data (maximum 10 MHz). Corrections to the trapezium deflection are made for gain, rotation, a main field deflectiondependent orthogonality (dynamic) factor, and a main field deflectiondependent pincushion distortion (dynamic) factor. The dynamic corrections are derived directly from the main deflection circuitry, while the others are provided by the software. The maximum trapezium deflection is 8.125 µm and may be reduced by a factor of two or four for improved figurewriting resolution. A laser height sensor provides for deflection gain correction due to substrate height variations of ±50 µm. Additional stageposition-dependent rotation compensation is provided (during pattern writing) to correct for the effects of residual stage magnetism. Leica Vectorbeam. The Vectorbeam series evolved from the EBPG-5 instrument with many similarities. Only key improvements and distinctions from the EBPG-5 series are noted in this section. A thermal field emission source is provided with up to 100kV accelerating voltage. The stage travel is increased to six inches and the stage measurement accuracy is improved to λ/1024, or 0.6 nm. The pattern generator speed is
718
Handbook of VLSI Microlithography
increased to 25 MHz and incorporates a repeating pattern buffer for higher throughput. A new software platform supports exposure control and system operation. Leica LION. The LION series is a Gaussian beam, vector scan system that was developed by Jenoptic Technologie GmbH. It uses a thermal emitter with a unique low-accelerating voltage range of 1 to 20 kV. The lowest voltages allow for minimal proximity effects and simultaneous high resolution, with a minimum spot size of 5 nm or less at 1 kV. Variable exposure energy at the low end of the range allows for variable beam penetration for three-dimensional resist patterning. A second unique feature is an alternate pattern writing mode from the usual fixed-stage or linear-scan moving stage. In this mode (termed continuous path pattern generation by Leica), continuous stage motion is used with a nominally onaxis beam to trace out Bezier and spline curves typically found in integrated optics applications. The advantage of this mode of operation over a conventional stitched large-area pattern is that the spatially localized stitching error is eliminated, and, instead, the associated errors are distributed over the length of the curve. Leica EBMF and EBML. The Leica EBMF and EBML series are spot-beam, fixed-stage vector scan systems that are a continuation of the Cambridge Instruments EBMF product line, following its merger with Leitz (Leica is a convolution of Leitz and Cambridge). The gun uses a thermal emitter and has the unusual capability of continuously variable accelerating voltage in the range of 1–40 kV (50 kV on improved systems). Also uncommon is a tilted gun assembly that provides an initial triodegenerated beam at an angle to the column electron-optical axis. The required magnetic deflection to align this beam down the column acts as a mass filter to screen any gun-produced ions or photons. Three electromagnetic lenses are used, providing a broad range of beam current. Post-lens deflection is used, providing a scalable field size from 0.25 to 5 mm. Deflection is two-level hierarchical, with the lowerlevel, slower DAC determining a subfield location using a 5-bit word (32 subfields), and the higher-level, high-speed DAC filling the subfield using a 10-bit word. The address grids on each DAC are equal, giving a total scan field resolution of 15 bits (0.1 µm for a 3.2768 mm field). The output signals from the two deflection DACs are electronically summed and fed to a common, nontelecentric deflection coil. The maximum pixel exposure rate is 10 MHz, and the pattern generator mode can be selected to skip 1, 4, or 8 pixels, for higher throughput writing on fine address grids. Dynamic (deflection-dependent) corrections are provided (through lookup tables
Electron-Beam ULSI Applications
719
by subfield) for deflection gain, field shift, field rotation, focus, and stigmation. Focus is controlled by three separate components, manually by the third lens (coarse focus), and automatically by two separate focus correction coils, one for static (on-axis) correction and one for dynamic (off-axis) correction. The stage on older systems has 26 nm measurement precision, but upgrades now provide 5 nm. A capacitive height sensor allows height mapping separate from exposure, to provide height-dependent deflection gain correction. A channel electron multiplier provides backscattered electron detection. The vacuum system uses ion pumps on the column and diffusion pumps on the chambers. The software is versatile and provides for two independently-calibrated coordinate modes in addition to stage coordinate mode. JEOL JBX-5DII, JEOL JBX-5FE, and JBX-5000LS. The JEOL JBX-5DII system is a Gaussian-spot, fixed-stage vector-scan system that uses a thermal emitter at 25 or 50 kV, has a minimum spot size of 8 nm and is capable of generating 30 nm or smaller features. Stitching error and overlay accuracy are 40 nm or better over a 80 µm scan field or 60 nm or better over a 1500 µm scan field. The stage is capable of 120 mm travel in both axes with λ/1024 (0.6 nm) measurement precision. A silicon P-N junction backscatter detector surrounds the final lens. The column has five electromagnetic lenses in total, using the first three in conjunction with either the fourth or fifth lens to provide two separate working distances. The fourth lens has the longer working distance and provides a 1500 µm or 75 µm scan field at 25 and 50 kV respectively. When used with the shorter working distance fifth lens, the scan field sizes are 160 µm and 80 µm at 25 and 50 kV respectively. The scan field ranges are not continuously scalable within the selected field size. The pattern generator can be directed to increase the exposure stepping increment to as large as twenty times the pixel size. Electromagnetic coils are used for initial beam alignment and stigmation. Beam deflection is accomplished by two-stage electrostatic octupoles, with separate systems for the fourth and fifth lenses. The deflection is two-level hierarchical, with a lower-level 16bit DAC specifying sub-field position and an upper-level 12-bit DAC providing vector figure generation within the subfields. The separate DAC outputs are summed electronically and fed to either the fourth or fifth lens deflection system. Dynamic distortion corrections are provided for each subfield. The maximum pixel writing rate is 6 MHz, with provision for figure dose modulation.
720
Handbook of VLSI Microlithography
The JBX-5000LS has nearly the same configuration and specifications of the JBX-5DII, except that the cassette loader configuration is changed. The JBX-5FE is similar to the JBX-5DII, with the substitution of a thermal field emitter gun.[32] JEOL JBX-6000FS. The JEOL JBX-6000FS is a spot-beam, fixedstage vector scan system targeting direct-write applications, such as microwave and millimeter-wave device patterning. A thermal field emission gun is used at 25 or 50 kV with minimum spot sizes of 8 and 5 nm, respectively. The electron-optical specifications are similar to those of the JBX-5DII, except that only the last four electromagnetic lenses are required with the TFE gun. The maximum writing speed is increased to 12 MHz to realize throughput advantages of the higher TFE gun brightness over thermal emitters. The stage is also improved with an addressable range of 150 × 150 mm at λ /1024 (0.6 nm) position readout resolution. ETEC MEBES 4500. The ETEC MEBES 4500 series is a spot-beam, moving-stage raster scan system. It is the latest version of the MEBES line that has been a staple of mask shops around the world. The 4500 series is targeting mask and reticle fabrication for 256 Mb DRAMs, phase shift masks, and optical proximity corrected masks. The system uses a thermal field emitter for high current density and resolution, with a minimum spot of 80 nm. The system uses 10 kV accelerating voltage for maximum resist sensitivity and throughput; combined with a pixel exposure rate as high as 160 MHz, a minimum dose of 2 µC/cm2 is possible. Throughput limitations from high pattern-generator data-rate transfer requirements, incurred from the raster scan writing, are avoided by using dual-pattern memory buffers. Mask geometries of 0.25 µm are supported through a minimum pixel size of 25 nm and a stripe butting accuracy of 50 nm. The system is capable of proximity correction and can align a phase-shift mask level to a base mask pattern. A real-time substrate height sensor allows measurement and correction of substrate height errors. The system incorporates dynamic focus, gain, and rotation corrections. The 4500S version includes additional pattern exposure modes and component temperature regulation for improved stripe-butting accuracy.[33] Lepton EBES4. The Lepton (now Ultrabeam Lithography) EBES4 is a spot-beam, moving-stage modified vector scan system. The design began with the acquisition of rights to the EBES 4A technology developed by AT+T Bell Laboratories. A thermal field-emitter source at 20 kV provides brightness of at least 1600 A/cm2, with a 0.125 µm spot and expected lifetime greater than 8000 hr. The column uses three lenses and employs three-level hierarchical telecentric deflection.[34] The lowest
Electron-Beam ULSI Applications
721
deflection level is electromagnetic and provides for the placement of a subfield within a 256 µm-wide stripe. The middle level of deflection is electrostatic and places a microfigure within the 32 µm subfield, while correcting for second-order distortions. The highest level is electrostatic and fills in the microfigure using up to a 2 × 2 µm “raster” scan with a 16 nm address grid. The microfigure scanning is accomplished with stepping and scanning DACs for both axes that can be used concurrently for angled lines. Microfigure filling time is minimized through the use of a microfigure library, rather than individual pixels. The microfigure pixel stepping rate is 75–500 MHz, with dose modulation available on a per-pixel basis. A dualpattern buffer memory is used to enhance data transfer rate. A fast focus coil provides a dynamic (deflection-dependent) correction and height correction of ±32 µm.[35] Stripe butting accuracy is 30 nm, and placement and overlay accuracy are both 50 nm. The stage uses two stepping axis interferometers to correct for yaw with 125 × 125 mm stage addressing range at λ/128 (5 nm) measurement resolution. Stage out of plane (height) variation is on the order of 0.1 µm, and in-plane vibration is on the order of 1 µm with the stage scanning.[36] Thermal stability is addressed thoroughly through component temperature control, use of low-expansion Zerodur™ glass-ceramic material, and substrate temperature equilibration before loading into the vacuum chamber. Kinematic clamping provides minimal substrate distortion. Robotic handling of substrates enhances throughput and carrier-loading reproducibility. Substrate particulate defects introduced by the system are controlled by minimizing sliding surfaces inside the vacuum to reduce particle generation and using soft vent and pumpdown cycles to reduce particle redistribution. An unusual vacuum system configuration uses ion pumps with cryopumps instead of the more common turbomolecular pumps. Overall throughput for a 64 Mb DRAM 5 in. mask exposure is two plates per hour.[37] Lawrence Berkely National Laboratory (LBNL) Nanowriter. Sponsored by the U.S. Defense Advanced Research Projects Agency (DARPA), the LBNL Nanowriter project will develop a next-generation electronbeam lithography system, with resolution and placement accuracy commensurate with sub-0.1 µm features. Specific goals are a 2.5 nm beam at 100 kV that will provide stitching accuracy of 20 nm over a 1 cm field. The column is a modified Leica VB6 with thermal field emission, dynamic focus and stigmation, and substrate height sensing. Pattern generator speed is 30 MHz and stage measurement precision is 5 nm with three-axis metrology (x, y, θ).
722
Handbook of VLSI Microlithography
A significant effort is placed on the pattern generator system. Parallel array processors are used for figure generation, with equivalent capabilities for both rectilinear and nonrectilinear patterns. Image processing software will be implemented for alignment, calibration, and metrology functions, interpreting mark detection signals from both backscattered and transmitted electron detectors.[38] Variable Shaped Beam. JEOL JBX-7000 MVII. The JEOL JBX7000 MVII is a variable-shaped-beam, fixed-stage, vector scan system targeting mask patterning. Some key performance specifications are 30 nm or better field stitching and in-field distortion, and 40 and 50 nm overlay accuracy on 5 and 6 inch mask plates, respectively (up to 7 inch plates are accommodated). The accelerating voltage is 20 kV with a maximum 2 A/ cm2 current density. The maximum scan area is 1500 µm with a 25 nm pixel, with field size scalable from 0.75 to 1.2X in 0.0000001 increments. The maximum beam size is 6 × 6 µm, with a beam size increment of 25 nm and a beam position address grid of 25 nm. Dynamic corrections are provided for deflection field distortions and aberrations, and static measurement and correction are provided for substrate height. The stage is capable of addressing a total writing area of 165 × 165 mm with λ/1024 (0.6 nm) position readout resolution. A field-shift writing feature is provided that allows the averaging of larger off-axis exposure errors with on-axis errors via fractional field stepping and multiple exposures. The predecessor to this model is the JBX-6AIIIMV. JEOL JBX-8600DV. The JEOL JBX-8600DV is a variable-shapedbeam, fixed stage, vector scan system targeting direct-write applications for application specific integrated circuits (ASIC) and DRAM. The minimum feature size is 150 nm with positioning, field stitching, and overlay accuracies of 50 nm. The gun is a thermal emitter operating at 40 kV. The maximum scan field is 1 × 1 mm with 10 nm pixels. The maximum beam size is 2.5 × 2.5 µm with a 10 nm beam size increment. The stage is capable of addressing a full 6 inch wafer at λ /120 (5 nm) measurement precision. Dose modulation is provided via 64 shot-modulation increments for each of the various primitive shapes (rectangle, x trapezoid, y trapezoid). IBM EL-4. The IBM EL-4 is a variable-shaped-beam, fixed stage, vector scan system developed for internal IBM use.[39] A thermal emitter operates at 75 kV. The column employs a unique variable axis immersion lens that provides extremely wide telecentric field coverage with minimal aberrations. Three-level hierarchical deflection is employed for optimum throughput. The stage operates without guide rails, relying entirely on the stage metrology feedback.[40] One axis has two servo motor drives and two
Electron-Beam ULSI Applications
723
measurement positions on its stage mirror to provide yaw control. Wafers are held electrostatically for improved flatness and thermal contact. This back-side-referenced wafer mounting provides sufficient front (exposure) surface orientation given normal wafer thickness runout specifications and the real-time surface height measurement range of ±75 µm. The EL4/XEP0 tool has demonstrated validated 0.25 µm and prototype 0.18 µm x-ray mask writing. Demonstrated pattern size control is 28 nm, and placement accuracy is 29 nm, using a multiple-pass writing strategy to average pattern errors.[41] Leica ZBA23H, ZBA31, and ZBA32. The Leica ZBA31 and ZBA32 systems are moving-stage, shaped-beam systems intended for maskmaking and direct-write applications, respectively, with throughput goals driven by 256 Mb DRAM complexity and density. The ZBA23H is a lower cost, Stepand-Scan variation of the above systems intended for mixed applications with lower throughput requirements. These systems are based on technology originally developed by Jenoptic Technologie GmbH. ETEC Excaliber. The ETEC Excaliber system is under development to meet needs for 0.18 to 0.13 µm design rules. A wide array of applications are supported by the diverse substrate handling capability, including photomask blanks, proximity x-ray 1X membrane masks, and 200 mm wafers. The system uses a moving-stage, shaped-beam approach and incorporates the column design of the ETEC AEBLE-150 and the threeaxis stage design of the IBM EL-4 with λ/1024 (0.6 nm) measurement precision. High-bandwidth electronics provide corrections for deflectiondependent distortion, stigmation, and focus, as well as height-dependent corrections to deflection and focus. To accommodate the high-resolution goals, the system will use a 24-bit address word to divide the 500 µm lowest-level deflection range, and 15 bits to divide the 32 µm subfield deflection range. The resulting large data volume is handled by data hierarchy preservation, parallel processing, data compression, and data path electronics employing field-programmable gate arrays to allow softwarereconfigurable writing strategy optimization. Demonstrated performance at 50 kV indicates that 0.18 µm geometries can be patterned with 30 nm placement accuracy.[42] Planned extensions of this platform include 100 kV accelerating voltage and a compatible column with telecentric deflection. Toshiba EBM-800. The EBM-800, only available in Japan, is a unique system designed specifically for the production of large-scale LCD masks with high throughput. Because of this application, the EBM-800 can accommodate a maximum substrate size of 812 × 812 mm with an addressable area of 780 × 780 mm. Performance specifications are also
724
Handbook of VLSI Microlithography
geared for this application, with placement accuracy of 0.5 µm (3s) and stitching accuracy of 0.1 µm.[43] Hitachi HL-800M and HL-800D. The Hitachi HL-800M is a moving-stage, shaped-beam vector scan system designed for mask/reticle production, including phase-shift designs. The stage can accommodate up to a 7 in. mask. Stitching and dimensional accuracy of 30 nm and absolute position accuracy of 40 nm represent significant improvements over the preceding HL-700 generation. The beam is generated using a thermal emitter and is deflected using three-level hierarchy for maximum throughput. A unique high-speed hardware-based proximity correction function is used to modify shot duration in real time, based on a stored pattern density map generated (quickly) before exposure. The 800D is also a variable shaped-beam system, but it targets ASIC and microcomputer production by direct write. Performance is specified as 0.3 µm minimum linewidth, 50 nm CD accuracy, and 70 nm overlay and stitching accuracy. Wafers up to 200 mm can be patterned. Throughput is specified at 11 150-mm wafers per hour. Note that the 800D can be configured for cell projection.[44] Cell Projection Systems. Hitachi HL800D. Hitachi’s HL800D variable-shape beam instrument, the first cell projection system to become commercially-available, is optionally-configured as a moving-stage cellprojection exposure system targeting applications in ULSI direct-write mass production. Each cell aperture contains five cell patterns that are selectable by high-speed deflection to expose repeating pattern features. Each cell aperture also contains a square aperture for formation of a rectangular variable-shape beam to expose nonrepeating patterns. A mechanical aperture drive allows selection of 25 cell apertures, for a total of 125 cell patterns. Cell patterns are reduced by a factor of 25 at the wafer. A 500 µm2 LaB6 cathode is used to provide a beam of 10 A/cm2 at 50 kV. At 0.25 µm resolution, throughput is up to 20, 6-in. wafers per hour for a 64 MB DRAM benchmark (wafers up to 8-in. can be exposed). Stitching and overlay accuracies are both 70 nm (3s), CD accuracy is 50 nm (3s), and minimum linewidth is 0.2 µm. Stitching errors between cell projection patterns and variable-shape beam patterns are 50 nm or better. The stage velocity is modulated according to pattern density. The pattern-writing deflection system is three-level hierarchical, using three separate in-lens deflector stages. The lowest level of deflection addresses a 5-mm-wide stripe with 20 bit resolution (5 nm address grid). The middle level of deflection addresses a subfield up to 500 × 500 µm with 17 bit resolution, and the highest level of deflection addresses a
Electron-Beam ULSI Applications
725
sub-subfield up to 80 × 80 µm with 14 bit resolution. Hardware-implemented proximity correction is provided in real time, as in the HL800M.[45] Advantest F5120. The Advantest F5120 is a cell-projection system that addresses throughput issues by employing two electron-optical columns for simultaneous exposure of two wafers. The performance accuracy specifications are a minimum feature size of 0.13 µm with dimensional control of ±20 nm, stitching accuracy of 40 nm, and overlay accuracy of 60 nm. Throughput is specified at 15, 8-in. wafers per hour for a 256 Mb DRAM hole layer with 0.25 µm geometries. The cell mask contains 100 different patterns for repeating geometries. The target application is high-volume production of memory and logic devices with 0.18 µm design rules.[46] Leica WePrint 200. The WePrint 200 is based on the technology of the Leica ZBA-31/32, with the addition of mini-reticles (cell mask apertures). The mini-reticles permit character (parallel element) exposure with about one order of magnitude reduction in shot number relative to a vector scan strategy for improved throughput. The mini-reticles are 25 × 25 µm metal-coated Si stencils, with features as small as 1 µm. Mini-reticle electron-optical reduction to minimum patterned dimensions at the substrate of 0.1µm allow direct-write of 256 Mb DRAM features. The column uses two beam-shaping apertures and associated deflection systems to: i. Define beam extent. ii. Provide either character generation or beam shaping (similar to conventional vector scan strategies). Two levels of electromagnetic deflection provide a deflection range of 2.4 mm. High-speed subfield deflection is accomplished with a 12-pole electrostatic deflector. The writing strategy uses a continuously moving stage with velocity modulated according to pattern density. Dual data storage disks are linked to dual 32 MB data buffers, with each buffer capable of holding data for an entire stripe.[47] 3.6
Novel Electron-Beam Technologies
Projection-Reduction Electron-Beam Lithography. A unique masked projection-reduction electron-beam lithography technology called scattering with angular-limitation in projection electron lithography (SCALPEL™), [48] is under development at Lucent Technologies.
726
Handbook of VLSI Microlithography
Pattern data are encoded in a scattering layer of a special membrane mask which is scanned by a wide electron beam and projected and demagnified on the substrate using a step-and-scan exposure strategy (see Fig. 14). Electrons passing through the scattering layer suffer wide-angle scattering and are efficiently screened from the projection lenses by a limiting aperture. The fact that the electrons are not absorbed in the mask, greatly reduces the concern of mask heating, present with stencil (i.e., absorbing) masked electron-beam exposure strategies. The use of a membrane support eliminates the need for complementary mask pairs required to produce closed annular shapes with stencil masks, and reduces backscatter and related proximity effects during mask patterning.[49] A unique aspect of SCALPEL is that a proximity-correcting GHOST exposure can be generated from the same pattern mask by allowing some of the scattered electrons to project to the substrate. The exposure strategy involves the simultaneous scanning of mask and wafer under the nominally stationary rectangular beam. The mask is divided into 1-mm-wide membrane regions, with support ribs in between. The mask will move at four times the substrate speed to accommodate the 4X reduction. Electrostatic beam deflection corrects measured stage position errors and permits the precise stitching of scan regions. A relatively small (0.25 mm) field is used to minimize off-axis aberrations. Resolution and accuracy are targeted for direct writing of wafers at 0.18 µm design rules.[50] Masked Electron-Beam Lithography. Cathode projection systems use a photocathode mask that emits electrons upon UV irradiation. The photoemitter material is patterned on the mask to provide patterned photoelectron emission that is focused and accelerated to the wafer by electric and magnetic fields. This approach provides a massively parallel exposure, but to date has not demonstrated resolution better than steppers. Ultraviolet stimulation of electron emission allows for the use of a conventional photomask plate that is UV-transparent, with 1X photoemitter patterning on the side adjacent to the substrate.[51] Stencil mask proximity printing uses a 1X stencil mask in close proximity to the substrate and a scanned electron beam. An advantage of this approach is the ability to vary the angle of beam incidence during the scan to correct for mask distortions. Although this relaxes some mask requirements, the overall need for a 1X stencil mask is still demanding. An additional limitation, common to all stencil masks, is the need for complementary mask pairs where closed annular features are present.[52]
Figure 14. SCALPEL operation principle and Step-and-Scan writing strategy.
Electron-Beam ULSI Applications 727
728
Handbook of VLSI Microlithography
Multiple-Beam Technology. The need to improve throughput in electron-beam lithography has led to various multiple-beam techniques. These approaches consist of permutations of configurations using either single or multiple sources and either single or multiple columns. Common to all approaches are beam-to-beam interactions that limit maximum beam current, as well as limits to beam array extent due to off-axis aberration limits. A single column approach can generate multiple beams using either separate sources or a single source and a lens (for example, fly’s eye) to generate the multiple beams. These approaches both suffer from limitations due to the uniformity of the beam-generating components. The use of a single column does not provide for individual electron-optical beam optimization to correct for these nonuniformities. A distinct advantage to this approach over multiple column approaches is the greatly reduced complexity of beam control. Numerous proposals and developments exist for the use of Si microelectro-mechanical systems (MEMS) fabrication technology for the construction of miniaturized electron-beam lithography columns. The aim is the low-cost fabrication of a matrix of miniature exposure columns for improvement of direct-write throughput via parallel exposure. In principle, all elements can be fabricated with this technology, including sources, blankers, lenses, deflectors, and detectors. The potential exists to fabricate an array of columns using wafer-scale integration. The columns can be scaled in operating voltage to work at a few kilovolts, with resulting proximity-free exposure. The stage movement range is greatly reduced, by 1/N for an array of N × N columns, relative to a single column system. Difficulties with the miniature column approach arise from assembly and alignment of components, column-to-column interactions, individual control and calibration requirements, and mark detection due to both short working distance and limited resist penetration at low voltage.[53] Related low-voltage resists and process strategies will be required to implement this technology. Scanning Tunneling Microscope Exposure. Although not a beam technique, scanning tunneling microscopes are capable of resist exposure.[54] The exposure occurs from field emission from a sharp tip maintained in close proximity to the surface during scanning. The proximity of the tip to the sample, limits the lateral displacement of electrons along field lines diverging from the tip, allowing nanometer resolution. This proximity also limits the usable tip voltage, allowing only low-energy exposure in thin resists. Throughput is extremely limited due to low-bandwidth piezoelectric scanning mechanisms and limited scanning range.
Electron-Beam ULSI Applications 4.0
RESIST
4.1
Introduction
729
Resists are applied to a substrate as thin surface layers for the purpose of recording the latent image of an exposed pattern. The final result of the lithography process is a relief structure in the resist layer representing the exposed pattern. However, the ultimate utility of lithography is its role in pattern transfer, the permanent transfer of the resist image to the substrate (or any permanent films applied to the substrate), as the resist is ultimately stripped from the substrate. The requirements of pattern transfer often dictate the substrate preparation, choice of resist, or exposure conditions. The term resist will be used to indicate an electron-beam resist (i.e., any resist that is sensitive to electron-beam exposure), with additional notation made to indicate sensitivity to other radiation. It is noted that this section is intended to provide only enough detail about resists to understand the following discussions and applications. Resists serve as a useful tool for patterning; however, their necessity can be questioned. Since pattern transfer to the substrate is the usual goal of a resist, the possibility exists for the elimination of the overhead required for the processing of resist by direct modification of the substrate.[55][56] The potential exists for the improvement of lithographic yield through reduction in processing steps, but consideration must also be given to possible reduced exposure throughput, as well as the possibility of increased device damage where significantly higher exposure doses are required. Electron-beam resists are most commonly liquid solutions composed of organic polymers cast in a solvent. The liquid is applied to a substrate by spin coating to form a film of uniform thickness. Following spin casting, the casting solvent is driven from the film by baking, referred to as a softbake. The softbaking leaves a durable polymer film on the substrate that is ready for electron exposure. Often resist is applied to the substrate in multiple layers or in layered combinations with inorganic materials or other organic materials that are not sensitive to electron-beam exposure. The purposes of multi-layer resist are varied and include planarization, improvement of pattern transfer, reduction of backscatter, and creation of specific feature cross-sections. The electron-beam pattern exposure deposits energy in the resist and modifies it to create the latent pattern image. The nature of the modification can be one or more of various mechanisms, such as polymer chain scission,
730
Handbook of VLSI Microlithography
polymer chain cross-linking, or acid catalysis. The modification may be directly suitable for the development step to distinguish between exposed and unexposed regions of the resist, or it may require an intermediate step such as a post-exposure bake. Ultimately, the solubility of the exposed region in a developer is modified by these steps. Development completes the transformation of the resist coating into the three-dimensional representation of the CAD pattern, revealing the latent image. For a positive resist, the exposed regions are made more soluble and are selectively removed by the developer. For a negative resist, the exposed regions are rendered insoluble in developer and the unpatterned area (the field) is removed by the developer. Development is most commonly performed by wet processing with organic solvent or aqueous-based chemistry. Less common are self-developing materials that ablate directly under exposure to the electron beam, and materials that are developed using plasma etching, referred to as dry development. For wet processes, the development rate is often very sensitive to temperature and should be performed using temperature control. Aqueous developers are typically strong base solutions that can etch underlying materials, such as aluminum, during overdevelopment. The resist pattern is ultimately used as a template for pattern transfer. Pattern transfer to the substrate can be additive or subtractive. In an additive transfer, material is added to the substrate through the resist openings. In a subtractive transfer, the resist protects the underlying material while material in the resist openings is removed. In either case, the resist film is removed from the substrate following pattern transfer. Commercially available acid-based strippers can remove a wide variety of resist materials provided that the stripper is compatible with the substrate materials, especially when heated. PMMA and P(MMA-MAA) are readily removed with chlorinated solvents, for example, trichloroethylene (TCE), methylene chloride, which are increasingly less acceptable in manufacturing. Novolac-based resists may be stripped with acetone depending the level of crosslinking. An increasingly more popular, broadly effective, and reduced-hazard solvent for resist stripping is n-methyl-pyrrolidinone. Organic resist films, especially heavily cross-linked materials, may be effectively stripped using oxygen plasma cleaning. As device active region thicknesses are scaled down, consideration must be given to device damage generated by energetic plasma species. Common additive pattern transfer processes are ion-implantation and liftoff. Ion implantation requires a resist thickness sufficient to prevent penetration of the given ion species and energy into the substrate.
Electron-Beam ULSI Applications
731
Difficulties can arise from cross-linking of the resist surface due to ion bombardment, leading to problematic resist removal. Additional problems can occur due to resist edge slope, leading to poor definition of the implantation region edges and to knock-on of organic contaminants into the substrate at thin resist tails. The liftoff process is shown schematically in Fig. 15. First, material is deposited on the patterned substrate, condensing on the resist top surface and on the substrate surface where there are resist openings. Following deposition, the resist is dissolved in a solvent, and the excess metal on top of the resist is washed away, or lifted off, leaving only the deposited material on the substrate in the patterned openings. Liftoff requires an undercut resist profile to prevent deposition on the sidewall, which can lead to ragged edges, tearing of material from the substrate, or total failure of liftoff due to complete encapsulation of the resist. Liftoff works best for deposition that is normal to the surface, such as evaporation and is more difficult for angular deposition such as with sputtering.
Figure 15. Standard and tape-assisted liftoff processes.
732
Handbook of VLSI Microlithography
For a liftoff process, the nonzero sticking coefficient for deposited material on the sides of the resist opening leads to a natural reduction of transferred linewidth with increasing material thickness. This self-closing effect significantly limits the cross-sectional area achievable for narrow lines in single-level resist and may impose resistance limitations for devices and circuits. Multi-layer resist approaches can overcome this limitation where low line-line spacing requirements permit mushroomtype line cross-sections (see Sec. 4.4). The dissolution of the resist after deposition is limited by the penetration of the solvent through the narrow gaps in deposited material at the perimeter of the pattern. Tape-assisted liftoff eliminates this problem by removing the excess metal mechanically before dissolution. Tape is applied to the substrate following deposition using a roller, limiting pressure such that the tape contacts only the high points, i.e. the excess metal on top of the resist. The tape is then peeled off with the adhering metal and the full resist resist area is then totally accessible to the solvent. Liftoff can be used to implement a tone reversal in certain applications. A specific example is a low-density microwave transistor gate level that is to be patterned as a clear-field MoSi2 on quartz mask plate. In the absence of a suitable negative resist (now readily available) for dry-etching of the MoSi2, a positive resist (EBR-9) is first patterned. This patterned resist is used for liftoff of a thin Ni film that is subsequently used for the plasma etch mask and is retained in the final plate. Note again that the liftoff of the Ni etch mask provides for the etching of the field of the mask, a tone reversal of the original positive-tone exposure. Common subtractive pattern transfer processes are wet etching and dry etching. In wet etching, liquid solutions etch away material in the resist openings. This process places great demands on resist-to-substrate adhesion. Any adhesion failure here can lead to significant feature size variation or even total resist delamination. Removing contaminants thoroughly and possibly removing native oxides immediately before resist coating can greatly improve adhesion, depending on the substrate. Bulk contamination should be removed with organic solvents or oxygen plasma cleaning (reactive ion etching will protect the substrate backside where it is etched in oxygen plasmas). Trace organic contamination can be removed effectively with minimal substrate damage using UV-ozone cleaning.[57] Consideration must also be given to the nature of chemicals used for the etch process. For example, with PMMA and P(MMA-MAA) on GaAs or InP, acidic etchants are preferred over basic etchants for reduced undercutting.
Electron-Beam ULSI Applications
733
Dry etching uses energetic gas atoms to etch via physical sputtering alone or sputtering combined with chemical reaction. The primary resist concern with dry etching is selectivity, the ratio of material etch rates for resist and substrate. Low selectivity implies rapid erosion of resist during etching, leading to loss of feature size control and rough edges. Resist sidewall profiles should be vertical for isotropic dry etching to maintain critical dimension control as the top surface of the resist is eroded. 4.2
Resist Properties
The key parameters that characterize resists are tone, sensitivity, contrast, resolution, viscosity, process latitude, and dry etch resistance. Tone is either positive or negative, as discussed above. Sensitivity indicates the total number of electrons required to expose the resist per unit dimension and may be a strong function of exposure energy. This number is referred to as the dose and is smaller for higher sensitivity resists. For a point exposure with a beam of current I b (nA) and an exposure, or dwell, time of td s, the point dose is Dp = Ibtd (nAs) = I btd (nC). Since the dwell time per point is given by the reciprocal of the beam stepping frequency f, the point dose is often expressed as Dp = Ib /f (fC), where Ib is in nanoamperes and f is in kilohertz. For a line exposure, each point or pixel along the line contributes an exposure to a unit of length equal to the pixel spacing d, for a line dose of Dl = Ib /fd (nC/cm) where Ib is in nA, f is in kHz, and d is in cm. For an area exposure, each point or pixel contributes an exposure to a unit of area equal to the pixel spacing squared, for an area dose of D a = I b/fd2 (µC/cm2), where Ib is in nA, f is in kHz, and d is in cm. For extremely sensitive resists, the total charge required for an exposure dose may be low enough that the exposure becomes nonuniform due to statistical fluctuations in the number of electrons in the beam. This translates into a fundamental tradeoff between speed and resolution. Contrast indicates the ability of the developed image to distinguish between areas of different doses and is defined in terms of the variation of developed resist thickness as a function of dose. This behavior is best represented by a contrast curve, plotting normalized resist thickness as a function of dose, as shown in Fig. 16. From these curves, the concept of a minimum exposure dose is understood as the dose required to achieve the maximum modification of the resist by the exposure, indicated by point D3 in Fig. 16. The contrast g is defined as g = |log10(D2/D 1)|-1 where D 1 and D2 are the doses defined at the points indicated in Fig. 16. From a practical standpoint, a high-contrast resist is desirable as it develops along contours
734
Handbook of VLSI Microlithography
of constant dose. This leads to steeper sidewalls and larger process latitude than for lower contrast materials. A high-contrast resist may also permit the precise adjustment of fine feature size and sidewall slope by varying the exposure dose. The contrast of a given resist is strongly related to the choice of developer chemicals, concentration, and temperature. A verylow-contrast developer can lead to significant unexposed resist thickness loss during development.
Figure 16. Examples of resist contrast curves.
The definition of resolution is somewhat subjective due to the dependence on exposure conditions, development conditions, resist thickness, and substrate composition (atomic number), as well as the need to meet requirements of critical dimension control, sidewall profile, process latitude, and, ultimately, the required yield.[58] A practical definition of resolution is the minimum line/space period that can be resolved or the radius of an inside corner of a pattern feature.[59] Resolution may be improved by choosing the thinnest resist compatible with pattern transfer,
Electron-Beam ULSI Applications
735
allowing forward scattering and development times to be minimized. Resolution may also be improved by increasing exposure energy, which can decrease the backscattered electron dose peak below the threshold for development and further reduce forward scattering. Resist viscosity may be varied by altering the ratio of solid to liquid. The ability to alter viscosity is useful for achieving different ranges of film thickness versus coating spin speed, while keeping the spin speed within a preferred range of 4000 to 6000 rpm for thickness uniformity. In some cases, it is possible to vary the type of solvent used to cast the resist. For multiple resist layers, it is preferable to use a solvent for the layer to be applied that is a very weak solvent for the layer to be covered, in order to minimize interlayer dissolution. Process latitude concerns the control of the critical dimension (CD) as a function of parameter variations. One measure of process latitude is the slope of the linewidth versus dose curve. A steep slope implies greater sensitivity to exposure variation and reduction of CD control. Process latitude considerations ultimately must encompass all parameters whose variation leads to a variation in CD control. If process latitude is to be quantified by linewidth versus dose slope, then a family of curves must be considered for each significant parameter, as determined from a parameter screening. Some typically sensitive parameters are resist age, developer strength, development environment (temperature and humidity), development method (dip versus puddle versus spray), and post-bake and post-exposure latency time and environment (for example, atmosphere versus vacuum). 4.3
Positive Electron-Beam Resists
PMMA and P(MMA-MAA). PMMA has remained a useful resist since its introduction into electron-beam lithography in 1968.[60] It has extremely high resolution (below 10 nm), excellent shelf life, and does not suffer from swelling or post-exposure latency effects. Its biggest drawbacks are its poor sensitivity, poor dry etch resistance, and poor thermal stability. Historically cast in chlorobenzene, it is now commercially available cast in safer anisole solvent.[61] It is available in molecular weights from 50 k to 1 M, with sensitivity decreasing slightly with increasing molecular weight. PMMA is also sensitive to exposure by deep ultraviolet light and x-rays, permitting hybrid exposure strategies.[62] A hybrid exposure exposes the finest features of a given lithography level by electron beam with coarser features of the same lithography level exposed
736
Handbook of VLSI Microlithography
separately by a high-throughput optical or x-ray technique, all in the same resist and with a common development. The primary exposure mechanism for PMMA is by scission of the polymer chain. Scission in the exposed region reduces the molecular weight, with the lighter weight fragments selectively removed by developer. Commonly used developers are 1:3 methyl isobutyl ketone (MIBK): isopropanol (IPA) for highest contrast and 1:1 MIBK:IPA for highest sensitivity.[63] Useful area dose values are 100, 250, and 500 µC/cm2 at 20, 50, and 100kV, respectively. At roughly an order of magnitude higher dose, PMMA will crosslink and exhibit negative tone, an effect commonly observed where alignment marks are scanned for registration. This positive behavior of PMMA can be exploited to realize resolution comparable to that demonstrated in positive tone.[64] The crosslinked PMMA can persist as a source of organic contamination if an aggressive stripping method such as oxygen plasma cleaning is not used. The poor thermal stability of PMMA is problematic for electron-beam inspection of developed images, where the resist deforms rapidly under observation. In addition, elevated temperatures during material deposition onto patterned PMMA can result in CD variation and pattern distortion. P(MMA-MAA) provides a three- to four-fold improvement in sensitivity relative to PMMA and has improved image thermal stability to 160°C. P(MMA-MAA) is readily developed in MIBK:IPA mixtures used for PMMA, permitting single-step development of mixed multiple layers of PMMA and P(MMA-MAA). Historically cast in ethylene glycol monoethyl ether acetate (EGMEA), P(MMA-MAA) is now commercially available cast in safer ethyl lactate solvent.[61] Poly(Butene-1-Sulfone). Poly(butene-1-sulfone) (PBS) has high sensitivity suitable for high-speed patterning, from less than 1 µC/cm2 at 10 kV to 1–2 µC/cm 2 at 20 to 25 kV. Because of its speed, it has been a staple of United States mask shops for high-throughput photomask production. However, it does have a very limited process latitude, with high susceptibility to proximity effects, feature-size-dependent swelling, dependence on development fluid dynamics, and extreme sensitivity to temperature and humidity during development, which often necessitates iterative development.[65] Plasma etch resistance is also poor, unless special measures are taken to cool the substrate and minimize oxygen content in the etch gas. [66] The most common development of PBS is by spray application of 5-methyl-2-hexanone/2-pentanone (MIAK/MPK) solutions ranging from 60–90% MIAK concentration. Evaluations of PBS
Electron-Beam ULSI Applications
737
indicate that additional process control may extend its suitability to 0.25 µm design rule mask fabrication for 256 Mb DRAM production.[67][68 ] ZEP. ZEP is a copolymer of chloromethacrylate and methylstyrene that provides high resolution comparable to PMMA, but with improved speed and dry etch resistance. Area dose requirements are approximately 40 µC/cm 2 at 30 kV and 60 µC/cm 2 at 50 kV. Development is accomplished with xylene or methyl ethyl ketone (MEK):MIBK. Some problems with cracking at feature corners of thick (greater than 1.5 µm) films is noted, but thinner films show no problems using 100% xylene development. ZEP has been used successfully to pattern 240 nm pitch gratings with pattern transfer, using CH4/H2 reactive-ion-etching for InPbased distributed-feedback laser arrays.[69] EBR-9. EBR-9 is a copolymer of trifluoroethyl a-chloroacrylate and tetrafluropropyl a-chloroacrylate that provides sensitivity and speed comparable to PBS, but with greater process latitude.[70] The hydrophobic nature of the baked EBR-9 film and the high glass transition temperature provide excellent coated film stability, with minimal variation of exposure sensitivity as a function of time after coating for up to six months and minimal post-exposure latency effects. Sensitivity to humidity variation during development and variation of linewidth as a function of development time are both about half the values for PBS. Dry etch resistance is poor for the high-speed formulations, comparable or slightly worse than PMMA, and better than PBS. EBR-9 has been optimized for development in pure MIBK with minimum swelling and has a demonstrated resolution of 0.1 µm. Exposure sensitivity can be increased by more rapid cooling following prebake with some deterioration of developed profile. The addition of water to the MIBK developer (up to a few percent by weight) also improves sensitivity. EBR-9 may be post-develop baked up to 140°C to promote adhesion when used with wet chemical etching.[71] UV Photoresist as a Positive Electron-Beam Resist. Conventional diazoquinone sensitizer/novolac resin ultraviolet photoresists can be employed as both positive-tone and negative-tone electron-beam resists that provide superior dry etch resistance relative to PMMA, P(MMA-MAA), PBS, and EBR-9.[72][73] The electron-beam exposure occurs in a vacuum and does not result in the water-dependent conversion of the photoactive compound to a developer-soluble product, as for the optical exposure for which photoresists are designed. Instead, the electron-beam exposure interacts with the sensitizer and the resin. At low doses, the sensitizer is broken down, increasing the dissolution rate and providing positive action. The electron-beam exposure will also cause cross-linking. At high enough
738
Handbook of VLSI Microlithography
doses this effect will outweigh the sensitizer destruction, leading to negative behavior. For example, KTI-brand 895i-series resist[74] (a conventional i-line photoresist) will change from positive to negative acting at a dose of 35 µC/cm2 at 20 kV, without introducing any additional steps such as flood exposure.[75] KTI 895i has been used to demonstrate 64 Mb DRAM pattern direct writing at a 20 kV sensitivity of 8 µC/cm2.[76] The advantages of using photoresists as electron-beam resists include well documented compatibility of novolak polymers in IC manufacturing, aqueous development, availability of dry etch behavior from published photoresist applications, volume-driven industry concern for meeting the highest filtration and purity requirements, and overall reduced toxicity. A large and constantly changing variety of resists optimized for UV exposure and suitable for electron-beam exposure are available from such manufacturers as OCG, Hoechst Celanese, Japan Synthetic Rubber, Shipley, and Toray.[77] Chemically amplified resists have been developed for high sensitivity, high contrast, and high resolution in both negative and positive tones. Positive-acting chemically amplified resists work using a photobase generator that generates a base compound due to irradiation by the electron beam. An exposure-independent thermal acid generator creates an acid compound in all regions of the resist. The acid is neutralized by the base compound in the exposed regions and assists in crosslinking in the unexposed regions. A soluble salt is formed in the exposed region that is washed away by the developer, resulting in a positive tone exposure. 4.4
Multiple-Layer Resist Strategies
Multiple-layer resist strategies allow the combination of materials with different properties to be used to simultaneously achieve multiple effects, such as high resolution and planarization, high resolution and undercut, or combinations of low and high resolution for special crosssections. The strategy for all multiple-layer combinations involves the selection of materials with compatible exposure, chemistry, and process requirements. In all cases, each coated layer is baked before successive coatings to minimize intermixing. Consideration should be given to the bake temperature sequence, which ideally will progress from the highest to the lowest temperature for successive layers. The relative sensitivity differential for PMMA of differing molecular weights can be used to achieve a natural undercut suitable for liftoff. Figure 17 shows a cross-section of a bilayer of 950k molecular weight PMMA over 496k molecular weight PMMA developed in 1:3 MIBK:IPA, with a
Electron-Beam ULSI Applications
739
controllable undercut suitable for resolving closely spaced features. The upper layer PMMA was cast in MIBK, a very weak PMMA solvent, to minimize dissolution and intermixing with the lower level during coating.
Figure 17. High/low-molecular-weight PMMA bilayer resist cross-section.
When P(MMA-MAA) is used in a bilayer system underneath PMMA, the differential sensitivities realized with a single-step development lead to a natural undercut. This undercut is larger than that using PMMA of differing molecular weights, and is also suitable for liftoff. The fact that P(MMA-MAA) is nearly insoluble in common nonpolar solvents for PMMA makes it ideally suited for multiple-layer applications using selective developers.[78] Still larger undercut is possible using either PMMA over P(MMA-MAA) or P(MMA-MAA) over PMMA with separate selective solvents (see Fig. 18). The use of P(MMA-MAA) as the top layer in multiple-layer systems can improve thermal stability during material deposition for improved CD control.
Figure 18. P(MMA-MAA) over PMMA, developed selectively using ethyl-cellosolve acetate:ethanol followed by chlorobenzene.
740
Handbook of VLSI Microlithography
Electron-beam resist can be combined with nonimaging layers, relying upon reactive ion etching for image transfer from the resist. A metal-on-polymer process has been used as an alternative to negative resist for the patterning of mesh structures.[79] In this process, nonimaging polyimide is applied first and baked at a temperature sufficient to render it essentially insoluble in solvents to be used for subsequent PMMA processing steps. PMMA is then applied, exposed, developed, and used to liftoff a positive-tone metal pattern on top of the polyimide. The metal pattern is then used as a mask for oxygen reactive ion etching, undercutting the metal and providing an excellent liftoff profile for deposition of material on the substrate. The metal-on-polymer and excess deposited material are rejected by liftoff in methylene chloride. A trilayer process using a nonimaging polymer layer and an intermediate inorganic pattern transfer layer has demonstrated 25 nm features on a thick Si substrate while simultaneously providing planarization capability.[80] In this strategy, thin PMMA is applied over thin Ge that has been evaporated on a thick, planarizing, nonimaging polymer layer. The PMMA provides high resolution via its limited thickness and distance from the substrate and associated backscattering. The PMMA is used to pattern the Ge with CF4 reactive ion etching, the patterned Ge then acts as a highselectivity mask for O2 reactive ion etching of the polymer, providing an ideal liftoff profile with high aspect ratio. A specific niche application of multiple-layer resist has emerged for the electron-beam direct-writing of microwave/millimeter-wave fieldeffect transistor gates with “T” and “Γ” cross-sections.[81] In the liftoff process, the deposition of evaporated material on the resist sidewall creates a self-closing effect that limits the aspect ratio of narrow lines in singlelayer resist (see Fig. 19). The desired gate structure is an evaporated metal gate with a small footprint to realize short electrical length and a large top to minimize resistance at high frequencies, as shown in Fig. 20. The simplest approach to achieving this cross-section is using a bilayer approach, with a low-sensitivity, high-resolution lower layer and a highsensitivity, lower resolution upper layer, such as P(MMA-MAA) over PMMA. The sensitivity/resolution combinations lead to development of a profile with a narrow bottom opening and a wider top opening. Metal deposited in this opening forms the desired T cross-section. Bilayer systems can produce excellent “T” and “Γ” cross-section gates (“T” gates and gamma gates respectively) but require careful dose control for large area features; also the required liftoff profile of the upper layer can be lost due to intraproximity overdosing, as shown in Fig. 21.
Electron-Beam ULSI Applications
741
Figure 19. Impact of resist self-closing effect during evaporation, showing height limitation for narrow line in single-layer resist. A Ti/Pt/Au transistor gate in an etched recess is shown crossing the mesa edge.
Figure 20. Cross-section of InP-based microwave transistor and low-resistance “gamma” gate formed using PMMA/P(MMA-MAA)/PMMA multilayer resist.
742
Handbook of VLSI Microlithography
Figure 21. Cross-section of large pad, showing loss of bilayer resist undercut due to overdosing.
The addition of an upper layer of lower sensitivity resist on top of the bilayer results in a trilayer system of PMMA/P(MMA-MAA/PMMA that maintains a liftoff profile for large-area features over a wider range of dose.[82] The top layer will develop out to a larger opening than the bottom layer, even if the same resist material is used in both layers, since the top is made thinner than the bottom layer, and it is subject to additional development while the developer penetrates the intermediate layer. Additional cross-sectional sculpting by partial exposure makes it possible to generate wider or asymmetric top openings. In this approach, a primary exposure defines the footprint with a dose sufficient to clear the substrate, while side exposures of a lower dose open down through only the top two layers (see Fig. 22). This approach is particularly useful for higher energy exposures (≥ 50 kV) where the top opening for a single exposure tends to be too narrow. As with any multiple-layer system, inorganic barrier layers can be introduced to provide compatibility or isolation of layers. Insertion of a Ge layer above the bottom PMMA layer in a PMMA/P(MMA-MAA/ PMMA trilayer system allows for completely separated development of the critical lower level from the wider upper level, resulting in improved process latitude.[83] The Ge layer is readily removed in a potassium iodide/ iodine solution that is compatible with the resist layers.
Electron-Beam ULSI Applications
743
Figure 22. Sidebeam exposure technique used in multilayer resist T- andΓ-gate formation.
4.5
Negative Electron-Beam Resists
Shipley Advanced Lithography (SAL). The Shipley Advanced Lithography (SAL) product line represents chemically amplified triplecomponent resists. The resist components are novolak resin, crosslinker, and photo-acid generator. Electron-beam interaction with the photo-acid generator during exposure creates an acid that catalyzes a reaction with the crosslinker upon post-exposure baking (so-called acid-hardening), crosslinking the resist in the region of exposure to provide a negative tone. The post-exposure bake temperature is a critical parameter. Minimal pattern webbing is achieved with post-exposure bake temperatures of 90–95°C and with minimal latency time between exposure and postexposure bake. Development is via standard aqueous developers available
744
Handbook of VLSI Microlithography
for development of optical resists. Since the base polymer is a novolak resin, the associated benefits noted in the above subsection UV Photoresist, also apply to SAL products. SAL provides high contrast, resolution of at least 0.1 µm, and excellent process latitude, accommodating significant overdevelopment with minimal variation in linewidth. The process latitude of SAL has been investigated for the fabrication of photomasks for 0.25 µm design rules. With proper control of postexposure bake conditions, SAL-601 has demonstrated CD variation of ±40 nm and CD uniformity of ±16 nm (3σ). It provides suitable dry-etch resistance for the etching of chrome by Cl2/O2 reactive ion etching. Together, these properties make SAL-601 suitable for use in 0.25 µm mask fabrication with sufficient control for implementing detailed figures required for optical proximity correction.[84] CMS. CMS is a partially chloromethylated polystyrene resist that uses the high sensitivity of the chloromethyl group to electron-beam, x-ray, and deep ultraviolet radiation (DUV). CMS possesses modest resolution (0.2 µm for isolated lines) at modest sensitivity (16 µC/cm2 at 20 kV) but is capable of 1.1 µC/cm2 at 20kV for formulations with poorer resolution. It has excellent shelf life in the bottle, up to three months stability for coated films before exposure, and at least seven days stability after exposure with no film swelling. It provides a CCL4 reactive ion etching selectivity of 5 over Al.[85] UV and DUV Photoresists as Negative Electron-Beam Resists.As previously mentioned in the discussion of UV photoresist for positive electron-beam exposure, the ability to crosslink these materials allows their use in negative tone. This is accomplished by first exposing and crosslinking the pattern areas. A UV flood exposure renders the non-electron-beamexposed, noncrosslinked areas soluble in a weak developer that does not remove the crosslinked region, resulting in negative tone. The transition from UV to DUV chemically amplified resists includes the replacement of novolak polymer with the common DUV poly-4-hydroxystyrene (PHS) polymer matrix, ARCH3, manufactured by OCG. As with novolak-based photoresists, these DUV PHS-based resists are also electron sensitive and have relatively higher glass transition temperatures, providing increased thermal stability. In the case of DX-1179 from Hoechst Celanese, the glass transition temperature is seen to increase even further over novolak-based materials through flood electron-beam irradiation at doses of 500 to 2000 µC/cm2.[86]
Electron-Beam ULSI Applications 4.6
745
Conductive Overlayers
A unique requirement of charged-particle lithography relative to optical lithography is the need to dissipate surface charge during exposure to prevent pattern distortion. For conductive substrates (as with Cr-coated photomasks), the conductivity of the substrate beneath the resist is usually a sufficient discharge mechanism. For highly insulating substrates such as quartz, lithium niobate, or glass, a discharge path is obtained by using an organic or inorganic overcoat above conventional insulating resists. At higher energies, thin (100–150 Å) layers of Au, Cr, or Al deposited on the resist surface provide sufficient conductivity.[87] The limited thickness and higher beam energy (above 10 kV) result in minimal beam broadening from forward scattering with some reduction of backscattered electron signal used for alignment mark acquisition. Deposition processes for these materials add additional process complexity and must not contribute significant unintentional exposure energy, such as from stray electrons from electronbeam evaporation or radiation from sputter system plasmas. The thin metal layers are removed before development using commercially available wet chemical etchants. Conducting organic layers provide for charge dissipation using standard resist spin-coating methods and equipment, greatly simplifying substrate preparation relative to metal overlayers. Conductive polyaniline films are radiation sensitive and can be used directly for both imaging and charge dissipation.[88] The conducting varnish TQV can be applied directly over resists such as PMMA and P(MMA-MAA), but the cyclohexanone in TQV will dissolve novolak resin-based resists such as SAL. In the case of novolak resin overcoating, a layer of polyvinyl alcohol (PVA) provides a barrier to both intermixing of TQV and novolak resist during coating and to dissolution of the novolak resist during the solvent strip of the TQV before development. The TQV layer is readily removed in MIBK and IPA and thus requires no additional processing with PMMA and P(MMAMAA), only added development time for TQV dissolution. With novolak resists, MIBK:IPA is used to first strip the TQV, then the water-soluble PVA barrier layer is removed by either a separate water rinse or by standard aqueous developer. Where post-exposure baking is required, as with SAL, the TQV and PVA must first be stripped.[89] Materials such as Showa Denko’s ESPACER100 and IBM’s poly. [3-(ethanesulfonate) thiophene] are water soluble and thus simplify coating compatibility and removal considerations.[90]
746 4.7
Handbook of VLSI Microlithography Inorganic Resists and Self-Assembled Monolayers
Technological development of thin resist layers is driven by a general need to scale lithographic features for nanometer-scale device fabrication and interest in limited-penetration, low-voltage electron exposure. In PMMA, the secondary electron range is thought to limit resolution to approximately 10 nm even for smaller electron beam sizes, but with other materials, such as Al2O 3, patterns of 1 nm in the imaging layer are possible.[91] A large number of inorganic materials may be patterned at the nanometer level using high beam energy, including AlF3, Al2O3, C 6O, MgF2, NaCl, SiO2, and Langmuir-Blodgett films.[92-95] Most of these materials require doses four orders of magnitude higher than is required for PMMA and are ablated directly. Realization of nanometer-scale device geometries from these thin films is limited by difficulties of pattern transfer, as well as by film defects. Self-assembled monolayers (SAMs) are under investigation as alternatives to these inorganic materials that can provide thin layers (~1–2 nm) with sensitivities comparable to PMMA at higher voltages.[96] Conventional polymer materials are difficult to apply as continuous defect-free films of thickness less than 50 nm and are thus limited in their application to low-voltage, low-penetration depth exposures. SAM materials are of interest for their potential use at the low exposure energies used in lowproximity effect exposures and scanning probe lithography.
5.0
COMPETING TECHNOLOGIES
The future requirements of ULSI lithography as predicted by Moore’s Law are challenging, with each three-year generation resulting –in a doubling of chip complexity and a reduction of feature size by 1/√2.[97] The primary technologies foreseen to be competitive for the next several generations of high-throughput lithography are optical reduction projection step-and-repeat (i.e., steppers), projection electron-beam lithography (for example, SCALPEL), high-throughput electron beam (for example, cell/mini-reticle projection), proximity x-ray, reduction projection ion-beam, and extreme ultraviolet (EUV) (formerly known as projection x-ray). The distinction between projection electron beam and cell projection approaches is made because of the radically different mask/aperture requirements and exposure strategies as described in the preceding sections.
Electron-Beam ULSI Applications
747
An evaluation of competing lithographic technologies may be organized according to intended application. The major application areas considered here are mask/reticle generation and high-volume DRAM patterning. Historically a two-generation lead time is required for prototype lithographic technology to meet a manufacturing generational insertion point, including the buildup of supporting infrastructure. Since the driving force for the implementation of new lithography techniques is fabrication of DRAMs, a high-volume and low-margin market, it is possible that economic considerations and not technological capability will steer the eventual path. For mask/reticle generation, the predominating technology is electron beam, with competition emerging from scanned laser systems. High throughput in scanned laser systems is possible through multiple scanned laser-beam channels. Multiple electron-beam approaches are under development, primarily as multi-micro-column approaches. These approaches are less mature than laser systems and need significant development of resist and process transfer infrastructures to deal with low accelerating voltages and associated thin resists. Laser exposure systems operate at atmospheric pressure, allowing for backside vacuum mask fixturing for reduced plate clamping distortion and loading or unloading of substrates without vacuum pumpdown overhead for improved throughput over electron-beam systems. [98] The latest laser systems are designed for 0.25 µm design-rule 4X mask/reticle generation to support 256 Mb DRAM fabrication, with address grids as small as 5 nm to accommodate phase shift and optical proximity correction mask details.[99] Electron beam is foreseen to be the only viable technology to realize the 1X mask patterning requirements for proximity x-ray lithography. For wafer patterning, the momentum of the incumbent multiplegenerational optical stepper continues to drive this technology toward further sophistication in exposure tools (for example, off-axis illumination) and mask implementation (for example, phase-shifting and optical proximity correction). Steppers currently provide the highest throughput for this application (50, 200-mm wafers/hour at 0.25 µm design rules) due to the highly parallel exposure. At 0.25 µm design rules for 256 Mb DRAM patterning, the emerging technology path is evolutionary, targeting 248 nm DUV steppers which are already in partial implementation for 0.35 µm design rules. For 0.18 µm design rules for 1 Gb DRAM patterning, industry speculation abounds as to the extendibility of stepper optics. Many technical challenges exist for lens and resist materials development for the
748
Handbook of VLSI Microlithography
required 193 nm operating wavelength. Process latitude is also reduced with subsequent optical generations, largely due to limitations in depth of focus. Electron-beam technology can accommodate local height variations due to its small beam convergence angle and can accommodate global average substrate height variations using substrate height measurement. The tremendous investment cost for 193 nm DUV stepper technology may ultimately be limited to one generation because of the formidable challenges in its application to sub-wavelength patterning of 0.13 µm design rule 4 Gb DRAMs. This possible one-generation cost is part of the argument for insertion of proximity x-ray for 0.18µm generation production since its cost can be amortized over succeeding generations. At present, the path to 0.13 µm and smaller high-throughput lithography is unclear.[100] X-ray proximity printing uses a 1X mask held in close proximity (10– 20 µm) to the wafer.[101] High-throughput configurations use a synchrotron x-ray source that generates a narrow rectangular beam that is scanned across the mask for a Step-and-Scan exposure strategy. The lack of mask demagnification in x-ray proximity printing provides a significant mask fabrication challenge relative to 4X-reduction optical steppers. The exposure wavelength (10–15 Å ) and proximity mask provide excellent depth of focus, large process latitude, and simplified resist processing. Current industry projections are for proximity x-ray insertion at 0.1 µm design rules, or at 0.13 µm design rules if 0.13 µm optical approaches are not feasible. Key issues are pattern placement accuracy and feature size control. Additional considerations are needed for the extremely large data handling (pattern generator speed) and data processing (for example, proximity correction) requirements for mask patterning, as well as defect reduction through reduced contamination. There are no apparent technological barriers for insertion of proximity x-ray lithography, with complex Si device production demonstrated by NTT, IBM, and Mitsubishi. A large investment in development of the infrastructure components has been made in the U. S. and Japan. The nature of the equipment (most likely based on a superconducting synchrotron storage ring as an x-ray source supplying 10–20 steppers) will require radical changes in the layout of clean rooms due to space and safety requirements. Electron-beam approaches to high-volume direct-writing of wafers do not face resolution limits for the next several generations of lithography. The key factors limiting electron-beam implementation are throughput, placement accuracy, and CD control. Electron-beam cell/mini-reticle projection systems can provide limited-volume prototyping capability for 0.25
Electron-Beam ULSI Applications
749
and 0.18 µm generations, but throughput advantages of steppers will limit large-scale implementation. Cell/mini-reticle projection systems have relatively simple aperture shape requirements that are simplified by electronoptical reduction, but complicated by the need for thick stencils. Mask requirements for projection electron-beam systems are also simplified due to image reduction, but complicated due to membrane processing, although the membrane requirements are significantly less than for proximity x-ray. Multiple-beam approaches are currently very immature, but hold promise for high-throughput and high resolution without requirements for intermediate masks. It is noted that high-throughput electron-beam cell projection and multiple-column systems are key items in a list of research projects for Japan’s Association for Super-Advanced Electronics Technologies (ASET).[102] Ion-beam lithography has inherent advantages over electron beams due to the higher mass of its charged particles, realizing higher resist sensitivity, and significantly less scattering and related proximity effects. This mass also proves to be a disadvantage, as ion damage effects can appear electrically in devices and optically in masks, introducing unintentional phase shift. Mainstream interest in focused ion beam systems is currently limited to mask repair and destructive feature inspection by crosssectional milling. Reduction projection ion-beam lithography has been investigated as a highly parallel wafer exposure technology, but there are currently no industrial champions for this technology in the U.S. or Japan. Only minor investments have been made relative to the infrastructure of optical or projection x-ray technologies. Although reduction reduces mask resolution requirements, a stencil mask is required. This mask poses challenges in fabrication and use, with issues such as fabrication and illumination-induced distortion, lifetime, and cleaning/damage due to the lack of a protective pellicle. As with all stencil-mask techniques, complementary masks are required for closed annular features, reducing throughput. EUV lithography uses reflective optics and radiation in the relatively unexplored region between the lower limit of DUV excimer laser sources and the extremely short wavelengths of synchrotron sources (~13 nm).[103] Although there is design simplification relative to conventional optical steppers and step-and-scan systems due to the reduced wavelength, many challenges exist in all areas of EUV implementation. Reflectivity at EUV wavelengths requires complicated multiple-layer substrates for masks and lenses, with difficult tolerances for all optical components (as low as λ/1000 for a full-field EUV system).[104] Low reflectivity of these
750
Handbook of VLSI Microlithography
components limits available exposure power at the wafer, limiting throughput. The limited penetration depth of EUV radiation requires novel and complex resist approaches.
6.0
ACKNOWLEDGEMENTS
The assistance of the following people is acknowledged in the preparation and acquisition of data included in this chapter: John Helbert, Rich Tiberio, Woody Windsor, Ron Thompson, Steve Brown, Bruce Geil, Louis Poli, Christine Kondek, John Pong, and Charles Cook. Special thanks go to Dr. Joanne Stellato for data-gathering efforts on activities in Japan.
REFERENCES 1. Brewer, G., Electron-Beam Technology in Microelectronic Fabrication, (G. Brewer, ed.), p. 7–16, Academic Press, New York (1980) 2. Column design is diffraction-limited, with an optimal convergence semiangle located for the minimum beam diameter at the crossover between diffractionlimited and abberation-limited behavior. 3. Kerber, T., and Koops, H., Microelectronic Engineering, 21(1):275–278 (1993) 4. Hatzakis, M., IBM J. Res. Develop., 32(4):441–453 (1988) 5. Greeneich, Electron-Beam Technology in Microelectronic Fabrication, (G. Brewer, ed.), pp. 64–74, Academic Press, New York (1980) 6. Broers, A., J. Electrochem. Soc.: Solid State Sci. & Technol., 128:166 (1981) 7. Dobisz, E., Marrian, C., Salvino, R., Ancona, M., Rhee, K., and Peckerar, M., SPIE Optical Engineering, 32(10):2452–2458 (1993) 8. Chang, T., Kern, D., Kratschmer, E., Lee, K., Luhn, H., McCord, M., Rishton, S., and Vladimirsky, Y., IBM J. Res. Develop., 32(4):464–466 (1988) 9. Broers, A., IBM J. Res. Develop., 32(4):504–506 (1988) 10. Smith, D., SPIE Dry Processing for Submicrometer Lithography, 1185:278–283 (1989) 11. Houli, B., Umansky, V., and Heiblum, M., Semiconductor Science & Technology, 8:1490–1492 (1993) 12. Otto, O., and Griffith, A., J. Vac. Sci. Technol, B, 6(1) (Jan/Feb 1988)
Electron-Beam ULSI Applications
751
13. Marrian, C. R. K., Chang, S., and Peckerar, M. C., Optical Engineering, 35:2685–92 (1996) 14. Harafuji, Kenji, Misaka, Akio, Nomura, and Noboru, IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems, 12:1508–1514 (1993) 15. Yamada, Y., Tamura, T., Nakajima, K., and Nozue, H., Proceedings of SPIE: Emerging Lithographic Technologies, 3048:63–68 (1997) 16. Owen, G., and Rissman, P., J. Applied Phys., 54:3573 (1983) 17. Watson, G., Berger, S., and Liddle, J., J. Vac. Sci. Technol. B, 13(6):2504–2507 (1995) 18. Bass, J., and Butler, M., Proceedings of the SPIE Symposium on Microlithography, Electron-Beam, X-Ray, and Ion-Beam Technology: Submicrometer Lithographys IX, 156 (1990) 19. Herriot, D., and Brewer, G., “Electron-Beam Technology in Microelectronic Fabrication,” (G. Brewer, ed.), pp. 141–216, Academic Press, New York (1980) 20. Hohn, F., SPIE Symposium on Microlithography, Santa Clara, CA, 28, Sec. T6 (Feb 1988) 21. Roelofs, T., and Meuwissen, A., SPIE Electron-Beam, X-Ray, and IonBeam Technology for Submicron Lithographies VII, Santa Clara, CA, pp. 274–280 (1988) 22. Chisholm, T., Wallman, B., and Romijn, J., Proceedings of SPIE, 2522:31–42 (1995) 23. Gesley, M., Abboud, F., Colby, D., Raymond, F., and Watson, S., Japanese Journal of Applied Physics, 32(12B): 5993–6005 (1993) 24. Thompson, M., Liu, R., Collier, R., Carrol, H., Doherty, E., and Murray, R., J. Vac. Sci. Technol. B, 5(1):53–56 (1987) 25. Herriot, D., and Brewer, G., Electron-Beam Technology in Microelectronic Fabrication, (G. Brewer, ed.), p. 160, Academic Press, New York (1980) 26. Polasko, K., Yau, Y., and Pease, F., Proceedings of the SPIE:Submicron Lithography, 333:76–82 (1982) 27. Ogata, S., Tada, M., and Yoneda, M., Applied Optics, 33:2032–2038 (1994) 28. Sohda, Y., Someda, Y., Saitou, N., and Itoh, H., J. Vac. Sci. Technol., 13(6): 2419–2423 (1995) 29. Takahashi, K., Yamazaki, S., Ohno, M., Watanabe, H., Sakakibara, T., Sato, M., Nagata, T., Yamada, A., Yasuda, H., Nara, Y., and Sasaki, N., Proceedings of SPIE: Emerging Lithographic Technologies, 3048:44–53 (1997)
752
Handbook of VLSI Microlithography
30. Pattern generator systems available from: Raith USA, 6 Beech Rd., Islip, NY 11751, 516 224-1764, [email protected] and J. C. Nabity Lithography Systems, Box 5354, Bozeman, MT 59717, phone (406) 587-0848, [email protected] 31. Leica Lithography Systems Ltd., Clifton Rd., Cambridge, CB1 3QH, UK, phone (44) 1223 411123, [email protected] 32. JEOL Semiconductor Electronics Sales Division, Mr. Shihei Imai, General Manager, phone (030)284–1432 33. Pearce-Percy, H., Prior, R., Abboud, F., Benveniste, A., Gasiorek, L., Lubin, M., and Raymond, F., J. Vac. Sci. Technol.B, 12(6):3393–3398 (1994) 34. The highest level of deflection is not telecentric, but its limited scan range of 2 µm provides a very small error. 35. Alles, D., Biddick, C., Bruning, J., Clemens, J., Collier, R., Gere, E., Harriot, L., Leone, F., Liu, R., Mulrooney, T., Nielsen, R., Paras, N., Richman, R., Rose, C., Rosenfeld, D., Smith, D., and Thompson, M., EBES4, J. Vac. Sci. Technol.B, 5(1):47–52 (1987) 36. Peters, D., Fowlis, D., von Neida, A., Rose, C., Waggener, H., and Wilson, W., SPIE Proc. Electron-Beam, X-Ray, and Ion-Beam Submicrometer Lithographies for Manufacturing III, 1924 (1993) 37. Waggener, H., BACUS News, 6(4):1–7 (1990) 38. Anderson, E., Boegli, V., and Muray, L., 39th Int. Conf. Electron, Ion, Photon Beam Technol. and Nanofabrication, Scottsdale, AZ, 63–63 (1995) 39. Groves, T. R., Hartley, J. G., and Pfeiffer, H. C., IBM Journal of Research & Development, 37:411–19 (1993) 40. Kendall, R., Doran, S., and Weissman, E., J. Vac. Sci. Technol.B, 9:3019 (1991) 41. Rockrohr, J., Butsch, R., Enichen, W., Gordon, M., Groves, T., Hartley, J., and Pfeiffer, H., Proc. SPIE Electron-Beam, X-Ray, EUV, and Ion-Beam Submicrometer Lithographies for Manufacturing V, 2437:160–167 (1995) 42. Gesley, M., Mulera, T., Nurmi, C., Radley, J., Sagle, A., Standiford, K., Tan, Z., Thomas, J., and Veneklasen, L., Proc. SPIE, 2437:168–184 (1995) 43. Toshiba Machine, Semiconductor Equipment Sales Department, Mr. Sugita, phone (035)250-3557 or -3561 44. Hitachi Industrial Component and Equipment Dept. XS, International Operations Div. II, Mr. Naoyuki Eguchi and Mr. Tsubosaki, phone (033)258–1111 45. Saitou, N., Sakitani, Y., Proc. SPIE, 2194:1489–1490 (1994) 46. Advantest Corporation, International Sales Department (Saitama), Ms. Mickey Iijima, phone (048)556-6500 47. Fortagne, O., Hahmann, P., and Erlich, C., Microelectronic Engineering, 27:15–154 (1995)
Electron-Beam ULSI Applications
753
48. SCALPEL™ is a registered trademark of Lucent Technologies, with development supported by DARPA under contracts MDA 972-94-C-0013 and MDA 972-95-C-0013. 49. Liddle, J., Harriot, L., and Wasckiewicz, W., Microlithography World, 6(2):15–18 (1997) 50. Berger, S., and Gibson, J., Appl. Phys. Lett., 57(2):153 (1990) 51. Ward, R., J. Vac. Sci. Technol., B8:1830 (1990) 52. Bohlen, H., Greschner, J., and Nehmiz, P., J. Vac. Sci. Technol., B8:1834 (1990) 53. Fresser, H., Prins, F., and Kern, D., J. Vac. Sci. Technol., B13(6):2553–2555 (1995) 54. Kragler, K., Guenther, E., Leuschner, R., Falk, G., von Seggern, H., and Saemann-Ischenko, G., Thin Solid Films, 264(2): 259–263 (1995) 55. Drouin, D., Beauvais, J., and Lemire, R., Applied Physics Letters, 70:3020–3022 (1997) 56. Yoshida, Norimitsu., Tanaka, and Keiji, Applied Physics Letters, 70:779–781 (1997) 57. Vig, J., Treatise on Clean Surface Technology, Vol. 1, (K. L. Mittal, ed.), Plenum Press, NY, pp. 1–26 (1987) 58. Mack, C., Microlithography World, 6(1):16–17 (1997) 59. Veneklasen, L., Handbook of VLSI Microlithography, (W. Glendenning and J. Helbert, eds.), Noyes Publications, New Jersey, p. 414 (1991) 60. Haller, I., Hatzakis, M., and Srinivasen, R., IBM J. Res. Develop., 12:251 (1968) 61. Microlithography Chemical Corp., 1254 Chestnut St., Newton, MA 02164-1418 62. Haller, I., Feder, R., Hatzakis, M., and Spiller, E., J. Electrochem. Soc., 126(1):154–161 (1979) 63. Bernstein, G. H., Hill, D. A., and Liu, W., Journal of Applied Physics, 71:4066–4075 (1992) 64. Hoole, A. C. F., Welland, M. E., and Broers, A. N., Semiconductor Science & Technology, 12:1166–1170 (1997) 65. Dean, R., BACUS Photomask News, 9(7):1–9 (1993) 66. Bell, G., and Spierer, H., BACUS News, 6(4):1–7 (1990) 67. Shen, W., Marra, J., and van den Broeke, D., SPIE Proc. 16th Annual Symp. on Photomask Technology and Management, 2884:48–66 (1996) 68. Kobayashi, H., Higuchi, T., Yamashiro, K., and Asakawa, K., SPIE Proc. 16th Annual Symp. on Photomask Technology and Management, 2884: 67–82 (1996)
754
Handbook of VLSI Microlithography
69. Lee, T., Zah, C., Bhat, R., Young, W., Pathak, B., Favire, F., Lin, P., Andreadakis, N., Caneau, C., Rahjel, A., Koza, M., Gamelin, J., Curtis, L., Mahoney, D., and Lepore, A., J. Lightwave Tech., 14(6):967–976 (1996) 70. EBR-9 is available in several formulations from Toray Industries America, Inc., 1875 South Grant St., San Mateo, CA 94402, phone (415) 341-7152 71. Kataoka, M., and Tokunaga, A., Polymers for Microelectronics - Science and Technology, (Kodansha, ed.), pp. 327–342 (1990) 72. Shaw, J., and Hatzakis, M., IEEE Transactions on Electron Devices, ED25(4):425 (1978) 73. Tritchkov, A., Jonckheere, R., and van den Hove, L., J. Vac. Sci. Technol., B13(6) 2986–2993 (1995) 74. KTI Chemicals Inc., 1170 Sonora Ct., Sunnyvale, CA 94086-5385, phone (408) 733-3500 75. Liu, H., Proceedings of the Microlithography Seminar Interface ’91, San Jose, CA, pp. 307–319 (1991) 76. Kugelmass, S., Mitchell, J., and Poreda, J., Proc. SPIE Photomask and XRay Mask Technology III, 2793:464–472 (1996) 77. Hoechst Celanese Corp., AZ Photoresist Products, 70 Meister Ave., Somerville, NJ 08876, phone (908) 429-3500; Shipley Inc., 455 Forest St., Marlboro, MA 01752, phone (800) 343-3013; Toray Industries America, Inc., 1875 South Grant St., San Mateo, CA 94402, phone (415) 341-7152 78. Hatzakis, M., J. Vac. Sci. Technol., 16(6):1984–1988 (1980) 79. Byrne, D., Brouns, A., Case, F., Tiberio, R., Whitehead, B., and Wolf, E., J. Vac. Sci. Technol., B3(1):268–271 (1985) 80. Tennant, D., Jackel, L., Howard, R., Hu, E., Grabbe, P., Capik, R., and Schneider, B., J. Vac. Sci. Tech., 19(4):1304–1307 (1981) 81. Bandy, S. Chai, Y., Chow, R., and Zdasiuk, G., IEEE Electron Device Letters, EDL-4:42–44 (1983) 82. Chao, P., Smith, P., Wanuga, S., Hwang, J., and Perkins, W., IEEE Electron Device Letters, ED-32:1042–1046 (1985) 83. Lepore, A., Levy, M., Tiberio, R., Tasker, P., Lee, H., and Wolf, E., Eastman, L., Kohn, E., Electronics Letters, 24(6):364–366 (1988) 84. Katsumata, M., Kawahira, H., Sugawara, M., and Nozawa, S., Proc. SPIE Photomask and X-Ray Mask Technology III, 2793:96–104 (1996) 85. Toyo Soda USA Inc., 1700 Water Pl., Suite 204, Atlanta, GA, 30339, phone (404) 956-1100 86. Ross, M., Livesay, W., and Petrillo, K., SPIE Advances in Resist Technology and Processing, 3049:676–691 (1997) 87. Tan, Z., and Sauer, C., Proc. SPIE 14th Annual BACUS Symposium on Photomask Technology and Management, Santa Clara, CA, pp. 141–148 (1994)
Electron-Beam ULSI Applications
755
88. Angelopoulos, M., Shaw, J., Lee, K., Huang, W., Lecorre, M., and Tissier, M., Photopolymers: Principles - Processes and Materials: 9th International Technical Conference on Photopolymers, Ellenville, NY, pp. 165–182 (1991) 89. Kondek, C., and Poli, L., SPIE 1994 International Symposium on Microlithography, 27, San Jose, CA (1994) 90. Huang, W., Polymer 35(19):4057–4064 (1994) 91. Mochel, M., Humphreys, C., Mochel, J., and Eades, J., Proceedings of the 41st Annual Meeting of the Electron Microscopy Society of America, San Francisco Press, San Francisco, CA, pp. 100–101 (1983) 92. Muray, A., and Isaacson, M., J. Vac. Sci. and Technol., B1(4):1091–1095 (1983) 93. Broers, A., IBM J. Res. and Develop., 32(4):502-513 (1988) 94. Ferry, D. K. Khoury, M. and Pivin, D. P., Jr., Semiconductor Science & Technology, 11(11S):1552–1557 (1996) 95. Tada, T., and Kanayama, T., Japanese Journal of Applied Physics, 35(1A):L63–L65 (1996) 96. Lercel, M., Tiberio, R., Chapman, P., Craighead, H., Sheen, C., and Parikh, A., Allara, D., J. Vac. Sci. Technol., B11(6):2823–2828 (1993) 97. Moore, G., Proceedings of the SPIE: Electron-Beam, X-Ray, EUV, and Ion-Beam Submiocrometer Lithographies for Manufacturing V, 2437:2–17 (1995) 98. Grenon, B., Hamaker, H., and Buck, P., Microelectronic Engineering, 27(1):225–230 (1995) 99. Product documentation for ALTA 3500 scanned laser exposure system, ETEC Systems, Inc., 26460 Corporate Ave., Hayward, CA 94545, phone (510) 887–2870. 100. Service, R. F., Science 274:1834–1836 (1996) 101. Smith, H., and Cerrina, F., Microlithography World, 6(1):10–15 (1997) 102. ASET Atsugi Research Center, c/o NTT Laboratories, 3-1 Morinosata Wakamiya, Atsugi-shi, Kanagawa Prefecture 243-01, Superfine SR Lithography Laboratory, Mr. Yoshio Gomei, phone (046)247-3721 103. Sommargren, G., Vernon, S., Gaines, D., Cohen, S., and Shafer, D., 39th Int. Conf. Electron, Ion, Photon Beam Technol. and Nanofabrication, Scottsdale, AZ, 11 (1995) 104. Kania, D., Gaines, D., Hermann, M., Hostetler, R., Levesque, R., Sommargren, G., Spitzer, R., and Vernon, S., 39th Int. Conf. Electron, Ion, Photon Beam Technol. and Nanofabrication, Scottsdale, AZ, 14–15 (1995)
756
Handbook of VLSI Microlithography
8 Rational Vibration and Structural Dynamics for Lithographic Tool Installations Kenneth Medearis Kenneth Medearis Associates Fort Collins, Colorado
1.0
INTRODUCTION
In order to address vibration and structural dynamic concepts for lithographic tool installations, it is deemed to be essential to discuss and evaluate the overall dynamic scenario. As the writer pointed out in 1995,[1] the advanced theoretical means for doing so have been available for a number of years. Modern vibration and structural dynamics methods were actually utilized in 1974[2] for the determination of how to successfully stabilize an existing vibration-prone floor that was used for the development of the Motorola MC68000 computer chip. Such methods have since been utilized by the writer for hundreds of semiconductor industry studies. The industry, unfortunately, has been willing to accept and reference inappropriate, poorly-defined theories and criteria. The result is somewhat of a morass that it will take years to emerge from, but progress is being made. A not-unique example is the past recommendation of the Semiconductor Industry Association’s (SIA) Table 43,[3] which called for an
756
Vibration and Structural Dynamics
757
unsubstantiated maximum floor velocity of 100 micro-inches/second for 0.25 micron geometry in the year 1998. This provably incorrect SIA recommendation has sometimes resulted in grossly-oversized fab structural systems. It is believed to have been quite costly to the semiconductor industry through being referenced by companies that, unfortunately, believed it to have some validity, which it does not. Fortunately, all such recommendations have been eliminated from the latest 1997 version. This is a positive step but much remains to be done. The fact of the matter is that 0.25 micron geometry has been achieved on rational fab floors having total velocities at least 10 times that value. Even if velocity was an appropriate criterion, which it is not, the 100 microinches/second would need to be far more precisely defined. For example, although recognized by virtually no one in the semiconductor industry, that value does not even refer to a floor total velocity (i.e., that which might affect tool operations); rather, it simply refers to frequency-related components of the total velocity. The items cited above will, hopefully, provide part of the introduction stimulant for recognizing the importance of addressing the vibration and structural dynamics scenario in a more scientific fashion. As previously indicated, the now obsolete SIA TABLE 43 recommendations were clearly thought to have some credence by a number of its members. One of the end results, as indicated, has been unnecessarily costly production facilities having excessively large, structural members. One example is a recentlyconstructed computerchip manufacturing floor framing system which has 30" square columns at 12' centers, supporting a grid work of 14" and 30" wide by 48" deep beams. It is intuitively apparent (even to technical editors) that such a system represents gross structural over-kill with excessive costs, yet it was apparently never even questioned by those involved. The direct and indirect costs due to loss of space for routing utilities, ducts, etc., are far more significant. When the writer pointed out the questionable aspects of this overall scenario in an award-winning paper,[1] the typical justification offered for building such bunker-like structures was that they seem to work. This is not always true since “bigger has proven to not always be better.” Regardless, it should also be noted that one may indeed use a 30' square × 10' wide concrete base to support a small compressor (actual case), and then proclaim the installation to be a success from a vibration standpoint; but such a costly design is surely not an attestment to modern engineering or analysis.
758 2.0
Handbook of VLSI Microlithography STRUCTURAL DYNAMICS, VIBRATION, AND STRUCTURAL ENGINEERING
A significant contributing factor in the costly, structural-overkill scenario is the apparent lack of understanding of the differences in background between structural dynamics, vibration, and structural engineering. There is clearly a lack of comprehension in that regard, especially by those not familiar with university curricula; that they have not taken time to understand the terms is most unfortunate, especially in view of the highdollar ramifications. However, excessive costs, and failures, have now provided the required impetus for such understanding. To elaborate, structural dynamics is typically an engineering curriculum graduate level MS or PHD set of courses, requiring at least several undergraduate courses in structures, structural design, and vibration as prerequisites. As the name implies, it is the study of the dynamics of structures, such as fab floor and tool framing systems. Conversely, vibration is a quite basic undergraduate course in an engineering or science curriculum. As such, it provides no background whatsoever for structural engineering expertise. Neither does, for example, acoustics, electrical, etc. Finally, a structural engineering curriculum almost never includes even an elementary vibration course, much less structural dynamics. Consideration of this discussion clearly indicates that only a background in structural dynamics, excluding the on-job-training variety, is really satisfactory for making recommendations for fab structural systems. It is also quite unlikely that a vibration consultant, with no background in structural design and engineering, can team up with a structural engineering consultant, having no background in vibration, to achieve efficient, effective solutions. Such a scenario obviously has no possibility of the required checks and balances since neither really understands what the other is talking about, except in quite general terms.
3.0
TOOL EXCITATION SOURCES AND LEVELS
The sources of excitation in semiconductor facilities are relatively well known; but the levels of excitation do not appear to be well understood at all. The latter is partially because the greatest excitation levels have notinfrequently been lost in inappropriate and incorrect FFT (Fast Fourier Transform) and one-third octave acoustic band spectra instrument
Vibration and Structural Dynamics
759
measurement representations; as well as a basic lack of familiarity with rotating machinery dynamics and its unbalanced forces. The sources of excitation include personnel, cart, and other in-plant traffic, tool operations, rotating machinery unbalance, etc. It should be noted, however, that rotating machinery excitations, contrary to popular thinking, are typically not the major consideration. This may be easily understood by noting the unbalance force information given in Fig. 1, the balancing standard for a major U.S. rotating machinery manufacturer. It may be seen, therein, the total unbalance force for a relatively large fan (500 lb impeller), when it leaves the factory, is only 10 lbs or less. Even that low level is further attenuated by the structural system supporting a tool. Such force levels are minor compared to those of the footfall excitation caused by personnel and other traffic. Indeed, the maximum floor motions (total, not component) measured at a distance of 10' from some 20 major air handlers (having 800 lb impellers and no spring “isolators”) were found to be only 15 micro-ins, peak-to-peak. It is relevant to discuss ambient and impulse excitations and motions, as typically defined by structural dynamicists.[1] Ambient floor motions are those resulting from all of the mechanical excitations present at the measurement location (i.e., those associated with rotating machinery unbalance, tool operations, etc.), but excluding impulse excitations such as foot traffic and cart movement. Ambient motions can be readily addressed with routine mathematical analyses; impulse motions require considerably more care. The cited ambient motions are clearly those associated with an operational fab, excluding those due to tool operators, etc. Interestingly, the writer has actually encountered fab vibration specifications that exclude both tool operation and personnel traffic excitations, i.e., an essentially “empty” facility specification. Such is clearly geared to minimizing potential liability rather than really addressing fab performance. All existing, relevant excitations must be considered, but those due to impulse will almost always be the greatest. An appropriate structural dynamics study for impulse excitations will adequately address any typical floor motions associated with tool-induced excitations. Such studies are beyond the scope of this treatise but, when accomplished by a qualified structural dynamicist, provide quite precise results and insight. A tool may excite itself, as will be discussed later, but that is a function of the overall, total vibratory system, which includes a number of constituent components.
760
Handbook of VLSI Microlithography
Figure 1. Fan residual unbalance.
3.1
Tool Excitation Sources and Levels—Case Study 1
Impulse-induced excitations, as indicated, are those resulting from personnel, in-plant traffic, cart movements, etc. Such movements are typically the primary contributor to the maximum floor motions that exist in an operational, functioning laboratory or high-tech area. This may be illustrated by consideration of the floor acceleration response
Vibration and Structural Dynamics
761
time-histories given in Fig. 2. One time-history (a) was recorded on an office area floor system; the other (b) on a fab floor system. Note the ambient floor motions in both traces are far smaller than those due to heelimpulse. Possibly more surprising, the recorded office area ambient motions are actually less than those of the fab floor, despite the fact the latter is much stiffer. This situation is not infrequently encountered. Note, however, the office area impulse-induced motions are about 6.5 times those of the fab floor. It follows that consideration of only the ambient excitations can clearly be misleading as far as the floor system vibrational adequacy evaluations are concerned.
(a)
(b) Figure 2. Floor motion time-histories-impulse and ambient excitations. (a) Office floor vertical motions and (b) FAB floor vertical motions.
762
Handbook of VLSI Microlithography
The given scenario is of the real-world variety in that it was proposed to install a scanning electron microscope on the office floor; a not-thatunusual proposal in an existing, crowded facility. The initial ambient motion evaluation (by others) indicated no problem with doing so but that proved to not be the case. Subsequent consideration of impulse results indicated why. The installation was ultimately accomplished, but only after structural dynamics studies results indicated how the floor should be stabilized such that success of the installation was assured. 3.2
Tool Excitation Sources and Levels—Case Study 2
When a tool is not performing properly, it is typically assumed that the problem can be traced to vibratory motions of the supporting floor system. However, it has already been shown that machine unbalance excitations are usually of secondary consequence and, if the floor system is adequate from the standpoint of impulse excitations, it should not be a major player. Indeed, the floor is only one component of the overall, total dynamic system[1] depicted in Fig. 3. That system also includes the tool, its support pedestal, the columns supporting the floor, lateral bracing, building foundation, and the supporting soil, piles, caissons, etc.
Figure 3. Tool-support pedestal-structure-foundation components, including damping, for evaluating the dynamic motions. The system must be modeled in three dimensions, i.e., x, y, and z responses computed and evaluated.
Vibration and Structural Dynamics
763
The tool merits major attention for more than one reason. It may be contributing to its vibration problems through its operations. Such contributions have frequently been ignored when trouble-shooting problem situations. One of the reasons for this neglect is the acceptance by some of unsubstantiated, generic criteria for the categorization of various tool types. That tool categorization is based on specifying floor velocity component levels, for undefined excitations. However, for such velocity component levels to have any real credibility, a number of items would need to have been addressed, not the least of which is each tool’s internal dynamic system. This, of course, has not been done, so the categorization is completely without basis. It is akin to the previously-cited specifications which exclude both tool and personnel traffic excitations., i.e., misleading. The fact of the matter is that most tool manufacturers have concentrated their analysis efforts in the areas of tool optics and electronics, not vibrational dynamics. They have, with minimal or no theoretical or experimental justification, incorporated granite “inertia” blocks and rubber “isolation” mounts in their tools internal dynamic systems. The result is that such systems can be a contributor to, or even the root cause of, operational difficulties. For example, Fig. 4 depicts vibration traces for a not untypical situation. The two parts of the figure, (a) and (b) reveal that the motions recorded at the microscope level are on the order of 10–12 times those of the floor system, i.e., the tool “isolation” is magnifying, rather than isolating. A recent paper[4] describes a situation in which a stepper experiencing a vibration-related production problem was found to be a prime, but not the only, contributor to that problem. This case illustrates the value of considering the total vibratory system. Specifically, the stepper’s operating stage is located on top of a granite block, which then is supported on rubber supports. Neither component appears to be effective in regard to the system dynamics. Indeed, the stage movement forces are enough to cause movement of the heavy granite block on its soft, low stiffness supports. The total system response picture is completed by noting the forces generated by the movements of the block are sufficient to cause motions of the steel pedestal supporting the tool. Those motions are then transmitted back to the tool; almost a perpetual motion scenario. Thus, tool operations can be a source of major concern.
764
Handbook of VLSI Microlithography
(a)
(b) Figure 4. Typical floor and aligner ambient vibration recording trace segments. (a) Floor system lateral motions – dominant frequency = ~20 Hz and (b) aligner microscope lateral motions – dominant frequency ~15 Hz.
This case also demonstrates the lack of validity of generic tool classifications. Specifically, a stepper requires a quite stable floor system. In this case, it had one, yet there were still vibration problems. The solution to the problem was to design and install a dynamically adequate, stiffer,
Vibration and Structural Dynamics
765
structural pedestal. It should be noted that such pedestals have frequently been given minimal dynamics attention. They are often designed for static loadings by structural and mechanical engineers, contractors, etc., rather than for dynamic loadings by structural dynamicists. Note, the solution involved a stiffer pedestal. It should be intuitively clear that some form of less-stiff “isolation” would provide no vibratory improvement whatsoever. That was essentially what had already existed; only it was in the form of a less-stiff, dynamically inadequate pedestal. Finally, air-flow and acoustic excitations are occasionally brought up in conjunction with vibration problems. It is rare that either is of any consequence. However, air-flow in a poorly designed clean room may merit consideration when its effects are coupled with a low stiffness tool internal dynamic system. Several tool manufacturers have also stipulated acoustic specifications. Those appear to have minimal scientific basis. A rare example of air-flow excitation is given in the next section. 3.3
Tool Excitation Sources and Levels—Case Study 3
One of the most perplexing tool vibration situations the author has encountered involved the unacceptable vibratory motions of one of two virtually identical inspection station microscopes. Numerous vibration measurements were taken in order to assess the performance of the total system components (Fig. 3) of both inspection stations. It was found that the vibrations of corresponding components were essentially identical. For example, Fig. 5 depicts the computed system response spectra[4] for the vibratory motions of both microscope platens. Consideration of those spectra reveals that the frequencies and responses of both microscope platens were essentially the same; yet one system was working and one was not. It was noticed, however, that a unique, out-ofphase, low frequency, vibratory motion was observed through the microscope of the non-working inspection station. It was ultimately concluded that motion might be caused by air-flow excitations of the minimal thickness, low-stiffness, sodalime glass, which was supported at five points on the platen. This hypothesis was subsequently verified via ingenious experimental efforts conducted by site personnel. Further experimental efforts revealed relatively non-uniform air flows in the general vicinity of the inspection station. Those flows were found to be the result of the incorrect installation of one of the nearby walls. Subsequent correction of that installation rectified the air-flow situation. The result was a working inspection station.
766
Handbook of VLSI Microlithography
Figure 5. Inspection station platen-system response spectra.
A pertinent analytical model of the platen glass was subsequently generated and structural dynamics studies conducted. The sodalime glass essentially vibrates as a low-stiffness plate supported at five points. Its theoretical vertical fundamental frequency was found to be on the order of 5 Hz or less, i.e., quite low. That satisfied one of the previously observed microscope motion requirements. The responses of the plate to potential air-flow excitations were also computed. Such excitations are also typically low-frequency in nature. The computed maximum glass motions were found to be in the 500–600 micro-inch, peak-to-peak, range, which would indeed be readily noticeable. The cited theoretical results thus verified the observed situation dynamics.
Vibration and Structural Dynamics 4.0
767
DISPLACEMENT, VELOCITY, OR ACCELERATION CRITERIA
Displacement is the most appropriate floor vibratory motion evaluation criteria. The most important reason for this is the fact that motion results are not improperly biased by irrelevant high-frequency components (as is the case with velocity or acceleration). To elaborate, it is known that, for harmonic motions, velocity is a function of displacement times frequency, and acceleration is a function of displacement times frequencysquared. Thus, for example, an insignificant displacement can result in a seemingly significant velocity or acceleration value when multiplied by a large frequency.[4] This pertinent rationale is based simply on the system vibrational dynamics. There are a number of structural dynamics research studies and publications[1] that address vibratory motion criterion. One of the most detailed and extensive is a blasting-induced ground motion and structural response research study conducted in 1976.[5] Significantly, the research results were subsequently validated by independent studies conducted at the University of Leeds, UK. The impetus for the 1976 study, which was funded by twenty-three United States and Canadian organizations, was the numerous, scientifically unjustified, damage claims that were resulting from the use of the U.S. Bureau of Mines vintage velocity criterion of 2 inch/sec peak ground motion. One of the primary conclusions of the research study, accepted and endorsed by qualified structural dynamicists on two continents, was that the specifying of a velocity criterion is inappropriate for vibratory motions. The primary reason for this is the scenario distortion caused by multiplying minimal, non-damaging, ground displacement values by the relatively high frequencies of the ground motions, resulting in high peak velocities that are simply of no consequence to the structural system. That scenario is quite similar to that which is currently plaguing the semiconductor industry. Specifically, the industry is now, by necessity, studying the high-cost fabs that have been constructed as a result of the use of velocity component vibratory motion criteria; just as the blasting industry was forced to address the numerous scientifically-unjustified damage claims and associated, costly, legal entanglements. A simple example illustrates why the constant velocity blasting ground motion criterion is inappropriate. A similar example could easily be formulated with regard to a microelectronics floor system, using different parameter values. Suppose that a structure is subjected to a peak
768
Handbook of VLSI Microlithography
ground motion displacement of 0.020" at a frequency of 10 Hz, as well as in-phase motion displacement of only 0.004" at a frequency of 50 Hz. It can easily be shown that the total displacement of 0.024" would almost always be non-damaging. The total associated velocity, however, would be (0.02 × 2 × PI × 10 = 1.26 inch/sec) + (0.004 × 2 × PI × 50 = 1.26 inch/ sec) =2.52 inch/sec. That value, being greater than the previously cited 2 inch/sec peak ground motion criterion, would be incorrectly judged to be damage producing. This is due to the irrelevant, frequency-distorted 50 Hz motion. Typical structures, especially residences, simply do not respond significantly to such high frequencies. Neither do fab floor systems.
4.1
Floor Displacement Criteria
A majority of micro-electronics tool manufacturers specify peak, or peak-to-peak (PTP), floor displacement criteria. The displacement hypotheses is intuitively justifiable. It is well-known that a structural element or structure can be subjected to high velocities, and/or accelerations, without distress, the previously-cited frequency-dependence scenario being of significant importance. The key item is whether or not the structure total displacements are of any consequence. Unfortunately, the vast majority of the typical floor vibration criteria provided by tool manufacturers are not based on scientific studies. Examples of these criteria are depicted in the plots of Fig. 6. Considerations of these quite different plots for tools having similar vibration sensitivities is essentially confirmation of the already-stated conclusion—they have minimal theoretical or experimental bases and are, at best, only very basic guidelines. This, in some respects, is not completely without reason. The structural floor is only one component (site-dependent) of the total vibratory system previously given in Fig. 3. Its member sizes are almost never directly controlled by the tool manufacturer, thus the specifying of peak floor motions should be presented in simple, pertinent, realistic terms. That is rarely ever the case. For example, note that the curve of Fig. 6 (c) stops at 10 Hz. This means that it does not address the floor vertical motions at all, but rather the tool’s internal, low-frequency, “isolation” system. This is because fab floor system vertical frequencies are typically much higher, being in the 16–40 Hz range. It might be deemed relevant for floor horizontal motions, but those are typically small compared to its vertical motions. Such structural dynamics facts appear to have been given little attention by tool manufacturers.
Vibration and Structural Dynamics
769
Figure 6. Typical floor vibration specifications provided by tool manufacturers.
A somewhat unique tool manufacturer testing effort is described in Ref. 6, pertinent results being given in Fig. 7. The segments of that figure have been broken into three parts for purposes of interpretation. The 2–16 Hz portion is certainly associated with the tool’s low-frequency, internal dynamic system, which typically consists of some form of “isolation” mounts and/or a granite “inertia”block. As previously indicated, that range is simply too low to be relevant for the vertical motions of a typical microelectronics floor. There may be 2–16 Hz building horizontal motion frequencies but, typically, not significant excitations. Earthquakes are a possible exception, but not a daily operational consideration. Floor vertical frequencies rarely exceed 40 Hz. Even if they do, the associated displacements are insignificant. The key 16–40 Hz range is thus seen to be the most relevant for microelectronics floor framing. Test results for that range have
770
Handbook of VLSI Microlithography
been re-plotted in Fig. 8. The associated displacements vary from 90–110 micro-inches, peak-to-peak (PTP) (i.e., virtually constant) velocities from 5000–15000 microinches/sec; and accelerations for 2–10 milli-g’s. The PTP displacements are thus seen to be an excellent vibration motion criteria because they are relatively constant over the pertinent floor frequency range. This constancy is, of course, as it should be considering structural dynamics theory.[7] Velocities and accelerations, being frequency-dependent, are far from constant and increase in accordance with their frequency multipliers.
Figure 7. Allowable tool displacements from floor vibration.
Vibration and Structural Dynamics
771
Figure 8. Relevant frequency range values interpreted and re-plotted (see Fig. 7).
These results have considerable value. Although necessarily generated for one specific tool, its basic construction is reasonably representative of a number of others. It may be intuitively postulated that peak floor displacements of 100 micro-inches or less for the more stringent impulse excitations will generally be acceptable. Considerable experience over the past twenty years, relating to floors supporting typical micron and submicron tools, high energy laser facilities, electron and tunneling microscopes, etc., has thus far indicated no exceptions to that conclusion. Further discussion and conclusions follow. The dynamic responses of a fab floor structural system to footfall impulse are directly related to its generalized stiffness.[7] Thus, it is deemed reasonable to use static stiffness values (which structural engineers understand and should be able to compute), to develop the preliminary designs of structural floors. Figure 9[1] was generated using bay center dynamic responses and stiffness values compiled for several hundred buildings analyzed and studied by the writer. The computed theoretical floor dynamic responses were always validated by comparison with on-site measurement values, the motion difference typically being less than 15–20 %. Such differences can result from material strength variations alone, thus it is clear
772
Handbook of VLSI Microlithography
that significant precision can be attained with appropriate structural dynamics analyses. The cited categories (office areas, high-tech and science laboratories, and microelectronics) constitute a relatively broad spectrum of activities having somewhat different vibratory motion requirements and characteristics. Responses greater than 1000 microinches (PTP) are readily noticeable to office occupants. Responses greater than 300 micro-inches (PTP) have typically proven to be a undesirable for science laboratory activities. As previously indicated, micro-electronics floor, impulse-induced motions of 100 micro-inches (PTP) will usually pose no vibrationrelated difficulties. As shown in Fig. 9, increases in the framing structural stiffness for the microelectronic range result in relatively minor floor response reductions. Therefore, little is gained by using the excessively large structural members previously cited. This is especially true when it is recognized that the tool dynamic system may be the controlling factor.
Figure 9. Impulse excitation-structural dynamics response results-theoretical and field measurements—100 representative KMA-analyzed facilities—(response-stiffness values used to generate lines).
Vibration and Structural Dynamics
773
Total system dynamics provide results that are typically more precise than tool manufacturer vibration guidelines. As indicated, Fig. 9 can be readily used by structural engineers for preliminary static designs. This is quite important. Most simply have no idea as to why they are being asked to provide excessively large structural members, cope with acoustic terminology, etc. To facilitate that use, the microelectronics portion has been enlarged for clarity and re-presented in Fig. 10. The associated least squares equation clearly shows (via the 0.004 multiplier) that the slope of the criteria line is quite small. In other words, the decreases in floor response are relatively minimal for even major increases in floor stiffness.
Figure 10. Impulse excitation structural dynamics criteria—theoretical and field measurement data; response vs. stiffness points used to generate the solid line.[2]
It is generally acknowledged that tool installations on slab-on-grade floors have had few vibration-related problems. On some occasions, there have been minor subsidence under such floors, but even that is easily detected, and avoided, with impulse excitation vibration measurements.
774
Handbook of VLSI Microlithography
The vertical stiffness of a well-constructed, appropriate thickness, slab-ongrade floor system can be evaluated using the dynamics concepts given in Ref. 8. That stiffness is typically on the order of 4000 kips/inch; while that of the cited excessively large member systems is on the order of 2–4 times that value. The acceptance of such systems, with minimal theoretical justification, if any, appears to be an attestment to the old adage that “when in doubt make it stout.” There has clearly been considerable doubt. As indicated, the writer has not encountered vibration-related problems where the impulse-induced floor motions were less than 100 microinches, PTP. However, it is apparent from Fig. 10 that lesser floor responses are both easily achievable and rational. It is thus recommended that new fabs have floor stiffness values equal to that of a typical slab-on-grade. As indicated, that results in a required stiffness on the order of 4000 kips/inch. The associated response is about 45 micro-inches, PTP. These values clearly provide a significant margin with regard to potential vibratory motion problems, as well as for accomodating possible future-vintage tool innovations. They essentially represent a realistic, rational solution to a scenario that has been questionably addressed in a number of fab designs and tool vibration situations. For purposes of comparison and information, 20–24' bays, having 30–40" deep beams, are both appropriate and adequate. Specific fab floor designs, however, should always be validated by structural dynamics studies. 4.2
Tool Manufacturers Floor Vibration Specifications
The tool manufacturer vibration specification study[6] discussed in the previous section is, unfortunately, quite unique and certainly an exception to the norm. Indeed, it is important to recognize that the majority of the specifications provided by tool manufacturers typically have minimal scientific basis. The writer has, over the years, shown that numerous tool specifications are inappropriate. They often represent a liability rather than an asset. The manufacturer has often not addressed the dynamics of the tool support system in other than a cursory manner. There is scant possibility that it has addressed another key component of the total vibratory system, i.e., a supporting floor structure that it did not design. For the most part, basic guidelines are provided because they have been requested by the purchaser. Those guidelines must really be viewed in that light. This scenario is often further complicated by the numerous vibration measurement takers, whose primary qualifications may be simply having a vibration instrument and a basic knowledge of how to use it. Their typical
Vibration and Structural Dynamics
775
charge is to evaluate, by simplistic measurements, whether or not a floor system will meet a manufacturer’s specification, the latter often having questionable correctness. It is impossible, in this brief treatise, to discuss the multitude of meaningless vibration specifications that have been provided by tool manufacturers. However, a representative example consists of four vibration specifications provided for similar-type tools by the same manufacturer. They offer an interesting variety; two are based on displacement, one on velocity, and one on acceleration. All are inadequately defined. Specifically, the following items may be noted: i. The direction of the motions is not specified. ii. The motions are not defined as to whether they are zeroto-peak, peak-to-peak, root mean square (RMS), etc. iii. One specifies acceleration motion levels for a tool that is not ultra-sensitive to vibrations; yet simple conversions of those frequency-biased acceleration levels gives displacements that are simply unattainable on any fab floor subjected to typical operational excitations. iv. The excitation is not defined. If that is not defined in some fashion, there is no rational basis for the specification. v. Fast Fourier Transform (FFT) measurements are stipulated in the specifications. Although that makes it simple for the previously cited vibration measurement takers to use their instruments, the obtained results may be of questionable value. It has infrequently been recognized that such results may be erroneous, especially in the low frequency range. Also, as previously indicated, the maximum fab floor motions are typically those resulting from personnel and vehicle traffic. The difficulties in determining accurate peak responses for such excitations using ordinary FFT methods are well known to mathematicians and structural dynamicists. A three minute time period has been specified for the measurements. It is likely that any peak responses would be essentially “averaged out” over that duration. vi. The specified frequency range of 1–100 Hz indicates the specification is inappropriate for a typical fab floor
776
Handbook of VLSI Microlithography system. A structural dynamicist would readily recognize the former is far too low to be transmitted by such floors; the latter is far too high. In the rare event there are any significant floor system frequencies even as great as 50 Hz, the associated displacements would be minuscule and of little importance. The low and high frequencies are necessarily related only to the tool internal support system or individual tool components, not the floor system.
Possibly the most troublesome aspect of the tool manufacturer’s specifications is the statement, “The customer must supply verification of compliance with the vibration specification upon request by the manufacturer. Failure to do so will void the warranty.” Since the specifications have obvious deficiencies, this non-constructive statement may result in a “catch-all” situation for the buyer. Conversely, it may also result in the tool not being purchased at all. In any case, it has resulted in a number of littlevalue reports by the vibration measurement takers; who sometimes do not even recognize the limitations and deficiencies of their instruments. Such reports frequently add to the confusion rather than helping to address, and solve, tool vibration scenarios. Finally, warranties are of questionable value when a tool is inoperative for an extended period of time, even if it is being worked on by the manufacturer.
5.0
VIBRATION-RESISTANT SUPPORT PEDESTALS FOR TOOLS
Vibration-resistant support pedestals (Fig. 3) are key items in controlling tool vibrations, as shown in the case example of Sec. 3.2. They have frequently been given minimal attention, resulting in numerous vibration problems. Such pedestals usually bridge the gap between the top of a raised floor system and the primary structural floor system. Current raised floor designs are typically not adequate for supporting vibrationsensitive tools. It thus follows the pedestal must have, as a minimum, essentially the same dynamic response characteristics as the vibrationally-adequate structural floor system. The pedestal responses should be determined for impulse excitation because that represents a worst case scenario. Further, the pedestal should be functionally adequate even if
Vibration and Structural Dynamics
777
is subjected to foot traffic excitation. “Isolating” the pedestal from the raised floor is ineffective and unnecessary. It should be intuitively apparent that traffic excitation simply follows a path from the raised floor down to the structural floor and back up to the top of the pedestal. As long as the responses of the pedestal are similar to those of the supporting floor, it will be satisfactory. Support pedestal frequencies are almost irrelevant and should never be used as a criteria for design. The case study example in Section 3.2 illustrates a not untypical situation. Consider carefully the vibration traces given in Fig. 11.[4] Note the motions of the supporting floor in Fig. 11 (a) and those of a properly designed pedestal in Fig. 11 (c) are quite similar. In fact, the pedestal motions are actually slightly less. Conversely, Fig. 11 (b) depicts the much greater motions of the dynamically inadequate pedestal that was replaced. Pedestals should be carefully designed and analyzed to achieve good dynamic characteristics and an optimal weight/stiffness ratio. It is typically desirable for the design and response analyses to be done by a qualified structural dynamicist. Dynamic rather than static loadings are, by far, the most significant, thus there is little need for major structural engineering involvement in most cases. The many different types of pedestal structures that have been installed almost defies imagination. A sizable number of these have been of the type that satisfies the basics of bridging the gap between the structural floor and the top of the raised floor. Those have often been dynamically inefficient yet costly; relatively difficult to install (heavy); and sometimes have not functioned as desired. Illustrative examples of such designs are described in Fig. 12. An efficient design must be based on good structural concepts. The pedestal’s overall geometry should be relatively basic and such that it can be easily analyzed from a structural dynamics standpoint. Impulse excitation responses, not frequencies, should be utilized as the basis for acceptance. Examples of relatively lightweight, easy to construct and install, tool pedestal designs are given in Figs. 13, 14, and 15. These are presented for the purpose of providing those in charge of installing tool pedestals with examples of what is recommended as opposed to that which is not.
778
Handbook of VLSI Microlithography
(a)
(b)
(c) Figure 11. Section 3.2 case study recording trace segments. (a) Floor vertical motion, (b) dynamically inadequate pedestal vertical motions, and (c) dynamically adequate pedestal vertical motions.
Vibration and Structural Dynamics
779
Pedestal 1—Description i. Install 12" × 12" precast concrete beams on top of the main floor grid. ii. Pour a 10" thick reinforced concrete slab on top of the beams. iii. Epoxy coat all of the concrete. iv. Attach a 1/4" thick stainless steel cover plate using concrete anchors and spacers. Pedestal 2—Description i. Fabricate a reinforced concrete waffle slab pedestal— (complete unit—slab plus support beam grid) having the same outside dimensions as the tool to be supported. ii. Install a 1/4" thick stainless steel cover plate using concrete anchors and spacers. iii. Epoxy the pedestal to the supporting floor system. Pedestal 3—Description i. Install 12" × 12" × 1/2" thick tube steel on top of the main floor grid. ii. Place 10" × 10" × 1/2" thick tube steel on top of, and perpendicular to, the 12" × 12" tubes. iii. Attach a 1/4" thick stainless steel cover plate to the tubes using stainless steel bolts and screws. Pedestal 4—Description i. Install 12" × 12" precast concrete beams on top of the main floor grid. ii. Place 10" × 10" × 1/2" thick tube steel on top of, and perpendicular to, the 12" × 12" cast concrete beams. iii. Attach a 1/4" thick stainless steel cover plate to the tubes using stainless steel bolts and screws. Figure 12. Basic descriptions of inefficient cost/weight pedestal designs (24" high raised floor).
780
Handbook of VLSI Microlithography
Figure 13. Pedestal 1. Three-dimensional view—pedestal framing—(no scale).
Vibration and Structural Dynamics
Figure 14. Pedestal 2. Three-dimensional view—pedestal framing—(no scale).
781
782
Handbook of VLSI Microlithography
Figure 15. Pedestal 3. Three-dimensional view—pedestal framing—(no scale).
Vibration and Structural Dynamics 6.0
783
SYSTEM “ISOLATION”
There is a still-existing pre-computer era belief that if an item vibrates, or causes vibrations, the solution is simply to attempt to “isolate” it in one way or another. Much of the time, however, that is a far too simplistic approach based on elementary textbook vibration theory. Recall that it was shown in Sec. 3.2, that a tool’s internal dynamic system can be the cause of some operational difficulties; also that the solution to the problem involved a stiffer pedestal, rather than a softer “isolation” system. Following are some of the reasons why isolation supports are often ineffective, if not detrimental. The most basic reason is that the “total vibratory system” is typically not considered; rather, only a single component of that system. Further, the item to be isolated is generally considered to be rigid. However, assuming an item to be rigid, and not considering the total system, can be major mistakes: i. Even “rigid” units move with six degrees of freedom, three translation (up/down, side-to-side, and front-toback) and three rotations (roll, pitch, and yaw). Virtually all “isolation” literature is based on “rigid,” single (one) degree-of-freedom theory, which is often inappropriate. ii. Isolation mounts are frequently quoted as providing about 95% efficiency, which is almost never the case. The 95% figure pertains only to “rigid” floor systems and single-degree-of-freedom (unidirectional) theory, rather than typical real-world elastic floor systems having multi-degree-of-freedom (usually at least 6) motions. iii. The “isolation” center of elasticity must coincide as closely as possible with the unit mass center of gravity if it is to be effective. This is frequently not feasible with, for example, tools or fans where the isolation mounts are placed near the base and the mass center of gravity is at or near the operational level. iv. The isolation should, ideally, be at the same level as the excitation forcing functions. This is usually not feasible for tools or fans. v. Detaching a unit from its supporting structure via an isolation system will change both its rigidity and associated natural bending frequencies, which may be quite undesirable.
784
Handbook of VLSI Microlithography vi. Because of the situation described in item three, the primary tool motions are side-to-side translation coupled with roll, i.e., horizontal movements. Typical “isolation” is based on unidirectional vertical motions with no rotation, which is an inappropriate representation.
Reference 9 contains further discussion of the difficulties in isolating multi-degree-of-freedom systems. The key item, in many cases, is whether or not the isolation is effective in “ filtering out” detrimental high frequency components of excitation. The softness of the isolation may allow excessive response motions that are detrimental to the components of the tool or fan. That merits careful consideration. One tool manufacturer has significantly improved vibration and production performance by using a stiffer, but not too stiff, internal dynamic system. The best recommendation, however, is to analyze and modify, as may be required, the internal dynamic system using modern structural dynamics methods. Nothing positive is typically achieved by providing an excessively stiff, massive, floor structure to support a tool having a low stiffness internal dynamic system. Figure 16 gives vibration recording trace segments taken to assess the performance of a typical commercial “isolation” table. The traces again indicate that amplification rather than isolation is occurring. This not untypical result is actually due to the excitation of one of the table’s multimode vibration components and the table’s air-bag system “softness,” especially in its horizontal planes. Sophisticated, comprehensive, analyses of the table system structural framing dynamics must be accomplished if the desired motion reduction effects are to be obtained. Situations should not be prejudged, however, but rather addressed carefully with isolation performance being evaluated by a qualified structural dynamicist. The evaluation should include physical vibration measurements, with intuitive judgments being unacceptable. Response component measurements alone should be deemed equally unacceptable. It is recommended that such a third-party evaluation be made prior to acceptance of any installation.
7.0
CONCLUSIONS AND COMMENTS
It is imperative that tool manufacturers reevaluate their providing floor vibration specifications such as those discussed in preceding paragraphs. If a tool vibration specification has essentially no scientific basis,
Vibration and Structural Dynamics
785
(a)
(b) Figure 16. Typical floor and “isolation” table ambient vibration trace segments. (a) Astronomy floor system lateral motions and (b) astronomy "isolation" table lateral motions ("floating").
786
Handbook of VLSI Microlithography
it is of little or negative value and should not be included in the tool information package. The heavily-promoted, inappropriate, frequencyrelated component motion concepts are a part of the problem. Rather than specify easily usable, technically valid, vibration criteria, such as floor system maximum total displacement, some tool manufacturers have presented their specification criteria in terms of FFT, one-third octave band, power spectral density, etc., velocity or acceleration, frequency-related components. This practice is also partially due to some of the measurement instrumentation that is commercially available. Such representations give no consideration to the following facts: i. Vibratory motion, frequency-related, component criteria have gained no known acceptance by qualified structural dynamicists. ii. Although such criteria are purported to be applicable to structural systems, no structural or structural dynamics textbook mentions, or even references, such. Fourier analysis is briefly addressed in elementary vibration textbooks; one-third octave bands are not. Even if a floor system vibratory situation was truly random and broadband, which it is not, frequency-related motion components must still be appropriately summed for dynamic response evaluations.[10] iii. System frequencies may not be of importance at all. Dynamic response evaluations are often made without even computing frequencies. However, those can be easily determined if deemed necessary. iv. Equipment engineers, as well as the structural engineers doing the static designs for advanced technology facilities, typically have no understanding of FFT, onethird octave acoustic band, power spectral density, etc., criteria or concepts; or the means to utilize such for preliminary designs. Since structural dynamicists don’t typically use those either, there is actually no reason they should have an understanding. All do understand total displacement, however. v. The inadequacy of using only motion components to analyze vibratory situations was proven many years ago via numerous earthquake dynamic studies. As shown in
Vibration and Structural Dynamics
787
Ref. 11, summations of mathematically-valid response spectra[12] components do not provide precise solutions, much less FFT representations. vi. The minuscule, less than 1 micro-inch, displacements associated with the previously cited 100 micro-inches/ sec velocity component criteria are known to be unachievable for typical fab excitations, including personnel traffic, etc. This is easily proven with an example. Consider a typical fab floor having a fundamental frequency of 30 Hz. Its displacement for a 100 micro-inch/sec velocity is 100/(2 × pi × 30) = 0.53 micro-inches (or about 1.1 micro-inches), peak-to-peak. That value cannot even be achieved for a slab-on-grade subjected to the cited excitations. This fact also helps explain the excessively large structural systems sometimes found in fabs. The goal was to attempt to achieve the relatively unachievable, i.e., stiffness-of-arock for a suspended floor system. Nevertheless, the stability and vibration-resistivity of the supporting floor system must be addressed in some fashion by both the tool manufacturers and equipment engineers. This is imperative because there are certainly limber floor systems and large bay sizes that would not be adequate for supporting many tools. Such addressing, however, must be easily understandable and usable by all parties. This includes the facility designers and users, virtually none of whom really understands the previously cited frequency-related, motion component criteria.
8.0
RECOMMENDED TOOL AND FLOOR VIBRATION CRITERIA
Considering the preceding sections, which have presented pertinent discussions of the subject matter, it may be concluded the difficult-tounderstand criteria and categorizations of tools (such as scanning electron microscopes, steppers, probers, etc.) are essentially baseless and misleading. A relevant example is the minimal attention given to probe and polisher systems. Both are not only affected by floor motions, but also generate considerable excitation through their operational processes. They should be placed on quite stable floors (certainly not “isolation”) so at least that
788
Handbook of VLSI Microlithography
part of the excitation is appropriately addressed.[4] It is relevant that the writer has studied more prober vibration-related problems than for any other tool. There is a continuing, instrument-output-related, tendency to give floor vibration criteria in terms of frequency-related components for a broad frequency range. Doing so completely ignores basic floor system structural dynamics response and frequency fundamentals. Specifically, component measurements do not accurately depict floor responses for impulse-type excitations, while most of the 0–100 Hz frequency range is totally irrelevant. It is well known to structural dynamicists that floor systems act as “response filters.”[10] Specifically, they only respond significantly to their fundamental frequencies and filter out the rest, i.e., they simply don’t respond to much of the 0–100 Hz range. In other words, floor systems function in a fashion similar to properly-functioning “isolation” systems; the difference being that floor systems always work. Their frequencies and time-history responses can be accurately computed using modern finite element procedures,[1] thus random vibration, broadband considerations, which might be used for complex airplane structural systems, are quite inappropriate. The means for specifying allowable floor vibrations have already been presented in the easy-to-use plots of Figs. 9 and 10. For the least vibration-sensitive tools, the floor system bay center total, impulse-induced, vertical displacement motion must not exceed 100 micro-inches, peak-to-peak (PTP). The associated bay center stiffness must be at least 1000 kips/inch. Structural engineers should be able to calculate the latter for preliminary static designs. For the more vibration-sensitive tools, the stringent 45 micro-inches, PTP, response and 4000 kips/inch stiffness values should be used. It is certainly reasonable to utilize those values since they correspond to a typical slab-on-grade floor system rather than structural over-kill. Horizontal motions and stiffnesses should be based on the same values. Even with a minimal, but adequate, shear wall system, the floor horizontal motions are typically less than vertical motions. Such motions still merit appropriate consideration, however. The final theoretical response analyses, and evaluations, should always be made by a qualified structural dynamicist. Displacement-based criteria, unlike frequency-related criteria, are very easy to use and have sound practical and theoretical bases. The writer has used the criteria for over several hundred tool installations. Tool manufacturers have been reasonable to work with and all have given their
Vibration and Structural Dynamics
789
approval to those criteria, or appropriate conversions of the same. Most important, there have been no known vibration-related difficulties with any of the installations.
REFERENCES 1. Medearis, K., Rational Vibration and Structural Dynamics Evaluations for Advanced Technology Facilities, Journal of the Inst. of Environmental Sciences, Feature Paper (Sept.–Oct. 1995); Maurice Simpson Outstanding Technical Paper Award (1996) 2. Medearis, K., A Study of Wafer Building Structural and Soil Vibrations, A Report to Motorola Semiconductor Products Division, Phoenix, AZ (1974) 3. Semiconductor Industry Association (SIA), The National Technology Roadmap for Semiconductors (1994) 4. Medearis, K., Analyzing, Correcting Stepper System Vibration: A Case Study, MICRO-Cleanroom Technologies (Sept. 1997) 5. Medearis, K., Rational Damage Criteria for Low-Rise Structures Subjected to Blasting Vibrations, Institution of Civil Engineers, London (Sept. 1978) 6. Perkin-Elmer Corp. Microlign M500 Sensitivity to Floor Vibration and Acoustic Disturbances, MLD002356 (1986) 7. Medearis, K., Static and Dynamic Properties of Shear Structures, Proceedings of the RILEM International Symposium, Mexico City (1966) 8. Medearis, K., Fan Foundation Systems-Analysis and Design Guidelines, Electric Power Research Institute (EPRI) (1986) 9. Harris, C. M., and Crede, C. E., Shock and Vibration Handbook, McGrawHill, New York (1961) 10. Crandall, S. H., Random Vibrations, MIT Press (1963) 11. Clough, R., Earthquake Analysis by Response Spectrum Superposition, Bulletin of the Seismological Society of America (1962) 12. Medearis, K., Dynamic Characteristics of Ground Motions Due to Blasting, Bulletin of the Seismological Society of America (Apr. 1979)
790
Handbook of VLSI Microlithography
9 Applications of Ion Microbeams Lithography and Direct Processing John Melngailis University of Maryland College Park, Maryland
1.0
INTRODUCTION
Because ions incident on a substrate deliver both mass and energy, they can be used in many ways in microfabrication. For example, they can be used to image the surface, to implant and dope semiconductors, to mill the surface, or to induce chemical changes such as in exposing resist. The first requirement in using ions for microfabrication is the ability to modulate the dose on a surface (ideally in an on-off fashion) with submicrometer resolution. The three ways in which this modulation can be achieved are shown in Fig. 1. A point beam can be focused to a fine spot (~10 nm) from a bright “point” source and deflected on the surface. This kind of focused ion beam (FIB) is now widely used in the integrated circuits industry for failure analysis and for prototype circuit repair.[1] It may also be useful for lithography, i.e., resist exposure, and for direct implantation of dopants. The second way to form a modulated dose is to place a stencil mask in close proximity to a surface and irradiate the mask with a collimated beam of ions. This kind of shadow printing has been considered for lithography[2] and may have some special advantages for surface acoustic devices and for flat panel displays.[3]
790
Ion Microbeams Lithography
791
(a)
(b)
(c)
Figure 1. Three ways of producing a patterned dose of ions on the surface: (a) focused (point) ion beam deflected in a programmed fashion, (b) masked ion lithography, stencil mask in close proximity to a wafer, and (c) ion projection lithography, stencil mask pattern projected (demagnified) on the wafer.
792
Handbook of VLSI Microlithography
One can combine the first two techniques and use an ion optical column to image the pattern of a stencil mask on to a wafer. This is also being considered as a technique for lithography at ~100 nm minimum dimensions.[4][5] Ion projection lithography has been pioneered by a small company in Vienna, Austria (IMS) and is currently being pursued by a European consortium, which includes Siemens, IMS, ASML, and Leica. All of these ways of using ion microbeams depend on ion surface interaction. 2.0
ION-SURFACE INTERACTION
Figure 2 illustrates the phenomenon that characterizes an energetic ion incident on a solid surface. All of them are exploited in some way in the various applications of ion microbeams. The main effects are: i. Implantation. The ion penetrates into the solid and the new imbedded atom alters the properties of the solid. The most common example is semiconductor doping. The depth of penetration into the solid decreases with the increasing mass of the ion and the increasing density of the solid. The ion loses energy to both the electrons in the solid and to the atoms or nuclei of atoms in the solid by displacing them from their normal lattice sites. These two rates of energy loss are referred to as the electronic stopping power and nuclear stopping power. Generally, electronic stopping power is dominant for fast, light ions, and nuclear stopping power is dominant for heavier slower ions. Since the scattering by nuclei can cause large changes in the momentum vector of the incoming ion, the final resting place of the ion in the solid is distributed in depth and laterally. This is illustrated in Fig. 3. The half-width of the distribution in depth around the average penetration depth, Rp, (or range) is called the range straggle ( ∆Rp), while the half width of the distribution parallel to the surface is called the lateral or [ transverse straggle (∆Rt). These quantities are tabulated 6] and are convenient for the initial design of semiconductor devices. More detailed models exist as well, for example, [7] Monte Carlo simulation (TRIM), or the so-called Marlowe simulation.[8]
Ion Microbeams Lithography
793
Figure 2. Schematic of an ion penetrating the surface, showing sputtering of neutral atoms, emission of electrons, lattice damage, heat generation and implantation. In addition, the beam can produce chemical effects (bond breaking).
Figure 3. The distribution of a large number of ions entering the solid at the point z = 0, r = 0 in cylindrical coordinates. Due to random collisions, the ions are distributed in a cloud of density N(r,a). The profiles in both width and depth are modeled as Gaussians. Values of Rp , ∆ Rp, and ∆R t are tabulated[6] for various ions incident on Si and GaAs respectively.
794
Handbook of VLSI Microlithography ii. Damage. As the ions enter the solid, they displace some of the host atoms from their normal crystal lattice sites. This damage is due to the nuclear stopping power. In fact, if the dose of ions is high enough, the crystal lattice order is completely destroyed and the surface layer penetrated by the ions becomes amorphous. The damage is generally distributed in a manner similar to the implanted ions as shown in Fig. 3, although there may be more damage near the surface. iii. Chemical changes in the bulk. In addition to implantation and damage, the incident ions can also produce chemical changes in the solid. For example, in resist exposure for lithography, the ions alter the structure of the resist by producing cross linking between molecules in negative resist, or producing chain scission in positive resist. These chemical changes, at least for the positive resist PMMA, correlate with the electronic stopping power of the resist rather than the nuclear stopping power of the ions.[9] This is shown in Fig. 4, where the sensitivity (or dose needed for resist exposure) is plotted vs. nuclear and vs. electronic stopping power. A strong correlation with electronic stopping power is evident. iv. Sputtering. Incident energetic ions will cause atoms to be ejected from the surface leading to an erosion of the surface. This “ion milling” technique is used in various areas of microfabrication. For example, using a focused ion beam, as in Fig. 1 (a), one can mill trenches in the surface of a fabricated integrated circuit to uncover defects under the surface or to cut conductors, thereby altering the functioning of the circuit. The sputtering process, particularly for energetic ions, is thought to occur via collision cascades. The incoming ion may impart energy and momentum to a host atom which will send it in some direction, perhaps toward the surface. If this atom reaches the surface, or if it, in turn, imparts energy and momentum to other atoms of the host crystal
Ion Microbeams Lithography
795
Minimum Exposure Dose (µC/cm2)
lattice, and if an atom reaches the surface or imparts enough energy to a surface atom to remove it (generally ~5 eV) from the surface, a host atom is ejected. The sputtering yield, number of atoms ejected per incident ion, is in the range of 2–30 atoms/ion. Some typical values are shown in Table 1. If a focused ion beam is used [Fig. 1 (a)] this erosion can be controlled with very high resolution and trenches with dimensions as small as 20 nm can be milled.[10][11] The milling yield generally increases as the angle of incidence goes from normal toward glancing incidence.[13] This can be exploited to enhance material removal rate.[12][14]
Figure 4. Minimum exposure dose of PMMA vs. stopping power (nuclear or electronic) for various ion species and ion energies. Each point is labeled with the ion energy in keV and the stopping power (n-nuclear or e-electronic) which is plotted. The correlation is seen to be much better with electronic stopping power than with nuclear.[9]
796
Handbook of VLSI Microlithography
Table 1. Ion Milling Yields for Focused Ga (Atmoic Mass 69) and Broad Beam Kr Ions (Atomic Mass 84)
Substrate
Ion
Si Si Si Si Si Au Au Au (plated) Au (evap.) Au
Ga+ Ga+ Ga+ Ga+ Kr + Ga+ Ga+ Ga+ Ga+ Kr +
W (RF sputt.) W SiO2 SiO2 SiO2
Ga+ Kr + Ga+ Ga+ Ga+
Energy (keV)
Yield (atoms/ion)
30 30 25 25 25 100 40 25 25 25 45 25 22 68 25 30
3.1 ±0.8 2.1 2.6 3.9 ±0.4 3.1 32 15.7 ±1.3 18 ±3 23 ±5 20 28 5 ±0.7 4.1 2.0 0.84 0.85
Reference [a] [b] [c] [d] [e] [f] [g] [d] [d] [e] [d] [e] [h] [d] [b]
[a] Yamuguchi, H., J. De Physique Colloque C6, Suppl. 11(48):C6–165 (Nov. 1987) [b] Santamore, D., Edinger, K., Orloff, J., and Melngailis, J., J. Vac. Sci. Technol. B, 15:6 (1997) [c] Pellerin, J. G., Shedd, G. M., Griffis, D. P., and Russell, P. E., J. Vac. Scil Technol. B, 7:1810 (1989) [d] Xu, X., Della Ratta, A. D., Sosonkina, J., and Melngailis, J., J. Vac. Scil. Technol. B, 10:2675 (1992) [e] Anderson, H. H., and Bay, H. L., Sputtering by Particle Bombardment I, Physical Sputtering of Single Element Solids, (R. Behrisch, ed.), p. 145, Springer Verlag, Berlin-Heidelberg (1981) [f] Muller, K. P., and Petzold, H. C., SPIE, 1263:12 (1990) [g] Blauner, P. G., Butt, Y., Ro, J. S., and Melngailis, J., J. Vac. Sci. Technol. B, 7:609 (1989) [h] Melngailis, J., Musil, C. R., Stevens, E. H., Urlaut, M., Kellogg, E. M., Post, R. J., Geis, M. W., and Mountain, R. W., J. Vac. Sci. Technol. B, 4:176 (1986)
Ion Microbeams Lithography v. Ion induced deposition. Incident ions can not only be used to remove material but can also be used to add material to the surface. This generally requires a precursor gas. (To add material directly from the incident ion beam, the ion energy has to be reduced to around 10 eV or below, otherwise the sputter yield exceeds deposition. This can, and has, been done but is not as useful as the induced deposition.) The precursor gas atoms, such as W(CO) 6, adsorb on the surface usually as a monolayer,[15] and the incident ion beam causes the adsorbed gas molecules to dissociate, leaving the metal behind. Of course, the incident ions will also sputter the newly deposited material, but net deposition will occur if the dissociation yield is higher than the sputter yield. This is generally the case if the gas supply is not exhausted by the dissociation process. In focused ion beam induced deposition, the gas is supplied to the surface by aiming a fine capillary tube at the area scanned by the ion beam, as shown in Fig. 5. Thus, a pressure in the millitorr range is produced locally under the mouth of the tube, while in the rest of the chamber, and in the ion column, the pressure is many orders of magnitude lower (as needed for ion beam propagation). So far, most of the precursor gases used are organometallics, and as a result the “metal” deposited on the surface contains a substantial percentage of carbon, which raises the resistivity. In some cases (for example, gold and copper), the deposited material segregates into islands of metal in a background of carbon. [16]–[18] The carbon incorporation can be minimized by heating the substrate during deposition to around 100°C.[18][19] Ion induced deposition is widely used in conjunction with focused ion beams, for example, to deposit conductors in integrated circuit rewiring, to deposit absorbers on lithography masks, or to deposit a protective layer for sample cross sectioning. Deposition of insulators, namely SiO2, has also been developed (see Sec. 4.1). Various precursor gases have been tried, and the best ones yield oxide layers of resistivities of order 8 × 1011 Ωcm. Insulating films are needed in circuit restructuring, if a contact has to be made through or across an existing conductor film.
797
798
Handbook of VLSI Microlithography
Figure 5. Top. Schematic of the FIB column and chamber with gas feed system. Bottom. Close-up of the capillary gas feed tube.
vi. Ion assisted etching. When an adsorbed gas can be caused to react with the substrate, and the reaction products are volatile, material removal at rates much higher than sputtering alone can be achieved. Thus, for example, H2O will cause rapid etching of organic films and diamond,[20] Cl2 or XeF2 will cause rapid etching of Si or SiO2.[21][22] In general, the use of a reactive gas increases the material removal rate typically by a factor of 10 or so. The apparatus used is the same as shown in
Ion Microbeams Lithography
799
Fig. 5. In some cases, one can take advantage of selectivity. For example, H2O will increase the removal rate of an organic film such as polyimide by a factor of 20, while the removal rate of Al or Si is reduced, compared to milling. This is thought to be due to the fact that oxides build up on the surfaces of Al and Si as they are milled in the presence of water vapor, thus slowing the removal rate. vii. Electron emission. When an energetic ion is incident on a surface, a few electrons are also typically emitted. These electrons have an energy distribution peaked at a few eV even though the ions may have many keV of energy. The electrons are used in imaging of the sample with the focused ion beam in the same way that the secondary electrons are used in a scanning electron microscope (SEM). In effect, we have a scanning ion microscope. Of the three ways of generating ion microbeams shown in Fig. 1, the focused ion beam is the most flexible and exploits most of these ion-surface interactions.
3.0
FOCUSED ION BEAMS
Driven mainly by applications in the semiconductor device industry and the discovery of the liquid metal ion source,[23] focused ion beams systems have been developed extensively over the past 15 years. A number of reviews of this technology have been written[24]–[26] as well as two books.[1] Only the main features of the FIB systems will be summarized. 3.1
Machinery
By use of electrostatic lenses, ions emanating from a “point” source are refocused to a point on the sample which can be deflected over some chosen area (typically up to about 200 × 200 µm). A schematic of a typical FIB system is shown in Fig. 6. The three main parts of the FIB system are the source, the ion column, and the beam writing mechanism.
800
Handbook of VLSI Microlithography
Figure 6. Schematic of the focused ion beam column in use at MIT. The accel lens can be raised up to 120 kV, the pre-accel lens, an additional 30 kV. The extraction voltage adds another 6–10 kV. (The einzel lens does not change the energy of the beam.)
3.2
Point Sources of Ions
The important property of the source of ions in focused ion beam systems is its brightness (measured in A/cm2 steradian). Thus a large current of ions emitted into a small solid angle from a “point” source is desirable. Since charged particles repel one another, the implied confinement of such a configuration is hard to achieve. An extreme environment is needed; such as is provided by a sharp tip with a high electric field. The two types of ion sources, liquid metal[23][27]–[30] and gaseous (cryogenic) field ion source,[31]–[35] both use this configuration. Of the two, the liquid metal has so far proven to be, technologically, the more important one.
Ion Microbeams Lithography
801
Liquid Metal Sources. In the case of liquid metal sources, the sharp tip is largely achieved by the fact that the electric field pulls the conducting fluid into a cusp from which the ions are emitted (see Fig. 7). The remarkable fact is that this structure works well, namely at the low currents, desirable in focused ion beam applications. Many sources can operate for long periods of time with stable extraction currents. In some cases, a servo system is installed which varies the potential on the emission tip to maintain constant current. This voltage is controlled by an auxiliary electrode so that the energy of the ions entering the rest of the ion column remains constant.
Figure 7. Schematic of the liquid metal ion source in cross section. The enlarged view of the tip shows the metal being pulled into a cusp by the electric field.
The ions emitted by the liquid metal ion source have an energy spread in the range of 5–10 eV. This appears to be unavoidable[36] and is the most important factor in limiting beam diameter and beam current density. The mechanism for this limitation is chromatic aberration, i.e., ions of different energies are focused at different points in the column, in effect increasing the beam diameter.
802
Handbook of VLSI Microlithography
The energy spread in general increases with increasing ion current from the source.[36] The coulomb repulsion between ions which are at random spacings from one another has been proposed as an explanation.[37] However, this model does not appear to account for the observation that the energy spread does not go to zero as the source ion current is reduced. The energy spread for a Ga source decreases as the current is lowered to 1 µA but then saturates and remains flat at a value of 4.5 eV (FWHM) even down to currents of 5 nA.[36] Another related fundamental property of the liquid metal sources is the virtual source size. Just as energy spread is thought to be caused by coulomb repulsion along the direction of travel, virtual source size is thought to be caused by the coulomb repulsion normal to the direction of travel.[37] The virtual source size will ultimately limit the beam diameter, i.e., the ion optics of the column can do no better than to image the source (albeit demagnified) on to the sample. The random transverse velocities make it appear that the source is 50–100 nm in diameter which is larger than the emission point (see Fig. 8). Both the energy spread and the virtual source size are expected to decrease with decreasing ion extraction current. (This has only been clearly documented for the energy spread).[38] In focused ion beam columns, one accepts only the very central part of the ion emission cone, a few milliradians from the axis (see Fig. 8). This central part of the distribution is a weak function of the total extraction current. Increasing the extraction current mainly spreads out the emission cone rather than increasing the current at the center.[39] For example, increasing the total extraction current of Ga+ ion source by a factor of 10 from 0.75 to 7.5 µA only increases the current at the center of the cone by 50%. Thus, liquid metal ion sources should be operated at as low a current as possible, to both prolong their life and to decrease the energy spread. In spite of the fact that there are still unanswered questions in the details of source operation, (for example, to what extent is source current limited by space charge effects vs. hydrodynamic effects), many useful liquid metal sources have been built. Besides Ga, other frequently used sources are AuSi, AuSiBe,[40]–[42] PdAs,[43] PdAsB,[43] NiB,[44] and NiAs,[45] since they provide the principal dopants of GaAs and Si. Many of these sources have long lifetimes. Commercially available Ga+ sources operate in excess of 1000 hours.
Ion Microbeams Lithography
803
Figure 8. A further enlarged view of the tip of the cusp (Fig. 7) showing the extrapolation of the perturbed ion trajectories back to the source. The waist formed by these trajectories defines the virtual source size.
Although, at this point, source operation is adequate for many focused ion beam applications, improvements which continuing research brings will be welcome; in particular, increased lifetime, reduced energy spread, increased stability, increased brightness, and new ion species. Gaseous Field Ion Sources. This type of source is operated at cryogenic temperatures (liquid helium) and condenses either He, H2, or Ne on the tip.[31]–[35][46] (see Fig. 9). Note that emission can also occur by field ionization directly from the gas. In that case, cryogenic cooling is not essential. However, much higher currents and source brightnesses are observed when sources are cooled.[35][47][48] The main problems with the gaseous field ion source are the uncertainty of the emission point which leads to a non-axial emission cone,[47] and the instability of the emission current due to contamination.[49]
804
Handbook of VLSI Microlithography
Figure 9. Schematic of gaseous field ion source. The emission tip and the entering gas are cooled to near liquid helium temperatures. The atomically sharp tip is a thought to ionize gas atoms directly as well as gas condensed on the needle.
Progress toward a reliable field ion source has been reported. Use of a specially constructed tungsten (111) emitter tip and highly purified He gas has permitted stable operation of the source for up to 170 hrs.[50] In addition, the field ion source has an energy spread of the ions of only ~1 eV. The available current density in the beam spot on the sample is, therefore, expected to be 10–100 times higher than in the case of liquid metal sources. Work is continuing to make the gas field ion sources practical. The motivation is the increased current density which would speed up nanometer scale lithography and the possibility of having non-contaminating, rare earth gas ions for integrated circuit on-line examination.
Ion Microbeams Lithography 3.3
805
Ion Column
The ion column mainly contains electrostatic ion lenses, but also may contain mass separators if alloy sources are used (see Fig. 6). It focuses the ions from the liquid metal source on the sample surface. In most operating regimes, the beam spot size and current density are limited by chromatic aberration and the virtual source size. Thus the total beam diameter focused on a sample is given by[51] (see the books cited in Ref. 1 or other texts[52]): Eq. (1)
d = (do2M 2 + dc2)1/2
where do is the virtual source size, M is the magnification and dc is the contribution due to chromatic aberration, given by: [30] Eq. (2)
d c = 2 C co M α o
∆E E
where Cco is the chromatic aberration coefficient referred to the source side and αo the half angle of the beam accepted by the beam defining aperture (see Fig. 8), M is the magnification of the system (usually less than 1), E the final energy of the ions, and) ∆E the energy spread. Unless one is interested in the ultimate in minimum beam diameter, the doM term is small compared to dc and the beam diameter is determined by dc. The current density J in the focal spot is then constant since J = 4I/ πdc2 (I is beam current), andI ~ao2 and d c ~αo, so that αo cancels. In practice, αo is a few milliradians. For larger αo, spherical aberration, which is proportional to αo3, may begin to play a role. Underlying these simple expressions for the beam diameter is the complicated dependence of the chromatic aberration coefficient Cco, on the layout of the electrodes and values of the electrode potentials in all of the electrostatic lenses in an ion column. Ray tracing computer programs are used to calculate the aberration coefficients.[53] Generally, these programs start with some existing or envisioned electrode/potential configurations. The reverse calculation has also been implemented,[54] i.e., given certain constraints to compute the electrode and potential configuration, which will minimize the aberration coefficients. These optimized configurations predict increases of current density of as much as a factor of thirty-three, compared to conventional lenses with the same source parameters and working distances.[54] This prediction has not been verified experimentally.
806
Handbook of VLSI Microlithography
A low aberration lens system designed in a more conventional fashion also predicts up to 20 times higher current density than hitherto reported.[55] However, this is obtained in part by reducing the working distance. (In effect, decreasing M in the above equations.) All of these calculations consider the focused ion beam going down the axis of a cylindrically symmetrical column. There are two ways that this is violated in practice. The first is the unintentional misalignment of the column. Lens element displacement, tilt, and ellipticity may occur and have been analyzed. [56] For a permitted blurring of an originally 0.1 µm beam by 0.05 µm, a few µm of misalignment, or ellipticity, and a 0.5–1 mrad of tilt are permitted. Thus, as expected, lens elements must be very precisely machined and aligned. The other non-cylindrical symmetrical effects are intentional, namely EXB mass separation (needed if alloy liquid metal sources are used) and beam deflection. [57] In both cases, the energy spread again is responsible for additional beam blurring. The mass filter is, in effect, a velocity filter and the energy spread will result in astigmatism. In addition, beam deflection is proportional to the ratio of the applied transverse voltage to the acceleration voltage of the beam. The energy spread translates into an uncertainty of the accelerating voltage. A simple calculation, [see Eq. (3)], shows that a 10 eV energy spread in a 100 kV column results in 20 nm blurring in the direction of deflection for a beam deflected by 100 µm. This can be minimized by dynamic stigmation or by using only small deflection fields. Within the recognized bounds of these limitations, there has still been progress in column performance. Beam diameters down to 0.05 µm are quoted routinely even for columns with mass separation. With Ga+ ions, beam current densities up to 10A/cm2 are possible and beam diameters in the 10 nm regime have been obtained. With alloys, the density is lower often by a factor of 10. With special efforts, including reducing the current magnification in the beam spot M to 0.14, features down to at least 15 nm width have been written in PMMA using Ga ions.[58] 3.4
Beam Writing
In any practical applications of focused ion beams, we must have the ability to deliver a given dose in desired patterns on the surface. To do this, the system must be able to deflect the beam, to turn it on and off (beam blanking), and to align to existing features. In addition, since most patterning is likely to be over entire wafers, or at least over a chip, and since the
Ion Microbeams Lithography
807
maximum practical deflection of the ion beam is only a few hundred micrometers, the stage holding the sample must be moved, and the stage motion and beam deflection must be aligned to one another. Beam Deflection. Beam deflection is accomplished by a transverse electric field generated by suitable electrodes. Simple parallel plates or octopoles are typically used. Although deflection appears to be straightforward, it is responsible for some of the system limitations such as writing speed and field size. This can be illustrated by a simple example as shown in Fig. 10. If we consider a transverse electric fieldVx/do acting on the beam over a distance L, and an ion of mass m, charge e, and axial velocity vz, then simple classical mechanics permits one to derive an expression for the angle of deflection θ (assumed small): Eq. (3)
=
θ =
eV x L d omv z2
If the entering ion has been accelerated by a voltage Vz , then eVz and we have:
mv 2z/2
Eq. (4)
θ=
Vx L 2Vzd o
Some typical numbers are: do = 5 mm, L = 25 mm, W = 30 mm, Vx = 100 V, and Vz = 150 kV. Then we get x ~Wθ = 50 µm, which is a typical deflection. Ideally, the deflection signal should be a square wave with rise and fall times of a few nsec for, say, 10 MHz writing speeds. Also the deflection signal must be free of noise or ringing to one part in 104 (or 20 mV), if the degradation of the beam position is to be less than 10 nm. Note that noise on the deflection plates simply looks like an increase in beam diameter in most cases. The generation of a fast, clean high voltage deflection signal is a serious electronics challenge. Possible design trade-offs are clear from Eq. (4). Lower deflection voltage can be used if L is increased and do is decreased. But in practice, there are limits to the range of these dimensions, and in fact, they cannot deviate much from the sample values we have picked. For example, if the beam deflection is located below the final lens, as shown in Fig. 6, then increasing the length L will increase the working distance, increase the magnification M, and lead to larger beam diameters as can be seen from Eq. (2). Decreasing the spacing of electrodes (do) makes beam alignment more critical.
808
Handbook of VLSI Microlithography
Figure 10. The parameters used to discuss beam deflection. An ion velocity vz is incident on the axis half way between the deflection plates and ends up being deflected through an angle θ .
Note also, the energy spread ∆E of the ions will lead to a smearing of the beam in the direction of deflection, ∆x −∆Ε = eVz x For ∆E = 10 eV, Vz = 100 kV, x = 150 µm, we get ∆x = 15 nm. This is not negligible. Nevertheless, within these limitations, focused ion beam writing at a few MHz has been demonstrated. Beam Blanking. Beam blanking is usually done further up in the column, where the beam has lower energy. Together with deflection on the sample, it is an essential part of any writing. Blanking is done by deflecting the beam above an aperture preferably where the beam has a cross over. For the system depicted in Fig. 6, the blanking plates are between the EXB mass
Ion Microbeams Lithography
809
separator and the mass separating aperture, and they deflect the beam perpendicular to the deflection direction of the EXB mass separator. Because the blanker has a finite length and displaces the beam laterally before sweeping it off the aperture, the beam has “tails” both in time and in space.[26][59] The blanking pulse needs to rise fast from zero and return cleanly and quickly to zero, but can have a noisy maximum. Transit Time. The transit time of the ion in a column must also be considered. The beam blanking and beam deflection in general do not occur in the same place along the beam path, for example, the blanking may be done early in the column and the deflection just before the beam is incident on the substrate. The distance between these two locations might be 30 cm. Sample transit times are given in Table 2. This transit time has to be taken into account when the blanking and deflection are synchronized.
Table 2. Time to Travel One Centimeter (ns)
Energy (keV) Species Electron Proton H2 He Be Ga
30 0.097 4.17 5.90 8.32 12.5 34.7
100
300
0.53 2.28
0.031 1.32
3.23 4.56 6.84 19.0
1.86 2.63 3.95 11.0
The transit time of the beam through the deflector and the blanker itself presents a more fundamental limit to writing time. Scaling the numbers from Table 2, for example, we find that a 10 keV Au ion travels 1 cm in 0.1 µsec. Thus since the blanker and deflector are about 1 cm long in the direction of beam travel, writing speeds of 10 MHz would not be possible in this case. For lighter ions at higher energy, this limitation is not yet serious.
810
Handbook of VLSI Microlithography
Alignment and Imaging. Alignment to existing features is essential for focused ion beam writing circuit modification, or failure analysis in most cases. Alignment is achieved by using the focused ion beam itself to image. In this mode, the machine is used as a scanning ion microscope which in principle operates the same as a scanning electron microscope (SEM). The beam is raster scanned over the sample and secondary electrons emitted from the point of ion incidence are collected (see Fig. 11). The secondary electron signal in turn modulates the intensity of a synchronously scanned cathode ray tube. The image can be formed of both the difference in topography or the difference in materials. Figure 12 shows an example of a scanning ion microscope image as well as the same feature taken by an SEM. Note that focused ion beam imaging erodes the surface. In most cases, an image can be formed at the sacrifice of a few monolayers. Use of frame storage will reduce unnecessary erosion.
Figure 11. Schematic of the secondary electron collection with either a multichannel plate (top) or a channel electron multiplier (bottom).
Ion Microbeams Lithography
811
Figure 12. Example of the same aluminum conductor being imaged in the scanning ion microscope mode (left) and an SEM (right). The two concentric square vias and the shallow square were all milled with the same 68 kV Ga+ ion beam that was used for the imaging.
For accurate alignment mark observation, the nature of the secondary electron detector is important. If the detector is a channel electron multiplier (as shown in Figs. 6 and 11 bottom) or a photomultiplier/scintillator detector, then the imaging is not symmetrical, and a step will look different depending on its orientation with respect to the detector. For this reason, most focused ion beam systems now use microchannel plates which are circularly symmetrical about the incident beam and therefore detect the secondary electrons symmetrically (see Fig. 11 top). A typical alignment mark which might be a raised (or indented) cross on the surface would, therefore, appear as shown[60] in Fig. 13. As the ion beam scans over the edge of a step, the secondary electron emission increases sharply because the electrons can escape more readily from an edge than from a flat surface. The secondary electron signal for a scan across one arm of the cross has two sharp peaks, one for each edge, as shown in Fig. 13. Note that, as discussed above, for the microchannel plate, the two peaks are equal, while for the asymmetric detector (in this case a scintillator and photomultiplier), they are not.[60] The emission of electrons has, in fact, been calculated as a function of the step wall angle and for 160 KeV Si ions incident on GaAs. An inclination of 85° was found to be optimum, i.e., on the step wall, the ion beam is at a 5° grazing angle.[61][62] Under favorable conditions, alignment to 0.1 µm was reported.[60] The theoretical resolution of the
812
Handbook of VLSI Microlithography
alignment mark position is calculated to be 0.01 µm.[62] The reason the achieved alignment is not as impressive has been attributed to vibration. Other factors which affect the ability to align include the edge roughness in the alignment mark and the erosion of the step during scanning.
(a)
(b) Figure 13. (a) Examples of signals from an alignment mark as the focused ion beam is scanned in a line across a silicon pedestal. If photomultiplier detector is used then the signal is seen to be assymetrical. If a microchannel plate is used, the signal from the two sides of the pedestal is seen to be the same. (b) Image of a cross formed with a microchannel plate detector. (The signals in the upper part of the figure would be obtained by scanning the focused ion beam across one arm).
Ion Microbeams Lithography
813
In some circumstances, where a focused ion beam step is the first of a sequence, alignment marks can be milled into the surface by scanning with an appropriately heavy dose (~1018 ions/cm2). The ion beam profile is Gaussian over 2–3 orders of magnitude, but below that it has “tails” where the fall-off is slower. Clearly, alignment marks must be located at an appropriate distance from the fabricated features. Otherwise, the tails of the beam, combined with the heavy doses needed to mill alignment marks or even to view them (~1015 ions/cm2), may produce unwanted doses in the active areas. Writing Strategies. How the beam is deflected over the surface depends on the application. i. In the Ga ion columns widely used in the semiconductor industry for integrated circuit failure analysis and for repair, a large dose of ions is typically needed (1016–10 19 ions/cm2) to erode or deposit a significant volume of material. The typical strategy here is to image an area on the screen, outline an area with a box using the computer mouse and then program the beam to scan over this surface to deliver a desired dose. The beam is usually scanned digitally, i.e., stepped from point to point and the operator specifies the dwell time and the step size. If deposition is desired, a precursor gas is supplied (see Fig. 5). Otherwise the area scanned is milled. ii. If implantation or lithography is carried out with focused ion beams, the strategy is different. A low dose, typically 1011–1015 cm2, is delivered but with high positional accuracy and often in a complex pattern. A laserinterferometrically controlled stage is used which reads its position to better than 20 nm. The wafer on the stage is registered with respect to the ion beam by imaging alignment marks as discussed above, and reading their location on the stage using the interferrometer. Then beam blanking, beam deflection, and stage motion is computer controlled to deliver the desired dose to the desired locations. This writing is quite similar to the schemes used in e-beam lithography.
814 4.0
Handbook of VLSI Microlithography FOCUSED ION BEAM APPLICATIONS
The two types of applications, which use different writing strategies, also exploit different aspects of the ion-surface interaction (Fig. 2). Imaging is fundamental to both classes of applications. Sputtering and induced surface chemical reactions are used in the repair, failure analysis and circuit restructuring while ion doping, damage, and bulk chemical change play a role in focused ion beam implantation, and lithography (i.e., resist exposure). 4.1
Low Energy Ga Ion Beam Applications
The repair, failure analysis, and circuit restructuring machines are built with simple columns with maximum ion energy of about 50 keV using Ga+ liquid metal ion sources. No mass separation is needed and, as discussed above, the writing scheme is also relatively simple. The main processes needed are material removal, material addition, and imaging. Imaging, as discussed above, is needed so that the other two processes can be carried out in the desired location. Material removal can be achieved by either sputtering, also called ion milling, or by ion induced etching, also called ion assisted etching. i. Ion Milling. The fundamental quantity that determines milling rate at a given ion current is sputter yield, i.e., the average number of substrate atoms (or molecules in the case of compound substrates) removed from the surface by each incident ion. The sputter yield of a number of materials is shown in Table 1; the yield for Kr+ ions is also included for comparison since Kr+ is the noble gas ion closest to Ga+ in mass. While the sputter yield would seem to be a straightforward quantity that is easily measured, this is not always the case, particularly with focused ion beams. Possible complications are: a. Due to ion channeling, the sputter yield depends on the orientation of the crystal axis with respect to the incident ion beam. Thus, in a polycrystalline sample, each grain may mill at a different rate and a milled surface may become very rough. This is illustrated[63] in Fig. 14. and may make the measurement of yield for metal films, which are usually polycrystalline, difficult.
Ion Microbeams Lithography
815
Figure 14. Gold film in which a pit has been FIB milled, the apparent roughening of the bottom of the pit is due to the fact that the individual grains mill at different rates depending on orientation.
b. The sputter yield depends on the angle of incidence. Thus a surface that has some topography may mill at a different rate than a flat surface. The increase in milling yield can be several-fold in going from normal toward grazing incidence as shown[63] in Fig. 15. c. The yield may depend on the focused ion beam scan rate. If the beam is scanned slowly and the thickness of material removed per scan is comparable to the beam diameter[64][65], then locally under the beam the angle of incidence will not be normal, as illustrated in Fig. 16. This effect may increase the observed sputter yield by as much as a factor of 2.[65]
816
Handbook of VLSI Microlithography
Figure 15. Relative milling yield vs. angle of incidence. The solid lines represent the results from a Monte Carlo simulation (TRIM). The dotted lines serve only to guide the eye. Results for SiO2 were obtained by milling a quartz fiber with 35 keV Ga+ at a beam current of 100 pA and an average current density of 2.9 pA/µm2. Milling conditions for the rest: beam 25 keV Ga+, beam current 283 pA, dwell time 0.3 µs/pixel, average current density 7 pA/µm2 for W and Si, 2.9 pA/µm2 for Au.
d. Redeposition of the sputtered material may play a role. The sputtered atoms have been shown to be emitted from the surface in a cosine squared distribution,[66] i.e., with maximum normal to the surface and going to zero at grazing angles. Thus, if a substantial thickness is removed per scan, then redeposition may affect the milling rate and the measured sputter yield. In order to avoid these complications in either measuring the milling yield or in milling a sample to a desired predictable depth, the scan speed of the beam should be high enough so that the last three of these effects do not
Ion Microbeams Lithography
817
come into play. In spite of these possible complications, focused ion beam milling is widely and successfully used as we will see later.
Figure 16. Illustration of how a slow beam scan will result in non-normal ion incidence and an increased milling rate.
ii. Ion Induced Etching. If a chemically active precursor gas such as Cl2, XeF2, or H2O is beamed on to the surface (as in Fig. 5) where the ion beam is incident, then a surface chemical reaction may be induced by the ion beam so that the removal rate is higher. For example, Si + 2Cl2 → SiCl4. One important condition for this to occur is that the reaction product is volatile (SiCl4 is a gas). Another condition is that the reaction not proceed spontaneously, i.e., where the ion beam is not incident. In most cases, the substrate removal rate is enhanced by factors of about 10 over ion milling alone. In addition, the undesirable variable milling rate of polycrystalline material may be minimized. Selectivity of one film on the surface relative to another is another highly desirable feature. Some sample etch rate and gases are given in Table 3.
818
Handbook of VLSI Microlithography
Table 3. Ion Beam Assisted Etching
Ion (Energy)
Gas Flux (or pressure on substrate)
Si
65 keV (Ga)
Cl 2 (4 mtorr)
20
[a]
InP
35 keV (Ga)
Cl 2 (1.3 mtorr)
20–30 (140°)
[a]
SiO2
50 keV (Ar)
XeF 2 (20 mtorr)
100
[b]
W
20 keV (Ga)
XeF 2 (2 torr)
15–75
[c]
Cl 2
5–10
[d]
Substrate
Al
Etching Rate Enhancement (rate with gas/rate without gas)
Reference
PMMA
25 keV (Ga)
H2 O (70 mtorr)
[e]
Diamond
25 keV (Ga)
H2 O (70 mtorr)
10
[e]
Si
25 keV (Ga)
H2 O (70 mtorr)
0.3
[e]
[a] Ochiai, Y. et al., J. Vac. Sci. Technol. B, 5:423 (1997) [b] Xu, Z., Gamo, K., and Namba, S., J. Vac. Sci. Technol. B 6:1039 (1998) [c] Kola, R. R., Celler, G. K., and Harriott, L. R., MRS Symposium, (M. A. Nastasi, L. R. Harriott, N. Herbots, and R. S. Averback, eds.), 279:593 (1993) [d] Casey, J. D., Doyle, A. F., Lee, R. G., and Stewart, D. K., Microelectronic Engineering, 24:43 (1994) [e] Stark, T. J., Shedd, G. M., Vitarelli, J., Griffis, D. P., and Russell, P. E., J. Vac. Sci. Technol. B, 13:2565 (1995)
Ion Microbeams Lithography iii. Ion Induced Deposition. Many of the applications, such as mask repair or integrated circuit rewiring, require material to be deposited. Ion induced deposition is complimentary to ion induced etching.[16] Deposition is achieved by creating a flux of a precursor gas, such as an organometallic or W(CO)6, on the surface where the ion beam is incident. A fine nozzle such as a hypodermic needle is pointed at the surface and creates a local gas pressure in the mtorr range;[67] in the rest of the chamber, the pressure is 2–4 orders of magnitude lower as needed for ion source operation and beam propagation. The focused ion beam dissociates the gas molecules adsorbed on the surface, creating a deposit. Examples of deposited materials are shown in Table 4. Because the precursor gases used to deposit metals contain carbon as a constituent, the deposits contain carbon as an impurity. Gases that do not contain carbon such as WF6 have been used to deposit fairly pure tungsten with Ar+ ions,[68] but deposition with Ga+ focused ion beams has not been reported. In the case of gold and copper depositions (Table 4) the carbon content can be greatly reduced if the substrate is heated to about 100°C during the deposition. Presumably, this is due to the fact that the carbon bearing reaction products of the precursor gas dissociation can more easily desorb at higher temperatures. However, even though the resistivity of the “metal” films in most cases is one to two orders of magnitude higher than that of pure metal, the deposits are still useful for making connections in integrated circuit repair. Lower resistance can be achieved by increasing the thickness of the deposited material and, in addition, the connections generally are not made over large distances.
819
0.2–30
4–6 10–30
Ga + 35 keV
Ga + 20 keV Ga + 25–35 keV
Si 60 kV
C9H17Pt
(CH3 )3NA1H3 Cu(hfac)TMVS
TMOS + O2
1 mol/ion
3
3–8
Ga + 40 keV (room T) Ga+ 40 keV (100°C)
C7H7 F6O2 Au
SiOx
Cu:C 60:50(25°C) 95:5(100°C)
Al:Ga:C:N
Pt:C:Ga:O 45:24:28:3 25:55:19:2
Au:C:Ga 50:35:15 Au:C:Ga 80:10:10
W:C:Ga:O 75:10:10:5
2
Ga + 25 keV
W(CO)6
Deposit Compostion W:F:C 93.3:4, 4:2.3
“Yield” (atoms/ions)
Ar + 500 eV & 2 keV
Ion, Energy
WF6
Gas
Table 4. Ion Induced Deposition
[f]
[g] [h]
[i]
900 µΩcm 100 µΩcm 5 µΩcm 2.5 × 106 Ωcm
[e]
[d]
[c]
[a][b]
Reference
70–700 (Bulk Pt = 10.4)
500–1500 (Bulk Au = 2.44) 3–10
150–225
15
Resistivity (µΩ µΩcm) µΩ
820 Handbook of VLSI Microlithography
Ga 30 kV Ga 30 kV
TEOS PMCPS
Deposit (atoms/ions)
SiOx
Si:O:Ga 27:56:17
Resistivity Compostion
[j]
[k] [l]
10 Ωcm* 8 × 1011 Ωcm*
Reference
1.2 × Ωcm*
(µΩ µΩcm) µΩ
[a] Xu, Z., Kosugi, T., Gamo, K., and Namba, S., J. Vac. Sci. Technol. B, 7:1959 (1989) [b] Gamo, K., and Namba, S., Proc. 1989 Intern. MicroProcess Confl, p. 293 (1989) [c] Stewart, D., K., Stern, L. A., and Morgan, J. C., SPIE, 1089:18 (1989) [d] Blauner, P. G., Ro, J. S., Butt, Y., and Melngailis, J., J. Vac. Sci. Technol. B, 7:609 (1989) [e] Blauner, P. G., Butt, Y., Ro, J. S., Thompson, C.V., and Melngailis, J., J. Vac. Sci. Technol. B, 7:1816 (1989) [f] Tao, T., Melngailis, J., Xue, Z., and Kaesz, H. D., J. Vac. Scil. Technol. B, 8:1826 (1990) [g] Gross, M. E., Harriott, L. R., and Opila, R. L., J. Appl. Phys., 68:4820 (1990) [h] Della Ratta, A., Melngailis, J., Thompson, C. V., J. Vac. Sci. Technol. B, 11:2196 (1993) [i] Komano, H., Ogawa, Y., and Takigawa, T., Japan J. Appl. Phys., 28:2372 (1989) [j] Campbell, A. N., et al., 23rd International Symposium for Testing and Failure Analysis (Oct. 27–31, 1997) [k]Young, R. J., and Puretz, J., J. Vac. Sci. Technol. B, 13:2579 (1995) [l] Edinger, K., Melngailis, J., and Orloff, J., EIPBM Symposium, May 1998, J. Vac. Sci. Technol. B, 16 (Nov/Dec. 1998)
*The resistivity of the FIB deposited oxides is nonlinear and depends on the applied electric field. The values quoted should be viewed as very approximate and cannot be directly compared since they were not measured at the same applied electric field.
Ga 50 kV
“Yield” Ion, Energy
OMCTS + O2
Gas
Table 4. (Cont’d.)
Ion Microbeams Lithography 821
822
Handbook of VLSI Microlithography
Some applications in circuit rewiring require an insulator to be deposited. Accordingly, deposition of SiO2 has been developed with acceptable oxide quality,[69][70] i.e., resistivities in the range of 109 to 1012 Ω cm. Although these oxide deposits are much poorer than pure SiO2 which as a resistivity of ~1015 Ω cm, they are still useful for permitting one conductor to cross over another, for example, or for a via to be “lined” so that it can penetratea conducting layer to make contact to a lower level.The successful development of these focused ion beam material removal techniques and material deposition techniques has led to a number of practical applications. Mask Repair. The earliest practical application of focused ion beams, conceived in the mid-1980’s, was repair of photomasks.[71] Already, masks were sufficiently complicated so that the probability of making a mask without a defect was low. Since the masks were also expensive, a repair process was practical. In the repair of a photo-mask, an unwanted absorber, such as chrome, is milled off while the missing absorber is replaced by ion induced deposition of a suitable metal, or even carbon. In the removal process, some implantation of the quartz mask plate by Ga+ ions is inevitable. This leads to an undesirable reduction in the transmission of the UV radiation used for exposure. This can be minimized or avoided by using a reactive gas to increase the etch rate, thus reducing the Ga ion dose needed to remove material and reducing the density of Ga ions implanted. Alternatively, the entire mask can be reactive-ion-etched using a gas that selectively etches quartz and not Cr (such as CH F3). Only a small amount of quartz needs to be removed since the penetration of the Ga ions is shallow. These staining issues may become even more critical, if phase shift masks need to be repaired. In phase shift masks, there may be areas that are transparent but that shift the phase of the transmitted light (by π for example) relative to neighboring areas. Moreover, these phase shifted areas have submicron tolerances in their dimensions. The repair of phase shift masks is a developing area, but in conventional opaque/clear masks, focused ion beam repair is well established and all major mask making facilitates use these repair tools. Circuit Rewiring. In the prototyping phase, an integrated circuit may be designed and built which does not work. Unlike an old fashioned circuit, which is an assembly of discrete components wired together, an integrated circuit cannot be probed easily in its interior. So the designers have the problem of attempting to figure out what is wrong and then redesigning and refabricating the circuit. In the early days of IC’s this iterative process may have needed to be repeated several times before a satisfactory working circuit was achieved.
Ion Microbeams Lithography
823
The focused ion beam can be used in several ways to greatly shorten the process. Via holes can be milled through the oxide passivation layer that covers the conductors of an IC to make an opening to a specific conductor. Then a contact probe pad can be deposited on top of the passivation so that the particular point in the circuit can be probed or a test signal applied. If a wiring defect is uncovered, the focused ion beam can be used to cut conductors, by ion milling a trench through the particular metal film conductor. In addition, if two conductors need to be corrected, vias can be milled down to them and a metal “jumper” can be deposited by ion induced deposition to create a conducting path between them. This is illustrated in Fig. 17(a) schematically and in Fig. 17(b) on an actual circuit.
(a) Figure 17. (a) Schematic of a rewiring process where vias are first milled down to the two underlying metal “wires.” Using FIB induced deposition the wires are connected. (b) An example of a cut in a wire and a deposited connector about 2.5 µm wide (from FEI Co.).
824
Handbook of VLSI Microlithography
(b) Figure 17. (Cont’d.)
Failure Analysis. The focused ion beam provides a way of looking below the surface of microfabricated structures that is only locally destructive. For example, when an open contact or a non-functioning transistor is identified in an integrated circuit, it can be examined and analyzed using focused ion beam sectioning. Using the scanning ion microscope mode and the circuit layout information, the faulty site is positioned under the focused ion beam. A pit is milled into the circuit with the defect site as one wall. The pit has to be deep enough and large enough so that the defect can be seen in a tilted sample either with the ion beam or with an SEM. (In some instruments, the SEM is built into the same chamber as the FIB and one can section and “look” at the same time.) The pit is usually milled with a coarse beam of a few nA current. Then the current is reduced to achieve a finer focus and closely controlled sectioning near the defect [see Fig. 18(a)]. One can, of course, make successive cuts through the defect to see different cross sections. An example of a sectioned device is shown in Fig. 18 (b).
Ion Microbeams Lithography
825
Figure 18. A cross section of a part of an integrated circuit made by milling along two axes to form a corner (from FEI Corp.).
How long does it take to mill out the material? Assume the material is Si and the milling yield is 3 atoms/ion (Table 1). Si has a density of 5 × 1010 atoms/µm3. So, 1.7 × 1010 ions are needed to remove 1 µm3. A 5 nA beam delivers 3.1 × 1010 ions/sec and will remove 1 µm3 of Si in 0.53 sec. Thus, a sloping pit 10 µm × 50 µm which is 10 µm deep at the sectioned end would take 22 minutes to mill in Si. If a reactive gas is used, such as Cl2, the Si removal rate can be increased by a factor of up to 20 (see Table 3) and the time to mill the pit is reduced to 1 minute. Since a device is typically made up of slices of different materials, such as SiO2, Al, Si, different reactive gases would, in principle, be needed to optimize the material removal rate. With or without gas enhancement, the removal rate in most cases is reasonable, since the information gained is valuable. Alternate methods, such as cutting with a diamond saw and sectioning the entire device, are far more difficult, and in most cases would not be able to guarantee that the cross section will intersect the defect.
826
Handbook of VLSI Microlithography
Other Applications of the Ga+ FIB Systems. Aside from the main applications outlined above, any microfabrication task where a small amount of material needs to be removed or added at submicrometer dimensions can be performed with a simple FIB tool. Some examples follow: i. Secondary Ion Mass Spectroscopy. SIMS is a well known technique for material analysis. A surface is sputtered and the sputtered off ions are analyzed in a mass spectrometer thereby identifying and measuring the constituents of the substrate. If this is done with a FIB, the material identification can be carried out with submicrometer resolution.[51] ii. TEM Sample Preparation. To examine a sample in cross section with transmission electron microscopy (TEM) a thin (less than 100 nm thick sliver) has to somehow be prepared. The conventional technique is to pot the sample, cut it with a saw, then polish, and finally mill with a broad ion beam. With an FIB, one can simply cut pits in both sides of a section to be examined. This is similar to the technique shown in Fig. 18 (a), except that two pits are milled back-to-back. The sliver left standing between the pits can be cut loose around the edges and lifted out with a drawn Pyrex probe and placed on a TEM grid.[72] iii. Tunneling Microscopy Tips. The FIB can be used to sharpen or otherwise shape tips used in various forms of tunneling microscopy. It can even be used to grow fine needles by ion induced deposition. Since the tips have a three-dimensional character, the shaping is more challenging than milling a flat surface.[73] iv. Sectioning Resist. With the shrinking of dimensions in IC’s, resist profiles have higher aspect ratios and the sidewall angles have become more critical. Using FIB etching with water vapor,[20] the resist can be rapidly and preferentially sectioned so that sidewalls can be better examined with SEM’s. This is particularly true of vias. v. Trimming of Magnetic Disk Drive Read/Write Heads. The part of the head that faces the magnetic data storage disk consists of two poles. The dimensions of these poles
Ion Microbeams Lithography
827
in part determine the data storage density on the disk. Although much of the fabrication is done with conventional lithography, the final dimensions on this three-dimensional structure are hard to control. The FIB is uniquely suited to trim the dimensions of these heads with high (~10 nm) precision. vi. Examining Crystal Structure. The penetration of ions into a surface depends on crystal orientation and, in fact, so does the number of secondary electrons emitted due to the ion impact. Thus, in a poly-crystalline film, each grain will have a different contrast in the ion imaging mode. This is particularly useful for examining the crystal structure in FIB cross-sectioned vias.[74] One can imagine many other applications of this unique material micromanipulation technique, particularly with the increased interest in MEMS (micro-electro-mechanical systems) and other structures that extend the chip-making technology into areas like sensors and actuators. 4.2
Applications of the High-Voltage Mass-Separated FIB Systems
As discussed above, (see Sec. 3.4), these systems are configured more like e-beam lithography systems. They have sophisticated beam deflection, pattern generation, software coupled with precision stage motion. The main applications of these systems have been in research, for example, direct maskless implantation of semiconductor devices, and resist exposure (i.e., lithography). Direct Implantation. Focused ion beam implantation provides unprecedented flexibility to fabricate semiconductor devices. The dose of ions can be controlled, point-by-point, with 50 nm resolution, in a maskless, resistless process. This permits devices to be made side by side, each with different implant doses, or it permits lateral gradients of doping to be generated within a device. In addition, when combined with local ion induced deposition, we can envisage devices made entirely by FIB’s with no lithography. This would permit simple circuits to be built literally on the side of a needle or on other non-planar geometrics. The machines used for implantation in general use alloy sources such as Pd/As/B or Au/Si/Be, instead of the Ga used for all of the practical applications discussed above. Thus the systems are more complicated and have to include a mass filter as shown in Fig. 6.
828
Handbook of VLSI Microlithography
So far, the applications of implantation have been demonstrated as part of conventional fabrication by substituting FIB implantation for broad area, lithographically-defined implantation. Some examples are: a fast 1 GHz flash analog-to-digital converter made by implanting thirty-two transistors (each with a different dose)[75] to yield different threshold voltages;[76] transistors with doping gradients in the channel; [77][78] tunable Gunn diodes,[79] faster CCD’s,[80] or transistors made by lateral confinement of a 2D electron gas in modulation doped GaAs/Al GaAs.[81] So far, these are research results, They have not found their way into production because the FIB implantation is slow, and the alloy sources are hard to use and not as reliable as the Ga+ sources. Additionally, in some cases, once the advantage of a certain geometry is demonstrated by FIB implants, ways to achieve the same result with modifications of conventional processes have been found. Nevertheless, many FIB implanted devices are unique, and the fabrication time is not prohibitive as long as the number of special transistors on a chip is small (i.e., hundreds, not millions). With a 30 pA beam a dose of 1012 ions/cm2, typical of transistor and channel implants, can be delivered over 1mm2 in 50 sec. In addition, the ability to fabricate devices on unique geometries has not been explored. The need to do this may increase as the development of MEMS progresses. Lithography with Ions. Ions can and have been used for lithography, i.e., resist exposure, by all three of the methods shown in Fig. 1, namely by focused (point) ion beams, by proximity masked shadow printing lithography, and by ion projection lithography. The unique advantages of ions for resist exposure are largely associated with the mechanism of energy loss of ions. The penetration depth of ions in the energy range of interest (~ 100 keV) is comparable to the resist thickness as shown in Table 5. In addition, light ions lose the overwhelming percentage of their energy to the electrons in the resist (the so-called electronic stopping power), and relatively little to scattering by nuclei (so-called nuclear stopping power). This means that the tracks of ions in resist are relatively straight. This straightness has a number of important consequences. i. High Resolution. The minimum linewidths exposed with ions is of order 15 nm. Where an ion passes through resist, it is thought to “expose” a cylinder of about 10 nm diameter. Although there is some evidence to suggest this,[82][83] these cylinders have not been clearly observed. They would represent an ultimate limit to the minimum linewidth that can be exposed at about 10 nm.
Ion Microbeams Lithography
829
ii. No Proximity Effect. The fact that the ions deposit energy in a narrow cylinder surrounding their paths into the resist means that the exposure is strictly localized. This is not the case with electrons. High energy electrons scatter widely as they penetrate a solid and, therefore, the exposure of a given feature is influenced by the presence of neighboring features (see Fig. 19).
Table 5. Ion Range in PMMA Resist Energy (keV)
Range µm) (µ
H+
40 120 240
0.52 1.12 1.85
[a]
He+
40 120 240
0.44 0.96 1054
[a]
Be
100 200
0.45 0.72 1.2
[b] [c][d]
Ion
Reference
Si
200
0.65
[c][d]
Ga
100
0.074
[e]
Note: The range will be the same for most other organic resists since they have the same density, ~1.2 gm/cm3 . [a] [b] [c] [d] [e]
Ryssel, H., et al., J. Vac. Sci. Technol., 19:1358 (1981) Matsui, S., et al., J. Vac. Sci. Technol. B, 5:853 (1987) Matsui, S., et al., J. Vac. Sci. Technol. B, 9:2622 (1991) Huh, J. S., et al., J. Vac. Sci. Technol. B, 9:173 (1991) Matsui, S., et al., J. Vac. Sci. Technol. B, 4:845 (1986)
830
Handbook of VLSI Microlithography
Figure 19. Monte Carlo simulation of ion paths in resist compared to electron paths. (From G. Cho, et al., Kwangwoon University, Seoul, Korea.)
iii. Fast Exposure. Since the ions deposit most of their energy in the resist, the exposure process is effective. The dose of ions needed is usually in the 1012–1013 ions/cm2 (0.16–1.6 µC/cm2) range. Moreover, the minimum dose needed to expose resist when measured over a range of ion energies and ion species correlates with the electronic stopping power rather than nuclear stopping power. Thus, light ions are likely to be as effective in exposing resist as heavier ions, and they have the added advantage of penetrating in straight lines. iv. Good Exposure Latitude. The absence of proximity effect and the fact that ions can be focused fairly tightly into the feature that needs to be exposed with little
Ion Microbeams Lithography
831
leakage outside the feature, means that the feature dimensions will not have a strong dependence on the dose. v. Low Energy Input to the Substrate. Again, because the ions deliver most of their energy to the resist, minimum energy is delivered to the substrate. This is not the case with high energy electrons which penetrate relatively deep into the substrate and lose only a small fraction of their energy in exposing the resist. Therefore, heating of the resist is a consideration in e-beam lithography and may ultimately limit the exposure rate.[84][85] The nature of ion exposure of resist leads to some potential concerns for ion lithography: i. Shot Noise. The fact that resist is so sensitive to ions means that only a finite number of ions expose a given area. For example, if the minimum dose for exposure is 6 × 1012 ions/cm2 (1 µC/cm2) then an area of 50 nm × 50 nm will be exposed by 150 ions. This number can fluctuate statistically by (150)1/2. If the minimum dose to expose resist is 1012 ions/cm2, then this 50 × 50 nm pixel will receive only 25 ions. The probability that the pixel receives considerably fewer is finite and would lead to unexposed pixels. Somewhat contrary to this statistical argument, well formed lines have been exposed with as few as 25–35 ions/pixel.[86][87] Shot noise should not affect exposure as long as the resist is not too sensitive, and finding less sensitive resists is easy. ii. Damage to the Substrate. Since energetic ions in principle carry enough energy to displace crystal lattice atoms, damage to the substrate needs to be considered. As seen from Table 5, the range of ions in resist can be chosen to be close to the resist thickness. Ion penetration into the substrate can be limited to the immediate surface. In many applications, such as gate definition in Si IC’s, the resist is over insensitive layers such as polysilicon or oxide, and the sensitive gate oxide and transistor channel region is not irradiated. And even if the channel is
832
Handbook of VLSI Microlithography irradiated, the damage anneals out during sintering (450°C at 30 min.) In GaAs MESFET’s, however, the resist is generally directly over the channel and more care would need to be taken to avoid or mitigate the effects of damage (for example, a sacrificial film between the resist and the substrate, an etch of the GaAs to remove some material before the gate metal is deposited, often done anyway, or an appropriate anneal). iii. Finite Ion Velocity. This is a concern only for focused ion beam writing and may limit how fast a beam can be deflected or blanked. The blanking or deflection electrodes in a system usually extend about 1 cm along the direction of travel, and the time that the ion takes to traverse this 1 cm limits the beam manipulation speed. For example, a 100 kV proton travels 1 cm in 2.28 nsec. While a 30 keV Ga+ ion takes 34.7 nanoseconds. An analysis of the resist sensitivities and the beam currents available leads to the conclusion that for light ions, the finite velocity is not the limiting factor at this time.
The considerations concerning ion lithography, so far, in general apply to all the ways of generating a patterned dose of ions as shown in Fig. 1.
5.0
FOCUSED ION BEAM LITHOGRAPHY
Electron beam lithography and focused ion beam lithography are the only means for generating an original pattern at submicrometer dimensions. The other ion lithography techniques, i.e., masked ion beam lithography and ion projection lithography can only replicate an existing pattern. Focused ion beam lithography has been used to write some impressive patterns, such as 15 nm wide features in 60 nm thick PMMA written by 50 keV Ga ions,[86] 50 nm features written in 300 nm thick PMMA,[88] (see Fig. 20), and 100 nm wide features in 600 nm thick SAL 601 resist.[89] Comparable features can also be written by e-beam lithography. Electron beam lithography has been developed far more extensively. Thus, in spite of the advantages of FIB lithography, such as no proximity effect, and negligible heating, e-beam lithography has remained dominant. One other reason for this is that, so far, the only really stable
Ion Microbeams Lithography
833
and reliable ion source is the Ga liquid metal source. The alloy sources that emit the light ions such as Be and B are usable in the laboratory but are not mature enough for production use. If a stable gas field ion source is developed, then FIB lithography will be much faster than point (Gaussian) beam e-beam lithography. This is shown in Table 6.
Ion Beam
PMMA
Gold
Develop & Plate 0.3 µm
Dissolve PMMA
Figure 20. Focused ion beam exposure of PMMA and electroplating of features used to fabricate x-ray lithography masks. First, PMMA is exposed with 280 keV Be 2+ ions at 1 × 1013 ions/cm 2 and developed. Then, gold is plated up from a plating base (a thin film of gold under the PMMA). Finally, the PMMA is dissolved resulting in the features shown in the photo. 150 sites were exposed with pairs of lines 40 µm long. (From Ref. 88 Chu.)
5 nA
600 µC/cm2 1 × 10 13 ions/cm2 (PMMA 0.75 µm (PMMA 0.3 µm thick over thick) thin Si membrane) 12 µC/cm2 (negative chemically amplified)
1.17 nA 150 µC/cm2 (9 × 1014 e/cm2) (PMMA 0.1 µm thick)
Resist sensitivity
20 pA
50 nm lines
Beam current
75 nm lines
75 nm lines
8 × 10 11 (positive SAL 601 silylation)
176 pA
75 nm lines
Ga+ 30 keV
Be2+ at 280 keV
Features written
Field emission 100 kV
Ion beam (projected)[e]
Ion beam[d]
Field emission e-gun 25 kV
e-beam[b][c]
Source
e-beam[a]
Table 6. Electron Beam vs. Focused Ion Beam Writing Time for Ultrafine Features
1.5 × 1012 (negative resist SAL 601)
1.5 × 1013/cm2 (positive resist PMMA)
3.6 nA
75 nm lines
H2+ 100 keV
Ion beam (projected)[f][g]
834 Handbook of VLSI Microlithography
24 s/mm2 for negative resist
1300 s/mm2 PMMA
e-beam[a] 1200 s/mm2 750 s/mm 2 if scaled to 75 nm features
e-beam[b][c] 1700 s/mm2 or
Ion beam[d] 7.6 s/mm2
Ion beam (projected)[e]
0.69 s/mm2 for negative resist
6.9 s/mm2 for PMMA
Ion beam (projected)[f][g]
[2]
Gesley, M. A., Hohn, F. J., Viswanthan, R. G., and Wilson, A. D., J. Vac. Sci. Technol. B, 6:2014 (1988) McCord, M. A., Viswanthan, R. G., Hohn, F. J., and Wilson, A. D., EIPBN Conf., 1992 to be published, J. Vac. Sci. Technol. B, 2754 (Nov./Dec. 1992) [3] McCord, M. A., private communication. [4] Chu, W., Yen, A., Ismail, K., Shepard, M. I., Lezec, H. J., Musil, C. R., Melngailis, J., Ku, Y. C., Carter, J. M., and Smith, I. I., J. Vac. Sci. Technol. B, 7:1583 (1989 [5] Hartney, M. A., Shaver, D. C., Shepard, M. I., Melngailis, J., Medvedev, V., and Robinson, W. P., J. Vac. Scil. Technol. B, 9:3432 (1991) [6] Wilbertz, C., Maisch, T., Hhttner, D., B‘hringer, K., Jousten, K., and Kalbitzer, S., Nucl. Instr. And Meth. B, 63:120 (1992) [7] Kalbitzer, S., Max Planck Inst. Fhr Kernphysik, private communication.
[1]
Note: In the case of ion beams the minimum linewidth is written with a single pass of the beam. This has been shown to yield well formed lines. The linewidth can be adjusted by varying the dose. In the case of electrons the 75 run minimum linewidth features are written with five passes of a 15 nm diameter beam or three passes of a 28 run diameter beam in order to have good control of the linewidth. If multiple passes are used with ion beams the writing time will be increased, e.g., using two passes will increase the writing time by about a factor of 4.
Writing time
Table 6. (Cont’d.)
Ion Microbeams Lithography 835
836
Handbook of VLSI Microlithography
6.0
MASKED ION BEAM LITHOGRAPHY
A patterned ion dose can be delivered to a surface by using a stencil mask in close proximity to the sample. This is similar in principle to the socalled contact optical lithography, and also to x-ray lithography. Masked ion beam lithography is illustrated in Fig. 21. Usually protons are used in the 100 keV range. (Summaries of the earlier work have been published.[2][90]). The dose required to expose most organic resists is in the range of 5 × 1011 to 3 × 1013 ions/cm2. The source of protons is also easily available. Most any implanter is capable of producing the desired exposure over, say 1 cm2, in well under a second. 6.1
The Mask
The main difficulty in masked ion beam lithography has been the mask. Most materials, including the least dense, are highly absorbing to protons and also have the effect of scattering an incident collimated beam. While for photons one can speak of attenuation of materials (i.e., decrease of intensity or number of photons as a function of depth in a material), for ions, the situation is more complicated. A collimated beam of ions passing through a thin film will emerge both diminished in number and in energy and to some extent uncollimated (see Table 7). The difference in the energy loss rate, and range for protons in various materials is seen to be barely over a factor of two. For photons (x-ray or UV), the difference in attenuation can be several orders of magnitude. Thus, the design of a mask for ion beam lithography, which ideally should be made of alternate areas of absorbing and transmitting material, presents a special challenge. Two classes of masks have been considered: the open stencil mask and the Si channeling mask. The open stencil mask has the advantage of not effecting transmitted ions. However, certain geometries are forbidden. The center will fall out of an “O;” very dense, open areas are likely to be difficult to make due to distortion caused by stress relief. Membranes with stress down to 10 MPa and below have been made and the stress induced distortion has been limited to 0.15 µm over a 20 mm × 20 mm area in specially compensated masks.[91] A recently reported practical application of masked ion beam lithography is the fabrication of infrared filters. A
Ion Microbeams Lithography
837
surface that is covered with metal (in this case gold) crosses 500 nm arm length and 50 nm arm width acts as a bandpass filter at 1.5 µm wavelength and reflects higher wavelengths. [92] The crosses are shown in Fig. 22. The advantage of masked ion beam lithography in this case is cost. Features of 50 nm can be made over relatively large areas at reasonable rates (20 sec/ cm2).[92] To write such features with e-beam lithography would take a prohibitive amount of time. The original mask is, of course, written with ebeam lithography.
Figure 21. Schematic of masked ion beam lithography, showing the mask consisting of a support membrane, transparent to protons, and absorber regions.
838
Handbook of VLSI Microlithography
Table 7. Range and Energy Loss of Protons in Matter for Two Incident Energies
Incident Energy
Range (µm)
(100 keV)
(200 keV)
Initial Energy loss rate (keV/µm) (100 keV)
(200 keV)
Material Resist (KTFR)
0.88
1.4
164
206
Au
0.40
0.79
203
206
Si
1.20
2.24
98.1
87.6
SiO2
1.04
2.19
109
96.7
Reference: Smith, B., Ion Implantation Range Data for Silicon and Germanium Device Technologies, Research Studies Press, Inc. (1977)
(a)
(b)
Figure 22. Photomicrographs of (a) the electron beam lithography fabricated infrared filter crosses; (b) the same fabricated by masked ion beam lithography. The overall width of each cross is 500 nm.[91]
Ion Microbeams Lithography
839
The silicon channeling mask does not have the disadvantage of forbidden geometries. It consists of a single crystal silicon membrane about 2.5 µm thick which has been thinned to 0.5 µm where transmission is desired.[93][94] The disadvantage is that the protons lose ~50 keV of energy passing through the 0.5 µm thick silicon and also lose some collimation.[94] This limits the resolution and the ultimate usefulness of the technique. The source of ions is relatively easy to obtain. Most any implanter can deliver a reasonably collimated beam of ions at the desired (~100 keV) energy, and collimation can be improved by moving farther away from the source. The most impressive source of ions, however, has been demonstrated by IMS in Vienna.[95] Modifying the system originally used for ion projection lithography, workers at IMS were able to achieve an exposure field of 150 × 150 mm with a deviation from telecentricity of -0.5 mrad and a penumbral blur of 30 µm rad. (See Fig. 23). The energy range of ions available was 75–150 keV with H+, H2+, H3+ or He+ ions. The current density at the mask was 0.1 µA/cm2, i.e., resist would be exposed in about 10 sec. The small penumbral blur means that even at a 1 mm gap between mask and substrate, the image in the resist would be blurred by only 30 nm. The fact that the lithography is insensitive to the mask-to-substrate gap is an attractive feature of this technology. Issues of mask erosion, mask distortion by ion implantation, and mask heating have to be considered but do not appear to prevent the application.[92] These issues are common to ion projection lithography and will be addressed in the next section.
Figure 23. Sketch defining penumbral blur (angle %) and deviation from telecentricity (angle ε).
840
Handbook of VLSI Microlithography
7.0
ION PROJECTION LITHOGRAPHY
So far, we have examined focused ion beam lithography (and implantation) and masked ion beam lithography. In some sense ion projection lithography is a cross between them. Like the focused ion beam it uses ion optics in an ion column, and like masked ion lithography it uses a reticle, i.e., an open stencil mask, to define the pattern. As shown schematically in Fig. 24, ions are incident on the reticle and some pass through the open spaces of the reticle. The image of this reticle is projected through the ion column and demagnified by, say 4X. .
Ion Source Zoom Lenses ExB Stencil Mask
Diverging Lens
10 keV
Field Lens 200 keV
Asymmetric Einzel Lens 150 keV W afer
X-Y-Stage
Figure 24. Cross-section of the ALG-1000 IPL stepper. The ion lenses are shown approximately to scale. The mask to wafer distance is approximately 2 m.
Ion projection systems have been built by Ion Microfabrication Systems in Vienna, Austria (a recent detailed review is available.[96]). The main motivation has been lithography, i.e., exposure of resist, although in some cases direct resistless processing has been demonstrated, for example, exposure of SiO2 with 80 keV He+ ions to a dose of 6.6 × 1015 ions/ cm2 followed by wet chemical etching in 5% HF.[97] Most inert ions appear to be usable in this system (H, He, M, Ne, Ar, Xe). [98] If species
Ion Microbeams Lithography
841
such as B, As, Si, or Be could be projected, then direct implantation of semiconductors would be an attractive possibility. Three generations of systems have been built by IMS in Vienna and the results have been sufficiently promising that a large European program has been launched to develop a tool that would demonstrate production worthiness.[99] Ion projection lithography is one of the candidates for sub 0.13 µm patterning. The other candidates are 1:1 x-ray lithography, soft xray projection, also called extreme ultraviolet, EUV, and electron projection. Each of the parts of the ion projection system has critical requirements. 7.1
Ion Source
For the four time demagnification system being built in Europe, the ion current density on the stencil mask needs to be 0.15 µA/cm2 at about 10 keV of energy. This translates to a 0.4 sec. exposure time per die for 1 µC/ cm2 resist sensitivity. If the usable mask area is 200 mm diameter, the total current on the mask needs to be about 50 µA. Moreover, the current density needs to be uniform over the mask. Ideally, the ions should appear to be coming from a point on the axis and to be collimated within 3° on the mask. The other important parameter is the energy spread of the ions. The energy spread leads to chromatic aberration (as discussed above in connection with FIB systems) and produces blurring of the image. The sources that have been used, all generate a plasma of the desired species from which ions are extracted by suitably biased and shaped extraction electrodes. The sources for ion projection lithography are being developed by Lawrence Berkeley National Laboratory, and have been yielded energy spreads as lows as 0.7 eV.[100] With a source that has 2eV energy spread, 65 nm features were readily resolved in an ion projection system.[100] The key feature that enables low energy spreads to be achieved is magnetic energy filters placed in front of the extraction region. Without such filters, the energy spread is in to 10–15 eV range. 7.2
Mask
The mask in this case is demagnified and, therefore, easier to fabricate from the point of view of resolution. However, it has to have a larger area. The masks made so far have almost all been made of single crystal silicon wafers which have the central area, for example, 200 mm diameter, thinned to a membrane 4 µm thick. The outer perimeter of the
842
Handbook of VLSI Microlithography
wafer which is not thinned acts as a frame which in turn can be attached to an even more rigid frame. The membrane is etched electrochemically up to a predefined p-n junction 4 µm below the surface.[101] The tensile stress of the membrane is adjusted by suitable implantation and is kept below 10 MPa.[91] The pattern of openings is defined in an oxide film on the membrane using electron beam lithography, and the holes are etched in the membrane with reactive ion etching using Br2 gas.[102] An example of the quality of this etching process is shown by the 50 nm slot etched in 1 µm thick Si (see Fig. 25). Clearly, vertical sidewalls can be produced in the mask openings. In addition, the mask is covered by an absorbing carbon layer on the side that is exposed to the ions.[103]
Figure 25. A 50 nm wide slot etched through a 1 µm thick membrane by the University of Houston. The slot width narrows by only about 5 nm from front to back side.
Ion Microbeams Lithography
843
The key issue with these masks is the distortion induced by stress relief. If a pattern of holes is cut in a given area of the mask, the stress in that area will be relieved and there will be a lateral displacement of parts of the mask. This will translate into a placement error in the features exposed on the wafer. Thus, if this placement error is to be kept below 10 nm, the distortion of the mask must be kept below 40 nm. This distortion has been modeled for various simple geometries.[104][105] In some of the worst case situations, for example, a patterned area over one half of the mask only, distortions are of order 200 nm, at least four times larger than permitted. There are several ways of reducing this distortion: i. Calculate the expected distortion and predistort the pattern to correct for it. This may be difficult for complex patterns. ii. Reduce the initial stress in the membrane. This may present some fabrication challenges. iii. Build a stress relief pattern around the perimeter of the mask which in effect reduces the stress after the openings are etched.[105] (The pattern will uniformly shrink but this is automatically taken care of by adjusting the magnification of the system). iv. Choose different membrane material. Larger elastic constants will result in lower distortion. Diamond has about a factor of six larger elastic constants than Si while Si3N4 is more favorable by a factor of 3. Another source of mask distortion is thermal expansion due to mask heating. The membrane has energy incident at 0.15–0.33 mW/cm2 and is in vacuum. Heat is lost only by conduction to the rim and by radiation. Thus, the temperature of the center of the membrane will be higher than the edges. A temperature difference of about 10 K between the edge and the center (which is calculated to occur under some circumstances) will result in an unacceptable non-uniform radial distortion of about 0.2 µm.[106] (Some of this can be corrected by a change in magnification, but at least 0.1 µm distortion will remain). However, the radial temperature nonuniformity can be reduced to an acceptable level (~ ±1 K) by introducing a cooled cylinder surrounding the ion beam incident on the mask. This has been verified both by modeling[106] and by measurement.[107][108]
844
Handbook of VLSI Microlithography
Since the mask is cooled the same way it is heated, namely by radiation, the cutting of holes in the mask will not affect the temperature distribution, i.e., once the temperature profile is flat there is no lateral heat flow. 7.3
Ion Optical Column
After the ions pass through the mask the ion optical column demagnifies the image and increases the ion energy from the 10 keV range to the 75–150 keV range. The image of the pattern that is projected is subject to several sources of distortion and possibly drift in the ion column. Distortion can be caused by: the inherent properties of the ion optical elements, ion-ion global repulsion, or ion column misalignment. Drift can be caused by external (mainly magnetic) fields, temperature changes, or charging of the elements inside the column. 7.4
Pattern Lock System
A number of these sources of drift and distortion can be compensated by the pattern lock system[109]–[111] (see Fig. 26). In this scheme, extra beamlets are defined in precise locations around the perimeter of the die pattern on the mask. These beamlets surround the image that is projected through the ion optics, and they fall on permanent alignment marks that are located in a plate just above the wafer. The positions of the beamlets with respect to the alignment marks is measured and the various lens voltages, multipoles, and axial magnetic poles are adjusted to bring (and keep) the beamlets on the alignment marks (see Fig. 26). This scheme, which is crucial to ion projection lithography, has been tested and found to permit high resolution exposure even in the presence of external magnetic field noise. In addition to the external fields, the pattern lock is expected to be able to correct for drift and distortion due to charging, some mask distortion, and ion-ion global repulsion, which to lowest order leads to a change in magnification. The degree to which non-uniform distortion can be corrected depends on the number of beamlets used to measure the distortion and on the variety of multipole fields that can be adjusted in the column to shape the image.
Ion Microbeams Lithography
845
Figure 26. The IPL “pattern-lock” system.
7.5
Optical Column Design
Of course, the initial layout for the ion lenses and lens voltages is carefully designed to minimize the inherent pattern placement distortion in the column. The computations are complicated but, for one model, were independently verified by three laboratories who found that the pattern placement distortion could be kept below 20 nm over the entire die field.[112] To verify that the level of distortion is acceptable, a metrology plate will be inserted at the position of the wafer and a corresponding mask which consists of an array of alignment masks inserted in the system. This scheme has been devised by IMS and is described in the review.[96] The system verification and adjustment with the metrology plate is expected to be done periodically. With the pattern lock system operating, and the image quality verified, the image of a pattern to be exposed is rigidly positioned with respect to the pattern lock alignment plate just above the wafer. The alignment of the wafer to the bottom of the column is a task that is, in one way or another, common to almost all lithography schemes. Off axis optical alignment can achieve the desired performance and is being used in optical steppers. In this scheme, the alignment module (light source and detector)
846
Handbook of VLSI Microlithography
is mounted at a precise position with respect to the ion beam. Since the wafer stage is on a laser interferrometrically controlled x-y table, the wafer can be precisely positioned under the ion beam once the optical alignment has been carried out. 7.6
Stochastic Blur
The coulomb interaction between ions manifests itself in two ways. The global effect is due to the displacement of a given ion due to the combined effect of all of the other ions travelling in the beam. In addition, individual, random ion-ion scattering will mean that some ions have suffered enough trajectory displacement so that they do not arrive at the intended location. This is referred to as stochastic blur. Most of this effect will occur where the beam is densest, namely at the beam cross-over point, usually in a field free region between the two lenses of the column. To minimize the stochastic blur, the ion energy in the cross-over should be as high as possible. Measurements of stochastic blur carried out with H+ ions with 7 keV energy at the cross over indicate that if the column is scaled up in higher voltage (for example, 200 keV), then 3 µA of ion beam current can be passed through the column before the blur becomes unacceptable.[113] This is sufficient beam current to achieve the desired (~ 0.5 sec/die) exposure time. 7.7
Resist Exposure
As discussed above, ions have a number of advantages for exposing resist. If the nominal dose is 1 µC/cm2 (6 × 1012 ions/cm2), then shot noise should not be an issue. (See Sec. 4.2.) Damage with H+ ions to the substrate is also either unimportant (i.e., the ions stop in insensitive layers) or it anneals out. Since light ions lose most of their energy to electrons rather than to scattering with the nuclei of the substrate, the paths of electrons in the resist and the substrate will be more or less straight and the range will be well defined. This is shown by the simulated paths of H+ and H2+ ions.[114] Many resists work well with ions with contrast values between 2.7 and thirty have been reported.[96] Well formed features down to 65 nm have been exposed in DUV resist with a very early model ion projection machine[115][116] (see Fig. 27) Also, 0.18 µm wide lines have been exposed over 0.5 µm high steps in negative resist (see Fig. 28).
Ion Microbeams Lithography
847
Figure 27. Features down to (left) 65 nm in 140 nm thick in Shipley DUV positive resist (UVII HS 0.6) and (right) 80 nm spaces to 120 nm spaces in 370 nm thick resist.This resist is quite sensitive ~1012 ions/cm2 and becomes negative only at a dose 2 orders of magnitude above the positive exposure threshold.
Figure 28. 0.18 µm wide lines on 0.4 µm pitch in Ray-AZ-PN resist over oxide steps 0.5 µm high. Dose = 5 × 1012 H+ ions/cm2.
848
Handbook of VLSI Microlithography
From the results demonstrated with ion projection lithography and the calculations carried out, the technology looks quite promising for future sub 100 nm lithography. However, much work still needs to be done to prove its capabilities. This will be done by the systems now being built by the European project.
8.0
CONCLUSION
Progress in a field of technology is often governed by the interplay of several factors: the perceived need, useful results obtained, useful results envisioned, competing technologies, and resources devoted to it. In the case of focused ion beam development for photomask repair, and integrated circuit diagnosis, a clear need existed, useful results were readily demonstrated, no competing technologies were available, and the resources were expended to make commercial machines. In the case of ion lithography and focused ion beam implantation the issues are less clear. Focused ion beam lithography was first demonstrated more than twenty years ago, and many impressive results have been obtained since then. However, it has always had to compete with the earlier and more mature e-beam lithography. In spite of the fact that future advantages have been demonstrated and projected, the resources devoted to focused ion beam lithography development have been limited. The near term lithography needs have been satisfied by e-beams. As the demands on resolution in lithography increase, we expect the interest in focused ion beams, particularly with light ion species, to increase. Focused ion beam implantation, in contrast to lithography, can clearly perform unique functions which cannot be duplicated by conventional fabrication using broad beam implantation. The ability to vary the dose from point to point within a device and to vary the dose from device to device has permitted unique devices to be fabricated. What has limited the investment in this seemingly powerful technique is the fact that focused ion beam implantation is serial and therefore slow and that the devices fabricated, though unique and with superior performance, are not eagerly sought by the market place. In addition, some of the ion sources needed for implantation are difficult to operate. This may change. In the meantime, focused ion beam implantation is an exciting research tool and is a useful prototyping tool. Many implant doses, energies and unique geometries can be explored on one chip or wafer.
Ion Microbeams Lithography
849
In the case of masked ion lithography, despite impressive results, there are heavily funded competing efforts. One can, at this point, imagine special cases where the limitations of stencil masks are overshadowed by the advantages of the simpler source of radiation, and the fact that a large mask-to-substrate gap can be used. Projection ion lithography is competing with other technologies in the post optical arena and further developments will determine the outcome.
REFERENCES 1. See for example Handbook of Charged Particle Optics, (J. Orloff, ed.), (CRC) Press (1997); or Prewett, P. and Maier, G., Focused Ion Beams from Liquid Metal Ion Sources, Research Studies Press (1991) 2. Bartelt, J. L., Solid State Technology, 29:215 (May, 1986) 3. Wolfe, J. C., Pendharkav, S. V., Ruchhoeft, P., Sen, S., Morgan, M. D., Horne, W. E., Tiberio, R. C., and Randall, J., J. Vac. Sci. Technol. B 14:3896 (1996) 4. Mondelli, A. A., Berry, I. L., Melngailis, J., and Gross, G., Microlithography World, 6(4):12 (Autumn 1997) 5. Melngailis, J., Mondelli, A. A., Berry, I. L., and Mohondro, R., J. Vac. Sci. Technol. B, 16, 927 (1998) 6. Ghandi, S., VLSI Fabrication Principles, J. Wiley, p. 316 and references therein (1983); Also 2nd Ed., p. 384 (1994) 7. Ziegler, J. F., Handbook of Ion Implantation Science and Technology, North Holland (1992) 8. Robinson, M. T., and Torrens, I. M., Phys. Rev. B 9, 5008 (1974); Robinson, M. T., Phys. Rev. B, 27:5347 (1983) 9. Mladenov, G. M., and Emmoth, B., Appl. Phy. Lett., 38:1000 (1981) 10. Chen, C. H., Ph.D. Thesis, Univ. of Md. 11. Stanishevsky, A., Univ. of Md., private communication. 12. Xin, X., DellaRatta, A. D., Sosonkin, J., and Melngailis, J., J. Vac. Sci. Technol. B, 10:2675 (1992) 13. Ishitani, T., and Ohnishi, T., Jpn. J. Appl. Phys., 28:L320, (1989) 14. Santamore, D., Edinger, K., Orloff, J., and Melngailis, J., J. Vac. Sci. Technol. B, 15:2346 (1997) 15. Dubner, A., and Wagner, A., J. Appl. Phys., 66:870 (1989). This has been studied for dimethyl gold hexafluoracetyl acetoneate.
850
Handbook of VLSI Microlithography
16. Melngailis, J., SPIE, 1465:36 (1991). For a review of ion induced deposition. 17. Ro, J. S., Thompson, C. V., and Melngailis, J., Thin Solid Films, 258:333 (1995) 18. Della Ratta, A. D., Melngailis, J., and Thompson, C. V., J. Vac. Sci. Technol. B, 11: 2195 (1993) 19. Blauner, P. G., Butt, Y., Ro, J. S., Thompson, C. V., and Melngailis, J., J. Vac. Sci. Technol. B, 7:1816 (1989) 20. Stark, T. J., Shedd, G. M., Vitarelli, J., Griffis, D. P., and Russell, P. E., J. Vac. Sci. Technol. B, 13:2565 (1995) 21. Ochiai, Y., et al., J. Vac. Sci. Technol. B, 5:423 (1987) 22. Xu, Z., Gamo, K., and Namba, S., J. Vac. Sci. Technol. B, 6:1039 (1988) 23. Clampit, R., Aitken, K. L., and Jefferies, D. K., J. Vac. Sci. Technol., 12: 1208 (1975); Clampit, R., and Jefferies, D. K., Nucl. Instrum. Methods, 149:739 (1978); Krohn, V. E., and Ringo, G. R., Appl. Phys. Lett., 27:479 (1975) 24. Harriott, L. R., and Temkin, H., Integrated Optoelectronics. Ch. 6, Academic Press, ISBN 0-12 200420-5 (Lehaney, Crow and Dagenair, eds.) (1994) 25. Orloff, J., Rev. Sci. Instrum., 64:1105 (1993) 26. Melngailis, J., J. Vac. Sci. Technol. B, 5:469 (1987) 27. Clark, W. M., Seliger, R. L., Utlaut, M. W., Bell, A. E., Swanson, L. W., Schwind, G. A., and Jergensen, J. B., J. Vac. Sci. Technol. B, 5, p. 197 (1987) 28. Seliger, R. L., Ward, V. W., Wang, V., and Kubena, R. L., Appl. Phys. Lett., 34:510 (1979) 29. Wang, V., Ward, J. W., and Seliger, R. L., J. Vac. Sci. Technol., 19:1158 (1981) 30. Shiokawa, T., Kim, P. H., Toyoda, K., and Namba, S., J. Vac. Sci. Technol. B, 1:1117 (1983) 31. Levi-Setti, R., Proceedings of the 7th Annual Scanning Electron Microscope Symposium, p. 125, IIT Research Institute, Chicago, IL (1974) 32. Orloff, J. H., and Swanson, L. S., J. Vac. Sci. Technol., 12:1209 (1975); J. Vac. Sci. Technol., 15:845 (1978); J. Vac. Sci. Technol., 50:6026 (1979) 33. Schwoebel, P. R., and Hanson, G. R., J. Vac. Sci. Technol., 133:214 (1985) 34. Itakura, T., Horiuchi, K., and Nakayama, N., J. Vac. Sci. Technol. B, 9:2596 (1991) 35. Kalbitzer, S., and Knoblauch, A., J. Vac. Sci. Technol. B, 16:2455 and references therein (1998)
Ion Microbeams Lithography
851
36. Bell, A. E., Rao, K., Schwind, G. A., and Swanson, L. W., J. Vac. Sci. Technol. B, 6:927 (1988) 37. Ward, J. W., J. Vac. Sci. Technol. B, 3:207 (1985); Yan, Y. W., Groves, T. R., and Pease, R. F. W., J. Vac. Sci. Technol. B, 2:1141 (1983) 38. Ishitani, T., Umemura, K., and Kawanami, Y., J. Vac. Sci. Technol. B, 6:931 (1988) 39. Marriott, P., J. de Physique Colloque C6, Suppl. Au no, 11 Tome 48, p. C6-189 (Nov. 1987). Presented at Field Emission Symp., Osaka (July 1987) 40. Kurihara, K., J. Vac. Sci. Technol. B, 3:41 (1985) 41. Miyauchi, E., Hashimoto, H., and Utsumi, T., Jpn. J. Appl. Phys., 22: L225 (1983) 42. Gamo, K., Matsui, T., and Namba, S., Jpn. Appl. Phys., 22:L692 (1983) 43. Clark, W. M., Seliger, R. L., Utlaut, M. W., Bell, A. E., Swanson, L. W., Schwind, G. A., and Jergensen, J. B., J. Vac. Sci. Technol. B, 5:197 (1987) 44. Swanson, L. W., Bell, A. E., Schwind, G. A., J. Vac. Sci. Technol. B, 6:491 (1988) 45. Umemura, K., Kawanami, Y., Ishitani, T., Mucl. Instr. And Methods in Physics Res. (1988) 46. Edinger, K., Yun, V., Melngailis, J., and Orloff, J., J. Vac. Sci. Technol. B, 15:2365 (1997) 47. Lewis, G. N., Paik, H., Mioduszewski, J., and Siegel, B. M., J. Vac. Sci. Technol. B, 4:116 (1986) 48. Levi-Setti, R., Advances in Electronics and Electron Physics, Suppl., 13A:261, Academic Press (1980) 49. Horiuchi, K., Itakura, T., and Ishikawa, H., J. Vac. Sci. Technol. B, 6:937 (1988) 50. Börret, R., Jousten, K., Böhringer, K., and Kalbitzer, S., J. Phys. D. Applied Phys., 21:1835 (1988) 51. Levi-Setti, R., Advances in Electronics and Electron Physics, Suppl. 13A:261, Academic Press (1980) 52. Murray, J., and Brodie, I., Physics of Microfabrication, Ch. 2, Plenum Press (1982) 53. Munro, E., Image Processing and Computer-aided Design in Electron Optics, (P. W. Hawkes, ed.), p. 284, Academic Press, New York, (1973) 54. Szilagyi, M., and Szep, J., J. Vac. Sci. Technol. B, 6:953 (1988) 55. Tsumagari, T., Ohiwa, H., and Noda, T., J. Vac. Sci. Technol. B, 6:449 (1988) 56. Munro, E., J. Vac. Sci. Technol. B, 6:941 (1988)
852
Handbook of VLSI Microlithography
57. Thompson, W., Honjo, I., and Utlaut, M., J. Vac. Sci. Technol. B, 1:1125 (1983) 58. Kubena, R. L., Stratton, F. P., Ward, J. W., Atkinson, G. M., and Joyce, R. J., J. Vac. Sci. Technol. B, 7:1798 (1989) 59. Paik, H., Lewis, G. N., Kirkland, E. J., and Siefel, B. M., J. Vac. Sci. Technol. B, 3:75 (1985) 60. Morimoto, H., Sasaki, Y., Onda, H., and Kato, T., Appl. Phys. Lett., 46:898 (1985) 61. Morita, T., Miyauchi, E., Arimoto, H., Takamori, A., Bamba, Y., Hashimoto, H., J. Vac. Sci. Technol. B, 5:236 (1987); Miyauchi, E., Morita, T., Takamori, A., Arimoto, H., Bamba, Y., and Hashimoto, H., J. Vac. Sci. Technol. B, 4:189 (1986) 62. Mogren, S., and Berry, I. L., J. Vac. Sci. Technol. B, 16:2469 (1998) 63. Xu, X., Della Ratta, A. D. , Sosonkin, J. A., and Melngailis, J., J. Vac. Sci. Technol. B, 10:2675 (1992) 64. Ishitani, T., and Ohnishi, T., Jpn. J. Appl. Phys., 28:L320 (1989) 65, Santamore, D., Edinger, K., Orloff, J., and Melngailis, J., J. Vac. Sci. Technol. B, 15:2346 (1997) 66. Muller, K. P., and Petold, J. C., Proc. SPIE, 1263:12 (1990) 67. Blauner, P. G., Ro, J. S., Butt, Y., and Melngailis, J., J. Vac. Sci. Technol. B, 7:609 (1989) 68. Xu, Z., Kosugi, T., Gamo, K., and Namba, S., J. Vac. Sci. Technol. B, 7:1959 (1989) 69. Young, R. J., and Puretz, J., J. Vac. Sci. Technol. B, 13:2576 (1995) 70. Edinger, K., Melngailis, J., and Orloff, J., J. Vac. Sci. Technol. B, 16:3311 (1998) 71. Wager, A., Barr, D. L., Atwood, D., and Bruning, J., Electron Im & Photon Beam Technology Symopsium (May 1983) 72. Overwijk, M. H. F., van der Heuvel, F. C., and Bulle-Lieuwma, C. W. T., J. Vac. Sci. Technol. B, 11:2021 (1993) 73. Vasile, M. J., Biddick, C., and Schwalm, S. A., J. Vac. Sci. Technol. B, 12:2388 (1994) 74. Nikawa, K., J. Vac. Sci. Technol. B, 9:2566 (1991) 75. Lee, J. Y., and Kubena, R. L., Appl. Phys. Lett, 48:668 (1986) 76. Walsen, R. H., Schmitz, A. E., Larson, L. E., Kramer, A. R., and Pasiecznik, J., Proc. IEEE 1988 Custom Integrated Circuits Conf., P. 18.7.1, IEEE 88 CH 2584-1 (May 1988) 77. Evason, A. F., Cleaver, J. R. A., and Ahmed, J., IEEE Electron Dev. Lett., 9:281 (1988)
Ion Microbeams Lithography
853
78. Shen, C., Murugia, J., Goldsman, J., Peckerar, M., Melngailis, J., and Antoniadis, D. A., IEEE Transactions on Electron Devices, 45(2):453–459 (Feb. 1998) 79. Lezec, H. J., Ismail, K., Mahoney, L. J., Shepard, M. I., Antoniadis, D. I., and Melngailis, J., IEEE Electron Dev. Lett., 9:476 (1998) 80. Lattes, A. L., Munroe, S. C., Seaver, M. M., Murugia, J. E., and Melngailis, J., IEEE Transactions on Electron Devices, 39(7):1772–1774 (July 1992) 81. Wieck, A. D., and Ploog, K., Appl. Phys. Lett., 56:928 (1990) 82. Kubena, R. L., Ward, J. W., Stratton, F. P., Joyce, R. J., and Atkinson, G. M., J. Vac. Sci. Technol., 139:3079 (1991) 83. Stumbo, D. P., and Wolfe, J. C., J. Vac. Sci. Technol. B, 11:2432 (1993) 84. Babin, S., and Kuzmin, I. Y., J. Vac. Sci. Technol. B, 16:3241 (1998) 85. Kratschmer, E., and Groves, T. R., J. Vac. Sci. Technol. B, 8:1898 (1990) 86. Kubena, R. L., Stratton, F. P., Ward, J. W., Atkinson, G. M., and Joyce, R. J., J. Vac. Sci. Technol. B, 7:1798 (1989) 87. Matsui, S., Moro, K., Saigo, K., Shiokawa, T., Toyoda, K., and Namba, S., J. Vac. Sci. Technol. B, 4:845 (1986) 88. Chu, W., Yen, A., Ismail, K., Shepard, M. I., Lezee, H. J., Musil, C. R., Melngailis, J., Ku, Y. C., Carter, J. M., and Smith, H. I., J. Vac. Sci. Technol. B, 7:1583 (1989) 89. Matsui, S., Kojima, Y., Ochiai, Y., and Honda, T., J. Vac. Sci. Technol. B, 9:2622 (1991) 90. Randall, J. N., J. Vac. Sci. Technol. A, 4:777 (1986) 91. Mauger, P. E., Shimkunas, A. R., Wolfe, J. C., Sen, S., Löschner, H., and Stengl, G., J. Vac. Sci. Technol. B, 10:2819 (1992) 92. Morgan, M. D., Horne, W. E., Sundaram, V., Wolfe, J. C., Pendharkan, S. V., and Tiberio, R., J. Vac. Sci. Technol. B, 14:3903 (1996) 93. Atkinson, G. M., Bartelt, J. L., and Middleton, P. L., J. Vac. Sci. Technol. B, 5:219 (1987) 94. Atkinson, G. M., Bartelt, J. L., Weureuther, A. R., and Chang, N. W., J. Vac. Sci. Technol. B, 5:232 (1987) 95. Löschner, H., et al., unpublished result. 96. Melngailis, J., Mondelli, A. A., Berry, I. L., III, and Mohondro, R., J. Vac. Sci. Technol. B, 16, p. 927–957 (May/June 1998) 97. Stengl, G., Löschner, H., and P. Wolf, Nucl. Instr. And Methods in Physics Research, B,19/20:987 (1987) 98. Stengl, G., Löschner, H., W. Maurer, and Wolf, P., J. Vac. Sci. Technol. B, 4:194 (1986) 99. Kaesmaier, R., Loeschner, H., Stengl, G., Wolfe, J. C., and Ruchhoeft, P., J. Vac. Sci. Technol. B, 17, 3091 (1999)
854
Handbook of VLSI Microlithography
100. Lee, Y., Gough, R. A., Leung, K. N., Vulic, J., Williams, M. D., Zahir, N., Fallman, W., Torkler, M., and Brunger, W., J. Vac. Sci. Technol. B, 16:3367 (1998) 101. U.S. Patent No. 4,919,749, Mauger, P. E., Shimkunas, A. R., and Yen, J. J.; U.S. Patent No. 4,966,663, Mauger, P. E. 102. Wolfe, J. C., Pendharkar, S. V., Ruchhoeft, P., Sen, S., Morgan, M. D., Horne, W. E., Tiberio, R. C., and Randall, J. N., J. Vac. Sci. Technol. B, 14:3896 (1996) 103. Ruchhoeft, P., Wolfe, J. C., Wasson, J., Torres, J., Wu, H., Nourin, H., Liu, N., Morgan, M. D., and Tiberio, R., J. Vac. Sci. Technol. B, 16:3599 (1998) 104. Fisher, A. H., Laudon, M. F., Engelstad, R. L., Lowell, E. G., and Cerrina, F., presented at the EIPBN’97, San Diego, May (1997); J. Vac. Sci. Technol. B, 15:2249, (Nov./Dec. 1997) 105. Didenko, L., Melngailis, J., Löschner, H., Stengl, G., Chalupka, A., and Shimkunas, A., presented at MNE ’96 Glasgow, Scotland; Microelectronic Engineering, 35: 443–446 (1997) 106. Birman, A., Levush, B., Melngailis, J., Löschner, H., and Stengl, G., J. Vac. Sci. Technol. B, 13:2584 (1995) 107. Riordon, J., Didenko, L., and Melngailis, J., EIPB ’96 Atlanta (May 1996) J. Vac. Sci. Technol. B, 14:3900 (Nov./Dec. 1996) 108. Riordon, J., et.al., Univ. of Maryland unpublished result. 109. Chalupka, A., Fegerl, J., Fischer, R., Lammer, G., Löschner, H., Malek, L., Nowak, R., Stengl, G., Traher, C., Wolf, P., Proceedings of the ISIAT’91— 14th Symposium Ion Sources and Ion Assisted Technology, (T. Takagi, ed.), The Ion Engineering Society of Japan, Tokyo (1991) 110. Chalupka, A., Fegerl, J., Fischer, R., Lammer, G., Löschner, H., Malek, L., Nowak, R., Stengl, G., Traher, C., and Wolf, P., Microelectron. Eng., 17:229 (1992) 111. Stengl, G., Bosch, G., Chalupka, A., Fegerl, J., Fischer, R., Lammer, G., Löschner, H., Malek, L., Nowak, R., Traher, C., and Wolf, P., Microelectron. Eng., 21:187 (1993) 112. Chalupka, A., Stengl, G., Buschbet, H., Lammer, G., H. Vonach, Fischer, R., Hammel, E., Löschner, H., Nowak, R., Wolf, P., Finkelstein, W., Hill, R. W., Berry, I. L., Harriott, L. R., Melngailis, J., Randall, J. N., Wolfe, J. C., Stroh, H., Wollnik, H., Mondelli, A. A., Petillo, J. J., Leung, K., J. Vac. Sci. Technol. B, 12:3513 (1994) 113. Hammel, E., Chalupka, A., Fegerl, J., Fischer, R., Lammer, G., Löschner, H., Malek, L., Nowak, R., Stengl, G., Vonach, H., Wolf, P., Brünger, W. H., Buchmann, L. M., Torkler, M., Cekan, E., Fallmann, W., Paschke, F., Stangl, G., Thalinger, F., Berry, I. L., Harriott, L. R., Finkelstein, W., and Hill, R. W., J. Vac. Sci. Technol. B, 12:3533 (1994)
Ion Microbeams Lithography
855
114. Hammel, E., Monte Carlo simulations, Ionen Mikrofabrikations Systeme, Vienna, Austria, private communication 115. Brunger, W., Buchmann, L. M., Torkler, M., and Sinkwitz, S., J. Vac. Sci. Technol. B, 14:3924 (1996) 116. Brunger, W. H., Torkler, M., Buchmann, L. M., and Finklestein, W., EIPBN ’97 San Diego, CA (May 27–30, 1997); Vac. Sci. Technol. B, 15:2355 (1997)
856
Handbook of VLSI Microlithography
10 X-Ray Lithography William B. Glendinning Consultant South Bristol, Maine
Franco Cerrina University of Wisconsin Madison, Wisconsin
PART I 1.0
INTRODUCTION
The inquisition of well established optical (visible) IC chip lithography practice and principles led to the x-ray radiation-based alternative. The pervasive scrutiny of the optical-mask shadow details at the mask's opaque pattern edges revealed perplexing questions.[1] Conclusions were made over two decades ago that visible (400 nm to 600 nm) proximity and projection optical printing was diffraction limited and would not provide high quality delineation at IC device minimum feature size (MFS) dimensions of less than 2.0 µm.[2] Conversely, x-ray-mask-produced shadows with only small Fresnel diffraction limitations (wavelengths of 0.4 nm to1.5 nm) would extend high quality IC production lithography to 1.0 µm MFS and less. X-ray flux could provide high volume, high density IC device production as well as meet the necessary favorable throughput, yield and processing cost criteria. With time, the x-ray lithography method, although lacking
856
X-Ray Lithography
857
even few sizable production-use examples, retained support by virtue of its fairly unencumbered diffraction feature. In addition, x-ray's associated large depth of focus, wide process latitude, and excellent line width control, furthered arguments favoring the x-ray alternative. The relative dust-anddefect immunity of x-ray printing, an elegant and additional plus for high quality/high yield IC process, has not as yet been fully established.[3] The text in Secs. 1.0–10.0 of this chapter will present the growth and development of the present day x-ray lithography printing methods, mostly from a process and user viewpoint. The x-ray printing method will be defined and described as a system in terms of source, mask, aligner, and resist. The important component design error budget factors are detailed to provide both insight and clarity regarding the x-ray system complexity. The strengths and weaknesses of the major x-ray printing system sources and aligners are cited. Some emphasis is then given to mask technology in terms of fabrication, membrane and absorber materials, and absorber patterning. Finally, the extent of IC device damage from x-ray lithography process steps is shown. The chapter ends with a few brief conclusions and directs attention towards the subject of Sec. 11.0 of Part II namely, the synchrotron as a unique and technically ideal x-ray source.
2.0
X-RAY PRINTING METHOD—SYSTEM APPROACH
The x-ray method of lithography, like its spawning predecessor (visible optical proximity printing), can be defined in a system approach manner. The x-ray system’s major components are: source, mask, aligner, and resist, as seen in Fig. 1(a). In Fig. 1(b), x-ray photons emitted from the radiation source of finite dimensions, d, travel through distance, D, to impinge on the mask of field size, F. The x-ray flux continues it’s path through the mask membrane of thickness, t. The photons either pass through clear mask areas, including the smallest feature width, W, or are absorbed in the absorber areas of height, h. The exiting aerial image of flux from the bottom of the mask travels the gap spacing distance, s, to be finally absorbed in the resist, creating photoelectrons. A latent image in the resist, corresponding to the geometric mask absorber pattern, instantly forms by the photoelectron-induced changes to the molecular weights within the resist layer (see Ch. 2). This simple physical model is applicable to all x-ray lithography systems regardless of size, power, fixed or scanned flux, path direction of flux (vertical to horizontal), or stepped or fixed (full wafer) field.
858
Handbook of VLSI Microlithography
(a)
(b) Figure 1. (a) X-ray lithography system concept, (b) x-ray exposure physical model.
X-Ray Lithography
859
2.1 X-Ray System Definitions Since the x-ray system is employed to perform key lithography process steps in the IC chip manufacturing line, severe lithography constraints are levied in context of the complexity of the particular circuit device being fabricated. Four important parameters—MFS ( also referred to as critical dimension), line width control (sometimes critical dimension control), the overlay accuracy, and throughput—dictate the fine details of the x-ray system component performances and the respective component allowable design error budget factors.
2.2
Minimum Feature Size and Line Width Control
The first parameter, MFS, concerns the capability of imaging a narrow line width (or space) of some minimum size into a resist. The second parameter, line width control, concerns the capability of maintaining line width within tolerance across the field, F, from field to field, and from wafer to wafer. An image is transferred into a negative resist on a silicon wafer (throughout text other substrates equally apply, gallium arsenide, etc.) in accordance with the characteristic resist response curve shown in Fig. 2. The x-ray aerial image flux non-uniformity directly affects the line width control. Also affecting these parameters is the penumbral blur, δ. This blur consists of a geometric shadow factor of consequence in achieving and controlling dimensions, as illustrated in Fig. 3, for the case of spot source. Here D, d, h, and s, are defined previously, R is the distance from the source axis to the absorber line width, Lm, and Lw is the transferred Lm image in the resist.[4] The penumbral blur is: sd D
Eq. (1)
δ =
Eq. (2)
Lw = L m [1 + s/(D + s) + Rh/(D +s)]
Reducing penumbral blur reduces line edge roughness and increases line width resolution and control capability. As seen, equivalent source size, d, mask-to-wafer gap spacing, s, and the source-to-wafer distance, D, all can be used to control the penumbral blur.
860
Handbook of VLSI Microlithography
Figure 2. Transfer of mask absorber pattern into negative resist in accordance with the resist’s transfer curve.
Figure 3. Penumbral blur and its affect upon the absorber pattern image transfer for a spot source.
To evaluate the x-ray system’s capability to print fine lines, exposures of lines are made in resist and further processed through resist development and oven bake. Customarily, the thin resist layer is spun on a
X-Ray Lithography
861
bare or oxidized silicon wafer. The dimensions of the lines after resist baking are carefully measured by special optical microscopes or by electron-optics (scanning electron microscope, SEM) methods. As an example, in high density IC device work such lines would be made at a minimum feature size of 0.5 µm.[5] High quality line width control of 75 nm (3 σ ) is expected. To fully qualify the x-ray system, x- and y-directed lines must be evaluated throughout the entire exposure field, F, across the entire wafer, and from wafer to wafer. The line edge quality is also of importance and is measured as part of the line width artifacts. An edge roughness per line side of 75 nm (3 σ) is acceptable. Table 1 shows the design error budget numerical values levied against a spot source, aligner, mask, and resist, of an example x-ray system in order to achieve the required MFS and both the MFS and the edge roughness controls (2 σ range is compatible with the tolerances cited above). Table 1. Elements of the linewidth control design error budget for a spot source x-ray system example. MFS = 0.5 µm; Linewidth Control = 150 ± 50 nm (2 σ); Penumbral Blur = MFS/3 = 170 nm. Component
Error Element
Source
Source-to-wafer (40 µm to 50 µm)
Budget Allowance (± nm) 10
X-ray flux uniformity: Across Field (± 2%) Exposure to exposure (± 2%)
5 5
Aligner
Mask-to-wafer stability
10
Mask Edge profile <150 nm/side Roughness <50 nbm/side
Absorber projected angle Linewidth uniformity: E-beam pattern Across field Mask-to-mask
10
Resist Thickness` >300 nn Edge roughness <50 nm
10 10
Processing Across field Mask-to-mask
10 10
X-ray transmission Across mask (± 2%)
4
Processing development control
30
862
Handbook of VLSI Microlithography
2.3 Overlay Accuracy The next parameter for judging the capability of an x-ray system is the overlay accuracy. The manufacturing of IC devices requires many process steps involving the lithographic delineation of geometries. To make IC chips to exacting electrical performance specifications and achieve production yields, each lithographic mask level must be aligned perfectly to all of the other lithographic levels and vice versa. In terms of the device MFS, acceptable production yields occur if the x- and y-direction overlay accuracy value is at least 1/5 of the MFS. An MFS of 0.5 µm requires that the overlay accuracy be less than ±0.1 µm (2 σ). Overlay accuracy values represent measured x- and y-directed data sample populations with spreads of 2 or 3 sigma from the mean. Sample data is obtained by the fabrication and test of simple electrical devices, optically read verniers, and optoelectronically read bench marks (for example, matrix of market crosses), (see Fig. 4).[6][7] Table 2 shows the design error budget numerical values levied against the source, aligner, and mask components in order to achieve the desired overlay accuracy for a spot source x-ray system example. Maskto-mask (± 80 nm) and aligner (± 60 nm) are the major error components. The wafer contributes to the overlay error (distortion) and is, therefore, included in the error budget.
Figure 4. Y-direction overlay error data—two different mask levels printed on the same wafer. (Courtesy Perkin-Elmer.)
X-Ray Lithography
863
Table 2. Elements of the overlay design error budget for a spot source x-ray system example. MFS = 0.5 nm; Total x,y Errors = ±100nm (2 σ); Penumbral Blur = MFS/3 = 170 nm. Component
Error Element
Budget Allowance
Source
Power Input Source-to-Wafer distance Penumbral blur Beryllium Window
10 kW 18 cm 170 nm 15 µm
Mask
X-ray transparency
50%
Aligner
Overhead: Transport Gap prealign Transport to/from exposure
20.5 seconds/wafer 2.5 seconds 2.8 seconds 15.2 seconds
Subfield time: Accelerate Move Decelerate Fine latern align Gap adjust
12.6 seconds/wafer 0.1 seconds 0.5 seconds 0.2 seconds 0.5 seconds 0.5 seconds
Resist
2.4
Exposure time/field (resist sensitivity - 8 mJ/cm2)
5.5 seconds/field
Throughput
The last parameter for judging the x-ray system performance capability is throughput. The throughput directly bears on the essential value of any lithography method, i.e., the throughput value must favorably affect the manufacturing cost per chip equation. Hence, the throughput should be as high as possible. It is evaluated by lithographically printing a series of wafer levels for a measured period of time. The time interval must be long enough to include the prospects and consequences of downtime (equipment breakdown) should it occur.[8] Table 3 shows the design error budget numerical values levied against a spot source, planar-motor driven aligner stage with physical optoelectronic alignment detection/positioning control (zone-plate/grating method). Source, mask aligner, and resist components are all throughput determinants. The source and resist especially are shown in other chapter sections as the salient throughput determining
864
Handbook of VLSI Microlithography
factors. A throughput of forty-five 100 mm (diameter) wafer levels per hour for 0.5 µm MFS lithography is acceptable, or, an equivalent exposed area value (mm2/hour) for the case of using larger diameter wafers (for example, 150 mm or 200 mm).
Table 3. Elements of the design throughput budget for a spot source xray system example. Fifty wafer levels (100 mm diameter) per hour, 7 fields (3 cm × 3 cm) per wafer level. Design Budget Allowance
Component
Error Element
Source
Power Input Source-to-Wafer distance Penumbral blur Beryllium Window
Mask
X-ray transparency
50%
Aligner
Overhead: Transport Gap pre-align Transport to/from exposure
20.5 seconds/wafer 2.5 seconds 2.8 seconds 15.2 seconds
Subfield time: Accelerate Move Decelerate Fine lateral align Gap adjust
12.6 seconds/wafer 0.1 seconds 0.5 seconds 0.2 seconds 0.5 seconds 0.5 seconds
Resist
3.0
Exposure time/field (resist sensitivity - 8 mJ/cm2)
10 kW 18 cm 170 nm 15 µm
5.5 seconds/field
X-RAY SYSTEM COMPONENTS
To understand the x-ray lithography method requires familiarity with the features, materials, configurations, physics, electronics, and the principles of operation of the x-ray system components of source, mask, aligner, and resist. Broad views of these components can only be presented here. Some depth in mask technology is provided later in the chapter. The reader is directed to Ch. 2 and Part II of this chapter for in-depth discussions of photoresist and x-ray synchrotron source subject matter, respectively.
X-Ray Lithography
865
3.1 Sources for X-Ray Flux The selection of a source type levies several important design constraints: source size, power emitted, mask-to-source distance, characteristic and broadband radiation, flux scanning, aligner orientation, mask membrane and absorber material, and photoresist spectral sensitivity and contrast. In practice, three categories of sources have shown lithographic quality to varying degrees, as noted in Table 4. Electron impact (fixed, rotating), plasma (laser, gas-puff), and the electron storage ring or synchrotron (warm magnet, cold [superconducting] magnet). Ideally, the x-ray emitter should be of minimal size less than 1 mm, provide collimated flux (approximated by large source-to-mask distance [2 m to 4 m]), or grazingangle scanning, emit copious photons of complementary energy for a per field exposure time of a few seconds duration. The storage ring is near to an ideal emitter source. Its awkward features are its physical and cost size, which will be mentioned later (also see Ch. 1). Table 4. X-Ray Source Types and Related Parameters Parameter
Electron Impact Stationary Rotating
Z-Pinch
Plasma Laser
Warm
Storage Ringa Superconducting
Material
Palladium 5–6 kW
Tungsten 10 kW
Krypton 1 pk 400– 500 kA
Steel, Cu 1.0 GeV 0.7 GeV Nd: Glass
Distance Source/ mask (cm)
4.0
18.0
18.0
1.0
1500
Source size (d) (nm)
4
1.5
1
0.1–0.2
x: 0.5 x: 0.7 z: 0.075 z: 0.7
Penumbral blur (nm)
400
170
170
300
x: 5–10 x: 1.3 z: 0.05 z: 1.3
Wavelength (nm)
0.44
0.7
0.6–0.8
0.4–2.2 1.2–1.6
0.4–2.0 λc: 8.45 λc: 1.0
Irradiance at resist mW/cm2
0.3
1.5
2.5–12.5
5.0–25b
50
100
Isolation window (µm)
Be, 50
Be, 15
Be, 15–25
no Be, 0
Be, 25
Be, 25
a See Table 8, this chapter b With 100 J/pulse laser output
1500
866
Handbook of VLSI Microlithography
Electron Impact Source. Electron impact x-ray sources emit the smallest quantities of photons. Electrons bombarding a solid target produce both a characteristic line (stripping of inner nucleus electrons) and continuous Bremsstrahlung (deceleration of electrons) x-rays.[9] Emitted power, for example, from a 10 kW rotating tungsten (M-line) anode is absorbed en route to the wafer/resist by: a beryllium window (25 µm,T = 0.61), ambient gas (He, 18 cm, T = 0.96), boron nitride mask (3.0 µm, T = 0.72), and a polyamide layer (2 µm, T = 0.88), where T is the flux fraction transmitted through the medium. The total transmission coefficient is T T = 0.38. For the optimum x-ray system performance the mask, resist, and anode materials are customarily selected for compatibility with respect to the characteristic radiation. If a resist with a sensitivity of Di = 8 mJ/cm2 is used, and an assumed field exposure time texp = 5.0 sec applied, the incident on-resist xray power irradiance φ must be:
Eq. (3)
φ = Di / t =
8 mJ/cm 2 = 1. 6 mW/cm 5.0 sec
2
A tungsten anode x-ray generation efficiency which relates irradiance to input power, P [10][11] is:
η = φ d 2 / PTT
[
]
= 1.6 × 10 −3 W/cm2 × (18) /(10 kW ⋅ 0.38) η = 136 µW/W - Sr.
Eq. (4)
2
The maximum power dissipation of a rotating anode is: Eq. (5)
P = (π/4)2 (ρ Cv kd3 v) 1/2 ∆T
where ρ is the density of the material,Cv is the specific heat,k is the thermal conductivity, d is the spot size, v is the surface velocity and ∆T is the maximum allowable temperature rise. For tungsten P = (π / 4) [(19.3)(0.155)W − 1.99W (0.001) (2.2 × 103 )] 2
and
P = 7.5 kW
1/ 2
3345°C
X-Ray Lithography
867
The incorporation of water cooling enables a P value increase from 10 kW to 20 kW to be realized.[12][13] High density, high melting point metals such as molybdenum and tungsten have high figures of merit for high power anodes. Rotation velocities of 7000 rpm to 8000 rpm are required. The vacuum condition of electron impact anodes are usually maintained via high transmission beryllium windows which absorb very few of the emitted photons en route to the resist/wafer. Plasma Sources. Those who work with plasma physics have known for some time that plasmas of multimillion-degree temperatures can emit intense x-ray pulses.[14]–[16] The average pulsed power of such sources (beam and electron discharge heated types) ranges from several to ten times more irradiance (5 mW/cm2 to 30 mW/cm2) on the resist wafer than the electron impact anode type. As is desirable, characteristic x-ray lines (0.4 nm to 2.0 nm) can be generated through selection of target and gas materials, and the laser or electron discharge energy and properties. Some of the plasma sources tend to be relatively bulky and higher in cost, however, their potential of three to five times higher wafer level throughput make them economically attractive. The laser plasma source spot is a fraction of the source spot of either the electron discharge plasma or the impact anode types. Laser-Beam Heated Plasma Source. In the most advanced laser plasma source system a Nd+ 3:glass slab laser, pumped by an array of xenon flash lamps, pulses at up to 2 Hz rate during exposures and delivers pulse energy of 25 joules at a 1.054 µm wavelength. The focused laser beam impinges upon and vaporizes a thin iron alloy film target. The targets are indexed by a tape cassette drive to enable a supply of fresh metal for each pulse impact. The focused laser beam further couples its energy (1 × 1013 W/cm2) into a spatial metal vapor creating dense million degree microplasma temperatures. About 10% of the laser pulse energy is converted to a soft xray energy spectrum producing copious photons (λ = 1.2 nm to 1.6 nm) suitable for photoresist absorption from a plasma spot size of only 150 µm to 200 µm diameter. The small spot size reduces the penumbral blurring factor and provides a design trade-off for the source-to-wafer distance factor. For a practical laser plasma x-ray system: Source-to-wafer distance D = 10 cm, gap = 20 µm, pulse repetition rate R = 2 Hz (pulses/s), and the total windowless (no beryllium membrane) transmission fraction T T which includes both helium ambient and mask (l.0 µm silicon) absorption at a wavelength peak of λ = 1.2 nm, TT = 0.59. The source x-ray energy release per pulse is Ep = 2.5 J/pulse. For a spot-type x-ray source flux distribution at the resist/wafer plane the incident energy intensity per pulse, Ew, is:
868
Handbook of VLSI Microlithography
Eq. (6)
EW = E ρ TT ⋅ ω / 4πD 2 =
2.5 J/pulse (0.59 ) = 1.17ω mJ/cm 2 /pulse 2 2 4π (10 ) cm
where ω is a factor to compensate for a nonuniform atmospheric helium distribution. For ω = 1, a uniform distribution, Ew = 1.17 mJ/cm 2/pulse. For a resist with a sensitivity S = 24 mJ/cm2, the number of high intensity laser pulses, Np, to achieve proper exposure is: Eq. (7)
NP =
24 mJ/cm 2 S 20.51 pulses = 2 = EW 1. 17 mJ/pulses/ cm
Practically speaking Np = 20 pulses. Since the laser pulsing rate, Rp = 2 Hz, the time for exposure, texp would be:
Eq. (8)
t exp = N p / R p =
20 pulses − 10 sec 2 pulses / sec
The average irradiance φ on the resist is:
Eq. (9)
φ = E w N p / t exp =
24 mJ/cm 2 = 2.4 mW/cm 2 10 sec
To achieve an average irradiance φ of 15 mW/cm2, necessitates multiple burst pulsing and/or increased laser power. With an irradiance of 15 mW/cm2 an exposure time, texp, of less than 2 seconds per printing field occurs and conforms to the criterion of high throughput. The windowless nature of the x-ray transmission path described above is unique and makes possible the delivery of almost half of the entire soft-type flux spectrum to the resist. For various resists, for example, novolak, a relatively high (50%) of the soft x-rays. The x-ray output energy spectra and yield is controllable via target composition, laser pulse shape, peak power, focused spot size, target geometry, and the background gas.[17] Notice the appearance of an extremely small penumbral blur δ [Eq. (1)] value of:
X-Ray Lithography
δ =
869
sd = 25 × 10 −4 cm(0.01) cm/(12) cm = 21 nm D
Electron Discharge Heated Plasma Source. Two types of electron discharge heated plasma sources have produced high density x-ray photon energies (1.0 kV to 2.0 kV). These photons are compatible with the component materials necessary to accomplish IC chip lithography.[18]–[20] These imploding plasma discharges (x-ray pulse duration = 20 ns) take place in large work chambers (0.1 mm3 to 0.2 mm3) coupled via special transmission lines to a very large oil-cooled inductance-capacitance tank circuit having shaping, compression, and output stages. The gas-puff Z-pinch configuration shown in Fig. 5 has been driven in a repetitive mode (1 Hz to 10 Hz) at 400–650 kA discharge currents of about 1.0 ms duration. The puff diode gas breakdown releasing the narrow x-ray photon pulse occurs in the time during the upper leading edge of the high tank output current pulse. The krypton or neon gas “puffed” into the diode electrode region builds up and forms a hollow gas cylinder being synchronized with the tank current discharge pulse as described above. After discharge, the residual gas is pumped from the chamber to allow the implosion cycle to repeat. The cylindrical plasma region with axial spot size of 1 mm to 2 mm emits x-ray pulses of 100 J to 200 J amplitude of 20 ns width at repetition rates of 3 Hz to 10 Hz.[21][22] The photon energy depends upon the noble gas: krypton emits 1.6 to 2.0 kV at 0.6 nm to 0.8 nm wave length, and neon gives 1.4 kV to 0.9 kV at 0.9 nm to 1.4 nm wavelength, and neon gives 1.4 kV to 0.9 kV at 0.9 nm to 1.4 nm wavelength. Such transmission through the confining beryllium window (10 µm to 15 µm), 18 cm helium, silicon nitride or boron nitride mask (1 µm to 2 µm), and polyamide electron stopper (2µm), delivers 2.0 to 4.0 mJ/cm2/pulse energy into the resist. For a resist sensitivity of 8 to 80 mJ/cm2, the resist exposure is achieved by as few as two to forty pulses. The printing time for a one mask field, therefore, requires a total exposure time ranging from 0.2 sec. to 4.0 sec. By using a longer series of pulses the undesirable effects arising from pulse to pulse nonpulses, the undesirable effects arising from pulse to pulse non-uniformities can be statistically compensated. Hence, exposure times of a few seconds or more are required. The best imaging work has been reported at repetition rates of several Hz. Although some high quality resist imaging has been demonstrated, because of the size, cost, transmission tank and line durability, overall performance stability, and the debris factor, cause these intermediate power sources to require engineering enhancements to become IC production viable.
870 4.0
Handbook of VLSI Microlithography MASK TECHNOLOGY
In general, the x-ray lithography method requires a sharp photon aerial image to be formed at the resist using the irradiance from the x-ray system’s source.[23][24] Large process latitude can result from masks with high contrast, steep absorber walls, and smooth absorber line edges. For overlay accuracy the mask needs accurate pattern placement, accurate alignment marks, high laser or light transmission for alignment, minimal distortion, and planarity. In Fig. 6(a) the quantities d, D, s, h, and R are as defined previously. A magnification factor∆ R/R = s/D, and penumbral blur varies as a function of R: Eq. (10)
δ (R) = sd/D + h/D · (R-d/2)
Some flux transmission actually takes place through the top and bottom corners of the absorber pattern edges. Figure 6(b) shows that out-of-place distortions δs of the mask or the wafer produce lateral x,y shifts of the transferred mask pattern data on the wafer/resist by amount δr. An alignment error δalign results such that: Eq. (11)
δalign = R/D · δ s
For an R = 2.0 cm, D = 18 cm, and δs = 100 nm, then δalign = 11 nm.
Figure 5. Gas-puff Z-pinch diode; pinched plasma located at the center of the diode measures about 1 mm diameter and 2 mm axial length.
X-Ray Lithography
871
(a)
(b) Figure 6. (a) Magnification and penumbral blur variation with position R in the mask field. (b) Alignment errors from out-of-plane mask or wafer distortions.
872
Handbook of VLSI Microlithography
Absorber defect density must be ≤1 defect/cm 2 (including mask repair). In present day thinking for an MFS of less than 0.5 mm, defect densities should be less than 0.1 defect/cm2. Through the life of the mask, the extent of in-lane dimensional distortions and the changes in membrane transmission coefficients must remain within close tolerance.
4.1 Minimum Line Width and Control The major mask factors affecting minimum line width and control are contrast, mask transmission and uniformity, source energy spectra and uniformity, absorber line edge profile, pattern generation accuracy, distortion, geometric magnification, and, indirectly, the resist properties, including secondary electron range. Sharp aerial image (line edge) profiles depend chiefly upon the absorber line edge profile in Fig. 7. In addition, both the absorber line edge profile location and its direction within the exposure field causes small line width variations. Not only can line widths increase toward peripheral field locations, but a tilting of the resist walls occurs, i.e., resist towers lean radially inward towards the source.[25] The absorber wall angles should be greater than 70° to keep the wall-smearing blur effect as small as possible and much smaller than the simple penumbral blur, δ, defined earlier in the chapter. Achieving high mask contrast and high uniform mask transmission depend directly upon material absorption coefficients, and the physical thickness of the mask membrane, absorber, and electron stopper (if used). Absorption coefficients for some of the typical x-ray mask materials are shown in Fig. 8. Since the x-ray absorption in a material is proportional to its electron density, x-ray mask membranes need low atomic numbers. However, such low density elements/compounds lack other properties needed (Young’s modulus, tensile strength, etc.) and hence intermediate density materials are made to suffice (silicon, silicon nitride, boron nitride, etc.). Window material for the source vacuum integrity and protection are low density/low atomic number elements (beryllium). Absorbers are high density/high atomic number elements (tungsten, gold, tantalum)). The mask contrast factor, Cm, can be thought of simply as the ratio of photon transmission through the clear mask membrane (non-absorber area), Tclear, to the transmission through the absorber area, T absorber. Usually the ratio is calculated on the basis of absorption coefficients corresponding to the x-ray source’s characteristic radiation. Here: Eq. (12)
Cm = Tclear/Tabsorber
X-Ray Lithography
873
and assuming that:
λ = λwL = 6.98 Å, tBN = 2.5 µm, tAu = 0.7 µm. then:
Cm = e - 800 (2.5 × 10-4) /e · 5 × 10-4 (0.7) = 9.97
Mask contrast values ≥10 are adequate for the resolution of MFS and line width control.[26]
Figure 7. Absorber line profile; h = height, W = MFS (e.g.), and θ is the angle of the wall’s slope.
Figure 8. Absorption coefficient vs. photon wave length for various x-ray mask window materials.
874
Handbook of VLSI Microlithography
4.2 Overlay The five major mask factors which contribute to the system overlay error were listed in Table 2. The design error budget allowance presented for each factor assumed the applicable masks were used in the example xray system cited. Differential run out error due to the gap variation from one printing level to a subsequent level produce an overlay error in level-tolevel printing. However, capacitive gap-sensing alignment/positioning control, on a field by field basis has shown gap positioning to a tolerance of ± 100 nm. The run out error is reduced by such precision to ± 11.0 nm.
4.3
Throughput
The mask component affects the throughput by virtue of the absorption property of its membrane (plus polyamide layer if it is used for electron stopper and/or strength). The photon energy must be transmitted through the non-absorbing mask areas (membrane) to provide an irradiance at the resist/wafer surface sufficient to yield the required throughput. In general, the field exposure time should range between 2 seconds to 5 seconds to achieve economical throughput. For a resist with a 10 mJ/cm2 sensitivity, an irradiance of 2 mW/cm2 is required to complete proper exposure within 5 seconds.
5.0
MASK CONSTRUCTION
One of the current x-ray mask construction features has survived from the inception of the x-ray lithography method itself, i.e., the use of the starting material of an epitaxial n-n+ silicon wafer which is subsequently oxidized and then thinned by the etch back method.[27] From the beginning, the choice for x-ray membranes and absorbers has been to use intermediateZ and high-Z materials, respectively.[28] The two major divisions of mask fabrication are the blank and the absorber/patterning. The absorber/patterning is performed either as a subtractive (etch) or additive (plating) process. The blank assembly typically starts with a silicon wafer which is oxidized, epitaxially layered, or with a boron nitride deposition layer as shown in Fig. 9. The silicon wafer is etched back and epoxy mounted to a Pyrex or glass ring. After depositing a tantalum/gold plating base
X-Ray Lithography
875
(additive), a thick stencil resist is spun and baked followed by a thin chromium etch mask layer. The blank is then completed by spinning on and baking an imaging resist poly(butene-1-sulphone) (PBS). Variations from this particular procedure and materials have included stretching a transparent membrane over a silica or silicon mounting frame and the use of Mylar,[29][30] Kapton, [30] and titanium,[31] as membrane materials, respectively. Other membrane forms include aluminum/oxide,[32] silicon nitride, [33] silicon nitride/silicon dioxide/silicon nitride,[34][35] silicon oxynitride,[36] and triple layer membranes (tensile silicon-doped boron nitride with compressive layers of boron nitride).[37]
Figure 9. Mask blank fabrication process.
It is imperative to consider the interactions of the thermal and mechanical properties of the membrane and its support ring.[38][39] Considering Table 5 and the blank assembly described above, the thick Pyrex support ring with an expansion coefficient of 3.3 × 10-6/°C matches the silicon substrate closely (2.6 × 10-6 /°C). Matching these coefficients with the membrane coefficient itself assures that the present membrane tension will change only a small amount as temperature fluctuates. Membrane coefficients for silicon nitride, silicon, silicon carbide, and boron nitride hydride, indicate compatibility with the Pyrex/silicon combination and, therefore, should be used together. Since membranes with high elastic modules can reduce local pattern distortions from high absorber stress, silicon carbide appears as the leading membrane candidate. The additive absorber patterning process starts with the exposure and development of patterns in the PBS resist, as in Fig. 10. The thin chromium
876
Handbook of VLSI Microlithography
layer is etched to become a mask for reactive ion etching (RIE) of the stencil resist. After stripping the chromium layer, gold is plated onto the stencil resist mask. The resist is then stripped followed by removal of the tantalum/ gold plating base. The resulting gold x-ray mask absorber patterns have steep line edge wall profiles (Fig. 11). Numerous references exist which describe the electroplating process.[40]–[48] Table 5. X-Ray Mask and Window Material Properties
Materials BN Si3N4 Ai2O3 SiO2 SiC Si Be Pyrex
Coefficient of Thermal Expansion (°C-1) 1 × 10 -6 2.7 × 10-6 9 × 10-6 0.4 × 10-6 4.7 × 10-6 2.6 × 10-6 12.3 × 10-6 3.3 × 10-6
Young’s Modulus (dynes/cm 2) 1.33 × 1012 1.55 × 1012 3.73 × 1012 7.38 × 1012 4.57 × 1011 1.62 × 1012 3.02 × 1012
Figure 10. Additive mask absorber patterning process via gold electroplating. (Courtesy Perkin-Elmer.)
X-Ray Lithography
877
DENSE LINE/SPACE ARRAYS (0.8 micron pitch) (a)
(b) Figure 11. Electroplated gold absorber lines: (a) 0.4 µm line/space on 1 µm boron doped silicon; (b) Histogram of horizontal and vertical features on the reticle in (a), (Courtesy Hampshire Instruments.), line/space (0.7 µm gold) on 5 µm boron nitride; (c) 0.5 µm, and (d) 0.3µm, deviation 0.040 µm from mean (3 σ). (Courtesy Perkin-Elmer ALO.)
878
Handbook of VLSI Microlithography
0.5 Micron
(c)
0.3 Micron
(d) Figure 11. (Cont’d.)
The subtractive absorber/patterning process shown in Fig. 12 starts with a blank prepared in a manner similar to that of preparing an additive absorber blank. Only one resist layer is exposed, developed, and used to
X-Ray Lithography
879
delineate absorber geometries in three etching steps: RIE of tantalum, sputter etch of gold, and RIE of tantalum.[49] A polymer is finally deposited over the entire mask area to act as a photoelectron-auger electron suppressor. Electron beam lithography is commonly used to write the submicron MFS absorber patterns. The RIE and sputter etch combination has been used to pattern a variety of multilayers, such as tantalum/gold/tantalum,[50] titanium/gold/titanium,[51] and chromium/gold/tantalum.[52] The sputter etch patterning suffers from redeposition of gold pattern feature on the side walls producing wall angles of 65–73°.[45][53] Vertical walls in 7000 Å gold were reported from reactive ion milling.[54] The additive process in general has the advantage of producing vertical wall features and less local mask pattern distortions as a result of the low stress electroplated gold.
5.1
Mechanical and Optical Distortions
Local membrane distortions occur within the mask’s exposure field from the stresses caused by the placement of resist and absorber layers and the subsequent patterning of these layers.[47][55] Adjusting membrane residual tensile stress between 5 × 108 and 1 × 109 dynes/cm2 gives flat membranes ≤1 µm and provides a control mechanism for compensating absorber induced tensile stresses of 2 × 10 8 dynes/cm 2.[56] The low pressure chemical vapor deposition method (LPCVD) and its plasma enhanced counterpart enable great control of the boron nitride hydride and silicon nitride membrane layer properties to be exercised. Varying reactant gas ratios and substrate temperature affect absorption, density, stoichiometry, residual stress, and refractive index. In Fig. 13, the residual stress of boron nitride: hydride deposited by LPCVD is made to vary between 1 × 109 dynes/cm2 tension to 8 × 108 dynes/cm2 compression by changing the diborane-to-ammonia flow ratio at various temperatures.[57] Refractive index increases (2. 0 to 3.0) and optical absorption increases resulted. Using CVD methods, membranes of silicon nitride have been made with tensile stresses ranging from 2.5 × 109 to 1 × 10 10 dynes/cm2.[35][44] An accepted way to measure file stress on a wafer is to measure the amount of curvature before and after removing the film coating from only one side of a wafer (see Fig. 14).[45][58][59] Measuring the bulge displacement of the membrane center point under differential pressure conditions also enables the calculation of stress. For least distortion of mask patterns the membrane stress should be the dominant stress. [60] Larger membrane-to-film thickness ratios also provide a means of minimizing local mask distortions.
Figure 12. Subtractive mask absorber patterning process via tungsten on epitaxial silicon. (Courtesy Hampshire Instruments.)
880 Handbook of VLSI Microlithography
X-Ray Lithography
881
Figure 13. Mask membrane stress vs. B2H6 /NH3 ratio as a function of temperature in an LPCVD process.
882
Handbook of VLSI Microlithography
Figure 14. Mask membrane stress measurement using the radius of curvature technique +13.44 µm (convex) bow with 4.55 µm film of BN on both sides; after BN removal from one side curvature became +16.32 (convex). (Courtesy Westinghouse.)
Local in-plane stress distributions create distortions of mask membranes and can shift line edge positions by small fractions of a micron.[59][60] Line edge lateral geometric movements and out-of-plane bowing (concave and convex) of membranes are generated in the subtractive absorber process.[63][64] (Various mechanical spring models have been used to analyze specifically prepared absorber-membrane configurations as shown in Fig. 15).[65] Here, L is the distance (mid-membrane), S is the substrate (membrane) stress, A is absorber stress, and δ w is the worst case distortion. Then: Eq. (13)
δ w = (1 - v2) σ at a/Emt m (1.25 × 10-3) µm
where v = Poisson’s ratio, Em is Young’s Modulus (membrane), σa is the absorber stress, and tm and ta are the thicknesses of the membrane and absorber.
X-Ray Lithography
883
(a)
(b) Figure 15. (a) Distortion due to stress in gold covering half of the mask membrane. (b) Simple spring model used to estimate distortion due to worst-case pattern loading.
Fizeau and other interferometric evaluation methods show outof-plane distortions for silicon oxynitride membranes vary as much as 2.0 µm. [36] Measurements taken over four months of time for gold patterns on boron nitride membranes (4.0 µm, tensile stress = 1.2 × 109 dynes/cm2) indicate a mask stability with overlay error of σ = 30 nm. Thermal and stress cycling of these bonded boron nitride membranes show no fatigue, loss of tension, or bond slippage.[66] High integrated synchrotron fluxes of the
884
Handbook of VLSI Microlithography
order of 130 kJ/cm3 show in-lane x,y stripe (1 cm wide absorber) pattern distortions of hydrogenated boron nitride masks ranging to 1 µm. At doses of 200 J/cm2, the hydrogenated membrane’s (4 µm thick) optical absorption coefficient increases and transmission (λ = 632.8 nm) falls from 65% to 40%.[67] However, hydrogen-free mask membranes of BN/B3N/BN suffer only slight distortions of 10 nm, and transmission (λ = 632.8 nm) changes of less than a few percent for doses of 200 kJ/cm3.[68]
5.2
Defects
An attractive feature of the x-ray lithography method is the possibility of immunity from dust particles[69] and certain defects (Fig. 16). Energy absorbing particles and defects are considered transparent (non-printable) if their size is within two times the size of the penumbral blur (for example, 160 nm) or particle defect size ≤320 nm. This only applies to the clear portions of the mask, including clear areas adjacent to absorber edges. In the opposite case of non-absorbing defects, such as pinholes in absorber areas, immunity does not exist and pinholes of penumbral size remain printable defects. In the subtractive gold patterning process of boron nitride hydride, printable defects fall into three types: missing absorber gold from pinholes in the resist, unetched gold absorber from inadequate patterning mask, and opaque silicon residues from incomplete silicon backside edge removal.[70] Since dust particle (low-Z) immunity is a photon energy dependent effect, higher energy sources are relatively advantageous in ignoring even the larger sized particles (several µm). In addition to the lithographic process, wafer processing (for example, deposition, ion implant, and etching) and handling can result in defects that also reduce device manufacturing yield. Defects that repeat in the exposure field are undoubtedly due to the lithographic process. Randomly distributed defects result from other process steps or from wafer handling. The airborne dust particles are found to be predominantly low density carbon-based particles that emanate from people and clothing. The large exposure latitude for the x-ray lithography method allows using an overexposure to reduce the impact of lower contrast defects (particles) without appreciable line width change. In fact, the selection of exposure, resist, and processing, can result in the elimination of both soft and certain hard defect printing errors.
X-Ray Lithography
885
(a)
(b) Figure 16. (a) Reticle mask with 0.4 µm line/space array in 0.4 µm thick gold absorber on 1.0 µm silicon membrane, a 0.4 µm soft (latex) defect partially obscures a space; (b) wafer printed without showing the defect into 0.81 µm thick Hitachi RE500P resist. (Courtesy Hampshire Instruments.)
886
Handbook of VLSI Microlithography
The argument has risen that for the production of 0.5 µm MFS IC chips, such as a 16 MBit DRAM, will require printable defect densities of less than 0.1 defect/cm2.[71]
5.3
Inspection
In performing inspection of the x-ray masks with a pattern MFS = 1.0 µm, the printable defect (0.3 µm–0.35 µm) density and location were first evaluated manually via highly magnified optical means. The expensive and time consuming inspections, on a sample basis made at rates of 1.0 cm2/45 min., were too costly to be extended into the x-ray printing realm at 0.5 µm MFS. Comparison microscopes with synchronously driven stages were used for the comparison of the mask and the printed wafer in order to determine printable defects attributed to the mask.[70] Commercial optical instruments, which compare the data base input pattern to the absorber mask pattern, exhibit enhanced and broadened defect analysis capabilities with the detectable defects ranging down in size to slightly below 0.5 µm. Although defect detection, sorting, and mapping, with respect to any particular patterned mask set allowed the prediction of lithography process limits and yield to be used for IC product quality control, manufacturing technology was enhanced even more as mask repair techniques emerged. Laser ablation was adapted for removing isolated high-Z opaque defects. Focused ion beam milling provides high resolution (100 nm) trimming of excess gold from absorber features.[70] Pinholes in opaque absorber areas can be filled selectively with photochemical deposition by laser CVD and laser enhanced plating techniques.[72][73] In the vein of standardization, tape and disk recorded defect map data taken via optical inspection machine has been made compatible with the input requirements for the drive control of an ion beam repair tool.[74]
5.4 Pattern Generation The generation of high accuracy mask absorber patterns is absolutely critical to high resolution (0.5 µm) MFS x-ray printing. Special electron beam addressing (100 to 50 nm), electron spot size (150 nm) and intensity, and substrate handling/positioning are required to convert a standard commercial electron beam pattern generator (see Ch. 5)[75] for meeting the requirements of 0.5 µm MFS x-ray mask patterning with ± 50 nm to 74 nm placement accuracy, and ± 40 nm to 80 nm line width control. The most
X-Ray Lithography
887
difficult criterion to meet is placement accuracy. In addition, for pattern transfer, tri-level resist processing is essential (described in Ch. 2) in which a 400 nm NPR imaging resist layer, a 100 nm silicon dioxide and a 1.4 µm HPR204 resist are layered on the absorber. It is imperative to use proximity correction software to reduce the pattern edge smearing by the effects of back and front scattered electrons caused by the impinging high energy (20 kV–30 kV) writing beam. Since a low intensity writing beam is used in order to achieve minimal spot size, the pattern exposure time suffers considerably, in fact, in some cases two-pass superimposed exposures have been attempted to achieve the required absorbed resist energy density.
6.0
ALIGNMENT
The performances of the x-ray system aligner and mask components essentially determine the overlay capability (for example, ± 100 nm, 2 σ ) of the overall x-ray system. Other relatively small overlay errors are contributed by the wafer and source/alignment interface. The alignment task, if performed ideally, would position the mask-to-wafer exactly in parallel plane positions at the specified design wafer-mask separation distances, and the parallel planes would be normal to the axis of radiation and contribute an aligner overlay error of ± 60 nm maximum. Figure 17 shows the aligner coordinate system related to the six degrees of freedom of movement required to compensate for the major misalignment of lateral shift (εx, εy), in-plane rotation (εθ z), out of plane tilt ( εθ x), out of plane tilt ( εθ y), gap setting (εz), and magnification. An inspection of Figs. 18 and 19 shows how all misalignment errors resolve into ∆X and ∆Y quantities. The aligner’s closed-loop servo features automatically compensate (during the x-ray mask field exposure) for these misalignments. Both x-ray and visible-optical (coherent and non-coherent) radiation have been used for sensing misalignment. Highly sensitive subsystems detect the relative position of the mask with respect to the wafer.[76] All receive optoelectronic signals from matched pairs of alignment marks placed at three or four orthogonal peripheral positions outside of the mask field, F, and at the corresponding locations on the wafer.[77] X-ray flux used in an alignment sensing scheme required special back etching of the silicon wafer for transmission of the signal flux into a null x-ray detector to control servo schemes shifted more towards coherent or incoherent light for the generation of alignment signals.
888
Handbook of VLSI Microlithography
Figure 17. Aligner coordinate system related to the six degrees of freedom of movement— translations occur in the x, y, and z directions and rotations occur about the x, y, and z axes.
Figure 18. In-plane overlay errors: x and y axis translation errors (a) ∈x, (b) ∈y, and (c) rotation error, ∈θ z.
X-Ray Lithography
889
Figure 19. Out-of-plane overlay errors with source-to-wafer distance D: (a) mask-towafer gap ∈z, and tilts about the: (b) x-axis, ∈θz, and (c) y-axis, ∈θy.
6.1 Interferometric Schemes The Fresnel zone plate and grating method using laser radiation appeared first in a linear form as shown in Fig. 20(a).[78] The relative positions of linear zone-plate patterns on the mask are measured with respect to line diffracting gratings on the wafer [Fig. 20(b)]. The onedimensional Fresnel type lens focuses incident laser radiation into a line focus at the surface of the wafer. When the focused line aligns with the fine linear grating etched into the wafer surface, diffraction of the laser beam occurs from the grating causing negative and positive orders to emanate off the normal to the wafer surface (Fig. 21). The diffracted laser energy from the 1st+ and 1st- orders is absorbed and sensed by a photodetector. When the maximum amount of diffracted light is sensed by the photodetector, exact alignment takes place, i.e., the grating is directly beneath the zoneplate. Three orthogonal zone-plate grating pairs are used for in-plane and θz positioning. A fourth orthogonal zone-plate grating pair is added for
890
Handbook of VLSI Microlithography
measuring expansion or contraction of the same mask or wafer. A compatible capacitive gap sensing scheme works in conjunction with the interferometric alignment method, as in Fig. 22. Closed-loop positioning using this method has given accuracy of 15 nm. Alignment by linear zone-plate and gratings has required the incorporation of auxiliary coarse alignment techniques to assure the capability to handle initial misalignment to ± 1 mm.[80]
(a) Figure 20. (a) Linear zone-plate-grating alignment model. (b) Lateral alignment marks on the wafer and the mask. (Courtesy Perkin-Elmer.)
Figure 20. (Cont’d.)
Mask Zoneplate
(b)
Wafer Grating Line
Mask
Wafer
Overlaid Alignment Marks
X-Ray Lithography 891
Figure 21. Schematic of linear zone-plate-grating; alignment signals are diffracted from grating at the Littrow angle.
892 Handbook of VLSI Microlithography
Figure 22. Incorporation of a linear zone-plate-grating alignment method with a capacitance gap-sensing scheme. (Courtesy Perkin-Elmer.)
X-Ray Lithography 893
894
Handbook of VLSI Microlithography
A circular zone-plate scheme demonstrated[81] that focusing by zoneplate lenses on the mask, in conjunction with a third lens located on the wafer (etched in oxide) (Fig. 23), produced signals with very large signalto noise ratio for various surfaces, oxide thicknesses, and resists.[82][83] The principle is, as shown in Fig. 23, that the condition of alignment generates a single resultant spot from the coincidence of the mask lens focusing action with the focusing of the wafer lens. The circular zones are formed with radii, RN = (fNλ)1/2, where f is the zone target focal length, N is an integer (maximum 15), and λ is the laser radiation wavelength. The focal length of the wafer lens (240 µm) is greater than the focal length of the mask lens (200 µm) by an amount equal to the mask-to-wafer gap distance (40 µm) as in Fig. 23(b). This allows the two-lens system to concentrically converge to the same identical spot. Using three sets of orthogonally placed zone targets the achieved and measured alignment accuracy of ± 250 nm (2 σ) depicts the global (full wafer) positioning capability for 100 mm fields in Fig. 23(c).[84] By digitizing television camera scans of the focused zone plate spot (2.4 µm), center coordinate data is instantly processed and enters the servo control loop to drive five piezoelectric motors which move the x,y stage system. Six degrees of freedom of movement are maintained inside a constant temperature housing (± 0.2°F). Magnification compensation for wafer or mask linear distortions is also instantaneously rendered with this aligner/stage system[85] via the z-axis gap spacing adjustment. Other alignment approaches use only gratings. Matching gratings on the mask and on the wafer can be of different spatial period and diffract symmetrically from a normally incident laser beam. [86] The experimental step-and-repeat x-ray lithography system aligns mask-to-wafer when the symmetrically diffracted beam intensities are equal. The initial coarse alignment ± 1 mm achieved by Moire fringe patterns requires operator control and provides gap spacing information. [87] The theoretical fine alignment accuracy is 200 Å. Other work achieves dual grating detection of lateral position to within 10 nm and simultaneous gap detection of 100 nm.[88] At a specific gap s = P2 m/λ, where, P is the grating pitch, λ is the wavelength of laser radiation, and m is an integer. The sum of the 1st + and 1st - diffracted orders are at maximum intensity for a lateral displacement of P/2. Achieving the lateral alignment, therefore, consists of detecting the maximum of the diffracted intensities.
X-Ray Lithography
895
(a)
Figure 23. (a) Schematic of circular zone-plate x,y alignment and gap-sensing. (b) Alignment targets: center target placed on wafer, smaller targets placed on mask. (c) Orthogonal placement of targets on wafer/mask.
896
Handbook of VLSI Microlithography
(b)
(c) Figure 23. (Cont’d.)
X-Ray Lithography
897
6.2 Non-Interferometric Schemes Automatic mask alignment has also been achieved by non-interferometric light-optical methods requiring image processing and pattern recognition. One such system processes optical images to halftones in 256 grayscale levels.[89] Using incoherent illumination reduces phase contrast irregularities due to photoresist thickness or gap spacing variations, however, sufficiently sharp optical images or alignment marks in various planes (mask or wafer) must be realized. Automatic position detection requires the detection of straight and frequently low-contrast edges of the alignment targets. Image converters (video cameras) face the burden of contrast differences at edges which might supply data signals no larger than the noise amplitude. Using operational amplifiers, real time integration provides line-by-line integrals to a microprocessor for additional halftone processing. Experiments with the combination of a square-shaped annular target (~2.0 µm annular rim) on a wafer with a cross-shaped (5.0 µm line width) target on a mask show repeatability of ± 125 nm, whereas, for alignment via multiples of the same alignment targets (3 × 3 matrix) a repeatability of ± 40 nm is determined (with a television spatial resolution of 0.5 µm per line). Using a television spatial resolution of 0.25 µm per line and the same multiple targets, the measuring uncertainty is found to be a minimum of ± 30 nm. The reduction of the measuring uncertainty ∆K is in accordance with ∆K = [2t(t-1)]-1, where t is the number of average single alignment targets. This light-optical based alignment system used with synchrotron source yielded a measured overlay accuracy (using one mask) of ± 40 nm (3 σ) in x and y. The alignment mark targets consisted of 0.5 µm silicon dioxide, 2 µm of pyrolytic silicon dioxide, and 1.5 µm resist thickness.[90] Yet another light-optical non-interferometric alignment system similar to the alignment system above incorporates television camera image detection of the matching alignment mark pairs (wafer, mask) (Fig. 24) via bifocal objectives, light path converters, differentiation of the video scan across the image line edges to derive control signals for a null (zerodetection), and auto-focusing and auto-positioning.[11] The distance between the double focal points is set with high accuracy to equal the gap spacing, s (10 µm). Setting the specified s value at the three orthogonal alignment target locations almost eliminates wafer-flatness-caused alignment errors. The alignment system with multiple servo controls for multiple-axis alignment is driven by stepping the x,y wafer-stage motor and
898
Handbook of VLSI Microlithography
electromagnetic force operated elastic plates for micromovement positioning of both wafer and mask stages. The θ and Z positioning is performed at the mask stage, and x,y lateral positioning at the wafer stage. Figure 25 shows lateral positioning date, ∆x, obtained from signal processing of the combined mask-wafer mark image according to ∆x = K∑n (a-b), where k is a conversion constant, n is the number of average scan lines, and a and b represent bilevel signal widths. Fine x,y lateral displacement control of the stage to 7 nm, combined with an alignment servo stability of 22 nm enabled step-and-repeat overlay accuracy of ±150 nm (2 σ ) to be achieved over a 100 mm wafer (two different masks were used). The total alignment time under microprocessor/computer control is 0.5 seconds.
Figure 24. Single-line marks are on the wafer surface and double-lined marks are on the mask surface.
Figure 25. Schematic of mark positioning error detection.
X-Ray Lithography 7.0
899
RESIST
The photoresist component of the x-ray lithography method is of great importance in determining the x-ray system’s achievable capability: minimum feature size (resolution), line width and edge control, and maximum throughout. The reader is directed to Ch. 2 pertaining to photoresist materials and processing concepts. The photoresist, flux intensity, and mask, are related in the following way: high contrast positive resists make good line width control possible with lower contrast mask (i.e., the profile slopes of the x-ray flux intensity aerial image within the gap space can be less). These lower contrast masks are less difficult to manufacture. Such resists also can lead to a smaller source-mask distance with a resultant flux intensity increase since the penumbral blur increase would be compensated. The use of single level resists of appreciable thickness for more complete wafer topology coverage is a trade-off, also. Positive resists in general are ten or more times less sensitive than the negative resists, hence, throughout becomes a deciding factor in the selection of a resist for production use. If the relatively fast negative resists are used then the serial image profile slopes must be made steeper, hence, the mask absorber wall angles must be sharper and the mask fabrication technology once again becomes difficult. The penumbral blur should be made smaller, either by extending the source mask distance or by reducing the gap spacing. The flux reduction can be tolerated because of the compensation made through the high resist sensitivity. The customary swelling of negative resists requires caution in lithography practice at MFS ≤ 0.5 µm, and as a result, complex multiple layer resist (MLR) processing is considered a worthy alternative. Besides being judged by its sensitivity and resolution, an x-ray resist must be judged for its dry etch resistance to the reactive ion etching and other dry active ion etches used in delineating the layers (silicon dioxide, poly-silicon, silicon nitride, aluminum, etc.) that are deposited or grown on silicon water surfaces during IC fabrication processes. Resist contrast, as defined in Ch. 2, is a measurable resist parameter which indicates the resolution capability of a resist. A comparison of various x-ray resists is shown in Table 6. Although the absorption of x-ray resists can be very uniform the deposited energy density profile may not be. This nonuniformity can be caused by electron escape from the top resist surface and from backward energy dispersion into the resist from the silicon substrate. Higher energy x-rays such as from the Bremsstrahlung range can produce more than five times the effective dose in the bottom of the resist as
900
Handbook of VLSI Microlithography
Table 6.
Resists Used in X-Ray Lithography
Resist
Symbol
Sensitivity (mJ/cm2)
Resolution (µm)
Contrast (1)
Dry Etch Resistance
—
PBS (+)
167`
<1
1.27
Poor
Poor adhesion Mead Tech., KTI Chem.
—
COP (-)
35–200
<2
1.07
Poor
Mead Tech., Hunt
—
PMMA (+)
5600–8500
<0.1
1.7–2.0
Poor
Dupont, Tokyo Ohka
—
PCMS (-)
8
0.5`
1.8
Good
Toyo Soda, HP, Mead Tech
Ray-PF
(+)
50–200
<0.3
—
Good
Hoechst
WX-214
(+)
500
<1
—
Good
Olin-Hunt
RX-242 -
(+)
200
0.3
—
Good
Olin-Hung (BESSY)
400
4
—
Good
Olin-Hunt Perkin Elmer
Comments
HPR-204 -
(+)
8300
0.1
—
Good
Olin Hunt
ECX-1029-
(-)
200–250
0.3–0.4
—
Good
Rohm & Haas Novolak
ECX-1092
(-)
10–20
0.2
Low
Good
Rohm & Haas Novolak
EK-88, (771)
(-)
9
1.0
1.3
Good
Kodak
Kodak 571
—
—
0.75
High
Poor
—
—
DCOPA (-)
14
1
—
—
Trilevel, BTL 0.3% O 2/N 2 amb
—
DCPA+ +BABTDS (-)
4
0.5
Low
—
BTL, Plasma dry develop
(Novolak)
IBM (+)
750
0.5
—
—
NMOS devices IBM
RD-2000N
-(-)
1200
0.5
—
—
NMOS devices IBM
PR1024MB
—
150
0.5–0.75
—
—
MacDermid, Hampshire
RE-5000P Hitachi
—
14–38
—
—
—
X-Ray Lithography
901
compared with the resist’s surface.[91] The unique resist tilting effect described previously, which arises at the peripheral field from point sources (electron impact, plasma), can be controlled or minimized by the use of MLR processing. Placing the impinging resist layer in the MLR structure at an appreciable distance (i.e., increase bottom layer thickness) from the silicon substrate remedies the substrate backscattered dispersion effect. The MLR method also achieves steep resist profiles and MFS’s less than 0.5 µm via uniformly thin and planarized exposing resist layer.[92]
8.0
METROLOGY
The progressive development and production of IC chips into the submicron range via x-ray lithography (as well as electron beam and optical) required the extension and enhancement of the concomitant metrology base. The reader is directed to Ch. 3 for an excellent description of the related issues affecting the growth of finer line metrology with respect to line width measurement. The overlay accuracy and line edge roughness also have importance in determining the full capability of any lithography system. The quality factors line width control, overlay accuracy, and edge roughness, can be measured in terms of the printed resist image or in terms of its subsequent transferred likeness etched into a silicon surface layer (silicon dioxide, polysilicon, etc.). Obviously the latter method places some constraints upon the resist material and its processing. At submicron dimensions, edge roughness measurement (≤50 nm) lends itself more to the scanning electron microscope (SEM) type measuring tool. In some cases, expensive and extremely precise electron beam pattern generators (MEBES II, III) have been used for the assessment of the three quality factors. Overlay accuracy is also accessible through optical vernier methods.[93][94] The reading of verniers via microscope viewing is slow, operator dependent, and necessitates high image magnification and double interpolation to achieve a measuring resolution of 25 nm (vernier increments of 100 nm). The optical vernier technique has some advantage as a nondestructive procedure. Overlay accuracy and line width can each be measured via simple electrical test devices (see Ch. 3). These fast and fully automated electrical techniques involve the deposition and delineation of single or double layer conductive films which add some complexity, but enable accurate determination of lithography system performance status.[95][96]
902 9.0
Handbook of VLSI Microlithography X-RAY SYSTEM
The characteristics of the components of a near ideal x-ray system for submicron≤0.5 µm MFS are as follows.Source: (i) high intensity, uniform, collimated, and stable photon flux, and (ii) energy compatible with both high mask contrast and resist absorption. Masks: (i) high contrast and abundant clear area transmission, (ii) vertical and smooth absorber pattern walls, (iii) high absorber pattern accuracy, (iv) long term freedom from spatial or transmission distortions (from exposure radiation, mechanical creep, or otherwise), and (v) near zero opaque and/or pinhole defects. Aligner: (i) perform rapid step-and-repeat movement, (ii) align and maintain mask and wafer parallel to each other and normal to radiation flux, (iii) maintain gap spacing within close tolerances, and (iv) align in the x,y directions and to close tolerance to the corresponding pattern geometries of the mask and wafer. Resist: (i) high contrast and high sensitivity at source energy, (ii) low defects, (iii) very good adhesion, (iv) very good dry etch resistance, and (v) compatible development and bake processes. The source collimation criterion is also realized via the synchrotron storage ring features coupled with the use of favorable radiation reflective material properties (scanning mirror scheme) at the 0.6 nm to 1.5 nm wavelengths. Achieving some degree of collimation lessens some of the xray system design constraints. This relaxation simplifies gap spacing control, mask-pattern absorber wall-edge profile, source to-mask distance, and the size of mask field. These simplifications accrue merely by reducing source-caused penumbral blur and the combination run out and gapvariation wafer-to-mask misalignments. The noticeable disadvantage of the collimated source is the loss of magnification compensation control of wafer and/or mask linear distortions.[97] Also, some Fresnel diffraction effects have been noticed.[98] The photon flux from electron storage rings (synchrotron) is unique in terms of its low dispersion. Such flux renders all of the above advantages, besides providing an intense photon density. Hence, although not completely collimated, the storage ring appears as the outstanding source candidate for achieving very high quality IC chip patterning[99] with unmatched throughput.[100] Table 7 summarizes various x-ray system development and manufacturer efforts. Examples of resist exposures are shown in Fig. 26.
X-Ray Lithography
903
Table 7. X-Ray Systems for IC Chip Lithography
Component
University and National Laboratories Univ. Wisconsin Brookhaven NL Stanford Univ.
Sources: Electron Impact: Stationary Rotating Plasma: Laser: Storage Ring: Conventional (warm)
0.8-1.0 GeV 125 mA Opt. & Mech. Scan
0.6-1.0 GeV
2.0 GeV 10-20 mA
Opt. Scanb
Superconductor (cold) Aligner: Orientation x,y detection Gap detection Masks: Full Field Step & repeat Resists: Positive Negative
Vert. LZP Capacitance Bn/B3 N/Bn,B3Nd B4NH/W(-)g B4NH/W(-)g Si, SiN
Vert.b Opt.b
Vert.
B-Si/Au(+), BN
MMA and, c,d
Prop. Novalakb
c, d
RD-2000Nb
PMMA,PBS and, c
a. b. c.
No longer manufacturing IBM leased line Intel
d. e. f.
BTL Oxford Instruments Helios cold magnet ring for IBM site. Normal warm synchrotron ring built by Maxwell-Brobeck and installed at Louisiana State University. Hewlett Packard CZP and LZP are circular and linear zone plate, respectively
g. h.
(Cont’d.)
904
Handbook of VLSI Microlithography
Table 7. (Cont’d.) Component
IC Industrial Companies - X-ray System Developers IBM BTL Hughes Hewlett Packard
Sources: Electron Impact: Stationary Ai, 0.83 nm
Pd, 0.44 nmPd, 0.44 nm Ai, 0.83 nm
Rotating
Plasma: Laser: Storage Ring: Conventional (warm) Superconductor (cold) Aligner: Orientation x,y detection Gap detection
GeVe
Horiz./Vert. Opt.
Horiz. CZP CZP
Horiz. Opt. Opt.
Horiz. Opt. Opt.
Masks: Full Field Step & repeat
B 3NH Mylar B-Si, BN SiC BN/B3N/BN
B4NH
Resists: Positive Negative
Novalak RD-2000N
COP, DCOPA
PMMA COP
PCMS
a. b. c.
No longer manufacturing IBM leased line Intel
d. e. f.
BTL Oxford Instruments Helios cold magnet ring for IBM site Normal warm synchrotron ring built by Maxwell-Brobeck and installed at Louisiana State University Hewlett Packard (Cont’d.) CZP and LZP are circular and linear zone plate, respectively
g. h.
X-Ray Lithography
905
Table 7. (Cont’d.) Component
Hampshire
Equipment Manufacturers Micronixa Perkin-Elmera
Sources: Electron Impact: Stationary Rotating Plasma: Laser:
Pd,6kW,0.44 nm W,10kW,0.7 nm Nd:glass, iron alloy, 1.2– 1.4 nm
Storage Ring: Conventional (warm)f Superconductor (cold) Aligner: Orientation x,y detection Gap detection
Horiz. Optical
Horiz. LZPh Capacit. (Dyn.)
Horiz. CZPh CZP
Ti,SiC,BN/Au(-/+)
BN/Au(+/-)
B-Si/W(-),Au(+)
B-Si,BN/Au(+),
BN/Au(+)
AZ1350
ECX-1029 EK-88
Masks: Full Field Step & repeat Resists: Positive Negative
a.
No longer manufacturing
b. c. d. e. f.
IBM leased line Intel BTL Oxford Instruments Helios cold magnet ring for IBM site Normal warm synchrotron ring built by Maxwell-Brobeck and installed at Louisiana State University
g. h.
Hewlett Packard CZP and LZP are circular and linear zone plate, respectively
906
Handbook of VLSI Microlithography
(a) Figure 26. Resist exposures via: (a) Electron Impact Source Tungsten M-line, (1.0 mm ECX1029 resist), (b) Laser plasma 0.45 mm lines/spaces, ±0.05 µm (3 σ), 1.1 mm Hoechst RAY-PF over 1.0 µm Al topography; and (c) Synchrotron 0.5 µm lines ECX 1125, 38 mJ/ cm2 . (Courtesy (1) Perkin-Elmer, (2) Micronix, (3) Hampshire Instruments, and (4) Rohm and Haas with Center for X-ray Lithography (Univ. Wisconsin), respectively.)
X-Ray Lithography
(b)
(c) Figure 26. (Cont’d.)
907
908
Handbook of VLSI Microlithography
9.1 X-Ray Radiation Damage to IC Devices Various works have evaluated the radiation sensitivity of MOS capacitors and MOSFET device structures.[101]–[103] The IC device fabrication processes in general have long been questioned as to their role in inducing or enhancing the device’s radiation sensitivity. The energy of the x-ray lithography process has likewise been suspected enough to provoke a study in which n and p channel MOSFET transistors and MOS capacitors were exposed to 10 kW, Alkα radiation. [104] The devices were made using only photolithographic processes and subsequently dosed with various levels (0.375 J/cm2 to 8.24 J/cm2) of accumulated x-ray flux. The before and after effects of device annealing, as obtained by capacitance voltage (CV) plots,[105] indicated that the radiation induced damage (fixed oxide charge and interface (silicon/silicon dioxide) states) could be removed by annealing. Similarly, plots of source-drain current vs. gate voltage revealed that radiation damage was present, but curable by annealing. A 400°C anneal in hydrogen/nitrogen gas for thirty minutes was used. In other work, trapped charge damage has been detected via CV plots on MOS capacitor and p-channel MOSFET devices which received an accumulated x-ray dose from the four x-ray lithography exposure steps used in the fabrication of the devices.[8] By annealing, device damage cased by copper (0.9 keV) and aluminum (1.5 keV) radiation, immobile positive oxide charges (5 × 1011 charges/cm 2) and fast surface states (7 × 1011 charges/cm2 eV at mid-band gap) were removed.[106] The annealing required ten minutes exposure to a hydrogen/nitrogen, or pure nitrogen atmosphere. The gate threshold (onset of source-drain current) shift from -20 volts to -3.2 volts produced by annealing agreed with the CV plotted voltage shift. Another work evaluated the damage from the generation of neutral traps in the silicon dioxide of MOS devices fabricated with tri-level resist and x-ray lithography. No annealing was performed. Such neutral traps, if present within the completed device oxides, can capture holes evolved under subsequent radiation environment (for example, gamma-ray) giving rise to a negative gate voltage shift. The extent or rate of trapping and threshold shifts depend upon the electric field in the gate oxide (Leff 0.3 µm to 5.5 µm).[107] In Fig. 27(a), the NMOS threshold shift is presented as a function of γ-ray dose in rads with gate voltage (electric field in oxide) a
X-Ray Lithography
909
parameter. The performance of the above NMOS devices were compared with identically designed and fabricated devices made using tri-level resist photolithography [Fig. 27(b)].
(a)
(b) Figure 27. Threshold shift of NMOS devices versus gamma dose: (a) with gate voltage as parameter; and (b) comparison of x-ray versus optical lithography damage.
910
Handbook of VLSI Microlithography
One might expect the threshold voltage shift vs. γ-dose in rads for these identical devices to be very similar. The disparity in the plots shown can be attributed to the slight (accidentally produced) oxide thickness difference. Notice that if a correction factor for oxide thickness difference is applied and plotted against the right hand ordinate, a near unity value emerges, i.e., an equal radiation sensitivity exists whether x-ray or optical lithography is used. In the first mentioned radiation damage to experiments above, lots of devices made by photolithographic methods were similarly evaluated for comparison purposes and little or no difference in the radiation extent of damage was observed. Other energetic MOSFET fabrication process steps (plasma etching, etc.) appear to cause more radiation sensitivity than the x-ray lithography step. The use of direct write electron beam lithography steps, on the other hand, have been reported to enhance the device radiation sensitivity.[108] 10.0 CONCLUSION FOR PART I Within the IC device manufacturing industry the design performance goals for x-ray systems in achieving a high quality x-ray lithography manufacturing process are well established. Selecting the IC device type inherently specifies the MFS dimension, which in turn imposes the line width, line edge, and overlay accuracy dimensional tolerances to be: ± MFS/10(3 σ), ± MFS/7(3 σ ) and ± MFS/5(3 σ ), respectively. The interdependence of the x-ray system components (source, mask, aligner, and resist) and their design trade-offs in achieving the x-ray system performance goals are furthermore understood. The quest for the high quality xray system specified above is made more difficult by the complicating factors of adequate IC wafer production, throughput, and reasonable costs for capital equipment, maintenance, and downtime. These factors complete the matrix of constraints within which the x-ray system designer must select and merge his components. Presently, optical lithography is the exclusive and crucial lithography method for the mainstream IC device production of 64 Mbit DRAMs and >500 MHz microprocessors requiring device MFS to 0.25 um. During the quest to reach this plateau, major concerted x-ray-patterned device fabrication work (IBM and Motorola) and mask/resist efforts (ATT) were aimed to compete with optical lithography in the future production contest for MFS work extending to 0.18 um and below. University (Wisconsin and Louisiana State) and National Laboratory (Brookhaven) programs also
X-Ray Lithography
911
supported this venture. However, the IBM Corporation’s successes at it’s Advanced Lithography Facility (ADF) has been the key in determining the future deployment of x-ray lithography in the forthcoming 0.18 um production era. ADF with it’s 22-beamline superconducting Oxford Helios synchrotron has made: production runs of 64 Mbit DRAMS (defect free) on 200 mm wafers at 4 wafer per hour throughput with device MFS of 0.35 um, 1 Gbit 4-level test sites at MFS of 0.18 um, memory and logic chips with MFS of 0.15 um, and RandD devices with MFS of 0.1 um. Wafers of 300 mm have been adapted to production lithography. These ADF achievements in concert with efforts of partners ATT and Motorola culminate a decade of endeavor on a national scale to establish x-ray lithography production feasibility. Production lithography advances cited by the 1999 International Technology Roadmap of Semiconductors group depicted a sequence of MFS nodes as 0.35 µm (1995), 0.25 µm (1997), and 0.18 µm (1999). Over many years x-ray systems with electron impact, laser/plasma, pinch plasma, and other sources have not in the least challenged the production feasibility of optically-based lithography systems. The throughput criterion has always weighed in favor of optical systems. Certainly a $15M or more synchrotron source cost has prohibitive aspects. In Addition, the emergence of a robust synchrotron-based lithography platform is thought necessary by the year 2001 in order to establish a production niche as the device developers make tool-selection decisions to fulfill the overall process requirements for 0.18 um MFS. It appears now that scattering angular limitation projection electron beam (SCALPEL), laser driven DUV, EUV, or other optical lithography tools (193 nm, 157 nm, or 130 nm) will dominate with continued and exclusive production acceptability. Other synchrotron x-ray lithography efforts, either partially or totally offshore (especially Japan), may define and contribute in deciding the future utilization of x-ray lithography. in the mean time x-ray lithography efforts per se are presently minimized at ADF. Small scale microwave millimeter wave IC work carries on in which laser plasma/ablation and dense pinch plasma x-ray source lithographies are employed. The IC device designers conventional demand for rectangular device geometries (vias, source/drain regions, gates, etc.) may help determine the preferable lithography method (x-ray, optical, electron beam, ion beam) as the IC device MFS is extended below 0.25 µm. In this realm, the corners of geometric rectangles are difficult to maintain as right angles (90°). Perhaps
912
Handbook of VLSI Microlithography
geometric corners with finite radii might even remedy some device performance instability and such devices could work just as well with π/4-sized smaller vias, etc. Some expectancy exists also regarding wafer alignment mark stability and degradation under varied and hostile IC device process environments. Lithography methods for 0.25 µm MFS IC device production (64 Mbit DRAM) exist now; lithography methods for, and IC device processes for 0.18 µm MFS (256 Mbit DRAM) have been developed. An assessment of present nonconventional IC device research, such as quantum-well structures, points out that the need for MFS resolution less than 0.25 µm is to be expected and will also include the 1 Gbit DRAM (0.15 µm, ca. 2001–2004). The fascinating high-powered synchrotron source text follows in Part II of this chapter. The electron physics oriented subject matter provides a scientific awareness of the full technical prospects and the potential limits of x-ray lithography. The author regrets that the above text does not include the very important data processing subject matter concerning the x-ray printing tool’s (source, wafer input/output, and aligner integration) automated control networking: data flow electronics, software, data storage, diagnostics, etc.
PART II 11.0 SYNCHROTRON RADIATION SOURCES
11.1 Introduction This part of Ch. 10 is organized as follows: we first review the basic principles of the emission of x-rays from electron storage rings, the properties of the radiation which are most relevant to x-ray lithography are then presented, and finally, we discuss the system used to relay the radiation from the storage ring to the exposure station. Without entering in any detailed discussion, it is necessary to mention synchrotrons are considered to be, by some, the best sources to power the x-ray lithography steppers needed for advanced lithographies. These new
X-Ray Lithography
913
technologies are needed for the manufacturing of the high-density and high-volume semiconductor integrated circuits based on devices with dimensions of 0.35 µm and smaller.
11.2 Properties of Synchrotron Radiation A Basic Accelerator. Synchrotrons and Electron Storage Rings (ESR) are accelerators capable of producing stable beams of particles of very high energy, in the million (MeV) to billion (or giga, GeV) electron volt range. The machines were first designed for basic high energy and nuclear physics research, but found later application in the industrial and medical field. Yet another application is that in the area of semiconductor processing, where other types of accelerators are commonly employed (for example, ion implanters). Figure 28 shows the conception of an accelerator from which several beam-lines extract the radiation and direct it to the exposure stations. With some impropriety, we commonly refer to ESRs as synchrotrons. Their basic structure is quite different, but the properties of the radiation is the same—hence the terminology.
Figure 28. Basic accelerator structure.
914
Handbook of VLSI Microlithography
Radiation Emission: Basic Process. An electron storage ring performs the function of capturing and storing electrons which are generated by an injector system. The electrons are kept in stable orbits that close upon themselves; these orbits are defined by the action of a series of magnets that are used to (a) define the orbit and (b) focus the electrons. Let us at first consider the simple case of a perfectly circular orbit of an electron moving in a completely uniform magnetic field, B. No work is performed by the magnetic field on the electronics, since the Lorentz force is perpendicular to the velocity and the electrons would be moving at constant speed in a circular path. However, accelerated charges radiate electromagnetic energy at the expense of their kinetic energy.[109] This is indeed what happens in the case of electrons moving in a storage ring, where the losses must be compensated in order to achieve a stable system. The radiation emitted by low-energy electrons captured in a circular motion can be decomposed in two linear oscillations with angular frequency, ω . The two oscillations are exactly 90° out of phase. Each one will give rise to a dipole emission pattern described in the far field by the classical equation:
Eq. (14)
(
dP = e 2 R 2ω 4 / 4π c 3 sin 2 θ dΩ
)
−1
where dP is the power radiated in the solid angle dΩ at an angle θ from the dipole axis.[109] In other words, electrons orbiting in a circular path are equivalent to a dipole antenna which generates the familiar pattern shown in Fig. 29. The total power radiated is obtained by integrating over the angles, obtaining:[109][110] Eq. (15)
P = 2e2 (2R)2 ω 4/3c3
When the speed of the electrons becomes very large, approaching c (the speed of light), these formulae lose validity as a new effect sets in. This is the Lorentz contraction of the emitted radiation as seen by an observer at rest in the lab reference frame. Following the treatment of A. A. Sokolov and I. M. Ternov, let us consider the situation shown in Fig. 30, and more graphically in Fig. 3l, where an observer, O, looks tangentially to the electron trajectory. First, we note that the wave length of the radiation as seen by him will be blue shifted by a Doppler effect factor:
X-Ray Lithography
Figure 29. Emission pattern from a classical dipole.
915
916
Handbook of VLSI Microlithography
Figure 30. Energy-angle plot for SR source.
Figure 31. Synchrotron radiation emission cone. Emission pattern from a high relativistic electron.
X-Ray Lithography Eq. (16)
917
1/(1- ν2/c2)S
i.e., γ 2 in machine physics terminology (note: the value of γ is typically around 1000, so that the electrons are moving with a velocity ν = 0.999999 c). Second, another important consequence of the relativistic electron speed is the “folding” of the radiation pattern whereby the angles are compressed by a factor of γ along the instantaneous electron velocity. These two effects, combined, result in a profound change in the radiation spectrum, i.e., the way in which the power emitted is distributed among the different wave lengths. The folding of the radiation will give rise to a narrow emission cone, of aperture ∆θ = 1/γ (Fig. 31). To the observer, O, the electron will appear to sweep by for a duration, τ = R∆θ /∆ = R/γ c. We recall that the frequency spectrum of a time-dependent signal (such as a pulse) is obtained by taking the Fourier transform of the signal itself; in the case of a synchrotron radiation, the Fourier transform of such a pulse will extend to frequencies as high as 1/τ, i.e., proportionally to γ. We can conclude then that the revolution frequency, ωR of the electron, c/R, is shifted toward the higher frequencies by a factorτ3, coming about because of the narrow angle of emission (contributing a factor of γ) and of the blue shift (contributing γ2). Τhe radiation becomes a continuum (the pulse’s Fourier transform) centered around ωc ≈ ω Rγ3; the frequency bandwidth extends up to ω ≈ 5ω c. These simple arguments are substantiated by a more formal analysis,[110] leading to the definition of a “critical energy,” εc, which is very close to the value obtained above: Eq. (17)
εC = hωC
Eq. (18)
ωC = 2 C/3Rγ 3
In practical units: Eq. (19)
λC (Å) = 5.59 R/E 3, and
Eq. (20)
εC(eV) = 12,398/ λC
with E in GeV, I in A and R in m. The critical energy is the median energy in the total power spectral distribution curve and, as such, an important figure of merit of an electron storage ring; half of the power is radiated at energies larger than εC. The amount of energy radiated by the electrons is also very large because of the very efficient coupling between high energy
918
Handbook of VLSI Microlithography
electrons and the electromagnetic field. If we consider the fact the electrons are emitting incoherently from each other the total radiated power is simply obtained by adding together the radiation from each electron. We obtain: Eq. (21)
P = 88.5 E4 I/R kW
The power is radiated over the entire orbit, so that the power per milliradian of horizontal orbit is: (Eq. (22)
P = 14.1 E4 I/R W/mrad
An electron storage ring is essentially a very efficient x-ray antenna; this explains why it is essentially impossible to outperform the conversion efficiency of these machines. Since the spectrum emitted depends only on the basic laws of electrodynamics it is possible to predict exactly the energy and photon flux from these sources. (Note: ESRs are used as primary standards for the calibration of detectors in the XUV. The National Institute of Standards and Technology maintains an ESR, Surf II, primarily for calibration purposes.) By inspecting the above equations, it can be seen that the properties of the radiation depend only on any two of the three variables (E, B, R), since: Eq. (23)
R = 33.35 E/B m
for relativistic electrons, with B in kG. The spectral distribution of the radiation can be obtained analytically and is a fairly complex expression involving a Bessel function of fractional order,[110] however, it is possible to rewrite the emission spectrum in terms of the reduced variablesλ/ λC and, by normalizing the power with the electron energy, to obtain a universal curve. For example, we can rewrite Eq. (21) as: Eq. (24)
P = 15.83 εC EI kW
with the usual meaning of the variables. It is important to note how it is possible to obtain the same radiation (both in spectrum and in total power) from two very different machines, as long as εC and the product EI are the same. Thanks to these scaling properties, it is then possible to compute the flux only once and then re-scale it for the particular case being considered. Computer programs have been developed to exactly model the source and to easily generate tables and plots.[111]
X-Ray Lithography
919
A reduced spectral distribution is shown in Fig. 32 and can be used to compute the actual flux from any machine. The power radiated is given per milliradian of orbit. For example, let us consider a machine with E = 1 GeV, R = 2 m and I = 0.1 A. Such a ring will have εC = 1109 eV, λC = 11.18 Å and will radiate 0.705 W/mrad. If we keep in mind that an exposure beam line will accept around 30 mrads of radiation, more than 21 watts will be delivered to the beam line. It is important to notice that while the median energy is at the wavelength 11.18 Å, the spectrum extends well into the visible (and even infrared) on the longer wavelength side. The synchrotron radiation spectrum falls off more rapidly towards shorter wavelengths (higher energies) than it does toward the longer. It can be shown[110] that the power emitted behaves as ω1/3 for ω → 0 and ω1/2 e-ω for ω → ∞. A question of importance in x-ray lithography is the average number of photons that are generated in a given window of energy. It can be shown that for the whole spectrum the verge photon energy is given by:[112} Eq. (25)
<e> = 8εC/15√3
Eq. (26)
= 2.03 × 1019P/<ε>
with P in watts and ε in eV. For the above case, we would then have 1.3 × 1017 photons per second of average energy 341 eV. In general the average number of photons depends on the bandwidth selected.
Figure 32. Radiation spectrum emitted by a storage ring.
920
Handbook of VLSI Microlithography
In summary, the spectral distribution of the radiation emitted by a synchrotron or by an electron storage ring is characterized by a very wide frequency distribution, extending from the infrared to the x-rays. The power radiated is also large, making the electron storage ring a very efficient x-ray source. Angle Distribution. Another important property of the radiation emitted by ESRs is its angular distribution. As we have briefly discussed above, the Lorentz transformation contracts the “figure 8” shape typical of the dipole pattern in a narrow beam of aperture 1/γ. The motion of the electrons along the orbit makes the narrow cone “sweep” horizontally, thus leading to a very uniform horizontal distribution. Vertically, the beam is instead characterized by a distribution width ~1/λ, as shown for our typical machine in Fig. 33, where results of a Monte-Carlo simulation of the radiation generation are compared with an experimentally measured power profile. The shape closely resembles that of a Gaussian function. The distribution in energy of the photons generated in the process is not uniform as well, as illustrated in Fig. 34, which shows in more detail the relation existing between photon energy and emission angle. From this model we can note: (a) how many more photons are generated at low energies, (b) how close they are to the orbit plane, and (c) that the ones of lower energy subtend a wider vertical angle. The strong chromaticity of the source is typical of the synchrotron radiation process and in first approximation we can write that photons of wavelength λ will be emitted with a distribution of standard deviation given by:[110][113]
Eq. (27)
1 λ θ = γ λc
0. 435
The narrow vertical emission angles have led to considering synchrotron radiation as being “collimated.” This statement is an approximation, as shown above. It is, however, true that the opening angles are very small (a fraction of a milliradian), particularly in comparison with other sources. Horizontally, these effects exist as well, but are averaged out by the motion of the electrons. Scaling relationships exist as well for the angle distribution, and the dependence of the photon energy spectrum on the observation angle is quite apparent in Fig. 35, that shows the power emitted at different (reduced) angles from the orbit plane in function of the photon energy while Fig. 34 shows the same data, but as a function of the
X-Ray Lithography
921
Figure 33. Vertical beam (50 mA) power distribution as observed at a target located at 4 m. The histogram shows experimental data and the smooth curve results from a theoretical calculation.
Figure 34. Photon flux as a function of elevation angle for different wavelengths.
922
Handbook of VLSI Microlithography
Figure 35. Angle of emission vs. photon energy for a synchrotron radiation source.
X-Ray Lithography
923
emission angle for different photon energies. The curves can be scaled to any machine by making the appropriate substitutions in the units. The most important observation is that harder radiation is always emitted closer to the orbit plane. We want to briefly notice the strong linear polarization of the radiation. To an observer located exactly in the orbit plane, the radiation would appear to be 100% polarized horizontally. This can be understood if we recall the separation of the circular motion in two harmonic oscillations exactly 90 degrees out of phase, to the observer exactly in the plane the second oscillation (vertical from his point of view) will be all but invisible. Formally, θ = 0 for this perpendicular component (cf. Fig. 29) so that no emission is observed on axis. As soon as the observer moves off-axis, the second oscillation becomes “visible,” so that a perpendicular component is detected; however, because of the narrow width of the overall opening angle, the radiation quickly fades. By integrating over all the spectrum[110] we find that the degree of polarization can be obtained by noticing the ratio between parallel emission and perpendicular emission: Eq. (28)
α = Wπ/Wσ = 1/7
This would give a polarization degree of exactly 75%. Although polarization effects are important in several instances they do not appear to play any significant role in XRL, so that we will not consider the argument further. Equivalent Source. The emission of radiation is a random process governed by Poisson statistics, controlled by the deterministic laws described above. The results we have so far derived apply to the case of an isolated electron moving on the central orbit The electrons stored in an ESR do not all have exactly the same parameters, but rather are distributed in a random way around some average values. The distribution of the electrons’ trajectories is governed by statistical laws, so that, for example, the distribution describing the deviation of the electrons from the standard orbit is to a very good approximation a Gaussian function of the form: Eq. (29)
2/2σ2 -x′2/2σ2 x x′
N(x,x´)=No/2πσ xσx′ e-x
where (x, x´) are the electron transverse coordinate and direction in horizontal, σ x(x′) the standard deviation in x(x´) and NO the number of electrons per second in the bunch. (Note: In statistical mechanics, one should use a coordinate q and its canonical conjugate momentum p = ∂H∂q). For a mono-energetic beam of small aperture, if x is the coordinate chosen,
924
Handbook of VLSI Microlithography
then p x = psin(x´)~(p)x´, where x´ is the Cartesian angle. Since we assume that p= const., we can drop it and consider only the pair (x, x´). A similar equation applies in the vertical direction. Equation (29) is valid at a waist location, i.e., at a position along the orbit where the electrons come to a focus (of dimension σ x, usually called waist); at a position s removed from the waist we must substitute x → x + x´s to obtain the new distribution (a wider Gaussian shape). The standard deviation of the beam at the new position is given by: Eq. (30)
σ2x(s) = σ2x(sxo) + (s - sxo)2 σ2x′
Eq. (31)
σ2z(s) = σ2z(szo) + (s - szo)2 σ2z′
where the variables sxo, szo are the locations of the electron beam waists in x, z measured along the orbit (s) (in general different). An excellent discussion of the beam optics is found in the work of G. K. Green.[114] If we assume that the x-rays are emitted by the electrons exactly along their instantaneous velocity, then from the point of view of an observer located at some distance, D, from the tangent point, the x-rays will appear to form a patch with extension σx in horizontal and σz in vertical exactly in Eq. (31), with s = D. In general we can use a form similar to Eq. (29) to represent as well the propagation of the x-ray beam. Equation (30) can then be used to describe the shape of the x-ray beam during its propagation. In a real machine, the photons are not generated exactly along the electron trajectory, but rather at some random angle whose distribution is consistent with the radiation distribution (for example, Fig. 34).[112][115] The radiation angles always refer to each individual electron instantaneous orbit plane,[115] so that if the electron orbit forms an angle α with the reference or central orbit at the emission point, then the radiation lobe is oriented along the orbit at the same angle α. The radiation pattern is then the convolution of the radiation fan and of the electron directions at that orbit location. If we approximate the radiation with a Gaussian distribution we can make use of the fact that the convolution of two Gaussians is still a Gaussian of standard deviation given by: Eq. (32)
σ2T = σ 2R + σ2x′
where, σR is the standard deviation of the radiation. The angle distribution used in Eq. (31) is thus augmented by the photon angular distribution. For machines of low beam energy, the photon energy spread normally dominates.
X-Ray Lithography
925
In conclusion, the physical aspect of the synchrotron radiation source is that of a Gaussian distribution of point sources with a distribution given by Eq. (31) and with an angle aperture following Eq. (32). An example is illustrated in Fig. 36.
Figure 36. A typical x-ray source, illustrating the original cross-section and the phase space at the origin and after a 10 m propagation.
926
Handbook of VLSI Microlithography
A very important figure of merit of any optical source is brightness, the density in phase space of the ensemble of rays generated by the source. In general, the higher the brightness the “better” the source. For example, both point σ x = 0 ) and collimated sources (σx′ = 0) have zero phase space, granted for opposite reasons. For a given beam of rays propagating through an optical system, the well known Smith-Helmholtz invariance holds (in the paraxial approximation x´ ~ sin x´):[116] Eq. (33)
x x´ = const.
When extended to an ensemble of No rays, this reduces to the Liouville theorem stating the conservation of the phase space density of trajectories.[117] If we now define the brightness: Eq. (34)
B (x , x ′) =
N (x , x ′) 2πσ xσ x′
where N is the photon flux (photons/s), Eq. (29), it is easy to convince ourselves that B (x,x´) must be conserved following the Liouville theorem, since it is nothing else than the density of the photons in the space of the coordinates (x,x´). The flux N (x,x´) is proportional to the electron beam density in phase space and includes the source radiation angle standard deviation. These effects are easily observed in Fig. 36 where the evolution of phase space is easy to follow. The ellipses describing the (x,z) spaces are changed after propagation, but the total area is constant. The radiation emitted by the storage rings has very high brightness in comparison to other sources because of both (a) small source size and (b) small included angle. This effect is further enhanced in wiggler and undulator sources.[113] Another important issue is that of the source coherence. The degree of coherence essentially measures the capabilities of two radiation beams to interfere; a fully coherent source will exhibit strong modulations in the interference region that will be absent in the incoherent case. The details are somewhat complicated and beyond this discussion, but we would like to note that because of the broad spectrum the radiation is very temporally incoherent. Conversely, the small source size reduces the spatial coherence only slightly because of the ion optical distances involved; for all practical purposes the radiation can be considered as being emitted from a small volume containing independent radiators. A typical x-ray optics calculation would then start with the computation of electric field due to a
X-Ray Lithography
927
monochromatic point source to obtain the monochromatic intensity, including diffraction and physical optics effects; this calculation must then be repeated for all the wave lengths involved and the intensities added together. Finally, the calculations should be repeated for all the points defining the source, a quite formidable task. Since the source is formed by an array of independent radiators (thus incoherent) we can use the convolution theorem to add the processes together. The convolution theorem allows a separate calculation of the diffracted image and of the geometrical source effect, followed by a simple convolution integral, thus greatly simplifying the computational steps necessary. These steps are implemented in existing codes that produce a full 2-D or 3-D image calculation.[118] Radiation Distribution at the Mask. On the basis of the above discussion, we can define exactly the influence that the peculiarities of the SR source may have on the formation of the image in the lithographic process. It is important because of the very high resolution involved to verify the extent of these effects. For example, a pattern feature must be defined to within a 1/10 of a critical dimension overall. [119] Several error sources contribute to this value so that the amount allocated to each component is much less than the typical 1/10 of the smallest feature size. For example, at 0.25 µm line widths, only 25 nm form the budget allocated to the overall error. One-fifth (5 nm) may be the fraction allocated to the exposure. The simple optical system sketched in Fig. 37 can be used to describe in detail the effects of the electron beam on the innate formation.[120] The system we discuss is of the non-focusing type, but any optical system can be accommodated with few changes. An optically active system (i.e., focusing or defocusing) will result in a different value for the spatial and angle beam extent, but the discussion remains otherwise valid. We recall that in order to achieve a uniform intensity at the mask position some form of scanning must be provided. This can be achieved by keeping the mask-wafer stationary and restoring the x-ray beam or by keeping the radiation fixed and mechanically scanning the mask-wafer assembly. Other schemes, based on oscillating the electron beam itself, are unlikely to gain widespread acceptance. The optical system shown can indeed be used to study the effects of any type of scanning by choosing suitable values for the parameters. For simplicity we limit the discussion to the case of a plane scanning mirror, although this does not affect the conclusions. If we consider the photon beam to be a Gaussian beam, the propagation from the source point to the mask via the reflection from the mirror gives rise to a beam with:
928
Handbook of VLSI Microlithography
Eq. (35)
σ2x = σx(0)2 + D2σ2x’
Eq. (36)
σ2z = σz(0)2 + D2σ2z’
The beam waist is located exactly at the tangent point; the size will be independent of whether the (flat) mirror is being scanned or not. In this equation D(P) is the distance source-mask (mirror-mask). We can safely neglect the effect of the small reflection angle 2θ. The effect of the “painting” of the beam vertically over the mask will be that of introducing a “bias” in the vertical velocities distributions because of the linear scanning rate. From the point of view of XRL, what is important is the penumbra which is created by the finite size of the electron beam. It can be shown that the penumbra is obtained by: Eq. (37)
σg,x = g/D (σx)
Eq. (38)
σ g ,z = (g / P ) σ 2 z + (D − P )2 σ 2 z ′
so that the scanning action introduces, quite unexpectedly, the effect of angle divergence as well as the beam size.[112][120] Even a collimated beam (σ z’ = 0) or a point source (σ z = 0) will introduce penumbral effects when scanned. The case of electron beam wobbling corresponds to having the emitter coincident with the source, so that P = D. Similarly, the scanning of the mask-wafer assembly leads to the same D = P condition. We notice that we can put the equation describing σ g,z in that of an ellipse in the plane (σ z,σz´) for a given combination of (D,P). We can thus draw lines of constant resolution (more precisely, constant penumbra) and lines of constant emittance (hyperbolas) as shown in Fig. 38.
Figure 37. Model optical system for resolution studies.
X-Ray Lithography
929
Figure 38. σz vs.σz′ . The dashed lines refer to lines of constant emittance while the ellipses show lines of constant resolution.
The volume of phase space occupied by the electrons (no radiation) is proportional to the product εx = π σxσx′, called emittance in accelerator physics terminology, so that the smaller the emittance, the brighter the beam. It is easy to notice that for a given resolution there is a maximum value of emittance above which it is impossible to achieve said resolution. This can be used to define exactly the tolerances on the ESR, since the beam emittance is one of the typical figures of merit of an accelerator and may include strongly its cost. In general, the “best” (i.e., smallest) value achievable is determined by the lattice design and by the number of electron optical elements used (see below). Coming back to the case of “optically active” beam lines, we notice that the main effect will be that of modifying the values in Eq. (38) by a magnification factor, M, and moving the source location to a virtual source position, thus changing D,P as well.
930
Handbook of VLSI Microlithography
Table 8 shows some parameters that are typical of two machines, a large “research” and a “compact” one, together with the performances that can be expected in both cases. Table 8. Research and Compact Synchrontron Parameters Parameter Beam Energy γ Radius Magnetic Field Critical Energy Radiation Angle Current σR σx σx´ σz σz´ Penumbra x Penumbra z Power/mrad Power (Beamline)
Units GeV m kG Å mrads mA mrads mm mrads mm mrad nm nm W/mrad W
Research 1.0 1950 2.0 16 10 0.5 100 1 1.2
Compact 650 1270 0.5 43 10 0.8 200 2 2
0.3 0.5 1.6 0.4 0.7 21
1 0.6 2.7 1.3 1.0 30
12.0 TYPES OF MACHINES The first type of accelerator useful for spectroscopical study was the electron synchrotron. All accelerators share the same block structure, illustrated in Fig. 28. Source → Pre Accel → Injector → Booster → Machine The source is typically a high brightness electron gun, followed by the electrostatic acceleration and focusing optics. The next step requires
X-Ray Lithography
931
acceleration to relativistic speed and this is done in the injector stage. Two choices are possible for this stage, normally called low and full-energy injection. The low energy injector accelerates the electron only to a fraction of the final energy, leaving to the electron storage ring itself or to an intermediate booster the achievement of full energy. Full energy injectors are typically LINACs. The relatively small energy gradient results in long machines. Intermediate schemes (non-accelerating ring) based on short injectors followed by a booster synchrotron are more cost effective and compact in size. The least expensive approach is certainly that of a low injection scheme coupled to an accelerating ring. Classical examples of the three types of injection are found in Stanford (SSRL, full energy with LINAC), Brookhaven (NSLS, short LINAC and booster synchrotron), Wisconsin (SRC, microtron and accelerating ring). The choice between the different schemes is dictated by the issue of cost, efficiency, and reliability, which are compatible with a given application. For XRL applications, reliability (i.e., uptime) is of the utmost importance. No rings have yet been built and used in an industrial environment; the lessons learned from the physics experience do indicate that all of those approaches are viable provided that they are correctly implemented and that maintenance is regularly used. Storage rings are well behaved, predictable machines with few critical components; maintainability should be relatively easy to design in. Safety issues are also very important, since the injection subsystem is a region of intense radiation that must be fully contained. An important difference between an accelerating ESR and a nonaccelerating one is the “filling” procedure. In the first it is possible to replenish the electrons lost by attrition due to the residual gas molecules without affecting the normal operation of the machine, while in an accelerating machine this is not possible, leading to an operation pause, for example, every eight hours at shift change times. Here is where the lifetime of the injected current plays a critical role; the (almost) exponential decay of the current should be such that the average beam current is large enough to sustain the steppers’ throughput. When calculating production figures, one should always use the average current: Eq. (39)
= I inj (1/t) O ∫ T e-t/τ dt
Eq. (40)
= I inj (τ/T) (1 - e -T/τ)
Eq. (41)
= I inj (1 - T /2τ)
932
Handbook of VLSI Microlithography
for long lifetimes, τ. We can easily verify that if we require that = 0.8 Iinj then we need a lifetime in excess of 20 hours for an eight hour shift. The “filling-up” process normally lasts 15–20 minutes for a well operated machine. The ESR itself is essentially a lattice of magnetic fields used to steer and control the electron beam in its path through the vacuum chamber. An essential component is the radio frequency cavity which is used to provide the energy lost by the synchrotron radiation emission process. The RF cavity is a key component and the subject of extensive studies.[121] Dipole magnets are used to define a closed orbit, while quadrupoles, hexopoles, and octopoles are used as active elements in controlling the beam position and shape.[122] Because of the steady state requirement conditions, only some beam orbits and configurations are stable (like in an optical resonator). The electrons are characterized by a Hermite-Gaussian distribution of trajectories.[113][114] The intrinsic circularity of the orbit is broken down in sectors equivalent to each other. Each sector contains one bending magnet and some other focusing magnets; several arrangements are possible leading to the different type of lattice design. The type and the complexity of the lattice has a direct impact on the quality of the electron beam, i.e., its emittance. Basically, a simple lattice without higher order elements will make it more difficult to achieve a beam of small dimensions and good collimation. Conversely, sophisticated schemes allow the achievement of extremely compact electron beams. Fortunately, the requirements on the beam size and divergence of XRL are quite relaxed, greatly simplifying the ring design and construction.[123] This has lead to the proposal of “compact” ESRs, i.e., scaled down versions of the large accelerators used in high energy physics and in spectroscopy. Besides the distinction between “large” and “small” rings, there is also another distinction between “warm” and “cold” rings, depending on the use of resistive or superconducting magnets. The rationale of the two choices is the following. For XRL applications, a small footprint is highly desirable. The minimum footprint is, in ultimate analysis, determined by the maximum magnetic field achievable and by the distance between bending magnet (straight sections). For example, our 1 GeV ring will require a magnetic field of 1.6 Tesla to bend the electrons into a 2 m radius circle. This gives a minimum size of some 6–8 m when we take in account straight sections. Normal magnets are made by water cooled copper coils with a laminated iron yoke used to shape and confine the magnetic field. The technology is simple and robust, and the system limitations come (a) from the maximum field that can be achieved and (b) from the large power that is dissipated, leading to a sizable addition
X-Ray Lithography
933
to the ring operating costs. Conversely, the superconducting magnets allow high fields and by definition use essentially no power. If we had 4 tesla magnets available, our 1 GeV machine could now have a bending radius of 0.5 m, considerably smaller than the normal magnet version. The superconducting magnets are, however, of complex design and should still be considered a high-risk option. The cost of the helium system can also add considerably to the cost of operation. The focusing magnets are always at room temperature. The main advantage of cold rings is smaller footprints at equivalent energy. An inspection of Eq. (24) shows that trade-offs are possible between the parameters that define the main aspects of a storage ring. As a rule, large currents are the most difficult to obtain and to keep in an ESR. It is then convenient to trade beam energy vs. current in order to achieve a given throughput for a fixed radius and this is clearly where one of the advantages of the cold ring is. Against cold rings are several engineering issues. In particular, the non-triviality of magnet construction as well as the difficulty of locating focusing (electron) optics and the injection line in a small area. The warm ESRs are the ones traditionally used in SR work. Their design and construction is well established. Several commercial vendors are actively working in this area, particularly in Japan.[124] In the United States, Maxwell Laboratories is designing and constructing a warm storage ring. At the time of this writing several projects were underway for cold rings, the most notable ones being COSY (Germany), Oxford (England, and Sumitomo (Japan), in the area of commercial vendors, and Brookhaven National Laboratory in the area of U.S. based efforts. The German and British efforts have very close ties with their respective national laboratories.
13.0 BEAM TRANSPORT SYSTEMS The problem of delivering a high intensity radiation beam to the mask requires a careful engineering study in order to ensure cost-effectiveness, reliability, and safety. The system used to accomplish this is called a beam line. It can be divided into subsystems performing the different tasks required for the overall functionality, that is, the relay of the x-ray generated by the ESR to the mask-wafer assembly.
934
Handbook of VLSI Microlithography
First and foremost, the beam line must deliver the x-rays to the maskwafer assembly in a controlled and convenient way. This is dictated by the exposure tool being used (stepper) and by the mask-resist combination. An exposure window can be easily defined by forming the product of the mask carrier transmission times the resist absorption: Eq. (42)
W(hw) = T carrier(hw) Aresist(hw)
The physical meaning can be easily grasped since it represents the part of the spectrum that will have useful photons, i.e., photons that will be absorbed in the photoresist. Figure 39 shows how the window peaks around 10 Å. Softer radiation is absorbed by the carrier, creating heat and possibly radiation damage; harder radiation is not absorbed by the photoresist and is useless. A well designed beam line will then reject photons emitted outside of the exposure window, delivering radiation that is well matched to the system in use. Second, the beam line must provide a bridge between the different environmental conditions at the source and exposure station. The environment of the exposure station and of the storage ring are very different, with the first being kept at reduced vacuum (P ~ 20–200 torr) or at atmospheric pressure, while the ring is always in ultrahigh vacuum, i.e., at pressures of 10-9 torr or less. These low pressures are obtained only after prolonged outgassing of the ring and conditioning, since the radiation itself desorbs absorbed gases from the ring walls: it takes a fairly long time (weeks) before the walls are scrubbed clean and a satisfactory pressure is achieved. Exposure of the ring to high pressures in an uncontrolled way may result in long down times due to the need for reconditioning the vacuum. Thus, means of maintaining the pressure differential without affecting the delivery of the radiation must be devised; at the same time, the safety of the ring must be maintained even in case of catastrophic failures. Third, the beam line must provide a data channel for communications between the stepper and the storage ring. Finally, the hard radiation generated in the ring during injection must be fully contained for the operators’ safety. A large body of literature exists on this subject[125][126] and we will not dwell on it save for noticing that safety radiation is achieved by carefully placed shields located along the beam-lines and by heavier, fixed shielding, around the ring itself. The shielding along the beam-lines is not needed because of the x-rays which are efficiently stopped by the vacuum vessel walls, but rather because of the harder gamma rays that may
X-Ray Lithography
935
be generated by the collision between the high energy electrons and the residual gas molecules. This Bremsstrahlung may happen to be oriented along the beam line if by chance a region of (relatively) large gas trapping exists at the tangent point.
Figure 39. Process exposure window.
Breaking down the beam transport system into subsystems, we find that we can define the following blocks: vacuum, optics, communication, and safeties. We will address these points in turn.
936
Handbook of VLSI Microlithography
13.1 Vacuum Requirements From the point of view of the lithographic process, operation in vacuum is a nuisance that is better avoided because of all the mechanical implementations needed for the scanning stage. Exposure at atmospheric pressure (of He gas) is also required by modern XRL steppers. A typical beam line layout is shown in Fig. 40. We notice how the beam line itself can be divided in sections around the optical systems, thus forming welldefined vacuum subunits. Front End. The part of the beam line that directly interfaces to the storage ring is conventionally called front end and provides some of the basic functionality. A front end will typically include: i. A manual isolation valve, all-metal, normally open. It is used to isolate the beam line for maintenance or installation. ii. A photon shutter, i.e., a water-cooled metal plate used to shut out the radiation beam. iii. An electro-pneumatic gate valve, used for normal operation. This valve is slaved to the beam line computer and also interlocked to a “grant” line from the ring and from a vacuum ion gauge located immediately downstream. iv. A heavy-metal shutter, used to eliminate the risk of radiation exposure for the operators during maintenance. v. A fast acting valve, with closure time of ~ 10 msec, slaved to a corresponding fast pressure monitor system. vi. A beam diagnostic chamber, with tools for beam location and monitoring, where the first beam line ion pump is also located. The details of the implementation may vary from facility to facility, but the basic functions are those performed above. The front-end should be kept simple in order to increase reliability. As discussed below, the power densities typical of a lithography ring are relatively modest and do not require excessive precaution from a cooling point of view.
Figure 40. Layout of an XRL beamline.
X-Ray Lithography 937
938
Handbook of VLSI Microlithography
Extraction Windows. Located at the other extremity, near the exposure station, the exit (or extraction) window is one of the most critical subsystems. A possible alternative, the use of a differentially pumped vacuum system where the pressure is reduced in stages, cannot be implemented efficiently. These systems are normally built as a succession of apertures or pinholes with pumps installed in between. The efficiency of such a system depends on the ratio between the conductance of the apertures (C ∝ d2) and the pump speeds S.[128] In order to get an efficient pressure differential, it is necessary to have at least S/C = 100 per stage. In an x-ray lithography beam line it is difficult to achieve such a ratio because of the large (relatively speaking) optical cross section of the x-ray beam that mandates the use of large apertures (and thus large C ). Windows must then be used in order to guarantee vacuum integrity while allowing the extraction of the x-ray beam in air. These windows must be able to withstand the force originated by the pressure differential as well as the thermal stresses that may be generated by the absorption of radiation. Of all the materials, beryllium is the best from essentially all points of view, but for its wellknown toxicity. Beryllium has an excellent x-ray transmission, a high Young’s modulus, and can be machined in thin foils. A thickness of 25 µm is sufficient to hold a full atmosphere while transmitting a good fraction of the spectrum; the foil is usually clamped on a metal or Viton gasket, while thicker foils can be welded or brazed to the support. Thinner foil can be used if the metal is preformed in a convenient shape so that the stress caused by the pressure is considerably reduced. In any event, because of the large power density absorbed by the Be foil if exposed directly to the white beam, it is necessary to carefully evaluate (and reduce) thermal stresses in the film.[128] The Be windows are sometimes coated with a thin layer of polyamide to seal some possible porosity and, more important, to avoid the disintegration into small flakes in case of failure. Other window materials are possible and the list would be very similar to that of x-ray mask membranes. Most often Si:B membranes are used as filters before the actual window in order to minimize thermal stresses and to seal against low level leaks. Another approach makes use of a thin (1–2 mm) Be pre-filter whose only function is that of absorbing the softer part of the spectrum, thus minimizing the dose absorbed by the metal window. The same result can be achieved in yet another way, that is the use of horizontal slits. From the discussion above, it is clear that the softer part of the radiation is emitted more off-axis; thus, a slit will have essentially the same effect as a (adjustable) filter.
X-Ray Lithography
939
Valves and Delay Lines. The vacuum beam line can be sectioned by valves located at several positions; these valves are automatic and interlocked to vacuum gauges against possible wrong openings. The closing time of these valves is normally 1–2 sec., so that they are not suitable as safety valves since the catastrophic failure of a component creates a pressure shock wave moving at the speed of sound. In helium, this is 965 m/s, giving a progression time of about 1 msec/m of beam line. A typical beam line would be 10–13 m long, so that a fast-acting valve (typically with a closure of 10–15 msec) would have to be located next to the storage ring itself. In order to improve the odds of successfully safeguarding the ring, an “acoustic delay line” can also be installed.[129] The acoustic delay line slows the progression of the front providing the time needed by the fast valve to close and intercept the shock wave. Nothing can be done about the tail of fast molecules that are in direct line of sight with the beam line. Outside of the fast-closing valve, the other components are Viton-sealed gate valves for the sectioning valves, and all-metal angle valves for other purposes (fore vacuum connections, etc.). Pumps and Construction. The construction of the beam-lines follows standard UHV practices, based on the use of all-metal flanges and ion pumps all the way to the Be window itself. This construction normally gives highly reliable systems with little maintenance; other pumps, such as turbo-moleculars, should not be used for these applications, but can be used for pumping down. An excellent choice is the air-bearing-based oil-less pumps manufactured by Alcatel; they can be started at atmospheric pressure and be used to provide the pumping during the bake out procedure. After this necessary stage, the beam line must be conditioned to be x-ray beam; no matter how careful the bake out procedure, the energetic x-rays will always force desorption from chemisorbed species. When initially opened to the ESR, the initial pressure raise may force the shutdown of the front valve (interlocked to the vacuum gauges), so that a gradual conditioning may prove necessary. This may require careful planning in an industrial environment. Particular attention must be dedicated to the risk of contaminating the surfaces of the mirrors used in the beam line during any of the preparation and installation steps. A base vacuum of 1 × 10-10 torr or better must be maintained at all times when in operation to avoid residual gases cracking at the mirror surface and buildup of a carbonaceous layer that may degrade significantly the mirror reflectivity. Vacuum of this quality is possible with standard practices, if strictly enforced. Downstream of the Be window the requirements are much more relaxed. An important consideration is the protection of the window from
940
Handbook of VLSI Microlithography
the atmosphere. No oxygen or water vapor should be allowed near the Be window, since under the intense x-ray beam, ozone and other active species are formed and they quickly corrode the metal of the window, with potentially catastrophic results. Another issue that has not been fully developed is that of the window needed for a scanning beam line; existing solutions are not fully satisfactory.[130]
13.2 Optical The optical system, if installed, performs the function of distributing uniformly and in a controlled way the radiation across the mask. There are several reasons for which one or more mirrors need to be installed in a beam line. First, the presence of one mirror provides a break in the machinestepper line of sight, allowing one to locate a radiation stop for full containment. Second, a collecting mirror system may increase the amount of radiation delivered to the mask-wafer assembly, lightening the requirements on the machine itself. Third, mirrors act as low-pass filters that remove the harder radiation that would not be absorbed by the resist and would end up in radiation damage to the underlying wafer. A careful assessment is necessary in order to avoid design pitfalls while delivering maximum system performance. While discussing alternatives one must keep in mind the exposure window defined above as well as the speed of the exposure tool itself. An important parameter is the horizontal width of the field that must be filled by the beam line. A full field exposure is not practical so that one must resort to forming a uniform (horizontal) line image and to a scanning system to generate the required uniformity in the other direction (vertical). It is always possible to use suitably shaped apertures to improve the beam uniformity, but this option should be used with caution because it may prove too expensive in terms of power loss if the beam profile is strongly nonuniform to start with. Aperturing should be used for fine-tuning the beam rather than for providing the main shaping. In the following discussion we assume a field of 50 mm (H) by 25 mm (V) as our exposure area, with a requirement of ± 3% uniformity. One Mirror: Collecting Radiation. A single mirror can be used to increase the amount of radiation delivered to the mask-wafer assembly by increasing the beam line aperture.[131] For example, for a beam line 15 m long and a horizontal field of 50 mm (f/300), a 100 mm wide collecting mirror located at 5 m from the tangent can provide a gain of 6 (from f/300 to f/50) in comparison to a straight-through system. However, the image
X-Ray Lithography
941
formed by such a mirror will not be well suited to the requirements of the exposure system; the image is not a line, but rather forms the characteristic “smile” of glancing optics. Furthermore, the variation in incident angle across the sagittal direction will force a nonuniform distribution at the image. If the system design can tolerate these shortcomings (usually much larger than the typical ± 3%) then the option is a good one. The mirror can be optimized to form a focus at the mask or past it; typically the use of a toroidal mirror will allow one to set the tangential (vertical) focus at the mask and the sagittal (horizontal) one quite past it in order to fill the required width. The main problem here is that the optical designer has not enough free parameters to adjust in order to fulfill the requirements of line image and uniformity, while increasing the beam line aperture. For example, one can form an excellent line image by using a tangentially cylindrical mirror (i.e., non-focusing in the sagittal), but with no gain in luminosity. Two Mirrors: Solacing and Shaping. The use of two mirrors[132] allows increased flexibility to the optical design, since now one can compensate aberrations. Several designs have been proposed in the past for monochromator beam lines,[133] but the requirements of an XRL system are different to the point that new designs are necessary. If one considers the combinations offered by the use of two toroidal mirrors there are several possible choices that may provide the required line image. Great care must be exercised in the selection since the uniformity may be irreparably compromised by the reflectivity changes across the surface of the mirrors, both from a power and from a spectral content point of view. Proprietary designs overcome this shortcoming by providing an excellent line image and beam uniformity.[134] In general, the luminosity can be easily increased to f/20 without compromising the uniformity, both power and spectral. From several points of view such a beam line is ideal. All the requirements (uniformity, power, image, modularity) are fulfilled or exceeded. The extra cost of the optical system is well compensated by the increased performances. A beam line based on two mirrors requires a more complex vacuum system since it must be possible, for example, to replace one of the mirrors without affecting the other. This can be achieved by a judicious use of sectioning valves, modular design, and flexible optical design. Three Mirrors: Scanning. The vertical scanning action can be provided by a mirror. On the plus side, a scanning mirror provides a well controlled beam and allows the design of a fixed mask-wafer assembly with dynamic alignment during exposure. This greatly simplifies the design of the stepper (as in the Perkin Elmer approach). On the minimum side, the
942
Handbook of VLSI Microlithography
scanning mirror adds one more reflection and may harm the beam uniformity. For example, a single mirror beam line, where the mirror is used for collection and scanning, will have unacceptable uniformity. This is true also for a two-mirror beam line, where the change in incidence angle due to the scanning action appreciably changes the system magnification. At a glancing angle of 2° or 1° the scanning range of 25 mm/8 m = 3.1 mrads is definitely not negligible. The best approach appears, then, to be a threemirror system where the scanning action is fully separated from the beamshaping (or image-forming) action. A flat mirror following the couple of focusing mirrors above will provide the ideal solution since it will raster the beam across the field without affecting the image shape and the spectral power. This is true if the scanner is operated at a larger incident angle, since it is the mirror at the smaller incident angle that determines the overall system lower pass frequency. The use of a flat scanner has another appealing facet, upgradability. For example, the two-mirror beam line designed for a Karl Suss stepper (mechanic rastering, fixed beam) can be implemented for use with a Perkin Elmer stepper by simply adding one more stage, the scanning option. Everything else remains the same. This considerably reduces complexity, inventory and maintenance costs. Optical Components Specifications. The mirrors used in an XRL beam line must deliver the maximum power in the prescribed image. There are two main sources of potential problems, figure errors and finish. The first causes the rays not to be focused in the correct position, thus degrading the image quality. Because of the reduced requirements of the optics (after all, the beam line is a condenser, not an image forming system) the tolerances on the figure are quite relaxed. An error of one wavelength (HeNe testing laser) across a mirror 0.25 m long gives a slop error of 1.3 µrad, leading to an error of 13 µm after a 10 m long beam line, quite acceptable. The requirement on the surface finish is more stringent. The reflectivity of a mirror at glancing angles is affected by the surface roughness of standard deviation σ via an exponential term: Eq. (43)
R( λ ; σ ) = R( λ; 0)exp (4 πσ cos ( θ )/λ 2)
hence at θ = 88° and λ = 10 Å, it is necessary to keep the roughness to less than about 0.74λ for the reflectivity not to degrade more than 10%. This sets it at about 7.4 Å for x-ray lithography, a value not at all unreasonable by today’s optical industry standards. The mirrors used for focusing and scanning are typically manufactured from fused silica. The material is well known to the optical industry
X-Ray Lithography
943
and can be polished with high figure and finish, well within the tolerances described above (1 wavelength figure, 20 Å finish). The power density encountered on a typical XRL beam line does not warrant the use of cooled opticals. If the power per milliradian Po is of the order of 1 W, on the basis of simple, not an imaging forming system, geometrical arguments, we can verify that the power density at the first mirror will be of the order of:
Eq. (44)
P=
(1000 Po )cos θ θV D
2
=
(1000 Po )γ cos θ D2
where D is the source-mirror distance (m), θv the radiation opening angle (radians) and the factor of 1000 takes in account the conversion from mrads to rads. ForD = 5 m,γ = 1000 and θ = 88°, we obtain a power density of about 140 mW/cm2. Hopefully, the mirror will reflect about 80% of this, leading to an absorbed power density of 28 mW/cm2 that can easily be dissipated without any special techniques. For example, SiO2 (for example, versus Al) has a thermal conductivity of 0.013 (2.37) W/cm2/°K so that a 5 cm thick mirror will have, at most, (neglecting radiation and lateral conduction) a ∆T of the order of 10 (0.06) °C between front and back surfaces. Clearly higher beam energy machines (larger γ ) will present more problems. The reflectivity of the mirrors depends strongly on the coating material. Reflective optics in the x-rays is based on the fact that the index of refraction is less than 1.0 in that energy region, so that total external reflection is possible. However, the reflectivity is strongly affected by the material characteristics within a skin depth (of the order of a few λ) so that the surface conditions play a very important role. A few monolayers of carbon (Z = 6) can significantly reduce the reflectivity of a gold mirror (Z = 72). As discussed above, roughness may decrease the reflectivity as well. Care must be exercised when depositing the material in order to have a smooth surface; Au/Cr coatings (500 Å/50 Å) are often used and can be deposited from thermal sources, electron guns, or sputter guns. Since the materials are all strongly absorbing, in x-ray we do not observe a well defined critical angle as we do for (non-absorbing) dielectrics in the visible. This means that the selection of the design reflection angle is dictated by a system design based on the increase in cost with larger incidence angles (longer mirrors) weighted against the reflectivity increase. The decrease in reflectivity may often be compensated by an increased acceptance angle (wider mirror). Once more, a system design based on a full ray-tracing[135] is necessary in order to balance the various requirements. Practically
944
Handbook of VLSI Microlithography
speaking, angles less than 88° make for “easy,” 88–89° difficult and more than 89° for very difficult mirrors. An initial decrease in reflectivity of a newly installed mirror is to be expected, because of the formation of an initial graphic carbon monolayer on the mirror surface. The decrease should quickly saturate at a value of about 10% less than the original (fresh surface) value. This is where the requirements of UHV are the most stringent. Mirrors kept at vacuums of the order of 1.0 × 10-10 have shown essentially no degradation after more than two years of continuous operation at SRC. If a mirror shows signs of degradation, there are only a few choices. The first is that of stripping, cleaning, and recoating; most of the times the original values can be restored. Cleaning schemes (both in-situ and ex-situ) based on the use of plasma discharges have been proposed with mixed success.[136] The idea is obviously very attractive, but needs further development. In summary, mirrors must be manufactured in a clean environment and kept clean during use. Standard optical shop techniques, augmented by UHV procedures, can provide satisfactory results.
13.3 Data Communication The beam line must provide a communication bridge between the stepper and the ESR. From an operational point of view, there is no reason for a stepper to communicate directly with the ring. The beam line replaces, or actually is, a “light bulb;” the stepper may request the beam line to deliver a dose to expose a given field, and the beam line system just either grant or refuse the request, depending on the system’s status. In the same way, the ring should not interrogate the stepper, but simply grant (or refuse) the request of the beam line to open the front end valves. Of course, essentially infinite variations are possible on the implementation of the hardware and software. In general, the most important requirement is that of a judicious implementation of a system based on a high-speed link and hardware interrupts, as well as hardware backup systems. Watchdog timers are an absolute necessity. In one implementation, an Ethernet backbone is used for fast data transfer while handshake lines are used to verify the granting of valves and shutter opening or closing actions.
13.4 Safety Issues As briefly mentioned above, two different types of safety issues must be considered around a storage ring source, personnel and machine. When
X-Ray Lithography
945
speaking of safety, the radiation we speak of is the hard radiation generated by the high-energy electrons when they interact with the vacuum chamber walls and/or residual gas. The radiation is typically formed by gamma rays, although neutrons generated by nuclear radiations induced by the electrons are also present. A storage ring is a very “clean” machine, with very little (if any) residual radioactivity; most of the radiation is generated (a) at well defined locations, or (b) by the attrition of the gas molecules. The injection procedure is always a source of higher radiation levels (because of less than 100% efficiency). From a personnel point of view, the shielding of radiation must be complete. Specifications vary from laboratory to laboratory and are as a rule much more stringent than the federally approved minimum radiation levels.[125] In the case of compact industrial rings, the reduced size of the machine simplifies the problem by reducing the size (and weight) of the main shielding. In a laboratory environment, where skilled personnel are normally at work, procedures based on non-full containment are acceptable as long as the “hot” areas are clearly marked and/or interlocked. In an industrial setting this is not acceptable and full radiation shielding must be implemented. This essentially forces the use of a beam line-break mirror to allow one to put a radiation stop. The machine itself is typically fully enclosed in a concrete vault so that no radiation may leak out. The beam lines have a lead shielding arranged in a telescopic fashion so that no where can an operator be in a direct path to the machine. The shielding design must always be reviewed by an authorized medical physics team before implementation.[125] The questions of instrument safety follow quite different guidelines. In a synchrotron radiation x-ray lithography system several beam-lines share the use of a single light source, the synchrotron. Any accident happening on any beam line should not be capable of bringing the ring down or affecting the other beam-lines. This means that the beam line systems must be capable of fully containing the accident. The main concern is that of an accidental catastrophic break of the final vacuum isolation window. As discussed above, acoustic delay lines may be used to provide time for the safety valves to shut down isolating the faulty beam line. Experience has shown that catastrophic failures are extremely rare on a “mature” system, where the safety subsystems have been thoroughly developed. An XRL source will require extra protection in the areas where breakdowns are otherwise most likely to occur: electrical feed through, windows (should be avoided unless absolutely necessary), and bellows. Most of the time these breakages do not lead to catastrophic vacuum failure. Residual gas analyzers must be used permanently to monitor the vacuum composition by
946
Handbook of VLSI Microlithography
checking mass 32 (O+2, indicative of a beam line leak) and 4 (He+, likely to indicate a window leak). The safety interlocks on the beam-lines follow standard approaches, with a microcomputer managing the opening and closing of valves as well as the pumping system. Manual overrides allow nonstandard operations and maintenance. The vacuum gauges used throughout the beam line must be capable of generating interrupt signals to the valves’ interlock system independent of the computer operation. While a completely foolproof system does not exist, a careful engineering study will reduce the likelihood of non-containable accidents to an acceptable minimum.
13.5 Machines and Lithography A synchrotron-based x-ray lithography system comprises three main subsystems: the source, the beam line, and the exposure station. We have described the source and the beam line, while the exposure tools are covered in the first part of this chapter. At this point, we want to emphasize that the design of a successful lithography process requires a careful engineering systems study. There are many trade-offs which are possible, delivering performances which are approximately the same, but with widely different costs. For example, beam line acceptance may be used to compensate for a low-power machine; modularity is essential in order to accommodate tools with different scanning requirements, and so on. From a lithography point of view, there are two parameters that will determine the viability of the approach. Resolution is certainly not an issue, at least until features of less than 0.1 µm are needed. Power density and cost are. The economics have been addressed in several papers;[137] here we limit ourselves to the discussion of the lithographic parameters, excluding cost. We have chosen this position because of the relative meaning of the cost of a process. The absolute cost is not a very significant figure of merit; cost-effectiveness is. Cost-effectiveness must not be confused with net cost. To make a trivial example, a large bulldozer is certainly more expensive than a garden spade, but few would argue that the spade is a more cost-effective tool to build highways. Synchrotron based x-ray lithography is certainly expensive, but is also capable of exceedingly large economies of scale. A well designed x-ray lithography synchrotron radiation system can easily deliver in excess of 50 mW/cm2 over a field of 50 mm (H) by 25 mm (V) of collimated radiation in the lithographically useful exposure window. We notice, en passant, that most photoresist systems have a sensitivity
X-Ray Lithography
947
which is quoted by referring to a reading from a reference instrument (such as a calorimeter); these readings do not measure the lithographically useful dose since they respond to the whole incoming spectrum. A tailored beam line will deliver only useful photons so that an efficiency factor must be considered). Resist systems with sensitivities in the range of 50–100 mJ/cm2 are available in experimental formulations and should be reaching a stable formulation in the market in early 1991. This makes it easily possible to achieve the goal of 1 sec/field exposure; assuming an overhead of 1 sec. for stepping and alignment, a six inch diameter wafer would be exposed in 30 sec. Including wafer loading/unloading, the system should be capable of delivering more than sixty 6-inch diameter wafers per hour and will most likely be limited by the exposure tool overhead. For a manufacturing plant using thirteen beamlines, this would be equivalent to 60 × 23 × 13 = 18940 wafer starts a day. A storage ring should support a very large silicon operation. Clearly, even a large initial capital cost can be amortized quickly by such a production rate. The technology to build efficient sources and beam transport systems is here today and work is progressing on the implementation of several such sources in an industrial environment. Time will tell if the synchrotrons do provide an economical as well as technical answer to the challenges of the lithography for the end of the century.
14.0 ACKNOWLEDGMENT This chapter is based on the outcome of many years of activity and it would be difficult to acknowledge all the contributions to its content. In particular, Barry Lal wrote a large section of SHADOW and F. Baszler of TANSMIT, codes on which many of the conclusions are based. Among CXrL staff, R. K. Cole provided many ideas in the beam line optical design and other areas; C. Welnak wrote most of the documentation of SHADOW. The continuous support of the University of Wisconsin, of the Synchrotron Radiation Center and Center for X-Ray Lithography staff is also acknowledged. This work would have been impossible without the support of DARPA/NRL to the Center for X-Ray Lithography. The operation of the Synchrotron Radiation Center is supported by the National Science Foundation.
948
Handbook of VLSI Microlithography
REFERENCES 1. 2. 3. 4.
5.
6.
7.
8. 9.
10. 11. 12.
13.
14.
15.
Spears, D. L., and Smith, H. I., “High Resolution Pattern Replication Using Soft X-Rays,” Electron. Lett., 8:102–104 (1972) Gordon, E., and Herriott, D. R., “Pathways in Device Lithography,” IEEE Trans. Electron Devices, 22:371–375 (1975) Plotnik, I., “Metrology Applied to X-Ray Lithography,” Solid State Tech. J., 32:102 (1989) Jaeger, R. P., and Hefflinger, B. L., “Linewidth Control in X-Ray Lithography: The Influence of the Penumbral Shadow,” Proc. SPIE - Int. Soc. Opt. Eng., 471:110–126 (1984) Viswanathan, R., Acosta, R. E., Seeger, S. D., Voelker, H., Wilson, A. D., Babich, I., Maldonado, J., Warlaumont, J., Vladimersky, O., Hohn, F., Crockatt, D., and Fair, R., “Fully Scaled 0.5 µm Metal-Oxide Semiconductor Circuits by Synchrotron X-Ray Lithography: Mask Fabrication and Characterization,” J. Vac. Sci. Technol. B, 6:2196–2201 (1988) Harrell, S., and Alexander, D., “Characterization Techniques for X-Ray Lithography Submicron Metrology,” Proc. SPIE - Int. Soc. Opt. Eng., 471:103–109 (1984) Fay, F., and Hasan, T., “Electrical Measurement Techniques for the Characterization of X-Ray Lithography Systems,” Solid State Tech. J., 29:239–243 (1986) Wilson, A. D., “X-Ray Lithography: Can it be Justified?” Solid State Tech. J., 29:239–243 (1986) Bernacki, S. E., and Smith, H. I., “Characteristic and Bremsstrahlung XRay Radiation Damage,” IEEE Trans. Electron. Devices, 22:421–428 (1975) Blais, P. D., “A Practical System for X-Ray Lithography,” Proc. Tech. Symp., Tokyo (Jan. 7–11, 1982) Ostercamp, W. J., “Efficient X-Ray Generation,” Phillips Rev. Rpt., 3:303 (1948) Hayasaka, T., Ishihara, S., Kinoshita, H., and Takeuchi, N., “A Step-andRepeat X-Ray Exposure System for 0.5 µm Pattern Replication,” J. Vac. Sci. Technol. B, 3:1581–1597 (1985) Kreuzer, J. L., Hughes, G. P., and LaFlandra, C., “Precision Alignment for X-Ray Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 471:84–89 (1984) Nagel, D. J., Whitlock, R. R., Greig, J. R., Pechacek, R. E., and Peckerar, M. C., “Developments in Semiconductor Microlithography,” Proc. SPIE - Int. Soc. Opt. Eng., 135:46–53 (1978) Matthews, S. M., and Cooper, R., “Plasma Sources for X-Ray Lithography,” Proc. SPIE- Int. Soc. Opt. Eng., 333:136–139 (1982)
X-Ray Lithography 16. 17. 18.
19. 20. 21. 22.
23.
24. 25. 26. 27. 28. 29.
30.
31. 32. 33.
949
Guchek, R. A., and Murray, J. J., “High Resolution Soft X-Ray Optics,” Proc. SPIE - Int. Soc. Opt. Eng., 316:196–202 (1982) Frankel, R. D., et al., Proc. Kodak Microel. Seminar, p. 82 (1986) Burkhalter, P. G., Shiloh, J., Fisher, A., and Cowan, R. D., “X-Ray Spectra from Gas-Puff Z-Pinch Device,” J. Appl. Phys., 50:4532–4550 (1979) Peacock, N. J., Speer, R. J., and Hobby, M. G., J. Phys. B, 2:798 (1969) Nagel, D. J., Dozier, C. M., Klein, B. M., and Mather, J. W., Bul. Am. Phys. Soc., 18:1363 (1973) Pearlman, J. S., and Riordan, J. C., “X-Ray Lithography Using a Pulsed Plasma Source,” J. Vac. Sci. Technol., 19:1190–1193 (1981) Okada, I., Saitoh, Y., Itabashi, S., and Yoshihara, T., “A Plasma X-Ray Source for X-Ray Lithography,” J. Vac. Sci. Technol. B, 4:243–247 (1986) Smith, H. I., Spears, D. L., and Bernaki, S. E., “X-Ray Lithography: A Complementary Technique to Electron Beam Lithography,” J. Vac. Sci. Technol., 10:913–917 (1973) Feder, R., Spiller, E., and Topolian, J., “Replication of 0.1 µm Geometry with X-Ray Lithography,” J. Vac. Sci. Technol., 12:1332–1335 (1975) Neukermans, A. P., “Current Status of X-Ray Lithography, Part II.,” Solid State Tech. J., 27:213–219 (1984) Fencil, C. R., and Hughes, G. P., “Submicron Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 333:100–110 (1982) Spears, D. L., and Smith, H. I., “X-Ray Lithography - A New High Resolution Replication Process,” Solid State Tech. J., 15:21–26 (1972) Shimkunas, A. R., “Advances in X-Ray Mask Technology,” Solid State Tech. J., 27:192–199 (1984) Maydan, D., Coguin, G. A., Maldonado, J. R., Somekh, S., Lou, D. Y., and Taylor, G. N., “High Speed Replication of Submicron Features on Large Areas by X-Ray Lithography,” IEEE Trans. Electron. Devices, 22:429–433 (1975) Greeneich, J. S., “X-Ray Lithography: Part I - Design Criteria for Optimizing Resist Energy Absorption; Part II - Pattern Replication with Polymer Masks,” IEEE Trans. Electron. Devices, 22:434–439 (1975) Buckley, W. D., Nester, J. F., and Windischmann, H., “X-Ray Lithography Mask Technology,” J. Electrochem. Soc., 128:1116–1120 (1981) Funayama, T., Takayama, Y., Inagaki, T., and Nakamura, M., “New XRay Mask of Al-AIO Structure,” J. Vac. Sci. Technol., 12:1324 (1975) Spears, D. L., Smith, H. I., and Stern, E., “X-Ray Replication of Scanning Electron Microscope Generated Patterns,” Proc. SPIE - Int. Soc. Opt. Eng., (1972/3)
950 34.
35.
36.
37.
38.
39.
40.
41.
42. 43. 44. 45.
46. 47. 48.
49.
Handbook of VLSI Microlithography Suzuki, K., Matusi, J., Kadota, T., and Ono, T., “Preparation of X-Ray Lithography Masks with Large Area Sandwich Structure Membrane,” Jpn. J. Appl Phys., 17:1447–1448 (1978) Ebata, T., Sekimoto, M., Ono, T., Suzuki, K., Matsui, J., Ulitchi, C., and Nakayama, S., “Transparent X-Ray Lithography Masks,” Jpn. J. Appl. Phys., 21:762–767 (1982) Csepregi, L., and Heuberger, A., “Fabrication of Silicon Oxynitride Masks for X-Ray Lithography,” J. Vac. Sci. Technol., 16:1962–1964 (1979) U.S. Patent 4,171,4989, Adams, A. C., Caplo, C. D., Levinstein, H. J., Sinha, A. K., and Wang, D. N.; Assigned to Bell Telephone Laboratories (1979) Hofer, D., Powers, J., and Grobman, W. D., “X-Ray Lithographic Patterning of Magnetic Bubble Circuits with Submicron Dimensions,” J. Vac. Sci. Technol, 16:1968–1972 (1979) Blais, P. D., O’Keefe, T., Tremere, D., and Creswell, M., “A Practical System for X-Ray Lithography,” Semicon-West Tech. Prog. Proc. (May 28, 1982) Parrens, P., Tabouret, E., and Tacussei, M. C., “Preparation of X-Ray Lithography Masks with 0.1 µm Structures,” J. Fac. Sci. Technol., 16:1965–1968 (1979) Georgiou, G. E., Jankoski, C. A., and Palumbo, T. A., “DC Electroplating of Submicron Gold Patterns on X-Ray Masks,” Proc. SPIE - Int. Soc. Opt. Eng., 471:96–99 (1984) Yamagishi, F., Kimura, Y., and Furukawa, Y., “Fabrication of Silicon Polyamide Complex X-Ray Masks,” Fujitsu Sci. Tech. J., p. 85 (1980) Brors, D. L., “X-Ray Mask Fabrication,” Proc. SPIE - Int. Soc. Opt. Eng., 333:111–112 (1982) Suzuki, K., and Matsui, J., “SiN Membrane Masks for X-Ray Lithography,” J. Vac. Sci. Technol., 20:191–194 (1982) Acosta, R. E., Maldonado, J. R., Towart, L. K., and Warlaumont, J. R., “B-Si Masks for Storage Ring X-Ray Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 488:114–116 (1983) Gong, B. M., and Ye, Y. D., “Fabrication of Polyamide Masks for X-Ray Lithography,” J. Vac. Sci. Technol, 19:1204 (1981) Ono, T., and Ozawa, A., “High Contrast X-Ray Mask Preparation,” J. Vac. Sci. Technol. B, 2:68–72 (1984) Bassous, E., Feder, R., Spiller, E., and Totalian, J., “High Transmission X-Ray Masks for Lithographic Applications,” Solid State Tech. J., 19:55–58 (1976) U.S. Patent 3,975,252; Fraser, D. B., and Lou, D. Y. K. (1976)
X-Ray Lithography 50.
51. 52. 53. 54.
55.
56.
57. 58.
59. 60. 61.
62.
63. 64. 65.
951
Maydan, D., Coguin, G. A., Levinstein, H. J., Sinha, A. K., and Wang, D. N. K., “Boron Nitride Mask Structure for X-Ray Lithography,” J. Vac. Sci. Technol, 16:1959–1961 (1979) Garrettson, G., and Neukermans, A. R., “HP Gives Peek at X-Ray Aligner,” Semi. Intl., p. 17 (1983) Triplett, B. B., and Hollman, R. F., “X-Ray Lithography for VLSI,” IEEE Proc., 71:585–588 (1983) Neukermans, A. P., “Status of X-Ray Lithography at HP,” Proc. SPIE Int. Soc. Opt. Eng., 393:93–98 (1983) Bartelt, J. L., Slayman, C. W., Wood, J. E., Chen, J. Y., McKenna, C. M., Minning, C. P., Coakley, J. F., Hollman, R. E., and Perrygo, C. M., “Mask Ion-Beam Lithography: A Feasibility Demonstration for Sub-Micrometer Device Fabrication,” J. Vac. Sci. Technol, 19:1161–1171 (1981) Lepselter, M. P., Alles, D. A., Levinstein, H. J., Smith, G. E., and Watson, J. A., “A Systems Approach to 1-µm NMOS,” IEEE Proc., 71:640–656 (1983) Plotnik, I., Porter, M. E., Toth, M., Akhtar, S., and Smith, H. I., “IonImplant Compensation of Tensile Stress in Tungsten Absorber for Low Distortion X-Ray Masks,” Microel. Engrg., 5:51–59 (1986) Adams, A. C., and Capio, C. D., “The Chemical Deposition of BoronNitrogen Films,” J. Electrochem. Soc., 127:399 (1980) Bomley, E. I., Randall, J. N., Flanders, D. C., and Mountain, R. W., “A Technique for the Determination of Stress in Thin Films,” J. Vac. Sci. Technol. B, 1:364–1366 (1983) Klokholm, E., “An Apparatus for Measuring Stress in Thin Films,” Rev. Sci. Instrum., 40:1054–1058 (1969) Garrettson, G., and Neukermans, A. R., Mkt. Engrg. Academic, p. 247, London (1983) Glang, R., Holmwood, R. A., and Rosenfield, R. L., “Determination of Stress in Films on Single Crystalline Silicon Substrates,” Rev. Sci. Instrum., 36:7–10 (1965) Yanof, A. W., Resnick, D. J., Jankoski, C. A., and Johnson, W. A., “XRay Mask Distortion: Process and Pattern Dependence,” Proc. SPIE - Int. Soc. Opt. Eng., 632:118–132 (1986) Karnezos, M., “X-Ray Mask Distortions,” Solid State Tech. J., 30:151–156 (1987 ) Frankel, R. D., and Peters, D. W., “Engineering of Reticles for LaserBased-Plasma Sources,” Microel. Manuf. Test., 10:8–9 (1987) Ruby, R., Baldwin, D., and Karnezos, M., “The Use of Diffraction Techniques for the Study of In-Plane Distortions of X-Ray Masks,” J. Vac. Sci. Technol. B, 5:272–277 (1987)
952 66. 67.
68.
69.
70.
71. 72.
73. 74. 75.
76. 77.
78.
79.
80.
Handbook of VLSI Microlithography Karnezos, M., “Effects of Stress on the Stability of X-Ray Masks,” J. Vac. Sci. Technol. B, 4:226–229 (1986) Johnson, W. A., Levy, R. A., Resnick, D. J., Saunders, T. E., and Yanof, A. W., “Radiation Damage Effects in Boron Nitride Mask Membranes Subjected to Synchrotron X-Ray Exposure,” J. Vac. Sci. Technol. B, 5:257–261 (1987) Levy, R. A., Resnick, D. J., Frye, R. C., Yanof, A. W., Wells, G. M., and Cerrina, F., “An Improved Boron Nitride Technology for Synchrotron XRay Masks,” J. Vac. Sci. Technol. B, 6:154–161 (1988) Peters, D. W., Dardzinski, B. J., and Frankel, R. D., “Defect Printability for Soft X-Ray Microlithography,” Proc. SPIE - Int. Soc. Opt. Eng., 1263:99–109 (1990) Atwood, D. K., Fisanick, G. J., Johnson, W. A., and Wagner, A, “Defect Repair Techniques for X-Ray Masks,” Proc. SPIE - Int. Soc. Opt. Eng., 471:127–134 (1984) Burggraaf, P. S., “X-Ray Lithography and Mask Technology,” Semi. Intl., (1985) Ehrlich, D. J., Osgood, M., Silversmith, D. J., and Deutsch, T. F., “OneStep Repair of Transparent Defects in Hard-Surface Photolithographic Masks via Photo-deposition,” Electron Device Lett., 1:101–109 (1980) Weigmann, U., et al., “Repair of Electroplated Au Masks for X-Ray Lithography,” J. Vac. Sci. Technol. B, 6:2170–2173 (1988) Econonou, N. P., and Cambria, T. D., “Mask and Circuit Repair with Focused-Ion Beams,” State State Tech. J., 30:133–136 (1987) Herriott, D. R., Collier, R. J., Alles, D. S., and Stafford, J. W., “EBES: A Practical Electron Lithographic System,” IEEE Trans. Electron. Devices, 22:385–392 (1975) Flanders, D. C., and Smith, H. I., “A New Interferometric Alignment Technique,” Appl. Phys. Lett., 31:426–428 (1977) Austin, S., Smith, H. I., and Flanders, D. C., “Alignment of X-Ray Lithography Masks Using a New Interferometric Technique - Experimental Results,” J. Vac. Sci. Technol., 15:984–986 (1978) Fay, B., Trotel, J., and Frichet, A., “Optical Alignment System for Submicron X-Ray Lithography,” J. Vac. Sci. Technol., 16:1954–1958 (1979) Kouno, E., Tanaka, Y., Iwata, J., Tasaki, Y., Kakimoto, E., Okada, K., Suzuki, K., Fujii, J. K., and Nomura, E., “An X-Ray Stepper for Synchrotron Radiation Lithography,” J. Vac. Sci. Technol. B, 6:2135–2138 (1988) Nelson, D. A., diMilia, V., and Warlaumont, J. M., “A Wide Range Alignment System for X-Ray Lithography,” J. Vac. Sci. Technol., 19:1219–1223 (1981)
X-Ray Lithography
953
81.
Feldman, M., White, A. D., and White, D. L., “Application of Zone Plates to Alignment in Microlithography,” J. Vac. Sci. Technol, 19:1224–1228 (1981)
82.
Feldman, M., White, A. D., and White, D. L., “Application of Zone Plates to Alignment in X-Ray Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 333:124–130 (1982)
83.
Lavine, J. M., Mason, M. T., and Beaulieu, D. R., “The Effect of Semiconductor Processing Upon the Focusing Properties of Fresnet Zone Plates used as Alignment Targets,” Proc. SPIE - Int. Soc. Opt. Eng., 470:122–135 (1984)
84.
Fay, B., Novak, W. T., “Automatic X-Ray Alignment System for Submicron VLSI Lithography,” State State Tech. J., 28:175–179 (1985)
85.
Novak, W. T., “A Lithography System for X-Ray Process Development,” Proc. SPIE - Int. Soc. Opt. Eng., 393:106–113 (1983)
86.
Kleinknecht, H. P., “Diffraction Gratings as Keys for Automatic Alignment in Proximity and Projection Printing,” Proc. SPIE - Int. Soc. Opt. Eng., 174:63–69 (1979)
87.
Lyszczarz, T. M., Flanders, D. C., Economou, N. P., and DeGraff, P. D., “Experimental Evaluation of Interferometric Alignment Techniques for Multiple Mask Registration,” J. Vac. Sci. Technol., 19:1214–1218 (1981/2)
88.
Kinoshita, H., Une, A., and Iki, M., “A Dual Grating Alignment Technique for X-Ray Lithography,” J. Vac. Sci. Technol. B, l:1276–1279 (1983)
89.
Doemens, G., and Mengel, P., Automatic Mask Alignment for X-Ray Microlithography, Siemens Forsch. und Entwickl - Ber., Bd 13:47–47 (1984)
90.
Heuberger, A., “X-Ray Lithography,” State State Tech. J., 29:93–101 (1986)
91
Semenzato, L., Eaton, S., Neukermans, A., and Jaeger, R., “Monte Carlo Simulation of Line Edge Profiles and Linewidth Control in X-Ray Lithography,” J. Vac. Sci. Technol. B, 3:245–252 (1985)
92.
Maydan, D., “X-Ray Lithography for Microfabrication,” J. Vac. Sci. Technol., 17:1164–1168 (1980)
93.
David, J. E., and Stover, H. L., “Optical Test Structures for Process Control Monitors, Using Wafer Stepper Metrology,” Solid State Tech. J., 25:131–141 (1982)
94.
Glendinning, W. B., and Goodreau, W. M., “Direct-Write Electron Beam Patterning Re-Registration and Metrology,” Proc. SPIE - Int. Soc. Opt. Eng., 480:141–144 (1984)
954
Handbook of VLSI Microlithography
95.
Bay, B., and Alexander, D., “Recent Printing and Registration Results with X-Ray Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 537:57–68 (1985)
96.
Stemp, I. J., Nicholas, K. H., and Brockman, H. E., “Automatic Testing and Analysis of Misregistrations Found in Semiconductor Processing,” IEEE Trans. Electron. Devices, 26:729–732 (1979)
97.
Muller, K. H., Brehm, K., and Werner, K., “Magnification Corrected Imaging in Synchrotron Radiation X-Ray Lithography,” J. Vac. Sci. Technol. B, 6:2139–2141 (1988)
98.
Atoda, N., Kawakatsu, H., Tanino, H., Ichimura, S., Hirata, M., and Hoh, K., “Diffraction Effects on Pattern Replication with Synchrotron Radiation,” J. Vac. Sci. Technol. B, 1:1267–1270 (1983)
99.
Silverman, J. P., diMilia, V., Katakoff, D., Kwietniak, K., Seeger, D., Wang, L. K., Warlaumont, J. M., Wilson, A. D., Crockatt, D., Devenuto, R., Hill, B., Hsia, L. C., and Rippstein, R., “Fabrication of Fully Scaled 0.5 µm n-Type Metal Oxide Semiconductor Test Devices Using Synchrotron X-Ray Lithography: Overlay, Resist Processes, and Device Fabrication,” J. Vac. Sci. Technol. B, 6:2147–2152 (1988)
100.
Takahashi, N., “SHI Group Compact Storage Ring Light Source for XRay Lithography,” Proc. SPIE - Int. Soc. Opt. Eng., 923:47–54 (1988)
101.
Aitken, J. M., “1 µm MOSFET’s VLSI Technology: Part VI,” J. Solid State Circuits., 14:294 (1979)
102.
Davis, R. T., Woods, M. H., Will, W. E., and Measel, P. R., “HighPerformance MOS Resists Radiation,” Electronics, (1982)
103.
Gdula, R. A., “The Effect of Processing on Radiation Damage in SIO,” IEEE Trans. Electron. Devices, 26:644–647 (1979)
104.
Stover, H. L., Hause, F. L., and McGreevy, D., “X-Ray Lithography for One Micron LSI,” Solid State Tech. J., 22:95–100 (1979)
105.
Grove, A., “Physics and Technology of Semiconductors,” Wiley, New York (1967)
106.
Kuhn, M., “A Quasi-Static Technique for MOS CV and Surfacer State Measurements,” Solid State Elect., 13:873–885 (1970)
107.
Manchanda, L., “Radiation Effects on MOSFET’s Fabricated with NMOS Submicrometer Technology,” Electron Device Lett., 5:412–414 (1984)
108.
Chen, J. Y., Henderson, R. C., Patterson, D. O., and Martin, R., “Radiation Effects of e-Beam Fabricated Submicron NMOS Transistors,” Electron Device Lett., 3:13–15 (1982)
109.
See, for example, Jackson, J. D., Classical Electrodynamics, p. 654, Wiley, New York (1975)
X-Ray Lithography
955
110.
Sokolov, A. A., and Ternov, I. M., “Radiation from Relativistic Electrons,” Amer. Inst. of Phys., p. 82, New York (1986)
111.
One such Program is Transmit, available from the Center for X-Ray Lithography, 3731 Schneider Dr., Stoughton, WI 53589.
112.
So, D., Lai, B., and Cerrina, F., SPIE., 30:6 (1987)
113.
Krinsky, S., in: “Handbook on Synchrotron Radiation,” (E. Koch, ed.)., North Holland (1985)
114.
Green, G. K., Brookhaven National Lab Report, BNL 50522 (1973)
115.
Chapman, K., Lai, B., and Cerrina, F., Nucl. Instr. and Meth. in Physics Review, A283 (1989)
116.
Born, M., and Wolf, E., Principles of Optics, p. 165, Pergamon Press, Oxford (1980)
117.
Goldstein, S., Classical Mechanics, Addison-Wesley (1980)
118.
Available from CXrL, see (Ref. 111) above.
119.
See, for example, VLSI Processing, (S. Sze, ed.), McGraw Hill (1983)
120.
So, D., Lai, B., Wells, G. M., and Cerrina, F., J. Vac. Sci. and Tech., 6:2190–5 (1988)
121.
RF Cavity.
122.
More Accelerators.
123.
For a recent review of Japanese activity, See H. Winick, in the Proceedings of the 6th Synchrotron Radiation Instrumentation Conf., Berkeley, (1989); Nucl. Instrum. and Methods, AA291 (1990)
124.
See, for Example, the Report from the Brookhaven Workshop on X-Ray Lithography.
125.
National Council on Radiation Protection and Measurements, Rep. 39, (1971b)
126.
Aladdin Safety Guidelines.
127.
Dushman, S., Scientific Foundations of Vacuum Technique, Wiley, New York (1967)
128.
Brodsky, E., Synchrotron Radiaton Center, Private Comm.
129.
Acoustic Delay Line.
130.
Be and Other Windows.
131.
IBM Beam Lines.
132.
NTT Japanese Ring.
133.
See, for example, Johnson, R. L., in, Handbook on Synchrotron Radiation, (E. Koch, ed.), North Holland (1985)
956
Handbook of VLSI Microlithography
134.
Cole, R. K., and Cerrina, F., CXrL, unpublished.
135.
Lai, B., Chapman, K., and Cerrina, F., Uncl. Instrum. and Methods, A266:544 (1988). SHADOW is available from CXrL, Univ. Wisc., see (109).
136.
Johnson, E. D., Hulbert, S. L., Garrett, R. F., Williams, G. P., and Knotek, M. L., Rev. Sci. Instru., 58:1042 (1987)
137.
See, for example, Hill, R. H., in : J. Vac. Sci. and Techn . B, 7:1387 (1989); also, Wilson, A., Solid State Tech., 29:249 (1986)
Index
957
Index
Abbe errors 558, 584 Abbe’s theory 489 Aberration 475, 496, 511, 516, 528, 551, 705, 708 characteristics 518 chromatic 516, 528, 705 coefficients 805 condenser 600 effects 613 lens 492, 516 lens system 806 modeling 516 pattern 520 sources 519 spherical 518, 597, 705 Ablate 672 Abscissa 554 Absorbed energy distribution 676 Absorber 822 patterning 857 Absorption 560 coefficients 872 gas rations and substrate temperature 879 Accelerating voltage 697 Acceleration 767, 775
Accelerometer 579 Acceptance angle 483 Accuracy of measurement 453 Acetal system 100 Acetone quantum efficiency 79 ACI circle defects 363 Acid catalysis 730 Acid-catalyzed deprotection 100 Acid-hardening 743 Acoustic delay line 939 Acoustic excitations 765 Across-field errors 562 Actinic bandwidth 498 Actinic radiation 574 Actinic wavelength 505, 513, 552, 578 Additive process 698 Additive transfer 730 Addressing 711 Adhesion 676, 732 development step 148 improvements 164 promotion 151 resist method 156
957
958
Handbook of VLSI Microlithography
Adhesion failure original processes 159 rework processes 159 Adhesion module 243 ADI sphere defects 364 Advantest F5120 725 Aerial image 480, 504, 523, 524, 527, 569, 606 Aerosol particles 214 AFM 389. See also Atomic force microscope cross section techniques 414 AGV 664 Air bearings 584 Air shower 585 Air-bearing-based oil-less pumps 939 Air-flow 765 control 212 excitation 765 studies 366 Air/resist interface 574 Airy disk 478, 480 pattern 571 Airy image 480 AIT system 340 Alignment 582, 583, 810, 812 brightfield 587 darkfield 587 data collection 592 enhanced global 434, 582 field-by-field 582 modeled approach 434 objective 588 through the lens 588 variation 582 Alignment mark 691, 811, 813, 844 observation 811 Alignment shifts 591 Alignment signal processing 589
Alignment system configuration 585 Alignment targets 591, 592 coat conformality effect 227 Alignment tolerance 303 All-metal angle valves 939 Alternative advanced technologies off-axis illumination 37 OPC 37 PSM 37 Ambient control 254 Ambient environment temperature 206 Ambient excitations 759, 761 Ambient floor motions 759, 761 Amortized mask costs 65 Amplitude 475 Amplitude filter 532 Amplitude ratio 533 Analysis of Variance 455 Anamorphism 599 Angle of departure 484 Angle of incidence 815 Angular distribution 920 Angular resolution criterion 480 Annual marketing report sales volume 22 Annular illumination benefits 531 Annulus 530 Anode 700 ANOVA 458, 542. See also Analysis of Variance degrees of freedom 456 mean square deviation 456 sum of squares 456 table 410 test 409 Antireflection data 577 Antireflective coatings 277, 570. See also ARC
Index Antireflective layer 574 compatibility 577 Aperture 550 beam-shaping 704, 725 cell mask 725 circular 477, 478, 480 externally selectable 704 grounded 704 numerical 483, 527 pre-lens 705 rectangular 477 spray 704 stop 496 Apodizing filter 570 Application focused ion beam 814 specific integrated circuits 722 specific-type resists 102 Aqueous base developers 84 ARC 276, 277. See also Antireflective coatings alternatives 288 bake latitude process 278 development rate 278 film thickness 285 inorganic systems 283 optimization strategy process 278 organic systems 283 thickness plots 280 ARC-XLT 294 ARL 574 AS-type resist design categories 89 ASET 749 Asian foundry operations 20 ASIC 722 Aspect ratio 549, 590 Aspheric mirror 582 Aspherical optical elements 522 Assistant apertures 624, 625 Association for Super-Advanced Electronics Tech. 749
959
Astigmatism 519, 520, 554, 613, 806 Asymmetric errors 599 Asymmetric imaging effects 559 Asymmetric lens aberrations coma 38 three-leaf clover 38 Asymmetric signals 589 Asymmetry 591 Atomic force microscope 389 Atomized photoresist particles 203 Attenuated-PSM resists 101 Attenuation 836 Auto-defect classification 332 Autofocus system 557 air gauge 557 capacitive sensor 557 grazing incidence optical 557 Automated wafer inspections 363 Automatic Guided Vehicles 664 Automatic mask alignment 897 Automatic transmission 644 Automatic wafer handling 48 Automation 332, 644, 645 considerations 663 designer 662 general topics projects 663 standards 663 tools 663 Auxiliary electrode 801 Axially symmetric source, 530 Azimuth filtering 345 Azimuthal rotation 584
Back-end processing TIS problems 422 Backscatter 726, 729 coefficient 680 distribution 676 Backscattered electrons 42, 684 detection mode 395 secondary electrons 388
960
Handbook of VLSI Microlithography
Backscattering implications 677 Backward reflection 574 Baking 580 Bandwidth 517, 565, 711 optimization 567 Bar targets 489 BARC 285. See also Bottom antireflection coatings Bare wafer approach 355 Barometric pressure compensation 213 BCD 256. See also Bulk chemical distribution particle contamination components 257 Be window 939 Beam blanking 700, 710, 715, 806, 808 Beam crossover 716 Beam current 846 density 696 Beam defletion 675, 706, 710, 807 Beam diameter 391, 805, 806 Beam energy 676, 683 Beam line 42, 934 communication bridge 944 construction 939 delay line 939 exit window 938 front end 936 lead shielding 945 mirrors 940 optically active 929 pumps 939 requirements 941 single mirror 942 three mirror 942 two mirror 941, 942 typical layout 936 valves 939 Beam optics 924
Beam profile ellipsometer 441 Beam shape throughput advantage 712 Beam sharpness evaluation apparent beam width 389 Beam spot size 805 Beam transport system subsystems 935 Beam writing 806 Beamlets 844 Beams large-area 702 Beamsplitter 566 Beryllium window 869 Bessel functions 477 Bethe range 678, 680, 683 Bias resistor 703 Biasing 692 Bilayer approach 740 Bilayer ARCs 278 Bilayer systems 260 BIMOS product wafers 271 Binary exposure file 689 Biphotonic average 92 Bis-arylazide sensitizer 92 Bistrimethylsilylacetimide 151 Blanking 808, 809, 832 pulse 809 Blocked 2-layer process 302 Blocking 539 Blurring 806 Boolean subtraction 690 Bornside’s experimental model 217 Boron nitride residual stress 879 Borophosphosilicate glass 150, 561 Bossung curve 548, 550 Bossung focus-exposure plot 553, 560 Bottom antireflection coatings 285 Bottom EBR 221
Index Box-Behnken design 176, 540 Box-Behnken tests 540 Box-Bhenken response surface design 173 Box-in-box structure 595 BPE. See Beam profile ellipsometer BPSG 561. See also Borophosphosilicate glass Bremsstrahlung 935 Brewster’s angle significance 445 Bridge 604 Bridging 696 Bright resist image negative charging 402 Brightfield detection 344 defect 334 Brightfield inspections metal layers 337 Brightfield mask 571 Brightfield techniques darkfield techniques 339 defect types 337 Brightfield tools 330, 333 Brightness 701 Broad angle collector 349 Broadband exposure 574 BSA. See Bistrimethylsilylacetimide BSE silicon/resist contrast 400 Bulging 551 Bulk chemical change 814 distribution 256 Bulk delivery 663 Bulk effect 122, 263 Bulk film absorption 572 Bulky detector example Everhart-Thornley type 386 Butting errors field 688 stripe 688 subfield 688
961
CA resists 193 nm 109 CAD 606, 672. See also Computer aided design pattern preparation conditions 686 systems capability 23 Calibration 693, 714 drift 714 fundamental 714 stage runout 714 Canister pressure 242 Capacitance voltage plots 908 Capacitive probe 709 Cartesian angle 924 Cartesian geometry 688 Cartesian grid 687 Cascading linear functions 495 Case-based reasoning 590 Cassette load/unload sequence 666 Cassette loading scenario 667 Cassette port standards of configuration 668 Cassette transfer robot 667 Catadioptric 31 design 472 Catch cup re-circulation zones 217 Cathode 700 Cauchy accuracy 451 Cauchy coefficients 451 CCD 587 camera 349, 514 CD 735 biasing 271 budgets 39 calibrations 290 exposure latitude 264 latitude 131 matching 235 requirements 68 thin film interference behavior 293
962
Handbook of VLSI Microlithography
CD control 25, 29, 32, 255, 369, 377, 748 ARC 276 chart 181 enhancement 33 important issues 283 influencing factors 39 reflectivity effect 297 via layers 310 CD measurement contrast 395 electrical 383 CD metrology charging effects 400 edge effects 397 CD RPL developer composition effect 164 CD scanning electron microscope 382 CD SPC violation troubleshooting 189 CD uniformity 235, 237, 250 CD variability 313 CD variation 262, 276, 292 significant improvements 292 thin film reflectivity 292 via 310 CD vs dose curves 143 CD-critical layers 89 CD-SEM calibration standards 412 distinguishing aspects 384 excellent sensitivity 412 image 396, 398 measurement process 408 performance 406 resolution 392 simple t-test 410 Cell biases 508 Cell cluster approach 254 Cell controller 646, 647 benefits 650 definition 647
Cell projection 712 Cell team 647 CEM 528, 578 Census prediction 30 Center of elasticity 783 Center of gravity 783 Center stripe defect 358 Center stripe signature 362 Center to edge phenomena 308 Centering the data effect 435 Central composite design 173, 176 Centralized chemical-delivery systems 256 Centroid 521 Certification of process 19 Chain scission 794 Channel electron multiplier 811 Charge buildup 679 Charge coupled device 587 Charge dissipation 745 Charging 696, 697, 844 phenomenon 413 Charter 22, 67 Chemical changes 794 Chemical configuration 242 Chemical distribution systems 257 Chemical fluids operating conditions 248 Chemical Management Services 256, 744 Chemical mechanical polishing 440. See also CMP process 302 Chemical treatments 185 Chi-Square distribution 464 function 465 CHIINV 465 Chip area distribution vs. minimum geometry 22 Chip area increases 22 Chip fabrication process steps 356
Index Choice between schemes cost 931 efficiency 931 reliability 931 Chromatic aberration 526, 565, 701, 706, 801, 805, 841 coefficient 805 Chromatic lens 566 aberration 704 Chrome patterning 615 Chromium mask damage 628 Chromium undercut 627 Chucks 561 CIM 664, 668 architecture 668 architecture considerations 668 Circle defect 363 Circuit element definition 75 Circuit rewiring 822 Clariant resist performance results 97 Classes of applications 18 Clock speed bits 684 Cluster tools 645 CMP 440. See also Chemical mechanical polishing inspections 331 layers inspection 333 processed dielectric layers 338 research 310 CMS. See Chemical Management Services CO2 scum defect 237 Coat modules test 355 Coat process 355 Coat uniformity and defects 372 experimental screening results 371 Coater cup RH 212 temperature 210
Coating 580 antireflection 602 process 674 processes optimization 163 Coherence 498, 550, 551, 570, 604 illumination 604 partial 570 Coherent grating 689 Coherent illumination 480, 498, 501 Coherent light 508 Coherent plane wave 501 Cold field emission 703 Cold rings main advantage 933 Collection channels 340 Collimated beam 836 Collimated light 475 Collimation 499 Collision cascades 794 Color differences 338 Column performance 806 Column tilt 556 Coma 492, 518, 556, 613 correction 557 sagittal 519 tangential 519 Commercial equipment examples 715 Commercial optical instruments 886 Communication interface 651 standards 651 Compact ESRs 932 Compaction 568 Complete IC process turnaround time 58 Complex conjunction 525 Complex point 485 Compound objective lens 392 Computer aided design 23, 672. See also CAD
963
964
Handbook of VLSI Microlithography
Computer aided manufacturing 23, 49 Computer Integrated Manufacturing 664, 668 Computing power 22 Condenser 570 properly focused 522 tilt 600, 607 Conductive overlayers 745 Confidence coefficient 537 Configuration control 19 Conformal 561 Conjugate blanking 705 Conjugate length 507 Conjugate lithography 550 Conjugate points 507 Conjugate spatial filter 531 Conjugate twin-shift 613 advantages 614 Bossung curve 615 Consistent edge wall imagery improvement 265 Constructive interference 571 Contact 548 Contact angle measurement 243 water droplet 157 Contact hole imaging 621, 626 Contact optical lithography 836 Contact printing 28 primary disadvantage 28 Contact probe pad 823 Contact/proximity printing 29 Contaminant adsorption 703 Contamination 696 growth assessment 408 Contour plot 203 Contrast 120, 493, 524, 733 charging influence 402 enhancement material 528, 578 values 846 Contrast control study results 176 Control charts 545
Conventional secondary electron detectors 386 Convergence angle 701 Conveyors 664 Convolution theorem 927 COO analysis 66 Cool plate effects 189 Cooling plate application 189 Cooling plates 189 Cooling water loops 244 Corner rounding 681 Correction data 709 Correction measurements 715 Correction parameter 708 Correction polynomial coefficients 708 Corrections deflection-based 708 deflection-dependent 718 Cost of equipment amortized 59 Cost of ownership 35, 332 Cost-effectiveness 946 Coulomb interaction 712, 846 Coulomb repulsion 802 Counter-rotating diffuser 567 Cr-patterned glass e-beam fabrication 111 Critical dimension 2, 75, 329, 371, 508, 547, 575, 625, 692, 735 control 51, 547, 552, 646 results 167 variable influences 164 variations 52 Critical energy 917 Critical illumination 497 Critical tasks 650 Cross linking 794 Crossed analyses 462 Crossed Variable ANOVA 461 Crosslink 736 Cryogenic cooling 803 Crystal structure 827
Index Current density 804, 805 Customer demands 22 Cutoff frequency 501, 502 Cylindricity 521
DAC 688, 700, 709 Daily matrix 181 Damage 794, 814 thresholds 629 Darkened resist image positive charging 402 Darkfield detection 344 defect types 337 depth-of-focus 339 inherent immunity 339 Darkfield imaging 342 Darkfield inspection tool 330 Darkfield tools 333 pixel size 339 DARPA 721 Data analysis 352 collection benefit 352 communications channel 934 crossed and nested groupings 460 errors 697 events collection 653 handling 700 regression 534 DBS. See Dual-beam reflectance spectrometer measurements 440 De Broglie wavelength 42 Deadpath error 584 DEATS 151 Decision-making process charter 16 marketing 16 production requirements 16 Deep UV light responsive resists 96
965
radiation 744 treatments 185 Defect characterization 352 Defect classification 353 Defect density 57, 377 Defect detection 330, 886 using OPF 352 Defect free x-ray masks fabricating costs 40 Defect frequency 363 Defect grade 342 Defect inspection technique 371 Defect overlay and subtraction 363 Defect Pareto charts 353 Defect pattern 362 Defect sensitivity inspection 334 Defect signature 353, 363 Defectivity 203, 235, 606 monitoring 355 test 252 Defects 213, 363, 696 minimized 356 process coating induced 371 random 696 Defense Advanced Research Projects Agency 721 Deflection 706, 710, 717, 807, 808, 832 corrected errors 706 dynamic response 706 electronics 700 field properties 706 gain 714 hardware 688 hierarchical strategies 710 hierarchical system 711 hierarchy 713 noise 690, 696 resolution 688 signal 807 voltage 807 Deflection-based corrections 709 Deflector 809
966
Handbook of VLSI Microlithography
Defocus 505, 507, 510, 518, 533, 612, 613 aberration 525 array 559 latitude 610 plot 554 separation 520 wafer plane 528 Defocused aerial image response 616 Defocused beam 686 Degradation mechanism 386 Degradation problems 404 Degree of coherence 520, 556 Degrees of freedom 465, 537 Delaminating 697 Delphi project 18 roadmap 35 Delta-to-predicted statistical process 404 Deltas-to-target 406 Demagnification 702, 704 process 389 Densification 568 Density gas ratios and substrate temperature 879 Deposited layers 159 Depth of focus 10, 552. See also DOF Depth variable 506 Design and response analyses 777 Design IP rules 7 Designed factorial experiments 357 Desire processing 259 problems 314 Desire-type processing 313 Destructive interference 484, 571, 608, 614 Detection technique tools brightfield image comparision 330
darkfield light scattering 330 Detection threshold 347 Detector 699 BSE electrons 399 Develop 549 Develop defectivity 237 Develop latitude 553 Develop modules test 355 Develop process 355 Develop process qualification 250 Develop puddle time 234 Develop solution ideal dispense 237 Develop temperature response 232 Develop time 234 Developed positive resists aqueous-based 100 Developer concentration 232 Developer consumption 235 Developer dispense nozzle 362 parameters 248 pressure 362 repeatability 248 volume testing 248 wafer rpm 362 Developer process characteristics 167 Developer strength 735 Developer temperature 362 Developers 736 cost 241 Development 730 Development method 139, 735 Development process 672 Development rate 581 Development time 581 Deviation effects 692 Device annealing before and after effects 908 Device chip yield 369 Device cores 7 Device density 4
Index Device fabrication pilot line Cp value 185 Cpk value 185 Device fabrication requirements 329 Device implantation 159 Device Isolation Topography Effect 271 Device manufacturability 75 Dewetting and popping problems over-priming results 155 Dialogue 666 Diazide-novolak resists positive type 26 primary limitations 26 Diazo coupling reaction 581 Diazonaphthoquinone 578 Diazonaphthoquinone novolak resists 525 Die yield 360 Dielectric thickness variation 338 Differential refractometer 583 Differential scanning calorimetry 145 Diffracted orders 487 Diffraction 26, 475, 477, 480, 550, 587 angle 533 effects 507, 551, 671 Fraunhofer 475, 476 Fresnel 475 gratings 484, 487, 491, 501, 587 integral 511 pattern 349, 480 Diffusers 567 Digital signal processor chips 6 Digital tachometer external calibrated 248 Digital-to-analog converter 688 Dill values 122 Dimension control 507 Dimensional control 692
967
Dipole emission pattern 914 Dipole magnets 932 Direct implantation 827 Direct mask reference 588 Direct step on a wafer 29 Direct writing 671 Direct-write application 690 Direct-write e-beam 45, 259 Dispense developer 235 Dispense nozzle offset 362 Dispense pressure 194, 197, 221 range 194 Dispense resist 193 Dispense speed 194 Dispense time 197 Dispense volume 197 Dispensing developer nozzles 237 Dispersion 516 Displacement 767, 775, 787 Displacement-based criteria 788 Dissociation yield 797 Distortion 521, 599, 715, 843, 844 reduction 843 Distributed-feedback laser 708 DOF 10. See also Depth of focus DOF budget 299, 302 Doped polysilicon 52 Doping 792 Doping atom variations 8 Doppler effect factor 914 Dose modulation 684, 686 Dose requirements 693 Dose variations 692 Double darkfield detection 340 Double detection algorithm 334 Double resist application 155 Down-stream activated oxygen systems 187 Downward trend 352 DRAM 671, 713 dependency 4 developements 21
968
Handbook of VLSI Microlithography
development cycle 17 factories 20 patterning 747 Drift 690, 844 Dry etch compatibility 77, 112 Dry etching 51, 698, 732, 733 Dry-process compatible resists 115 Drying time 216 DTP 404. See also Delta-topredicted statistical process DTT 406. See also Deltas-totarget Dual quadrant detectors 558 Dual-beam reflectance spectrometer 440 Dumping 715 Duncan’s test 411 DUV 744 193 nm 108 ARC SiRN 285 curing 306 DRAM production 255 effects 133 ESCAP resists 103 exposure tool shipments 8 lithography 6, 7 optical stepper 68 photoresists 744 scanner costs 34 scanners vs. steppers 39 DUV resist 846 future 119 material costs 255 negative 108 positive 104 target process performance goals 104 DUV resists positive 96 DUV step & scan tools 36 vs. reduction steppers 32
DUV track process effects 255 processing costs 255 wafer processing 252 Dye 576 Dyed resist useage 276 Dynamic corrections 717 Dynamic random access memory 671 Dynamic repeatability important quantities 457 Dynamic stigmation 806 Dynamic system 762 Dynamic threshold 337
E-beam direct-write 68 E-beam gate processes 143 E-beam lithography 2, 42, 43, 63, 111 direct write on a wafer 25 direct write tools 50 printing 25 E-beam positive resists direct-write 113 novolac-based 113 E-beam proximity effects 313 E-beam resist application low-volume 108 E-beam stencil printing 7 E-beam/stepper critical resolution 65 E2 nozzle method most efficient 241 EBR customer specifications 248 measured 250 operating conditions 372 processes 372 solvent application 221 solvents 221
Index two types of defects 372 width 248 EBR-9 737 ECD advantages 415 disadvantage 415 gauge capability 416 metrology 416 Economics 67 ED 548 Edge acuity 25 Edge bead 220 process 372 Edge profile 620 control 627 Edge roughness 696 measurement 901 Edge slope 702 Edge wall angles 264 EDS analysis 366 Elastic constants 843 Elastic scattering 676 Electric displacement 444 Electric field distribution 491 Electrical CD measurements 414 Electrical cross bridge 416 Electrical features high aspect ratio 383 Electrical measurements conductive films 52 Electrical probe structure 595 Electrical test devices 901 Electrically conductive film 54 Electrode potentials 805 Electromagnetic energy source wavelength 75 lenses 716, 719 theory 445 wave amplitude 476 Electron 395 advantages 42 discharge heated plasma sources 869
969
emission 799 emitter 700, 702 gun 700 impact x-ray sources 866 optics 698 range 394 source 672, 700 storage rings 912, 913, 918 volume of phase space 929 Electron beam 671 exposure 737 implementation 748 lithography 56, 670, 672, 673, 879 resist 670, 674 source 385 technology 725, 747, 748 Electron detector 811 Electron synchrotron 930 Electron-energy loss 679 Electron-optical column 701 Electron-optical system components 700 Electron-optics control 700 Electron-sensitive coating 674 Electron-solid interactions 676 Electronic stopping power 792, 794, 828 Electrostatic deflection 706 Electrostatic lenses 799, 805 Electrostatic octupoles 719 Electrostatic plates 704, 706 Element flexure 597 Ellipsometer 436, 439 multiple angle 450 single angle, single wavelength 450 Ellipsometry 440 Brewster’s angle 445 single layer film 440 thin films 449 EM 509
970
Handbook of VLSI Microlithography
Emission angle 923 current 703 of radiation 923 tip 801 Emission-chamber 700 Emitted radiation Lorentz contraction 914 Emitters field 702 thermal 702 Empirical resist process development 74 Empirical testing 527 Empirically modeled response surfaces 370 Encapsulated particles 696 Energy absorption considerations 111 Energy distribution 480 Energy input 831 Energy loss 676 rate 836 Energy output 564 Energy spread 802, 804, 808, 841 Energy-to-clear (Eo) test 228 Entrance pupil plane 570 Environment 735 E o 377 E o test 229, 235 Eo vs. Run Order experiment 229 Equipment alarms 653 commonality studies 363 communication standards 651 development 59 issues 650 market growth pause 17 performance 201 servers 647 Error 475 anamorphic 599 budget 562, 690
component aligner 862 cosine 584 criterion 534 deflection axes orthogonality 708 mask-to-mask components 862 micron defocus 601 trapezoid 597 ESR accelerating 931 non-accelerating 931 Etalons 564 Etch 159 Etch masking 126 Etch rate 822 Etch resistance 131, 676, 692 Etch technologies 262 Etchback time reduction 305 Etching technology 259 ETEC Excaliber 723 ETEC MEBES 4500 720 Ethernet backbone 944 Euler’s formula 486 EUV 15, 45, 746 development 119 lenses 45 lithography 749 technology 7, 119 tools 46 Evaporation rates 703 Event message 657 Evolution of phase space 926 EXB mass separation 806 Excimer laser 564 exposure 602 irradiation damage 628 spectrum 565 Exciplexes 564 Excitation 759, 775, 784, 788 sources 759 Exhaust defect 203 Exhaust flow level 221 Experiment risk 537 Experimental designs 539
Index matrix 539 Exposed photoresist develop time 234 Exposure 674, 676, 684, 830 comparison strategies 693 conditions 680 conjugate 550 control 700 dose 572, 606, 684, 692, 733 duty 560, 598 energy 508, 571, 680 environment station 934 errors 696 isofocal 550 latitude 530, 553, 610, 830 margin 509 strategy 726 timer reduction 69 Exposure tool 693 capability 250 throughput 50, 56 time required 693 Exposure-Defocus Diagrams 548 Exposure/focus plot 52 Extensive computational capability 440 Extraction current 802 Extraction electrodes 841 Extreme UV 746 lithography 2, 26 projection 841
F-test 460, 536, 537 Fab 69 airborne bases 254 floor designs 774 future trend 9 layout considerations 664 photo capability impact 22 Fabless semiconductor companies 9 Fabrication 23, 857 cost 24
971
Fabrication and yields throughput vs. cost 19 Fabry-Perot Etalon optical model 293 Fabs large 30 new 34 older 33 Factorial design 539, 542 Factorial experiment 539 Factorial surface designs 542 Factors interactions 540 Failure analysis 360, 824 Faraday cup 699 Fast Fourier Transform 758, 775 FE 441. See also Focusing ellipsometer source-size 389 Feature size reduction 329 FFT 758 FIB 790, 799 implanted devices 828 system schematic 799 Fiber bundle 569 Fiducial 596 Field 675, 688, 730 anamorphism 595 curvature 520, 521, 555 diameter 547 emitters 703, 704 errors 592 ion source 804 ionization 803 phase errors 615 rotation 708 size 807 Field-effect transistor gate level 689 Field-emission gun 385 Fifth detector 344 primary benefit 344 Figure-of-merit 690 cost-effectiveness 65 Filaments 702
972
Handbook of VLSI Microlithography
Fillet 680 Filling-up process 932 Film attenuation 571 Film coating uniformity cup temperature 210 Film nonuniformity 163 Film thickness 213, 605 Film uniformity 201, 203 Fine features 607 Fine spot 790 First automated inspection tools 330 First task in optimization 354 Fish eye particles 203 Five-axis laser interferometer 558 Fixed wafer orientation 362 Fixed-stage exposure 699 Fixed-stage writing 713 Fixtures 691 Flare 528, 570 Flash analog-to-digital converter 828 FLEX 558 Flood exposure 526 intensity 615 Floor displacement criteria 768 motions 759 responses 788 stiffness 774 system 788 total velocity 757 vertical frequencies 769 vibrations 768, 784, 788 Flow rates 242 Fluence 568 Fluorescent intensity 510 Flux level 534 Flux reduction 899 Fly’s eye integrator 569 Focal length 480, 521 Focal plane 491
Focus correction 717 correction signal 709 depth 504, 530 nominal 558, 559 Focus latitude 558, 625 Focus offsets 299 Focus walk 598 Focused ion beam 790, 799, 806 applications 803 implantation 848 lithography 848 scan rate 815 sectioning 824 writing 808 Focusing ellipsometer 441 Forward scattering 677 Four phase transition 613 Four-quadrant optical detector 558 Four-terminal network acurate test configuration 415 Fourier analysis 590, 786 Fourier coefficient 487 Fourier image 349 Fourier integrals 489 Fourier optics theory 487 Fourier series 487, 489 Fourier synthesis 487, 489 Fourier transform 487, 491, 494, 521, 525, 528, 531, 917 Fourier’s theorem 489 FPA 5000ES2 31 Fracture 684 parameters 689 process requirements 688 requirements 689 Fracturing 674, 686, 688 Frame storage 810 Framing structural stiffness 772 Fraunhofer diffraction 487, 491, 492 Fraunhofer’s diffraction theory 489
Index Frequency curve 537 Fresnel zone plate and grating method 889 targets 590 Fringe analysis 561 Fringes 581 Front end beam line basic functions 936 Full energy injectors 931 Full wafer projection printing 29 Full width at half maximum 566 Function code 653 line spread 521 optical transfer 512 phase transfer 519 wave aberration 516 Functional IC cost of manufacturing 57 Fundamental 489 Fused silica 567 light transmission 567 Fusion deep ultraviolet radiation 131 Future device fabrication application and requirements 262 Future resists new advances 119 Future volume fabs 20
G-line lens bandwidth 566 Ga+ FIB system applications 826 Ga+ ions 46 GaAs 42 Gage capability study 248 Gain 706 Gas-puff Z-pinch configuration 869 Gate CDs 283 Gate definition 831
973
Gate lines 689 Gate transistor level CD variation 39 Gauge capability 559 Gaussian beams 43 Gaussian distribution 530, 567, 677, 683, 702, 924 Gaussian peak height 678 Gaussian shaped beam systems Raster scan 43 Vector scan 43 Gaussian spot 673, 711 General process optimization 378 Generic equipment model 653 Generic tool classifications 764 Geometric model 592 Geometrical optics 475, 512 GeV ring 932 GHOST 726 exposure dose 686 technique 686 Glancing optics characteristic smile 941 Glass coverplate 603 Glass standoff 603 Glass transition temperatures 145 Global repulsion 844 Goniometer 243 Gradient 523 Grating 564, 619, 708 frequency 502 phase error 674 Grazing angles 816 incidence illumination 337, 342 Grid 700 error 435 Ground motion criterion 767 Gun 700 cap 700 thermal emitter 703 vacuum requirements 703
974
Handbook of VLSI Microlithography
H+ ions 46 Hard contact printing 28 Hard radiation 934 Hardware configuration 241 Hardware requirements 710 Hardware source inspection 242 Harel state model notation 655 Harel statechart notation 655 Harmonic mean 538 Harmonic motion 484 Harmonics 489 Heating advantages 385 Heating and cooling plates accuracy tests 242 repeatability test 242 Heel-impulse 761 Height measurement 708 Helium system cost 933 Hershel-Wynne-Dyson lens 594 Hexamethyldisilazane 146, 697. See also HMDS Hexopoles 932 High beam energy 746 High end processes 164 High intensity radiation beam 933 High ortho-bonding 84 High quality recipe management systems 412 High resist pre-exposure bakes 98 High speed SECS message service 652 High volume products class capital cost 19 dominant factors 19 economic factors 19 process adjustments 19 High-throughput electron beam 746 Higher resolution designs 370 Hipox isolation topography 275
Hitachi 7000 SPC capability 290 Hitachi HL-800M and HL-800D 724 Hitachi HL800D 724 HMDS 697. See also Hexamethyldisilazane application 199 dispense parameters 248 processing 146 substrates 156 treatment 49 Hopkins’ theory 525 Horiba XXX 368 Host 647 Hotplate defects 227 Hysteresis 542
I-line ARC 283, 285 lens bandwidth 566 lithography 3, 88 optical stepper 59, 68 resist 89 reticle 15 stepper 29 technology 86 IBM EL-4 722 IC chip lithography 869 metrology 418 processing types 18 product quality control 886 production 23 products 17 technology 7 wafer patterning stepper 371 IC device designers 911 high density work 861 market value 60 type 910
Index Ideal x-ray source 857 Illuminance 476 Illumination 562 4 spot 533 beam diameters 339 coherent 503 cone 526 emissions spectra 88 meter 569 nonuniform 534 profile 510 slit 498 sources 562 uniform 569 uniformity 600 Illuminator assembly 496 collimator 570 issues 600 ILM-2230 345 Image 845 acquisition and image processing hardware 371 analysis tools 346 contrast 510, 528, 605 formation 489 placement 474, 475, 582, 629 plane 483, 491, 502 point 507 processing 334, 342 quality 474 reversal alternative 264 reversal process 266 slit 582 subtraction 342 Imaging 810, 814 contrast 608 fabrication tools 3 improvement 530 isolated lines 627 line/space pairs 627 metrology tools 382
975
resist layer 684 technologies 333 Impact to device yield minimized 357 Implant layer pattern defect 356 Implantation 792, 841 applications 828 Impulse excitation 759 responses 777 vibration 773 Impulse-induced excitations 760 In-line inspection results 360 priming modules 146 product inspections 357 visual data OVERLAY 360 In-situ test mask 595 Incident beam axis 672 Incident electrons 681 Incident energetic ion 794 Incoherence 566 Incoherent illumination 498, 502 Incoherent light 480, 483 Independent variables normalized 539 Index of refraction 482, 577 Induced surface chemical reactions 814 Inductive sensor 580 Industrial designs 581 Ineffective isolation supports 783 Inelastic scattering 676 Influence 520 Infrared filters 836 Injection locking 564 Inorganic BARC primary advantage 287 processing 287 Inorganic coating technology manufacturing 287 Inorganic layers 278 Inorganic plasma deposition 288
976
Handbook of VLSI Microlithography
Inspection in-line 368 magnification 342 speed improvements 330 system 371 technology convergence 330 time 334 tool selection 332 wavelength 581 Inspex Eagle 342 Inspex tools 337 Insulating film 797 Insulators deposition 797 Integrated circuit diagnosis 848 lithographic design 259 Integrated circuit fabrication 1, 75, 354 Integrated circuit manufacturing technology 78 wafer spinning 193 Integrated circuit rewiring 819 Integrated Normal Perspective 344 Integrated-optics 689 Intensity 476, 581 Intensity curves 480 Intensity dip 613 Intensity distribution 525 Inter-feature gap 683 Inter-level registration measurements 54 Inter-machine lithography 54 Interaction 540 Interaction source of error 457 Interconnections 689 Interface 651, 665 Interference 587, 604 light 550 pattern 514 Interferometer axis 584 displacement measuring 583 Interferometric metrology 45
Interferometric microscope 423 Interferometry 513 Interfield errors 585 Interfield model 585 Internal dynamic system 784 Internal system components 257 Interproximity 680, 683, 693 Interrelated parameters focus and tilt 369 Interrupted development 581 Interstitial diffusion 567 Intracavity prisms 564 Intraproximity effect 680 Intraproximity rounding 680 Invar 585 Investigating a problem 357 Ion assisted etching 798, 814 Ion beam lithography 2 advantages 749 direct write on wafer 25 focused 25 masked 25, 836 reduction projection 749 reduction stepper 25 Ion beams 46 milling 817 profile 813 Ion channeling 814 Ion column 805 Ion current density 841 Ion doping 814 Ion emission cone 802 Ion energy spread 841 Ion exposure 831 Ion implantation 698 Ion induced deposition 797, 819 Ion induced etching 817 Ion microbeams 792 Ion milling 794, 814 Ion optical column 844 Ion projection 15 lithography 46, 841 system requirements 841
Index Ion sources 800, 839 gaseous field 803 liquid metal 801 Ion surface interaction 792 Ion velocity 832 Ion-implantation 690, 730 Iso/dense line bias 104 Isolated CDs 266 Isolated lines imaging 618, 621 Isolated spaces imaging 619 Isolation 765 mounts 783 softness 784
JEOL JBX-5000LS 719 JBX-5DII 719 JBX-5FE 719 JBX-6000FS 720 JBX-7000 MVII 722 JBX-8600DV 722
Kaleidoscope 569 Karl Suss stepper 942 Kelvin bridge structure 52 Ketal system 100 Keystone 706 Kinematic coupling 515 Kinematic mounts temperature control 515 Kite 599 KLA-Tencor 2138 338 KLA-Tencor 2230 344 Kohler illumination 496, 530, 567
Labor direct cost 58 indirect cost 58 savings 644
977
Ladder diagram 662 LAN 648. See also Local Area Network Large format photomask materials next generation 15 Laser darkfield illumination module 344 fluctuation noise 569 light scattering 339 photoablation 588 plasma source spot 867 pulse energy 867 radiation 889 reference 714 reflectometer 602 scattering inspection tool 330 sources 565 Laser interferometer 512, 672, 690, 698 standard 714 Laser-based steppers vs. scanning steppers 33 Laser-beam heated plasma source 867 Latency time 697, 735 Latent image 504, 523, 674, 729 development 673 Lateral dimension 620 Lateral positional shift 607 Layout 674 LBNL 721 LDMOS devices 271 Least squares equation 773 Least squares method 534 Leica EBMF and EBML 718 LION 718 Vectorbeam 717 WePrint 200 725 ZBA23H, ZBA31, and ZBA32 723 Lense numerical aperture values 6
978
Handbook of VLSI Microlithography
Lenses 475 aberrations 705 aplanatic system 519 astigmatism 268 contamination 183 costs 31 demagnifying 705 designs 512, 525 distortion 56, 432 electromagnetic 705 elements 597 magnetic 705 magnifying 705 matching 594 reduction 601 shaping 705 Lepton EBES4 720 Leveling 558 agents 78 Levenson imaging 610 Levenson mask 600 aerial image 610 imaging problems 611 linearity 611 LFPD 560 Lift off metallization 52 Lift preventing processes 156 Lifting 576 Liftoff 692, 698, 730, 731, 732, 739 process 731 tape-assisted 732 Light interference effects 605 Line and products qualification 19 Line edge roughness 901 Line monitoring tool 331 Line shortening effects 86 Line spread function 492 Line width standards 414 Line/space pairs imaging 617, 621 imaging limitations 618 line width control 621
Linear pixel density 688 Linear polarization 438 Linear voltage differential transformers 584 Linearity 127 Linearly polarized component electric vector 438 Liners 700 Linewidth 534, 621 control 508, 859 Linked lithocells 646 Liouville theorem 926 Liquid metal ion sources 814 Lithocells 362, 367 configuration 370 periodic exhaust check 367 process history 366 Lithographic advancement 6 Lithographic automation 49 Lithographic defects process to prevent 372 Lithographic image sizes 262 Lithographic processes 75, 936 optimization methodology 282 technology 74 Lithographic relief images 76 Lithographic resolution 692 Lithographic technique factors 2 Lithographic throughput 9 Lithographic tools 15 Eo matrix 185 higher resolution 259 installation 756 reusable 7 Lithography 8, 12, 25, 35, 38, 57, 59, 80, 159, 229, 327, 645, 790 charged-particle 745 circuit integration 37 cost 61 dielectric layers 150 domination 1 DRAM technology 6 driver 3
Index e-beam 25, 813, 832 e-beam systems 827 electron beam 692, 726, 832 five players 61 focused ion beam 832 higher resolution 3, 6 ion 828 ion beam 25 ion projection 792, 840 multilayer resist 528 operations 369 optical 25, 692, 745 output control 382 pattern density 37 patterns 260 process 250, 327, 328, 354, 673, 859 rapid advances 61 resolution region 65 system 16, 25, 47 technical requirements 16, 20 tool selection 68 total labor cost 58 types 25 variety of systems 25 x-ray 25, 836, 841, 857 Lithography tools 67, 96, 369 chemical mechanical planarization 288 dual damascene 288 selection 70 selection influence 63 set up time 30 shallow trench isolation 288 Local Area Network 648. See also LAN Local focal plane deviation 560 Local membrane distortions 879 Local total indicated range 302 LOCOS isolation process 275 Logic device 6 developements 21 production 34
979
Long-term reproducibility 405 Looping bridges 611, 615 Lorentz force 705, 914 Lorentz transformation 920 Lorentzian waveform 565 Lot processing scenario 649 Low absorption oil 569 Low contrast defect detecting 352 Low contrast level 690 Low energy injector 931 Low energy system 679 Low level circle defects 363 Low molecular weight resins 84 Low pressure chemical vapor deposition method 879 Low temperature oxide 561 Low voltage electron beams diffraction 390 Low voltage SEM chromatic aberration term 391 essential sources 391 Low volume/multiple processes 19 LPCVD. See Low pressure chemical vapor deposition method LPIII process 156 LTIR 302. See also Local total indicated range LVDT 584
Machine architecture 713 limitations 688 physics terminology 917 Magnetic coils 706 Magnetic data storage disk 826 Magnetic energy filters 841 Magnetic field lines 386 Magnetic field noise 844 Magnetic lens spherical aberration 391 Magnetic/electrostatic compound lens 388
980
Handbook of VLSI Microlithography
Magnification 596 Mainstream IC production 15 Manual error 647 Manual wafer inspection 330 Manufacturing communication scenarios 653 Manufacturing execution system 648 Manufacturing fab 368 Manufacturing facility economic health 60 Manufacturing operation class 19 Mapping 582, 583, 886 Marechal condition 513 Mark contrast 691 Mark degrading effects 691 Mark detection 672 Mark registration 691 Mark symmetry 428 Marketing 20, 67 Marlowe simulation 792 Mask 836 bias 549, 579, 626 contrast factor 872 cooling 844 current 841 defects 612 demagnified 841 design and testing 595 distortion 839, 843, 844 erosion 839 fab facility 15 fabrication 23, 42 heating 726, 839, 843 inspection 42 inspection and repair equipment 47 linearity 534, 611 open stencil 836 pattern generation 39 patterning 722 placement 582 registration 582
repair 42, 819, 822 Si channeling 836, 839 transmissivity 491 writing capability 45 Mask making resists 114 techniques 39 Mask/reticle 527 aligners 81 generation 747 Masked ion beam lithography application 836 Masked ion beam systems 46 Masked ion lithography 849 Masks 601, 670 classes of 836 phase shifting 604, 606 Masks and mask making equipment 47 Maskshop glass plate fabrication 10 Mass filter 806 Mass separators 805 Master clock 700 Matching 408, 454 developer track 235 Matching error corrective action 412 Matching test 408 Material contrast secondary electron yield 401 Material index of refraction 568 Material thermal expansion 583 Material transfer interface 668 Material transport 663 Maxima 480, 484 locations 477 Maximum chip area 56 Maxwell’s equations 443 Mean square error (MSE) 410 Measured variance 466 Measurement reproducibility tool-to-tool component 408
Index Mechanical rigidity 700 Membrane and absorber materials 857 Membrane processing 749 Membrane-to-film thickness ratios 879 MEMS 728, 827 Meniscus 221 lens 522 Mercury arc lamp 564 vapor lamps 562 MES 648 MESFET 832 Message transfer 651 Metal etch biases 299 Metal image adhesion 150 Metal line corrosion 363 Metal substrates linewidth experiment results Methacrylic acid 697 Methyl methacrylate 697 Metrology 289, 463 plate 845 single number output 383 statistics 453 targets 430 Metrology tools 330 control chart 404 gauge studies 454 suitability 455 Mexican hat effect 210 MFS 859 MFS IC chips 886 MFS IC device production lithography methods 912 MFS x-ray printing 886 Micro-electro-mechanical systems 827 Microchannel plates 811 Microcontrollers 6 semiconductor business 8 Microdensitometer 510 Microfabrication 790
298
981
Microlithographic resolution limit 483 Microlithography 645 Micromanipulation 827 Microprocessor fabrication 61 Microprocessors 6, 352 Microwave transistor gate fabrication 698 Miniature column approach difficulties 728 Mirrors 584 Misprocesses 647 Misregistration 54, 689 measurements 428 MLM 262. See also Multilevel metallization device layers 262 layers 262, 293 lithography 183 photo processes 313 requirements 302 MLR 528 structure 901 Model cell controller 647, 648 Model validity 457 Moderate volume class capital cost 19 process sequence 19 Modern factories 256 Modulation 493, 494, 503, 505 rate 498 transfer factor 493 Module partitioning study 362 Molar extinction coefficient 80 Monochromatic high numerical aperature tools 371 Monotonic scaling 527 Monte Carlo simulation 396, 600, 792 Moore’s Law 21, 746 MOS capacitor 908 MOSFET fabrication process steps 910
982
Handbook of VLSI Microlithography
Moving stage exposure 699 Moving stage systems advantages 710 Moving stage writing 713 MPU technology 4 MTF curve 510 minimum requirement 508 Multifactor experiments 538 Multilayer processes 276 techniques 260 Multilevel metallization 262 Multilevel processes 262 Multiple process tools 352 Multiple wavelengths 450, 589 Multiple-beam technology 728 Multistage shifter 624 Multivari CD control study results 176 Multivari experiment 370 Multivari study 174 Multivariate studies 543
n-layer 303, 306 NA 483 NA/WL ratio 3 Nanometer-scale device geometries 746 lithography 804 Nanowriter 721 National Institute of Standards and Technology 413, 918 Negative e-beam resists design advances 118 direct-write compatible 114 limitations 118 Negative i-line resist advances 92 Negative photoresists required adhesion promotion 148 Negative resists 92, 846, 899
Negative tone 736 mid-UV photoresist 90 Nested analysis 462 ANOVA 457 Nested control loops 382 Nested variation 545 Neutral density filter 334 Neutral traps 908 NIST 413. See also National Institute of Standards and Technology Noise 807 averaging 696 Non-birefringent optical lense materials 36 Non-concentric imaging 599 Non-interferometric alignment system 897 Non-interferometric light-optical methods 897 Non-optical lithography tool 34 Non-optical resists sensitivity and contrast 134 Non-optical tools 69 Non-photoactive dissolution inhibitors 78 Non-random defects 62 Non-zero topography 339 Nonactinic wavelength 589 Nonimaging polyimide 740 Nonlinear effects 542 Nonlinearities 708 Nonsaturated designs 540 Nonuniform sources 530, 533 Normal in-line monitoring 363 Normal incidence laser 342 Normalization constant 526 Normalization methods 548 Notching 561 Novolac systems base developed 118 Novolak dissolution 88 Novolak polymers 738
Index Novolak resist 745 Novolak-based photoresists 744 Nuclear stopping power 792, 794, 828 Numerical aperture 552 Numerical methods 534
OAI technology application 35 Objective 483 Oblique illumination 349 implementation 10 Octopoles 932 Off-axis beam components 712 Off-axis illumination 10 Off-axis object ray 496 Off-axis optical alignment 845 Off-axis reticle illumination 11 OHC 664 OHV 664 One-third octave bands 786 Opaque area imaging 619 OPC and PSM technologies cost of applying 36 OPC techniques 96 OPC technology application 35 Open frame exposures 509 Operating software 334 Operational amplifiers 897 Operator error 646 solving 646 Operator functions 647 Operator tasks 646 Optical and SEM photos suspicious ADI defects 363 Optical axis 533 Optical column design 845 Optical cross section technique measurements 414 Optical defect 628 review station 339 Optical EBR 225
983
Optical elements resist out-gassing contamination 110 Optical exposure tools 3 Optical inspection systems 47 Optical lithography 2, 35, 63, 259, 313 advances 26 advantage 671 barrier 2 cost effective 26 direct-step wafer 25 economics questions 36 full wafer reduction projection 25 future 119 major thrust 26 physical limitations 8 proximity/contact printing 25 survival 3 technical questions 36 wavefront engineering 10 Optical metrology tools film thickness control 383 Optical microscopes 329, 363 Optical overlay tools 420 Optical pattern filtering 345, 349 Optical projection systems minimum linewidth 29 Optical proximity 3, 108 correction 12, 15 printing 39 Optical step-and-repeat 746 Optical steppers 4, 50, 51, 474, 747 Optical system 940 extra cost 941 lenses 31 Optical thickness 602 Optical tools 10, 20, 69 Optical transfer function 494, 495 application 510 Optical verniers 595, 901
984
Handbook of VLSI Microlithography
Optically active system 927 Optics full wafer scanning type 30 thermal drift 584 Optimization routines 512 Optimization verification multivari results 372 Organic ARC bake conditions 278 process optimization 282 Organic BARC coatings defectivity 288 Organometallics 797 Orthogonality 539 Oscillation 578 OTF 494, 521 Outrigger shifters sizing 625 Oven specification 243 Overhead cost 59 Overhead time 693 Overlay accuracy 859, 862, 901 error 883 feature 420 layer to layer 289 mark asymmetry 427 mismatch problem 425 target deficiencies 431 tool 420, 423, 424, 426 Overlay measurement 420 analysis 434 process problems 427 TIS correction 423 tool-induced shift 421 Overlay metrology 420 errors 430 Oxide substrates 150 Oxygen plasma stripping 187
P(MMA-MAA) 697, 730, 735, 736, 739
p-channel MOSFET devices 908 p-wave equations 444 PAC 216. See also Photoactive compounds composition 81 molecules 227 PolyDAQ substitution 81 PAG effects 100 Parallax error 584 Parallel writing advantage 26 Paraxial focal point 518 Paraxial imaging 527 Paraxial rays 512 Pareto charts 542 Partial coherence 503, 552 Partially coherent imaging 504 Particle and contamination test 252 Particle checks 252 Particulate contamination 696 Particulate protection 601 Passivation 823 Pattern alignment monochromatic illumination 227 Pattern complexity increase 330 Pattern degradation 679 Pattern design 673 Pattern distortions 677, 691 Pattern edge smearing 887 Pattern exposure 673 Pattern generator 629, 686, 711, 715 Pattern lock alignment plate 845 Pattern lock system 844 Pattern placement accuracy 690 Pattern placement distortion 845 Pattern recognition 590 Pattern registration 672 Pattern suppression software 342 Pattern transfer 674, 676, 698, 729, 730 processes 730 subtractive 732 technique 679
Index Pattern writing time 693 Pattern-dependent settling times 693 Patterned ion dose 836 Patterned wafers 330, 339 inspection tools 330, 333, 356 Patterning 729, 806 resist layers 303 PBS 736 Peak floor displacements 771 PEB 227. See also Post exposure bake process 164 PED bake delay 98 Pedestal 776, 777 geometry 777 motions 777 structures 777 Pellicle films 601 selection 602 thickness 602, 603 PEN. See Silicon-rich plasma enhanced nitride Penumbra 928 Penumbral blur 859, 899 Percent exposure latitude 52 Percent transmission 493 Periodic structure 587 Periumbral blur 839 Perkin Elmer stepper 942 PGMEA 225. See also Propylene glycol methyl ether acetate Phase 475, 494, 502 cancellation requirements 289 change 494 defects 614 errors 611, 615 filter 532 grating auto-aligment system 31 grating method 587 interferometry 6
985
lag 705 modulator 567 sensitive mask technology 10 Phase shift 604, 605 alternating aperture 606 attenuated 627 auxiliary feature 624 chromeless 604 configurations 604 Conjugate Twin-Shifter 613 deviation 612 double edge chromeless 615 layers 606 Levenson 604, 607, 610 mask technology 3 masks 12, 606, 608, 822 rim 626 self-aligned 626 single edge chromeless 619 subresolution outrigger 624 substrate 624 technology 313 Phase transfer function 495 Phase transitions 613, 623 Photacids 100 Photo defects minimizing 356 problems 357 Photo process 328, 355, 369 defectivity 329 equipment 370 lithocell 362 monitoring 339, 352 optimization 354 qualification goal 355 Photo speed reproducibility 377 Photo tracks 355 Photo-etch module 382 Photoabsorption 580 Photoacid generator 96 Photoactive compound 78, 216, 578. See also PAC
986
Handbook of VLSI Microlithography
Photoactive molecular organic additives 78 Photobleachable 578 Photocathode mask 726 Photochemistry first law 79 second law 79 Photoelectron generation 349 Photoengineer resist process parameter 136 Photolithographic bulk and focus tolerance effects 313 Photolithographic metrology 289 Photolithographic process 241, 369, 370 Photolithography 75, 81, 262, 288, 327, 436, 561 control and monitoring 327 defects 370 dimension quality 260 image edge 260 linewidth variations 283 steps 369 Photomask repair 848 Photometer off-axis 510 Photomultiplier tube 340 output 346 Photoresist 77, 88, 199, 221, 523, 569, 624, 737 advantage 131 CD latitude 131 conventional 81 design 81, 89 developing 227 dispense location 193 edge bead 225 moisture 229 non-HMDS promoters 150 process 165 thermal stability process 131 thick region 220 thickness 126, 382
Phototool hardware changes 89 PHS 744 equivalent base resin 109 Picture elements 686 Piezoelectric transducers 584 Piezoelectric translators 597 Pillar functional test example 279 Pitch 548 Pitch calibration standard key requirements 413 Pitch record 407 Pixels 686 Placement accuracy 748 Placement error 843 Planar defects detection 340 Planar substrate 561 Planarization 302, 309, 561, 591, 729 Planarizing coating 591 Plane waves 475 Plasma 841 Plasma discharges cleaning schemes 944 Plasma etch rate 672 Plasma etching 698 Plasma physics 867 cyclotron phenomenon 386 Plasma treatments 185 Plots resolution versus defocus 554, 556 PMMA 697, 730, 735, 738, 740, 746. See also Polymethyl methacrylate Point beam 790 Point dose 733 Point source ideal 478 Point spread function 492 Poisson’s Equation 56 Polarization 437 effects 923 Polarized beam 442
Index Polarizers 340 optical components 439 Polarizing beam splitter cube 582 Poly(butene-1-sulfone) 736 Polymer chain cross-linking 730 Polymer chain scission 730 Polymethyl methacrylate 697. See also PMMA Polyphotolysis 83 Polysilicon CDs problem 266 Polysilicon reflectivity swing curves 271 Polyvinyl alcohol 745 Poor resist image adhesion 158 Popping 155 Positive i-line resist advances 86 Positive photoresist 80 conventional technology 119 film thickness 161 image reversal 264 novolac resin composition 84 OFPR-800 164 technology 259 Positive pressure environment 213 Positive resists 2-layer blocked process 302 classic examples 111 dispense step 217 effects 164 exposure and development conditions 163 high contrast design 83 prebake conditions 163 processes 164 successful design 83 Positive system sensitivity 134 Post exposure bake 97, 227, 578. See also PEB Post lithographic processing image flow 185 POT process 302 sacrificial organic 303
987
Pre- and post- counts 355 Pre-development bake process 164 Pre-lens deflection 706 Pre-photo inspections 364 Pre-wet 362 Precision 454 Precursor gas 797 Primary beam 677 secondary electrons (SE1) 388 Primary message 653 Problem of degradation circumvention 404 Process flows 647 issues 580 latitude 524, 548, 697, 735, 736, 742, 744 limits 8 module monitoring 334 overhead 684 resolution 692 sequence 227 Process adhesion prerequisite 148 Process automation considerations 644 Process biasing 299 Process capability indices 546 Process chamber 213 Process compatibility 131, 145 Process conditions 281, 549 Process control 254, 542 Process fabrication line developement 18 Process optimization 165, 166, 354, 588 Process qualification 354 preferred method 355 preparation 248 several advantages 355 Product development 22 responsibility 24 Product inspections 327 Product volume 19
988
Handbook of VLSI Microlithography
Production functional requirements 25 lithography 176 manufacturing space 24 maximum capital appropriation 24 photo processes 358 Profitability 645 Programmable Fourier mask 342 Programmable spatial filters 340 Projection electron-beam lithography 746 Projection x-ray barrier 42 Projection-reduction electron-beam lithography 725 PROLITH swing curve simulations 266 PROLITH 2 simulations 292 Prometrix System 54 LithoMapR 52 Propylene glycol methyl ether acetate 225. See also PGMEA Protons range 836 Proximity bias 86 Proximity correction 684, 696, 714 techniques 686 Proximity effect 43, 268, 550, 551, 679, 680, 726, 829, 830 corrections 693 minimization 683 optical 550 Proximity printing 28 stencil mask 726 Proximity systems 45 Proximity x-ray 746 lithography 747 Pseudo evaluation technique 57 PSM artifacts 12
techniques 96 technology 35 PTP 770 Pulsed gas excimer lasers 564 Pump technology 194 Pumpdown 693 Pumps diffusion 699 ion 699 turbomolecular 699 Pupil entrance 496 exit 496 filling ratio 570 PVA 745
Quadrupoles 932 Quality 542 Quantum efficiency 78 Quartz 567 Quick-Spin, Then-Dry method 216
Radiation 749, 844 distribution 924 linear polarization 923 propagation 527 relay system 912 resist 111 spectral distribution 918 Radii of obscuration 530 Radiometer 571 Rail Guided Vehicles 664 Random order tables 539 Random point defects 62 Random-access wafer track 645 Randomization 539 Range 536 straggle 792 Rare earth gas ions 804 Ray optics 510
Index Rayleigh resolution 505 coefficient 503 limit 504, 527, 549, 605, 617 Rayleigh unit 556 Rayleigh unit of defocus 504, 505 Rayleigh’s criterion 482 Reactive ion etching 732 Real time in-line production monitoring 330 Rectangular aperture 476 Rectilinear propagation 475 Redeposition 816 Reduction and minimization of defects 370 Reduction lens assembly 515, 522 Reduction projection ionbeam 746 Reference marks 672, 690 Reference stepper 585 Reflectance 574, 575 measurements 437 Reflectance spectra basic principles 448 Reflectance spectrometer 437 oscillation frequency 451 Reflection angle 943 Reflections suppression of 574 Reflective interference 122 Reflective notching problems practical solution 263 Reflective optics 749 Reflectivity 447, 574 of mirrors 943 substrate 26, 290 Reflectivity decrease 943 Reflector 572 Refractive index 450, 574, 705 gas rations and substrate temperature 879 Refractory metals thin films 52
989
Registration 547, 690, 691 error contributions 629 errors 691, 697 level-to-level accuracy 692 marks 690 measurements 54, 433 overhead 695 Regression 539 Relative humidity 210 Reliability 331 Remote command 661 Remote controllers 647 Repeatability 453 error term 436 Repeater defect 367, 368 Repeating defect size 368 Repetitive pattern filtering 345 Replicates 539 Reproducibility 453, 644 important quantities 457 Research and development applications 18 Residual stress gas ratios and substrate temperature 879 Resin design and mechanism 100 Resin molecular weights 78 Resin system synthesis 101 Resist 78, 133, 684, 729 absorption 80, 120, 553 absorption coefficient 576 acid catalyzed 578 adhesion 146, 155, 697 age 735 asymmetrical profile 560 characteristic curve 509 chemically amplified 738, 744 coated wafer 220, 228, 252 consumption reduction 197 contrast 126, 172, 576 coverage tests 197 customer specifications 248 delamination 732
990
Handbook of VLSI Microlithography
depth of focus (DOF) data 127 developer spin motor 248 development 109, 525 dissolution 120, 732 dyeing 263 electron-beam 729, 740 environment temperature 206 erosion 559 exposure 846 film uniformity 210, 226 gate CD control 271 higher tool resolution 259 index of refraction 506 inorganic 746 lithographic properties 77 lithographic requirement 122 low contrast 534 materials 112, 394 mechanism 90 multiple-layer strategies 738 negative 217, 730 negative electron-beam 743 non-linear response 50 parameter characterization 120 pattern 676, 730 positive 730 positive electron-beam 735 process contrast 136 profile 51, 508, 524 properties 733 refractive index 581 repeatability specification 248 screening 120 sectioning 826 sensitivity 81, 134, 164 sidewalls 397 solutions 155, 161 surface intensity 42 swing curve 126 thinning 263 type test 355 uniformity 206 viscosity 225, 735 volume 200, 248
wetting 164 Resist applications 161 front end device layers 163 metal MLM processing layers 163 thickness control 185 variables 161 wafer tracks 185 Resist coat 163, 591 dispense challenges 199 EBR solvent 220 exhaust 217 process 181, 193 recipe 191, 192 sensitivity index 201 spin motor 248 surface preparation 193 topography 271 uniformity requirements 367 Resist coat and develop 47, 243 multivari studies 250 Resist dispense 371 speed 193 suckback 200 system 194 volume 200 Resist image 474 adhesion 148, 150, 151 CD 136 contrast 400 depth of focus (DOF) 299 edge wall 128, 136, 164 high quality 869 shorting prevention 289 transfer layers 75 Resist process 164, 259 data base 145 dependence of proximity effects 552 lithographic 181 manufacturing 74 optimization 77 substeps 165 technology 266
Index Resist system 52, 534 adhesion 50 critical dimension control 50 post process compatibility 50 resolution 50 sensitivities 947 throughput 50 track support 49 Resist technology 51, 74 Resist thickness 271, 553 depth of focus (DOF) 310 lithographic testing 122 temperature variable 206 troubleshooting 189 Resist-free exposure field 228 Resist/TEOS sandwich 306 Resistance heating 702 Resistivity 797, 819 Resolution 25, 166, 480, 508, 530, 547, 552, 554, 567, 692, 711, 734, 828 coefficient 532 criterion 482 limit 483 novel definition 389 physics 389 range 25 regime 671 Resonance peak 579 Response filters 788 Response spectra components 787 Response surface designs 542 Response surface experiment 540 Response surface methodology 357 Reticle 472, 601, 649, 670, 840 defects 367 delivery 665 design and testing 595 layout 368 monitoring 368 optical assist features 34 reference 588 technology 12, 23, 34
Reverse tone 686 Rework 646 adhesion effects 159 RF cavity key component 932 RFLDMOS gate image CD variation 276 RGV 664, 665 RIE and sputter etch combination 879 etch rates 109 Rigid process control 52 Rigid units 783 Rimshifter mask 627 Ringing 498 RMS 613 Roller bearing 584 Rotatability 539 Rotation 598, 706 RPL results 139 RSM model minimization efforts 377 Run out error 874
S & S applications final roadblock 34 S & S systems 37 low aberration optics 33 S & S tools 230 mm square reticles 35 S-wave equations 445 Safety issues two types 944 Safety subsystems 945 Sagittal 554 SAL 743 Sample size determining 538 SAMs 746 Saturated designs 540 Scalar amplitude 476 Scaling 598
991
992
Handbook of VLSI Microlithography
SCALPEL 15, 45, 686, 726, 746 Scan field size 708 Gaussian-spot raster 698 Gaussian-spot vector 698 speed 816 variable-spot/cell projection vector 698 Scanning electron microscope (SEMS) 51, 715, 799, 901 Scanning electron microscopy 216 Scanning ion microscope 810 Scanning transmission electron microscopes 715 Scanning tunneling microscope exposure 728 Scattered light defects 339 Scattering 550, 725, 846 Scenario distortion 767 Scission 736 Scrap 646 Screening design experiment results 280 Screening experiments 354 SE. See Spectral ellipsometer edge effect 396 production 398 Secondary electron detection low voltage SEM images 395 Secondary electron generation 676 Secondary electrons 386, 676 Secondary ion mass spectroscopy 826 Secondary message 653 SECS-I protocol 651 SECS-II 652 SECS-II dialogue example 654 Sectioned device 824 Segmented auto-threshold 331, 337 Seidel aberrations 522, 526 Selectivity 733, 799
Self-assembled monolayers 746 Self-closing effect 732 SEM CD metrology accuracy of measurements 404 single-tool reproducibility 404 tool-to-tool reproducibility or matching 404 SEM(s) 655, 715, 799. See also Scanning electron microscope (SEMS) accuracy uncertainty 418 analysis 363, 366 cross section techniques 414 image characteristics 392 methods 861 micrograph 360 performance 385 significant differences 409 SEMI E23 standard 666 standards 651 Semiconductor 3 applications 445 device 8, 75 device costs 59 equipment communication standard 652 film thickness tools 440 lithography production 32 manufacturing 158 manufacturing device 90 processing 8 top ten companies 8 Semiconductor Industry Association 756 Sensitive positive e-beam resist 112 Sensitivity 120, 342, 733, 738, 794 measurement 331 Sensitizers 78 Separation distance 482 Serial processing equipment throughput 49
Index Serial track 646 Serifs 628 Settling times 696 Shadow printing 790 Shallow mesa etch 690 Shaped beam advantages 711 systems 44 Shaped illumination sources 528 Shaping deflectors 704 Sharp photon aerial image 870 Shear wall system 788 Shifter comb structures 621 Shifter edge 606 Shin Etsu resist performance results 97 Shipley acid-catalyzed resists 118 Shipley advanced lithography 743 Shipley I-300 resist 92 Shot noise 831, 846 Shoulder effects 509 Si wafer contact angle 155 SIA 756 SIA TABLE 43 757 Sidelobes 532 Sigma DTT 407 Signal processing errors 582 Signal-to-noise images 342 Signature size and orientation 353 Silicon 445, 557 device size 22 electron yield 401 wafer fabrication 159 Silicon-rich plasma enhanced nitride optical film properties 287 Silicon-rich silicon oxynitride optical film properties 287 SIMS 826 Simulation input variables 289 Single and bilayer resists formulations 110 Single column approach 728 Single die reticle 368 Single edge chromeless shifter 624
993
Single electron beam machine 629 Single layer resist 52 dyed and thinned 263 Single mirror 940 Single resist viscosity 214 Single tool reproducibility statistical process control 404 Single-layer resist 684 Sinusoidal dependence of intensity 571 Sinusoidal transmission characteristics 503 SiON. See Silicon-rich silicon oxynitride: optical film properties Site-to-site effects 458 Site-to-site sum of squares 461 Site-to-site variation 250 Skin absorption effect 81 Sleeving 689 Slope 524 SLR 665 Smith-Helmholtz invariance 926 Snell’s Law 443, 512 Sodalime glass 766 Soft defects 602 Soft x-ray projection 841 Soft-impact nozzles E 2-type 362 Softbake 729 exhaust 227 process 225 Softer radiation 934 Software 646, 647 SOG 561 SOI 557 Solarization 569 Sorting 886 Source coherence 926 Source lifetime 703 Source operation 802, 803 Spatial coherence 499 Spatial filter 531, 587 Spatial filtering 528, 532
994
Handbook of VLSI Microlithography
Spatial Fourier synthesis 489 Spatial frequency 484, 502, 505, 508, 610 Spatial noise 349 SPC 183 approach control charts 406 SPC chart 235 SPC charting methods 181 SPC control measurement tool 404 SPC methods 181, 183 SPC record 406 Special coat program 366 Specific equipment model 655 Specification window 546 Speckle 564, 566 Spectral distribution 919 Spectral distribution curve critical energy 917 Spectral ellipsometer 440 information 452 Spectral model cauchy coefficients 437 Spectral narrowing 564 Spectral reflectance everyday experience 438 Spectrometers 436 Spherical aberration 805 Spherical particles 203 Spin coating 561, 729 Spin curve 214 Spin defects 194 Spin speed 213, 221 acceleration 201 Spin time 216 Spin-casting technique 674 Spin-on-glass 561 Spin-to-Dry method 216 Split detectors 395 Split-cross-bridge concept 416 Spot diameter 702 Spray aperture 700
Spray method second wasteful 241 Spread function 511 Spread step 200 Spun resist film 572 Sputter etch patterning 879 Sputter yield 797, 814, 815 Sputtered thin films 278 Sputtering 703, 733, 794, 795, 814 SR source peculiarities 927 SSEM 655 Stability 331, 606, 787 Stadium effect 206 Stage 647, 698 Stage motion 709 Stage orthogonality 599 Stage positioning 691 Stage precision 583 Stage scan path 713 Stage stepping 688 Staining issues 822 Standard deviation 464, 536 statistically correct uncertainty 464 Standing waves 578, 581 effect 574 intensity 578 Standoff distance 601 Stapper’s Model 56 State models 655, 656 State-of-the-art technical difficulty 19 Static 709 designs 773 stiffness values 771 Stationary conveyor systems 665 Statistical methods 534 Statistical process control 172, 404, 405, 650 Stencil mask 712, 790, 792, 836
Index Step and scan 30, 472, 567, 582 See also S & S systems advantages 32 dynamic field flatness 32 dynamic lens distortion 32 lenses 31 model 531 Stepper(s) 472, 484, 584, 746 double telecentric 597 focus-tilt issues 369 lenses 110 linking 646 loading robots 665 performance 476, 477 projection 532 resolution 518 resonant frequency 579 site 579 SPC 181 SPC reporting 181 Specific Equipment Model 655 unity magnification optical 496 wafer chuck contamination 183 Stepper-to-stepper test 595 Steradians 499 Sticking coefficient 732 Stiffness 771, 788 Stigmatic images 521 Stigmation 708 Stitching 548 error 675 Stochastic blur 846 Stochastic cell models 525 Stoichiometry gas rations and substrate temperature 879 Storage ring 945, 947 environment 934 Strategy coarse-fine exposure split 712 Stream code 653 dispense method 241
995
Strehl intensity ratio 478, 507, 522 Stress birefringence 568 relief 843 substrate 691 Stripe butting accuracy 721 Stripes 675, 688 Strippers 730 Structural dynamics 756, 758 research studies 767 Structural engineering 758 Student’s t 537 Submicrometer resolution 790 Submicron lithography 43 Submicron polysilicon lines electrical and top-down CD-SEM measurement 418 Subsidence 773 Substrate 656, 681 alignment mark 676 beam damage 679 conductive 745 damage 831 energy variation 194 exchange 693, 700 pattern 673 planarization 561 position errors 691 preparation 673, 674 pretreatment 147 translation 675 W and Al 147 Subtractive process 698 Subtractive transfer 730 Sums of squares computing shortcuts 458 Suppressor electrode 704 Surface charging 697 Surface state characteristics 199 Surface vacuum work function 703 Swing curve 122
996
Handbook of VLSI Microlithography
Swing ratio 576 Symmetric errors 595 Symmetric signals 589 Synchrotron 42, 912 radiation source 925 x-ray source 748 System backscattered electron detector 699 beam blanking 704 bilayer 740 blanking 672 case-based 590 cathode projection 726 cell projection 724 cell projection/mini-reticle 714 coherent 501 deflection 672, 707, 708, 717 detection 672 dynamics 773 electron beam lithography 675, 698 electron-optical 700 exposure 494 field emission 696 fixed-stage 695, 709 frequencies 786 Gaussian beam 695, 702 Gaussian spot 712, 716 isolation 783 laser exposure 747 laser height measurement 709 Leica EBPG-5 716 Leica Nanowriter 716 micro-electro-mechanical 728 moving-stage 695, 709, 710 nonlinear 496 optical 511 overlay error 874 pattern generator 722 pattern-writing deflection 724 projection imaging 484 raster scan 695, 710, 711
response 763, 765 shaped beam 695, 711 telecentric deflection 707 trilayer 742 vacuum 672, 699 variable shape beam 708, 712, 714 vector scan 695, 710, 711 Wynne-Dyson 522 System-on-a-chip 7 Systematic errors 547 Systematic methodology 378 Systematic stage errors 595, 598
T-distribution 537 T-pi mask 627 T-statistic 169 T-test 167, 536 Tangential 554 TARC 285. See also Top anti-reflection coating Target degradation 404, 407 Target designs 590 Target topography 430 TBOC resists 108 early uses 108 TCM 256. See also Total Chemical Management TCPS. See Tricholorophenylsilane Technical evaluation 51 Technical requirements 67 Technical understanding 51 Teflon 602 Telecentric design 522 Telecentric error 522 Telecentricity 597 TEM 826 sample preparation 826 Temperature control testing 242 final adjustment 206 plate-to-plate variation 255 rule of thumb 206
Index Temporal coherence 498 Temporal noise 349 Tensile stress 842 TEOS 561 Terminal server 648 Terpolymer resists 100 Test sequence typical 355 Test wafers special processing 431 Testing contamination 252 Tetra-Methyl-AmmoniumHydroxide 227. See also TMAH Tetraethylorthosilicate 561 TF 493 TFE 716 Theoretical reflectivity curve 294 Thermal absorption 560 Thermal effects 690 Thermal emitters 703 Thermal equilibration 693, 699 Thermal expansion 843 Thermal field emission 703 Thermal field emitter 716 configuration 704 Thermal gravimetric analysis 145 Thermal image flowing 131 Thermal image reversal 264, 578 Thermal stability 77, 676, 721 Thermionic emission 704 Thermionic tungsten hairpin filaments 385 Thermocouples 242 Thickness measurements X-axis 210 Y-axis 210 Thin film 439, 746 deposition techniques 260 interference 571, 572 normal incidence 447 reflectivity 299 Thin lens 559 Thin resist layer 860
997
Thin transparent film measurement ellipsometry 442 spectrometry 442 Three mirrors 941 Three-factor Box-Behnkin design critical dimension vs. exposure results 170 Threshold detector 510 Threshold voltage shift 910 Threshold voltage-nonscaling 8 Through-the-lens detection 388 quadrant type 388 Throughput 29, 30, 44, 254, 693, 710, 748, 863, 874 factors affecting 695 mask component 874 measurement 331 parameters 859 Throughput and process control wafer track designs 254 Tilt offset 369 Tilt-stage SEMs 402 Time-delay integration 334 Time-history 761 Tip radius 703 TIR 560 TIS calibration 424 correction 424 errors 422 focus 422 formula 424 stability 424 variability 424 TiW 295 TLP etchback alternate processes 308 TMAH 227. See also TetraMethyl-Ammonium-Hydroxide cost 241 TMSDEA. See Trimethylsilyldiethylamine vs. HMDS 146 Tone 732, 733
998
Handbook of VLSI Microlithography
Tool manufacturers specifications 776 testing 769 vibration specification 774 Tools categorizations 763, 787 challenges 9 contribution to vibration 763 excitation levels 758 excitation sources 758, 762, 765 illumination wavelength 3 internal dynamic system 763 modification 358 operations contributors 763 qualification applications 340 utilization 424 vibration scenarios 776 vibrational dynamics 763 Top anti-reflection coating 285. See also TARC Top bead removal 220 Top EBR 221 Top surface imaging 528 Toshiba EBM-800 723 Total Chemical Management 256. See also TCM Total displacement 786 Total electron dose 44 Total fabrication costs 60 Total Indicated Reading 560 Total registration analysis 600 TQV 745 Trace data collection 653 Track and steppers integrated 369 Track equipment qualifications 241 Track footprint 254 Track process control sub-systems 255 Track software 254 Track startup 242 Traffic excitation 777 Transaction 653
Transfer function 491, 502 field application 509 lens 496 resist 496 Transform lens 491 Transform plane 491 Transistor gates 740 Transit time 809 Translation 596 Transmission 568 Transmission cross-coefficient 526 Transmission electron microscopy 826 Transmission errors 611 Transmission function 487 Transmission mask 557 Transmittance swing 605 Transport systems types 664 Transverse straggle 792 Trapezium 717 Trapezoid 599 Trend analysis 353 Trend charts 352, 358 Tricholorophenylsilane 151 See also TCPS Trigger 657 Trilayer process 740 Trilayer systems 260 TRIM 792 Trimethylsilyl groups relative surface coverage 155 Trimethylsilyl silanol reaction species 151 Trimethylsilyldiethylamine 146 See also TMSDEA Trimming 826 Triode circuit 700 Tungsten filaments thermionic emission 43 Tunneling microscopy tips 826 Turnaround time 58 Two layer photo process 303
Index Two level Stickman structure 54 Twyman-Green interferometer 512, 515
ULSI lithography future requirements 746 Ultra broad band illumination 331, 338 Ultrabeam Lithography 720 Ultrahigh vacuum electron column 383 Ultratech broad band steppers 88 Unbalance force 759 Uncertainty principle 391 Undercut 739 Underetch 150 Uniform resist coating low exhaust pursuit 217 Uniformer 528 Uniformity 569 Unit aspect ratio advantages 591 Unit process monitoring 340 Unity 482 Unpatterned wafers 330, 339 inspection 346 measurements 339 monitors 362 UV photoresist 737
Vacuum chamber line priming modules 146 Vacuum control 700 Vacuum system 26, 721 Valid states 661 van der Pauw cross 52 van der Pauw pattern sheet resistance measurement 415 Vapor temperature 158 Variability 536
999
Variable screen design 165, 173 procedure 166 results 170 Variable screen experiment 165 Variable shaped beam 722 Variables control chart zone 545 Variance components 545 Variance ratio 466 Vector diffraction models 527 Vector scan systems 44 Velocity 757, 767, 775 criterion 767 filter 806 Vertical emission angles 920 Vertical sidewalls 842 Very high volume product class 20 Via 310, 822, 826 holes 823 images 311 layer image processing 310 opening 548 Vibration 579, 756, 758, 784, 812 consultant 758 criteria 786 effects 580 histogram 580 isolator 579 mask-to-wafer 579 measurements 765, 774, 784 motion criteria 770 simulations 580 specifications 775 traces 777 Vibration-resistant support pedestals 776 Vibration-resistivity 787 Vibration-sensitive tools 788 Vibratory motion 767, 772, 786 Vibratory system 763, 768 Video archival-retrieval system 363 Vignetting 522 Virtual source size 802, 805
1000
Handbook of VLSI Microlithography
Viscosity 735 Visible-optical radiation 887 Visual contact angle 243 Viton-sealed gate valves 939 VLSIC fabrication lines 78 Volume manufacturing problems 358
Wafer inspection technologies 329 tools 331 Wafer processes 187, 199 expected variations 456 Wafer results rework/reprimed 159 Wafer substrate film stacks variations 290 preparation 151 Wafer track 187, 254 hardware qualification 248 high technology Cluster tools 255 processing 281 random-access 645 resist coat process 189 Wafer-handling test 242 Wafer-to-wafer variation 250 Wafers 748 200-mm factory 358 cassette transfer 666 centering 248, 250 chuck temperature 206 cost of manufacturing 57 defect maps 358 fabrication 24, 60 flatness 560 layer bubbling 306 patterning 747 plane 491 processing 645 processing equipment 187 resist coat chart 181
resist coating 372 scaling 592 spin speed 200 temperature 206 thermal image flow testing 133 throughput 9 topography 422 treatment 305 visual inspection 248 Warm ESRs 933 Warranty 776 Waste management systems 257 Watchdog times 944 Wave aberration function 511, 526 Wave trains 498 Waveform 489 Wavefront 511 aberration 522 aberration function 511 axially symmetric 516 deformations 512 description 511 engineering techniques 38 optical 521 quality 515 Wavelength 504, 588 limitations 567 Wavy gate effect 271 Web 680 Webbing 696 Weighted average 538 Wet etching 698, 732 Wet processing problem underetch 150 Whenelt assembly 700 Whenelt bias voltage 703 Whenelt electrode 700, 702, 703 Within-wafer variation process capability 243 Work cells 647 World semiconductor chip business 8 Writing speed 807
Index 1001 Writing strategies 813 Writing time 809 Wynne-Dyson lens system 581
X-ray blank cost 15 flux 856, 887 lines 867 photons 857 printing 857 projection technique 15 proximity masks 681 proximity printing 748 X-ray emitter 865 X-ray lithography 2, 39, 47, 63, 65, 259, 856, 870 development 40 important question 919 photoresist component 899 step-and-repeat 894 synchrotron radiation 945, 946 synchrotron-based 946 understanding 864 X-ray masks performing inspection 886 writing capability 45 X-ray radiation 887 large mask-resist gap 40 X-ray resist 899 dry etch resistence 899 resolution 899 sensitivity 899 X-ray source flux distribution spot-type 867 X-ray stencil printing 7 X-ray steppers 66, 68 X-ray synchrotron source 864 X-ray systems 859 characteristics 902 design performance goals 910 plasma point sources 41
X-ray tools 20 X-ray transmission 938 X-ray/stepper technology critical resolution 65 X-rays 40 XRL applications 931 beam line 942, 943 source 945 steppers 936 system 941
Yates algorithm 171 Yates analysis 171 Yaw 721 Yield 61 budget 24 curve shape 398 enhancement team 357 Yield map 360 Young’s modulus 938
Z atomic number 394 ZEP 737 Zernike coefficients 525 polynomials 516, 521 Zerodur 585