OPTICAL NETWORKING STANDARDS: A COMPREHENSIVE GUIDE
OPTICAL NETWORKING STANDARDS: A COMPREHENSIVE GUIDE
Edited by Khurram Kazi
Springer
Khurram Kazi, Ph.D
[email protected]
Optical Networking Standards: A Comprehensive Guide
Library of Congress Control Number: 2006921777 ISBN 0-387-24062-4 ISBN 978-0-387-24062-6
e-ISBN 0-387-24063-2
Printed on acid-free paper. "The materials in Chapters 10 and 12 have been reproduced by Springer with the permission of Cisco Systems, Inc. COPYRIGHT © 2006 CISCO SYSTEMS, INC. ALL RIGHTS RESERVED." © 2006 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science-HBusiness Media, Inc., 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now know or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if the are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1 springer.com
Dedication
This book is dedicated to my wife Sameema, my family and friends and all the folks who have spent countless hours developing networking standards
Contents
Foreword
xix
Preface
xxi
Acknowledgements
xxiii
About the Authors
xxv
CHAPTER 1 OVERVIEW 1.1. Optical Transport Network Infrastructure 7.7.7 Functional Modeling Specification Technique 1.1.2 Multiservice Optical Transport Network Infrastructure 1.1.3 Global Optical Transport Network Timing 1.2. Carriage of Services over Transport Networks 7.2.7 Ethernet Services Architecture and Definitions 1.2.2 Storage Area Services Over SONET 1.3. Control and Management of Optical Transport Networks 1.4. Intra-Network Element Communication and Component-centric Standards 1.4.1 Intra-Network Element Communication 1.4.2 Optical Interfaces 1.4.3 High-Speed Serial Interconnects 1.5. Standards Development Process
1 1 2 3 6 7 7 10 10 11 11 12 12 13
vni PARTI Optical Transport Network Infrastructure
15
CHAPTER 2 ARCHITECTURE OF TRANSPORT NETWORKS
17
2.1. Introduction. 2.2. Transport Functional Modeling 2.2.1 Basic Concepts 2.2.2 Functionality 2.2.3 Connections and Points 2.2.4 Connection Dimension Model. 2.2.5 Sublayers and Function Decomposition 2.2.6 Examples 2.2.7 Equipment Packaging 2.2.8 Application Examples 2.2.9 Equipment Control 2.2.10 Equipment Supervisory Process 2.2.11 Modeling Connectionless Layer Networks 2.2.12 Summary 2.3. Notes 2.4. References
17 18 20 29 31 32 35 36 39 40 50 53 60 61 61 62
CHAPTER 3 INTERFACES FOR OPTICAL TRANSPORT NETWORKS 3.1. Introduction. 3.2. OTN Standards 3.3. Standardized Interfaces 3.4. Forward Error Correction 3.4.1 Theoretical Description 3.4.2 Coding Gain. 3.5. Tandem Connection Monitoring 3.6. OTN Hierarchy Overview 3.7. OTN G.709 Frame Structure 3.8. G.709 Overhead Bytes: In-Depth Analysis and Processing 3.8.1 OPUk Overhead Bytes and Client Mapping Structure 3.8.2 Similarly Valued/Formatted Fields within G.709 Frame 3.8.3 ODUk Overhead and Processing 3.8.4 Tandem Connection Monitoring (TCM) 3.9. OTUk Overhead and Processing 3.9.1 Scrambling 3.9.2 Frame Alignment Overhead 3.9.3 Section Monitoring Byte Descriptions 3.9.4 General Communication Channel 0 (GCCO) 3.10. ODUk Multiplexing 3.10.1 Multiplexing Data Rates 3.10.2 4 XODUl to 0DU2 Multiplexing 3.10.3 0DU1/0DU2 to 0DU3 Multiplexing 3.10.4 Summary
63 63 64 66 67 68 70 73 76 79 81 82 88 90 95 98 99 100 101 104 104 105 107 112 117
Contents 3.11. References
ix 117
CHAPTER 4 MULTIPLEX STRUCTURES OF THE OPTICAL TRANSPORT NETWORK 4.1. Introduction 4.2. The Situation in the Previous Century 4,2.1. SDH structure details 4.3. The Evolution of the Bandwidth 4.4. New Clients 4.5. Virtual Concatenation 4.5.1. Differential Delay 4.5.2. Pay load Distribution and Reconstruction 4.5.3. Additional Benefits 4.5.4. Restrictions 4.5.5. VCATDetails 4.6. Link Capacity Adjustment Scheme (LCAS) 4.6.1. Link Capacity Increase 4.6.2. Link Capacity Decrease (Planned) 4.6.3. Temporary Link Capacity Decrease 4.6.4. LCAS Details 4.1. Advantages of Using VCAT LCAS and GFP 4.8. Implementers Guide for VCAT and LCAS 4.8.1. Detection of Differential Delay 4.8.2. Compensation of Differential Delay 4.8.3. Structure and Management of Differential Delay Buffers 4.8.4. Differential Delay Buffer Overview 4.8.5. Alignment within a VCG 4.8.6. Sizing the Delay Buffers 4.8.7. Processing Time 4.8.8. Controlling Distribution/Reconstruction Order 4.8.9. Member Status 4.9. References
119 119 120 120 127 130 131 131 133 136 136 137 140 140 140 141 141 144 144 144 145 146 147 148 149 149 150 151 152
CHAPTER 5 GENERIC FRAMING PROCEDURE (GFP) 5.1. Introduction 5.2. Background 5.2.1 Packet Transport on Public Networks 5.2.2 Other Traffic Adaptation Approaches 5.2.3 Other Design Considerations 5.3. Formats and Procedures 5.3.1 GFP Frame Formats 5.3.2 GFP Control Frames 5.3.3 Client-Independent Procedures 5.3.4 Client-Dependent Procedures 5.4. Implementation Considerations 5.4.1 Virtual Framer Management
153 153 155 155 156 157 15 8 159 164 164 166 171 171
5.4.2 Scrambler Options 5.5. Performance 5.5.1 Probability of GFP Frame Delineation Loss (FDL) 5.5.2 Probability of False Frame Synchronization (FFS) 5.5.3 Probability of Frame Unavailability (FUA) 5.5.4 Frame Acquisition Delay 5.5.5 Scrambler Resynchronization Delay 5.5.6 Link Efficiency 5.6. Applications 5.6.1 Ethernet Private Lines 5.6.2 Virtual Leased Lines 5.6.3 Packet Rings 5.7. Future Directions 5.8. References CHAPTER 6 SYNCHRONIZATION OF OPTICAL NETWORKS 6.1. The Field of Network Synchronization Engineering 6.LI Introduction 6.2. Background on Timing, Synchronization, and Jitter 6.2.1 Basics of Digital Transmission, Timing Jitter, and A lignment Jitter 6.2.2 Jitter Tolerance, Transfer, Generation, and Network Limit 6.2.3 Mapping and Multiplexing 6.2.4 Pointer Adjustments 6.2.5 Timing Signal Imperfections 6.2.6 Characterization of Timing Performance 6.2.7 Wander Network Limits and Wander Performance 6.3. Roadmap of Current ITU-T Recommendations on Timing, and Jitter, For OTN, SDH, and PDH 6.4. Timing and Jitter Requirements for SONET/SDH and OTN 6.4.1 SEC and ODC Frequency Accuracy, Clock Modes, Pull-in and Pull-out/Hold-in Ranges 6.4.2 STM-N and OTUk Jitter Network Limit and Tolerance, STM-N Regenerator and ODCr Jitter Generation and Transfer, and STM-N and OTUk Jitter Accumulation 6.4.3 Jitter and Wander Accumulation for PDH Clients of SDH Networks and SDH Clients of OTN 6.5. Reliable Distribution of Synchronization 6.5.1 The Need for Synchronization 6.5.2 Synchronization A reas 6.5.3 Reference Duplication and Reference Selection 6.5.4 Synchronization Status Messages 6.5.5 Satellite Timing 6.5.6 Synchronization Network Engineering 6.6. Conclusions and Closing Remarks 6.6.1 Conclusions 6.6.2 Closing Remarks
172 174 174 175 J 76 179 182 182 184 184 185 186 187 187
189 189 189 191 191 196 200 203 206 209 212 214 216 218
219 227 233 234 235 241 243 248 249 250 250 251
Contents 6.7. Notes 6.8. References
xi 252 254
CHAPTER 7 SYNCHRONIZATION ARCHITECTURES FOR SONET/SDH SYSTEMS AND NETWORKS 7.1. Synchronization Concepts 7.2. Timing Traceability 7.2.7 Source Traceability 7.3. Synchronization Distribution 7.4. Network Element (NE) Architecture 7.4.1 Timing Engine (TE) Functions 7.4.2 Timing Distributor (TD) Functions 7.4.3 Network Element System A rchitecture 7.4.4 Small Network Element A rchitecture 7.4.5 Medium Network Element A rchitecture 7.4.6 Large Network Element A rchitecture 7.5. External Timing Configurations 7.5.1 Direct-Source Timing Method. 7.5.2 Bridged-Source Timing Method 7.5.3 Line/External Timing Method 7.5.4 Mult Timing Method 7.6. Clock Backup Modes and Implications 7.7. Synchronization Guidelines 7.8. Notes 7.9. References
257 257 261 262 266 268 269 270 2 75 2 76 2 77 2 78 279 280 281 282 285 286 292 293 294
CHAPTER 8 NETWORK SURVIVABILITY 8.1. Introduction 8.2. Network Survivability Techniques 8.3. Survivability Offered by Protection 8.3.1 Network Objectives 8.3.2 Protection Switching Architectures 8.3.3 Protection Switching Parameters 8.3.4 Protection Switching Classes 8.3.5 Hold-off Timer 8.3.6 Protection Switching Trigger Criteria 8.3.7Null Signal 8.3.8 Automatic Protection Switching (APS) Protocol 8.3.9 Examples 8.3.10 Optical Transport Networks (OTN) Survivability 8.4. Survivability Offered by Restoration 8.4.1 Network Restoration Techniques 8.4.2 Restoration time 8.4.3 Interoperability 8.5. Link Capacity Adjustment Scheme (LCAS) 8.6. Multilayer Survivability
295 295 295 296 297 297 303 306 309 310 310 310 312 313 314 315 315 316 317 318
xu 8.7. References PART 2 Services Offered over Transport Networks CHAPTER 9 METRO ETHERNET OVERVIEW AND ARCHITECTURE 9.1. Metro Ethernet Demand and Requirements 9.1.1 Network Resiliency 9.1.2 Trajfic and Performance Management 9.1.3 Circuit Emulation Services 9.2. Metro Ethernet Forum Charter 9.3. Metro Ethernet Network (MEN) Architecture 9.3.1 MEN Reference Model 9.3.2 MEN Layer Network Model 9.3.3 MEN Reference Points 9.3.4 MEN Architectural Components 9.3.5 MEN Layer Relationship to the Architecture Model Components 9.4. References CHAPTER 10 ETHERNET SERVICES OVER METRO ETHERNET NETWORKS 10.1. Introduction 10.2. Services Model 10.2.1 Customer Edge View 10.2.2 User Network Interface 10.2.3 Service Frame 10.2.4 Ethernet Virtual Connection 10.2.5 Identifying an EVC at a UNI 10.3. Service Features 10.3.1 CE-VLAN ID Preservation 10.3.2 All-to-One Bundling Map 10.3.3 Service Multiplexing 10.3.4 Feature Constraints 10.3.5 E-Line and E-LAN Service 10.3.6 Class of Service 10.3.7 Bandwidth Profiles 10.3.8 Layer 2 Control Protocols 10.4. Conclusion and Future Work 10.5. Appendix A: Ethernet Basics 10.5.1 Ethernet Physical Layers 10.5.2 Ethernet Media Access Control Layer 10.5.3 Ethernet VLANs 10.6. Notes 10.7. References
319
321 323 323 323 324 324 325 326 326 327 329 334 337 341
343 343 343 344 344 345 346 348 349 350 350 352 355 356 356 359 365 367 367 368 368 370 371 372
Contents
xiii
CHAPTER 11 ETHERNET SERVICES OVER PUBLIC WAN 11.1. Introduction 77.7.7 Why Ethernet over the public WAN? 11.1.2 Organization of the chapter 11.1.3 Related standards activity 11.1.4 Definition of some technical terms in this chapter 11.2. Service Types and Characteristics 77.2.7 Ethernet connection (EC) attributes 11.2.2 Ethernet Private Line (EPL) service 11.2.3 Ethernet virtual private line service (EVPL) 11.2.4 Ethernet private LAN (EPLAN) service 11.2.5 Ethernet virtual private LAN service 11.3. Transport Network Models In Support of Ethernet Connectivity Services 11.4. Ethernet Client Interfaces 7 7. '^. 7 Multiplexed access 11.4.2 VLAN mapping. 11.4.3 Bundling 11.4.4 Bandwidth profile 11.4.5 Layer 2 Control Protocol processing 11.4.6 Summary of UNI Service Attributes for Different Services 11.5. Ethernet Transport Network To Network Interface (NNI) 11.6.0AM 11.7. Protection and Restoration 11.7.1 Service Protection or Restoration Provided by the Transport Network 11.7.2 Service Restoration at Layer 2 11.8. Conclusion 11.9. Notes 11.10. References
420 421 421 422 423
CHAPTER 12 ETHERNET SERVICES OVER MPLS NETWORKS 12.1. Virtual Private Networks 72.7.7 Traditional Layer 2 Virtual Private Networks 12.1.2 Classification of VPNs 12.1.3 Multiservice Converged Packet Switched Backbone \1.1. L2VPNS over MPLS Backbone 72.2.7 L2VPNs Architecture Generic Components 12.3. Metro Ethernet Services 72. J. 7 Ethernet Virtual Connection (EVC) 12.3.2 E-Line Service 12.3.3 E-LAN Service 12.4. Metro Ethernet Services Over MpPLS 72.^.7 Emulation of E-Line Services using VPWS
425 425 425 426 427 428 428 436 436 436 436 437 438
373 373 373 3 75 377 378 379 381 387 388 389 391 392 401 402 404 404 404 404 405 405 411 419
XIV
12.4.2 E-Line Service Emulation Walk-Through Example 12.4.3 Emulation ofE-LAN Services using VPLS 12.4.4 E-LAN Service Emulation Walk-Through Example 12.5. Importance of VPLS for Metro Ethernet Services 12.6. Summary 12.7. Appendix A: MPLS Basics 12.7.1 Forwarding Equivalence Class 12.7.2 Labels 12.7.3 Label Encoding 12.7.4 Label Switched Router (LSR) 12.7.5 Label Stack Operations — Imposition, Disposition, Swapping 12.7.6 MPLS Control Plane 12.7.7 MPLS Forwarding Plane 12.7.8 Label Switched Path (LSP) 12.7.9 Benefits of MPLS Technology 12.8. References
441 443 448 449 450 451 451 451 452 453 453 453 454 454 455 455
CHAPTER 13 METRO ETHERNET CIRCUIT EMULATION SERVICES 13.1. Metro Ethernet Circuit Emulation Services 13.1.1 Circuit Emulation Service Definition, 13.1.2 Circuit Emulation Service Framework, 13.2. References
457 457 457 466 496
CHAPTER 14 METRO ETHERNET NETWORK RESILIENCY AND TRAFFIC MANAGEMENT 14.1. Metro Ethernet Network Resiliency 14.1.1 Introduction 14.1.2 Protection Terminology 14.1.3 Discussion of Terminology 14.1.4 Protection Reference Model 14.1.5 Requirements for Ethernet Services protection mechanisms 14.1.6 Framework for Protection in the Metro Ethernet 14.2. Metro Ethernet Traffic and Performance Management 14.2.1 Ethernet Traffic Management Overview 14.3. References
497 497 497 499 504 505 516 521 524 524 526
CHAPTER 15 SONET SERVICES FOR STORAGE AREA NETWORKS
527
15.1. Data Growth 15.2. Storage Networking 15.3. Storage Area Networks 15.3.1 Factors Driving SAN Extension 15.3.2 Fibre Channel: The Storage Protocol of Choice 15.4. Distance Extension Requirements 15.5. Distance Extension Alternatives 75.5.7 Legacy Private Line
527 528 5 31 532 534 536 538 539
Contents 15.5.2 WDM, 15.5.3 Storage over IP 15.5.4 SONET/SDH 15.6. SONET - An Ideal Distance Extension Protocol 15,6,1 Making SONET Fit - The Role of Standards 15.7. Summary 15.8. References PART 3 Control and Management of Transport Networks CHAPTER 16 ARCHITECTING THE AUTOMATICALLY SWITCHED TRANSPORT NETWORK 16.1. Introduction 16.2. Network Requirements (G.807) 16.2.1 Architectural Context 16.2.2 Call and Connection Control 16.2.3 Business and Operational Aspects 16.2.4 Reference Points and Domains 16.2.5 Architecture Principles 16.2.6 Supporting Functions and Requirements 16.2.7 Signaling Communications Network Requirements 16.2.8 Support for Transport Network Survivability 16.3. Architecture (G.8080) 16.3.1 The Control Plane View of the Transport Network, 16.3.2 Identifying Components 16.3.3 General Component Properties and Special Components 16.3.4 Component Overview 16.3.5 Interlay er Modeling 16.3.6 Distribution models 16.3.7 An Example of Components in A ction, 16.3.8 Identifier Spaces 16.3.9 Restoration A rchitecture 16.4. Signaling Communications Network Architecture (G.7712) 16.4.1 Signaling Methods 16.4.2 Delivery of Control Plane Messages 16.4.3 DCN Topologies 16.4.4 DCN Reliability Considerations 16.4.5 DCN Security Considerations 16.5. Service Activation Process Elements 16.6. Discovery (G.7714) 16.6.1 Discovery and Connectivity Verification 16.6.2 Discovery A rchitecture 16.6.3 Types of Discovery 16.6.4 Discovery Considerations across Administrative Boundaries 16.7. Routing (G.7715 and G.7715.1) 16,7,1 Requirements
xv 539 540 541 542 544 547 548
549
551 551 553 554 555 559 562 564 567 570 571 571 576 578 580 580 583 585 586 588 593 595 596 597 599 602 603 603 604 605 606 607 611 611 611
XVI
16.72 A rchitecture 16.73 Hierarchy in Routing 16.7.4 Routing Information Exchange 16.8. Signaling (G.7713) 16.8.1 Call and Connection Management Operations 16.8.2 Basic Call and Connection Control Sequences 16.8.3 Signaling Attributes 16.8.4 Signaling Application Example 16.9. Control Plane Management 16.10. Protocol Analysis 16.10.1 Analysis Approach 16.10.2 Requirements Implications on Protocol Solutions 16.11. Methods and Protocols — Discovery 16.11.1 Layer Adjacency Discovery Methods 16.12. Methods and Protocols — Signaling 16.12.1 G. 7713.1 PNNI Signaling 16.12.2 G,7713,2 GMPLS RSVP-TE Signaling 16.12.3 G.7713.3 GMPLS CR-LDP 16.12.4 Interoperability and Interworking 16.13. METHODS AND PROTOCOLS — ROUTING 16.14. Signaling Communications Network — Mechanisms (G.7712) 16.15. Futures 16.16. Acknowledgements 16.17. References PART 4 Intra-Network Elements and Component-Centric Standards CHAPTER 17 INTRA-NETWORK ELEMENTS COMMUNICATION 17.1. Introduction 17.2. Requirement Placed on the Network Elements by the Network 17.3. Network Element Design and Interface Architecture 17.3.1 Packet Based Network Elements 17.3.2 TDM Based Network Elements 17.3.3 Hybrid (TDM + Cell/Packet Based) Network Element Architecture 17.4. 2.5 Gbits/s Systems 17.4.1 SPI-3 Signal Descriptions 17.5. 10 Gbits/s Systems 17.5.1 System Framer Interface-4 Phase 1 (SFI~4 Phase 1) 17.5.2 SPI-4 Phase 1 (Oc-192 System Packet Interface) 17.5.3 System Framer Interface-4 Phase 2 (SFI-4 Phase 2) 17.6. SPI-4 Phase 2 (Oc-192 System Packet Interface) 17.7. 40 Gbits/s Systems 17.7.1 Serdes Framer Interface-5 (Sfi-5) 17.7.2 SPI-5 (Oc-768 System Packet Interface) 17.7.3 TFI-5 (TDM Fabric to Framer Interface)
615 619 621 626 627 628 630 631 633 637 63 7 639 640 640 643 643 644 648 649 651 652 653 655 655
659
661 661 662 664 665 666 667 668 669 672 672 674 677 679 681 682 685 687
Contents 17.8. Acknowledgements 17.9. References
xvii 688 688
CHAPTER 18 ITU OPTICAL INTERFACE STANDARDS 18.1. Introduction 18.2. ITU Optical Interface Standards 18.2.1 Historical Perspective 18.2.2 Transverse versus Longitudinal Compatibility 18.2.3 Overview of Optical Fiber Types and Associated Recommendations 18.2.4 Overview of Optical Interface Recommendations 18.2.5 Application Code Terminology Related To Distance 18.2.6 Power Budget Design Considerations and Limitations 18.3. Optical Interface Implementations 18.3.1 General 18.3.2 140 Mbit/s - 2.5 Gbit/s Technology 18.3.3 10 Gbit/s Technology 18.3.4 40 Gbit/s Technology 18.4. Considerations on Optical Fault and Degradation Detection 18.4.1 General 18.4.2 Faults in Conventional Transmitters and Receivers 18.4.3 Faults in Optically Amplified Systems 18.5. Notes 18.6. Acknowledgments 18.7. References
703 706 709 710 712 712 713 721 729 729 729 729 731 732 733 733
CHAPTER 19 HIGH-SPEED SERIAL INTERCONNECT 19.1. Introduction 19.1.1 Chip-Chip Interconnect 19.1.2 Backplane Interconnect 19.2. High-Speed Interconnect System Architecture 19.2.1 Topologies 19.2.2 Printed Circuit Board (PCB) Interconnects 19.3. Compliance Test Methodology 19.3.1 Eye Mask 19.3.2 Jitter modeling conventions for high-speed interfaces 19.3.3 Bathtub curve analysis ofjitter 19.4. Interconnect Extension Using De-Emphasis and Equalization 19.4.1 De-emphasis at the Transmitter 19.4.2 Equalization at the Receiver 19.4.3 Usage Models 19.5. Standards-Based High-Speed Interconnect 19.5.1 OIF Sxl-5 19.5.2 OIF TFI-5 19.5.3 IEEE® 802.3ae™ Clause 47, XAUI 19.5.4 Backplane Ethernet
735 735 736 736 737 737 738 742 742 744 746 748 749 754 756 758 758 759 759 760
691 691 692 692 700
XVIU
19.5.5 Summary of Standards-Based High-Speed Interconnect 19.6. Higher and Higher Speeds 19.7. Summary 19.8. Notes 19.9. References
PARTS Standards Development Process
760 762 764 764 764
765
CHAPTER 20 STANDARDS DEVELOPMENT PROCESS 20.1. Introduction 20.2. The International Telecommunication Union (ITU) 20.2.1 Hierarchy 20.2.2 Membership 20.2.3 Standards Development 20.3. Technology-Specific Industry Forums 20.3.1 Message 20.3.2 What is involved? Election/hierarchy 20.3.3 The History behind Standards Groups: Why join? 20.3.4 Membership 20.3.5 Reality of human nature 20.3.6 Teamwork 20.4. Conclusion
767 767 768 768 772 772 776 776 777 779 780 781 782 783
INDEX
785
FOREWORD
Khurram Kazi SMSC
''O mankind! We have created you from a single (pair) of a male and female, and have made you into nations and tribes, so that you may know each other,..'' [Quran 49.13]. When one ponders over how we get to know each other; certain thoughts come to mind. As we venture outside our own region or domain, we tend to follow certain protocols that allow us to communicate with each other. Those protocols have diverse flavors; for example, the first thing we try is to communicate in a common language that both parties understand. If that fails, we use gestures or sign language or even resort to drawing pictures to get our message across. In short we find a common ground or similar footing which to build our communication platform on even though we may come from such diverse cultures and background. Just as we have diversity in mankind, we have disparate, ever-evolving communications networks. These networks are evolving towards providing seamless connectivity between different platforms and applications so that they cater to our insatiable need to communicate with each other in many different ways. Evolutionary technologies, including Dense Wavelength Division Multiplexing (DWDM), advances in optics/optoelectronics, highly versatile electronic integrated circuits, and control and management software have provided an excellent foundation for present-day networks to emerge into robust and scalable networks with ever-increasing intrinsic intelligence. These advances have been enabled by the relentless activities taking place within scores of technical standards committees, industry fora and consortia across the globe. In this comprehensive volume, we seek to give an overview of the converged multiservice optical transport networking development activities occurring within standards development
XX
organizations and industry fora including the International Telecommunication Union, ITU-T, Internet Engineering Task Force (IETF), Institute of Electrical and Electronics Engineers (IEEE), Metro Ethernet Forum (MEF) and Optical Internetworking Forum (OIF). Some of the issues these bodies are addressing are: • Multiservice and data-optimized SONET/SDH and OTN transport infrastructure • Ethernet and MPLS in converged transport networks spanning the enterprise, access, and core network realms • Flexible and efficient support for a diverse set of services over existing and emerging next-generation transport network architectures • Enhanced service provisioning, enabling more dynamic and responsive connection services, via automatically switched transport networks • Equipment functional block specifications that enable multivendor interoperability without constraining equipment implementation • Physical-layer specifications that enable multi-carrier network deployments and interconnection of equipments among multiple vendors • Network and equipment management specifications encompassing FCAPS (fault, connection, administration, performance, security management) that assure common behavior and minimize the need for human intervention, reducing operational and management expenses • Timing and synchronization in global transport networks • Backplane and component specifications encompassing optical, electrical, and mechanical characteristics that impact, e.g., optoelectronic modules. Very Large Scale Integrated devices, and backplanes utilized in existing and next-generation equipment
PREFACE
In the late 1980s and 1990s I was exposed to ANSI, ITU-T, IEEE and ATM forum standards and implementation agreements while developing ASICs and systems for Tl, SONET/SDH, ATM, and Ethernet applications. During the architecture, design, and verification, I had to go through the standards documents to ensure that my designs were standards compliant. While designing I was always on the lookout for a comprehensive source that could give me a broad prospective on all the relevant standards dealing with optical networking. Suffice it to say I did not find a single book that comprehensively covered the work being done at the major standards bodies and I ended up going through quite a few books, standards documents and technical papers. The development of the bigger picture proved to be quite useful in my design process as I started to understand which components my ASICs would be interfacing with, what systems and services features these ASICs were going to be providing etc. In short, I started to understand the design partitioning and hierarchical abstractions; from ASICs, standard VLSI products, and optoelectronic components to systems and networks architecture. With the desire to share these thoughts with the rest of the networking community, I embarked upon this project. With the help of Eve Varma, I was able to assemble a world-renowned team of over twentyfive leading contributors and editors of the standards from networking powerhouses such as Tellabs, PMC-Sierra, Nortel, Marconi, Lucent Technologies, Cisco, Ciena, British Telecom, Atrica, AMCC, and Agere Systems, as well as independent consultants, who have come together to make this work a reality. From our collective efforts we have put together Optical Networking Standards: A Comprehensive Guide, which provides a single-source reference work for the specifications of networks at all levels: from components (optics, optoelectronics, and VLSI
XXll
devices) through systems to global transport networks infrastructure, their management, and the services they offer. While going through the standards documents, especially the designers and implementers who ensure standards-compliant products should keep in mind that generally standards documents are not easy to read. There are several reasons behind the statement. Typically, the process of defining the requirements for specific sets of functionalities kickstarts the development of a particular standard or suite of standards. During the early phases of the development of standards technologies or services, the norm is to define a generic architecture. Details are subsequently added either to the same document or to different ones to get into specifics. Generally speaking, every effort is made to ensure that the standards are written in such a way that they are technology and implementation independent. This approach results in careful usage of the language that at times makes it difficult to read. From my personal experiences as a designer, I felt anguish while going through these documents. However, persistent reading made things clearer. One lesson that I learned is that one needs to spend the time going through the documents with patience and full concentration, along with having discussion with the colleagues, to fully appreciate the subtleties of the recommendations. It is always helpful if one develops some background information base prior to going through standards documents. Every effort is made in this book to give a reader some background information so that going through the respective recommendations is not as painful as when one starts to read them cold. The chapters are written such that they can be read as standalone chapters or can be combined to get a better understanding of the different aspects of optical networking standards. It should be kept in mind that the reader should always use the actual standards documents, implementation agreements, or RFCs as the definitive source of information.
Acknowledgements
I would like to thank ITU-T, Metro Ethernet Forum and Optical Internetworking Forum for allowing us to use appropriate information from their respective standards and the implementation agreements.
Contributing Authors
Ghani Abbas has spent over fifteen years in the SDH and Optical Networks business, initially with GPT and later with Marconi Communications, U.K. He is currently international standards manager in the Network Engineering and Technology department. He previously held various engineering development and management posts. He is currently the rapporteur for ITUT SGI5 Q9, which develops standards for transport equipment and network protection and restoration. He is an active member of OIF, ETSI, ITU SGI 3 and SGI5. Ghani received a B.Sc (Honour) degree from Manchester University in Electrical Engineering and a Ph.D degree in Electronics from Southampton University, U.K. P. Stephan Bedrosian is a Distinguished Member of Technical Staff in the Standards and Advanced Technology organization at Agere Systems. He has worked in the research, design, and development of synchronization systems, networks, and devices spanning the last two decades. At Bell Laboratories, his focus was on telecommunications synchronization systems, including the design and development of building integrated timing supplies (BITS). At Lucent Technologies, he was involved in the design and development of both telecom and datacom synchronizations systems and subsystems for use in SONET/SDH, xDSL and ATM networks. At Agere Systems, he is very involved with the development of computer timing devices as well as standardization of packet timing protocols. He has published several articles, including "Timing Synchronization Speeds Network Access," and holds several synchronization-related patents and patents pending. Mr. Bedrosian holds a Bachelor of Science in Electrical
XXVI
Engineering from Worcester Polytechnic Institute and a Master of Science in Electrical Engineering from Georgia Institute of Technology. Nan Chen Nan Chen is the Vice President of Marketing at Strix Systems (www.StrixSystems.com), a leading provider of mesh wireless Ethernet solutions enabling rapid networking without wires. Mr. Chen is also the President of the Metro Ethernet Forum (www.MetroEthernetForum.org), a worldwide standards organization for Carrier-class Ethernet networks and services. Before Strix, Nan Chen was the Vice President of Marketing at Atrica Inc. (www.Atrica.com), where he successfully drove Ethernet's metro vision in the industry and its wide adoption in carriers networks worldwide. Prior to joining Atrica, Mr. Chen was the Director of Product Management and Product Marketing at Force 10 Networks while serving as a founding member of the Board of Directors of the 10 Gigabit Ethernet Alliance (10 GEA). Mr. Chen also spent four years at Nortel/Bay Networks/SynOptics. While serving as a Director of Technology at Nortel Technology Center, Mr. Chen drove Nortel 10 Gigabit Ethernet strategy and served as a founding member of the IEEE 802.3ae Task Force for development of 10 Gigabit Ethernet standards. Mr. Chen holds two MS degrees from the University of Arizona and a B.S. degree from Beijing University, China where he also was a record holder in pole vault. Carmine Daloia (
[email protected]) is a senior communications engineer and consultant in Washington Group International. He holds an M.S. degree in Electrical Engineering from Columbia University and a B.S. degree in Electrical Engineering from The Cooper Union, New York. He has expertise on transport and data network architecture, planning, and design covering a wide range of technologies, including SONET/SDH, OTN, ATM, MPLS, and IP. Beginning in 1995, he worked at Telcordia Technologies as a senior communication engineer, where he was responsible for SONET and ATM network design and planning projects and led the development of various Generic Requirements (GR) documents, including the SONET UPSR, SONET BLSR, and DWDM OADM GRs. While at Telcordia he represented both Telcordia and the Regional Bell Operating Companies within national and global standards. He joined Lucent's Optical Networking Group in June 2000, where he continued his standards activities by contributing to the development of the OTN architecture, equipment, and protection specifications as well as ASON specifications, and was editor of G.7712 "Architecture and Specification of the Data Communications Network." He joined Metro Tech Consulting Services in September 2003, where he provided network planning consultation to New York City Transit
Contributing Authors
xxvii
(NYCT) for the future Second Avenue Subway line communications network. Mimi Dannhardt is a Consultant who received her M.S. degree in Electrical Engineering from Virgina Tech. In her career, she has designed numerous networking and telecommunications chips for ATM, SDH, PDH and Ethernet over Sonet applications. Tracy Dupree is a public relations professional who has worked in the telecom and networking industries for over a decade. Ms. Dupree operated her own consulting agency for several years, where she worked with the Metro Ethernet Forum, among other clients. She has been employed with a variety of communications companies including Tekelec, Nortel Networks and currently is employed at Alcatel. Geoffrey Garner received an S.B. degree in Physics from M.I.T. in 1976, S.M. degrees in Nuclear Engineering and Mechanical Engineering from M.I.T. in 1978, and a Ph.D. in Mechanical Engineering from M.I.T. in 1985. He is currently a consultant in telecommunications, specializing in network timing, jitter, and synchronization; network performance and quality of service; systems engineering; and standards development. Since 2003 he has worked on a variety of projects, including simulation of network level jitter and wander performance, development of a simulator for Optical Burst Switching Network performance, and development of new standards for carrying time-sensitive traffic over Residential Ethernet. Prior to his work as a consultant, he was a Distinguished Member of Technical Staff in the Transport Network Architecture Department of Lucent Technologies. Beginning in 1992, his work at AT&T and then Lucent included the development of international and national standards for jitter and synchronization performance and transmission error performance of OTN and SONET/SDH networks, and for Quality of Service of ATM networks. He was the Rapporteur of the Transmission Error Performance Question in ITU-T SG 13 from 2001 to 2004, and the Editor for the ITU-T Recommendation specifying jitter and wander in the Optical Transport Network (G.8251) in SG 15. He joined AT&T in 1985, went with Lucent Technologies upon its divestiture from AT&T in 1996, and became a consultant in 2003. Steven Scott Gorshe is a Principal Engineer with PMC-Sierra's Product Research Group. He received his B.S.E.E. (University of Idaho) and M.S.E.E. and Ph.D. (Oregon State University) in 1979, 1982, and 2002, respectively. He has been involved in applied research and the development
XXVlll
of transmission and access system architectures and ASICs since 1982, including over five years at GTE and over 12 years with NEC America, where he became Chief Architect for NEC Eluminant Technologies. His current work at PMC-Sierra involves technology development for applications-specific standard product ICs, including those for Ethernet WAN transport over telecommunications networks. Dr. Gorshe is a Senior Member of the IEEE and Co-Editor for the regular Broadband Access series and guest editor for multiple Feature Topics in the IEEE Communications Magazine. He has also been involved in telecommunications network standards continuously since 1985 and serves as Senior Editor for OPTSX (formerly TlXl, responsible for North American SONET and optical network interface standards); technical editor for multiple standards within the SONET series; and a technical editor for multiple ITU-T Recommendations including G.7041 (GFP), G.8011.1 (Ethernet Private Line Service), and G.7043 (Virtual Concatenation of PDH Signals). Areas in which he has made key contributions include architectures for multiple generations of SONET/SDH equipment and much of the transparent GFP protocol. He is a recipient of the Committee Tl Alvin Lai Outstanding Achievement Award for his standards contributions. He has 27 patents issued or pending and multiple published papers. Adam Healey is a Distinguished Member of Technical Staff at Agere Systems and is responsible for the definition of subsystems and components required for access and enterprise networks. Adam joined Lucent Microelectronics / Agere Systems in 2000. Prior to joining Agere Systems, he worked for seven years at the Interoperability Lab at University of New Hampshire where he developed many of the test procedures and systems used to verify interoperability, performance, and comphance to standards of 10, 100, and 1000 Mb/s electrical and optical links. Adam is a member of IEEE and contributes to the development of international standards as a member of the IEEE 802.3 working group. He currently serves as chair of the IEEE P802.3ap Backplane Ethernet Task Force. He received a B.S. and M.S. in Electrical Engineering from the University of New Hampshire. Huub van Helvoort is a Standards Consultant at Huawei Technologies Co., Ltd. In 1977 he received his MSEE degree at the Technical University in Eindhoven, the Netherlands. In his 26-year career he has collected extensive experience in public switching systems and ISDN, PDH, and SDH technology. He represents Huawei Technologies in the standards bodies: ITU-T (SG15) and ANSI (T1X1.5) and is the editor of several ITU-T recommendations. He is a senior member of the IEEE. He can be contacted at tel: +31 36 5315076; e-mail:
[email protected]
Contributing Authors
xxix
Enrique Hernandez-Valencia is a Consulting Member of the Technical Staff at Lucent Technologies' Bell Laboratories. He received his B.Sc. degree in Electrical Engineering from the Universidad Simon Bolivar, Caracas, Venezuela, and his M.Sc. and Ph.D. degrees in Electrical Engineering from the California Institute of Technology, Pasadena, California. He has over 15 years of experience in the design and development of systems architectures and protocols for high-speed communications networks. Dr. Hernandez-Valencia is a Bell Labs Fellow and a member of the Institute of Electrical and Electronics Engineers, Association for Computing Machinery, and Sigma Xi societies. Iftekhar Hussain is a technical leader with the Internet Technologies Division at Cisco Systems. For the past several years, Iftekhar has been involved in the design of high availability related aspects of IP/MPLS networks. He brings extensive industry experience to the subject of networking and telecommunication, including switching, traffic management, and voice delivery over packet switched networks. Dr. Hussain's current interests are in the area of IP/MPLS networks, network security, and mobile wireless architectures. He holds a Ph.D. degree in electrical and computer engineering from the University of California, Davis. Nevin Jones is a Consulting Member of Technical Staff (CMTS) with the Advanced Technology and Standards Development Group of Agere Systems (formerly Lucent Microelectronics). He has worked for approximately 20 years in the field of communications engineering at AT&T Bell Laboratories, Lucent Technologies and Agere Systems. He continues to work in a multidisciplinary communications engineering capacity and encompassing switching and transport networks modeling and planning, systems engineering, and software and hardware development of integrated circuits for PDH and SONET/SDH systems. His primary applied research activities currently include optical networking systems and architectures, client signal adaptation protocols, and integrated circuits physical-layer interface specifications for backplanes and chip-to-chip interconnects. He is an active contributor and member at ATIS OPTXS, PTSC, Optical Internetworking Forum (OIF), ITU-T SG-15 & SGI-3, IETF, and several other industry standards fora. He holds a B.S.E.E. (SUNY), MS (CUNY), and Ph.D. (CUNY). Khurram Kazi has over 19 years of industrial hands-on expertise in the computing, data and telecommunication networking field. He is a senior systems architect at SMSC working on trusted computing platforms. Prior to
XXX
SMSC, he concentrated on the architectural studies and designs of 40+-Gbps Next Generation Secure Transport Network, where he studyed the detailed trade-off analyses of crucial optical networking methodologies, devising new techniques, and recommending implementation methods within the intended architecture of the global network, including implementations of network elements and custom ASICs. Prior to this, he developed numerous ASICs for IP switching, SONET, Ethernet, ATM, and PDH applications. His extensive ASICs and systems work experience ranges from mid-size to venture-backed startup companies (Safenet-Mykotronx, General DataComm, TranSwitch, and Zagros Networks) and world-class research organizations like Bell Laboratories, Lucent Technologies. His work has resulted in over a dozen published papers and conference tutorials on optical components to ASICs to Optical Networks. Khurram received his B.S. from University of Bridgeport, M.S. and Ph.D. from the department of Electrical and Systems Engineering at the University of Connecticut. He can be reached at
[email protected]. Bob Klessig is a Director of Engineering at Cisco Systems. He is the Vice President of the Metro Ethernet Forum, where he is also the Co-Chair of the Technical Committee and a member of the Board of Directors. Before joining Cisco, Dr. Klessig was a founder of Telseon, an early competitive metro Ethernet service provider in North America. Before Telseon he was with 3Com, where he developed and helped execute the corporate ATM strategy. At 3Com he was the lead representative to the ATM Forum Technical Committee, where he focused on standards for data networking with ATM. He has held lead positions at Bellcore and Bell Laboratories. While at Bellcore he led the conception, design, and specification of Switched Multi-megabit Data Service (SMDS), the first high-speed metropolitan area data service. His Bellcore responsibilities also included leading RBOC participation and serving as Vice Chair of the IEEE 802.6 committee that wrote the IEEE Standard for Metropolitan Area Networks. Dr Klessig has a Ph.D. in Electrical Engineering and Computer Sciences from the University of California at Berkeley and is a co-author of the book SMDS: Wide Area Data Networking with Switched Multi-Megabit Data Service. Gert Manhoudt received a Masters Degree in Electrical Engineering from Delft University of Technology in 1986 in the area of Integrated Optics. He then worked for 17 years for Lucent Technologies and its predecessors and has a broad experience in optical transport networking as a hardware designer and systems engineer and in technical marketing. He has been active in the development and marketing of SDH/SONET equipment, specializing in optics, high-speed electronics, synchronization, and network performance aspects. He represented Lucent Technologies in ITU-T and
Contributing Authors
xxxi
ETSI in the synchronization working groups; in this capacity he was editor of two ETSI documents, namely, ETS 300462 parts 3 and 5. He has been one of the pioneers involved with the packet over SDH/SONET technology, as a responsible system engineer for the implementation of Ethernet transport and protocols over SDH/SONET networks. He has contributed to papers and studies on the evolution of packet-based transport in today's public networks. Since 2003 he worked as a network consultant for AimSys, www.aimsys.nl, a startup company that designs and manufactures equipment for metro optical networks. Alan McGuire is a Principal Engineer in the Network Technology Centre of BT. He leads a multidisciplinary team working on next generation transport networks. Alan is also active in the standards arena, where he has made numerous contributions and acted as editor of numerous ITU Recommendations concerned with network architecture, control and management, and optical networking. He graduated from the University of St. Andrews in 1987 with a first in Physics and received an M.Sc. in Medical Physics one year later from the University of Aberdeen. Alan is a member of the IEEE and the Institute of Physics and is a Chartered Physicist. George W. Newsome is currently a Systems Engineering Consultant whose primary work includes Control Plane management, architecture, and global standards. In previous assignments, George was a significant contributor to the ASON architecture standards, and has both created and managed the development of Network Element software. He has been involved with functional modeling since its inception in the ITU and has also worked on information modeling for network management. Mr. Newsome is a Chartered Engineer, Member of the lEE, and Senior Member of the IEEE, and holds a B.Sc. degree in Electrical Engineering from University College, London. Lyndon Ong is currently Director of Network Control Architecture in the CTO organization of Ciena Corporation, a supplier of intelligent optical and data transport equipment. Dr. Ong joined Ciena in 2001 after working at Nortel Networks, Bay Networks, and Bellcore/Telcordia. He received his doctoral degree in Electrical Engineering from Columbia University in 1991. He has had a long career in the area of control protocols, starting with the original team defining Signaling System 7 standards for North America, then working on ATM networking, IP QoS and transport protocols, and finally working on the optical control plane. Dr. Ong has chaired groups in ITU-T Standards, currently chairs the Signaling Transport WG in IETF, and is the editor of the OIF E-NNI Signaling Implementation Agreement.
XXXll
Richard Orgias currently works as a product marketing manager in the Broadband Networks business at Nortel. Mr. Orgias has worked in a number of different business units in various roles spanning operations, finance, and marketing since joining Nortel in 1992. Prior to his current role, in which he has marketing responsibility for Nortel's broadband access solutions, Mr. Orgias had responsibility for marketing Nortel's optical storage connectivity solutions, which included DWDM-based solutions as well as solutions based on SONET and Ethernet. Mr. Orgias received a B.Sc. degree from McMaster University in 1985 and also holds Master of Science and MBA degrees from McMaster. Mr. Orgias resides in Alpharetta, Georgia, with his wife and two children. Jonathan Sadler is a Staff Engineer in the Advanced Technologies group at Tellabs. With over 20 years of data communications experience as a protocol implementer, network element designer, carrier network operations manager, and carrier network planner, Jonathan brings a broad set of knowledge and experiences to the design and analysis of carrier network technology. Currently, he is involved in the development of technologies to provide the efficient transport of packet oriented services in Carrier Networks. He is the Chairman of the Optical Internetworking Fomm's Architecture and Signaling Working Group and an active participant in the IETF and ITU. Jonathan studied Computer Science at the University of Wisconsin - Madison. Stephen Shew is an Architect in the Optical Networks group at Nortel Networks. In his career at Nortel, Stephen has participated in the development and specification of distributed routing protocols, traffic engineering optimization tools, ATM PNNI, and MPLS signaling. His standards involvement has included the ATM Forum and IETF. He is currently participating in ITU-T Study Group 15 and OIF, and contributes to the architecture and protocols for the Automatic Switched Optical Network (ASON). Stephen received his Bachelor of Computer Science from Carleton University and M.Sc. from the University of Toronto. Peter J. J. Stassar holds a Masters Degree in Electrical Engineering from the Technical University in Eindhoven, the Netherlands. Between 1980 and 2003 he has been employed at Lucent Technologies Bell Labs, Hilversum, the Netherlands, lastly as senior technical consultant for optical technology and strategy in the R&D organization of Lucent's Optical Networking Group. He has well over 20 years of working experience in virtually all aspects of optical transmission, ranging from research to development, manufacturing, and deployment, in particular on SDH/SONET and FTTH
Contributing Authors
xxxiii
optical technologies. Since 1989 he played a key role in ITU SG15s activities on optical interface specifications. He is currently engaged as Product Manager for FTTH products at Genexis BV, Eindhoven, the Netherlands, and furthermore he is representing the interests of Finisar Corporation in ITU SO 15 on the field of optical interface technologies. Stephen J. Trowbridge received his Ph.D. in Computer Science from the University of Colorado at Boulder. He has worked for Bell Laboratories since 1977, and is currently a member of the Bell Laboratories Standards and Intellectual Property Department. He is a vice-chairman of ITU-T Study Group 15 (Optical and other Transport Network Infrastructures), where he chairs Working Party 3 of Study Group 15 (Optical transport network structure), which is responsible for standards related to PDH, SDH, and OTN Transport, Transport Equipment, Frame Formats, Network Synchronization, Equipment Management, and Automatically Switched Optical Networks (ASON). He chairs Working Party 3 of the ITU-T Telecommunication Standards Advisory Group (TSAG) on Electronic Working Methods and Publication Policy. He is also vice-chairman of the Optical Heirarchal Interfaces (OHI) subcommittee of the ATIS Optical Transport and Synchronization Committee (OPTXS). Eve L. Varma is a Technical Manager, Lucent Technologies, part of Lucent's Optical Networking Group. Ms. Varma has 26 years of research experience within the telecommunications industry. Her primary research areas are currently focused upon the automatically switched transport network (ASON/GMPLS) and converged multiservice transport networks. She is actively engaged in supporting the development of the associated specifications within global standards and industry fora spanning ITU-T, ATIS, and OIF. From 1995 to 1997, she led the team responsible for designing and prototyping a distributed optical network management system as part of the Multiwavelength Optical NETworking (MONET) Consortium (partially supported by DARPA). Previous research experience includes characterization, analysis, and development of transmission jitter requirements; systems engineering and standards for SDH/SONET and OTN transport and network management applications; as well as associated enabling technology and methodology development/assessment. Ms. Varma has been an active contributor to global standards since 1984 and has coauthored two books. Achieving Global Information Networking, Artech House (1999), and Jitter in Digital Transmission Systems, Artech House (1989). She holds an M.A. degree in Physics from the City University of New York. Ms. Varma is a 2004 Bell Labs fellow.
XXXIV
Tim Walker received his BS EE from Rensselaer Polytechnic Institute and his MS EE from University of Illinois at Urbana-Champaign. He is a member of IEEE, Tau Beta Pi, and Eta Kappa Nu. Mr. Walker is currently a Systems Engineer at Applied Micro Circuits Corporation (AMCC) in Andover, Massachusetts. Prior to AMCC he spent 16 years at Bell Labs/Lucent in various Hardware and Software design positions. Before that, he was an RF Engineer for four years at Teradyne. His main areas of expertise are OTN (G.709), SONET/SDH, and PON. He has presented numerous proposals to Standards bodies (ITU SG15 and TlXl),
Chapter 1 OVERVIEW
Khurram Kazi SMSC
The late 1990s heralded the need for optical transport networks to make a transition from catering to mostly voice traffic to converged voice and multiservice data-centric traffic. This transition led to several innovative standardized solutions, developed primarily within the International Telecommunication Union, (ITU-T) to develop and more efficiently leverage new and existing transport infrastructures to support the delivery of conventional and emerging services. Associated with this trend has been the cross-fertilization of concepts and technologies between traditionally data-oriented standards organizations, such as the Institute of Electrical and Electronics Engineers (IEEE) and Internet Engineering Task Force (IETF), and the ITU-T. Further coupled with this cross-fertilization has been an industry focus upon defining the multitude of services that such networks can offer, as evidenced by diligent work efforts in industry forums such as the Metro Ethernet Forum (MEF). Industry forums, including the MEF, Optical Internetworking Forum (OIF), and TeleManagement Forum (TMF), currently play important roles related to the realization and deployment of new technologies. Last but not least, due to the large number of component vendors whose standard products are used in the design of the network elements that provide these diverse set of services, the need for standardized backplane and component interfaces has never been greater. For example, the OIF and IEEE facilitate development and deployment of optical and electrical component/backplane technologies. The book is organized into five categories: 1. Optical Transport Network infrastructure 2. Services offered over Transport Networks
Chapter 1 3. Control and management of Transport Networks 4. Intra-Network Elements and component-centric standards 5. Standards development process The chapter distribution is illustrated in Figure 1-1.
f
Chapters * coverage of the Standards related to the respective categories Optical Transport Network Infrastructure: Chapters 2-8
Services offered over public and private Transport Networks: Chapters 9-15 L J
Control and Management of the Transport Networks: Chapter 16
Network Elements and components: Chapters 17-19
Standards development process: Chapter 20
\
/ Figure l-l. Categorical distribution of chapters
1.1.
OPTICAL TRANSPORT NETWORK INFRASTRUCTURE
1.1.1
Functional Modeling Specification Technique
Optical transport networks have evolved into intelligent and complex networks that provide global connectivity and increasingly diversified services. Such networks need to interoperate between numerous carriers and service providers to ensure seamless connectivity across the globe. This requirement leads to the daunting task of representing and specifying equipment and network functionality and behavior for transport networks in a coherent and implementation-independent manner. ITU-T took the lead in defining a functional modeling specification technique that can be used as the requirements' capture and analysis tool. Chapter 2 describes the fundamental concepts of the functional modeling approach used in multiservice optical transport networks. In reviewing the basics, it describes how the complexity of the network functions can be reduced by a "divide and conquer" approach using the concepts of layers and partitions. The simplicity and versatility of the functional modeling scheme is illustrated by demonstrating how a handful of symbols can be used to represent various networking functions, be they for the network level or for detailed descriptions of functions within the networking gear (even down to the ASIC level). To illustrate its usage, several examples are provided. For
Overview
3
example, functions like the mapping of a Plesiochronous Digital Hierarchy (PDH) client signal onto a Synchronous Optical Networks (SONETySynchronous Digital Hierarchy (SDH) server signal or Add Drop Multiplexing (ADM) equipment characteristics are shown in simple diagrams. Numerous other examples are given in illustrating the usage of the functional modeling techniques. The fundamental concepts developed in this chapter are also used in subsequent chapters where services, management, and control of the multiservice transport networks are described. For the digital design engineer, this platform can be compared to the US Department of Defense's Very High Speed Integrated Circuit (VHSIC) project that resulted in the development of VHSIC Hardware Description Language (VHDL). The intent in creating VHDL was to use it for writing the specifications of electronic devices and systems in a standard format. However, since then the scope of VHDL has expanded significantly, it is now used for designing, simulating and synthesis of Application Specific Integrated Circuits (ASICs). Similarly, the functional modeling techniques developed by ITU-T may be utilized in the design and development process of hardware and software for multiservice transport networks.
1.1.2
Multiservice Optical Transport Network Infrastructure
Since the wide deployment of the Internet, the transport networks that once catered to low bandwidth Time Division Multiplexing (TDM) centric services required evolutionary features to simultaneously support TDM and multiservice data-centric traffic (variable length packets or frames) that had "best effort delivery" or "time and jitter sensitive" characteristics. In the late 1990s, based upon the tremendous growth in data traffic, ITU-T experts determined that there was a need to come up with a new set of standards that was optimized for this ultra-high-growth and data-driven capacity demand. Based upon a set of requirements predicated upon carrier globalization, support for gigabit-level bandwidth granularity for scaling and managing multiterabit networks, and leverage of dense wavelength-division multiple (DWDM) technology, work started on a next-generation optical transport network. This infrastructure was intended to complement the SONET/SDH infrastructure, which was designed around a mux structure starting from VT1.5/VC12 that filled the payload with voice-centric 64 Kb/s traffic. Leveraging lessons learned in the establishment of SONET/SDH standards were also developed.
4 1.1.2.1
Chapter 1 Optical Transport Hierarchy
Chapter 3 describes in detail two key ITU-T standards, G.709 and G.798, that describe the implementation of Optical Transport Hierarchy (OTH). It goes through the rationales and requirements behind the development of a suite of standards. For example, it covers topics including its role as an efficient transport networking layer that supports very high rate services at 2.5, 10, and 40 Gb/s; strong Forward Error Correction (FEC); support of multiple nested and overlapping Tandem Connection Monitoring (TCM) layers; the management of frequency slots Optical Channels, (OChs); single or multiple A,s) instead of time slots (STS-ls, etc.). Furthermore, it extensively covers the details of the G.709 frame structure, how various client signals can be mapped onto it, and the maintenance and management of networked DWDM signals. Prior to the bursting of the telecom bubble, it was widely anticipated that the SONET/SDH signals would be become clients of the OTN. However, in the time frame during which the OTN standardization effort came to fmition, the market need for support of Gbitlevel networking did not materialize, and OTN deployment has been slower than anticipated. Currently, one of the major driving factors in its deployment is the availability of strong Forward Error Correction (FEC) functionality at high speeds. This factor will become even more pervasive as we see the deployment of 40 Gb/s line rates due to the power savings and lower Bit Error Rate (BER) features of the strong FEC. 1.1.2.2
Data and TDM-Friendly Next-Generation SONET/SDH
Leveraging the worldwide investment in SONET/SDH network infrastructure and equipment to support the delivery of new services continues to be an important consideration for carriers. The advantages of this leveraging include incurring only incremental investments for deploying new services and enabling generation of revenues from delivery of new services via standardized mappings into SONET/SDH (e.g., transport of non-SONET/SDH bit-rate payloads). As the saying goes, "Necessity is the mother of all inventions" and this truth led to simple and elegant solutions that changed the status of "legacy SONET/SDH" to "the next big thing." Recent ITU-T standards added the efficient multiservice data-centric transport capabilities to the Time Division Multiplexed voice-centric SONET/SDH networks without requiring a "forklift" upgrade. There were two main issues that needed to be addressed in achieving the desired features: 1) efficient utilization of bandwidth in transporting data traffic and 2) seamless encapsulation of any data centric traffic into SONET/SDH pipes.
Overview
5
Chapters 4 and 5 provide extensive details of the underlying technologies that breathed new life into SONET/SDH. From the networking point of view, a majority of the data-centric traffic across the globe originates in the form of Ethernet frames. Almost everyone who uses a networked computer knows something about Ethernet and feels comfortable with it. This situation led to the huge momentum behind the evolution of the Ethernet that resulted in its emerging from the enclosing scope of local area networks along with high-speed Internet, IP Virtual Path Networks, IP VPN, video distribution, and varied other services. This emergence created a requirement for the efficient transport of Ethernet services over the SONET/SDH and OTN transport network infrastructures. Chapter 4 describes the details of the ITU-T's work on Virtual Concatenation (VCAT) and the Link Capacity Adjustment Scheme (LCAS) to ensure that future networking needs are met by adding data-centric capabilities into the existing Time Division Multiplexed, TDM, transport structures. The concept behind VCAT is that the primitive payload containers that carry the client data can be virtually "glued" or concatenated to build the desired bandwidth pipe. This outcome is achieved by using pointers in the overhead field of the containers. Theoretically speaking, bandwidth can be added or deleted when desired with the lowest granularity of approximately 1.5 Mb/s (VT 1.5 or VC-11). However, there are no restrictions on changing the desired bandwidth by any container size (e.g., VT-x, VC-n, STS-n, or STM-n). The Link Capacity Adjustment Scheme defines the protocol of how bandwidth can dynamically be changed. In essence, one can say that VCAT in conjunction with LCAS allows networks operators to provide bandwidth on demand without disrupting service while effectively utilizing the SONET/SDH pipes. A nimble and efficient encapsulation protocol known as Generic Framing Procedure (GFP) that supports fixed and variable-length frames/packets over transport networks was also recently standardized within the ITU-T (post2000). Chapter 5 describes how diverse set of protocols can be efficiently mapped into a single transport protocol (GFP) without having to go through multiple layers of encapsulation before being transported over TDM-based core networks. As an example, native IP traffic, Ethernet, Fibre Channel, and Storage Area Networking (SAN) protocols, along with others, can be mapped into GFP with minimal overhead. In GFP there are three main modes of the client frames/packets adaptation: Frame Mapped, Transparent Mapped, and Asynchronous Transparent Mapped. The Frame Mapped adaptation mode is mostly utilized in encapsulating data traffic like Ethernet, IP, MPLS etc. In this mode, the entire client data frame/packet has to be buffered locally first, prior to the encapsulation. This process causes delay
6
Chapter 1
and may not be suitable for time-sensitive traffic. Transparent Mapped modes of the GFP provide low jitter and delay by encapsulating the client signal in fixed-length encoded GFP frames without having to wait for the entire client frame/packet to be buffered.
1.1.3
Global Optical Transport Network Timing
As the saying goes, "Timing is everything." This phrase is an exact fit for transport networking, since we know that timing and synchronization, among the network elements that make up the global transport network infrastructure play a very critical role in ensuring the proper operation and performance of these networks. Chapter 6 gives a comprehensive overview of the timing and synchronization techniques utilized within optical transport network infrastructures. It covers the fundamental concepts of jitter and wander and their relationship with network synchronization. These basic topics lead into the jitter, wander, and synchronization requirements for SONET/SDH networks and the differences between these requirements and those for OTN infrastructures. Moreover, topics including client signal mapping into SONET/SDH or OTN signals, and the accommodation of timing variation to ensure seamless operation, are extensively covered. The architectural solutions to accommodate effects due to jitter and wander within the components/blocks, may these be Clock and Data Recovery (CDR) or SONET/SDH or OTN framers, are provided. The fundamental concepts covered in Chapter 6 have to some extent a high degree of abstraction, since they are described in mathematical terms. To illustrate those points explicitly, it is prudent to follow the discussion with some technology specific concepts and examples. Chapter 7 complements the content of Chapter 6 by adressing the issues within the SONET/SDH systems and networks. It starts out by reviewing why synchronization is needed, how it is administered, and why tracing the timing source is important. Numerous methods of timing extraction mechanisms utilized by the network elements are shown. Chapter 6 also reviews the hierarchical timing distribution across the network. Several architectural examples of timing distribution within the various types of network elements are also presented. 1.1.3.1
Transport Network Survivability
A very important aspect of providing uninterrupted networking services is to ensure connectivity, while maintaining an acceptable level of service quality, under conditions of natural or man-made disasters. This topic is extensively addressed in Chapter 8 under the category of network
Overview
7
survivability. There are two predominant techniques utiHzed for achieving survivable networks: (1) network protection and (2) network restoration. Network protection imphes that there are dedicated resourses that have already been deployed as backups and that "take over" when the primary resources have failed. Network restoration refers to the use of backup resources from a pool of resources, as opposed to dedicated backup resources. Chapter 8 starts by providing the objectives of the protection schemes and covers the details of the major five protection architectures namely, (1) 1+1, (2) l:n, (3) m:n, (4) (1:1)", and (5) rings. Protection switching parameters, protection switching classes, protection switching trigger criteria, automatic protection switching protocol, and examples are discussed. Topics in survivability through restoration implemented by either preplanned or on-the-fly routing with centralized route-calculation techniques and distributed techniques are also covered.
1,2.
CARRIAGE OF SERVICES OVER TRANSPORT NETWORKS
The MEF, ITU-T, and IETF have taken the lead in defining technical specifications, standards and RFCs for extending the reach of Ethernet from its traditional LAN environment to metro- and wide-area networks. The intent of the effort by these standards and industry forums is to ensure that enhanced Ethernet-based services can be deployed reliably and are scalable at lower capital and operating expenses.
1.2.1
Ethernet Services Architecture and Definitions
In describing Ethernet services, it is prudent to describe reference models that can be used in defining terms and services. Chapter 9 lays the foundation for describing the reference models used in defining Ethernet services over metro area networks. These services are defined in such a way that they can be carried over any of the prevailing transport technologies like IEEE 802.3 PHY, IEEE 802.1 bridged networks, SDH VC-nA^C-n-Xc, SONET STS-n/STS-4n-Xc, ATM VC, OTN ODUk, PDH DSl/El, MPLS, dark fiber, etc., or possibly different future networks. 1.2.1.1
Ethernet Services over Metro Networks
Traditionally, services are defined in observable terms, with clear demarcation points between the subscriber and the service provider's
8
Chapter 1
equipment. The subscriber equipment is referred to as the Customer Edge (CE) at which observable service-level parameters are defined that become the basis for a Service Level Agreement (SLA) between the subscriber and the service provider. The physical demarcation point between the Service Provider and a single subscriber is termed a User to Network Interface {UNI) across which such SLAs are made. Beyond the UNI, the types of technology and the architecture inside the metro and wide-area networks are invisible to the subscriber. This transparency allows the services to be defined and observed from UNI to UNI. Moreover, the definition of the services allows the service providers to offer metro- and wide-area Ethernet services to over 100 million existing devices capable of using the services. Chapter 10 describes the service definitions and characteristics of Ethernet Virtual Connection (EVC) as defined by MEF. Details of point-topoint, point-to-multipoint, and multipoint-to-multipoint EVCs enabled by VLAN tags are provided, along with details of how these can be used in offering E-Line and E-LAN services. Traffic and performance management is an integral part of ensuring that the SLAs are met for such services. Traffic policing becomes essential in monitoring compliance with the SLAs and is based on parameters like Committed Information Rate (CIR), Committed Burst Rate Size (CBS), Excess Information Rate (EIR), and Coupling Flag (CF). The performance-monitoring parameters such as frame delay, frame delay variation, frame loss can be used in defining different classes of service. The means whereby these parameters are used in service delivery and performance monitoring/assurance are also covered. 1.2.1.2
Ethernet Services over Public Wide-Area Networks
In Chapter 11, we see that ITU-T has taken its lead from the work done by the MEF and extended the scope by defining the Ethernet services over public wide-area transport networks. ITU-T uses traffic management and performance parameters consistent and complementary with MEF in defining its services standards. In addition to the services description, it covers the transport network models that support the Ethernet connectivity. These include Ethernet private line services that leverage the existing connection-oriented, circuit-switched TDM networks. Several service scenarios that provide diverse sets of applications are provided. Chapter 11 goes on to describe the Ethernet-based user-to-network interfaces (UNIs) and Network-to-Network interfaces (NNIs). To facilitate reliable Ethernet services, Ethernet operations, administration and management, and survivability are also discussed. It should be noted that the ITU-T's datacentric transport technologies like Virtual Concatenation (VCAT), Generic Framing Procedure (GFP), and Link Capacity Adjustment Scheme, (LCAS)
Overview
9
have become the enablers of Ethernet services over SONET/SDH and OTN networks. 1.2.1.3
Ethernet Services over MPLS Networks
Leveraging the extensive work done on Muhiprotocol Label Switching (MPLS) networks, along with its expertise on packet-based networks, IETF has developed RFCs that are the enablers for providing Ethernet services by using MPLS networks. Chapter 12 details Ethernet services over MPLS networks. It starts by describing the fundamental concepts and architectures of layer 2 virtual private networks (VPN) over an MPLS backbone as it sets the stage for the Ethernet services over MPLS. E-Line and E-LAN functions are subsequently discussed, as well as how virtual private wire and virtual private LAN services can be offered. A walk-through example of E-Line emulation over MPLS is given to clarify the concepts and outline the steps needed to provide the service. 1.2.1.4
Circuit Emulation over Metro Ethernet
Traditional voice and other TDM services have been the core of our communication needs and over time have been offered on different technology platforms as these evolve. Ethernet as a service platform has gained significant momentum over the past couple of years. Carriers worldwide are deploying metro Ethernet networks to cater to the everdemanding business customers' requirements for faster and cheaper data and voice transport. Moreover, these carriers are finding increased demand for their existing lucrative TDM traffic, may it be PBX trunks or private line services. Chapter 13 describes the details of MEF's recommendations regarding Ethernet circuit emulation services, such as N x 64 kbit/s, Tl, El, T3, E3, OC-3, and OC-12, across a Metropolitan Ethernet Network. It provides numerous service reference models that can be used in implementing TDM services. 1.2.1.5
Metro Ethernet Network Resiliency
We are seeing tremendous momentum behind Ethernet services that are being offered by carriers to businesses. The availability of the network services to the business is very critical, as is the Quality of Service (QoS). The combination of service availability and QoS has become a crucial aspect of the Service Level Agreement (SLA) between the user and the service provider. Service availability is tightly coupled with network resiliency.
10
Chapter 1
Chapter 14 covers the metro Ethernet network resiliency as recommended by the MEF. Topics like protection types, failure types, resource selection, and event timings are covered. Chapter 14 also discusses timing issues along with service-level specification commitments. Protection reference models are described such that consistent descriptions of protection capabilities are applied on these services across various transmission technologies, network topologies, and policies. This approach enables the description of protection of services in the ETH layer (Ethernet Line Service, Ethernet LAN service, etc.). The models lend themselves nicely to a definition of requirements for the Ethernet services protection mechanism. This definition subsequently provides the framework for the protection in the metro Ethernet and the implementation of the protection schemes.
1.2.2
Storage Area Services over SONET
The data-centric upgrades to SONET/SDH networks and key developments in the Fibre Channel, the widely used protocol for storage-area networks, have provided a strong platform for connecting storage-area networks across town or across the globe. Chapter 15 provides an overview of the trends for tremendous data growth across the globe and the need to access stored data at remote sites. It briefly covers the basics of the storagearea networks and the different technologies that are used therein. It reviews the stringent requirements that storage networks place on wide-area networks when they are connected over wider distances, as well as the need for network resiliency, scalability, and performance. It gives various options for how storage data can be moved across various sites using WDM, storage over IP, Fibre Channel over IP, and SONET/SDH. It makes a strong case for why SONET/SDH (using VCAT and GFP), along with the work being done at the ANSI Technical Committee on Fibre Channel Back Bone (FC-BB-3), provides a good, strong platform for extending the storage-area networks across long distances.
1.3.
CONTROL AND MANAGEMENT OF OPTICAL TRANSPORT NETWORKS
Chapter 16 provides a treatise on the activities of the ITU-T in specifying the architecture and requirements for automatic switched optical networks (ASON), concentrating on defining the critical functions and the required components for the optical control plane. Since the 2000 time frame, the ITU-T has been engaged in developing a suite of Recommendations
Overview
11
encompassing control plane architecture, auto-discovery, signaling, routing, and control plane management requirements and specifications related to optical transport networks. The optical control plane enables rapid establishment of connection services across heterogeneous networks by supporting intelligence that enables transport networks to be dynamically managed for traffic engineering and bandwidth-on-demand applications, particularly in the areas of QoS, connection management, and traffic restoration after network failures. To achieve this goal, it was considered essential to first establish the general networking requirements related to the service control and management functions that are essential elements of the solution. These requirements included the fundamental concepts of separation of call and connection control; relationships among the management, control, and transport planes; and establishment of a flexible control-plane component architecture. One of the goals of this work is to assure that the optical control plane may be gracefully deployed into existing and new transport network infrastructures, and varied network management environments, in an evolutionary manner. This process would allow networks operators to harness the advances in optical transport networking technologies while ensuring that the existing deployed infrastructure is not rendered obsolete. The ultimate goal of the ASON suite of Recommendations is to enable automated cross-domain connection management supporting multivendor and multicarrier interoperability on a global scale. The methodology for the development of the ASON suite of Recommendations involves a foundation of protocol-neutral specifications that facilitate common requirements and behaviors for various technology and implementation options, hence lending themselves nicely to future growth. Chapter 16 also describes the relationships among the various standards and industry forums involved in the development of control plane specifications (IETF, ITU-T, OIF, ATM Forum), including utilization of associated protocol specifications (GMPLS, PNNI).
1.4.
INTRA-NETWORK ELEMENT COMMUNICATION AND COMPONENTCENTRIC STANDARDS
1.4.1
Intra-Network Element Communication
As we have seen in the earlier chapters, we review diverse sets of networking functions and multinetworking services that are presently being
12
Chapter 1
offered or will be offered in the near future. In building the networks and offering such services, numerous different types of network elements are used. Functions within network elements are primarily implemented on optoelectronic modules and ASICs. Chapter 17 starts with architectural examples of packet- and TDM-based network elements that can be used in providing multiservices. Architectural blocks described in the network element architectures can be mapped to ASICs or standard VLSI offered by semiconductor firms or developed in-house by the system vendor. OIF developed a number of implementation agreements that allowed the ASICs or standard VLSI products from different firms to communicate and interoperate with each other. Agreements such as serializer/deserializer (SERDES) to Framer interface or system packet interface operating at different rates are covered.
1.4.2
Optical Interfaces
Chapter 18 covers the diverse sets of topics on the optical interface standards developed by ITU-T. It starts by giving a history and rationale behind the evolution of optical interface standards relating to PDH, SONET/SDH, DWDM, OTN, CWDM, and all optical networks (AON). This chapter covers the general concepts and reference models, along with illustrative examples, so that the reader can subsequently get further details from the relevant standards documents. This approach was deemed necessary due to the large number of standards and the intricate details each respective standard provides. Chapter 18 gives an overview of optical fiber types and optical interface recommendations. It reviews the power budget design considerations and the limitations to overcome worst-case scenarios. Uses of certain coding schemes to achieve the required bit error rates are also covered. Subsequently, examples of discrete and integrated optoelectronic solutions related to operating speeds of 140 Mb/s - 10 Gb/s are highlighted. Finally, the chapter covers the illusive topic (from the standardization point of view) of faults and degradation detection in the optical transmitters, detectors, and amplifiers.
1.4.3
High-speed serial interconnects
In highly integrated network elements with ever-increasing port speeds and port card densities, the pressure is on to reduce printed circuit board (PCB) traces, layer count, and routing complexity. This situation has lead to the usage of serial interconnects using serializer/deserializer devices commonly know as SERDES. The high-speed interconnects operating at
Overview
13
Gb/s rates pose some interesting challenges. Chapter 19 discusses the highspeed serial interconnects that are used in the communication between devices within the same card along with card-to-card communication using the backplane architecture. It reviews the architectural considerations and signal-integrity design challenges. The discussion on topologies and the effects of material loss, layer connection, and the environment lead into the topic of de-emphasis and equalization—a powerful method in achieving highly reliable and robust chip-to-chip interconnects solution. The chapter subsequently gives an overview of the work of OIF, IEEE, and PCI Industrial Computer Manufacturers Group (PICMG) on interconnect standards. Finally, it considers some challenges and possible solutions for 6 Gb/s and 10 Gb/s applications that are anticipated in the near term.
1.5.
STANDARDS DEVELOPMENT PROCESS
The standards development process within the networking field has a long progressive history where innovative technologies are shaped in providing value-added products and services. Over the course of many decades, numerous standards organizations and fora have been working diligently to provide innovative standards and recommendations that have shaped the ever-evolving networking field. To appreciate this work, one needs to understand the behind-the-scene dynamics of what makes these standards bodies and fora "tick." Chapter 20 provides a snapshot of the practices and procedures that are used within ITU-T in developing global networking standards. In this chapter, practices in ITU-T and fora such as MEF and OIF are discussed. It is interesting to note that how the cultures in these organizations develop based on the policies and practices. For example, in the ITU-T, the approval of Recommendations requires unanimous agreements. This very fundamental premise developed a culture at ITU-T where civility and a spirit of cooperation prevail even against a background of fierce competition in the marketplace. However, within industry fora, the culture is quite different, since majority vote (of a certain percentage) governs who "wins." This chapter highlights some interesting insights into the whole recommendations/standards development process that are useful in understanding the "systems." It also highlights the "behind the scenes" hard work of personnel who make such venues possible so that we can appreciate their efforts in making such meetings and gatherings successful.
This page intentionally blank
PARTI Optical Transport Network Infrastructure
This page intentionally blank
Chapter 2 ARCHITECTURE OF TRANSPORT NETWORKS The Functional Modeling Approach^ Eve Varma* and Carmine Daloia** "^Lucent Technologies, "^"^Washington Group International
2.1.
INTRODUCTION
Transport networking has steadily grown more complex as a consequence of more sophisticated customer needs, the convergence of data and transport networking, and conditions imposed by external market and regulatory forces. In the evolution of embedded core transport infrastructures or in building new core transport networks, efficient cost-effective transport capacity expansion, network reliability, flexible and dynamic bandwidth management, and quality-assured service management are of paramount importance to service providers. Given the wide range of technology choices, there is a trend for networks to employ heterogeneous technology equipment. Whereas in the past, transport networks only supported plesiochronous digital hierarchy (PDH) equipment, current networks may utilize equipment employing various technologies, including SONET/SDH, DWDM/OTN, IP/MPLS, Ethernet, and ATM. Current technologies and hierarchies are being designed to facilitate interoperation of equipment produced by different manufacturers, a process that further widens the competitive aspects of equipment purchase. Equipment suppliers may support a number of operators, possibly within a single nation, and may be presented with a number of different equipment specifications. At best, this situation leads to duplication of effort, with several, often very comprehensive, specifications relating to the same piece of equipment. In many cases, especially in a period of standards evolution, the specifications each require slightly different functionality, which may reduce competition and increase the price an operator must pay.
18
Chapter 2
From an operator's perspective, the necessity to fully specify particular equipment in order to avoid confusion and misinterpretation by a number of different manufacturers has led to increased specification complexity, which can make it difficult to judge among competing suppliers. Adoption of a common methodology for describing such equipment is therefore intended to simplify the specification process, to prevent misunderstanding, and to ensure fair competition. It should also present a set of common basic equipment requirements, facilitating inter-operation of multivendor equipment and driving down costs to both the operator and the end user [1]. Spurred by the above factors, motivation arose for establishment of standardized model-based approaches to • Enable description of the generic characteristics of networks, using a common language, at a level that can transcend technology and physical architecture choices; • Provide a view of functions or entities that may be distributed among many types of equipment; and • Concurrently specify transport and management functionality. Accomplishing the above allows us to • Design and plan networks prior to investments, including selection of the most appropriate types of equipment, to support telecommunications services; and • Facilitate development of new transport services and their associated management. As discussed above, the transport network is a large, complex network with various components, and a network model with well-defined functional entities is essential for its design and management. Within this chapter, we introduce and describe transport functional modeling standards, which provide the foundation for equipment control and management.
2.2.
TRANSPORT FUNCTIONAL MODELING
Transport functional modeling can be thought of as a requirements capture and analysis tool. Its objective is to describe the information transfer capability of transport networks in a manner that is independent of networking technology and to provide a set of "tools" for describing, in a common, consistent manner, the technology-specific transport functionality contained within a complex network. It enables • A flexible description of transport network and equipment functional architectures; • A means to identify functional similarities and differences in heterogeneous technology architectures;
A rch itecture of Transport Networks
19
•
A means to derive equipment functional architectures that are traceable to and reflective of the transport network requirements; and • Formation of the basis for a rigorous and consistent relationship between these functional architectures and their associated management specifications. ITU-T Recommendation G.805 [2] was the first transport functional modeling specification developed, and was specifically designed to address the connection-related characteristics of transport networks and equipment. It has been used to provide the methodology and basic concepts that are the foundation for other ITU-T Recommendations for technology-specific network architectures, including: • Synchronous Digital Hierarchy - G.803 [3], the functional architecture of SDH networks - G.783 [4], the functional architecture of SDH equipment - G.841 [5], the SDH network protection functional architecture - G.842 [6], SDH protection architecture interworking • Optical Transport Networking - G.872 [7], the functional architecture of Optical Transport Networks (OTN) - G.798 [8], the functional architecture of OTN equipment - G.873.1 [9], OTN Linear protection The above specifications address connection-oriented networks. In connection-oriented networks, a connection must be set up within the data plane by either the management plane or the control plane prior to the transfer of information across the network. The connection setup process includes the routing process, which determines an appropriate path through the network, and a resource allocation process, which assigns network resources along the calculated path to support the connection. The focus of this chapter is on connection-oriented networks. In addition to connection-oriented networks, connectionless networks are also being deployed in service provider networks. In connectionless networks, data grams are transferred through the network without any prior negotiation of routes or resources. The data gram itself contains sufficient address information for network nodes to route the data gram from its source to its destination. As connectionless networks such as IP and Ethernet have become more heavily deployed within service provider networks in conjunction with the increase in IP and Ethernet service offerings, service providers and equipment suppliers within the ITU-T saw a need to develop a functional modeling specification, namely. Recommendation G.809 [10], designed to address connectionless networks much in the same way Recommendation
20
Chapter 2
G.805 addressed connection-oriented networks. It has been used to provide the methodology and basic concepts that are the foundation for other ITU-T Recommendations for technology-specific connectionless network architectures, including G.8010 [11], the functional architecture of Ethernet networks,
2.2.1
Basic Concepts
The G.805-based modeling approach has allowed us to analyze the transport network and to identify generic functionality that is independent of implementation technology. This outcome has provided a means to describe network functionality in an abstract way in terms of a small number of architectural components, which include topological components, transport entities, transport processing functions, and reference points. These are typically defined by the function they perform in terms of transformations applied to the signal or by the relationships they describe between other architectural components. In general, these functions act on a signal presented at one or more inputs and present processed signals (i.e., transformed signals) at one or more outputs, and are defined and characterized by the information processed between their inputs and outputs. Architectural components may also be associated together in particular ways to form the equipment from which real networks are constmcted. Patterns and structure in the network can be rapidly obscured in a cloud of complex relationships. From a connection perspective, two separate concepts are involved: topology and function. The topology of a network is essentially the set of relationships between nodes (which will later be seen as subnetworks), and it defines the available connectivity. Application of this concept simplifies network description by keeping logical connections distinct from their actual routing in the network and the resources that physically support them. Thus, the logical pattern of interconnection of elements in the network is established without concern for the associated signal processing functions, and this outcome allows an operator to easily establish connections as required. On the other hand, the concept of function refers to how signals are transformed during their passage through the network versus how elements of the network are interconnected. Recommendation G.805 provides elements that support the modeling of both topological and functional concepts. Within the topology domain, the two fundamental concepts that relate to the organization of the network are layering 2inA partitioning.
A rch itecture of Transport Networks 2.2.1.1
21
Layering
We have already introduced the concept of topology, which allows us to separate logical connections from the physical routes and resources used in their carriage. This logical separation is well represented by the client/server paradigm, where the client refers to the signal being carried and the server refers to the entity providing its carriage; i.e., client signals are transported by servers. To utilize this paradigm, we consider the client and server to be two layers, where the client layer is supported by the server layer. The client/server paradigm is recursive, in that any particular server layer could itself be considered a client of another server layer. If we elaborate this paradigm, a network can be represented in terms of a stack of client/server relationships (a stack of layers). It's also useful to note that server layers are relatively more permanent than their clients. This outcome follows from the observation that a server connection must exist both before and after a client connection carried by that server. Layering therefore enables decomposition of a transport network into a number of independent transport layer networks, and this independence provides the required separation between its logical topology and physical routes and resources. In particular, the process for setting up connections becomes layer independent. Network management may also be simplified because each layer's properties can be handled in the same way (e.g., each layer can be assigned a quality of service, monitored for its performance independent of the other layers, and assigned an identification to help in fault isolation). A layer is defined (characterized) in terms of its set of signal properties, which form what is called the characteristic information of the layer (e.g., 2.048 Mb/s and its format). These properties are chosen in such a way that any access points having the same characteristic information can be interconnected. This term emphasizes the abstract properties of the stream in order to avoid the connotations of a physical signal in a medium, though the properties most often chosen tend to be related to the way a particular stream is represented, e.g., the rate and format at which information is transported. Conventionally, lower-order client layer networks use transport services provided by underlying higher-order server layer networks. The notion of higher- and lower-order layers follows the assumption that a server has a higher capacity than its client does, but this terminology is sometimes confusing because higher order layers are conventionally drawn at the bottom of the page. The complete set of access points in the layer that can be associated for the purpose of transferring information defines the boundary of a layer network.
22
Chapter 2
For purposes of clarification, we provide an example of layering utilizing PDH (e.g., a DS3 client) and SONET/SDH. The SONET [12] and SDH [13] standards define a hierarchy of signal layer networks (see Figure 2-1), as do the PDH and FDM standards that preceded them. Each layer network requires the services of a higher-order layer network to perform the required transport functions. We will discuss the exact location of the layer boundaries in more detail later in this chapter. For this example, we describe the primary signal layer networks below: • The logical client signal layer represents the logical DS3 signal, i.e., the DS3 signal rate and format, irrespective of physical media characteristics (e.g., line coding, etc.). • The logical SONET STS-1 / SDH VC-3 path layer network deals with the transport of the DS3 client signal (which may be considered as a "service"). The main function of the path layer network is to provide endto-end supervision capabilities for the signal, which traverses a series of SDH Multiplex Sections. Additionally, the layer maps its client into the format required by the SONET Line layer network, on whose services it relies. • The logical SONET Line or SDH Multiplex Section layer network deals with the reliable transport of path layer network payload and its overhead across the physical medium. The main functions of this layer network are to provide alignment (e.g., frequency or phase) and multiplexing for the path layer network. It relies on the services provided by the SONET Section/SDH Regenerator Section layer network. • The logical SONET Section/SDH Regenerator Section layer network deals with the transport of an STS-N/STM-N frame across the physical medium, and uses the services of the physical layer network to form the physical transport. Functions in this layer network include framing, scrambling, section error monitoring, etc. • The Physical Media Layer network (photonic or electrical), identified as either the STM-N Optical Section (OSn) or STM-1 Electrical Section (ESI), deals with the transport of bits across the physical medium. For example, in the case of photonic media, issues dealt with at this layer network might include optical pulse shape, power levels, and wavelength. This layer is required whenever equipment is to be represented; i.e., physical equipment description is incomplete without provision of physical interfaces. Thus, using the client/server model recursively, a logical DS3 signal would act as the client layer network while being transported by a server logical STS-1 A^C-3 path layer network, the logical STS-1/VC-3 path layer network would be the client layer network to the server logical SONET Line/SDH Multiplex Section layer network, etc.
23
Architecture of Transport Networks Lower-Order Layers ."f Logical DS3 Signal
DS3 payload mapping
Logical Line Layer
STS-1 Path overhead insertion
Alignment of STS-1 Path payload
Logical DS3 Signal
Logical STS Path Layer
DS? payload mapping
Mapping and multiplexing into logical STS-N frame
Logical Section Layer
Line overhead generation
Section overhead generation for STS-N frame
Physical Media Layer
Higher-Order layers
Logical DS3 Client Layer
Conversion into OC-N Physical Interlace
Logical Multiplex Section Layer
Logical DS3 Client Layer
VC-3 Path overhead insertion
Alignment of VC-3 Path payload
Logical Regenerator Section Laver Physical Media Laver
Logical VC-3 Path Layer
Mapping and multiplexing into logical STM-N frame
Line overhead generation
Section overhead generation for STM-N frame
Conversion into STM-N Physical Intertace
5
Figure 2-1. SONET/SDH signal hierarchy examples: DS3 client carried on an OC-N/STM-N signal
The architecture of the Optical Transport Network (OTN), specified within G.872, has layering characteristics analogous to those for SDH. This similarity should not be surprising, as the OTN was similarly developed to provide fully featured transport networking functionality optimized for highcapacity path networking in a multidomain environment. The defined OTN layers are • Optical Channel (OCh) layer that supports end-to-end networking of optical channels for transparently conveying digital client information • Optical Multiplex Section (OMS) layer that provides functionality for networking of a multiwavelength optical signal • Optical Transmission Section (OTS) layer that provides functionality for transmission of optical signals on optical media Recommendation G.872 specifies maintenance requirements for each of the defined OTN layers listed above. During the development of G.709 [14], it was realized that only digital techniques were available to meet the continuity, connectivity, and signal-quality supervision requirements specified in G.872 for the OCh layer. The use of digital techniques within the OTN was not considered to be a serious limitation for the following reasons: • The scope of G.872 is limited to the support of digital client signals.
24
Chapter 2
•
Due to limitations in the current optical technology, it is not possible to build a worldwide optical network (i.e., 3R regeneration of the optical channel is required after a certain distance). • 3R regeneration will be used at domain boundaries to decouple the domains with respect to optical signal impairments. Therefore, G.709 specifies an implementation of the OCh utilizing a digital framed signal with digital overhead. The use of a digital framed signal to implement the OCh allowed for the use of Forward Error Correction to enhance performance within the OTN. Recommendation G.709 therefore defines two additional digital layer networks, the Optical Channel Data Unit (ODU) layer network, and the Optical Channel Transport Unit (OTU) layer network. Characteristics of the OTN will be elaborated in Chapter 3. 2.2.1.2
Partitioning
As discussed earlier, the concept of layering helps us manage the complexity created by the presence of different types of characteristic information in current networks, which utilize multiple technologies supporting a wide range of bandwidths. However, even within a single layer, complexity is introduced by the presence of many different network nodes and the connections between them. In order to manage this complexity, we introduce the partitioning concept, which also uses the principle of recursion to tailor the amount of detail that needs to be understood at any particular time according to the need of the viewer. Partitioning refers to the division of layer networks into separate subnetworks that are interconnected by links representing the available transport capacity between them. The role of the subnetwork is to describe flexibility of connection, with no notion of distance being traversed, where traversal of distance is the role of the link. Subnetworks may be delimited according to a wide range of criteria, including those related to network infrastructure, network services, administrative and/or management responsibility, or even geography. Just as a layer network is bounded by access points that can be associated with each other, a subnetwork is bounded by ports that can be associated with each other. (It is important to note that while an access point can only be associated with one layer network, a port may be a member of one or more subnetworks.) Just as layers enable the management of each layer to be similar, so does partitioning allow the management of each partition to be similar. If we consider that a layer network is actually the largest possible subnetwork bounded by access points, it should not be surprising that subnetworks themselves can also be recursively partitioned into sets of still
Architecture of Transport Networks
25
smaller subnetworks and interconnecting links until the last level of recursion is reached (i.e., a fabric in an equipment). Figure 2-2 below illustrates recursive partitioning of a layer network, focusing upon illustrating the principle of partitioning versus the reasons for creating each partition. As each level of partition is created, it is important to understand that the original set of ports around the largest subnetwork neither increases nor decreases in number. The inner subnetworks are intentionally drawn touching the outer subnetworks to indicate that the ports are members of all the touching subnetworks. As more partitions are created, the inner links that become exposed have their own ports on the inner subnetworks. An interesting concept is that at any particular level of partitioning, subnetworks can be considered as a graph whose vertices are the subnetworks and whose edges are the links. In this view, subnetworks provide for flexible connectivity, while links bridge physical distance.
Figure 2-2. Recursive partitioning of a layer network
As might be expected, the rationale for employing a recursive description of subnetworks also applies to links. Recalling that a link can represent available transport capacity between a pair of subnetworks, link connections have been defined within G.805 as representing the smallest granularity capacity (supported on a server layer) that can be allocated on a link. Thus, a link may be considered as composed of (partitioned into) a bundle of link connections. However, the concept of link partitioning can be further
Chapter 2
26
extended; specifically, we can consider partitioning a link into a set of links of equivalent aggregate capacity (illustrated in Figure 2-3 below).
Links with capacities of x1, x2, x3... respectivelvj
Link with a capacity of y (y>x1+x2+x3...)
Figure 2-3. Partitioning a link into a set of links
This type of link partitioning allows us to assign server capacity to several links, rather than to just one. It thus allows us to assign server capacity to several subnetworks, which is necessary for modeling the sharing of a common server layer by several networks. This link-partitioning concept is particularly relevant to the modeling of variable capacity technology networks. From a terminology perspective, links that have been partitioned into a bundle of smaller links in parallel may be considered as compound and component links, respectively (Figure 2-4).
Figure 2-4. Parallel partitioning of link into links
Links may also be serially partitioned into an arrangement of linksubnetwork-link, illustrated in Figure 2-5; such links may be designated as serial compound and component links, respectively.
11
Architecture of Transport Networks
Serial-compound link
Figure 2-5. Serial partitioning of a link
The concepts of layering and partitioning are brought together in Figure 2-6, which illustrates a "vertical" arrangement of the layering example described earlier. As illustrated, each layer may be thought of in terms of a layer network, which can be "horizontally" partitioned into subnetworks to reflect infrastructure or equipment organization, such as self-healing rings, or to reflect convenient management or administrative boundaries. As mentioned earlier, the same network can be partitioned differently for different purposes. For example, the partitioning for connection management of various services, network administration, and maintenance may all be different.
Logical DS3 Client Layer Logical f STS-1 \^ Path Layer
tvc-3 y Logical (
Line
)^ Layer
^' Multiplex Section ^ Logical I
Section
| Layer
Regen, Section Physical Media Layer Layering View
Partitioning View
Figure 2-6. Illustration of layering and partitioning
We will close this section with an example of how partitioning enables a network management application to abstract the topology of a layer network (Figure 2-7), which is particularly relevant to the connection management
28
Chapter 2
domain. This builds off the example provided in Figure 2-2, and provides a more detailed view of the final stage of partitioning of this particular network to set up a connection from access point A to access point B.
Network connection Layer network
A-
-^B
I, Link
Access point
Subnetwork
Figure 2-7. Enabling abstraction of the topology of a layer network
The topology of the layer network is modeled as an interconnected set of links and subnetworks. The connection management domain utilizes the abstraction of the layer network topology to determine the appropriate set of links and subnetworks required to support a connection between two points. Once the set of links and subnetworks is selected, the transport resources (i.e., the link connections and subnetwork connections) are reserved to support the connection. This description of connection management is applicable to a single layer network, and the processes described are applied one layer at a time, which correctly models the connection management in real networks. As discussed previously, the complexity due to the presence of different types of characteristic information and technologies in current networks is managed using the concept of layering. Real networks are therefore modeled as multiple layer networks, each layer network defined by its characteristic information. The general process for setting up connections is similar for each layer network. As noted before, server layers are relatively more permanent than their clients, since a server connection must exist both before and after a client connection carried by that server. As a consequence, the connection management process must be completed first within the server layer to ensure that a server layer trail exists and is ready to support the client layer connection. The creation of the server layer trail results in new transport
Architecture of Transport Networks
29
resources becoming available within the client layer to support the client layer connection requests.
2.2.2
Functionality
While we have discussed the network topology dimension, and seen how complexity can be reduced by introducing the concepts of layers and partitions, we have not yet said anything about the functionality needed to actually transport a signal across a network. We have shown how network layers represent client/server relationships; we may consider the functionality involved in transporting signals to be the implementation of these client/server relationships. This functionality is provided by the same three transport processing functions in each layer, namely, adaptation, termination, and connection functions. The fundamental components of transport processing functionally are known as "atomic" or "elementary" functions, and are related to a single layer of the signal hierarchy (or layer network). We will later see that "atomic" does not mean that the function could not be further decomposed, but that we choose not to decompose the function at this particular time (e.g., it is not necessary from the particular layer perspective). There are, however, rules for composition (and decomposition) of atomic functions. Transport processing functions have been identified and grouped into classes corresponding to adaptation and termination. As signal transport is directional, these functions have a source, which originates the signal, and a sink, which receives the signal. Source functions apply a transformation to the signal, and sink functions remove that transformation. Source and sink functions thus occur in pairs within a layer, and are bounded by ports, which represent the function inputs and outputs. These ports are actually the same ports that we have described as bounding subnetworks and link connections in our partitioning topology model. Transport processing functions are described in more detail below: • Adaptation Function: An atomic function that passes a collection of information between layer networks by changing the way in which the collection of information is represented into a form that is suitable for the server layer. The adaptation source function is responsible for several key processes: - Client encoding: The adaptation source adapts a data stream to the server characteristics. - Client labeling: The adaptation source "labels" each client so that the corresponding adaptation sink can correctly identify it. This process enables clients to be multiplexed; however, the means by which this is done is very technology specific.
Chapter 2
30
-
Client alignment: Adaptation sources align the client signal with capacity in the server layer, while adaptation sinks remove the effects of alignment. While the actual process is technology dependent, in time division multiplexed (TDM) systems the buffering of the signal is commonly required. • Trail Termination Function: An atomic function within a layer network where information concerning the integrity and supervision of adapted information may be generated and added or extracted and analyzed. While this function's full title is trail termination function, a common abbreviation is just termination function. The termination source is concerned with transforming signals so that they can be monitored for signal quality. This frequently involves the addition of components to the signal for the purposes of monitoring, frequently called overhead. The termination sink monitors the signal quality and removes any overhead. It is this overhead removal function that gives the function its name, i.e., overhead termination. Overhead can be provided via insertion of additional capacity or, alternatively, via usage of already available, but unused, capacity. While not a transport processing function, there is a third function in common use, known as the connection function. • Connection Function: An atomic function within a layer, which, if connectivity exists, relays a collection of items of information between groups of atomic functions. It does not modify the members of this collection of items of information, although it may terminate any switching protocol information and act upon it. Any connectivity restrictions between inputs and outputs are defined. We note that the connection function is actually the same topological component as the subnetwork and has the same properties. These atomic functions are represented using a set of symbols, shown in Figure 2-8, which constitute part of a shorthand diagrammatic notation that will be used for specification purposes. The intent is to simplify technical descriptions via a common set of symbols and naming conventions.
Connection Function Adaptation Function
°
Ports
Figure 2-8. Graphical representation of "atomic" functions
Architecture of Transport Networks
2.23
31
Connections and Points
In Section 2.2.2 we have seen that network layer functionality, including the client/server relationship, may be described in terms of a set of elementary functions. The client/server relationship itself is most precisely defined as the association between layer networks that is performed by an adaptation function. In fact, these elementary functions can be connected to describe the complete layer behavior, with associated rules describing allowable combinations. Functions are interconnected by considering their ports to be bound together, where a binding between two ports is called a reference point (or just point). This convention makes it possible to illustrate relationships between functions without having to explicitly cite which port is involved. Subnetworks allow flexible bindings between their ports, and the binding of two such ports is called a subnetwork connection. The most commonly used bindings and reference points are described below and are illustrated in Figure 2-9:
Sink
Source k / Layer Y/Z Adaptation Source
Layer Y/Z \ Adaptation Sinl<
AP 8)
AP Layer Y Termination Source
TCP
SNC
CP
Layer Y Termination Sinl<
CP Layer Y Connection
Layer Y Connection
Layer XA' Adaptation Source
SNC
TCP
Layer XA' Adaptation Sink
AP
AP
Figure 2-9. Illustration of various bindings and reference points
Access Point: Binding of an adaptation source output port to a termination source input port, or an adaptation sink input port to a termination sink output port, is called an access point (AP). This binding is never flexible and can therefore never be partitioned, so it is of relatively little interest. Access points are frequently omitted in functional model diagrams. (An access group is defined as a group of co-located access points, together with their associated trail termination functions.)
32
Chapter 2
•
Any binding involving a termination source output port or sink input port is called a termination connection point (TCP). The termination source output port may be bound to an adaptation source input port or a connection function input port. The termination sink input port may be bound to an adaptation sink output port or a connection function output port. • Any binding of an adaptation source input port to a connection function output port, or of an adaptation sink output port to a connection function input port, is called a connection point (CP). The preceding discussions imply that layers have no "thickness" and are simply planes representing the location of all the connection points in the particular layer. Adaptation and termination functions are located between layers, with inputs and outputs in different layers. These "vertical" relationships are usually statically configured, while the "horizontal" relationships are usually more dynamic. While this view leads to the least ambiguity in models, layers are conventionally considered to have thickness, and the adaptation and termination functions are assigned to either the client or the server layer (Figure 2-10). This convention has more to do with establishing who is responsible for what than with creating good modeling constructs.
p.p
Client Layer Network
Server Layer Network
Figure 2-JO. Allocation of atomic functions to network layers
2.2.4
Connection Dimension Model
We have already introduced the concept of a connection as representing an entity that transports information transparently without any integrity control. Several kinds of connections may be identified, depending on the layer and partition traversed by the connection. Some of these we have
A rch itecture of Transport Networks
33
informally introduced, or inferred, earlier in the chapter. They are depicted in Figure 2-11 and more formally defined below: • Trail: We have seen that access points bound a layer network. These access points are bound to the input and output ports of trail termination functions. This association between connection termination points is called a trail, and it provides an end-to-end connection that offers an automatic means to check the quality of the transport. • Network Connection: A network connection represents an association between output and input ports of trail termination functions that transfers information across a layer network without ensuring its integrity. From our earlier discussion of partitioning layer networks, a network connection is composed of contiguous subnetwork connections and/or link connections. • Link: A link represents the capacity between two subnetworks, two access groups, or one subnetwork and one access group. The granularity of this capacity depends on the implementation technology. Links are both providers and consumers of capacity. A link can be decomposed into several links of lower capacity, each serving different subnetworks or capacity consumers. • Link Connection - A link connection transfers information transparently across a link and is delimited by ports that represent the fixed relation between the ends of the link. These ports are the connection ports associated with an adaptation function. • Subnetwork Connection: A transport entity that transfers information across a subnetwork. It is formed by the flexible association of ports on the boundary of the subnetwork. This definition is more specific than that in the G.805 definition, which defines a subnetwork connection as an association between reference points. (The fixed bindings characteristic of trail connection points and connection points may also be thought of as subnetwork connections, which are very often called degenerate subnetwork connections. The subnetwork, or reference point, containing such a connection is very often called a degenerate subnetwork.) In summary, a trail may convey information for several clients of a layer network through the application of multiplexing and transcoding capabilities at the layer network boundary. Existence of a trail in one layer provides any client in that layer with a potential for information transfer between the access points characterizing the extremities of that trail. The client/server relationship, more precisely defined as the association between layer networks that is performed by an adaptation function, allows the link connection in the client layer network to be transported over a trail in the server layer network. The usage of the bandwidth contained in a link is flexible, even if its route may be fixed. Except for the case of definite stable
Chapter 2
34
capacity between points that characterize the cable infrastructure, transport services usually involve temporary associations between points. Thus, to allow transport resource reuse, a network needs flexibility (reflected in the subnetwork concept). The potential for flexibility across a subnetwork is governed by an associated policy. The subnetworks give the flexibility, and links give the fixed transport capabilities between subnetworks. Again, as noted earlier, when we refer to fixed infrastructure, we do not mean that this infrastructure is inflexible; rather, we mean that such possible flexibility is not exercised during the time of the connection we are considering. Links do not change during the time it takes to set up a network connection; neither do the allocated link connections during the duration of the network connection of which they are a part. In general, the higher the order of the link, the more fixed the link tends to be (and vice versa). The usage of the above terminology, and associated relationships, is illustrated in Figure 2-11 below, which shows all the relationships (no other arrangements are possible) between ports, reference points, and connections. These restrictions effectively specify a description language.
Client Layer Network
AP CP SNC TCP
Adaptation Function Access Point Connection Point Subnetworit Connection Termination Connection Point Trail Termination Function
Figure 2-11. Illustration of terminology and relationships
Figure 2-11 uses the above concepts to show how a client layer trail may be transported by a server layer signal. Here, the client trail is first terminated, then transported through a subnetwork via a subnetwork connection, and adapted for transport across a server layer trail involving server layer subnetwork and link connections. This model allows us to characterize network functionality in a technology-independent manner. In Section 2.2.5, we will provide some examples illustrating application of
Architecture of Transport Networks these principles to specific technologies.
2.2.5
networking examples involving
35 various
Sublayers and Function Decomposition
The functions described so far are considered to be atomic at the current level of interest. As with the topological concepts we are now familiar with, these atomic functions can be decomposed to reveal internal detail when it is necessary. Conversely, more detailed layers can be collapsed to reduce the level of visible detail. The goal, as with the topology models, is to reduce the number of items being dealt with at a given level of interest. Expansion of the adaptation function or termination function (see Figure 2-12) may expand a layer to show more detail. Expansion of the adaptation function allows more detailed specification of the adaptation necessary to create the server layer characteristic information, while expansion of the termination function allows more detailed specification of the termination of the server layer. These techniques have been used to specify greater levels of detail in equipment, new monitoring arrangements in existing layers, and fault recovery arrangements for existing layers, as well as completely new server networks. For completeness. Figure 2-12 also depicts the expansion of the connection point, though this is simply the inclusion of additional resources in the connection.
Figure 2-12. Expansion of layers
The converse of expanding layers is, of course, collapsing layers (see Figure 2-13). Layers are often collapsed when there are no flexibility points between them and it is not necessary to fully understand the details of every layer. This is most often done in equipment, though it is possible to collapse layers simply to reduce the amount of detail in a drawing.
Chapter 2
36 When constrained to be in same component
for useful layers internal to a piece of equipment
Figure 2-13. Simplification vs.flexibility:collapsing layers
2.2.6
Examples
Let us first consider how we would model the transport of a PDH DS-3 client signal onto an STM-N server signal (Figure 2-14). Here, the logical DS3 client signal is adapted for transport onto a VC-3 trail via the VC-3/DS3 adaptation function, the VC-3 path overhead is provided by the VC-3 trail termination function, and the VC-3 client signal is then adapted for transport on a Multiplex Section trail (frequency or phase alignment and multiplexing) via the Multiplex Section adaptation function. Finally, the STM-N Regenerator Section overhead is provided by the Regenerator Section termination function. We note also that it is possible to stop the recursive descent through client/server associations at any arbitrary point. This makes it possible to separate the concerns of the different layer networks, enabling focus on the layer network(s) of interest for any particular purpose. For example. Figure 2-15 only describes associations from the DS3 client through the VC-3 trail and network connections, whereas Figure 2-14 shows the remainder of the recursion to the section layers in this example scenario. The technology and distribution-independent aspects of the functional modeling approach provide a highly flexible tool to accommodate mixed technologies and various possible functional distributions.
Architecture of Transport Networks DS3 Client Signal
37 DS3 Client Signal
DS3 Path Link Connection
—oVC-3/DS3 Adaptation AP
—e
VC-3 Trail
VC-3 Trail Termination \ [ F
yC.:3 Network Conn^^^^^^^^^^^ VC-3 LC .<;;;^^C7SNC^ N C ^ ..^
TCP
•4)
yC:3 LC.
cpSTM-N MS/VC-3 Adaptation
STM-N -MS Trail. _
AP STM-N MS Trail Termination
Y
TCP
(J>
STM-N ...MSNC...
Y
STM-N RS/MS Adaptation AP
STT/
STM-N MSNC
•0
STM-N RS Trail
RS Trail.
STM-N RS Trail Termination X T T > / \TT/ Y STM-N TCP 0 •^•5..NC...,
STM-N RSNC
Figure 2-14. DS3 client conveyed on an SDH VC-3/STM-N server signal
DS3 Path Link Connection
VC-3/DS3 Adaptation
DS3 aient Signal
—^
VC-3 Trail AP
VC-3 Termination TCP
VC-3 Network Connection
(J) TCP
Figure 2-15. DS3 client conveyed on an SDH VC-3 server signal
We will next examine how to model the carriage of an STM-N client signal on an OTM-n.m server signal (Figure 2-16). The STM-N client signal is treated as a constant bit rate (CBR) signal within a certain bit rate range. Here, the logical CBR client signal is adapted for transport onto an ODUkP trail via the ODUkP/CBRx-a adaptation function. The CBR client signal may be either asynchronously or synchronously mapped into the ODUkP server signal. In this example, an asynchronous mapping is supported. The ODUkP path overhead is provided by the ODUkP trail termination function. The ODUkP client signal is then adapted into an OTUk server trail via the OTUk/ODUk adaptation function (synchronous mapping of the ODUk frame signal into the OTUk frame signal). The OTUk section overhead is provided by the OTUk trail termination function. The OTUk client signal is then adapted into an OCh server trail (forward error correction, scrambling, and clock recovery) via the OCh/OTUk adaptation function. The adapted
Chapter 2
38
signal is then conditioned for transport across the optical medium, and the OCh path nonassociated overhead is provided by the OCh trail termination function. The OCh signal is adapted into an OMS server trail (wavelength assignment and wavelength division multiplexing) via the OMS/OCh adaptation function. The OMS nonassociated overhead is provided by the OMS trail termination function. The OMS signal is adapted into an OTS server trail via the OTS/OMS adaptation function. The OTS nonassociated overhead is provided by the OTS trail termination function. The OTS trail termination function also maps the logical OTM Overhead Signal supporting the nonassociated overhead into the Optical Supervisory Channel and combines the OSC with the OTS payload signal to form the OTSn characteristic information.
CBRx Client Signal ODUkP/CBRx-a Adaptation
CBRx Client Signal
CBRx Path Link Connection
o\
—e
A PPiiH-I'^y
AP ODUkP Trail Termination
ODUk Network Connection
\rp/ TCP
\n/
m
OTUk/ODUk Adaptation
OTUk Trail Termination TCP OCh/OTUk Adaptation OCh Trail Termination TCP OMS/OCh Adaptation
OMS Trail Termination TCP OTS/OMS Adaptation
OTS Trail Termination
Vt^
T
VA-7 OTMmn \ n /
..°I?££ I
\ n /
I--
Figure 2-16. STM-N client conveyed on an OTM-n.m server signal
A rch itecture of Transport Networks
2.2.7
39
Equipment Packaging
We have seen that the topological model of layers and partitions, as well as the interlayer functions, do not specify the packaging of functions into telecommunications equipment. Equipment packaging is the realm where layers, partitions, and functions all come together. We have already seen how partitions can be forced by some physical boundary. Equipment provides such a boundary; therefore, equipment content is either driven by partitioning decisions or certain partitioning decisions are forced by equipment content decisions. Unlike the network model, which can support logical reference points at any layer, equipment is obviously constrained to provide only physical interfaces. Equipment therefore encapsulates some common element of the layer, partition, and functional models. It is clear that larger partitions, which are of interest from a network-level perspective, are not usually wholly contained in equipment. However, as we have discussed earlier, all partitions are bounded by ports and, since adaptation and termination functions are present only in source/sink pairs, it is clear that any network layer can usually have only one end terminating in any particular equipment. Layer functions also present ports to both client and server layers. Thus, the modeling component that is common from both a network- and equipment-level perspective is the port. The intersection of the network partition and network layers inside an individual equipment takes place at these ports; i.e., the equipment encapsulates the ports of a partition and one end of one or more layers. (As a corollary, layers and partitions that are fully contained in an individual equipment are internal matters and are generally not of interest to a network. When the equipment allows some flexibility of internal connections, as is generally the case in current equipment, the equipment may be considered to contain an internal flexible subnetwork, which is defined by the ports available for connection (represented as logical resources). Because equipment only provides physical interfaces, all reference points are located inside equipment and are therefore inaccessible. This property allows the functional description of the equipment to be independent of the implementation chosen. Returning to our example of a DS3 client conveyed on an SDH STM-N signal, we see that Figure 2-14 describes the complete set of functional associations between the client DS3 signal and the logical STM-N signal without ever once referring to any physical equipment. Figure 2-17 below shows a possible equipment functional partitioning, i.e., a typical organization of functions into equipment, to support transport of the DS3 client signal across an STM-N transport network. Specifically, what is shown is a DS3 connection supported by a VC-3 trail that is terminated by STM-N multiplexers and traverses an intervening cross-connect system with
Chapter 2
40
STM-N interfaces and an internal VC-3 matrix. Due to the restriction that equipment can only present physical interfaces, we first complete the model by adding DS3 physical interfaces and ensuring that the STM-N section layers are physical layers. DS3 Client Signal
DS3 Client Signal
""^ \^
\
V
<
/ /
u^>.^ pain Linx L oiinecnon
< ,
K
1.,, v '
y
VC-3SN(: VC-3 LC
VC-3 LC
.. <" CP
Y TCP ( )
^
^^•^
\ — f
rcp ( )
STM-N RS Trail
\
A /
A
X
XX
STM-N Physical Media Frail
\
AP f
\
'(A
A /
AP (D
AP a)
\
A
•
4
APX
TCP d)
•
AP
STM-N
/
\
1
AP (i
\
i
T
STM-N RS NC
AP (D
^ S
)
STM-N RS Trail
/
AP (4 STM-N RS NC
STM-N MS NC 4
Y
X
STM-N MS Trail
TCP (p \
)
C P
T T
STM-N MS NC
4 ........ •
^
. . 1.
STM-N MS Trail
AP (p
)
r~^ i!7^ ^ r \ \T77 L r\ 7 \
V(:-3 Trail
\ T T /
TCP (
>
r
11
)
1
r
Figure 2-17. Possible equipment functional packaging
2.2.8
Application Examples
SONET/SDH technology has introduced new network topologies within the transport domain, one of the most notable being the ring. A ring is composed of a set of equipment called add/drop multiplexers (ADMs) that are intercormected in a loop configuration. These ADMs allow traffic to enter and to exit the ring. The main advantage of the ring topology is survivability: the ring offers traffic two different ways of passing from its ingress to its egress (see Figure 2-18).
Architecture of Transport Networks
41 Example: traffic egress A-Bf^
S-A
ADM Example: traffic ingress
^
^ . /
fiber I / '
\ A ^
^ ^AOMI ,
A0M
0
A
••
\flber2
/
J
ADM: Add-Drop Multipl ex
Figure 2-18. Example of ring architecture
Specifically, the ring's provision of diversely routed fibers allows traffic to utilize an alternative route when its usual route is no longer working. The allocation of preassigned capacity between nodes, in conjunction with automatic protection switching facilities, enables protection of traffic upon detection of a failure. SONET/SDH ring protection architectures significantly improve transport service quality by reducing unavailable time. Ring protection schemes can provide both trail and subnetwork connection protection. The simplest scheme is 1+1 (one working and one protection transport entity) subnetwork connection protection (SNCP) on a physical ring. As illustrated in Figure 2-19 [16], the input traffic in this case is broadcast over two routes (one being the normal working route and the second one being the protection route). As an example, consider the failurefree state for a path from node B to node A. In this case, node B bridges a SONET/SDH path layer signal destined for node A onto both working and protection routes around the ring (fibers 1 and 2, respectively). At node A, the signals from both of these routes are continuously monitored for path layer defects, and the better-quality signal is selected.
A-B B-A Selector—>
Bridge
-^^
fiber 1 / /
1*f \
\
••*"*v. Working
Protection\
^
I!A*\^ Sdector \
Bridge
^
\flber2
Figure 2-19. l-fl SNCP in a physical ring — failure-free state
Chapter 2
42
In the event of a failure, a selector switches to the standby route at the ring output if the active route fails. This process is illustrated in Figure 2-20 [16] for the case of a failure between nodes B and A [16].
Bridge
^
Selector — •
...^^ ^"•^v^
fiber 1
i Workirrg
Failure State
-^-'V*^
Bndge
Protection \
l>ov
\
C
B-AI Sel»ctor
\fiber2
Figure 2-20. 1+1 SNCP in a physical ring — failure state
Obviously, the traffic on the ring has a single characteristic information, such as VC-12, and uses the other layers defined for SDH to support these VC-12 connections. We will use this simple example to show how such a network can be modeled from several perspectives, specifically: • Topological architecture from a network-level perspective; and • Associated transport functions from an equipment-level perspective. To simplify the example, we will consider the case of VC-12 client traffic being carried by VC-4 server trails. This example could be easily extended by considering the servers of the VC-4 link connections and/or the clients of the VC-12 trails. From the service provider's perspective, the ring may be represented as a VC-12 subnetwork with an associated VC-12 subnetwork connection (illustrated in Figure 2-21).
VC-12 subnetwork
VC-12 subnetwork connection
Figure 2-21. Representation of a ring as a VC-12 subnetwork
A given VC-12 subnetwork connection may exit the ring at any individual equipment; thus, the assignment between the input and the output ports needs to be flexible. This flexibility may be supported by the use of the partitioning concept. The previous VC-12 subnetwork may be partitioned
Architecture of Transport Networks
43
into three subnetworks (i.e., equipment fabrics, one in each add/drop multiplex), with the two consecutive ones connected via VC-12 link connections. This level of partitioning is reflected in Figure 2-22.
VC-12 subnetwork
Figure 2-22. Further partitioning of VC-12 subnetwork
In fact, all VC-12 link connections are served by a VC-4 trail established between the two topologically adjacent equipment fabrics by a process in the server layer, as illustrated in Figure 2-23.
VC-12 subnetwork connection
VC-12 link connection
Figure 2-23. Illustration of VC-12 client/VC-4 server relationship
From an equipment perspective, we can now use the network-level model to represent the transport function characteristics that will be implemented in the ADM equipment. If we do so, we see that the equipment characteristics for one ADM may then be expressed as illustrated in Figure 2-24. As mentioned earlier, equipment encapsulates some common element of the layer, partition, and functional models, and is constrained to provide only physical interfaces.
Chapter 2
44
L
2.048 Mb/s
S12/E12 ^
\
S4/S12
^ ^
^
1
S12/E12
^X^
S4/S12
. ^
VC-4 network connection " ^
MS/S4
^
^v^
cJ>
MS/S4
;X"
cp ^MS
RS/MS
RS/MS
OS/RS
CP OS
• / ^
^ ^
OS/RS
^x-^
.OS
STM-N optical signal
Figure 2-24. ADM equipment characteristics
If we consider the direction from ingress to egress of the ring, illustrated previously in Figure 2-19, we can model the protection scheme in a straightforward manner. The specific application example for a 1+1 SNCP ring is illustrated in Figure 2-25. The selector connection function is flexible and is driven by trail signal fail (TSF) signals derived from the S12m_TT_Sk termination points that are reading the S12 layer characteristic information. This arrangement is known as nonintrusive monitoring because, while the layer overhead is read to provide signal quality information, the layer is not in fact terminated. We note that these connection functions model the "bridge" and "selector" previously depicted in Figure 2-19. In this example we illustrated how the model can be used to represent a SONET/SDH ring architecture as well as SDH ring ADM equipment. This example shows how such a model provides a language that links service description, equipment functionality, and equipment management.
45
Architecture of Transport Networks Uni-directional representation
Protected sub-network connection Bridge
Selector
;TSF
<^o>
u^tvv.Qrki;piiDectiQ.a___
(Q^
Figure 2-25. Subnetwork connection protection using nonintrusive monitoring
In the next example, we illustrate a service provider offering STM-64 switched connection services via an OTN mesh network, focusing our attention on the transport plane as opposed to the control plane, which supports the signaling necessary to automatically setup the connections within the transport plane. The STM-64 service provides flexible connectivity of STM-64 Regenerator Section (RS64) connections. In this example, the term STM-64 is used to refer to the STM-64 RS64 layer network. We illustrate how such a network can be modeled from several perspectives, specifically: • Topological architecture from the STM-64 RS64 network level perspective; • Topological architecture from the 0DU2 network level perspective; and • Associated transport functions from an equipment-level perspective. The customer connects to the service provider's network within the transport plane via an STM-64 physical interface (see Figure 2-26). The service provider provides an STM-64 switched connection service via the combination of a transport plane that supports the flexible connectivity of STM-64 connections and a control plane that provides dynamic routing and signaling capabilities to determine a path for the STM-64 connection and assign resources within the network to support the connection.
Chapter 2
46
Service Provider Network
SDH
xc
SDH XC
UNI
L Nl SDH XC
STM-64 Physical Interface
Figure 2-26. STM-64 switched connection service
The customer, via UNI signaling within the control plane, requests STM64 connections between a set of endpoints across the service provider's STM-64 network. The customer is not aware of, nor does it care about, the technologies and architecture used by the service provider to support such a service. From the customer's perspective, the topological architecture of the service provider's network can be modeled as an STM-64 subnetwork and multiple STM-64 access links (see Figure 2-27). The STM-64 subnetwork provides the flexible connectivity within the transport plane, and the STM64 access links provide the fixed connectivity between the customer and the service provider's network. In using the model to describe the topological architecture, we can clearly see that such signaling provides coordination for connection management between two partitions of the STM-64 layer network.
STM-64 Link
!
[1 | ( ( \
UNI
STM-64 Subnetwork
\^^^
fUNI
STM-64 Link End
Figure 2-27. Topological architecture of the STM-64 layer network
The customer requests an STM-64 connection originating and terminating at specific link ends. The connection is subsequently setup across the STM-64 access links and the STM-64 subnetwork (see Figure 2-
Architecture of Transport Networks
47
28). The connection is partitioned into two STM-64 link connections and one STM-64 subnetwork connection.
STM-64 Link Connection
STM-64 Link Connection
Ai
Figure 2-28. Partitioning of the STM-64 connection
Across the access hnks, the STM-64 hnk connections are supported via the Optical Section 64 (OS64) server trail, as described in Section 2.2.1.1, thus supporting an STM-64 physical interface between the customer and service provider. Within the service provider's network, there is a need to monitor the quality of the STM-64 subnetwork connection as it is transported across the network. Therefore the service provider must transport the STM-64 subnetwork connection via a server layer that can provide the necessary monitoring capabilities. In this example, the service provider supports the STM-64 connection via an OTN. The STM-64 subnetwork connection is supported via an ODU2 server trail. The 0DU2 server trail allows the service provider to monitor the STM64 client as it is transported across its network. The topological architecture from the 0DU2 network level perspective can be modeled as an 0DU2 subnetwork associated with various access groups (see Figure 2-29).
0DU2 Link
a
^ ^
0DU2 Subnetwork
1) 1
1 ^^ 0DU2 Access Group
Figure 2-29. Topological architecture of ODU2 layer network
Chapter 2
48
The ODU2 subnetwork can be further partitioned into four smaller 0DU2 subnetworks corresponding to four OTN cross-connect fabrics, connected via ODU2 links (see Figure 2-30).
/ JL
ODU2LJnk
/
y "^
y^
0DU2 Subnetwork
\ 1^
0 D U 2 Link
\ ^ (K OOU2 H i l l Subnetwork
A\ III
(k 111
ODU2 m Subnetwork l l j
m •
1 V
\ .
(
Subnetwork
T
Figure 2-30. Partitioning of 0DU2 layer topological architecture
An 0DU2 network connection is set up that originates and terminates within ODU2 access groups, across the 0DU2 links and ODU2 subnetworks. The network connection is partitioned into three 0DU2 link connections and two ODU2 subnetwork connections (see Figure 2-31). ODU2 Subnetwork Connection
0DU2 Subnetwork Connection
Figure 2-31. Partitioning of 0DU2 network connection
Architecture of Transport Networks
49
Above, we modeled the topology of both the STM-64 service layer and the ODU2 layer networks. Notice that when describing the topological architecture from a network-level perspective, we focus on a particular layer network; in other words, we take a horizontal snapshot of the network. In modeling the transport functions from an equipment-level perspective, we take a vertical snapshot of the network, focusing on termination and adaptation of multiple layer networks. To illustrate the modeling of transport functions from an equipment-level perspective, we will focus on the OTN equipment used to support the ODU2 links and subnetworks modeled above. In this example, the service provider uses OTN line systems interconnected via OTN cross-connects to support the network architecture. Figure 2-32 illustrates the physical OTN network architecture, which is the physical equipment implementation of the network architecture models presented above.
OTN XC
SDH XC
OTM-n.m
OTN XC
OTN DWDM Line System
OTM-n.m
OTN XC
OTM-n.m
SDH XC
OTN XC
r~\ SDH XC
Figure 2-32. Physical OTN network architecture
The OTN equipment transport functions, specifically the OTN DWDM line systems and the OTN cross-connects, can be modeled as shown in Figure 2-33. The OTN DWDM line terminals at the edge of the service provider network terminate the OS64 layer network and map the CBRIOG layer network characteristic information into an ODU2P server layer. The 0DU2P layer network is subsequently mapped into the OTU2 layer network, which is supported via an OCh layer network. Multiple optical channels are multiplexed into the OMSn layer network, which is then supported via the OTSn layer network. The ODU2 subnetwork is implemented via the OTN
Chapter 2
50
cross-connect. The OTN cross-connect demultiplexes/multiplexes the 0DU2 connections from/to the OTM-n.m hne interface. The OTSn, OMSn, OCh, and OTU2 layers are terminated, and the ODU2 logical signal is crossconnected.
OTN DWDM Line System
OTN XC
OTN DWDM Line System
i\
Figure 2-33. Functional model of an OTN network
2.2.9
Equipment Control
The functional model of an individual equipment describes the way in which the equipment accepts, processes, and forwards the information contained in a signal. Transmission and equipment supervision processes are concerned with the management of the transmission resources in the network and involve the equipment functionality. Description of these processes requires a functional representation of equipment that is implementation independent. We assume that the manager of the transport equipment has no knowledge of the internal equipment implementation, so equipment faults are recognized as the unavailability of the affected functions. Most atomic functions monitor the signals they are processing for certain characteristics
A rchitecture of Transport Networks
51
and provide performance information or defect information based on those characteristics. Therefore, transmission supervision processing provides information about the external interface signals that are processed by equipment. Equipment supervision processing is concerned with the fault localization and repair of the equipment itself. Its purpose is to answer the classic question "Who to send where to repair what?" with a single replaceable unit at a single location. This desire to report only a single fault stems from the need to avoid overloading management systems and personnel with unnecessary information. It does not require a knowledge of the transmission network, other than that the equipment faults may have been categorized to indicate the severity (e.g., prompt, deferred, maintenance event information) of the fault. Just as we have seen that the network topology can be described recursively (e.g., subnetworks can be partitioned into smaller subnetworks), we can also describe equipment recursively. Specifically, equipment is considered to be built from containers (i.e., racks, which contain shelves, which contain plug-in units) where the limit of equipment recursion is a replaceable unit. As its name suggests, a replaceable unit is the smallest piece of equipment that can be replaced, and is usually a plug-in unit. It also represents the lowest level of granularity needed for equipment fault reporting. Since a manager has no knowledge of the internal details of equipment, it clearly makes no sense to burden the manager with fault reports about these details. This point is an important to keep firmly in mind when working with the functional model, because many functions are frequently packaged on the same replaceable unit, and it can be very tempting — but not useful — to consider reporting detailed fault information for each function. Figure 2-34 below illustrates atomic function information points, i.e., information exposed by a function that is not part of the payload or its standard overhead. Information points include management, timing, and remote points that generate management information (MI), timing information (TI), and remote information (RI), respectively. Timing and management points can connect to any atomic function. (An independent functional model should be developed for a timing information network. Though we will not describe TI further in this chapter, ETS 300 417-6-1 [17] and G.806 [15] contain timing atomic functions.) Further detail on function inputs and outputs is illustrated in Figure 235. Here the vertical flows represent the pay loads, designated as characteristic information (CI) and adapted information (AI).
Chapter 2
52 DS3 Client Signal
DS3 Client Signal - e
DS3 Path Link Connection
& •
VC-3/DS3 Adaptation
A /-MI
\
A /-MI
-\n/-RI
\
A
y^MI
STM-N RS Trail
STM-N RS Trail
MI-\ry^Rl
MI-Kry^
RI
MI - Management Information TI - Timing Information RI - Remote Information Figure 2-34. Atomic function information points
Figure 2-35. Atomic function inputs and outputs
MI-\ry^
A rch itecture of Transport Networks
53
2.2.10 Equipment Supervisory Process In this section, we introduce the basic concepts, terminology, and constructs necessary for utihzation of equipment supervisory process concepts. 2.2.10.1
Basic Concepts
Estabhshing network and equipment level functional models according to the principles described in Section 2.3.2 provides us the necessary foundation for understanding the concepts associated with equipment supervisory processes. In particular, we will address processing, which takes place within an atomic function to derive basic network management information from the signal, and its preparation for passage to the elementlevel processing that derives alarms and performance parameters. Processing within an atomic function encompasses such steps as detailed definition of the information to be made available to a management system and detailed definition of the parameters that must be provisionable by an operator or management system. Element-level processing refers to the correlation and analysis of information provided by several atomic functions to provide a reduced amount of higher-level information to the operator. This processing leads to alarms, performance reports, and lighting of indications on the equipment. In summary, this data is the most detailed available on client signals. The supervision process describes the way in which the actual occurrence of a disturbance or fault is analyzed, with the purpose of providing an appropriate indication of performance and/or detected fault condition to maintenance personnel. In general, performance monitoring involves the continuous collection, analysis, and reporting of performance data associated with a transmission entity. It refers to the set of functions and capabilities necessary for equipment to gather, store, threshold, and report performance data associated with its monitored transmission entities. These performance-related data elements are termed performance parameters. Performance-monitoring parameters can be generated in trail termination, adaptation, and connection functions. The actual signal parameters monitored are technology dependent. Performance parameters are normally gathered under in-service, nonfailure conditions and are typically accumulated (and stored) over predetermined accumulation periods. Performance history data is useful for verifying customer trouble reports and responding to alerts so as to quickly
Chapter 2
54
assess the recent performance of transport systems (determine the Quahty of Service) and to sectionahze the trouble or degradation (for example, to locate sources of intermittent errors). This history can also be used in performance assessment against long-term performance objectives [18-19]. The equipment management function performance monitoring process collects the events associated with trail, link, and protection performance parameters. It counts the events to derive the performance parameters, and stores these parameters for later retrieval. In general, fault (or alarm/status) monitoring is a process that tracks failure events so as to contribute to an understanding of the overall transmission performance of an entity. The information conveyed via alarm/status monitoring consists of a set of indications that are maintained by the equipment. The equipment sets and clears indications according to well-defined criteria based upon the occurrence and duration of specific events. Some events immediately lead to indications, while others must persist for a specified amount of time prior to the setting of an indication. Alarm and status indications are generally reported under failure events. Alarm/status monitoring and performance monitoring complement one another [20]. Fault-monitoring parameters can be generated in trail termination, adaptation, and connection functions. Each atomic function and performance monitoring process generates and delivers a fault cause to the Element Management Function (EMF) fault management process, and a performance indication to the EMF performance monitoring process. The EMF fault management process within equipment performs a persistency check on the fault causes before it declares a fault cause a failure. The failure is reported via an output failure report and by means of alarms (audible and visible indicators). These functions of the supervision process are illustrated in Figure 2-36 below.
1 Atomic Functions Fault & Performance Monitoring Process
7 , Fault Causes
Equipment Management Function Fault Management Process
Performance Indications
Equipment Management Function Performance Monitoring Process
Figure 2-36. Functions of the supervision process
A rch itecture of Transport Networks 2.2.10.2
55
Terminology and Constructs
The following terms are used to describe the supervision process: anomaly, defect, fault, fault cause, failure, consequent action, and alarm. Performance primitives are basic performance-related occurrences detected by monitoring the signal, and these impairment events give rise to various performance parameters. Primitives are grouped into categories of anomalies and defects. Performance parameters are derived from the processing of performance primitives, and the associated terms are defined below [18-19]: • Anomaly: A performance anomaly is a discrepancy between the actual and desired characteristics of an item. The desired characteristic may be expressed in the form of a specification. An anomaly may or may not affect the ability of an item to perform a required function. • Defect: A performance defect is a limited interruption in the ability of an item to perform a required function. It may or may not lead to maintenance action, depending on the results of additional analysis. Successive anomalies causing a decrease in the ability of an item to perform a required function are considered to be a defect. • Fault: A fault is the inability of a function to perform a required action. Faults do not include an inability due to preventive maintenance, lack of external resources, or planned actions. • Consequent action - The action taken in response to an anomaly, defect, or fault. • Fault cause: A single disturbance or fault may lead to the detection of multiple defects. A fault cause is the result of a correlation process that is intended to pinpoint the defect that is representative of the disturbance or the fault that is causing the problem. • Failure: Performance failures refer to the termination of an item's ability to perform a required function. In equipment, both local and remote failures can be observed. Local failures involve near-end signal failures, and remote failures are those that occur and are recognized elsewhere and are reported within the transmission system. (While SDH transports most indications in special channels embedded in the signal, this is not necessary for the working of the model.) • Alarm: Alarms are specific (human observable) types of notifications concerning detected failures (or abnormal conditions) usually giving an indication of the severity of the failure. Typically, alarms can be divided into unit level alarms, equipment level alarms, and central office/station alarms. Detected anomalies are processed as described below: • Anomalies are subjected to a check to identify defects;
Chapter 2
56 • • •
Certain defects initiate consequent actions; Defects are correlated to identify the probable fault cause; and Near-end and far-end defect and performance impairment indicators are counted.
2.2.1Q3
Utilization of Concepts
The atomic function fault expanded to show its intemal performance processes for the Figure 2-37. While the details shown are completely generic.
Far-end Anomaly
Consequent Action
To/From Atomic Function Fault Management
Integration of anomalies into defects
and performance process can be further workings. As an example, the fault and trail termination function are depicted in are technology specific, the basic functions
Far-end Defect
Far-end . Performance [determination process
Consequent ' action control Near-end I performance Idetermination process Near-end Defects
Near-end Anomalies
Integration of anomalies into| defects
Consequent Actions
ToEMF performance monitoring
[Correlation of defects into a tfault cause
To EMF fault management
Consequent action control
Figure 2-37. Fault and performance processes for Trail Termination Function
The EMF fault management process is illustrated in Figure 2-38 below.
Architecture of Transport Networks
57
Unit
Translation of failures into Unit Alarms
Fault Cause cZZZ
Translation of failures into Equipment Alarms (e.g., Severity)
Failure Integration fZZZ of fault causes into failures
Equipment Alarms
; ^:
Station Alarms
j i
Translation of failures into Station Alarms and alarm disconnect control
Fault management selective reporting control
j
Autonomous j Failure Reports ; to Mgmt. System ^ i
Figure 2-38. Element Management Function fault management process
Similarly, for the EMF performance monitoring process (Figure 2-39), the individual performance parameter outputs are fed to functions that determine near- and far-end performance according to appropriate standardized performance parameters (e.g., errored seconds) and are next forwarded to the performance monitoring history process. As a final step, performance management reports may be provided on a selective basis.
raw performance information filter process far-end defect second
derived performance parameters
•
registers and filter report threshold
filter _ process
j performance : mgmt. - ^ reports ; :
mgmt. system
Figure 2-39. Element Management Function performance monitoring process
58
Chapter 2
This section has dealt with equipment supervision from the perspective of processes that have been described in equipment standards. In real implementations, the need to reduce the amount of data sent to management systems and personnel leads to internal proprietary mechanisms that are not part of the standard. For example, most switch fabrics use some sort of internal continuity check to generate appropriate fault indications, yet are not part of any standard. In the language of functional modeling, these checks can be modeled as internal trails in exactly the same way as has been discussed for external trails. 2.2.10.4
Application Example
While the atomic function fault and performance monitoring processes are internal to the atomic function, the results of the processing are generally shown as function indication inputs and outputs. An example using the VC4 trail termination sink function is described below and illustrated in Figure 2-40. In addition to the signal inputs and outputs described in the preceding sections, the termination sink function is considered to have four sets of management-related interfaces :< • An input interface from the server layer that accepts indications about the health of the server layer. - These indications can be used to represent a failure or degradation in the server layer network. • An output interface to the client layer that provides indications about the health of the trail provided by this layer. - These indications can inform the next downstream adaptation function of signal fail or signal degrade condition of the associated trail. • An output interface carrying remote indications from the far end sink termination. - These indications can convey the defect status of, or the exact or truncated number of error detection code violations within, the characteristic information received by the trail termination sink function back to the network element that contains the characteristic information originating trail termination source function. It is used for single-ended monitoring of the entire trail (i.e., the two extremities). • An input/output interface carrying local management indications allowing for the provisioning of threshold data and any local control that may be provided. These indications: - Enable control (start and stop) of the trail termination fault monitoring reporting activities
Architecture of Transport Networks -
59
Enable control of the server signal failure (SSF) report, when detected, to the adaptation function - Provide the expected VC-4 trace identifier (the received path identifier has to match with the expected one to prevent misconnections) - Enable the report of remote defect indicator (RDI) and server signal failure (SSF) to the management interface. This feature allows monitoring of the far-end trail termination function defects from the near-end trail termination function. - Allow provisioning of the minimum number of errors to be detected by this function over a one-second time period required to declare the second as a bad second. The naming conventions for these management-related interfaces are provided below [19]: • Input interface to server layer — _CI_[SSF] • Output interface to client layer — _AI_[TSF, TSD] • Output interface carrying remote indications from the far end sink termination — _RI_[RDI,REI] • Input/output interface carrying local management information - _TT_SK_MI_[TPmode, SSF-reported, ExTI, RDI_reported, DEGTHR]; input (or output) signal where SSF = Server signal fail SD = Server signal degrade TSF = Trail signal fail TSD = Trail signal degrade RDI = Remote defect indication REI = Remote error indication TT_SK = Trail termination sink TPmode = Trail termination fault reporting control ExTI = Expected trace identifier DEGTHR = Degraded threshold
60
Chapter 2 S4_AI_[TSF,TSD]
S4_TT_SK_MI_[...]
<
\
/
• S4_RI_[RDI,REI]
S4_CI_[SSF] Figure 2-40. VC-4 Trail Termination Sink processing
2.2.11 Modeling Connectionless Layer Networks As the G.805 modeling approach allowed us to analyze the transport network and identify generic functionality that is independent of implementation within connection-oriented networks, G.809 does the same for connectionless networks. The same types of architectural components used to describe connection-oriented networks (such as topological components, transport entities transport processing functions, and reference points) are used to describe connectionless networks. The concepts of layering and partitioning that were fundamental in describing the topological aspects of a connection-oriented network are similarly used in describing connectionless networks. Whereas the concept of a connection is central to the description of connection-oriented networks, such a construct is inappropriate for the description of connectionless networks. Within connectionless networks, analogous to the connection is the concept of a flow, which is an aggregation of one or more traffic units or data grams with an element of common routing. The concept of a flow uses the principle of recursion in that a flow can contain another flow, until the limit of recursion is reached when a flow contains exactly one data gram. In substituting the concept of a flow for the connection, equivalent components within the connectionless network may be defined. The topology of connectionless layer networks is described by access groups, flow domains, and flow point pool links, which correspond to access groups, subnetworks, and links within connection-oriented networks. In addition.
A rchitecture of Transport Networks
61
several kinds of flows may be identified, depending on the layer and partition traversed by the flow. These are the link flow, flow domain flow, network flow, and a connectionless trail, which correspond to link connections, subnetwork connections, network connections, and trails within connection-oriented networks. Finally, a fundamental difference between connection-oriented networks and connectionless networks is the directionality of transmission that may be supported. Whereas connection-oriented networks may support either bidirectional or unidirectional transmission, connectionless networks always transfer data unidirectionally.
2.2.12 Summary The functional modeling method utilized within these dimensions is based on a recursive decomposition technique that decomposes the network into layer networks, layer networks into atomic functions, atomic functions into processes, and processes into detailed requirements. The created set of processes and atomic functions establish a kind of library of components, which can be used to describe many types of equipment. As such, there are no longer any mandatory requirements for particular equipment in standards. A requirement is mandatory for equipment only if the equipment includes an atomic function containing a process that is stated to support the requirement. It becomes the manufacturers' responsibility to select the proper subset of atomic functions within each of the network elements offered. (Some further reading material related to the usage of functional modeling in networks and equipment may be found in references [21-23].)
2.3. 1.
NOTES Portions of the material in this chapter appeared in or are updates of material originally published in Varma, E.L., and T. Stephant, et al, Achieving Global Information Networking, Artech House, Norwood, MA, 1999. Replicated material is reprinted here and updates made with permission of the publisher.
62
2.4. [I]
Chapter 2
REFERENCES
S. Brown, "A Functional Description of SDH Transmission Equipment," BT Technology Journal, Volume 14, No. 2, April 1996. [2] ITU-T Recommendation G.805, Generic functional architecture of transport networh, March 2000. [3] ITU-T Recommendation G.803, Architecture of transport networks based on the synchronous digital hierarchy (SDH), March 2000. [4] ITU-T Recommendation G.783, Characteristics of SDH Equipment Functional Bloclcs, February 2004. [5] ITU-T Recommendation 0.841, Types and characteristics of SDH network protection architectures, October 1998, Corrigendum 1, August 2002. [6] ITU-T Recommendation G.842, Interworking of SDH network protection architectures, April 1997. [7] ITU-T Recommendation G.872, Architecture of optical transport networks, November 2001 Amendment 1 December 2003, [8] ITU-T Recommendation G.798, Characteristics of optical transport network equipment functional blocks, January 2002, Amendment 1 June 2002 [9] ITU-T Recommendation G.873.1, "Optical Transport Network (OTN): Linear protection", March 2003 [10] ITU-T Recommendation G.809, Functional architecture of connectionless layer networks, March 2003. [II] ITU-T Recommendation G.8010, Architecture of Ethernet layer networks, February 2004. [12] ANSI T1.105 -1995, Synchronous Optical Network (SONET) — Basic Description including Multiplex Structure, Rates, and Formats. [13] ITU-T Recommendation G.707, Network node interface for the Synchronous Digital Hierarchy (SDH), December 2003. [14] ITU-T Recommendation G.709, Interfaces for the Optical Transport Network (OTN), February 2001. [15] ITU-T Recommendation G.806, Characteristics of Transport Equipment - Description Methodology and Generic Functionality, February 2004. [16] J. Manchester and P. Bonenfant, "Fiber optic network survivability: SONET/optical protection layer interworking," in Proc. ofNFOEC'96, Denver, CO, 1996. [17] ETSI EN 300 417-6-1, Generic requirements of transport functionality of equipment; Part 6-1: Synchronization layer functions, May 1999. [18] ANSI Tl .231-1997, Digital Hierarchy - Layer I In-Service Digital Transmission Performance Monitoring. [19] ITU-T Recommendation G.784, Synchronous digital hierarchy (SDH) management, July 1999. [20] ETSI EN 300 417-7-1, Generic requirements of transport functionality of equipment, Part 7-1: Equipment management and auxiliary layer functions, October 2000. [21] Mike Sexton, and Andy Reid, Transmission Networking — SONET and the Synchronous Digital Hierarchy, Artech House, 1992. [22] Mike Sexton, and Andy Reid, Broadband Networking: A TM, SDH, and SONET, Artech House, 1997. [23] Huub van Helvoort, SDH/SONET Explained in Functional Models, John Wiley & Sons, 2005.
Chapter 3 INTERFACES FOR OPTICAL TRANSPORT NETWORKS
Timothy P. Walker* and Khurram Kazi** ""AMCC, **SMSC
3A.
INTRODUCTION
The advancements in optics, photonics, electronics, software, and human expertise ensure the continual evolution of telecommunication and data networks. Over and over again, we have seen that present-day transport networks become tomorrow's access networks. We have witnessed Plesiochronous Digital Hierarchy (PDH) networks that once were the backbone of the network operators, with the widespread laying of the optical fiber, become the feeding data pipes to Synchronous Optical Networks or Synchronous Digital Hierarchy (SONET/SDH). The single optical channel transport system based on SONET/SDH standards proved to be very successful for its time. With the advent of Dense Wavelength Division Multiplexer Optical Add-Drop Multiplexer (DWDM-OADM), optical amplifiers, and Optical Cross-Connects (OXCs), we are witnessing a paradigm shift, where SONET/SDH networks are slowly becoming the feeding pipes to an "all-optical" transport network, better know as the Optical Transport Network (OTN). Since the telecom bubble burst, the rush to deploy OTN has been much slower that originally anticipated. With each paradigm shift, the standards committees are faced with the challenge of defining new standards that become the foundation or the blueprints [1] for the network service providers, equipment manufacturers, and users alike. Every effort is made to ensure that such standards are
64
Chapter 3
general enough so as not to be limited by the present technology yet provide enough information such that interoperability between different service providers or equipment manufacturers is seamless. International Telecommunication Union-Standardization Sector (ITU-T) has been leading the efforts in defining the optical networking standards. As we saw in Chapter 2, at the very outset of the development of OTN standards, it was realized that a framework or set of conventions had to be developed. ITU-T Rec. G.805 [2] established the rules and formulated the fundamental vocabulary that is used in capturing and analyzing the requirements of the transport networks.
3.2.
OTN STANDARDS
Many standards fall under the umbrella of OTN. In this chapter, we focus on Layer 1 standards and will not cover the physical or optical layers. Chapters 17 through 19 cover the relevant standards regarding the intranetwork element communication, physical electronic links, and optical properties. The Optical Transport Hierarchy (OTH) is a new transport technology for the OTN. It is based on the network architecture defined in ITU G.872 [3] Architecture for the Optical Transport Network (OTN). It defines an architecture that is composed of the Optical Channel (OCh), Optical Multiplex Section (OMS), and Optical Transmission Section (OTS). Functionality that is needed to make OTN work is also described in it. However, it may be interesting to note the decision made during G.872 development, as noted in Section 9.1/G.872 [3] : During the development of ITU-T Rec. G.709, (implementation of the Optical Channel Layer according to ITU-T Rec. G.872 requirements), it was realized that the only techniques presently available that could meet the requirements for associated OCh trace, as well as providing an accurate assessment of the quality of a digital client signal, were digital techniques.... For this reason ITU-T Rec. G. 709 chose to implement the Optical Channel by means of a digital framed signal with digital overhead that supports the management requirements for the OCh listed in clause 6. Furthermore this allows the use of Forward Error Correction for enhanced system performance. This results in the introduction of two digital layer networks, the ODU and OTU The intention is that all client signals would be mapped into the Optical Channel via the ODU and OTU layer networks.
The above statements can be attributed to the fact that the use of optical amplifiers and DWDM components significantly increased the transport capacity of the optical fiber. Hundreds of narrowly spaced wavelengths can
Interfaces for Optical Transport Networks
65
carry individual optical channels over ever-increasing distances. With these advancements came the problems associated with the "analog" transmission. Due to characteristics of the optical fiber and the related optical components, the optical signals eventually have to be regenerated. Moreover, it is possible that no single network operator will be the owner of the end-to-end path. Thus, appropriate management and control information for "signal handoff' needs to be passed from one operator's domain to another's without revealing the details of the two respective networks. This requirement resulted in the intelligent processing in the digital domain at naturally occurring 3R points (Retime, Reshape, and Reamplify) or at the domain boundaries. Another requirement that was placed on the next generation optical transport network is that it should be able to transport any client signal transparently. Figure 3-1 summarizes the requirements placed on the OTN [1].
-'—r
Ability to manage the network at the interfaces
Ability to accept ANY client signal Och Trail. FEC, TOM OH inserted
Och FEC, TOM OH terminated/inserted
Och Trail, FEC, TOM OH terminated
Optically tj-anspaireiiit;:
W:^gi^m^::-
OTN Client
Optically transparent subnetwork, bounded by feature enhanced electronics
3R Opto-Electronic OADM Processing Unit OLS: Optical Line System; OXC: Optical Gross Connect; OADM: Optical Add-Drop Mux; 3R: Retime, Reshape, and Reamplify
Figure 3-1. 3R points within Optical Transport Networks
Some characteristics that the standards committee members were keeping in mind during the evolution of OTN were to provide • An efficient transport networking layer that supports very high rate services at 2.5, 10 and 40 Gb/s • A strong Forward Error Correction (FEC) • An increased number of Tandem Connection Monitoring (TCM) layers
66
Chapter 3 • •
• • • •
Management of frequency slots (OCh's signals of multiple A.s) instead of time slots (STS-1, etc.) Service transparency for any client signal transport—for example, Ethernet, SONET/SDH, ATM, EP, Fibre Channel, Digital Video Broadcast, or proprietary data stream The shortest and most efficient physical layer stack for data services transport The gigabit-level bandwidth granularity required to scale and manage multiterabit networks A means to avoid very large numbers of fine granular pipes stressing network planning, administration, and management A means to select the frequencies such that they can be transmitted over existing transoceanic optical line systems with lOG wavelengths
It should be noted that unlike the SONET/SDH standard, which was designed around a mux structure starting from VT1.5/VC12 that filled the payload with voice-centric 64 Kb/s traffic, one of the major rationales behind the development of OTN standards was to solve the transport problems at higher speeds and not be restricted by any legacy work. The two relevant standards that cover the implementation of OTH are G.709 [4] and G.798 [5].
3.3.
STANDARDIZED INTERFACES
The optical transport network architecture as specified in ITU-T 0.872 defines two interface classes: • Interdomain interface (IrDI) • Intradomain interface (laDI) The OTN IrDI interfaces are defined with 3R processing at each end of the interface, i.e., the interface between Operators. It could also be thought of as the interface between different vendors within the same operator, as depicted in Figure 3-2.
Interfaces for Optical Transport Networks
67
User \ ii«
Figure 3-2. laDI vs. IrDI
The laDI is the interfaces within an operator or a vendor domain. G.709 apphes to information transferred across IrDI and laDL G.709 compHance at the IrDI by itself is not sufficient to guarantee a midspan meet, since it doesn't specify the electrical or optical interfaces. It is important to note that G.709 only defines logical interfaces.
3.4.
FORWARD ERROR CORRECTION
Forward error correction is a major feature of the OTN where the FEC check information is appended to the G.709 frame (making it "out of band" FEC), thus lending itself to strong FEC. The FEC defined for SONET/SDH networks uses undefined section overhead bytes to transport the FEC check information and hence is referred to as in-band FEC. It allows only a limited amount of FEC check information, which limits the performance of the FEC. For the OTN a Reed-Solomon 16-byte interleaved FEC scheme is defined, that uses 4 x 256 b3^es of check information per Optical channel Data Unit (ODU) frame. In addition, enhanced (proprietary) FEC schemes are explicitly allowed and are widely used, as described in ITU-T recommendation G.975.1 [6]. FEC has been proven to be effective in Optical Signal-to-Noise Ratio (OSNR) limited systems as well as in dispersion limited systems. As for nonlinear effects, reduction of the output power leads to OSNR limitations, against which FEC is useful. FEC is less effective against Polarization Mode Dispersion (PMD); however, G.709 defines a stronger FEC for OTN that can result in up to a 6.2 dB improvement in Signal to Noise Ratio (SNR). Another way of looking at this is as a transmission of a signal at a certain Bit Error Rate (BER) providing 6.2 dB less power than without such an FEC. The coding gain provided by the FEC can be used to •
Increase the maximum span length and/or the number of spans, resulting in an extended reach (note that this point assumes that other impairments
68
Chapter 3
such as chromatic and polarization mode dispersion are not becoming Hmiting factors) • Increase the number of Dense Wavelength Division Multiplexing (DWDM) channels in a DWDM system that is limited by the output power of the amplifiers by decreasing the power per channel and increasing the number of channels (note that changes in nonlinear effects due to reduced per-channel power have to be taken into account) • Relax the component parameters (e.g., launched power, eye mask, extinction ratio, noise figures, and filter isolation) for a given link and lower the component costs • Most importantly, make the FEC an enabler for transparent optical networks Transparent optical network elements like Optical Add Drop Multiplexers (OADMs) and Photonic Cross-Connects (PXCs) introduce significant optical impairments (e.g., attenuation). The number of transparent optical network elements that can be crossed by an optical path before 3R regeneration is needed is therefore strongly limited. With FEC, an optical path can cross more transparent optical network elements. This allows the evolution from point-to-point links to transparent, meshed optical networks with sufficient functionality.
3.4.1
Theoretical Description
G.709 FEC implements a Reed-Solomon RS(255,239) code. A ReedSolomon code is specified as RS(n,k) with ^--bit symbols, where n is the total number of symbols per codeword, k is the number of information symbols, and s is the size of a symbol. A codeword consists of data and parity, also known as check symbols, added to the data. The check symbols are extra redundant bytes used to detect and correct errors in a signal so that the original data can be recovered. For G.709: s = size of the symbol = 8 bits n = symbols per codeword = 255 bytes k = information symbols per codeword = 239 bytes A typical system is shown in Figure 3-3.
Interfaces for Optical Transport Networks interference Data source
Reed-Solomon encoder
> "^
69
i^
Communication Channel
J ^
Reed-Solomon decoder
Data sink
Figure 3-3. FEC block diagram
This setup means that the encoder takes k information symbols of s bits each and adds check symbols to make an ^-symbol codeword. There are n-k check symbols of s bits each. A Reed-Solomon decoder can correct up to t symbols that contain errors in a codeword, where 2t = n - k. Figure 3-4 shows a typical Reed-Solomon codeword:
n ^
Data
<
21
*
Parity
Figure 3-4. Reed-Solomon codeword
For the G.709 standard ITU recommended RS (255,239) code: 2/ = / 2 - ^ = 255-239=16 Hence, the decoder can correct any eight symbols in a codeword. ReedSolomon codes treat errors on a symbol basis; therefore, a symbol that contains all bits in error is as easy to detect and correct as a symbol that contains a single bit-error. That is why the Reed-Solomon code is particularly well suited to correcting burst errors (where a series of bits in the codeword are received in error by the decoder). Given a symbol size s, the maximum codeword length {n) for a Reed-Solomon code is n = T-\=255 Interleaving data from different codewords improves the efficiency of Reed-Solomon codes because the effect of burst errors is shared among many other codewords. Interleaving spreads the impact of a noise burst over multiple symbols, that come from several code words. As long as each deinterleaved codeword has fewer errors than it can correct, the interleaved
70
Chapter 3
group of codewords will be corrected. It is possible that some codewords will be corrected and some not if excessive errors are encountered. Interleaving actually integrates the error-correction powers of all of the codewords included in the interleaved group that is the depth of the interleaver. This allows a higher rate of code and channel efficiency and still protects against an occasional very long error. For example, if 64 codewords that can correct 8 errors each are interleaved, the interleaved group can correct almost any combination of symbol errors that total less than 512. It does not matter whether all 512 are in one long burst, appear as 512 onesymbol errors, or are in any combination in between. Both ITU-T G.709 and ITU-T G.975 specify interleaving as part of the transport frame to improve error-correction efficiency.
3.4.2
Coding Gain
The advantage of using FEC is that the probability of an error remaining in the decoded data is lower than the probability of an error if an FEC algorithm, such as Reed-Solomon, is not used. This outcome is coding gain in essence. Coding gain is the difference in Input SNR for a given Output BER. The Input SNR is measured either as the "Q factor" or as E\JNo or OSNR (). The "Net Coding Gain" takes into effect that there was a 7% rate expansion due to the FEC. What this means is that the data rate had to increase by 7% in order to transmit both the data and the FEC. 3.4.2.1
Coding Gain measured via Q Factor
The widely used technique of measuring coding gain is the Q-factor (quality factor) measurement. This technique estimates the OSNR at the optical amplifier or receiver by measuring BER vs. voltage threshold at voltage levels where BER can be accurately determined (see Figures 3-5 and 3-6). In reality, however, the Q-factor is derived from the measurement of the eye-pattern signal. It is defined as the ratio of peak-to-peak signal to total noise (conventionally electrical): Q = (jLli - |Llo)/(ao + Gi)
where |Lii and jLioare the mean signal levels of logic level 1 and logic level 0 Go and Gi are the respective standard deviations
Interfaces for Optical Transport Networks
X -x- :><
71 1^1, CJi
1^0, CFo
Figure 3-5. Eye diagram
Ui = logic level 1 average value Gi = logic level 1 noise standard deviation Uo = logic level 0 average value Go = logic level 0 noise standard deviation A system that requires an operating BER of 10'^^ has a Q-factor measurement of 18 dB without FEC. If RS (255, 239) FEC is employed, the Q-factor measurement decreases to 11.8 dB, yielding 6.2 dB of coding gain.
3.4.2.2
Coding Gain measured via Ey^/No
Another way to measure coding gain is with a plot of BER vs. EJNQ. E\J is the bit energy and can be described as signal power (S) times the bit time Tb- ^0 is the noise power spectral density and can be described as the noise power (AO divided by the bandwidth (W). Thus E^No is equal to SNR * (Bandwidth/Bit Rate). For a more thorough discussion, the reader is referred to [7]. Figure 3-7 shows what the BER out of the FEC decoder would be for a given input SNR (Eb/No). Thus to operate a system at 10"'^ BER, then one would need over 14 dB SNR without FEC or only 8.5 dB with FEC.
72
Chapter 3 BER vs Q for R-S 255 Code (t = 8)
[
\ BER in
1
1
k j
.
. No Overhead
\
1
: j 10
11
12 \
13
14
15
16
17
18
19
20
Q(dB)
Figure 3-6. BER vs. Q factor
0
r- -^^
"^
\ ^
L ncorre cted
X
v
7^
\= \
^ ^
'
"v—
\
.709
\
\
V—
F\
o""
\, \
2
3
4
5
6
7
;.86dE
LAJ 9
\
V A
6.2 dE 10
11
12
13
k
\ \
LAI
14
15
Figure 3-7. BER vs. E^NQ
3.4.2.3
Coding Gain measured via OSNR
Figure 3-8 shows the FEC net coding gain (NCG) of various FEC schemes. These are theoretical and real measurement results from running systems. Coding gain is the reduction of signal-to-noise ratio due to the use of the FEC at a reference BER. The Net Coding Gain (NCG) takes into
Interfaces for Optical Transport Networks
73
account the fact that the bandwidth extension needed for the FEC scheme is associated with increased noise in the receiver. For example, consider a reference BER of 10"^'. The SDH in-band FEC provides an NCG of 4 dB, the standard OTN FEC an NCG of 6.2 dB, and an enhanced FEC an NCG of 9.5 dB.
Unencoded
'"-^''''"^---rr^, ,-.-.:
10-' CO •3
%
10"'
^SDH in-band FEC
:
\
i
L
:
r
: : !
1 t I
•
}
• OTN EFEC (measured)
\ \
S.^A-'^T^ . - - ^
\^
OTN EFEC (theoretical)
\
^*^
1
10"
|
OTN Standard FEC
\
11 1
\
\
\ " NEGC @ BER 10
ll
1
8
10 O S N R (dB)
-''« .
\
12
FEC margins Figure 3-8. Coding gain measured via OSNR
3.5.
TANDEM CONNECTION MONITORING
SONET/SDH monitoring is divided into section, line, and path monitoring. A problem arises with the "carrier's carrier" situation, as shown in Figure 3-9, where it is required to monitor a segment of the path that passes another carrier network. Here Operator B is carrying the signal of Operator A. However, Operator A needs to monitor the signal as it passes through Operator B's network. This is how a "tandem connection" is defined. It is a layer between line monitoring and path monitoring. SONET/SDH was modified to allow a single tandem connection. G.709 allows six. TCMl is used by users to monitor the Quality of Service (QoS) that they see. TCM2 is used by Operator A to monitor his or her end-to-end QoS. TCM3 is used by the various domains for intradomain monitoring. Then TCM4 is used for protection monitoring by Operator B. There is no standard concerning which TCM is used by whom. The operators have to have an agreement so that they don't conflict. As we shall
Chapter 3
74
see later in the chapter, TCMs also support monitoring of ODUk (G.709 without the FEC) connections for one or more of the following network apphcations (outlined in ITU-T G.805 and ITU-T G.872):
Figure 3-9. Tandem Connection Monitoring
Optical UNI to UNI tandem connection monitoring; monitoring the ODUk connection through the public transport network (from public network ingress network termination to egress network termination) Optical NNI to NNI tandem connection monitoring; monitoring the ODUk connection through the network of a network operator (from operator network ingress network termination to egress network termination) Sublayer monitoring for linear 1+1, 1:1, and 1:« optical channel subnetwork connection protection switching, to determine the signal fail and signal degrade conditions Sublayer monitoring for optical channel shared protection ring (SPRing) protection switching to determine the signal fail and signal degrade conditions Monitoring an optical channel tandem connection for the purpose of detecting a signal fail or signal degrade condition in a switched optical channel connection, to initiate automatic restoration of the connection during fault and error conditions in the network Monitoring an optical channel tandem connection for, e.g., fault localization or verification of delivered quality of service
Interfaces for Optical Transport Networks
75
The number of monitored connections along an ODUk trail may vary between 0 and 6. Monitored connections can be nested, overlapping, and/or cascaded. Nesting and cascading are shown in Figure 3-10. Monitored connections A1-A2/B1-B2/C1-C2 and A1-A2/B3-B4 are nested, while B1-B2/B3-B4 is cascaded.
TCM OH Field not used
[ZD
TCMi OH Field in used
Figure 3-10. ODUk monitored connections
Overlapping monitored connections as shown in Figure 3-11 (B1-B2 and C1-C2) are also supported.
TCM6
TCM6
TCM6
TCM6
TCM6
TCM5
TCM5
TCM5
TCM5
TCM5
TCM4
TCM4
TCM4
TCM4
TCM4
TCM3
TCM3
ri:cM3:;
'tckr
TCM3
^^9
TCM2
TCM2
TCMl
TCMl
^B
TCM2 TCMI
^Ib
^
TCMl
TCMl
^
^ j ^
Al
Bl
CI
B2
C2
Cl-C 2 I J1-B2 A1-A2
1 TCM OH Field nc tused
1
^H
'^^r'H-
TC MiC IHFi eld inu secI
Figure 3-11. Overlapping of ODUk monitored connections
A2
Chapter 3
76
3.6.
OTN HIERARCHY OVERVIEW
The OTN allows multiple optical channels to be transported simultaneously, where each channel is nominally mapped to an individual wavelength. Each optical channel is transported in a typical digital frame structure that is made up of payload and overhead fields. As we will see shortly, the client data is mapped into the payload area and the networkrelated information is carried by the overhead section. The entire structure is partitioned into layers, as depicted in Figure 3-12 [8].
OCh Payload Unit OCh Data Unit OCh Transport Unit
Optical Cliannel Carrier Optical Multiplex Section Optical Transmission Section OTM Overhead Signal Optical Supervisory Channel Optical Physical Section
Figure 3-12. OTN layers and containment relationships
Figure 3-13 [8] shows the partitioning of the layers from a transport networks point of view. The OPUk, ODUk, and OTUk are in the electrical domain. The OPUk encapsulates the Client signal (e.g., SONET/SDH) and does any rate justification that is needed. It is analogous to the Path layer in SONET/SDH in that it is mapped at the source, demapped at the sink, and not modified by the network. The ODUk performs functions similar to these of the Line Overhead in SONET/SDH. The OTUk contains the FEC and performs functions similar to those of the Section Overhead in SONET/SDH. After the FECs are added, the signal is then sent to a Serializer/Deserializer (SerDes) to be converted to the optical domain. The
Interfaces for Optical Transport Networks
11
OCh is in the optical domain, which in the DWDM context can be envisioned as mapping of the OTUk to one of the wavelengths. Architectural demarcation points Optical Multiplex Section (OMS) can be mapped to the
OCh, OTUk 4 — • !
•<
OCh,OT^/k
OCh, OTUk
LT\
Client
H
i|><jfe"Y~ \
LT: OCADM: OCXC: 3R: R:
/LT
l
Line Terminal w/ optical channel multiplexing Optical Channel Add/Drop Multiplexer Optical Channel Cross-Connect O/E/O w/ Reamplification, Reshaping & Retiming and monitoring Repeater
OCXC
LT/
\LT
Figure 3-12. OTN layer network trails
Optical multiplexing function, and the Optical Transmission Section (OTS) is synonymous to the transport of the aggregated wavelengths, or "single signal," over the fiber. However, due to custom and diverse sets of implementations, the OMS and OTS are not yet defined within the ITU standards. For all practical purposes, only four layers are defined, as depicted in Figure 3-14.
Client Signals like SONET/SDH, Constant Bit Rate, Fibre Chatinel, IP, Ethernet, etc.
\ \/7 OPUk ODUk OTUk OCh
Figure 3-14. Defined OTN hierarchy
78
Chapter 3 <
B B
<
00 1
k2 BDI
r
NJ 1
00 BDIi
[en
Icnl H-j
M
1
"ft'o
1 1 ^
CO 1 ~~^^^^^^^ p T s
|~To
Pi
~"^^ 1
2^ 2^
r%. o ft f_. ho 1
C) i—>
m :
H"
u
u
•^
*">
00
m Di
m
^ J fc H
CO
PH
GO
w p^
o y. ^
00 ft
,S C
fe w 'o
-%ft ^ "in
Pu
o
o
r
__
H
Tf
(N
:^ u
\ ! o^ l~~|
H
pq
^ fi^
1 O
1
^
H H
<:
00
ft
•X3
g CD
o
ft
c
> TWj ^
_o
o a> CO U
ft
ifl
O '-CS •j3 cj
'o ft
*-
CO
IS
4^ • -
^
^
Q
Sec ^ H -B
T3
3 .f1
m
PQ PQ CO
00 ft
<
W Q
-d
U m PQ PQ 00
Ts I d =^ 0 0 ^ 3 ft
ft
00 H ft ft
K^
cj H:^
&
1 §
>
1^
O
;2:
ft
go iz:
ft
cd
§
'—' O
in c^
00
H
S
oo
^r
CO I-
i^
T)
! t< u
0
S
1
(S
ft 00
MD
ko
i 1 11 u1 .g •4:3
pHUk ClveiBiea
R
"S
•2
•S? 53
r-i
00
^ H
H"
1—1
D 00
H <
<
'S
(N
5 o
^
CO
•—'
<1
00
H
r—1
"IS CD CO
W
1 1
00
U 00
^
o U
§
"S 3
•4:3
(N
m
1 ^
(N
m
• * ~ |
Row Number
Figure 3-J5. OTN G.709 frame format
00 rt
0
0
rt
ft
U
3 ft
0
0
cli r o ^ H >' u W c> U 00
"-S 0
0
S
C So a "^=2 W CD c3 O O fii H <
O e^i H <
^
0 ft
0 a
0 •-E3
o
g
1 S)
lO
•S 1
!!> bri
O
3 ft ft ft
ft
X, 5 S W
<
ft
00 U ft U <1 ft
V X
ft
w
ft
Interfaces for Optical Transport Networks
79
3.7. OTN a 7 0 9 FRAME STRUCTURE Figure 3-15 shows the OTN frame format as defined in G.709. It consists of OTUk, ODUk, and OPUk overhead; an OPUk payload area; and an FEC field. It is the OPUk field where client signals like SONET/SDH, constant bit rate, Generic Framing Procedure (GFP), and encapsulated data (IP, Ethernet, Fibre Channel, etc.) to name a few, are mapped for transport. GFP will be covered in depth in Chapter 5. Figure 3-16 shows the mapping of the varied client signals onto the G.709 frame. One should keep in mind that the OPUk (Client signal) can only be of one type; e.g., it can either carry SONET/SDH (can also be considered CBR traffic) or ATM or CBR or GFP encapsulated traffic. The traffic type cannot be intermixed at the OPUk level. However, as we will see in Chapter 5, various traffic types can be encapsulated within GFP and can subsequently be transported as the "GFP" traffic type within the OPUk structure. The data rates were constructed so that they could transfer the SONET/SDH signal efficiently.
MPLS
SONET/SDH
t
ATM
CBR
IP
GbE
Ipibre Channell
3- -E GFP
W^^-^^n='.: ^^^-^i;r--r--r^
4 rows 4080 columns Figure 3-16. Client signal mapping onto the G.709 frame
The bit rates of OTUk, ODUk, and OPUk are shown Tables 3-1 to 3-4. In Table 3-4, the period is an approximated value that is rounded off to three digits.
80
Chapter 3
Table 3-1. OTU types and capacity (from ITU-T Rec. G.709)
OTU type
OTU nominal bit rate
OTU bit-rate tolerance
OTUl
255/238 X 2 488 320 kbit/s
±20 ppm
0TU2
255/237 X 9 953 280 kbit/s
OTU3
2 5 5 / 2 3 6 x 3 9 813 120 kbit/s
Table 3-2. ODU types and capacity (from ITU-T Rec.709) ODU nominal bit rate
ODU bit-rate tolerance
ODUl
239/238 X 2 488 320 kbit/s
±20 ppm
0DU2
239/237 X 9 953 280 kbit/s
0DU3
2 3 9 / 2 3 6 x 3 9 813 120 kbit/s
ODU type
Table 3-3. OPU types andcapacity (from ITU-T Rec. G.709) OPU type
OPU Payload nominal bit rate
OPUl
2 488 320 kbit/s
0PU2
238/237 X 9 953 280 kbit/s
0PU3
2 3 8 / 2 3 6 x 3 9 813 120 kbit/s
OPU payload bitrate tolerance
Table 3-4. OTUk/ODUk/OPUkframeperiods (from ITU-T Rec. G.709) OTU/ODU/OPUtype
Period (Note)
OTUl/ODUl/OPUl/OPUl-Xv
48.971 ^is
OTU2/ODU2/OPU2/OPU2-XV
12.191 fis
OTU3/ODU3/OPU3/OPU3-XV
3.035 ^s
±20 ppm
Interfaces for Optical Transport Networks
3.8.
81
G.709 OVERHEAD BYTES: IN-DEPTH ANALYSIS AND PROCESSING
In transport networks, the overhead b3^es are generally used for conveying performance, status, configuration, control, and management information regarding the transport network. This information can be relayed as instant snapshot values in single or multiple bytes or can span multiple frames in the form of messages. These bytes are passed between the network elements spanning their respective sections, domains, or paths. One of the characteristics of the G.709 frame structure is that, certain overhead fields use the same value or have the same content format and are transported over different overhead bytes. The information contained in these similarly valued overhead bytes is interpreted within the respective segments of the transport network. For example. Bit Interleaved Parity code (BIP-8) or Trail Trace Identifier (TTI) are such fields (and are elaborated in the subsequent section). To understand the context of the communication between different network nodes within the entire OTN, one needs to keep in mind the directionality of the information flow. Frequently used terms such as upstream node or downstream node need to be clarified. This concept can be illustrated by using the exchange of BIP-8 information between two different nodes, which do not have to be adjacent to each other. The transmitting node (Node A) sends the BIP-8 code to the downstream node (Node B). The receiving node (Node B) recalculates the BIP-8 and compares its calculated value with the value it received. If there are differences between the two values, appropriate information regarding the number of BIP-8 violations is sent to the upstream node (Node A). Figure 3-17 shows the directionality of the upstream and the downstream nodes. It should be kept in mind that the nodes depicted in Figure 3-17 do not have to be adjacent to each other. As illustrated in Figure 3-13, we can say that in the OTUk context, the source/sink node pair (downstream/upstream pair) will be the 3R point nodes, whereas in the ODUk context, they will be the LTs connected to the DXCs.
Chapter 3
82
Downstream Direction :
o
BIP-8 locally calculated and compared to received value
NodeB
If any BIP-8 violations are ""detected, information is sent "" back to the transmitting node ^T^TN
Upstream Direction Figure 3-17. Directionality of upstream and downstream nodes
3.8.1
OPUk Overhead Bytes and Client Mapping Structure
The OPUk {k = 1,2,3) frame structure is shown in Figure 3-18 (a). It is organized in an octet-based block frame structure with four rows and 3810 columns, with column 3824 being the last column of the OPUk field. The two main areas of the OPUk frame are (1) the OPUk overhead area and (2) the OPUk payload area. Columns 15 to 16 of the OPUk are dedicated to the OPUk overhead area. Columns 17 to 3824 of the OPUk are dedicated to the OPUk payload area. It should be noted that the OPUk column numbers are derived from the OPUk columns in the ODUk frame. OPUk OH information is added to the OPUk information payload to create an OPUk. This includes information to support the adaptation of client signals. The OPUk OH is terminated where the OPUk is assembled and disassembled.
Interfaces for Optical Transport Networks s^ Column Row FA
MFAS
OTUkOH
ODUkOH
15
16
RES
JC
17
RES
JC
RES
JC
PSI
NJO
PJO
15
16
17
RES
JC
RES
JC
RES
JC
PSI
NJO
83
18
Payload (Clicnf Signal) Slightly Slow!
o /
(a) PJO byte stuffing y Column FA
MFAS OTUkOH
ODUkOH
3824
18
Payload (Client Signal) Exact Match!
/
(b) Exact rate 17
1 FA
MFAS
OTUkOH
ODUk OH
RES
JC
RES
JC
RES
JC
18
Payload (Client Signal) Slightly Faster!
PSI
(c) Use additional byte for payload Figure 3-18. OPUk frame structure
3.8.1.1
OPUk Overhead description
Payload Structure Identifier (PSI): The 256-byte PSI signal is aligned with the ODUk multiframe (i.e., PSI [0] is present at ODUk multiframe position 0000 0000, PSI[1] at position 0000 0001, PSI [2] at position 0000 0010, etc.). PSI [0] contains a one-byte payload type. PSI [1] to PSI [255] are mapping and concatenation specific. Payload Type (PT): A one-byte payload type signal is defined in the PSI [0] byte of the payload structure identifier to indicate the composition of the OPUk signal. There are a number of payload types defined in Table 3-5. The virtual concatenated signal and the ODU multiplex structure are dealt with later in this document. Mapping SONET/SDH (CBR) into OPUk either synchronously or asynchronously is the most common mapping. Synchronous mapping is a subset of asynchronous mapping; thus we will only discuss asynchronous mapping.
Chapter 3
84 Table 3-5. Payl oad type code points (from ITU-T Rec. G.709) Interpretation
MSB
LSB
Hex code
1234
5678
(Note 1)
0000
0001
01
Experimental mapping (Note 3)
0000
0010
02
Asynchronous CBR mapping
1
0000
001 1
03
Bit synchronous CBR mapping
0000
0100
04
ATM mapping
0000
0101
05
GFP mapping
1 1
1
0000
Olio
06
Virtual concatenated signal (see Note 5)
0001
0000
10
Bit stream with octet timing mapping
1
0001
0001
11
Bit stream without octet timing mapping
1
0010
Olio
20
ODU multiplex structure
1 1 1 r
0101 Olio 1000 1111
0101
55
Not available (Note 2)
Olio
66
Not available (Note 2)
xxxx
80-8F
Reserved codes for proprietary use (Note 4)
1101
FD
NULL test signal mapping
1
1111
1110
FE
PRBS test signal mapping Not available (Note 2)
1111 FF 1 1111 1 Note 1 - There are 226 spare codes left for future international standardization. Note 2 - These values are excluded from the set of available code points. These bit patterns are present in ODUk maintenance signals. Note 3 - The value "01" is only to be used for experimental activities in cases where a mapping code is not defined in this table. Note 4 - These 16 code values will not be subject to standardization. Note 5 - For the payload type of the virtual concatenated signal, a dedicated payload type overhead (vcPT) is used.
3.8.1.2
Frequency Justification
Asynchronous mapping of a 2.5 Gb/s, lOGb/s, or 40 Gb/s constant bit rate (CBR) traffic signal into an OPUk (k = 1,2,3) may be performed as depicted in Figures 3-19 (b) and 3-19 (c). The maximum bit-rate tolerance between OPUk and the client signal clock that can be accommodated by this mapping scheme is ±65 ppm. With a bit-rate tolerance of ±20 ppm for the OPUk clock, the client signal's bit-rate tolerance can be ±45 ppm. If the client's frequency is out of range, then there aren't enough justification bytes in the OPUk overhead to make up the difference. The OPUk overhead for these mappings consists of a payload structure identifier (PSI) including the payload type (PT) and 255 bytes reserved for future international standardization (RES), three justification control (JC) bytes, one negative justification opportunity (NJO) byte, and three bytes reserved for future
Interfaces for Optical Transport Networks
85
international standardization (RES). The JC bytes consist of two bits for justification control and six bits reserved for future international standardization. The OPUk payload for these mappings consists of 4 x 3808 b)^es, including one positive justification opportunity (PJO) byte. The asynchronous and bit synchronous mapping processes generate the JC, NJO, and PJO bits according to Table 3-6 and Table 3-7, respectively. The demapping process interprets JC, NJO, and PJO according to Table 3-8. Majority vote (two out of three) is used to make the justification decision in the demapping process to protect against an error in one of the three JC signals.
Table 3-6. JC, NJO, and PJO generation by asynchronous mapping process (from ITU-T Rec. G.709)
JC [78]
NJO
00
justification byte
01
data byte
data byte data byte not generated
10 11
PJO
justification byte
justification byte
Table 3-7. JC, NJO, and PJO generation by bit synchronous mapping process (from ITU-T Rec. G.709)
JC [78] :
00
NJO justification byte
PJO data byte
01 10 11
not generated
Chapter 3
86 Table 3-8. JC, NJO, and PJO interpretation (from ITU-T Rec. G.709)
JC [78]
1
NJO
PJO
00
justification byte
data byte
01
data byte
data byte
10 (Note)
justification byte
data byte
11
justification byte
justification byte
One should keep in mind that a mapper circuit does not generate the code in Table 3-8. Due to bit errors, a demapper circuit might receive this code. Moreover, it should be noted that unlike SONET/SDH, there is no "Start of Payload" indication or pointer. 3.8.1.3 Mapping a CBR2G5 signal (e.g., STS-48/STM-16) into OPUl Groups of eight successive bits (not necessarily being a byte) of the CBR2G5 signal are mapped into a Data (D) byte of the OPUl, as depicted in Figure 3-19 (a). Once per OPUl frame, it is possible to perform either a positive or a negative justification action. 3.8.1.4 Mapping a CBRIOG signal (e.g., STS-192/STM-64) into OPU2 Groups of eight successive bits (not necessarily being a byte) of the CBRIOG signal are mapped into a Data (D) byte of the OPU2 and are shown in Figure 3-19 (b). A total of 64 fixed stuff (FS) bytes are added in columns 1905 to 1920. Once per 0PU2 frame, it is possible to perform either a positive or a negative justification action. 3.8.1.5
Mapping a CBR40G signal (e.g., STS-768/STM-256) into OPU3 Groups of eight successive bits (not necessarily being a byte) of the CBR40G signal are mapped into a data (D) byte of the OPU3, as illustrated in Figure 3-19 (c). A total of 128 fixed stuff (FS) bytes are added in columns 1265 to 1280 and in columns 2545 to 2560. Once per OPU3 frame, it is possible to perform either a positive or a negative justification action.
Interfaces for Optical Transport Networks
87
Payload (Client Signal) Column Row\ 15 1
16
17
RES
JC
RES
JC
RES
JC
PSl
NJO
3824
18
PJO
(a) Mapping of 2.5 Gb/s signal mapping
Payload (Client Signal) UOI umn
\ 1
15 RES
16
17
RES
JC
RES
JC
PSI
NJO
O
..
< > OS
CN
3824
OS
JC
o fe
4
i2 B Payload (Client stgnal)
Paylc ad (Client Signal)
o 5> PJO
(b) Mapping of 10 Gb/s signal mapping
Payload (Client Signal) ^ Column 15
o
16
17
RES
JC
RES
JC
RES
JC
PSI
NJO
oo
18
PayWad (Client Signal)
o
^
o ^
3824
Payload (Client Signal)
o fe O 2^
PJO
(c) Mapping of 40 Gb/s signal mapping Figure 3-19. Mapping of CBR signal on OPUk
Payload (Client Signal)
Chapter 3
3.8.2
Similarly Valued/Formatted Fields within G.709 Frame
3.8.2.1
BIP-8
One method of monitoring the performance of the network is to monitor the received bit errors. The error detection is performed by using Bit Interleaved Parity (BIP) check. ITU-T Rec. G.707 [9] defines BIP as follows: Bit Interleaved Parity-X (BIP-X) is a method of error monitoring. With even parity an X-bit code is generated by the transmitting equipment over a specified portion of the signal in such a manner that the first bit of the code provides even parity over the first bit of all X-bit sequences in the covered portion of the signal, the second bit provides even parity over the second bit of all X-bit sequences within the specified portion, etc. Even parity is generated by setting the BIP-X bits so that there is an even number of Is in each monitored partition of the signal. A monitored partition comprises all bits which are in the same bit position within the X-bit sequences in the covered portion of the signal. The covered portion includes the BIP-X. The BIP-8 is computed over the bits in the OPUk (columns 15 to 3824) area of frame /, and is inserted in the respective ODUk or the OTUk overhead field BIP-8 overhead locations of the fi-ame / + 2, as shown in Figure 3-20.
Xco lumn Number Number i
15
•^s^^j^^m
16 17 -
BIP-^ calculated over tiie cntijpe OPUk region
2 3 4
\%-<'/^-:i+ 1
i" i+2
fc^'^-'^f^M'f/. ^
Calculated BIP-8 inserted in the respective ODUk or OTUk OH field
Figure 3-20. BIP-8 computation
BlP-8
Interfaces for Optical Transport Networks
89
3.8.2.2 Trail Trace Identifier (TTI) A Trail Trace Identifier is a single byte that is used to transport a 64 byte message (similar in functionality to the JO in SONET/SDH). The message contains a source and destination identifier used for routing the OTU signal through the network. There are bytes allocated for operator-specific use also. The 64-byte message is aligned with the OTU multiframe and is transmitted four times per multiframe (since the multiframe extends over 256 frames). The TTI contains globally unique Access Point Identifiers (API), which are used to specify the Source Access Point Identifier (SAPI) and Destination Access Point Identifier (DAPI). The APIs contain information regarding the country of origin, network operator, and administrative details. Table 3-9 illustrates the format of the source and the destination access point identifiers. These consist of a three-character international segment (IS) and a twelve-character national segment (NS). The characters are coded according to the ITU-T T.50 International Reference Alphabet —7-bit coded character set for information exchange [10]. Table 3-9. Access Point Identifier structure (taken from ITU-T Rec. G.709) IS Character # NS Character #
1 1 2 1 3 CC
cc
CC
cc cc cc
4 ICC
5 1 6 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15
ICC 1 ICC 1 ICC ICC ICC
1 1
UAPC UAPC UAPC UAPC UAPC 1 UAPC
The Country Code (CC) under the IS column is represented by three letters uppercase alphabet as per the ISO 3166 country code; for example, France is represented by FRA or United States of America by USA. The national segment consists of two subfields, namely, the ITU carrier code (ICC) followed by a Unique Access Point Code (UAPC). The ITU-T Telecommunication Standardization Board (TSB) assigns and maintains the ICC for the network operators/service providers. The ICC consists of 1-6 left-justified characters, alphabetic or leading alphabetic with trailing numeric. The UAPC is selected by the organization to which the CC and the ICC have been assigned. The selection of the UAPC has to be made such that its uniqueness is ensured and guaranteed. The UAPC consists of 6-11 characters with a trailing NUL character, thereby making it a 12-character national segment.
90
Chapter 3
3.8.3
ODUk Overhead and Processing
The ODUk {k=' 1,2,3) frame structure is shown in Figure 3-21. It is organized in an octet-based block frame structure with four rows and 3824 columns.
\ c^Column o l u Number l o w \ i1 RowX
14 15 16 17 . t Overhead
^
iSL ^
a :1^^^ SjPSjf'i
^
OPUk Payload (4 X 3808 bytes)
'Jsed of FA and OTUk Overhead
Figure 3-21. ODUk frame structure
The three main areas of the ODUk frame are • • •
The OTUk area The ODUk overhead area The OPUk area
Columns 1 to 14 of rows 2-4 are dedicated to the ODUk overhead area, columns 1 to 14 of row 1 are reserved for frame alignment and OTUkspecific overhead, and columns 15 to 3824 of the ODUk are dedicated to the OPUk area. ODUk OH information is added to the ODUk information payload to create an ODUk. This includes information for maintenance and operational functions to support optical channels. The ODUk OH consists of portions dedicated to the end-to-end ODUk path and to six levels of tandem connection monitoring. The ODUk path OH is terminated where the ODUk is assembled and disassembled. The TC OH is added and terminated at the source and sink of the corresponding tandem connections, respectively. The term ODUk may or may not include the bytes in row 1. If one talks about the ODUk rate, then the bytes in row 1 are included. However, if one talks about the ODUk OH, then the bytes in row 1 are not included. In the functional model (G.798), the ODUk is considered to include row 1, but with all the bytes in row 1 equal to zero (Sections 14 and 14.3.1.1 of [5]). The ODUk overhead bytes location is shown in Figure 3-22. Figures 3-23 and Figure 3-24 expand the PM and the TCMi overhead fields.
Interfaces for Optical Transport Networks
91
Column Num 3er
^
1 1 II 2 II 3 II 4 II 5 II 6 II 7 II 8 II ' I M O I I " II ' M l 13 II ! l l l 15 11..^. J illlliilitt^^^^ III lllllll 1 •i|iji|||
illl
Ci
s O
W
2
RES
3
TCM3
4
GCCl
ITCM ACT
TCM5
TCM6 TCM2
GCC2
TCM4
TCMl
PM
APS/PCC
RES: TCM: ACT: FTFL: PM: EXP: GCC: APS: PCC:
FTFL EXP
RES
Reserved for future international standardization Tandem Connection Monitoring Activation/deactivation control channel Fault Type «fe Fault Location reporting channel Path Monitoring Experimental General Communication Channel Automatic Protection Switching coordination channel Protection Communication Control channel
Figure 3-22. ODUk overhead
Path Monitoring Overhead TTI
BIP-8 ^
| ; I 2 I 5 I ¥ \J\ 6 1 7 ( 5 1 SAPI 15 16
DAPI 31 32
Operator Specific 63
1 ^^^
[eoj
SAPI: DAPI: TTI: BIP-8: BEI: BDI: STAT: PSI: PT:
STAT 1
Source Access Point Identifier Destination Access Point Identifier Trail Trace Identifier Bit Interleaved Parity—level 8 Backward Error Indication Backward Defect Indication Status Payload Structure Identifier Payload Type
Figure 3-23. ODUk path monitoring overhead (from ITU-T Rec. G.709)
Chapter 3
92 (TCMi) Overhead 2
TTIi
BIP-8i
|/|2|5|4 SAPI 15 16
DAPI 31 32
Operator Specific 63
BEIi/BIAEi
SAPI: DAPI: TTI: BIP-8: BEI: BDI: STAT: PSI: PT:
0 ICQJ
^|7|<9| STATi
Source Access Point Identifier Destination Access Point Identifier Trail Trace Identifier Bit Interleaved Parity—level 8 Backward Error Indication Backward Defect Indication Status Payload Structure Identifier Payload Type
Figure 3-24. ODUk tandem connection monitoring #i overhead (from ITU-T Rec. G.709)
3.8.3.1 Path Monitoring (PM) Byte Descriptions Path monitoring overhead bytes, as depicted in Figure 3-23, consist of a Trail Trace Identifier (TTI), a Bit Interleaved Parity (BIP-8), and a third byte that consists of Backward Error Indication and Backward Incoming Alignment Error (BEI/BIAE), Backward Defect Indicator (BDI), and Status (STAT) fields. The Trail Trace Identifier is a single byte used within the context of the path within the optical transport network, as depicted in Figure 3-13. The multi-frame message format of the TTI is elaborated in Section 3.8.2.2 and is illustrated in Figure 3-24. The Bit Interleaved Parity (BIP-8) is used in the PM region of the G.709 frame, as elaborated earher in Section 8.1 (Figure 3-18), and is one byte that is used for error detection. The BIP-8 byte is computed over the entire OPUk region and is inserted two frames later in the PM's BIP-8 field, as discussed in Section 3.8.2.1. Backvv^ard Defect Indication (BDI) is defined to convey the "Signal Fail" status, detected at the Path Terminating Sink Function, to the upstream
Interfaces for Optical Transport Networks
93
node. This signal is created by the consequent action of aBDI (G.798/14.2.1.2). The actual defect equations are as follows: RI_BDI = aBDI ^ CI_SSF or dAIS or dOCI or dLCK or dTIM It should be noted that dAIS, dOCI, dLCK, and dTIM are all detected at the PM layer. CI_SSF = AI_TSF at the TCM layer AI_TSF = aTSF ^ CI_SSF or (dAIS or dLTC or dOCI or dLCK or (dTIM and not TIMActDis)) and TCMCI_Mode == OPERATIONAL dAIS, dLTC, dOCI, dLCK, and dTIM are TCM defects CI_SSF = aSSF^ dAIS or dLOF or dLOM (0.798/123.1.2); dAIS here is the SM AIS RI: BDI: CI: SSF: AIS: OCI: LCK: TIM: AI: TSF: LOF: LOM:
3.8.3.2
Remote Information Backward Defect Indication Characteristic Information Server Signal Fail Alarm Indication Signal defect Open Connection Indication defect Locked defect TTI mismatch detection Adapted Information Trail Signal Fail Loss of Frame Loss of Multiframe
Backward Error Indication and Backward Incoming Alignment Error (BEI/BIAE)
The BEI/BIAE signal is used to convey in the upstream direction the count of interleaved-bit blocks that have been detected in error by the corresponding ODUk path monitoring sink using the BIP-8 code. This count has nine legal values, namely, 0-8 errors. The remaining seven possible values represented by these four bits can only result from some unrelated condition and are interpreted as zero errors, as shown in Table 3-10.
Chapter 3
94 Table 3-10. ODUk PM BEI interpretation (from ITU-T Rec. G.709)
ODUk PM BEI bits 1234
BIP violations
0000
0
0001
1
0010
2
0011
3
0100
4
0101
5
Olio
6
0111 1000 1001 to n i l
7
1 1 1 3.8.3.3
8 0
Path Monitoring Status (STAT)
The three most significant bits of the PM status octet indicate the presence of a maintenance signal, as illustrated in Table 3-11.
Table 3-11. ODUk PM status interpretation (from ITU-T Rec. G.709)
1
PM byte 3, bits 678
status
000
Reserved for future international standardization
001
Normal path signal
010
Reserved for future international standardization
on
Reserved for future international standardization
100
Reserved for future international standardization
|
101
Maintenance signal: ODUk-LCK
!
110
Maintenance signal: ODUk-OCI
111
Maintenance signal: ODUk-AIS
Interfaces for Optical Transport Networks
3.8.4
95
Tandem Connection Monitoring (TCM)
There are six TCMs. These can be nested or overlapping, as illustrated in Figures 3-10 and 3-11. Trail Trace Identifier: The TTI is a single byte used within the context of the respective tandem connection, as depicted in Figures 3-10 and 3-11. The multiframe message format of the TTI is elaborated in Section 3.8.2.2 and is illustrated in Figure 3-24. BIP-8: The same value that was calculated for the BIP-8 in the PM BEP8 field is also used in the TCMi BIP-8 field. The BIP-8 is only overwritten at the start of a tandem connection. Any existing TCM is not overwritten. The BIP-8 byte is computed over the entire OPUk region and is inserted two frames later in the respective TCMi's BIP-8 field, as discussed in Section 3.8.2.1. 3.8.4.1
Backward Defect Indication (BDI)
Backward Defect Indication (BDI) is defined to convey the "Signal Fail" status, detected at the Path Terminating Sink Function, to the upstream node. This signal is created by the consequent action of aBDI at the TCM level (G.798/14.5.1.1.2). The actual defect equations are as follows: RI_BDI = aBDI^
(CI_SSF or dAIS or dLTC or dOCI or dLCK or (dTIM and not TIMActDis)) and TCMCIJAode ^^TRANSPARENT) dAIS and dTIM are TCM defects CI_SSF = aSSF ^
dAIS or dWF or
dWM{G.79SA23A.2)
dAIS here is the SM AIS RI: BDI: CI: SSF: AIS: LTC: OCI: LCK: TIM: LOF: LOM:
Remote Information Backward Defect Indication Characteristic Information Server Signal Fail Alarm Indication Signal defect Loss of Tandem Connection Open Connection Indication defect Locked defect TTI mismatch detection Loss of Frame Loss of Multiframe
Chapter 3
96
3.8.4.2 Backward Error Indication and Backward Incoming Alignment Error (BEI/BIAE) The BEI/BIAE signal is used to convey, in the upstream direction, the count of interleaved-bit blocks that have been detected as being in error by the corresponding ODUk tandem connection monitoring sink using the BIP-8 code. It is also used to convey in the upstream direction an incoming alignment error (lAE) condition that is detected in the corresponding ODUk tandem connection monitoring sink in the lAE overhead. As shown in Table 3-12, during an lAE condition the code "1011" is inserted into the BEI/BIAE field and the error count is ignored. Otherwise, the error count (0-8) is inserted into the BEI/BIAE field. The remaining six possible values represented by these four bits can only result from some unrelated condition and are interpreted as zero errors and BIAE not active. Table 3-12. ODUk TCM BEI interpretation (from ITU-T Rec. G.709)
ODUk TCM BEI bits 1234
BIAE
BIP violations
0000
false
0
0001
false
1
0010
false
2
0011
false
3
0100
false
4
0101
false
5
Olio
false
6
0111
false
7
1000
false
8
1001,1010
false
0
1011
true
0
llOOtolUl
false
0
1
3.8.4.3
TCM Monitoring Status (STAT)
For each tandem connection monitoring field, three bits are defined as status bits (STAT). These indicate the presence of a maintenance signal, if there is an incoming alignment error at the source TCM, or if there is no source TCM active, as illustrated in Table 3-13.
Interfaces for Optical Transport Networks
97
Table 3-13. ODUk TCM status interpretation (from ITU-T Rec. G.709)
TCM byte 3, bits 678
[
3.8.4.4
Status
000
No source TC
001
In use without lAE
010
In use with lAE
on
Reserved for future international standardization
100
Reserved for future international standardization
101
Maintenance signal: ODUk-LCK
no
Maintenance signal: ODUk-OCI
111
Maintenance signal: ODUk-AIS
Tandem Connection Monitoring ACTivation/deactivation (TCM-ACT)
The definition of this term is for further study. 3.8.4.5
General Communication Channels (GCCl, GCC2)
The GCCl and the GCC2 bytes are primarily used in providing a general communication channel between the ODUk termination points. Within the context of the data communication network as defined and structured in G.7712A^.1703 [11], the GCC bytes can be used to provide Embedded Control Channel. These bytes can carry various operations, management, administration and provisioning information. For example, a connection table for a digital cross-connect or software upgrades can be sent via these bytes from a remote management station. 3.8.4.6
Automatic Protection Switching and Protection Communication Channel (APS/PCC)
Up to eight levels of nested APS/PCC signals may be present in this field. The APS/PCC bytes in a given frame are assigned to a dedicated level depending on the value of MFAS, as shown in Table 3-14:
Chapter 3
98
Table 3-14. Multiframe to allow separate APS/PCC for each monitoring level (from ITU-T Rec. G.709)
MFAS bit 678
APS/PCC channel applies to
:
000
ODUk Path
1
001
ODUkTCMl
010
0DUkTCM2
on
ODUkTCMS
100
0DUkTCM4
101
ODUkTCMS
110
0DUkTCM6
111
ODUk SNC/I APS
For linear protection schemes, the bit assignments for these bytes and the bit-oriented protocol are given in Recommendation G.873.1. Bit assignment and byte oriented protocol for ring protection schemes are for further study.
3.8.4.7
Fault IVpe and Fault Location reporting communication channel (FTFL)
FTFL has been defined in G.709 to aid in fault location. The actual implementation is still under discussion in the ITU-T and thus won't be discussed here.
3.9.
OTUK OVERHEAD AND PROCESSING
The OTUk {k= 1,2,3) frame structure is based on the ODUk frame structure and extends it with a forward error correction (FEC). A total of 256 columns are added to the ODUk frame for the FEC, and the overhead bytes in row 1, columns 8 to 14, of the ODUk overhead are used for OTUkspecific overhead, resulting in an octet-based block frame structure with four rows and 4080 columns. The bit rates of the OTUk signals are defined in Table 3-1. The OTUk FEC contains the Reed-Solomon RS(255,239) FEC codes. If no FEC is used, fixed stuff bytes (all-Os pattern) are inserted. The RS(255,239) FEC code is elaborated in detail in Annex A/G.709.
Interfaces for Optical Transport Networks
99
OTUk OH information is part of the OTUk signal structure and is depicted in Figure 3-25. It includes information for operational functions to support the transport via one or more optical channel connections. The OTUk OH is terminated where the OTUk signal is assembled and disassembled. ^Column Number
rrnrm TF m RES BEI
FA:
Frame Aligmnem
MFAS:Multiframe Aligimient Signal SM: Section Monitoring GCC: General Communication Charmel RES: Reserved for future international standardization
TTI: SAPI: DAPI BIP-8 BEI: BDI: BEI: lAE: BIAE
Trail Trace Identifier Source Access Point Identifier Destination Access Point Identifier Bit Interleaved Parity—level 8 Backward Error Indication Backward Defect Indication Backward Error Indicator Incoming Alignment Error Backward Incoming Aligrmient Error
s
Operator Specific
Figure 3-25. OTUk Overhead (from ITU-T G.709)
3.9.1 Scrambling The OTUk signal needs sufficient bit timing content to allow a clock to be recovered. A suitable bit pattern, which prevents a long sequence of Is or Os, is provided by using a scrambler without adding any additional overhead to the overall OTUk frame. The operation of the scrambler is functionally identical to that of a frame synchronous scrambler of sequence length 65535 operating at the OTUk rate. The generating polynomial i s l + x + x + x + X . Figure 3-26 shows a functional diagram of the frame synchronous scrambler.
r-^f
©^
&<
^y-
(-•ID Q[-WD Q W D QHHD Q W D Q W D Q W D Q|-^D Q W D Q|-HD QW"^ ^\M° QHH° ^\M° ^ W ^ Q\M'^ Q|—'-K+)—• I II II II II II II II II II II II II II II II I ^-^Q.Scrambled ^rir rWCLK I iMCLK I IWCLK iHcLK I iHcLK [WcLK I WcLK I iMCLK I jWCLK I jHCLK jWcLK iWcLK iMcLK iWcLK I WcLK I |W( Data Out OTUk Clock
I ^1 ^1 ^1 y^l ^1 ^1 ^1 y^l ^1 ^1 ^1 ^1 ^1 ^1 ^1
OTUk MSB of MFAS byte
Figure 3-26. Frame synchronous scrambler
Chapter 3
100
The scrambler is reset to "FFFF" (HEX) on the most significant bit of the byte following the last framing byte in the OTUk frame, i.e., the MSB of the MFAS byte. This bit, and all subsequent bits to be scrambled, are added modulo 2 to the output from the x position of the scrambler. The scrambler runs continuously throughout the complete OTUk frame. To ensure frame acquisition at the receiving end, the framing bytes (FAS) of the OTUk overhead are not scrambled. Scrambling is performed after the FEC check bytes computation and insertion into the OTUk signal.
3.9.2
Frame Alignment Overhead
3.9.2.1
Frame Alignment Signal (FAS)
A six-byte OTUk-FAS signal as shown in Figure 3-27 is defined in row 1, columns 1 to 6, of the OTUk overhead. OAl is "1111 0110." OA2 is "0010 1000."
FAS OH Byte I 1
••
4
"I
OAl
FAS OH Byte 2 8 1
=•
4 5 OAl
.1 8
FAS OH Byte 3 1 2 3 4 5 617 8 OAl
FAS OH Byte 5
FAS OH Byte 4
li
5 6 7 8 0A2
1
'
4 5 0A2
•'
FASOHByteS 8 1 2 3 4 5
"
1 8
0A2
Figure 3-27. Frame alignment signal overhead structure (from ITU-T G.709)
3.9.2.2
Multiframe Alignment Signal (MFAS)
Some of the OTUk and ODUk overhead signals span multiple OTUk/ODUk frames. A single Multiframe Alignment Signal (MFAS) byte is defined in row 1, column 7, of the OTUk/ODUk overhead, as depicted in Figure 3-25. The value of the MFAS byte will be incremented each OTUk/ODUk frame and provides as such a 256-frame multiframe. Individual OTUk/ODUk overhead signals use this central multiframe to lock their 2-frame, 4-frame, 8-frame, 16-frame, 32-frame, etc., multiframes to the principal frame (depicted in Figure 3-28).
Interfaces for Optical Transport Networks
•m \
101
MFASOHByte
3
4
5
6
7
8
Equivalent Decimal Value
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 1 1
0 0 1 1 0 0
0 1 0 1 0 1
0 1 2 3 4 5
1 1 1 1 0 0
1 1 1 1 0 0
1 1 1 1 0 0
1 1 1 1 0 0
1 1 1 1 0 0
1 1 1 1 0 0
0 0 1 1 0 0
0 1 0 1 0 1
252 253 254 255 0 1
Figure 3-28. Multiframe alignment signal overhead
3.9.3
Section Monitoring Byte Descriptions
The Section Monitoring overhead bytes within the OTUk consist of TTI, BIP-8, and the third byte, consisting of subfields of BEI/BIAE, BDI, AEI and RES. The Trail Trace Identifier is a single byte used within the context of the section monitoring within the optical transport network, as depicted in Figure 3-13. The multiframe message format of the TTI is elaborated in Section 8.2.2 and is illustrated in Figure 3-29. Bit Interleaved Parity (BIP-8), used in the PM region of the G.709 frame as elaborated earher in Section 3.8.1 (Figure 3-18), is one byte that is used for error detection. The BrP-8 byte is computed over the entire OPUk region and is inserted two frames later in the PM's BIP-8 field, as discussed in Section 3.8.2.1.
Chapter 3
102 Path Monitoring Overhead
JTTI
BIP-8
y
SAPI
DAPI
Operator Specific
nwkmnmmnm BEI/BIAElQ
TTI: SAPI: DAPI: BIP-8: BBI: BIAE: BDI: lAE: RES:
RES
Trail Trace Identifier Source Access Point Identifier Destination Access Point Identifier Bit Interleaved Parity—level 8 Backward Error Indication Backward Incoming Alignment Error Backward Defect Indication Incoming Alignment Error Reserved
63 Figure 3-29. OTUk section monitoring overhead (fi-om ITU-T Rec. G.709)
Note: The OPUk includes the Justification Bytes, and thus an OTN signal can not be retimed v^ithout demapping back to the client signal. 3.9.3.1
Backward Defect Indication (BDI)
The Backward Defect Indication is defined to convey the "Signal Fail" status, detected at the Section Terminating Sink Function, to the upstream node. This signal is created by the consequent action aBDI at the SM level (Section 13.2.1.2 of G.798). The actual defect equations are as follows: RI_BDI = aBDI = CIJSF or (dTIM and not TIMActDis) CI_SSF = aSSF = dAIS or dWF or ^ZOM (G.798.12.3.1.2) dAIS=OTUk-AIS(G.79S.6.2.63A) J77M = 0.798.6.2.2.1 RI: BDI: CI: SSF: AIS: LCK: TIM: LOF:
Remote Information Backward Defect Indication Characteristic Information Server Signal Fail Alarm Indication Signal defect Locked defect TTI Mismatch detection Loss of Frame
Interfaces for Optical Transport Networks LOM:
103
Loss of Multiframe
3.9.3.2 Backward Error Indication and Backward Incoming Alignment Error (BEI/BIAE) The BEI/BIAE signal is used to convey, in the upstream direction, the count of interleaved-bit blocks that have been detected in error by the corresponding OTUk section monitoring sink using the BIP-8 code. It is also used to convey, in the upstream direction, an incoming alignment error (lAE) condition that is detected in the corresponding OTUk section monitoring sink in the lAE overhead. During an lAE condition the code "1011" is inserted into the BEI/BIAE field and the error count is ignored. Otherwise, the error count (0-) is inserted into the BEI/BIAE field. The remaining six possible values represented by these four bits can only result from some unrelated condition and are interpreted as zero errors (Table 3-15) and BIAE not active. Table 3-15. OTUk SM BEI inteqjretation (from ITU-T Rec. G.709)
OTUk SM BEI/BIAE bits 1234
BIAE
BIP violations
0000
false
0
0001
false
1
0010
false
2
0011
false
3
0100
false
4
0101
false
5
Olio
false
6
0111
false
7
1000
false
1001,1010
false
8 0
1011
true
0
llOOtollll
false
0
104 3.9.3.3
Chapters Incoming Alignment Error (lAE)
A single-bit incoming alignment error (lAE) signal is defined to allow the ingress point to inform its peer egress point that an alignment error in the incoming signal has been detected. lAE is set to " 1 " to indicate a frame alignment error; otherwise, it is set to "0". The egress point may use this information to suppress the counting of bit errors, which may occur as a result of a frame phase change of the OTUk at the ingress of the section. G.798 shows an incoming alignment error being detected on the source side (Section 13.3.1.1 of G.798). The consequent action (AI_IAE) is then used to set the SM lAE bit (Section 13.2.1.1 of G.798). Practically, it is detected on the sink side and then passed to the source side. However, if one is going through an ODUk switch, then it would need to be detected on the source side.
3.9.4
General Communication Channel 0 (GCCO)
The GCCO bytes are primarily used in providing a general communication channel between the OTUk termination points. Within the context of the data communication network as defined and structured in G.7712/Y.1703 [11] the GCC bytes can be used to provide an Embedded Control Channel. These bytes can carry various operations, management, administration, and provisioning information. For example, a connection table for a digital cross-connect or software upgrades can be sent via these bytes from a remote management station.
3.10.
ODUK MULTIPLEXING
Multiplexing in the OTN domain, as defined in G.709, supports several lower-speed channels into higher-speed channels. For example, four ODUl's can be multiplexed to an ODU2, or up to sixteen ODUl's or four ODU2's can be multiplexed to an 0DU3. It is also possible to mix ODUl's and 0DU2's in an ODU3. For 0DU2 to 0DU3 multiplexing, there have to be two positive stuff opportunities. For ODUl to ODU3 multiplexing, there is a fixed stuff in column 119. Thus the stuffing for multiplexing is different from the stuffing for mapping. In order to understand this intricacy, it is necessary to examine the data rates.
Interfaces for Optical Transport Networks
105
3.10.1 Multiplexing Data Rates 3.10.1.1
ODUl to ODU2 Justification Rate
For the case of multiplexing ODUl to ODU2: From Table 3-2, we see that the ODUl rate is equal to 239/238 * OC48 +- 20 ppm = 2,498,775,126 +- 49,976 b/s We can put data in the "Fixed Stuff bits of the 0PU2 payload that is shown in Figure 3-20b. Thus the 0PU2 payload is now 238/239 * 0DU2 rate, or 238/239 * 239/237 * OC192 +- 20 ppm. The 0PU2 payload is time sliced for the four ODUl's. Thus each ODUl has 238/237 * OC48 +- 20 ppm = 2,498,819,241 +- 49,976 b/s. The worst-case frequency difference is then (2,498,819,241 + 49,976) - (2,498,775,126 - 49,976) = 144,067 b/s or 57.65 ppm Thus we have to account for a data-rate mismatch of 144,067 b/s by stuffing. The stuffing is done on a multiframe basis. Each timeslot is stuffed once per four frames. The stuffing rate is defined as (stuff bits/frame)/(bits/frame) * (data rate) = (8/4)/(3824*4*8)*(238/237*OC192) = 163,364 b/s = 65 ppm. Therefore, in the worst case, there are enough data bytes coming in to match the outgoing rate. 3.10.1.2
ODU2 to ODU3 Justification Rate
For the case of multiplexing ODU2 to 0DU3: From Table 3-2, it can be see that the ODU2 rate is equal to 239/237 * OC192 +- 20 ppm = 10,037,273,930 +- 200,745 b/s. We can put data in the "Fixed Stuff bits of the OPU3 payload that is shown in Figure 3-20c. Thus the OPU3 payload is now 238/239 * 0DU3 rate, or
106
Chapters
238/239 * 239/236 * OC768 +- 20 ppm. The OPU3 payload is time sliced for the four ODU2's. Thus each 0DU2 has 238/236 * OC192 +- 20 ppm = 10,037,629,830 +- 200,753 b/s The worst-case frequency difference is then (10,037,629,830 + 200,753) - (10,037,273,930 - 200,745) = lSl,2>n b/s or +75 ppm and (10,037,629,830 - 200,753) - (10,037,273,930 + 200,745) = -45,598 b/s or -4.5 ppm Thus we have to account for a data-rate mismatch of 757,398 b/s by stuffing. The stuffing is done on a multiframe basis. This is more than the +- 65 ppm that the normal scheme can accommodate. Thus, each timeslot has two positive stuff opportunities and one negative stuff opportunity per four frames. The stuffing rate is is defined as (stuff bits/frame) * (bits/sec)/(bits/frame) = (16/4)/(3824*4*8)*(238/236*OC768) = 1,312,452 b/s = 130 ppm Therefore, in the worst case, there are enough data bytes coming in to match the outgoing rate. 3.10.1.3
ODUl to ODU3 Justification Rate
For the case of multiplexing ODUl to ODU3, it can be seen from Table 3-2 that the ODUl rate is equal to 239/238 * OC48 +- 20 ppm = 2,498,775,126 +- 49,976 b/s We can put data in the "Fixed Stuff bits of the OPU3 payload that is shown in Figure 3-20c. Thus the OPU3 payload is now 238/239 * ODU3 rate, or 238/239 * 239/236 * OC768 +- 20 ppm
Interfaces for Optical Transport Networks
107
The OPU3 payload is time sliced for the 16 ODUl's. Column 119 of the time-sliced ODU3 is fixed stuff. An all-Os pattern is inserted in the fixed stuff bytes. Thus each ODUl has 237/239 * 239/236 * OC48 +- 20 ppm = 2,498,863,728 +- 49,977 b/s The worst-case frequency difference is then (2,498,863,728 + 49,977) - (2,498,775,126 - 49,976) = 188,555 b/s or 75.46 ppm Therefore, we have to account for a data-rate mismatch of 188,555 b/s by stuffing. The stuffing is done on a multiframe basis. Once again, each timeslot has two positive stuff opportunities and one negative stuff opportunity per 16 frames. The stuffing rate defined as (stuff bits/frame) * (bits/sec)/(bits/frame) = (16/16)/(3824*4*8)*(238/236*OC768) = 328,113 b/s = 130 ppm Therefore, in the worst case, there are enough data bytes coming in to match the outgoing rate.
3.10.2 4 X ODUl to ODU2 Multiplexing 3.10.2.1
4 X ODUl to ODU2 Multiplexing Structure
The 0PU2 is divided in a number of Tributary Slots (TSs) and these Tributary Slots are interleaved within the OPU2, as depicted in Figure 3-30. The bytes of an ODUl input are mapped into one of four OPU2 Tributary Slots. The bytes of the Justification Overhead to adapt the asjoichronous ODUl to the 0DU2 clock rate are mapped into the 0PU2 OH area. An 0PU2 Tributary Slot occupies 25% of the OPU2 Payload area. It is a structure with 952 columns by 4 rows. The four 0PU2 TSs are byte interleaved in the 0PU2 payload area. It is important to note that the ODUl frame repeats every four 0DU2 frames! One of the implications of this point is that the FAS bytes in the ODUl frame could cause false locking of the 0DU2 frame. This is not supposed to be a problem according to contributions to the ITU. However, the FAS bytes can't be removed because there is no standard on where the ODUl frame starts. Thus the ODUl FAS bytes are needed to frame the recovered ODUl signal. The OTUl OH (SM, GCCO, and RES) is set to all Os, as illustrated in Figure 3-31.
108
Chapter 3
r - CNJ CO - ^ CNJ O J (Nl C S
CO CO 00 00 00 CO CO CO
bits R o w \ 1 ''
1 2
0PU2 Paytoad (4x 3808 bytes)
11 3 4 1 2
OPU2 Payload (4x 3808 bytes)
00 3 4 1 2
0PU2 Payload (4x 3808 bytes)
01 3 4 1
OPU2 Payload (4x 3808 bytes)
2 3
10
4 1 2 1^
0PU2 Payload (4x 3808 bytes)
3 4 1 2
00
0PU2 Payload (4x 3808 bytes)
3 4
Figure 3-30. 0PU2 tributary slot allocation (from ITU-T Rec. G.709)
^Column Number 1 14 15 16 17 Fixed Stuff FA OH (an-Os) 2
KOW X
Number
3 4
ODUj Overhead
Figure 3-31. Extended ODUj frame structure (from ITU-T Rec. G.709)
3824
Interfaces for Optical Transport Networks 3.10.2.2
109
4 X ODUl to ODU2 Justification Structure
The Justification Overhead (JOH), consisting of Justification Control (JC) and Negative Justification Opportunity (NJO) signals of the four OPU2 TSs, is located in the overhead area, column 16 of rows 1 to 4. The JOH is assigned to the related tributary slots on a per-frame base. The JOH for a tributary slot is available once every four frames. A four-frame multiframe structure is used for this assignment. This multiframe structure is locked to bits 7 and 8 of the MFAS byte, as shown in Table 3-16 and Figure 3-32. The PJOl and PJ02 bytes are in the ODUl payload. Figure 3-32 shows how the bytes are distributed. It should be noted that there are two PJO bytes and only one NJO byte. This is because the timeslot provides more capacity than is needed. Table 3-16. 0PU2 Justification OH tributary slots (from ITU-T Rec. G.7Q9)
MFAS Bits 7 & 8
JOHTS
00
1
01
2
10
3
11
4
. Column Number
o 01
o
Q o
I O I
l§ I I jo
Pi
10 11
0000 o l Sf 0001
Q
§
I
ISI I
I Is I I I Is
ol
§I Pi I
Figure 3-32. OPUk Multiplex Overhead
I I
I
no 3.10.2.3
Chapters OPU2 Payload Structure Identifier (PSI)
Byte 0 is defined as the Payload Type and is equal to 0x20. Byte 1 is reserved. Bytes 2-17 are the "Multiplex Structure Identifier." Bytes 18-255 are reserved. A total of 239 bytes are reserved in the OPUk PSI for future international standardization. These bytes are located in PSI[1] and PSI[18] to [PSI255] of the OPUk overhead. These b)1;es are set to all zeros.
3.10.2.4
OPU2 Multiplex Structure Identifier (MSI)
The Multiplex Structure Identifier (MSI) overhead, which encodes the ODU multiplex structure in the OPU, is located in the mapping-specific area of the PSI signal (PSI[2]... PSI[17]). The MSI indicates the content of each tributary slot (TS) of an OPU2. The generic coding for each TS is shown in Figure 3-33. One byte is used for each TS. • Bits 1 and 2 indicate the ODU type transported in the TS. • Bits 3 to 8 indicate the tributary port of the ODU transported. This is of interest in the case of flexible assignment of ODUs to tributary slots (e.g., ODU2 into OPU3). In the case of fixed assignment, the tributary port number corresponds to the tributary slot number.
PSI[I+1] ODU Type
1
00: ODUl 01: 0DU2 10: 0DU3 11: RES
Tributary Port Number
Tributary Slot Number
1
00 0000 Tributary Port 1 00 0001 : Tributary Port 2 00 0010 : Tributary Port 3 00 0011 : Tributary Port 4 • • • • 00 n i l : Tributary Port 16 Figure 3-33. Generic MSI coding
For the four OPU2 tributary slots four bytes of the PSI are used, as shown in Figure 3-34.
Interfaces for Optical Transport Networks
• •
111
The ODU type is fixed ODUl The tributary port # indicates the port number of the ODUl that is being transported in this TS; the assignment of ports to tributary slots is fixed; the port number equals the tributary slot number
The remaining 12 bytes of the MSI field (PSI[6] to PSI[17]) are unused. They are set to 0 and ignored by the receiver.
PSI [2]
00
00 0000
Tributary Slot Number 1
PSI [3]
00
00 0001
Tributary Slot Number 2
PSI [4]
00
00 0010
Tributary Slot Number 3
PSI [5]
00
00 0011
Tributary Slot Number 4
Figure 3-34. 0PU2-MSI coding
3.10.2.5
Frequency Justification
The mapping of ODUl signals (with up to ±20 ppm bit-rate tolerance) into the ODU2 signal is performed as an asynchronous mapping. The 0PU2 signal for the multiplexed ODUl structure is created from a locally generated clock, which is independent of the ODUl client signals. The ODUl signal is extended with Frame Alignment Overhead and an all-Os pattern in the OTUl overhead field; The extended ODUl signal is adapted to the locally generated 0DU2 clock by means of an asynchronous mapping with a -1/0/+1/+2 positive/negative/zero (pnz) justification scheme. The asynchronous mapping process generates the JC, NJO, PJOl and PJ02, according to Table 3-16. The demapping process interprets JC, NJO, PJOl, and PJ02, according to Table 3-16. Majority vote (two out of three) is used to make the justification decision in the demapping process to protect against an error in one of the three JC signals. The value contained in NJO, PJOl, and PJ02 when they are used as justification bytes is all-Os. The receiver is required to ignore the value contained in these bytes whenever they are used as justification bytes.
Chapter 3
112
Note: based on the calculations for ODUl to 0PU2 mapping (see Section 3.3.10.1.1), there should never be a need to do a "Double Positive Justification." Table 3-17. JC, NJO, PJOl, and PJ02 generation and interpretation (from ITU-T Rec. G.709)
JC [7,8]
NJO
PJOl
PJ02
Interpretation
00
justification byte
data byte
data byte
no justification (0)
01
data byte
data byte
data byte
1 ^^
justification byte
justification byte
justification byte
(-1) double positive justification (+2)
11
justification byte
justification byte
data byte
positive justification
j negative justification
1
(+1)
3.10.3 ODU1/ODU2 to ODU3 Multiplexing 3.10.3.1
ODU1/ODU2 to ODU3 Multiplexing Structure
The OPU3 is divided into a number of Tributary Slots (TSs) and these Tributary Slots are interleaved within the 0PU3, as depicted in Figure 3-35. The bytes of an ODUl or 0DU2 input are mapped into one or four OPU3 Tributary Slots. The bytes of the Justification Overhead to adapt the asynchronous ODUl or 0DU2 to the 0DU3 clock rate are mapped into the 0PU3 OH area. An 0PU3 Tributary Slot occupies 6.25% of the 0PU3 payload area. It is a structure with 238 columns by 4 rows. The sixteen 0PU3 TSs are byte interleaved in the OPU3 Payload area. It is important to note that the ODUl frame repeats every sixteen ODU3 frames! One of the implications of this point is that the FAS bytes in the ODUl frame could cause false locking of the ODU3 firame. This is not supposed to be a problem according to contributions to the ITU. However, the FAS bytes can't be removed because there is no standard on where the ODUl frame starts. Thus the ODUl FAS bytes are needed to frame the recovered ODUl signal. The OTUl OH (SM, GCCO, and RES) is set to all Os and is illustrated in Figure 3-36.
Interfaces for Optical Transport Networks
113
Column MFAS bits Rovi^ 1 5678
OPUS Payload (4 X 3808 bytes)
1111
0PU3 Payload (4 X 3808 bytes)
0PU3 Payload (4 X 3808 bytes)
0001
0PU3 Payload (4 X 3808 bytes)
0PU3 Payload (4 X 3808 bytes)
1111
0PU3 Payload (4 X 3808 bytes)
Figure 3-35. 0PU3 tributary slot allocation (from ITU-T Rec. G.709)
\Co lu Column Number Row 14 15 Number IFAOH^ Fixed Sftifr (sH-Os)
ODUj Overhead
16 17 .
OPUj Area (4x3810 bytes)
Figure 3-36. Extended ODUj frame structure (from ITU-T Rec. G.709)
Chapter 3
114
3.10.3.2
ODU1/ODU2 to ODU3 Justification Structure
The Justification Overhead (JOH), consisting of Justification Control (JC) and Negative Justification Opportunity (NJO) signals of the 16 OPU3 TSs, is located in the overhead area, column 16 of rows 1 to 4. The JOH is assigned to the related tributary slots on a per-frame base. The JOH for a tributary slot is available once every 16 frames. A 16-frame multiframe structure is used for this assignment. This multiframe structure is locked to bits 5-8 of the MFAS byte, as shown in Table 3-18 and Figure 3-32. It should be noted that there are two PJO bytes and only one NJO byte. This is because the timeslot provides more capacity than is needed. Table 3-18. OPUS Justification OH tributary slots (from ITU-T Rec. G.709)
3.10.3.3
MFAS bits 5678
JOHTS
MFAS bits
0000
1
1000
9
0001
2
1001
10
0010
3
1010
11
0011
4
1011
12
0100
5
1100
13
0101
6
1101
0110
7
1110
14 15
0111
8
1111
16
JOHTS
5678
OPU3 Payload Structure Identifier (PSI)
Byte 0 is defined as the Payload Type and is equal to 0x20. Byte 1 is reserved. Bytes 2-17 are the "Multiplex Structure Identifier." Bytes 18-255 are reserved. A total of 239 bytes are reserved in the OPUk PSI for future international standardization. These bytes are located in PSI[1] and PSI[18] to PSI[255] of the OPUk overhead. These bytes are set to all zeros.
Interfaces for Optical Transport Networks 3.10.3.4
115
OPU3 Multiplex Structure Identifier (MSI)
The MSI overhead, which encodes the ODU multiplex structure in the OPU, is located in the mapping specific area of the PSI signal (PSI[2]... PSI[17]). The MSI indicates the content of each tributary slot (TS) of an OPUS. The generic coding for each TS is shown in Figure 3-33. One byte is used for each TS. • Bits 1 and 2 indicate the ODU tj^e transported in the TS. • Bits 3 to 8 indicate the tributary port of the ODU transported. This is of interest in the case of flexible assignment of ODUs to tributary slots (e.g., 0DU2 into 0PU3). In the case of fixed assignment the tributary port number corresponds to the tributary slot number. For the 16 0PU3 tributary slots, 16 bytes of the PSI are used, as shown in Figure 3-37. • The ODU type can be ODUl or ODU2. • The tributary port # indicates the port number of the ODU 1/2 that is being transported in the respective TS; for the case of 0DU2 a flexible assignment of tributary ports to tributary slots is possible, for the case of ODUl this assignment is fixed, and the port number equals the slot number. ODU2 tributary ports are numbered 1 to 4. 3.10.3.5
Frequency Justification
The mapping of ODU1/ODU2 signals (with up to ±20 ppm bit-rate tolerance) into the 0DU3 signal is performed as an asynchronous mapping. The OPU3 signal for the multiplexed 0DU1/0DU2 structure is created from a locally generated clock, which is independent of the 0DU1/0DU2 client signals. The 0DU1/0DU2 signal is extended with Frame Alignment Overhead and an all-Os pattern in the OTUl overhead field. The extended ODU1/ODU2 signal is adapted to the locally generated ODU3 clock by means of an asynchronous mapping with a -1/0/+1/+2 positive/negative/zero (pnz) justification scheme. The asynchronous mapping process generates the JC, NJO, PJOl and PJ02, according to Table 3-19. The demapping process interprets JC, NJO, PJOl, and PJ02, according to Table 3-19. Majority vote (two out of three) is used to make the justification decision in the demapping process to protect against an error in one of the three JC signals. The value contained in NJO, PJOl, and PJ02 when they are used as justification bytes is all-Os. The receiver is required to ignore the value contained in these bytes whenever they are used as justification bytes.
Chapter 3
116 1
2
3
4
5
6
7
8
PSI [2]
OPUType
Tributary Port Number
Tributary Slot Number 1
PSI [3]
OPUType
Tributary Port Number
1 Tributary Slot Number 2
PSI [4]
OPUType
Tributary Port Number
1 Tributary Slot Number 3
PSI [5]
OPUType
Tributary Port Number
Tributary Slot Number 4
PSI [6]
OPUType
Tributary Port Number
Tributary Slot Number 5
PSI [7]
OPUType
Tributary Port Number
Tributary Slot Number 6
PSI [8]
OPUType
Tributary Port Number
Tributary Slot Number 7
PSI [9]
OPUType
Tributary Port Number
Tributary Slot Number 8
PSI [10]
OPUType
Tributary Port Number
Tributary Slot Number 9
PSI [11]
OPUType
Tributary Port Number
Tributary Slot Number 10
PSI [12]
OPUType
Tributary Port Number
Tributary Slot Number 11
PSI [13]
OPUType
Tributary Port Number
Tributary Slot Number 12
PSI [14]
OPUType
Tributary Port Number
Tributary Slot Number 13
PSI [15]
OPUType
Tributary Port Number
Tributary Slot Number 14
PSI [16]
OPUType
Tributary Port Number
Tributary Slot Number 15
PSI [17]
OPUType
Tributary Port Number
Tributary Slot Number 16
Figure 3-37. 0PU3-MSI coding
Table 3-19 JC, NJO, PJOl and PJ02 generation and interpretation (from ITU-T Rec. G.709) JC [7,8]
NJO
PJOl
PJ02
Interpretation
00
justification byte
data byte
data b3^e
no justification (0)
01
data byte
data byte
data byte
negative justification (-1) double positive justification (+2)
10
justification byte
justification byte
justification byte
11
justification byte
justification byte
data byte
positive justification (+1)
Interfaces for Optical Transport Networks
117
3.10.4 Summary The ITU-T standards under the OTN umbrella provide a transport platform optimized to carry high-capacity data pipes (greater than or equal to 2.5 Gb/s) with minimal usage of the overhead bytes and much more relaxed timing than SONET/SDH. The OTN frame structure defined with G.709 is a nimble transport fi-ame that carries user payload efficiently and provides strong FEC capabilities. Moreover, it provides a comprehensive method of detecting and locating faults along with relaying the status of such faults to the equipment in the forward direction (downstream) and backward direction (upstream). It was realized at the outset that an Optical Fiber duct which contains numerous fiber strands that can carry hundreds of A.s, if cut/damaged can generate thousands of alarms signals for individual channels. Hence the network can be easily flooded with alarm status traffic. Precautions were taken to ensure that only minimal traffic relaying necessary alarm signals would be generated if such a disaster occurred. Intelligent diagnostic capabilities of pinpointing failures and possibly healing from such failures, without human intervention in most cases, make it a viable transport technology of choice.
3.11. [I]
REFERENCES
Alan McGuire and Paul Bonenfant, "Standards: The Blueprints for Optical Networking," IEEE Communication Magazine, February 1998. [2] ITU-T Recommendation G.805, Generic functional architecture of transport networks, 2001. [3] ITU-T Recommendation G.872, Architecture for the Optical Transport Network (OTN), 2001. [4] ITU-T Recommendation G.709, Interfaces for the optical transport network (OTN), 2001. [5] ITU-T Recommendation G.798, Characteristics of optical transport network hierarchy equipment functional block, 2002. [6] ITU-T Recommendation G.975.1, ''Forward error correction for submarine systems'' [7] B. Sklar, Digital Communications: Fundamentals and Applications'' 2^^ edition, Prentice-Hall, 2001. [8] Maarten Vissers, "Optical Transport Network and Optical Transport Module; Digital Wrapper," Beyond SONET/SDH conference, Paris, France, April 2001. [9] ITU-T Recommendation G.707, Network node interface for the Synchronous Digital Hierarchy (SDH), 200^. [10] ITU-T T.50, International Reference Alphabet—7-bit coded character set for information exchange. [II] ITU-T Recommendation G.7712Af.l703, Architecture and specification of data communication network, 2003.
This page intentionally blank
Chapter 4 MULTIPLEX STRUCTURES OF THE OPTICAL TRANSPORT NETWORK Evolution: voice to data transport and voice over data transport Huub van Helvoort and Mimi Dannhardt Independent Consultants
4.1.
INTRODUCTION
This chapter will describe the transported payload capabilities of the Optical Transport Network (OTN). Since the introduction of the OTN in 1990, the initial capacity of 155 Mbit/s, based on the transport capabilities of the existing Pleisiochronous Digital Hierarchy (PDH) network, has grown to the current 40 Gbit/s. Up until about six years ago, the payload of the OTN was mainly used to transport voice traffic, and the multiplexing structure of the OTN was based on this traffic. At that time a consensus emerged among major market researchers indicating that, while voice traffic will continue to grow at a moderate rate, the transport of data traffic will dominate most networks by the years 2002-2005, requiring more bandwidth than was anticipated. Some of the new applications emerging that require transport over Optical Transport Networks include • Local/MetroAVide Area Networks (LAN/MANAVAN) services (e.g. Ethernet) • Storage Area Networks (SAN) service (e.g. Fibre Channel, ESCON) • High Speed Internet • rPVPN • Video distribution As with more and more applications emerge in the very near future, it is likely that they will have data-oriented characteristics similar to those mentioned above.
120
Chapter 4
This chapter describes the adaptations made to the existing standards to facilitate this anticipated growth. To avoid confusion, where the text refers to SDH (Synchronous Digital Hierarchy), also the terms SONET (Synchronous Optical NETwork) and OTN (Optical Transport Network) are applicable.
4.2.
THE SITUATION IN THE PREVIOUS CENTURY
The initial SDH transport structures were optimized for traditional TDM voice-type applications. In order to transport the new data-centric applications in an efficient manner, these SDH structures needed to be enhanced. Given the large installed base of SDH equipment, any enhancement needs to operate in a manner transparent to deployed networks. The revolutionary shift of voice to data transport needs an evolutionary change in the SDH standards. All information sent over the SDH network is placed into containers. Each container has its own management area, called overhead, and the area used for the transport of the information, ov pay load. One or more containers can then be mapped into the payload area of a particular SDH frame to be transported over an optical fiber or multiplexed in a larger container. The containers are virtual in the sense that they are pointed to from the frame overhead but can be interchanged or moved without difficulty. They are also virtual because they are not permanently assigned but are only used during the time of a connection through the network for a specific client. The containers are referred to as VC-n, virtual container of multiplex order n, where n = 11, 12,2,3,4.
4.2.1.
SDH structure details
The structure of an SDH frame transported over the fiber also has an overhead area and a payload area. The overhead area is divided into a management area dedicated to the Multiplex Section overhead (MS-OH), i.e. the line between two multiplexers; and the Regenerator Section overhead (RS-OH), i.e. the section between a multiplexer and a regenerator or between two regenerators. A line can consist of one or more sections. Between the MS-OH and the RS-OH, an area is reserved for the pointers that will be used to locate the transported containers in the payload area. The payload area is also referred to as Synchronous Payload Envelope (SPE). Figure 4-1 depicts the structure of an SDH frame. It is a matrix structure consisting of 9 rows and 270 columns containing octets. The frame rate is 125 |as. The management overhead is located in the first 9 columns, followed by the payload area of 261 columns. The SONET equivalent is the OC3
Multiplex Structures of the Optical Transport Network
111
structure. The shaded area is referred to as the Administrative Unit Group of the first order: AUG-1. An AUG-1 is capable of containing one AU-4 or three AU-3s, The pointer information of the AU-n is placed in the pointer area.
1
91
261
1 RS-OH area 4
pointer area payload area MS-OH area
9
Figure 4-1. STM-1/0C3 structure
Because SONET is based on STS-1, the first-order structure is the STS-1, and its format is depicted in Figure 4-2. This structure, in the ITU-T, is referred to as STM-0. Here the overhead area is located in the first three columns and the payload area in the next 87 columns. An AU-3 will fit in the shaded area.
1 1
31
87
RS-OH
area 4 pointer
payload area MS-OH area 9
Figure 4-2. STS-l/STM-0 structure
The definitions and allocation of the overhead octets are provided in the next subsections.
Chapter 4
122
Before the SDH/SONET frame is transmitted, it is scrambled to prevent long sequences of Is or Os. A frame-synchronous scrambler of sequence length 127 is used with a generating polynomial of 1 + x^ + x^. The RS-OH octets in the first row will not be scrambled. 4,2.1.1.
Overhead bytes
Figures 4-3 and 4-4 illustrate the basic SDH/SONET frame overhead area in detail. The STM-1 overhead is shown in Figure 4-3 and the OCl overhead in Figure 4-4. 1 1 Al RS-OH 2 Bl 3 Dl pointers 4 HI 5 B2 6 D4 MS-OH 7 D7 8 DIO 9 SI
2 Al R R HI* B2 R R R Zl
3 Al R R HI* B2 R R R Zl
4 A2 El D2 H2 Kl D5 D8 Dll Z2
5 A2 R R H2* R R R R Z2
6 A2 R R H2* R R R R Ml
7 8 9 JO NU/ZO NU Fl NU NU R D3 R H3 H3 H3 K2 R R D6 R R D9 R R D12 R R E2 NU NU
Figure 4-3. STM-1/0C3 overhead byte allocation
RS-OH pointers
MS-OH
1 2 3 4 5 6 7 8 9
1 ^^^^ 1 ^1 1 ^1 HI B2
1 D4 1 D7 1 DIO 1 SI
A2 El D2 H2 Kl D5 D8 Dll Ml
Figure 4-4. OCl/STM-0 OH
JO
1
Fl
D3
1
H3 K2
D6
1
D9 D12
E2
1
1
1
Multiplex Structures of the Optical Transport Network
123
The Regenerator Section (RS) overhead bytes are as follows: Al, A2 — frame alignment octets, Al = 11110110, A2 = 00101000 JO — RS trace, a 1-, 16-, or 64-byte information field that can be used to uniquely identify the SDH/SONET signal Bl — BIP-8, bit interleaved even parity calculated over the previous scrambled frame, used for RS error monitoring El — engineering order wire for access at the regenerators, 64 kbit/s Fl — user communication channel, 64 kbit/s D1...D3 — DCCR the RS data communication channel, 192 kbit/s, used for Operation Alarming and Maintenance (OA&M) NU — octets for national use (the octets located in the first row are not scrambled and their values shall be chosen carefully) R — reserved for future use ZO — octet defined only in SONET for future growth (this octet is located in the first overhead row; it is not scrambled, and its value shall be chosen carefully) The Multiplex Section (MS) overhead bytes are as follows: •
4.2.1.2. •
B2 — BIP-(Nx24), bit interleaved even parity calculated over the previous scrambled frame, used for MS error monitoring Kl, K2 — APS, automatic protection switch channel + MS-RDI D4...D12 — DCCM, the MS data communication channel, 576 kbit/s, used for Operation Alarming and Maintenance (OA&M) SI — synchronization status Ml — remote error indication (REI), reflects the B2 error count to the far end for performance monitoring E2 — engineering order wire for access at the multiplex, 64 kbit/s NU — octets for national use R — reserved for future use Z1/Z2 — octets defined only in SONET for future growth Pointers HI, H2 — these two octets form a word that contains the AU-n or TU-3 pointer information, i.e. a four-bit New Data Flag (NDF) to indicate that a new pointer value is present, two S bits (ignored in pointer processing), and a ten-bit pointer. The pointer is able to accommodate differences, not only in the phases of the container and the SDH frame but also in the frame rates. Figure 4-5 shows the
124
Chapter 4 allocation of the pointer bytes.
ISTM-l column
1
2
3
4
5
6
7
8
""9"
AU-4 pointer
HI
Y
Y
H2
Z
Z
H3
H3
H3
13 AU-3 pointers
HI,
HI2 HI3 H2i H22 H23 H3| H32 H33
Y = "1001 xx:ir' (value x is ignored) Z = "llll n i l " Figure 4-5. Allocation of pointer bytes in an STM-1 frame
ll HI H2
H3 *'*^^«*.^,.,^^^H1H2 ^PTR i Jl
|2 HI H2
H3
[3 HI H2
H3 *^"*"^***^.,,,^^^H1H2 = negative justification
4 negative justification opportunity HI H2 ' H3 ^^^'^^^N,.,,^^^
H1H2 = PTR-1
[5 HI H2
H3 ^"'*'*
.i,^„^^HlH2: : positive justification
6 positive justific ation opportunity HI H2
H3
^"^'^*******^1H2: = PTR+1
Figure 4-6. AU-4 AU-3 TU-3 pointer function
1
Multiplex Structures of the Optical Transport Network
125
The pointer value designates the offset from the first byte after the H3 octet to the first octet of the transported container; see the first two frames in Figure 4-6. H3 — pointer action octet defined for frequency justification purposes. Its use depends on the H1/H2 pointer information. If it indicates a negative justification, the H3 byte is used to transport payload (frames 3 and 4 in Figure 4-6). If it indicates a positive justification, the byte immediately following the H3 octet is used for stuffing the payload (frames 5 and 6 in Figure 4-6). VC-n structure
4.2.1,3.
Figure 4-7 depicts the structure of an SDH VC-n container. It is a matrix structure consisting of 9 rows and m columns containing octets. A VC-4 has 261 columns and a VC-3 has 85 columns. The frame rate is 125 |Lis. The first column contains the path overhead. The next (m-1) columns are designated for the payload. A VC-4 plus its associated pointer is referred to as an AU-4. 1 .
1 Jl B3 C2 Gl F2 H4 F3 K3 9 Nl
m
payload area VC-4 m = 261 VC-3 m = 85
Figure 4-7. VC-4A^C-3 structure
Figure 4-8 shows the structure of a SONET STS-1 SPE. It is a matrix structure consisting of 9 rows and 87 columns containing octets. The frame rate is 125 |us. The first column contains the path overhead. The next 86 columns contain the payload and 2 columns fixed stuff. An STS-1 SPE (or a VC-3 with the additional fixed stuff bytes) plus its associated pointers are referred to as an AU-3.
Chapter 4
126
30.
59 . . . . 87
\n
B3 C2 Gl _^^ F2 H4 Z3 Z4 Nl
H c3 O
>>
C/D
T3
,^H
T3 O
>»
^
c^ T^
o
">, cd
a. ^H
cd
«4H
F/^wr^ ^-5. STS-1 SPEA^C-3^ structure
The Virtual Container overhead bytes are as follows: Jl — trail trace, a 16- or 64-byte information field that can be used to uniquely identify the SDH/SONET signal B3 — BIP-8 bit interleaved parity for path error monitoring C2 — signal label, indicating the use of the payload area Gl — path status: path RDI, path REI F2 — user communication channel, 64 kbit/s H4 — position and sequence indicator F3 — user communication channel, 64 kbit/s K3 — APS (bl.. .b4) + data channel (b7, b8) Nl — network operator octet, used in tandem connection monitoring Z3/Z4 — octets defined only in SONET for future growth 4.2,1 A.
Substructuring
The payload area of the VC-n structures can again be used to transport containers of smaller structure sizes: • The VC-4 payload area can contain three Tributary Unit Groups of order 3 (TUG-3). A TUG-3 structure is 9 rows by 86 columns; they are byte interleaved in columns 4.. .261 of the VC-4; columns 2 and 3 of the VC-4 contain fixed stuff. • A TUG-3 (STS-1) can contain a TU-3, i.e. a VC-3 (STS-1 SPE) and its associated pointer, or it can contain seven TUG-2s. • A TUG-2 can contain a TU-2 (VT6), i.e. a VC-2 (VT6 SPE) and its associated pointer, or it can contain two VT3s, three TU-12s, or fourTU-lls.
Multiplex Structures of the Optical Transport Network • • • •
4.3.
127
A TU-12 (VT2) consists of a VC-12 (VT2 SPE) and its associated pointer. A TU-11 (VT1.5) consists of a VC-11 (VT1.5 SPE) and its associated pointer. A VT3 consists of a VT3 SPE and its associated pointer. The VC-3 pay load area can contain seven TUG-2s. A TUG-2 structure is 9 rows by 12 columns; they are byte interleaved in columns 2... 85 of the VC-3.
THE EVOLUTION OF THE BANDWIDTH
To be able to satisfy the demand for transport of more information on the same optical link, higher-order multiplexers were defined. These multiplexers increased the bandwidth in steps of 4, i.e. the STM-N (N = 4, 16, 64, 256) were defined and similarly OCn (n = 3N). For super-rate signals requiring the full payload area of these new multiplexers, contiguous concatenated (CCAT) containers are defined that provide a contiguous payload area that also increases in steps of 4. These are referred to as VC-4-Xc, with X = 4, 16, 64, and 256. The STM-N structure consists of 9 rows by (N x 270) columns. The overhead is located in the first (N x 9) columns, followed by the payload area of (N x 261) columns. The shaded area in Figure 4-9 represents an AUG-N and it has a fixed phase relation with the STM-N frame. 1
(Nx9) 1
(Nx261)
1 RS-OHarea 1 pointer area MS-OHarea
Figure 4-9. STM-N/OCn structure, N = 1, 4, 16, 64, 256; n = 3N
•
An AUG-(4M) can contain one VC-4-(4M)c or four AUG-Ms (M = 1,4, 16,64).
The structure of the STM-N overhead has also been extended to provide room for all the extra pointers required to allow all possible combinations of transported containers. The B2 size has been increased as well to match the
128
Chapter 4
required accuracy of performance monitoring. The STM-N overhead structure is shown in Figure 4-10. 1 1 IAI RS-OH
2 3 pointers 4 5 6 MS-OH 7
Bl
pi HI B2
p4 p? pio
1 SI
2...3N Al R R HI B2 R R R Zl
3N+1 3N+1...6N 6N+1 6N+1...9N A2 A2 JO NU/ZO 1 El R Fl NU D2 R R D3 H2 H2 H3 H3 Kl R K2 R D5 R D6 R R R D8 D9 DU D12 R R Z2 Ml E2 NU 1
Figure 4-10- STM-N/OCn OH, N=l, 4, 16, 64, 256; n = 3N
Note — for the highest rates (N = 64, 256), this overhead structure is slightly modified, e.g. some of the Reserved octets are used for Forward Error Correction (FEC) or an additional communications channel DDDMX (9216kbit/s). The special H1/H2 value Y/Z "1001 x;cll 1111 1111" mentioned in Figure 4-5 is in fact the contiguous concatenation indicator and shall not be used as a pointer value. The structure of the VC-4~Xc is shown in Figure 4-11. The first column contains the same path overhead as a VC-4. The next (X-1) columns contain fixed stuff and the concatenated payload area is X times that of a VC-4. 1 1|jl B3 C2 Gl F2 H4 F3 K3 9 |NI
X 1
fixed stuff
(X X 260)
payload area
Figure 4-11. VC-4/STS-3c structure
Figure 4-12 provides an overview of the SDH multiplex structures that are currently defined.
Multiplex Structures of the Optical Transport Network
129
1 STM-1 1 1 STIVI-4 ||STM-16||STM-64||STM-256|| 1x1 1 x1 x1 x1 x1 |AUG-256| 1 1 I
|AUG-64|
'^'^
..} 1 AUG-16 1 "" 1
1 AUG-4 1 '^^ higher order multiplexes 1 1"'
1 AUG-1 1 1 1 xN f f
pointer processing muitiplexinq aligning nnapping
AH
x1
1x1 1 AU-4 |
•
|[_VC-4 1
x1
|AU-4-4C|
xl
|AU-4-16C|
xl
1
|AU-4-64C||AU-4-256C||
^ ^ 4 4 |VC-4-4c| |VC-4-16c| |VC-4-64c| |VC-4"256c|
r ; |TUG-3r1
f""^ -^ 1 TUG-2 1 -
Xl. COntinii^ii*^ nnnrs^kfans^kfinn
x4| X3, ' 1 x1 |TU-11||TU-12|| TU-2 II TU-3 |
+ .-*^4
t
t
|VC-11||VC-12|| VC-2 II VC-3 |
•
•
Tlfc
1 C-11 I I C-12| 1 C
i» • •3 1 1 C-4 1
» 1
(> 1) f < 1 C-4-4C 1 1 C-4-16c 1 1 C-4-64C | | C-4-256c | __.„.„:
•
,
,
'
1
Figure 4-12. SDH multiplexing structures
The initial SONET multiplex structure was based on STS-1 (AU-3), but to provide interoperability with SDH, the AU-4 based structure is used for the higher-order multiplex, as shown in Figure 4-13. Emerging client applications, with their specific payload sizes requiring transport over an SDH/SONET network with its own specific structure, are faced with the problem of a relatively limited choice of bandwidth, i.e. concatenation levels. In addition to limited choice, there is also the problem of transport bandwidth inefficiency because the contiguously concatenated containers provide more bandwidth but not necessarily the "right-size" bandwidth.
130
Chapter 4 I QC-3 irOC-12 II OC-48 || OC-192 || OC-768 JxT x1 x1 x1 x1 |AUG-256|
OC-1 x1
I AUG-64 x4 3 AUG-16| x4 AUG-4 AUG-1
I I pointer processing xN multiplexing f aligning f
x3 AU-3
3=
x1
_r x4
x4 higher order multiplexes
x1
x1
x1
x1
AU-4 I [717-4-40 I |AU>4-16C| |AU-4>64C] |AU-4>256C|
mapping SPE rsTsTl
| S T S - 3 c | fsTS-12c I [STS-48C | |STS-192^ |STS^768c|
1
,,,,,,111, -^'"7
1 TUG.2 '1
1 X4|
AO|
*
f
t
T
•
•
•
X2|
r
Xl
contiguous concatenation
\VT^.5\\ VT2 II VT3 || VT6 |
f
| V r i . 5 | | VT2II W 3 | l VT6ISP E |C-11||C-12||
•
<
| | C - 2 | | 0 3 | I 0 - 4 I |G-4-4c| |C-4-16c| |C-4-64c| |C-4-256c Figure 4-13. SONET multiplexing structures
4.4.
NEW CLIENTS
The original tributary bit rates chosen for SDH were intended for voice services. These rates have a coarse granularity, require duplicate network resources for protection, and are not a good match to LAN, MAN, WAN, or SAN bandwidths. Examples of currently supported "traditional" SDH/SONET bit rates are shown in Table 4-1, SDH El E3 VC-4 VC-4-4C VC-4-16c VC-4-64C
2 Mbit/s 34 Mbit/s 155 Mbit/s 622 Mbit/s 2.4 Gbit/s lOGbit/s
SONET DSl DS3 STS-3C SPE STS 12c SPE STS-48C SPE STS-192 SPE
1.5 Mbit/s 45 Mbit/s 155 Mbit/s 622 Mbit/s 2.4 Gbit/s 10 Gbit/s
Table 4-1. Traditional SDH/SONET bit rates
Multiplex Structures of the Optical Transport Network
131
Bit rates for LAN/MANAVAN services are typically lOMbit/s, but lOOMbit/s and even 1 Gbit/s are becoming more and more popular. Other services, e.g, SAN, may vary from a few Mbit/s to several hundreds of Mbit/s. In order to transport these data services via an SDH transport, network there is no match in the bandwidth granularity. In addition, operators like to be able to sell their bandwidth in small chunks. Standards organizations have taken several successful steps to resolve this issue.
4.5.
VIRTUAL CONCATENATION
The enhancement of SDH to support Virtual concatenation (VCAT) provides the necessary payload sizes to enable the transport of the emerging services most efficiently. First proposed in November 1999, virtual concatenation is now present in all transport standards; i.e. the ITU-T Recommendations G.707 [1], G.783 [2] for SDH and Recommendations G.709 [3], G.798 [4] for OTN, the ETSI standard EN 300 417-9-1, and the ANSI standard T 1.105 [6]. Virtual concatenation provides an efficiency of 95 - 100% by grouping a number (X) of Virtual Containers (VC-n), using the combined payload provided by the Virtual concatenated Container (VC-n-Xv) to match the required bandwidth featuring: •
•
•
No requirements on existing SDH nodes that transit VC-n's part of a VC-n-Xv Virtual concatenation Group (VCG). Only the termination points of a connection are required to be compatible with virtual concatenation. Compensation for the differential delays caused by the difference in optical path length because each VC-n in the VCG does not have to follow the same physical path through the network, i.e. no routing constraints for operators. This diverse routing capability provides a better network resource utilization. Identifying the individual members of a VCG in order to enable the reconstruction of the original payload at the receiving side. The payload is distributed over the individual members of the VCG at the sending side.
4.5.1.
Differential Delay
There are several causes for the appearance of differential delay in a network:
132
Chapter 4 A geographically large ring with VC-n's from the same VCG routed around the ring in different directions (see Figure 4-3 member p and member q) caused by the availability of the required VC-ns in each part of the ring. The experienced delay is caused by the physical length of the fiber, i.e. the propagation delay of 5 |Lis per 1 km, and the transfer delay of each individual network element, i.e. 1 - 30 |as per NE. Depending on the network size, the differential delay, i.e. the difference in time between the fastest and slowest member in a VCG, can be several milliseconds. The standards allow a maximum differential delay of 256 ms. working path member q
Figure 4-14. Example of a ring with diverse routing
•
Networks using path-protected VC-n's to meet the required service availability. Many installed networks with path protection do not support locking of the path protection switch across a group of VC-n's. If a fault occurs that impacts only one of several virtually concatenated VC-n's, e.g. an equipment failure, only the failed VC-n will be switched to a protection path, and it will therefore arrive at the receiving end with a different delay (see Figure 4-3, member p). Again, the delay is due mainly to fiber propagation delay. To be able to detect and to compensate for the differential delay experienced by the members of a VCG, the X individual virtual containers will be sent with an identical label value. A counter provides the values of this label. At the receive end of the path, the received containers are stored in a buffer so as to be able to realign the received bytes from the individual VC-n's using the label value. The size of the buffer will determine the maximum allowable differential delay. The standards require a minimum delay of 125 |as. In the standards, this label is referred to as the Multi-Frame Indicator (MFI). The methodology that can be used to measure and compensate the differential delay experienced is explained in Section 4.6 of this chapter.
Multiplex Structures of the Optical Transport Network
4.5.2.
133
Payload Distribution and Reconstruction
In order to be able to reconstruct the original payload that was distributed over the X individual members of a VCG, the order of distribution needs to be known at the receiving side. 1
X
Xxm
C-n-Xc
125 MS
m+1
1 \
• •
"D CO
•
> VC-n-Xv
sz Xm.
o > o 1 2 5 MS
VC-n#X 1 2 5 MS
VC-n#1 Figure 4-15. Contiguous to virtual mapping
In the VCAT standard, each member of a particular VCG is assigned a unique Sequence Number (SQ). These sequence numbers are consecutive starting at 0 and ending at X-1. The information stream to be transported is divided into bytes, and each consecutive byte is placed in the identical byte position of the payload area of the consecutive VC-n's of the VCG using their sequence number (see Figure 4-4). At the receiving end of the path through the network, the bytes are recovered from the payload area of the individual VC-n's and are output consecutively using the sequence number to reconstruct the original signal. The effective payload transport capacity of the initial SDH/SONET multiplexes is shown in Table 4-2.
Chapter 4
134 SDH
SONET
Payload Capacity
VC-11
VT1.5SPE
1.600 Mbit/s
VC-12
VT2 SPE
2.176 Mbit/s
VC-2
VT6 SPE
6.784 Mbit/s
VC-3
STS-1 SPE
49.536 Mbit/s
VC-4
STS-Sc SPE
149.760 Mbit/s
VC-4-4C
STS-12c SPE
599.040 Mbit/s
VC-4-16c
STS-48C SPE
2.396160 Gbit/s
Table 4-2. Effective payload capacities
After the introduction of virtual concatenation, the following additional payload bandwidth sizes are available: SDH
SONET
from
to
In steps of
VC-11 (1-64)
VT1.5(164)
1.6 Mbit/s
102.4 Mbit/s
1,6 Mbit/s
VC-12 (1-64)
VT2(164)
2.2 Mbit/s
139.3 Mbit/s
2.2 Mbit/s
VC-3 (1-256)
STS-1 (1256)
49 Mbit/s
12.7 Gbit/s
49 Mbit/s
VC-4 (1-256)
STS-3c(1256)
150 Mbit/s
38.3 Gbit/s
150 Mbit/s
Table 4-3. Link sizes provided by virtual concatenation
Table 4-4 shows some examples of the efficiencies that are achievable in the SDH/SONET network. From Table 4-4, it is readily seen that at the 10 Mbit/s, 25 Mbit/s, 100 Mbit/s, and 1 Gbit/s rates there is no non-concatenated group that even closely matches the bit rates. Only with virtual concatenation of SDH/SONET bit rates can efficiencies better than 90% be achieved. The 10 Mbit/s Ethernet could be carried in a VC-3, but this yields an efficiency of just 20%. Similarly, the 100 Mbit/s rate could be carried in a VC-4, yielding a miserly 66%. However, now that virtual concatenated VC11, VC-12, VC-2, VC-3, and VC-4 are available, these services can be carried with almost 100% efficiency.
Multiplex Structures of the Optical Transport Network
k^ ^
11
OS
00
ON
O
O
oo
ON
a\ o
o
1>
> 1
>
-^
> (N ON
^ > ^
H
H
>
^
S
(N
o o
o o
> \o >
(N
o o
ON
>
>
en
1
U
>
>
1
QJ
i
03
1
en CO
1
00
H
00
en en
1 1
(N
^ 1
u
o ^ 1 ^ 1
o
1
o
>
00
*
>
IT)
1
00
1
4
u
>
1 '1"^
CO
1
1
00
1
1
OO
1
00
SN
.2
O
O
S3
^
o
(J
1
c
a o U
>
en en
,o C«
en
CO
> 00
0^
1 1
1 1
>
ON
>
.2 CO
135
H
O
a o
U
o
00
o
o
^ ^
CO
1^
en U
>
> p
§2
00
C/5
^.1
^ Is o S to
O
PL,
^
^
o
•S
t
§
o^- 1
Table 4-4, SDH/SONET virtual concatenation efficiencies
136
Chapter 4
It is worth noting that if even a single VC12 of an STM-1 is used in an access situation, then a 100 Mbit/s data service cannot be carried unless it is transported in a VC-3-2v. Moreover, with the rapid expansion to carry data over SDH, the need for virtual concatenation at all VC-n levels has become increasingly desirable by operators whose transport equipment cannot handle contiguous concatenation. This is especially true for interworking between global SDH networks and American SONET networks.
4.5.3.
Additional Benefits
The main objective of virtual concatenation is to flexibly provide multiple right-sized channels over an SDH ring or network. Virtual concatenation uses the SDH VC-n payload area directly and therefore does not have the inefficiency of mapping into an asynchronous payload first. In addition, since VCAT is a byte-level inverse multiplexing technique, it has the characteristics of a right-sized bandwidth, with an improved granularity, a low delay, low jitter, efficient reuse of the protection bandwidth, and a high efficiency payload mapping. Virtual concatenation is not restricted to the situation where all the individual VC-n's are transported in a single multiplex section (i.e. within a single SDH signal structure). In fact, the real potential flexibility offered by virtual concatenation occurs when the individual VC-ns forming the logical group are diversely routed over a number of different SDH signals. The diverse routing capability enables the transport of client signals in situations where a single link does not have enough resources to transport the client signal as a contiguous payload. In addition, virtual concatenation provides the network operator with the ability to implement channels in an SDH network that are more appropriate for the new, increasingly router-based applications. The advantages of these channels are bandwidth granularity, right-sized capacity, efficient mapping into VC-n, traffic scalability, and channelized high-capacity SDH interfaces. Finally, virtual concatenated payload transport is transparent to intermediate SDH Network Elements (NEs) on the path between two ends of a channel. Therefore it can be cost-effectively deployed into an existing SDH network without the need to upgrade all the NEs.
4.5.4.
Restrictions
Even though there are many benefits to using VCAT, there are also some restrictions. The size of the transported bandwidth of a VCG is fixed and, if one or more of the virtual containers fail, the full payload is discarded. Once
Multiplex Structures of the Optical Transport Network
137
the operator has provisioned the size of a VCG, it cannot be changed without interrupting the carried signal. Data transport can have a variable requirement for bandwidth regarding the time of the day, or the day of the week. Both these issue are addressed by the extension of the Virtual concatenation standard, know as Link Capacity Adjustment Scheme (LCAS).
4.5-5.
VCAT Details
A distinction has to be made between VC-n-Xv (n=3,4) and VC-m-Xv (m= 11,12,2), VC-n-Xv uses the H4 byte for the Path Overhead (POH), and VC-m-Xv uses K5 bit 2 for the POH. To reserve enough room for future expansion, both H4 and K5 comprise a multiframe. For the VCAT POH, an MFI field and an SQ field are allocated in this multiframe. Figure 4-16 shows the higher-order VCAT multiframe using the H4 byte. The higher-order VCAT multiframe uses the MFI-1 in H4 bits [5...8] for alignment. Figure 4-17 shows the lower-order VCAT multiframe utilizing the K4/Z7 byte bit 2. The lower-order VCAT multiframe uses the Multi-Frame Alignment Signal (MFAS) in K4/Z7 byte bit 1 for alignment.
Chapter 4
138 H4 byte Bitl
Bit 2 Bit 3 Bit 4 Bits
Bit 6
Bit?
1st 2nd multimultiframe frame number number
Bits
MFLl (bits 1-4) SQ LSB (bits 5-8)
"^
1
1
1
1
P T5
1 ^
II
MFI-2 MSB (bits 1-4)
0
0
0
0 1
1
MFI-2 LSB (bits 5-8)
0
0
0
1
II
Reserved ("0000")
0
0
1
0
II
Reserved ("0000")
0
0
1
II
Reserved ("0000")
0
1
0
0
4
II
Reserved ("0000")
0 1
1
0
1
5
II
Reserved ("0000")
0
1
1
0
6
II
Reserved ("0000")
0
1
1
II
Reserved ("0000")
0
0
0
8
II
Reserved ("0000")
0
0
11
9
II
Reserved ("0000")
0
1
II
Reserved ("0000")
0
1
II
Reserved ("0000")
1
0
II
Reserved ("0000")
1
0
II
SQ MSB (bits 1-4)
1
1
1
SQ LSB (bits 5-8)
1
1
1 ^^ 1 11 ^ ^ 0 11 ^ ^ 13 11 0 11 ^ ^ 1 15
1
MFL2 MSB (bits 1-4)
0
0
0
~
n
11
1 11
1 11
2
^ ^
0
1I
0
Figure 4-16. Higher-order VCAT Overhead in the H4 byte
n+1
Multiplex Structures of the Optical Transport Network MFAS in K4/Z7 bit 1 Function
Value
VCAT OH in K4/Z7 bit 2 Function
139
Bitnumber
Value
lilii jll^^^^^^
jljjlljlj^^^^
iiliii
jljl^^^^^^^^^^^^^^^^
IB^^^^^^
111
Extended signal label Not used by VCAT
Fixed Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved
^B
B li^^^^^^^^^^
iill Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved
Figure 4-17. Lower-order VCAT Overhead in the K4/Z7 bit 2
10 11 12 13 14 15 16 17 19 20
21 22 23 24
25 26 27
28 29 30
31 32
140
4.6.
Chapter 4
LINK CAPACITY ADJUSTMENT SCHEME (LC AS)
LCAS provides a mechanism for a hitless increase or decrease of the payload area of a VCG that is transported through an SDH network. In addition, the scheme will automatically decrease the payload capacity if a member experiences a failure in the network, and will increase the payload capacity when the network fault is repaired. The scheme is applicable to every member of the virtual concatenation group. The LCAS standard, i.e. ITU-T recommendation G.7042/Y.1305 [5], defines the required states at the source side and the sink side of the VCG path as well as the control information exchanged between both the source side and the sink side of the VCG path to enable the flexible and hitless resizing of the virtual concatenated signal. The feature characteristic of VCAT links using LCAS is the capability to reduce the transported payload bandwidth in the event of a path failure. This is a good match for data traffic with a mixture of priority levels. The reason for this is that the loss of bandwidth will affect the lower-priority traffic first and should allow the higher-priority traffic to continue passing over the link. The change in payload bandwidth takes place within a few milliseconds, depending on the physical distance between the two ends of the link.
4.6.1.
Link Capacity Increase
To increase the available virtually concatenated payload bandwidth, an additional path has to be set up through the network via the TMN. Once this path has been established, the VC-n can be added to the virtually concatenated signal.
4.6.2.
Link Capacity Decrease (Planned)
In a manner similar to the case of an increase in virtually concatenated bandwidth, decreasing the available virtually concatenated bandwidth requires the VC-n to be deleted from the virtual concatenated signal. Once the VC-n has been taken off the virtual concatenated signal, the path through the network can be deleted via the TMN. Note that in both cases of increasing or decreasing the virtually concatenated signal, a mechanism for the source node and sink node to notify each other about a request for path size changes and the status of the constituent signal is required. This has been accomplished by using a control field embedded in the overhead (OH) that is allocated for the implementation of virtual concatenation.
Multiplex Structures of the Optical Transport Network
141
In addition to the above signaling mechanism, the requirement of "hitless" path resizing (increase/decrease) created a further need for the development of a synchronization protocol between the source and sink nodes,
4.6.3.
Temporary Link Capacity Decrease
A temporary link capacity decrease can also be used if one or more VCn's belonging to a virtually concatenated signal fail. This failure is reported to the source node, which, upon reception and validation of this failure, will proceed by not using the payload area of the failed VC-n for transport of user data. Until the signal failure clears, the available bandwidth for the user is decreased by the size of the payload area of the failed signal. If the failure clears, the sink node notifies the source node, and the recovered VC-n will hitlessly be added to the virtual concatenated signal. First proposed in June 2000, the generic definition of LCAS is now in the new ITU-T recommendation G.7042/Y.1305 [5]. The actual information fields used to convey the control information through the transport network are defined in their respective Recommendations, namely, G.707 [1] and G.783 [2] for SDH and G.709 [3] and G.798 [4] for OTN. ETSI and ANSI refer to ITU-T.
4.6.4.
LCAS Details
Because the LCAS process is an extension of VCAT, it reuses the VCAT path overhead (light shading in Figures 4-18 and 4-19), i.e. the MFI and SQ numbering, and uses reserved bytes and bits for the additional LCAS POH (dark shading in Figures 4-18 and 4-19). The VCAT multiframe is referred to as a control packet because it contains the information needed by the LCAS protocol to control the use of the payload of each member in the VCG. The additional LCAS POH consists of the following: • A four-bit CTRL field, used to convey the operational state, i.e. IDLE, ADD, NORM/ EOS (normal operation with End Of Sequence indicator), or DNU (Do Not Use), of the member from the transmit side to the receive side. The state FIXED indicates that the member does not utilize the LCAS protocol. • An eight-bit MST field, to report the status of each member at the receive side back to the transmit side. • The RS-Ack bit, to acknowledge that the receive side has detected a change in the sequence numbering of the VCG.
142
Chapter 4 •
The GID bit, which can be utilized to verify the connectivity of the VCG through the network. • The CRC bits (eight bits in H4, three bits in K4/Z7), calculated over the total control packet, used for immediate validation of the control packet. Figure 4-18 shows the allocation of the higher-order LCAS POH in the control packet utilizing the H4 byte multiframe. H4 byte
Bitl
Jcs
1st 2nd multi- multiBit 2 Bits Bit 4 Bits Bit 6 Bit? Bits frame frame number number MFI-1 (bits 1-4) 7
C6
C7
Cs
0
1
1
IIM, IM5
M2
M3
M4
1
0
0
Mfi
M7
Ms
1
0
0
11
1 0
0
0
1
0
1
0
1
0
1
1
1
0
0
1
1
0
11
13
1
1
1
0
14
1
1
1
II II II 1 1
^
8 9 10
k Reserved ("0000") Reserved ("0000") Reserved ("0000") SQ MSB (bits 1-4) SQ LSB (bits 58)
II MFI-2 MSB (bits 14) 1 Mn-2 LSB (bits 58) |cTi CTz CT3 CT4 II Reserved ("0000") 0 0 GID II Reserved ("0000")
1 ^ c,
1
1 11
^^
1 12
0
0
0
0
0
0
0
0
1
0
0
1
0
1
0
0
1
0
1 11 15 0 11 ^ 1 11 1 0 1 2 11 3 0 1 "1 1 11 5
Ci
C3
C4
0
1
1
0
||C5
Ce
C7
Cg
0
1
1
11
6 7
]MI
M2
M3
M4
1
0
0
0
8
n+1
Figure 4-18. Higher-order VCAT + LCAS Overhead in the H4 byte. Light Shading: reuse of the VCAT path overhead. Dark shading: use of reserved bytes and bits for the additional LCAS POH.
Figure 4-19 shows the allocation of the lower-order LCAS POH in the control packet utilizing the K4/Z7 byte bit 2 multiframe.
Multiplex Structures of the Optical Transport Network MFAS in K4/Z7 bit 1 Function
Value
VCAT + LCAS OH in K4/Z7 bit 2 Function
MFI (bits 15) LSB MSB
Multiframe alignment signal SQ (bits 16)
0 CTRL (bits 14)
Extended signal label Not used by VCAT
Fixed Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved
0 0 0 0 0 0 0 0 0 0 0 0 0
GID Reserved Reserved Reserved Reserved RS-Ack
LSB CTi CTa CT3 CT4
0 0 0 0 , Ml M2
MST(bitsl8)
M3 M4 M5 M6 M7
CRC(bitsl3)
Bit number
Value MSB
0
143
Ms Ci C2 C3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3J
32
Figure 4-19. Lower-order VCAT + LCAS Overhead in the K4/Z7 byte bit 2. Light Shading: reuse of the VCAT path overhead. Dark shading: use of reserved bytes and bits for the additional LCAS POH.
1
144
4.7.
Chapter 4
ADVANTAGES OF USING VCAT LCAS AND GFP
Both GFP mapping modes GFP-F and GFP-T may use the Virtual concatenation techniques described above. Virtual concatenation allows matching of the transport bandwidth as closely as possible to the bandwidth required by the client signal. The actual bandwidth required by the client signal may be a fraction of the standard LAN bit rate and may not be constant. LCAS provides the flexibility of changing on demand the size of the transported payload. LCAS also provides the capability to adjust the required protection bandwidth to the availability requirements in steps of (1/Xy^ of the working bandwidth. One of the applications that may use GFP mapping is DP. One of the technologies that uses DP as a means of transport is Voice over IP (VoIP).
4.8.
IMFLEMENTERS GUIDE FOR VCAT AND LCAS
The VCAT functionality can be implemented in part by reusing existing devices that provide the individual VC-n termination points and access to the VC-n payload signal, and the H4 or K4/Z7 overhead bytes at the network or server-layer side and devices that provide access to the payload signal of the client signal to be transported by the VCG at the customer or client-layer side. The VCAT implementation is then required for the distribution and reconstruction of the transported payload and the VCAT specific overhead. Part of the reconstruction of the transported payload is the compensation of the differential delay. For the actual delay buffer, commercially available memory devices can be used, thereby providing the possibility of matching the memory size with the required or acceptable maximum differential delay. The LCAS protocol is described in ITU-T recommendation G.7042 [5] by using state machines specified in SDL diagrams. SDL is the Specification and Description Language described in ITU-T recommendation Z.IOO. While the individual state machines for each member are not complex, the interworking may be complicated.
4.8.1.
Detection of Differential Delay
The differential delay can be detected by comparing the value of the MFI fields in the VCAT overhead among all the members within the group. For VC-4-Xv and VC-3-Xv, the MFI is the combination of the MFI-2 field and the MFI-1 field into a 12-bit number. The differential delay is calculated (in frames) by subtracting the MFI of one member from another member using 2's complement math. The result
Multiplex Structures of the Optical Transport Network
145
is interpreted as a 2's complement number. When the process of differential delay detection is initiated, there is no a priori knowledge of the delays, and thus the result of this calculation may be either positive or negative. A positive result indicates that member 1 leads (has less delay than) member 2 while a negative result indicates that member 1 lags (has more delay than) member 2. With a 12-bit number and a frame rate of 125 |as, the maximum differential delay could be 512 ms. However, due to aliasing, this value cannot be used. For example, an actual differential delay of 384 ms between two members will be calculated as a difference of -128 ms using the 2's complement math. Thus the leading delay of 384 ms aliases to a lagging delay of 128 ms. To avoid aliasing, the maximum differential delay is defined to be less than half the maximum detection range. For VC-11/12/2, the MFI field is only 5 bits, but with a frame rate of 16 ms, the maximum detectable differential delay is also 512 ms. While the standard allows for a maximum of 256 ms of differential delay to be detected, most implementations will not support the maximum. With a fiber propagation delay of 5 ms per 1000 km, a signal sent around the earth (40,000 km) will experience a delay of 200 ms. Supporting a differential delay less than the maximum will also decrease the probability of aliasing. By using the MFI to determine the differential delay, the accuracy of the delay is related to the frame rate, e.g. 500 |us for VC-12 VCAT. The accuracy of the calculation can be improved when the byte number within the frame is tracked as well. Now the differential delay can be determined down to the byte level if desired, e.g. 3.6 |as for VC-12 VCAT. Another aspect of differential delay detection is to determine whether the differential delay exceeds the available buffer space. If the actual differential delay goes beyond the implemented buffer range, this error condition will be detected and reported, since the data will be corrupted if the delay buffers cannot be aligned properly,
4.8.2.
Compensation of Differential Delay
Realignment of the received VCs can only be accomplished by buffering the incoming data from the time the least-delayed VC arrives until the mostdelayed VC arrives. The required buffer size, i.e. the physical amount of memory, depends on the bit rate of the payload data, the maximum number of members in the VCG, the acceptable differential delay, and the buffer management scheme. The implemented maximum acceptable differential delay can be negotiated between the vendor and the operator and may depend on the network topology and/or the cost of the required buffer
146
Chapter 4
memory. The acceptable differential delay may be further limited by provisioning based on operator policy. Examples of the amount of memory required to compensate for the experienced differential delay are calculated as follows: •
VC-12 The VC-12 pay load container structure consists of 4 rows x 34 columns or 136 bytes repeated every 500 |us. The amount of buffer memory required is 272 bytes per ms per VC12. Example: 10 Mbit/s Ethernet transported in a VC-12-5v: The buffer memory required is 5 x 272 = 1360 bytes per ms differential delay.
•
VC-3 The VC-3 payload container structure consists of 9 rows x 85 columns or 765 bytes repeated every 125 ps. The amount of buffer memory required is 6120 bytes per ms per VC-3. Example: 100 Mbit/s Fast Ethernet transported in a VC-3-2v: The buffer memory required is 2 x 6120 = 12,240 bytes per ms differential delay.
•
VC-4 The VC-4 payload container structure consists of 9 rows x 260 columns or 2340 bytes repeated every 125 |LIS. The amount of buffer required is 18,720 bytes per ms per VC-4. Example: 1 Gbit/s Gigabit Ethernet transported in a VC-4-7v: The buffer memory required is 7 x 18720 = 131040 bytes per ms differential delay.
4.8.3.
Structure and Management of Differential Delay Buffers
The features that have to be supported and the flexibility of the implementation will determine the structure and management of the differential delay buffers. If the implementation is required to support VCAT without LCAS, there are simplifications from an implementation that must support LCAS. Table 4-5 below illustrates some of the delay buffer
Multiplex Structures of the Optical Transport Network
147
management differences between non-LCAS and LCAS implementations. That is, LCAS requires more dynamic and flexible buffer management than a non-LCAS implementation.
Feature Alignment required
Non-LCAS - at startup - after error recovery
Differential delay range determination
" at startup
Sequence number allocation
Fixed - at startup
LCAS - at startup - after error recovery - at member addition - at startup - after error recovery - at member addition - at member removal Variable - at startup - at member addition - at member removal
Table 4-5. Delay buffer management differences
4.8.4.
Differential Delay Buffer Overview
While the differential delay buffers can be conceptually thought of as FIFOs, they are generally implemented as circular buffers with read pointers and write pointers. As the member data arrives, it is written at the write pointer location. Once the differential delay is determined, the read pointer is set for each member link and the reconstruction of client data can begin. Depending upon the implementation, either the read pointers or the write pointers are synchronized across all members of the group. This synchronization step across the multiple delay buffers in the group favors the circular buffer architecture over a strict FIFO implementation. Once aligned to the MFI, the depth of the buffer, or the difference between the read pointer and write pointer, generally stays constant. There will normally be some jitter caused by pointer adjustments in the transport network and the presence of the SDH/SONET overhead bytes. Typically, the pointer adjustments will not occur at the same time and the overhead bytes will not be aligned. However, since all the member signals of a VCG are generated with the same clock, the long-term clock rate of the members will not diverge.
148
4.8.5.
Chapter 4
Alignment within a VCG
For VCAT, alignment of the members is required when the group is created. Realignment will be required after an error condition, since the routing through the network of one or more members in the group may have changed. Since the (re-) alignment occurs when data is not valid, the effect of the alignment process on the data transport is not critical. For VCAT with LCAS enabled, the alignment process is also active each time a member is added to the group. The alignment process should then check whether the calculated differential delay of the additional member is within the implemented boundaries (i.e. the available buffer size). The alignment process can start as soon as the control packet with the CTRL word ADD is received. The process should be ready when the control packet with the CTRL word NORM/EOS arrives and consequently can hitless increase the bandwidth. When adding a new member to an existing VCG, there are three possible scenarios: the new link may have similar delay, less delay, or more delay than the existing members in the group. If the new member experiences similar delay through the network, the delay of all the members of the VCG remains the same, and the new member and its associated delay buffer are added to the VCG. In this scenario, neither the differential delay range nor the propagation delay of the group will be modified. If the new member experiences less transport delay than the existing members in the VCG and the available delay buffer is large enough to accommodate the required delay compensation, the member is simply added to the group. This outcome increases the differential delay range of the group but not the propagation delay. If the new member experiences more propagation delay than the existing members in the VCG, then, to align the new member, delay has to be added to all the other members in the group (by increasing the size of each member's individual delay buffers). This addition increases both the differential delay range and the propagation delay of the group. These scenarios bring up bring up many interesting design points concerning propagation delay or latency. Depending upon the type of client traffic transported, there are trade-offs between latency and flexibility. Some applications are more sensitive to changes in the latency than to the latency itself. In this case, changing the latency of the group to accommodate a new member with more propagation delay that the existing members in the VCG may not be desired, and the new member may be refused by maintaining the fail condition, i.e. MST=FAIL. If fixed latency is a desired
Multiplex Structures of the Optical Transport Network
149
feature, the operator has to calculate and provision the worst-case latency that shall be used from the moment the group is initiated. Other applications are insensitive to changes in the latency; the implementation can even try to minimize the group latency. Change in the latency of a group, can either be slow, by increasing or decreasing the data playout rate slowly, or instantaneous, by stopping the data playout until the desired delay buffer depth is reached.
4.8.6.
Sizing the Delay Buffers
Depending upon the type of memory used, implementation can trade off simplicity for size of memory. If the memory chosen is very expensive, like ASIC internal memory or SRAM memory, the buffer sizes should be minimized. To achieve this result each member in the group will have a different amount of memory dedicated to its delay buffer. While this setup results in an optimal use of memory and a minimal memory size, the management of the delay buffers will become very complex and may limit the flexibility of LCAS groups. The simplest buffer structures are based on the maximum allowable propagation delay (determined by the vendor or buyer of the equipment). These buffers allow indexing based upon MFI and byte location within the member frame. The simplest buffer structures, however, are probably the least memory efficient, since memory efficiency is traded for simplicity and cheaper external DRAM could be used to support these simple structures.
4.8.7.
Processing Time
One feature of LCAS is that the source process controls the exact multiplexing order and timing of member additions and removals. This feature requires the sink process to react to the contents of the control packets at the very next multiframe boundary. The next multiframe boundary is located approximately 42 |Lis after the last (H4) byte of the HO VCAT control packet and approximately 125 |Lis after the last (K4) bit of the LO VCAT control packet. Although at first glance this may seem to be a lot of time, issues can arise if the implementation should have the capability of supporting very large VCAT groups and/or a large numbers of VCAT groups simultaneously. The major task that has to be performed within this time frame is to determine changes in the multiplexing order and to configure the data path accordingly to be able to switch to the new order at the beginning of the next multiframe. Because the CRC validates the content of a control packet, it is not useful to start interpreting MFI and SQ values before the CRC arrives. The standard
150
Chapter 4
requires that control packets with CRC failures be ignored. The CRC validation gives a faster response than the more common 3 or 5 times exact match validation. Ignoring a control packet that contains a change because it is erroneous could cause data corruption; however, the frequency of change within an LCAS group is fairly low, and the probability of data loss is even less. The implementation must resolve any inconsistencies within the VCAT group, such as duplicated or missing SQ numbers and duplicated or missing EOS control words. Since the probability of bit errors within the SDH/SONET network is low, the number of erroneous control packets is also low. The amount of processing time required to handle erroneous control packets is implementation dependent. Since the acceptable member state changes are limited and there are rules governing sequence number reassignments, some erroneous control packets may be reconstructed by analyzing the control information of the remaining members of the group. The question is, as always, whether the results justify the effort. A more common scenario is the total failure of a member trail. In this case, the member trail termination reports a signal fail (TSF) and the member is moved to the DNU state according to the LCAS procedure. The member remains in this DNU state until the trail failure is repaired. Since communication is lost with that member at the sink side, removal of the member from the group by the source process cannot be detected except by correlating the SQs of the remaining members.
4.8.8.
Controlling Distribution/Reconstruction Order
In the VCAT process, the client payload is distributed at the source side over all the individual member payload areas, octet by octet, in a round robin fashion. At the sink side, the client signal is reconstructed by interleaving the octets from all the member payload areas. It is essential that the order of distribution is also used for the reconstruction. That is the reason the source assigns sequence numbers to the members. Each member transports its assigned sequence number to the sink for the sink to use in the reconstruction process. A special case is introduced for the VCAT application with LCAS enabled. Here, only active members carry the distributed payload, i.e. members in the NORM/EOS state. A member can also be in the DNU state. While in the DNU state, the member retains its sequence number but will be skipped in the distribution/reconstruction process. When implementing LCAS, the DNU state could be handled by implementing an interleave sequence number. The interleave sequence
Multiplex Structures of the Optical Transport Network
151
number controls the distribution/reconstruction process. The interleave sequence number is assigned to a member based exclusively on the member status NORM/EOS. Table 4-6 contains an example. Member
1^
b
State NORM NORM Assigned SQ nr 0 1 1 Interleave SQ nr 0
c
d
e
f
DNU 2 n/a
NORM 3 2
DNU 4 n/a
EOS 5 3
g DNU 6 n/a
Table 4-6. Example of the interleave sequence number
4.8.9.
Member Status
The member status is the parameter to communicate the health and usability of the members from the sink side back to the source side of the VCG. The MST protocol uses one bit for each sequence number to indicate whether the member status is OK=0 or FAIL=1. The MST bits are transferred in an MST multiframe; its size is determined by the maximum of SQ numbers and is technology specific (e.g. 256 for HO SDH LCAS). Since not enough bits are present in a single control packet, the member status is serialized over multiple control packets. All members of the VCG transfer the MST multiframe. This ensures correct operation with just a single member in the reverse direction. Since all information is carried on every member, implementations that monitor just a single return member are allowed. The renumbering of the members when members are added or removed from the VCG complicates this simple protocol. Whenever a member is added to the group, the new member receives the next available higher sequence number. When a member is removed from the group, all members with a higher sequence number must be renumbered (decremented by one). Note that a member may have one sequence number while in the ADD state and a different sequence number when it transitions to the NORM/EOS state. When the sequence numbers of the members change due to member addition or removal, there exists a period of uncertainty during which it is not known whether the reported MST is for the old numbering or the new numbering. This period is caused by transmission propagation delays and processing delays. To control this period of uncertainty, the source will stop interpreting the received MST until the sink acknowledges that it detected a change in sequence numbering by toggling the RS-Ack bit. Once the toggling is detected by the source, it will assume that the received MSTs match the current numbering again.
152
Chapter 4
In certain situations, the sink side may not detect a resequence operation and consequently may never toggle the RS-Ack bit. One of these cases where the resequence may not be seen by the sink side is with the removal of a member in the DNU state that has the highest SQ number of the VCG. This is resolved in the standard by the definition of an RS-Ack timeout timer that is started when the source stops processing the MST due to a resequence. When the timer expires, the source will continue to process the MST and assumes the RS-Ack was lost or not sent.
4.9. REFERENCES [1] [2] [3] [4] [5] [6]
ITU-T Recommendation G.707/Y1322, Network Node Interface for the Synchronous Digital Hierarchy SDH, 2003. ITU-T Recommendation G.783, Characteristics of Synchronous Digital Hierarchy (SDH) Equipment functional blocks, 2003. ITU-T Recommendation G.709, Interfaces for the optical transport network (OTN), February 2001. ITU-T Recommendation G.798, Characteristics of OTN equipment functional blocks, November 2001. ITU-T Recommendation G.7042/Y. 1305, Link Capacity Adjustment Scheme (LCAS) for Virtual Concatenated Signals, 2003. ANSI American National Standard T1.105, Synchronous optical network (SONET) — Basic Description including Multiplex Structure, Rates and Formats.
Chapter 5 GENERIC FRAMING PROCEDURE (GFP) Enrique J. Hernandez-Valencia Lucent Technologies
5.1.
INTRODUCTION
The Generic Framing Procedure (GFP) is a new protocol recently standardized under ITU-T G.7041/Y.1303 [1] and ANSI Tl.105.02 [2] and designed to support variable- and fixed-length packet transport modes over a general-purpose bit or byte synchronous high-speed communications channel. GFP extends the HEC-based packet delineation mechanism used by other broadband applications such as ATM [3] to variable-length data transport applications. GFP exploits the ability of modern point-to-point transmission links to deliver the incoming information stream in a sequential and orderly fashion to greatly simplify data link layer synchronization and frame boundary delineation operations. Unlike packet delineation mechanisms based on the HDLC framing procedure [4, 5], GFP requires no special line encoding for the framed protocol data units (PDU), which substantially reduces processing logic requirements for the data link mapper/demappers. Unlike ATM, GFP delegates high-touch QoS management functions to the client layers, which further reduces operational overhead. The lower implementation complexity makes GFP particularly suitable for high-speed transmission links such as SONET/SDH [6, 7] pointto-point links, wavelength channels in an optical transport network [8], or even dark fiber applications [9]. For high data rate environments, GFP is a very attractive alternative to solutions such as ATM, Frame Relay [10], PPP/HDLC [11], PPP-over-SONET (PCS) [12], or X.85/X.86 [13, 14].
154
Chapters
TDM
PHY
PHY
PHY
PHY
STS/VTJJross-Connect Matrix
X
Virtual Concatenation/ LCAS
GFP Encapsulation/Framing
Packet Swjlch
Ports
PHY
PHY
PHY
PHY
Figure 5-1. High-level functional model of a hybrid Ethernet/TDM transport system
From a hybrid Packet/TDM system perspective, as illustrated in Figure 51, two aspects are of key relevance when discussing transport mechanisms for packet-oriented traffic over a TDM-based telecommunications infrastructure: (1) the data-link "adaptation" mechanism to transform the packet-based data flow into a bit/byte stream that preserves the native packet structure, and (2) the rate adaptation mechanism to map the resulting bit/byte stream into the SONET/SDH payload. For Ethernet transport, for instance, all solutions of practical interest (ATM, FR, POS, X.85/X.86, and GFP) perform data-link adaptation by reencapsulating the original protocol data unit (PDU) and then reframing the resulting PDU into a TDM-friendly cell/packet flow. It is this PDU "framing procedure" that largely differentiates the various solutions. The final rate-adaptation step is fairly similar across technologies. For constant-bit-rate-oriented traffic, the adaptation models over TDM-based telecommunications are based on the
Generic Framing Procedure (GFP)
155
principle of quantization of the incoming flow, with options for line code compression, if feasible. Rate adaptation typically requires fine-grained mapping of the adapted client signal into its new constant-bit-rate container to minimize additional signal jitter and wander. This chapter provides an overview of GFP. The chapter begins with a review of background information, related work associated with packet transport over public networks, and design factors influencing its development. Next follows a brief summary of current formats, procedures, and implementation considerations. The chapter closes with a discussion on performance and a sample look at applications.
5.2,
BACKGROUND
The immense popularity of the Internet and Internet Protocol- (IP-) based networks has created an explosion in the number of IP-based end systems, and consequently, in the aggregated IP traffic being carried over the public circuit-switched infrastructure. Most of this traffic originates in corporate LANs, which are today over 90% Ethernet based. While voice and private line traffic still account for a majority of traffic in most public network backbones today, it is widely expected that packet-oriented traffic, originating from IP end systems or native Ethernet transport applications, will dominate the public backbone traffic in the not-so-distant future. (Indeed, this is already the case for IP-centric Internet Service Providers [ISPs]). This increase in IP and native Ethernet traffic demands much higher access link rates than those in use today. It also demands data transport approaches that are compatible with future data-aware value-added services.
5,2,1
Packet Transport on Public Networks
Figure 5-2 illustrates various transport options for packet traffic over the public network infrastructure. A significant portion of IP traffic today is encapsulated in Frame Relay, PPP/HDLC, or POS, or isadapted to ATM for transport across a TDM-based core network. Currently, most Frame Relay and PPP line interfaces operate at DSl/El, DS3/E3, or 0C-3c/STM-l rates or less. The same is true of most line interfaces for IP edge routers, although OC-48C/STM-16 and OC-192c/STM-64 SONET/SDH interfaces are being deployed at an increasing rate, particularly in the core of metropolitan and wide-area networks. Ethernet and Storage Area Networking (SAN) protocols such as Fibre Channel, ESCON, and FICON have traditionally been transported over the public network infrastructure by means of proprietary (vendor-specific) solutions. Given the widespread availability of
156
Chapter 5
inexpensive 10/100/1000 Mbps Ethernet interfaces for CPE switches/routers, the growing need to improve data center/SAN interconnectivity, and the recent additions of Virtual LAN-based Virtual Private Networking (VPN) and QoS capabilities via IEEE 802.1Q/P, there is a renewed interest in a QoS-friendly, standard-based mechanism to transport Ethernet and SAN traffic directly over TDM networks. Voice (Applications)
Data (IP, IPX, MPLS, etc.)
SANs (Applications)
Video (Applications)
Ethernet* (MACS & PHYs)
Private Lines (Circuits)
RPR (MACS & PHYs)
o:
PPP (Services)
(Services)
ATM (Setvices)
HDLC I (Encapsulation) |
GFP (Encapsulation)
SONET/SDH (Physical Channels)
OTN (Physical Channels)
Fiber or WDM (Physical Channels) ' May also run directly on fiber
Figure 5-2. Transport options for voice, data, storage, and video traffic over a SONET/SDH network and Optical Channels via ATM, FR, HDLC, or GFP
5,2.2
Other Traffic Adaptation Approaches
While flag-based PDU delineation has been common in various framing and payload mapping standards, alternative approaches have been in use as well. For packet/cell delineation, ATM relies on implicit information about the packet length (fixed at 53 bytes) and the header CRC, rather than a flag, for the purpose of packet delineation. The header CRC match is used for initial link synchronization. The fixed packet (cell) size and the header CRC are used to verify link synchronization after initial link synchronization has been established. The fixed-length ATM PDUs significantly simplify processing at the data link receiver. There are also more recent examples of PDU delineation using implicit and explicit PDU length indicators. ATM Adaptation Layer, Type 2 (AAL-2) [15], packs variable-length PDUs into fixed-length ATM cells. A PDU length indicator is used to provide self-delineation once the packet boundary is identified. At the
Generic Framing Procedure (GFP)
157
beginning of each ATM cell, the length pointer helps regain frame delineation quickly. A similar mix of a PDU length indicator and boundary pointer is used for the downstream communication channel in Hybrid Fiber Coax (HFC) links using ADAPt+ [16]. In ATM, AAL-2, and HFC, the basic lower-layer framing exploits the fixed-length format of the PDUs at that protocol layer (ATM cells, downstream frames for HFC). The pointers and length indicators are then used to provide delineation of variable-length PDUs within the payloads of the lower-layer protocol. In absence of a lowerlayer framing with fixed-length frames, a delineation mechanism applicable to the variable length PDUs directly is needed.
5,2.3
Other Design Considerations
Although it would seem straightforward to extend the frame delineation and synchronization procedures used for ATM and other fixed-length PDUs to variable-length PDUs, that is not the case. Some of the issues that need to be addressed [17-18] are as follows: • For fixed-length PDUs, it is important to allow an efficient format for accessing and buffering the adapted PDUs. A header CRC could be used to identify a potential PDU boundary and then invoke the fixed PDU length to jump to the next frame boundary, and to verify that frame synchronization has been achieved. When the second header CRC check fails, because of the false start, the real PDU boundary may have been lost among the many bits/bytes jumped over. In ATM, the small cell size guarantees that, at most, 53 bytes are lost. Large variable-length PDUs make this test more difficult. A false match of the header CRC and a subsequent invocation of the wrong length indicator could waste a large number of opportunities to identify the true PDU boundary. Thus, straightforward extension of the procedure used for ATM may result in a much longer resynchronization interval for variable-length PDUs. • For small, fixed-length PDUs, a failed header CRC need not cause immediate loss of synchronization because the PDU length is specified implicitly. With variable-length PDUs, a failure in the header CRC makes the PDU length indicator itself suspect and causes immediate loss of synchronization. The error correction capability becomes very important for variable-length PDUs. • Small PDUs also imply that a single user can gain control over the link payload only for a very short period of time. PDU interleaving from multiple sources decreases the probability of successful attacks from malicious users trying to induce low bit transition density on the link. In large, variable-length PDUs, a single user gets access to the link for a
158
Chapter 5
much longer time period. Mechanisms to counter attacks aimed at creating very low bit transition density over the data link become critical. • Variable-length PDUs tend to have loose maximum size bounds (up to 64 Kbytes in IP), which make the resynchronization phase. The above protocol design issues are carefully addressed in GFP. The next sections presents the design choices made in the specification of the formats and procedures for GFP.
5,3.
FORMATS AND PROCEDURES
A high-level functional overview of GFP is presented in Figure 5-3. GFP consists of both client-independent and client-specific aspects. Common aspects of GFP apply to all GFP-adapted traffic and cover issues such as PDU delineation, data link synchronization, payload scrambling, client and control PDU multiplexing, and client-independent performance monitoring. Client-specific aspects of GFP cover issues such as mapping of the client PDU into the GFP payload and client-specific performance monitoring and OA&M.
Ethernet
1
HDLC/ PPP
RPR
IP/ MPLS
other Client Signals
ESCON
Framed Mapped
FICON
Fibre Channel
DVB ASI
Transparent Mapped
GFP Clu3nt-Specific Asf)ects GFP Common Aspects PDH Path
SONET/SDH Path
OTN ODUk Path
Figure 5-3, High-level functional model
Generic Framing Procedure (GFP)
5,3,1
159
GFP Frame Formats
The GFP frame format is designed to support both the multiplexing of multiprotocol PDUs as well as the multiplexing of a number of logical virtual links within the data link. Logical virtual links can be used to support different traffic streams with potentially different higher-layer protocols and with different QoS requirements. Two basic GFP frame formats are defined: GFP client frames and GFP control frames, as illustrated in Figure 5-4. GFP also supports a flexible (payload) header extension mechanism to facilitate the adaptation of diverse data clients. The GFP client and control frame formats are shown in Figure 5-5.
GFP Frames
Client Frames
Client Data Frames
Client Payload Transfer
Con irai Frames
Client Management Frames Client Resource Management
kJie
FnhiHh^i
Idle Time Fills
')A.iUyi F fatness
Unk OA&M
Figure 5-4. GFP frame types
5.3.1.1
GFP Client Data Frames
Client data frames provide the basic payload transport mechanism in GFP. As illustrated in Figure 5-5, client data frames are octet aligned and consist of a GFP Core Header and a GFP Payload Area.
160
Chapter 5
Client Data Frames 1
Core Header
// /
/
/
/
Payload Length LSB Core HEC MSB
/
Core HEC LSB
^''>^^ Payload Area
^' Order
^^''^
Payload Type LSB Type HEC MSB Type HEC LSB
CAI
^^^^'^ ^^—•^"'
CID
0-60 Bytes of
Spare
Extension Headers (Optional)
Extension HEC MSB
\\
Vari«ibl« UrHptN Packets \
PFI UP!
Payload Header
\
3\\ Transmission Byte
1
f
/ /
^^^^^
PTI
Payload Type MSB
t
Payload Inforitiation FfjHMlUfigth
^ ^\
f
1
// // / // // /
Payfoad Length MSB
Extension HEC LSB
^^---^'^
Header shown (others may apply)
Payload FCS MSB
Payload PCS
Payload FCS
\ Payload FCS
\\
\
\
Payload FCS LSB
0x00
(0xB6)
0x00
(OxAB)
0x00
(0x31)
Figure 5-5, GFP frame formats
5.3.1,1.1 GFP Core Header The Core Header supports the datalink management procedures in GFP. The Core Header length is fixed at four octets and consists of a PDU Length Indicator field and a Core Header Error Check field. The Core Header is always scrambled upon transmission (via an exclusive OR operation) with a well-known Barker-like pattern BAB31B0. PDU Length Indicator (PLI) Field: a two-octet field indicating the number of octets in the GFP Payload Area. It is used to extract the encapsulated PDU and to look for the next GFP frame boundary. Core HEC (cHEC) Field: a two-octet field containing an ISO CRC-16 to protect the integrity of the Core Header via single-bit error correction and multibit error detection. The cHEC sequence is calculated over the remaining octets of the Core Header. 5.3.1.1.2 GFP Payload Area The GFP Payload Area consists of all octets in the GFP frame after the GFP Core Header. This variable length area may include from 4 to 65,535 octets. It is intended to convey client layer specific protocol information. The GFP Payload Area consists of two common components: a Payload Header and a Payload Information field. A third optional component, the Payload FCS field, is also provided to protect the contents of the Payload Information field (Payload Headers are protected separately). The Payload Area is always scrambled upon transmission and descrambled upon reception via an ATM-like self-synchronous scrambler.
Generic Framing Procedure (GFP)
161
5.3.1.1.2.1 Payload Header The Payload Header is a variable-length area, 4 to 64 octets long, intended to support data link management procedures specific to the higherlayer client signal. The Payload Header contains two mandatory fields, namely, the Type field and the accompanying Type HEC (tHEC) field. The tHEC protects the integrity of the Type field. Optionally, the Payload Header may include an additional variable number of subfields, referred to as a group as the Extension Header. The Type field specifies the presence and format of the Extension Header. GFP Type Field: a mandatory two-octet field of the Payload Header that indicates the content and format of the GFP Payload Information. The Type field distinguishes between services in a multiservice environment. The Type field consists of a Payload Type Identifier (PTI), a Payload PCS Indicator (PFI), an Extension Header Identifier (EXI), and a User Payload Identifier (UPI), as shown in the top right corner of Figure 5-5. For Ethernet transport, for instance, PTI=0 (User Data), no Payload FCS (PFI=0), and the default Null Extension Header (EXI=0) are used. Type HEC (tHEC) Field: a two-octet field that contains an ISO CRC-16 sequence to protect the integrity of the Type field via single-bit error correction and multibit error detection. The Payload Header in GFP allows the support of multiple transport modes that may coexist within the same transport channel. Three adaptation modes are currently defined. The first mode, referred to as Frame-Mapped GFP (GFP-F), is optimized for packet switching environments where resource management functions are delegated to the native data clients. The Client-specific adaptation sublayer is fairly thin, supporting basic Layer 2 PDU encapsulation functions. This is the transport mode used for native IP, PPP, and Ethernet traffic. The second mode, referred to as Transparent GFP (GFP-T), is intended for delay-sensitive 8B/10B coded applications, where the goal is transport efficiency and transparency of the logical line code data. The Client-specific adaptation sublayer performs 8B/10B codeword recoding for data compression. This is the transport mode used for Fibre Channel, ESCON, and FICON traffic. The transport mode is indicated in the UPI field. The third adaptation mode. Asynchronous Transparent mapping, is a variation of GFP-T that supports selective client character removal to facilitate rate adaptation into a lower rate (compared with the native) transport channel. 5.3.1.1.2.2 GFP Extension Header The GFP Extension Header is a O-to-60-octet set of fields intended to support technology-specific data link information such as virtual link identifiers, source/destination addresses, port numbers. Class of Service, or
162
Chapter 5
extended header error control information. The type of Extension Header is indicated by the content of the EXI bits in the Type field. Three Extension Header Types are currently defined: Null Extension Header: the default extension header when the entire GPP payload is dedicated to a single service (as indicated by the UPI field). Linear Extension Header: a two-octet extension header that supports sharing of the GFP payload across multiple clients in a point-to-point configuration. Ring Extension Header: an 18-octet extension header (currently under study) that supports sharing of the GFP payload across multiple clients in a ring configuration. Extension HEC (eHEC) Field: a mandatory two-octet field that contains an ISO CRC-16 check sequence to protect the integrity of the contents of the Extension Header via single-bit error correction (optional) and multibit error detection. 5.3.1.1.2.3 Payload Information The Payload Information field contains the framed PDU. This variablelength field may include from 0 to 65,535 - X octets, where X is the size of the Payload Header. It may include an optional Payload PCS field. The client user/control PDU is always transferred into the GFP Payload Information field as an octet-aligned packet stream. The payload may be a single layer-2 MAC frame, via the frame-mapped GFP adaptation mode, or multiple layer-1 line codes, via the transparent-mapped GFP adaptation mode. 5.3.1.1.2.4 Payload Frame Check Sequence (FCS) The Payload Frame Check Sequence (FCS) is an optional, four-octet long, frame check sequence. It contains an HDLC-like CRC-32 check sequence that protects the contents of the GFP Payload Information field. A value of 1 in the PFI bit within the Type field indicates the presence of the Payload FCS field. Unless otherwise stated, corrupted GFP frames are passed to a client adaptation process for local handling according to clientspecific rules. 5.3.1.2
Client Management Frames
GFP provides a generic mechanism to propagate client-specific source adaptation information, such as performance monitoring and OA&M information, to end-systems. Currently, the only client-specific facility defined is a Client Signal Fail (CSF).
Generic Framing Procedure (GFP)
163
5.3.1.2.1 Client Signal Fail (CSF) CSF is a message that may be sent from the GFP source-adaptation process to the far-end GFP sink-adaptation process upon failure detection in the ingress client signal. Detection rules for client signal failure events are by definition client-specific. Figure 5-6 illustrates the use of CSF messages.
GFP Link
nana ^ ^ ^ --.^ I • Loss of Signal (LOS) ; f • Loss of Client Character Sync (LCS) S
Client Signal Fail: .LOS . LCS
- Loss of clock/frame - Running disparity violations
Figure 5-6. Example of Client Signal Fail usage in GFP
The CSF indication is a special type of GFP Client Frame consisting only of a Payload Header and no Payload Information field. The Payload Header consists of a Type field with its accompanying tHEC, and an Extension Header, if applicable to the encapsulated client signal. In the Type field, the PTI subfield is coded as Client Management, the PFI subfield is set to 0 (no Payload FCS), and the EXI subfield is set to the applicable Extension Header type. The UPI subfield is used to indicate the type of client signal failure. Two generic types of failure defects can be reported: • Loss of client signal (UPI=0) • Loss of client character synchronization (UPI=1) Upon failure detection, the GFP client-specific source adaptation process may send periodic far-end CSF indications. The GFP client-specific sink adaptation process should clear the defect condition either 1. after failing to receive a number of consecutive CSF indications (value of 3 is suggested), or 2. after receiving a valid GFP User Frame. The handling of incomplete GFP frames at the onset of a CSF event should be consistent with the GFP error-handling procedures.
164
5.3.2
Chapter 5
GFP Control Frames
GFP Control frames provide in-line link control mechanisms for GFP. This information is indicated via the lower values of the PLI field (0-3). Currently, only the GFP Idle Frame function is specified. The remaining PLI values are under consideration for dark fiber extensions. It is expected that such an in-band channel would use very small payload areas to minimize interactions with life FP Data frames. Note that it is not expected that such GFC control will have the same format as the GFP user frames, but at the very least they should incorporate a CRC-16 for the control message payload using the same generation procedure as for the cHEC computation. 5.3.2.1
GFP Idle Frame
The GFP Idle frame is a special four-octet GFP Control frame. It consists of only a GFP Core Header with the PLI and cHEC fields set to 0. The GFP Idle frame does not contain a Payload Area. It is intended as a filler frame for the GFP transmitter to facilitate the adaptation of the GFP octet stream to any given transport medium. The GFP Idle frame format is shown in the bottom right corner of Figure 5-5.
5.3.3
Client-Independent Procedures
5.3.3.1
GFP Frame Delineation
One important function in GFP is to identify the PDU boundary at the time of link initialization and also after packet delineation loss. The GFP receiver state machine is shown in Figure 5-7. Under normal conditions, the GFP receiver would be operating in the Sync state. The receiver examines the PLI field, validates the incoming HEC field, extracts the framed higher-layer PDU, and then rolls over to the next GFP Header. As soon as an uncorrectable error occurs in the GFP Header (that is, the HEC fails and more than a single bit error is detected), the receiver enters a Hunt state. It starts looking for the boundary to the next GFP PDU by moving one bit/byte at a time. Assuming that this bit/byte starts a new frame, the receiver checks the first four octets to see if they form a valid GFP Header (that is, a GFP Header where the HEC field checks out against the content of the PLI field). If the check succeeds, the receiver tentatively assumes that it has identified the frame boundary; otherwise, it shifts forward by one bit/byte and checks again. Boundary acquisition, and hence link resynchronization, is declared after the GFP receiver detects N consecutively correct GFP Headers. The GFP receiver can then return back
Generic Framing Procedure (GFP)
165
to the Sync state. The GFP frame delineation procedures are based on selfdelineation/self-synchronization principles illustrated in Figure 5-8.
Frame-by-Frame Core Header Correction Disabled
Frame-by-Frame Core Header Correction Enabled
2ndcHEC match
O
Pre-Sync State
Sync I State c:^
Noncorrectable Core Header Error
Correctable Core Header Error
NocHEC match
Figure 5-7. GFP state machine
rr Hun! Stale
Octet or Bit synchronous stream
CL
X
[_
Q-
O LU X o
o j
LU X o
Payload Area
•••
cHFrPnilJ '^' l'**'^ cHEC Fail I '^' '"*•='
i
PLI [c*CC
cn c
Valid
PLI Bytes
P U Bytes c i E C Matc^
1 t^
Paytoad Area
1
fcHECFaii' ""^ i****^!
J
Payfoad Area
3
Pre-Sync State
PLI Byies c HEC Mote b
Sync state
1•
Figure 5-8. Link synchronization and frame acquisition
The HEC-based frame delineation procedure permits sophisticated traffic engineering, flexible QoS aware routing, better partitioning of SONET/SDH bandwidth, and multiservice integration, either at the GFP layer (via Header Extension options with the GFP Payload Header) or via native Layer 2 (e.g., Ethernet) or Layer 3 (e.g., IP) mechanisms. Commercial component interconnect interfaces such as MII/GMII (IEEE 802.3) for the Ethernetrelated layers and SPI-3/SPI-4 (OIF) for the optical layer are readily available to facilitate the integration of system components, promote feature interoperability and decrease system development costs.
166 5.3.3.2
Chapter 5 Frame multiplexing
GFP Client and Control frames from multiple ports and multiple client types are multiplexed on a frame-by-frame basis. GFP does not impose any constraints on the choice of scheduling algorithms, since traffic-handling aspects tend to be client specific. In general, when there are no other GFP frames available for transmission, GFP Idle frames shall be inserted, thus providing a continuous stream of frames for mapping into an octet aligned physical layer. CMF, other than CSF, are to be sent opportunistically to minimize contention with client data frames. 5.3.3.3
Link Scrambler
Link-level scrambling is required to achieve adequate link transparency and bit density when data is mapped directly over the SONET-SPE or SDHAUG. Equivalent requirements are also anticipated for WDM-based solutions. A self-synchronous scrambler with generating polynomial 1 -I- x"^^ is specified. This scrambler is applied only to the GFP Payload Area in a manner similar to the link scrambler application for ATM over SONET/SDH.
5*3.4
Client-Dependent Procedures
The first step for a transport integration mechanism is a common signal convergence mechanism to map any native bit-stream into the transport channel and provide for signal rate adaptation and minimal OAM&P functions. The native adaptation mechanism provided by GFP is frame based and allows the segmentation of the physical channel into fixed- or variablesize containers or GFP frames. Three modes of client signal adaptation are provided with GFP: Frame Mapped, (Synchronous) Transparent Mapped, and Asynchronous Transparent Mapped modes. Table 5-1 summarizes the major client signals supported by each mode. Table 5-1. Summary of client signals supported by GFP Transparent Mapped Frame Mapped Ethernet MAC Gb Ethernet Fibre Channel HDLC-like/PPP FICON MAPOS ESCON RPR MAC FC-BBW2 AVB DSI MPLS
Async Transparent Mapped FC-BBW3
Generic Framing Procedure (GFP) 5.3.4.1
167
Frame-Mapped Mode (GFP-F)
The Frame-Mapped adaptation mode is a more flexible adaptation mode that is suitable for either full/subrate point-to-point or multipoint packetbased applications. Adaptation is accomplished by mapping upper-level PDUs (such as HDLC-like/PPP frames, IP/MPLS packets, or IEEE 802.3 MAC frames) into the variable-size GFP frames. The frame structure for mapping an Ethernet/IEEE 802.3 frame onto a GFP frame (assuming a Null Extension Header) is illustrated in Figure 5-9. Linear and Ring extension headers that support client-multiplexing functions in point-to-point or ring topologies are also defined. 1^
^1
ijihK-h i-rame
L r
Core ^ ^ Header ^ ^
Pavload Area
m
PCS
i|:|i:||iE||||
1
/Optional
|jj|:;|2i||t|||:|:; ||j|;i|i|i|||y;|i
4B\/tes
jl
Figure 5-9, Frame-Mapped adaptation mode
For applications where both the transport and bridging capabilities of Ethernet are integrated into the transport NEs, the Frame-Mapped mode is the preferred mode of adaptation, since the physical-layer aspects of both the SONET/SDH and Ethernet interfaces (layer 1) are segregated from the media access control (layer 2) aspects. Since the same mode of adaptation is applied to either point-to-point or multipoint configurations, service providers can deal with these two styles of application with the same provisioning and management procedures. Thus, for instance, if a customer wishes to migrate from a point-to-point transport service to a multipoint transport service, both these services can be delivered from the same service interface without further reconfiguration of the preexisting end-points. 5.3.4.2
Transparent-Mapped Mode (GFF-T)
The Transparent-Mapped adaptation mode (currently defined for 8B/10B encoded signals only) is particularly suitable for full-rate point-to-point applications requiring very low delay and delay jitter. Full-rate means that the entire capacity of the local physical interface is supported across the end-
Chapter 5
168
to-end path. Client adaptation is accomplished by preprocessing the incoming signal to remove the native link-layer codewords, postprocessing this raw data to the characteristics of the new transport media, and mapping the postprocessed data into fixed-size GFP frames, as illustrated in Figure 510. The detailed processing steps for 8B/10B encoded signals are discussed in Section 5.3.4.4. This mode is intended for applications that seek to emulate a native physical interface with very strict packet delay, loss, and throughput requirements (e.g.. Fibre Channel, FICON, and ESCON).
GFP-T Frame Payload Area
Core Header " ^ PU 2 Bytes
cHEC 2 Bytes
Payload Header
PCS
8X64B/65B + 16 Superblocks
4 Bytes
(Optional 4 Bytes
64B/65B # 1 1 64B/65B # 2 |
64B/65B Superblock (Flag bits carried in last octet of the super-block)
^ w
64B/65B # N - l [ 64B/65B # M | r l 1 72 ...| F8 1
^\.^^^ ^\.^
CRC-1$ MSB
1
CRC-16 I^B
1
1
CCL#1
1 CCI#1
1
N
CCL#n 1 CCI#n
1
DCI#1
1
64B/65B block (minus Flag bit)
DCI#8-a
Figure 5-10. Transparent-Mapped adaptation mode (GFP-T)
5.3.4.3
Asynchronous -Transparent Mapped Mode
The Asynchronous-Transparent adaptation mode (currently only defined for 8B/10B encoded signals) is a variant of GFP-T that is particularly suitable for full-rate point-to-point applications requiring a trade-off between low-to-medium delay and delay jitter versus bandwidth efficiency. As with GFP-T, signal adaptation is accomplished by mapping link-layer codewords into fixed-size GFP frames, as illustrated in Figure 5-7. But unlike GFP-T, this process requires that all signal components are adapted into a GFP frame. Thus, codewords associated with interframe fills may be removed, or entire codeword sequences related to link-control functions may be extracted, processed, and modified prior to their mapping to GFP. As such, GFP-A requires a certain level of client signal awareness but not complete processing of the native L2/L3 PDUs. This mode is intended for applications
Generic Framing Procedure (GFP)
169
that seek to emulate a native physical interface with packet delay, loss, and throughput requirements but that can take some level of delay, such as asynchronous Fibre Channel applications. 5.3,4.4
8B/10B Client Processing in GFP-T
GFP provides either full-rate or subrate adaptation for 8B/10B line coded signals, which is the prevalent line code in local area networks. These signals consist of a 10-bit character encoding either 8-bit data or control information. In GFP-T, an 8B/10B data codeword is decoded into its original 8-bit value, and an 8B/10B control codeword is decoded into special GFP-T control characters. The 8B/10B control codewords are mapped into one of the 16 possible 4-bit Control Code Indicators for the 8-bit control characters available in transparent GFP. r> Leadi Leading Bit 8 byte block 8 X 65B blocks = 520 bits
I I i I I
rr
O
Group 8 X 65B blocks
A
Rearrange Leading Bits at end of the block
^ ^ Generate & append CRG-16 ® check bits to form [536,520] superblock. ^ ^ n||| I 0 ™*-'/-^ Payload FCS (4 bytes) M-'
N X [536,520] Superblocks
Form GFP Payloads with N X [536,520] Superblocks. Append GFP Payload Header and (optional) pQg
0 Append Core Header and Scramble Payload Area with x43+i SSS (Core header not scrambled.)
Core Header (4 bytes)
Figure 5-11. GFP-T adaptation process
5.3.4.4.1 Generating GFP-T 64B/65B codes The decoded 8B/10B characters are mapped into a 64-bit/65-bit (64B/65B) block code. The components of the 64B/65B block code are illustrated in Figure 5-12. The leading bit of the 65-bit block, the Flag bit, indicates whether that block contains only 64B/65B 8-bit data characters or whether client control characters are also present in that block. (Flag bit = 0 indicates data octets only, and Flag bit = 1 indicates at least one control octet in the block). Client control characters, which are mapped into 8-bit
170
Chapter 5
64B/65B control characters, are located at the beginning of the 64-bit block payload if they are present in that block. The first bit of the 64B/65B control character contains a Last Control Character (LCC) flag bit, which indicates whether this control character is the last one in this block (LCC = 0) or whether there is another control character in the next octet (LCC = 1). The next three bits contain the Control Code Locator, which indicates the original location of the 8B/10B control code character within the sequence of the eight client characters contained in the block. The last four bits, the Control Code Indicator, give the four-bit representation of the 8B/10B control code character.
_D2_
_Da_
JM_
_Q5_
Dfi
n?
jQa_
_D4_
_D5_
JIL. _Qg_
_Da-.
-XLL n,hhh,C;
_Qi.
_D2_
J22_
JM-
Si5-
-DQ-
_Di-
_D2_
_Da_
_[M_
_D5_
Ql
_D2_
_Da_
_D±_
_Di_
_D2_
_xia_
O.fff.Ofi
_m_
_D2^
1 .hhh.C; n.nno.c:
1 hhh nd 1 nnn.C:^ n.ridd.Ci
1 hhh C. 1 -r.nr.,C?\ 1 dridc4 n.fififiCi
l.aaa.CI l.hhh.Ca
tcm.ca^ 1.rldd.C4l.RRR.Cf^
1,afiri,C1 Lhhh.O: i.GcaCa 1.rJf1fl,C4l.ftfiR.Cfl Lfff.Cfi n.fiflfl.C'
-D2-
_Di.
1,aaa,C1 l.hhh.ca 1 ,nnn,0/ 1,fif1ri,c4l,ftftR,0fl 1,fff,Cfi l.flfln.C -bit representation of the Ist control code's original position • bit representation of the 2nd control code's original position hhh s 3 CI s 4 Dl = 8
- bit representation of the 8th control code's original position 'bit representation of the Ah control code -bit representation of the /th data value In order of transmission
Figure 5-12. GFP-T 65B/64B code components
5.3.4.4,2 Adapting 64B/65B code blocks into GFP Superblocks To preserve the octet alignment of the GFP-T signal with the transport channel, the first step in the adaptation process is to group eight 64B/65B codes into a Superblock (Step 2 in Figure 5-11). The leading (Flag) bits of each of the eight 64B/65B codes are grouped together into a first trailing octet. The sixteen bits of the last two trailing octets are used for a CRC-16 error check over the bits of this Superblock (Step 3 in Figure 5-11). N 65B/64B codes (and associated CRC) are packed into a single GFP frame. Assuming no Payload FCS and a Null Extension header, the resulting GFP frame is [(A/^x ((65 x 8) + 16) + (8 x 8)] bits long, where N is the number of superblocks in the GFP frame. The value of A^ depends on the base rate of the client signal and on the transport channel capacity.
Generic Framing Procedure (GFP)
171
5.3.4.4.3 Error control with Transparent GFP The 16 error control bits in a Superblock contain a CRC-16 error check code over the 536 bits in that Superblock. If the GFP-T demapper detects an error, it should output either a lOB Error control character(s) or a 1 OB Unrecognized control character(s) in place of all the client characters contained in that Superblock. The generator polynomial for the CRC-16 is G(x) = x^^+ x^^ + x^^ + x^^ + x"^ + x^ + x^ + X + 1 with an initialization value of zero, where x^^ corresponds to the MSB and x^ to the LSB specially selected for this application.
5,4.
IMPLEMENTATION CONSIDERATIONS
All the procedures in the GFP state machine may be performed either on a bit-by-bit or byte-by-byte basis, making the equally suitable to bit and byte synchronous channels. If either 4-byte or 8-byte parallel processing is deemed necessary, then all the operations described here, including CRC computations, can be performed in parallel. The standard is open as to design choices for certain design parameters of the virtual framer and options for link scrambling.
5,4.1
Virtual Framer Management
The procedures described so far do not explicitly constrain the potential ways framers may be handled when returned to the Hunt state (after failing to produce A^-7 consecutive HEC matches), or when all the available framers are in use and a new potential GFP header is detected while in the Hunt state. There are a few implementation options available depending on the amount of link configuration information available to the receiver, the link synchronization performance objective, and the implementation complexity. Below we identify three simple implementation options that do not exploit the information conveyed in the assumed PLI fields. One design objective may be to maximize the chance to synchronize the link at the first available opportunity. The implementation (Option 1) could then allocate enough framers to guarantee, with a reasonably high probability, that one of the M framers will succeed in capturing the first incoming GFP frame boundary. For implementation simplicity, framers that fail to yield the proper GFP frame boundary would not be reused, and any further HEC field matches (beyond M) while in the Hunt state would be ignored. If all the framers fail to yield the proper GFP frame boundary in the first pass, then the resynchronization procedure must be restarted from scratch, A drawback in this approach is that the time to frame delineation
172
Chapters
could be large if the incoming GFP frame is missed, particularly in scenarios where the BER is high, since each of the failed framers may have been pointing up to 64 Kbytes into the incoming byte stream. When the receiver knows the Maximum Transmission Unit (MTU) for the link, the synchronization time can be further improved (valid PLI fields can only point up to MTU bytes into the incoming byte stream) at the expense of further implementation complexity. Reuse of failed framers can be exploited to further improve the link synchronization performance. Given the size of the HEC field and typical Internet traffic profiles, the chances of a large number of random HEC field matches in a PDU may be deemed rather unlikely. An alternative design philosophy (Option 2) may assume that a large number of active framers is indeed an indication that the GFP receiver is facing such an unlikely event. Thus, the receiver may just reset the ''oldest" of the framer when all available framers are in use and a new HEC field match is detected while the receiver is in the Hunt state. This approach has the advantage of decreasing the probability of "flushing" the GFP frame boundary with a hard reset of all framers. Alternatively, the framer with the farthest time to frame may also be a good candidate for reset. Yet this approach requires sorting according to the expected time to frame. We evaluate both approaches in Section 5.6.
5.4.2
Scrambler Options
The need for pay load scrambling was identified as an afterthought in the original POS specification. For expediency, the technical community selected an ATM-like self-synchronous scrambler with generating polynomial 1 + x'^^. Below are some considerations concerning the choice of a self-synchronous scrambler: The l+x"^^ scrambler is known to exhibit the poorest bit randomization property among all degree-43 polynomials [18]. The state self-synchronous scrambler is affected by the input data. It is relatively easy to generate input packets that yield a periodic bit pattern (with a period of 43 bits) on the link, even after applying the SONET/SDH scrambler. Although today's PDUs often contain as many as 1536 bytes, they can be as large as 64 Kbytes. The periodicity increases the probability of low bit transition density to 10% to 20% instead of the desired 50%. While the situation is not as serious as the one without the additional scrambler (where a zero transition density can be created relatively easily, as shown in Table 5-2), some transmission equipment may be sensitive to even this low transition density. The self-synchronous nature of the scrambler multiplies errors. In particular, the receiver perceives every bit error on the link as two bit errors (separated by 43 bits). Similar multiplication of errors occurs when there are
Generic Framing Procedure (GFP)
173
two or more errors on the link. Error multiplication interferes with the error correction in the header as well as in the payload. In some environments, the link payload may require a much more powerful FEC that would be required in the absence of error multiplication in order to achieve a given level of burst error protection. Error multiplication also affects the error detection capability of the PCS. In particular, the CRC-16 polynomial is inadequate in the presence of error multiplication, even with the maximum PDU size of 1536 bytes. Table 5-2. Probability distribution of bit transition density
TRANSITION DENSITY 0/43 2/43 4/43 6/43 8/43 8/43
EVENT PROBABILITY 2.0x10" 2.0x10'° 2.8 X 10"^ 1.0 xlO"^ 4.0x10"^ 4.3x10"'*
AVERAGE TIME PER EVENT AT 0C-12C 322 Years 177 Days 20 Hours 33 Minutes 1.2 Minutes 4.8 Seconds
Thus, one is driven to consider an alternative scrambler for GFP. Below we discuss a new independent set-reset scrambler, its operation, and the method for synchronizing the descrambler with the scrambler. All bits in every GFP PDU following the GFP header are scrambled using an independent scrambler with polynomial 1 + x + x^'' + x^^ + x'^^ The transmit scrambler and receive descrambler can be implemented using shift registers with 48 stages that are set to all-ones when the link is initialized. Each is refilled with all-one bits if the value in the shift register ever becomes all zeros. This scrambler, shown in Figure 5-13, is not reset at the beginning of each frame, as is the SONET/SDH x^+x^-i-1 scrambler, nor is it modified by the transmitted data, as is the ATM self-synchronous scrambler. Instead, the two ends are kept in synchronization using special GFP messages.
^
1 1
o<
rx.'^.* r^4
r^4
, W p=|| Y [ = f |
1 L-J ^"
1
DpR '
"
Dp7 1
"
1
=
r=fl 1 i
DoA 1 1
r
Figure 5-13. An independent setreset scrambler
1
D47
1i
r
:
174
Chapter 5
Each XOR is an exclusive-or gate, which is equivalent to a modulo-2 adder. Each Dn block is a D-type flip-flop clocked on the appropriate data clock rate. The scrambler is clocked once after transmission of each bit of GFP data, whether or not the transmitted bit is scrambled. When scrambling is enabled for a given octet, the OUT bit is exclusive-ored with the raw data bit to produce the transmitted bit. The scrambler and descrambler are kept synchronized by means of periodic GFP messages. Since the scrambler state exchange by GFP messages does not rely on the SONET/SDH structure or overhead bytes, the whole procedure can be used for GFP over WDM or any other core transport layer. To generate a scrambler state message, a snapshot is taken of the contents of D47 through Do at the point where the first scrambler state bit is sent. The receiver of a scrambler state message must run the CRC-16 payload check and execute the single bit header error correct algorithm over this message. If the CRC-16 detects multiple bit errors, then the message is dropped and is not processed further. Additional mechanisms are also provided to reduce the likelihood that a falsely corrected scrambler state message with multiple bit errors corrupts the running scrambler state.
5.5,
PERFORMANCE
Synchronization errors and delays affect the adequate performance of the transmission layer. In this section, we evaluate the extent of these performance impairments on GFP and compare the link efficiency with that of the HDLC-framed PDUs.
5.5.1
Probability of GFP Frame Delineation Loss (FDL)
GFP header integrity is key to the proper operation of GFP, and the HEC field ensures a very low error probability on the PLI field. Whenever the GFP header CRC fails, the value in the PLI field can no longer be trusted (even if the receiver is still synchronized). In those instances, the frame boundaries must be reacquired by entering the Hunt State. Since the header field is four octets long, the probability that the header contains an error is about 32p, where p is the link's Bit Error Rate (BER). Thus, the loss of synchronization would be an order p event in the absence of header error correction: too weak for gigabit rate operations. With all single bit errors corrected, the loss of synchronization becomes an order p^ event, a significant improvement in performance against random bit errors. Indeed, it is easy to show that for low p values, the probability of synchronization loss is given by
Generic Framing Procedure (GFP)
P(FDL) =. 1 - (1 - p + HpXl - pr
175
^^ ^ ^ ^ ^
(5.1)
where H is the size of the GFP core header in bits. This relation applies to both bit-synchronous and byte-synchronous links. Thus, from Eq. (5.1), for a BER of 10'^ the FDL is roughly once every 10^^ frames, while for a BER of 10"'^ P(FDL) is as low as once every 10^^ frames. The FDL events are even less common on links with lower BERs, as illustrated in Table 5-3. Table 5-3. FDL probability as a function of link BER
BER 10-^ 10"^ 10"^ 10"^^
P(FDL) 5x10-^^ 5x10"^"^ 5x10-1^ 5x10-^8
GFP is primarily designed for core networks where the fiber BERs are generally better than lO'^l Given a PDU size of 48 bytes and a BER of 10 ^ even at an OC-768 (40 Gbps) rate, GFP would lose synchronization only once every 224 days (on average) due to random bit errors. Frame resynchronization events will next be dominated by the frequency of burst errors, which in fiber transmission systems rarely occur as often as a few times each day. In the presence of burst errors, frame boundary resynchronization needs to take place, irrespective of whether the packet size is fixed or variable. Therefore the overall number of frame resynchronization events would be virtually the same for ATM and GFP. In the case of HDLC framing, a frame boundary loss occurs much more frequently because a single error in a flag or an error in a data octet causing it to look like a flag will result in frame boundary loss. In the case of HDLC, frame loss occurs as an order p event.
5.5,2
Probability of False Frame Synchronization (FFS)
False link synchronization events occur when A^ false HEC fields are matched in sequence in the incoming byte stream. This event requires A^ random CRC-16 matches before the GFP link is properly synchronized. With the header CRC-16 operating only in detection mode after a loss of link synchronization is declared, the probability of a random set of four octets passing the CRC-16 is q=2'^\ or ^=MTU/2^^ when the working MTU for the link is known. Then
176
Chapters PFS
(5.2)
The duration of this false synchronization event is bounded in time by, at most, one maximum-size PDU.
5.53
Probability of Frame Unavailability (FUA)
GFP allows multiple candidate HEC field matches to create up to M simultaneous checkers. Each such checker assumes the frame boundary has been identified. The receiver (virtually) jumps to the next assumed frame boundary and checks for a valid HEC field match. When all M checkers fail to capture the next incoming GFP header, the framer state machine must be reset and link resynchronization reinitiated. During such events, the framer state machine has become essentially unavailable. To quantify the link synchronization performance, one must thus quantify the probability of such events. Since the probability that the receiver fails to detect multiple header errors is quite small, the hunt process in GFP will typically start at the beginning of the current GFP payload. Subsequent resynchronization attempts, because of the framer unavailability events, would then start at a random point in some GFP PDU. Below we evaluate PFU for the two receiver implementation options described earlier, that is, either the first M HEC field matches are preserved (Option 1) or only the last M HEC field matches are preserved (Option 2). For the remainder of this discussion, it is assumed that N=2. 5.5.3.1
Option 1: Resynchronization reinitiated after all M active framers fail
In this scenario, resynchronization events are triggered after the last of the M spawned framer processes fails to capture the next incoming GFP PDU. The hunt process evolves along one of two disjoint paths. In the first path (FUAi), the search starts just at the beginning of the payload of the incoming GFP PDU, since the probability of undetected GFP header errors is very low. This search occurs any time the receiver is in the Sync state and multiple bit errors are detected. This path can only capture the next proper HEC field when less than M false HEC field matches are encountered after examining l-J bytes/bits, where / represents the size of the current GFP PDU. In the second path (FUA2), after the first path fails to produce a proper GFP PDU, the hunt process restarts at an arbitrary point within some GFP PDU. Any position within the payload of such GFP PDU is equally likely to
Generic Framing Procedure (GFP)
111
be the starting point of the search. Again, less than M false HEC field matches must be encountered within the residual frame length for the receiver to be in a position to capture the proper GFP header. For an arbitrary frame size distribution PB(1), the probability of next frame unavailability becomes
PFUJ = Y.PB(1)T.I
U
k=M\
1>M
PFU2-
^-
I
([^
^'^s^''^
1,>M l=M
5.5.3.2
(5.3)
J
'i-l D n \
^1
k=MKh
1/(1-'//-'^
Option 2: Reusing the oldest framer after the M-\-l HEC field matches while in the Hunt state
In this scenario, excessive HEC field matches while in the Hunt state do not trigger the reinitialization of the resynchronization process from scratch. Only the oldest framer is reset and reused to store the new HEC field match. The next GFP PDU becomes unavailable only if there are more than M HEC field matches between two consecutive GFP PDUs. Since there is at least one HEC field match in any GFP PDU of size /, the probability of next frame unavailability becomes
P(Ff/A) = X^fi(0£ 1>M
k=M
ViX-q)'-'-'
(5.4)
V ^ y
The computation of the frame unavailability probability so far ignores the impact of the links' BER. With the header CRC operating only in detection mode while in the Hunt state, the next GFP PDU boundary will also be missed whenever the next GFP header is in error. Assuming independent random bit errors, then PFU(BER) = P,,, + P{FUA){\ - P,,,)
(5.5)
where P^er is the uncorrected header error probability, P/^^^ =l-(l- p)^ , and P(FUA) is given by Eq. (5.3) in the case of Option 1, or Eq. (5.4) in the case of Option 2.
178
Chapter 5
Figure 5-14 and Figure 5-15 show P(FUA) as a function of the GFP PDU size and the number of framers over octet- and bit-synchronous channels, respectively. In these figures, Ml refers to P(FUA2) under Option 1, while M2 refers to Option 2. As expected. Option 2 gives slightly higher values for P(FUA), particularly for larger PDU sizes. P(FUA) values for bitsynchronous links are necessarily higher, as a larger number of events must be examined in between consecutive GFP PDUs.
64
128
256 384
512 1024 2048 3072 4096 8192 16384 32768 65535
GFP PDU (octet)
Figure 5-14. Probability of next frame unavailability as a function of the GFP PDU size and number of framers (octet-synchronous channel)
?5 1-E-04
128
256 384 512 1024 2048 3072 4096 8192 16384 32768 65535 GFP PDU size (octets)
Figure 5-15. Probability of next frame unavailability as a function of the GFP PDU size and number of framers (bit-synchronous channel)
Generic Framing Procedure (GFP) 5.5,4
179
Frame Acquisition Delay
For any given receiver implementation, the frame acquisition delay is a function of the packet size, the link BER, the number of framers, and the required number of consecutive HEC field matches. The most relevant measure of this impairment is the mean time to frame (MTTF), that is, the average time it takes to regain synchronization when a certain number, M, of framers is used in the Pre-Sync state. When M false matches create M checkers, the link resynchronization process is reinitiated. The process will start again after losing at least one GFP PDU for link-synchronization purposes. It is also possible that the true GFP frame boundary will be missed because of an error in the header. A checker with a true frame boundary may also revert to the Hunt state when an error occurs in one of the subsequent N-1 GFP headers. Therefore, the MTTF is a function of the probabilities calculated above and the BER. Assuming that the boundary search is initiated at a random point in the current GFP PDU, there are three main components to the boundary acquisition delay: 1. The time (tj) spent examining the subsequent bytes/bits after the initial HEC field failure, from the beginning of the current GFP payload to all but one byte/bit of the next candidate frame header. This event is a uniformly distributed event over the length of the GFP PDU. 2. The time (^2) spent examining the last complete GFP PDU prior to declaring link (re)-synchronization. This event is uniformly distributed over the length of the GFP PDU. 3. The time (tjhf) spent chasing after the false HEC field matches and restarting the link resynchronization process. This is essentially a geometrically distributed event with rate PFU2. Thus, for our receiver implementations, the MTTF can be expressed as MTTF={E{t,hE{t2)il-Pfui)+flE{t,hn4 = E{t^) + kh) + PpfAtfhM-Pfu2) For either Option 1 or 2, either E(t])=E(l)/2 or E(t])=E(l) depending on whether it relates to an initial synchronization or a resynchronization event, while E(t2)=E(l). E(tjhf) is just the mean value of the (uniformly) randomly matched PLI field, or E(tjhf)=L/2, where L is the largest possible GFP PDU. Figure 5-16 and Figure 5-17 show the MTTF as a function of the GFP PDU size and the number of framers over octet= and bit-synchronous channels, respectively. Option 2 yields lower MTTF values for GFP PDU
180
Chapter 5
sizes up to 4K octets as compared with Option 1, which reflects the impact of the missed frame acquisition opportunities once the framer state machine becomes unavailable. For larger GFP PDU sizes, the lower PFU values from Option 1 exhibit better MTTF performance.
•
2.00
I
t
•
Z) Q
1.75
1.50
t
T
f
64
128
256
1 I 7 384 512
T f T
1024 2048 3072 4096 8192 16384 32768 65535
GFP PDU size (octets)
Figure 5-16. MTTF as a function of the GFP PDU size and number of framers (octetsynchronous link)
2.00
3 Q Q_
1.75
1.50 384
512 1024 2048
3072 4096 8192 16384 32768 63536
GFP PDU size (octets)
Figure 5-17. MTTF as a function of the GFP PDU size and number of framers (bitsynchronous Hnk)
Generic Framing Procedure (GFP)
^^ s^ ^^ ^^^ ^O ^"^ /
181
Z' /
^v^\.^^'/
/
GFP PDU Size (Octets) Figure 5-18. MTTF as a function of the GFP PDU size and BER for M=2 (octet-synchronous links)
Figure 5-18 shows the impact of the links' BER on the MTTF for an octet-synchronous link. Our calculation shows that for Option 2 the MTTF is sensitive to the BER above 10""^ and becomes virtually insensitive to the BER below that range. For core networks, the expected BER is much below this threshold. For this scenario, one can assume insensitivity to the BER and use the resulting MTTF in an initial determination of good values for M and A^ >2. Option 1 again shows the impact of the lost frame acquisition opportunities when the working frame sizes are relatively small compared with the maximum GFP PDU size. Although not shown, the MTTF is still an increasing function of the GFP PDU size in octets. Similar observations apply to bit-synchronous links. For common packet sizes in today's Internet, two parallel framers, using links with a BER of 10"^ or better, provide an MTTF of about 1.5 GFP PDUs. Each framer (checker) needs to check at least two consecutive HEC field matches before declaring proper GFP PDU delineation. This is also the optimal time, since initial synchronization is typically initiated halfway into one GFP PDU, and the subsequent GFP PDU must also be checked and judged correctly framed. The MTTF analysis suggests that even for very large packets (64 Kbytes), M=2 or M=3 seems adequate. Since such large packets are not common in current data networks and since the introduction of real-time services in the IP networks make it more unlikely that such large packets will become more prevalent in future IP networks, M=2 should suffice for most practical scenarios. Note also that, although not shown, the synchronization performance can be further improved both by taking into account the actual contents of the
182
Chapter 5
assumed PLI fields, particularly when the working MTU is less than the maximum possible GFP PDU, and by reusing previously released framers shown to have pointed to a false GFP PDU.
5.5.5
Scrambler Resynchronization Delay
For the independent scrambler, both the transmitter scrambler and the receiver descrambler need to be synchronized for the receiver to be able to send valid PDUs to higher layers. If the scrambler state is transmitted every k GFP PDUs, then it will take an additional k/2 frames for the scrambler states to be synchronized. Of course, if the loss of synchronization is communicated to the transmitter, then the transmitter can suspend the transmission of all user traffic and continuously send only the scrambler state messages. These messages are short (12 bytes long). In this case, the resynchronization will be achieved within the transmission time of 1.5 (short) frames after the transmitter is informed of the loss of synchronization. When random errors occur, the header CRC corrects single bit errors in the GFP header. If the header CRC detects errors in the GFP Header that cannot be corrected, it will send the receiver to the Hunt state. Burst errors will almost always cause the GFP receiver to enter the Hunt state. Generally, burst errors in fiber systems appear to last between 20 and 40 ms. Once the error burst is over, an additional time interval of either 2 frames (for selfsynchronous scramblers) or 2 + k/2 frames (for the independent scrambler) will elapse before link resynchronization is achieved. This interval is insignificant at OC-3c transmission speeds and above.
5.5.6
Link Efficiency
Figure 5-19 compares the datalink transport efficiency between GFP and PPP over HDLC-like framing as defined in RFC-1662 [10]. From the viewpoint of the data link layer, one can readily identify two sources of transmission inefficiencies: framing overhead and payload encoding overhead. Framing overhead is typically associated with protocol control information. Payload encoding deals with transparency issues encountered during data transport, such as avoiding the receiver misinterpreting user data as control information, guarding against malicious attacks, or maintaining bit transparency. GFP introduces a fixed amount of framing overhead — either 8 bytes without an FCS field or 12 bytes with an FCS field. In comparison, PPP in HDLC-like framing requires a minimum of 79 bytes of framing overhead, depending on the size of the FCS field. The HDLC encoding overhead is
183
Generic Framing Procedure (GFP)
variable and loosely bounded. In the best-case scenario, there are no occurrences of the Flag and Escape Control bytes in the data packet and, no encoding overhead is added at all. In the worst case, the packet consists exclusively of Flag and Control Escape bytes, and the encoding overhead is 100%. For purely random data, it is easy to show that the average encoding overhead is about 0.78%. Yet it is not uncommon to find video and compressed voice data traces in which the flag or escape patterns account for 30% to 40% of the bytes. Such large payload size variations can strongly interfere with most QoS management mechanisms, demand looser engineering rules, and hence, decrease overall link efficiency for HDLCencapsulated PDUs.
100
90
^
80
1^ W
70
-•-GFP HP PPP Best -^K- PPP Random "»- PPP Worst
60
50
40
1
40
1
64
1
128
1
256
1
512
1
1024
1
4048
1
8096
16192
GFPPDUSize (Octets)
Figure 5-19, GFP and PPP/HDLC bandwidth efficiency as a function of the PDU size
184
Chapter 5
Other encoding algorithm may be used in place of HDLC byte stuffing. For instance, the Consistent Overhead Byte Stuffing (COBS) algorithm [19] encodes input data into variable-length code blocks by eliminating a target byte from the data stream. Each code block begins with a code byte indicating the length of the code block followed by up to 254 bytes of data. For purely random data, COBS efficiency is about 0.23%. Yet, COBS overhead is also variable, with a best-case overhead of 1 byte for packets without the Flag byte and a worst-case overhead of 1 additional byte for each 254 data bytes.
5.6,
APPLICATIONS
Most of this traffic originates in corporate LANs, which are today over 90% based on Ethernet technology. This situation makes a compelling case for Ethernet as the service interface towards the end users. Public networks, however, are larger SONET/SDH based. This situation makes a compelling case for SONET/SDH as the interface to the public transport network. Below we describe common GFP applications for Ethernet-oriented services. Further applications can also be found elsewhere [20, 21].
5.6.1
Ethernet Private Lines
The simplest GFP application is as a flow mapper/demapper into a TDM channel, such as a SONET/SDH path or an Ethernet segment, for an Ethernet Private Line (EPL) service. Service interfaces to the end-users are standard Ethernet PHYs, while the transport over the TDM network is GFP based. There, GFP provides a mechanism to extend the native datalink protocol (such as PPP or the IEEE 802.3 MAC [22]) over an existing transport infrastructure. This scenario is depicted in Figure 5-20. Note that this approach only requires enhancements to the edge of the transport network as opposed to a brand new data transport infrastructure.
Generic Framing Procedure (GFP) Hybrid Network Element KVlOOMbps 1 GbE
185 Dedicated VCG Per Client
Client A Client BClient C Ethernet MAC Frames
Figure 5-20. TDM-based Private Line Services via GFP & Virtual Concatenation
GFP in combination with SONET/SDH Virtual Concatenation and the Link Capacity Adjustment Scheme (LCAS) [23] provides a simple mechanism not only to adapt the user traffic very tightly to the actual bandwidth demand (e.g., a 1 Gig Ethernet client can be mapped into 21 STSIs or 7 STS-3cs as part of a virtual concatenation group -[VCG], as opposed to burning a single OC-48c -48 STS-ls- lines). VCGs provide a simple mechanism for service providers to offer subrate Ethernet transport services in increments of STS-ls. LCAS provides a flexible mechanism for service bandwidth modifications and failure management in a hitless manner. Note that the same traffic adaptation model can be used to create a wide variety of similar private line services for SANs and broadcast video applications.
5,6,2
Virtual Leased Lines
An application for the GFP Linear Extension Header format allows even finer granular subrate transport services, as illustrated in Figure 5-21. In that scenario, GFP itself is used to create multiple subchannels within the GFP payload rather than relying on the SONET/SDH layer. Each GFP channel can be allocated an arbitrary fraction of the transport bandwidth. Each channel behaves as a Virtual Ethernet Private Line. This approach is particularly attractive for newer multichannel component interconnect interfaces such as SPI-4/SP5. It does require a packet-level scheduler at the GFP layer.
186
Chapter 5 Hybrid Network Element lQ/100Mbps 1GbE
^0 P orts
Client A -I
^
VCG ABC VC-n-xv
Client B — f Client
C^ Ethernet MAC Frames
Packet Fabric
GFP-F Mapper
^
Shared VCG among Clients
TDM Fabric
Figure 5-21. Packet-based Virtual Private Line Services via GFP Linear Headers or VLAN tags
5.6,3
Packet Rings
A third application uses the GFP Linear Extension Header in conjunction with Ethernet Bridging functions to create a packet-based logical ring over a TDM infrastructure, as illustrated in Figure 5-22. In this scenario, there is a point-to-point path between the neighboring network elements. On each link, the entire SONET/SDH link is dedicated to the packet ring. The GFP payload can be shared in any arbitrary fashion among the clients using the ring transport services of GFP. A similar capability could be constructed via the proposed Ring Extension Header. Although the ring procedures are currently under study, it is also possible to reuse alternative ring procedures such as the one currently being developed under the IEEE 802.17 [24] working group.
10/100 Mbps lGt)E
I/O Perls Ethernet MAC Frames
W\
^k-k JU t l A ^ =1^9
\>acket
Fabric_
13
GFP-F f Mapper
Hybrid TDM/Ethernet Network Element
TLS: 1-to-l mapping of access ports to VLANs
10/100
i
IEEE 802. ID/W
I
or IEEE 802.17 Over ~ SONET/SDH Path Patti
\
IP/MPLS to VLAN mapping
VCGs/a/ed
^"^^''e'
/
S---H"--'ir m
;°£g^
rZ
III
Edge Router/:
• • •
ServirP?; .qwit(
Figure 5-22. Packet-based Ring Services via GFP Linear Headers or VLAN tags
Generic Framing Procedure (GFP)
5.7,
187
FUTURE DIRECTIONS
Most of the work on developing GFP so far has been focused on defining client-specific adaptation procedures for a variety of constant bit-rate (Layer 1) or packet-oriented (Layer 2+) client signals. In this regard, it is expected that the direct mappings for other client signals will be enhanced to include the most commonly used Layer 2+ protocols, including IPv4/IPv6, MPLS, and the OSI protocols, among others. Work is also under way to complete the work on a subrate mode for Fibre Channel signals. There has also been much discussion about the need for a new native packet transport mode specially optimized for SONET/SDH and OTN transport networks and the potential use of GFP for this purpose. GFP already provides the means to propagate label-switching information for statistical multiplexing of any number of client signal (via the Extension Headers) as well as the means to support separate in-band or out-of-band management and control channels for client- and server-layer resource management via the GFP Type field. These tools afford the means of defining a lightweight packet transport protocol as an alternative to more established approaches such as ATM or MPLS.
5.8. [1] [2] [3] [4]
[5]
[5] [6] [7] [8]
REFERENCES ITU-T Recommendation G.7041/Y.1303, The Generic Framing Procedure (GFP), 2003. American National Standard for Telecommunications, Synchronous Optical Network (SONET) PayloadMappings, ANSI Tl. 105.02, 2002. ITU-T Recommendation 1.432, B-ISDN UserNetwork Interface — Physical Layer Specification, 1993. ISO/EIC 3309:1991(E), Information Technology — Telecommunications and information exchange between systems — High-level Data Link Control (HDLC) Procedures — Frame Structure, 4th Edition. International Organization for Standardization., 1991. ISO/EIC 4435:1991(E), Information Technology — Telecommunications and Information Exchange Between Systems — High-level Data Link Control (HDLC) Procedures — Elements of Procedures, 4^^ Edition. International Organization for Standardization, 1991. American National Standard For Telecommunications, Synchronous Optical Network (SONET): Physical Interfaces Specifications, ANSI Tl. 105.06, 2000. ITU-T Recommendation G.707, Network Node Interface for the Synchronous Digital Hierarchy (SDH), 1996. ITU-T Recommendation G.709, Interfaces for the Optical Transport Network (OTN), 2001. J. Carlson, P. Langner, J. Manchester, and E. Hernandez-Valencia, "The Simple Data Link (SDL) Protocol," RFC 2823, May 2000.
188 [9]
[10] [11] [12] [13] [15]
[16]
[17] [18]
[19] [20] [21] [22]
[23] [24]
Chapters American National Standard For Telecommunications, Integrated Services Digital Network — Core Aspects of Frame Protocol for Use with Frame Relay Bearer Service, ANSI Tl.618-1991, June 1991. W. Simpson (Ed.), PPP in HDLC-like Framing, RFC 1662, July 1994. A. Malis and W. Simpson, PPP over SONET/SDH, RFC 2615, June 1999. ITU-T Recommendation X.85, IP over SDH using LAPS, 20Q\. ITU-T Recommendation X.86, Ethernet over LAPS, 200\, J. Baldwin, B. Bharucha, B Doshi, S. Dravida, and S. Nanda, "AAL2 — A new ATM adaptation layer for small packet encapsulation and multiplexing," Bell Labs Technical Journal, AprilJune 1997. B. Doshi, S. Dravida, P. Magill, C. Siller, and K. Sriram, "A broadband multiple access protocol for STM, ATM, and variable length data services on hybrid fiber-coax networks," Bell Labs Technical JourndX, JulySeptember 1996. D. Fiorini, M. Chiani, V. Tralli, and C. Salati. "Can we trust in HDLC?" ACM Computer Communication Review, pp. 61-80. 1994. I.Fair, V. Bhagava and Q. Wang, "On the power spectral density of self-synchronous scrambled sequences," IEEE Transactions on Information Theory, Vol. 44, No. 4, pp. 16871692, July 1998. S. Cheshire and M. Baker, "Consistent overhead byte stuffing," Proceedings of SIGCOM'97, September 1997. E. Hernandez-Valencia, "Hybrid Transport Solutions for TDM/DATA Networking Services," IEEE Comm, Magazine, Vol. 40, No. 5, pp. 104112, May 2002. M. Scholten, Z. Zhu, and E. Hernandez-Valencia, "Data Transport Applications Using GFP," IEEE Comm. Magazine, Vol. 40, No. 5, pp. 96103. May 2002. IEEE 802. ID, (ISO/IEC 15802-3:1998), IEEE Standard for Information TechnologyTelecommunications and Information Exchange Between Systems—IEEE Standard for Local and Metropolitan Area Networks—Common Specifications—Media Access Control (MAC) Bridges, 2002 Edition. ITU-T Recommendation G.7042/Y.1304, The Link Capacity Adjustment Scheme (LCAS ),200\. IEEE P802.17, Resilient Packet Rings (RPR), Draft version 2.2, April 2003.
Chapter 6 SYNCHRONIZATION OF OPTICAL NETWORKS An overview of network-level synchronization ^Geoffrey M. Garner, ** Gert H. Manhoudt "^Consultant, ** AimSys BV
6.1.
THE FIELD OF NETWORK SYNCHRONIZATION ENGINEERING
6.1.1
Introduction
The branch of network engineering that studies the distribution and quality of clock signals that are used in the public telecommunications network calls itself synchronization network engineering. In today's telecommunications networks, the clocks in transmission and switching equipment are often required to operate at equal or almost equal frequencies in order to transport signals between them that carry digital information and to do so without introducing single bit errors or bursts of errors. Synchronous operation of equipment that is spread out over a large geographic area requires a distribution network for synchronization information. The effects that cause degradation of this synchronization information can be divided in two categories. First, there is continuous degradation of these signals due to the accumulation of phase noise, caused by imperfect components and designs. This causes jitter and wander on the digital signals that are transported over the network. Too high levels of jitter and wander can cause bit errors, loss of frame, or controlled slips. Second, there may occasionally be a complete failure of a synchronization link, leaving network elements or entire network parts without synchronization information. The design of the synchronization network tries to minimize the effect of both the continuous phase noise accumulation and the effect of incidental
190
Chapter 6
loss of synchronization. Sections 6.2 through 6.4 of this chapter describe the theory and practice of phase noise, while Section 6.5 concentrates on the protection against loss of synchronization reference due to link failures. 6.1.1.1
Short History
The history of the specification of network synchronization, jitter, and wander is closely coupled to the history of digital transmission. Only when transmission became digital did the timing of the symbols on the line become important. The earliest problems to tackle involved jitter. Initially, digital transmission systems were used for point-to-point transmission to replace the Frequency Domain Multiplexed (FDM) systems in use at the time. The advances in integrated electronics made it possible to build the more complex digital Time Domain Multiplexed (TDM) systems. These allowed, in principle, the building of transmission paths that had no degradation, irrespective of the length of the path, as long as the "bits" on the line were recovered in each regenerator without error. To allow TDM systems to operate error free, the bits had to be sent on the line at very regular, equidistant, points in time. Deviation from this ideal was called jitter, and limits were set on its magnitude so as to control the number of bit errors made in the receiver. The next step in the evolution of digital systems was the concatenation of multiple digital systems, by directly patching through the 64 kbit/s DSO digital signals themselves on channel banks. This practice required the clocks of these systems to be equal, because the applied (primary) TDM multiplex method required the timing of the tributary signals to be coupled to the aggregate clock. This system of interconnected TDM systems became a nationwide network when digital switching was introduced. This required all switches and all 64 kbit/s signals to be synchronous. To make sure that this synchronicity was, under all conditions, guaranteed, a specific clock distribution network was deployed. It required the specification clock accuracies, reference switching, and holdover behavior to limit impairment due to wander and frame slip. This was the situation when SDH and SONET were introduced. The SDH/SONET multiplexing system required that the clock distribution network of PDH would no longer be carried over El or DSl trunk signals, but would be shifted to the OC-M/STM-N carriers of SONET/SDH. Moreover, the SDH/SONET network itself needed to be synchronized to avoid accumulation of large amount of wander in its payload signals.
Synchronization of Optical Networks
191
The last step was the introduction of OTN, This network is basically again an analog (FDM) network, but instead of multiplexing RF frequencies, optical wavelengths are multiplexed. But similar to the PDH network, the OTN network itself can operate asynchronously and still transport a synchronous STM-N/OC-M payload.
6,2.
BACKGROUND ON TIMING, SYNCHRONIZATION, AND JITTER
This section provides background on timing, synchronization, and jitter, and their importance in digital networks, A more detailed overview of this subject, with an emphasis on SONET/SDH networks, is given in [1].
6.2.1
Basics of Digital Transmission, Timing Jitter, and Alignment Jitter
At a fundamental level, a network transports client information from an ingress to an egress point over a number of links and nodes. In the case of a digital network, the information is transported as successive bits at a rate that may be constant or variable. Whatever the rate is at the ingress, each bit has associated with it an instant of time at which the bit is transmitted. If the client signal is constant bit rate (CBR), which is the case for Plesiochronous Digital Hierarchy (PDH), Synchronous Optical Network (SONET), Synchronous Digital Hierarchy (SDH), and Optical Transport Network (OTN) signals, and if the client itself has no timing impairments, then the client bit times at the ingress are equally spaced with period T equal to the reciprocal of the bit rate. This is illustrated in Figure 6-1. At a low layer, a bit may be transported over a single link as an analog pulse, with different pulses representing 1 and 0. The specific forms of the pulses are determined by the line coding. For example, a non-return-to-zero (NRZ) line coding represents 1 by a particular voltage or optical power level (depending on whether the link is electrical or optical, respectively) and 0 by a different voltage or optical power level. For simplicity in the examples given here, the levels may be taken as 1 Y or 0 dBm for binary 1 and 0 V or -10 dBm for binary 0.^ Then a stream of NRZ-encoded bits with constant rate and no impairments is represented mathematically as a train of such pulses [2]
^(0- Y.''n8(t-nT)
(6.1)
192
Chapter 6
where ^(0 = rectangular pulse of height 1, extending from t = 0 to t = T a, =n'^ hit (an e {0,1}) T= size of unit interval in units of time (seconds) = l/(bit rate in bits/s) xit) = amplitude function for pulse train at transmitter. An NRZ-encoded bit stream is illustrated in Figure 6-2. In this example, the bit time (sometimes referred to as the significant instant) is associated with the leading edge of the pulse/ The train of pulses, Eq. (6.1), may be represented by an actual signal (baseband signal) or may be used to modulate a higher-frequency carrier. The latter is normally the case in an optical transmission system, where the signal modulates an optical carrier of a given wavelength. At the receiver end of the link, the pulses will be distorted due to impairments in the transmission process. There also are, in practice, timing impairments, which shift the pulses in time relative to the equally spaced ideal time instants. Finally, noise is present due to various sources in the link. The resulting amplitude function for the pulse train at the receiver is [2] y(0= f^aMt-nT-e[nT]-T,)-h?j(t)
(6.2)
where (square brackets denote a discrete-time function) h(t) = distorted pulse due to transmission impairments e[nT] = shift in timing of pulse due to timing impairments 7 = size of unit interval in units of time (seconds) = l/(bit rate in bits/s) y(t) = amplitude function for pulse train at receiver T](t) = noise due to all sources TQ = average propagation delay between transmitter and receiver. An NRZ-encoded bit stream with distortion and timing impairments, and with the average propagation delay TQ removed, is illustrated in Figures 6-3 and 6-4. Figure 6-3 shows the signal with only timing impairments, while Figure 6-4 shows it with both timing impairments and distortion. The function e[nT] is the phase-error function (also referred to as the phase).
Synchronization of Optical Networks
T
2T
193
3T
4T
5T Time
Figure 6-7. Bit times for CBR signal with no timing impairments
Pulse Height 1
1
1
0
1
0
1
(vj
... ol
1r
3T
2T
4T
5T Time
Figure 6-2. Example NRZ-encoded bit stream, with no distortion (amplitude) or timing impairments, and no noise
e[3T] ^
e[2T]—> Pulse Height
1
1
e[5T]—> 1
0
0
1
(V)
• •• 0
T
2T
3T
4T
5T
Time
Figure 6-3. Example NRZ-encoded bit stream, with timing impairments but no distortion
194
Chapter 6
0
T
2T
3T
4T
5T
Time
Figure 6-4. Example NRZ-encoded bit stream, with timing impairments and distortion
recovered data
received data Decision Circuit i i
C*lock Recovery
Figure 6-5. Schematic of sampling and decision process for data recovery
e(s)
eXs)
Figure 6-6. Functional block for clock recovery low-pass filter
To determine whether a received pulse is a 1 or a 0, it must be sampled as close to the shifted pulse time nT + e[nT] as possible.^ A decision circuit compares the sampled value with a threshold. The sampling and decision process is illustrated in Figure 6-5. The mathematical theory of this process is described in detail in a number of references (see, for example, [2]-[5]) and will not be repeated here. However, we note that, in general, a larger magnitude of any impairment or noise source results in higher probability of a bit error. In particular, a larger offset between the ideal sampling time nT + e[nT] and the recovered sampling time nT + er[nT] results in higher
Synchronization of Optical Networks
195
probability of a bit error. The offset between the ideal sampling time and the recovered sampling time, CainT] = e[nT] - eXnT], is referred to as the alignment jitter. The determination of the recovered sampling instant is done with a clock recovery circuit. These circuits are often implemented using phase-locked loops (PLLs). Clock recovery circuits are described in more detail in [2], [6], and [7]. Functionally, the clock recovery circuit acts as a low-pass filter on the actual phase error e[nT\ to produce a recovered phase error eXnT] that is as close to e[nT\ as possible. This process is illustrated in Figure 6-6, where the actual and recovered phase error are related by e^{s) = H{s)e{s) with H{s) the transfer function of a suitable low-pass filter. alignment jitter is related to the actual phase error by e, is) = [1 - H{s)]e{s) s H^ {s)e{s)
(6.3) Then, the
(6.4)
Since H{s) is a low-pass filter, H^is) - 1 - H{s) is a high-pass filter. Therefore, the alignment jitter is equal to the phase-error filtered by a suitable high-pass filter; alternatively, it is equal to the short-term variations in the phase error. Eq. (6.4) also illustrates that, since He{s) is a high-pass filter, sufficiently slow variations in the phase error e{s) will result in very small alignment jitter, i.e., the variations will be tracked. However, fast variations in e{s) above the bandwidth, or corner frequency, of the clock recovery circuit will not be tracked, and will result in larger alignment jitter. Higher-frequency variation in the phase error is referred to as timing jitter. If Figure 6-5 represents the clock and data recovery process at a receiver, the data may be retransmitted on the next link (unless this is the final link), with possibly some buffering at the receiver node. There are two possibilities for the timing of the retransmitted data. First, the data may be transmitted with a clock that is independent of the timing of the received data (and therefore independent of the clock timing the data at the upstream node). In this case, the data buffer in the receiver node will eventually overflow or underflow, resulting in a slip."^ This mechanism is used in Public Switched Telephone Network (PSTN) switches; however, it is not used in the SONET, SDH, and OTN physical layers. Second, the transmit clock may be derived from (or be the same as) the recovered timing from the received data. This mechanism is used in OTN 3R regenerators and in SDH/SONET and PDH regenerators. Because OTN 3R^ and SDH regenerators also must process some overhead, the data must be buffered. In this case, it is advantageous (and, in the case of OTN, required) to filter the
196
Chapter 6
recovered clock in Figure 6-5 with a second, narrower-bandwidth phaselocked loop to further reduce any jitter. This second PLL can help control jitter accumulation over multiple 3R regenerators. However, the data buffer must be sufficiently large to accommodate momentary timing differences between the input data (clock) to the receiver and the filtered clock output of this second PLL. In general, the buffer must be made larger as the bandwidth of the second PLL is made smaller. The second, narrowerbandwidth PLL is sometimes referred to as a dejitterizer,
6,2.2
Jitter Tolerance, Transfer, Generation, and Network Limit
The ability of equipment to accommodate input jitter is referred to as jitter tolerance. As mentioned above, higher-frequency input phase error, or timing jitter, results in alignment jitter, or a difference between the ideal and actual sampling instants; larger alignment jitter means larger bit error probability or bit error ratio (BER). More precisely, a larger alignment jitter means that the actual sampling point is further from the center of the pulse; since real pulses are not exactly square, the actual sampling point is more likely to be on a portion of the pulse that is rising or falling and therefore closer to the decision threshold.^ There is then a greater probability of noise causing the sample to fall on the wrong side of the decision threshold, resulting in a bit error. The effect of jitter on BER can be mitigated by increasing the average signal power relative to the noise level. This approach is used in defining and specifying jitter tolerance. Jitter tolerance for an optical receiver is defined as the minimum level (i.e., minimum peakto-peak amplitude) of sinusoidal jitter of a given frequency that must be accommodated, which results in a 1 dB power penalty.^ Specifically, the average signal power is increased by 1 dB relative to the noise, which results in a drop in BER. Jitter is then added, and its level is increased until the BER is equal to its level prior to increasing the power. This is the level of jitter that the equipment can tolerate, and it must not be smaller than the specified jitter tolerance. Sinusoidal jitter, i.e., sinusoidally varying input phase error, is used in specifying jitter tolerance. This is conservative, because sinusoidal jitter is closer to its positive and negative peak values for a greater fraction of time compared with more realistic jitter distributions (e.g., Gaussian). Since input jitter above the clock recovery PLL bandwidth is not tracked, the jitter tolerance is approximately the same for frequencies above this corner frequency. For lower frequencies, the fact that the alignment jitter is equal to the input jitter filtered by a low-pass filter means that, for a given level of alignment jitter, the input jitter that results in this outcome as a
Synchronization of Optical Networks
197
function of frequency has a -20 dB/decade slope (assuming the PLL has a 20 dB/decade roll-off). To show this, assume the clock recovery PLL has a first-order^ closed-loop response with transfer function H(s) = - ^ ^
(6.5)
where CO] = 27if\ is the PLL corner frequency. Then H^{s)^\-H{s) = -^—
(6.6)
S + CO^
\HX27gf)\ is illustrated schematically in Figure 6-7 (this is an asymptotic approximation; the actual log-magnitude transfer function is 3 dB lower at the corner frequency/i). Let the input signal have sinusoidal phase error with amplitude Ao(f) (the amplitude can depend on frequency). Since the magnitude of the phase error transfer function relates the amplitude of the input phase error to the amplitude of the output phase error (specifically, it is the ratio of the latter to the former), the amplitude of the alignment jitter, A«(/), may be written \(f)
= \H^ {27gf)\\ if) = - 1 ^ =
Then I f2 _^ f2
(6.8)
198
Chapter 6
(log scale)
/, = PLL corner frequency
frequency (log scale)
Figure 6-7. Schematic of log-magnitude transfer function of clock recovery PLL
Now set Aaif) equal to the alignment jitter amplitude that corresponds to a 1 dB power penalty, i.e., the value of alignment jitter amplitude that would result in unchanged BER if the power were increased 1 dB. This value of Aaif) is independent of frequency, and therefore A^if) does depend on frequency. Ao(/) is the jitter tolerance and is illustrated schematically in Figure 6-8. Ao(/) has a slope of -20 dB/decade for frequencies that are small compared with the PLL corner frequency, / j , and is equal to A« for frequencies that are large compared with f\. In practice, jitter is often specified in terms of peak-to-peak values rather than amplitude (i.e., zero-topeak values); this corresponds to multiplying both sides of Eq. (6.9) by 2. The full jitter tolerance mask is illustrated in Figure 6-9 (adapted from Figure L2 of [8] and Figure L2 of [9]), where the frequencies f\ and fi represent the band widths of the dejitterizer PLL and clock recovery circuit, respectively. Similar arguments apply to the dejitterizer PLL, except here the jitter tolerance is determined by the data buffer size. A more detailed discussion of this is given in Appendix I of [8] and Appendix I of [9]. A jitter tolerance mask of the form in Figure 6-9 represents the minimum level of jitter that equipment must tolerate. Therefore, it may be used to define the network limit, i.e., the maximum level of jitter allowed in the network. Since the actual jitter in a network is generally not sinusoidal, the network limit is defined using high-pass filters. The wide-band jitter is defined as the peak-to-peak value of the phase error signal filtered by a highpass filter with corner frequency/i. The high-band jiiitY is defined as the peak-to-peak value of the phase error signal filtered by a high-pass filter with corner frequency fi. The high-band and wide-band jitter must not exceed Ai and A2 (given in Figure 6-9), respectively. The PDH, SDH/SONET, and OTN specifications also define low-pass filters to be
Synchronization of Optical Networks
199
applied when measuring the jitter network limits; these have bandwidths that are more than an order of magnitude greater than the high-band jitter highpass measurement filter, and are intended to represent an upper practical limit to the spectral content of the jitter that can arise in the network.^
(log scale)
20 dB/decade
frequency (log scale)
Figure 6-8. Schematic of jitter tolerance of clock recovery PLL
peak-to-peak amplitude (log scale)
20 dB/decade
A, = dejitterizer PLL jitter tolerance
A2 = clock recovery PLL jitter tolerance
/, = dejitterizer PLL bandwidth
f, = clock recovery PLL bandwidth
frequency (1»8 ^^^1^)
Figure 6-9. Relation between jitter tolerance mask and dejitterizer and clock recovery PLL bandwidths (adapted from Figure 1.2 of Reference [8] and Figure 1.2 of Reference [9])
Clock recovery circuits and dejitterizer filters, in practice, produce some amount of jitter, and this must be limited to prevent excessive jitter accumulation. The jitter generation of a piece of equipment is defined as the peak-to-peak jitter of the output when the input is jitter-free. Jitter
200
Chapter 6
measurement filters must be specified. For OTN [8] and Option 1 SDH [10], high-band jitter generation and wide-band jitter generation are defined, with measurement filters consistent with the jitter tolerance masks and network limits. For Option 2 SDH (SONET) [10], [11], the jitter generation measurement filters for OC-48/STM-16 and lower rates are not consistent with the network limits and jitter tolerance masks; however, the jitter generation and measurement filters are consistent with the network limits and jitter tolerance for OC-192/STM-64. The lack of consistency for SONET OC-48 and lower rates is mainly historical. Finally, timing jitter can accumulate over a chain of regenerators. Early studies of jitter accumulation, based on first-order filter regenerator models, are given in [12]. Later studies using second-order filter models that accounted for gain-peaking are given in [13]. Both types of results are described in [2]. The results show that limiting the regenerator bandwidth limits the jitter accumulation. In addition, the second-order filter results show that gain peaking can cause jitter accumulation to increase sharply after some number of regenerators. More recently, jitter accumulation studies were performed for OTN 3R regenerators using a model that explicitly accounts for noise in the PLL phase detector (PD), voltagecontrolled oscillator (VCO), and input. The model studies are documented in Appendix IV of [8], and the results were used to derive the 3R regenerator jitter requirements in [8]. Some of these results are used in examples in Section 6.4.2.
6.2.3
Mapping and Multiplexing
Often, a client signal is mapped into a server layer where the timing of the server is independent of the timing of the client. This is necessarily true when a number of client signals, each with timing independent of the others, are multiplexed into a higher-rate server layer; in this case, the server timing cannot be simultaneously traceable to all the client timings because the latter are all different. However, there are cases where a single client signal is mapped into a server layer (of slightly higher rate to account for server layer overhead), with the client and server timing independent of each other. Note that even if the multiple client signal clocks are specified to have the same nominal rate, the actual signal rates will not be the same if they are provided by different clocks because the frequency of a clock is never specified exactly; rather, a nominal rate plus a tolerance is specified (frequency tolerance and frequency offset are described in Section 6.2.5). A mapping of a client signal into a server layer with the client and server timings independent of each other is referred to as an asynchronous mapping. A mapping of a client signal into a server layer with the server timing traceable
Synchronization of Optical Networks
201
to the client timing (and no other relation between client and server byte or frame boundaries) is referred to as a bit-synchronous mapping. In PDH, asynchronous multiplexing of lower-rate signals into higher-rate signals is defined. In SONET and SDH, asynchronous, bit-synchronous, and bytesynchronous mappings of PDH clients are defined; however, in the vast majority of cases only the asynchronous mappings are used. In OTN, asynchronous and bit-synchronous mappings of CBR clients are defined. In bit-synchronous mapping, the server timing can be obtained from the client timing using a PLL that multiplies the client clock by the ratio of server to client nominal rate. The effect on jitter is similar to that of a regenerator. However, if the client signal is lost, the server timing will also be lost unless there is an alternative source of timing. One way of providing this alternative is for the PLL to be part of a clock that can enter free-run condition when its input is lost; another way is to have a separate clock whose timing is used when the client input is lost. In either case, requirements would be necessary to limit the phase and frequency transient when the client is lost and server timing switches to another source. In asynchronous mapping, the server timing is independent of the client timing. The client rate must be adapted to the server rate. The most common schemes for rate adaptation use a form of stuffing at the mapper coupled with destuffing and filtering at the demapper (the mapper is sometimes referred to as a synchronizer, the filter plus destuffer are collectively referred to as a desynchronizer). The client signal is buffered at the mapper; the client bits enter the buffer at the client rate and leave the buffer at the average rate of the server payload area (this rate is less than the server rate due to the presence of server layer overhead). In general, these two rates differ; therefore, if this process continues indefinitely, the buffer will eventually overflow or underflow. To prevent this, the buffer fill is monitored and, based on an algorithm whose input is the buffer fill, either extra or less payload information may be transmitted. The extra payload information is transmitted in designated server overhead known as negative stuff or negative justification. If less payload information is transmitted, designated bits or bytes in the payload area of the server frame are filled with dummy information rather than client information; this is known as positive stuff or positive justification. Server layer overhead known as stuff control or justification control indicates in each server layer frame whether a stuff has been done and, if so, whether it is positive or negative; this information is needed by the demapper. The delivery of client bits to the demapper is not uniform. Most of the time the bits are delivered at the actual server rate. However, there are regular gaps due to server layer overhead. In addition, there are either gaps due to negative stuff or extra bits due to positive stuff. The timing
202
Chapter 6
information embedded in this irregular bit stream is referred to as a gapped clock. This gapped clock is filtered by a desynchronizer (usually a PLL) to produce a clock with uniform rate (a regular clock). However, the resulting regular clock does contain some jitter and phase error due to the stuffs and overhead. Typically, requirements are placed on the desynchronizer to ensure that its jitter and wander (i.e., low-frequency phase error) generation (output jitter/wander in the absence of any jitter/wander on the client input at the mapper, and no additional sources of jitter or phase error in the server layer) are acceptable. In most cases, the regular gaps due to fixed overhead are easy to filter because they are of sufficiently high frequency; alternatively, if it is possible to buffer some data, a regular clock that runs at the rate with the fixed overhead removed can be derived from a clock that runs at the rate with the fixed overhead present using a PLL. However, the phase error waveform for the gapped clock that contains stuffs will tend to have a low-frequency envelope that is more difficult to filter. The jitter due to this envelope is referred to as waiting time jitter, and may be more difficult to filter if its frequency is sufficiently low relative to the desynchronizer bandwidth. In almost all cases of interest, the stuff unit is either 1 bit, 1 byte (8 bits), or an integral number of bytes. The number of stuff units per server layer frame is related to the nominal server and client rates and range of client and server frequency offset that must be accommodated. For example, if the nominal server payload area average rate is higher than the nominal client rate (this is the case for multiplexing PDH signals into higher-rate PDH signals and for mapping PDH signals into SDH Virtual Containers (VCs)), there will be a constant nominal positive stuff rate if the client and server frequencies are exactly nominal. The maximum server and minimum client rates (based on their frequency tolerances) determine the required maximum stuff opportunity rate. The minimum server and maximum client rates determine the required minimum stuff opportunity rate (which may be negative indicating negative stuffing). The relation among rate of stuffing, nominal client and server rates, and client and server frequency offsets is quantified for OTN in Appendix I of ITU-T Recommendation G.709 [14]. This is done for both mapping of CBR (e.g., SDH) clients into ODUk and multiplexing of ODUk into ODUm (m > k). The OTN CBR client mappings allow positive and negative byte justification (this is referred to as positive/negative/zero or +7/0/-7 byte justification). In addition, the multiplexing of ODUk into ODUm (m > k) allow positive justification of 2 bytes or 1 byte, and negative justification of 1 byte (this is referred to as +2/+//(9/-7 justification). The mappings used for PDH multiplexing allow positive bit justification (the rates and tolerances are such that negative bit justification would never be needed).
Synchronization of Optical Networks
203
The most straightforward algorithm for determining when to do a stuff is to monitor the mapper buffer and do a positive stuff if the buffer empties below a negative threshold and a negative stuff if the buffer fills above a positive threshold. This scheme is used for PDH multiplexing, with the simplification that only positive (bit) stuffing is needed because the minimum server payload area rate is greater than the maximum client rate; therefore, only one threshold must be considered. When the stuff ratio (the fraction of stuff opportunities for which a stuff is done) is close to a number that is the ratio of small integers, the phase waveform tends to have the low frequency envelope referred to above, and the jitter tends to be larger. The straightforward algorithm is used for OTN CBR client mappings defined in [14]. The CBR client demapper (desynchronizer) requirements of [8] provide for acceptable jitter and wander performance for the CBR clients. The OTN CBR client desynchronizer bandwidth is sufficiently narrow compared to the jitter measurement filters for the STM-N clients. It is possible to consider other more advanced justification algorithms where, for example, client phase information is explicitly carried in the server overhead or where threshold modulation is employed. Such schemes can result in reduced jitter or wander. However, they are not discussed here because the conventional scheme is found to produce acceptable jitter and wander accumulation for OTN. In the case of SDH, the asynchronous mapping of some PDH clients, e.g., DS3 and E4, use purely positive bit stuffing with the conventional algorithm. However, some mappings, e.g., DSl, El, and E3, use +1/0/-1 bit stuffing; i.e., there are two stuff bits per frame and, if the VC~11, VC-12, or VC-3 and DSl, El, or E3 frequencies (respectively) were exactly nominal, one stuff bit would contain data and the other would contain stuff.
6,2.4
Pointer Adjustments
In addition to mapping a client signal into a server layer, it is also often desired to cross-connect client signals at an intermediate switching node. Specifically, the client signals multiplexed within incoming server layer signals may not all have the same destination; it may be necessary to switch them to different outgoing server layer signals. In addition, it may be desired to drop one of the multiplexed client signals at a node and replace it with another client signal of the same type that originates at this node (i.e., perform add/drop). In principle, the cross-connect and add/drop functions could be accomplished by completely demultiplexing and demapping the client signals from each incoming server layer signal at a node, switching the client signals to the appropriate output ports, and remultiplexing them into new server layer signals. In PDH networks and OTN, this is the procedure
204
Chapter 6
that is used. However, in SONET and SDH networks, the cross-connect and add/drop functions are performed with the aid of the pointer mechanism. The SONET and SDH frame formats and multiplexing hierarchies are specified in [15] and [16]. Schematics of the SONET and SDH multiplexing structures are given in Chapter 4 of the present book. SONET and SDH clients are mapped into containers, shown at the bottom of Figures 4-1 and 4-2 of Chapter 4, For example, DSl, El, E3, DS3, and E4 clients can be mapped into C-11, C-12, C-3, C-3, and C-4 containers, respectively. Overhead (the nature of which is unimportant for the discussion here) is added to create a VT1.5 Synchronous Payload Envelope (SPE) (SONET) or VC-11 (SDH), VT-2 SPE (SONET) or VC-12 (SDH), STS-1 SPE (SONET) or VC-3 (SDH), or STS-3c SPE (SONET) or VC-4 (SDH), respectively. The SONET VT and STS SPEs and SDH VCs float in their respective frames. For example, the SONET STS-1 SPE may start anywhere in columns 4 through 90 of the OC-1 frame, and the starting point is indicated by a pointer (specifically, by the HI and H2 bytes in row 4, columns 1 and 2, respectively). The timing for the outgoing SONET or SDH signals (OC-N or STM-N) from a Network Element (NE) where cross-connect or add/drop functions are performed (i.e., where the SONET Line or SDH Multiplex Section (MS) is terminated) is, in general, independent of the timing of the incoming signals. The bytes of each incoming STS-N SPE/VC-N are buffered and then placed in the respective outgoing OC-N/STM-N, with the starting position indicated by the pointer. Since the outgoing timing is independent of the incoming client timing, the buffer fill for each STS-N SPE/VC-N will change over time. In addition, since the long-term average frequencies of the incoming and outgoing signals may be different, the buffer will eventually overflow or underflow and result in data loss if nothing is done. To prevent this, the buffer fill is constantly compared with upper and lower thresholds. If the upper threshold is exceeded, a negative pointer adjustment is performed. Specifically, in the next frame, extra data bytes (e.g., 1 byte for STS-1 and 3 bytes for VC-4) are written to the H3 overhead byte(s). In addition, the starting point of the STS-N SPEA^C-N shifts towards the beginning of the OC-N/STM-N by the number of extra data bytes written. This process is referred to as a negative pointer adjustment. Conversely, if the lower threshold is exceeded, a positive pointer adjustment is performed. Specifically, in the next frame, data bytes adjacent to the H3 byte(s) (e.g., 1 byte for STS-1 and 3 bytes for VC-4) are filled with stuff. In addition, the starting point of the STS-N SPE/VC-N shifts towards the end of the OCN/STM-N by the number of extra data bytes written. This process is referred to as a positive pointer adjustment.
Synchronization of Optical Networks
205
From the standpoint of timing, a pointer adjustment is equivalent to a justification of the same magnitude and sign. A positive or negative pointer adjustment of M bytes results in M fewer or extra client layer bytes being transmitted; this is equivalent to +M/0/-M byte justification (note that here, i.e., in SONET or SDH, the client layer is the VT/STS SPE or VC). The result of a pointer adjustment is a phase step of +M or -M bytes. One main difference between the pointer and justification mechanisms is that the pointer indicates where in the OC-N/STM-N the client (VT/STS SPE or VC) begins. In a justification mechanism, the client framing must be determined separately. However, this has no impact on timing, jitter, and wander. If a client signal traverses a number of SONET/SDH NEs, pointer processor buffer fill variations and pointer adjustments may occur at each NE. The phase difference over a time interval between the client signal at the network egress (i.e., where the client is demapped) and network ingress (i.e., where the client is mapped) is equal to the net change in total fill of all the buffers over that time interval. During normal operation, SONET and SDH networks are synchronized, i.e., the timing for the NEs is specified to be within a certain frequency tolerance of nominal, and phase variation (jitter and wander) is constrained (timing signal imperfections and the characterization of timing performance are described in Sections 6.2.5 and 6.2.6). Under these conditions, the buffer fill variations tend to be relatively slow, and the phase variation is wander (i.e., its frequency content is below 10 Hz; see Section 6.2.5). In addition, a pointer adjustment at the egress node, where the client signal is demapped, results in a phase step equal to the size of the pointer adjustment. This phase step results in both jitter and wander. The jitter can be controlled by filtering the pointer adjustment with an appropriate desynchronizer (in the same manner that jitter due to justifications is controlled, as described in the previous subsection). The long-term wander for an isolated pointer adjustment (isolated means that the time between successive pointer adjustments is long compared with the desynchronizer time constant) is equal to the magnitude of the pointer adjustment; this cannot be reduced as long as the pointer adjustment is isolated. The short-term wander (e.g., the phase variation over various time intervals) can be controlled with an appropriate desynchronizer. Note that pointer adjustments at intermediate nodes impact jitter and wander only to the extent that they result in buffer fill variations (this impacts wander) and pointer adjustments at the final node where the signal is demapped (this impacts jitter and wander).
206
Chapter 6
6.2.5
Timing Signal Imperfections
All timing signals are imperfect, i.e., the phase or frequency as a function of time differs from the desired phase or frequency. First, the frequency of a timing signal may differ from the desired, or nominal, frequency, by a fixed amount. This is referred to as frequency offset. The fractional frequency offset, y, is defined as y = ' - ^
,6.,0)
where V = actual frequency Vo = desired, or nominal, frequency. In Eq. (6.10), y is a pure fraction; it is also often expressed in parts per million (ppm). The frequency tolerance of a clock or timing signal is the specified maximum allowable absolute value for y. Second, the frequency of a timing signal may change with time. The rate of change is the frequency drift rate, which is often expressed in ppm/s. Often, the maximum frequency drift rate is specified for a transient event. Third, a timing signal may contain random phase noise, i.e., the phase error as a function of time is a random process. This can be characterized by power spectral density (PSD, described below in Section 6.2.5.1); additional, more convenient measures are given in Section 6.2.6. Fourth, it was indicated in previous sections that jitter, or the result of passing the phase error process through a high-pass filter, is useful in specifying the performance of clock recovery circuits and 3R regenerators. The specified high-pass filter is typically first order; its corner frequency depends on the signal in question (i.e., it is part of the specification), but is always greater than 10 Hz. Finally, wander is phase variation whose frequency components are less than 10 Hz. It is of interest when considering slip performance and synchronization distribution. The total instantaneous phase error of a timing signal at time t may be written [17]
Synchronization of Optical Networks
207
where 0(0 = instantaneous phase error at time t (UI) D = frequency drift rate (s'^) Oo = initial phase error (UI) (p(t) = instantaneous phase noise random process (UI). The instantaneous phase error may be expressed in radians by multiplying Eq. (6,11) by 2;r, and in units of time (e.g., s) by dividing Eq. (6,11) by VQ. When the phase is expressed in units of time, often the symbol x is used, rather than <[>, i.e., x{t) = XQ + yt'^-Dt^+^
(6.12)
Note that x(t) is sometimes used to represent just the random component of phase error expressed in units of time (i.e., the final term in Eq. (6,12) [17], whereas in Eq. (6.12) it represents the total phase error. Both conventions are used in this chapter, with the definition clearly indicated for each use. 6.2.5.1
Phase Noise'^
The phase noise random process, gj(t) in Eq. (6.12), may be characterized by its Power Spectral Density. In the remainder of this subsection, the phase noise will be expressed in units of time and represented by .^(0- The twosided PSD, Sjc(co), is defined by
2^-^°°
(6.13)
where R^i f) is the autocorrelation function. /?^( f) is defined by R^(T) = E[x{t)x(t + T)]
(6.14)
where E[] denotes expectation (ensemble average). In characterizing clock noise, usually the one-sided PSD is used; this is related to the autocorrelation function by
208
Chapter 6 S^{f)
=
4rR^(T)cos27fTdT ^
(6.15)
R^{T)=
^S^{f)COS27tfTdT
Eqs. (6.13)-(6.15) assume that the random phase process x(t) is widesense stationary, i.e., that the autocorrelation function depends only on the difference between the times of the two samples (or that Eq. (6.14) is independent of 0-'^ However, it will be seen shortly that many of the stochastic models used to characterize phase noise have power law PSDs, and some of these models are nonstationary. This problem is addressed by realizing that, in real systems, high- and low-frequency cutoffs exist. For example, a theoretical high-frequency limit is provided by the fact that it makes no sense to sample faster than the bit rate of the signal; a tighter limit may be imposed by the measurement system. A low-frequency cutoff is provided by the fact that measurements are made over finite time intervals. These issues are described in detail in Appendix I of [18]. As is stated in [18], results of measurements and calculations are most meaningful if they are independent of the low-frequency cutoff as this frequency approaches zero. In any case, the following points hold when using nonstationary models: 1. When making measurements, one should use statistics that converge. This was one consideration that led to the TVAR and TDEV parameters described in the next subsection, as well as to the related parameters Allan Variance and Modified Allan Variance (see [19], [20], [18], and various references given in those documents for more details on these parameters). 2. When performing analytical calculations, one must ensure that mathematical operations (e.g., summations, integrals) converge. Measurement of phase noise in clocks and oscillators has shown that the form of the power spectral density is generally a linear combination of power law terms [19]:
fi=oJ
k
(6.16) h^
h.
" f f
f
K
r
Each term is considered to represent a different noise type. The first term in Eq. (6.16) represents familiar white noise (white phase modulation, or
Synchronization of Optical Networks
209
WPM) that has a flat power spectral density. The third term in Eq. (6.16) represents white frequency modulation (WFM). Since phase is the integral of frequency, this white frequency noise results in a random walk in phase. The second and fourth terms in Eq. (6.16) represent flicker phase modulation (FPM) and flicker frequency modulation (FFM), respectively. The fifth term represents random walk frequency modulation (RWFM). Mathematical models have been developed in which the flicker noise terms may be considered as half (i.e., fractional order) integrals of white noise terms [21], [22], [23]. The fundamental physical processes that give rise to flicker noise are not nearly as well understood as those that give rise to white noise (e.g., the latter can arise from thermal motion of atoms and molecules). However, this particular lack of knowledge has not prevented the characterization of noise in clocks and oscillators. Eq. (6.16) can be generalized, if desired, to include power law dependences involving fractional exponents. In this case, P would take on any value (real number) between 0 and 4. In addition, there could, in principle, be an arbitrary number of terms. It can be shown that Gaussian noise with power spectral density of the form h/^J^ is wide-sense stationary (i.e., the autocorrelation function, as given by Eq. (6.14), depends only on r and not on 0 if 0 < P < 1 and nonstationary if J3> \ (see [24] for details).
6,2.6
Characterization of Timing Performance
This section describes several useful measures of jitter and wander performance. These measures include peak-to-peak jitter, root-mean-square (RMS) jitter, maximum time interval error (MTIE), time variance (TVAR), and square-root of time variance, or time deviation (TDEV). These measures, along with other terminology for network timing, jitter, and synchronization, are defined in [17] and described in more detail there. 6.2.6.1
Peak-to-Peak and RMS Jitter
Let CalnT] be the jitter process, i.e., the result of passing the phase error process through an appropriate jitter measurement filter (7 is the bit period). The RMS jitter, JRMS^ is defined as J,^,=^E[el(nT)]
(6.17)
where £"[•] denotes expected value, Eq. (6.17) assumes the jitter process is stationary; this assumption is valid because of the high-pass filtering. In
210
Chapter 6
practice, the jitter process is ergodic and RMS jitter may be estimated by replacing the ensemble average with a time average. Peak-to-peak jitter over a specified time interval is the difference between the maximum jitter value over that time interval and the minimum jitter value over that time interval (in defining the minimum value here, the sign of the jitter must be accounted for; the minimum value is algebraically the smallest value). In a rigorous definition, the maximum and minimum values should be defined as specified quantiles of the jitter distribution. However, in practice, the maximum and minimum values of a single sample of the jitter process are used. The time interval must be long compared with the time constant of the jitter measurement filter. For SDH/SONET and OTN line jitter, a 60-second interval is used. 6.2.6.2
Maximum Time Interval Error (MTIE)
MTIE is the peak-to-peak phase error of a timing signal for a specified observation interval, expressed as a function of observation interval. For a given timing signal of duration r^ax and observation interval S < r^ax^ the peak-to-peak phase error is computed for all subintervals of the total duration that are of length S, MTIE for observation interval S is the largest of these peak-to-peak values. MTIE is defined in this manner for 0 < 5 < r^ax- As with peak-to-peak jitter, MTIE should be defined as a specified quantile of a random variable defined by the above operation. However, in practice only a single phase history sample is generally used, and MTIE is estimated as the value of that sample (for each observation interval). MTIE may be defined rigorously as follows [17]. Define the random variable X (which is a function of the observation interval S, and may therefore also be thought of as a random process indexed by the parameter
sy X(5)=
max
max [x{t)\-
min [x{t)\\
(6.18)
Then MTIE(5) is defined as a specified percentile, p, of X(5). For a sampled data set obtained from a single measurement, MTIE can be estimated as
MTIE(nTo)= max (max x(i)lk+n
min x(/)), k
'
n= 1, 2,..., A^-1 (6.19)
Synchronization of Optical Networks
211
where To is the sampling interval and A^ is the number of samples in the measurement interval. Eq. (6.19) gives a point estimate with no information on what quantile it actually represents. To obtain an estimate of a specified quantile of X(S), multiple measurements must be made, and the following methodology may be used [17]. Let Xi, X2,...XM be a set of independent measurement samples of MTIE, for an interval of length S, for M measurement periods each of length Tmax- Assume that the samples have been put in ascending order, i.e., Xi < X2 < ... < XM- Let x^ be the jS^ quantile of the random variable X, Then a confidence interval for x^, expressed as the probability that x^ falls between the samples Xr and X, (with r<s), is given by P{X^ <x.< X,}= y
—
j3'(l-j3f-',
(6.20)
where ?{•} denotes probability. Eq. (6.20) holds under very general conditions; the only assumption needed is that the successive samples (e.g., successive measurements) are independent and identically distributed. This assumption is valid if the external environment has not changed appreciably over the timescale of the measurement. 6.2,6.3
Time Variance (TVAR) and Time Deviation (TDEV)
TVAR is a second order statistic of the phase noise random process. As indicated in Section 6.2.5.1, the phase noise processes in clocks and oscillators have power spectral densities that vary as 1//^, where a is an exponent between 0 and 4. For a > 1, such a random process is nonstationary and has first and second moments that depend on time. In general, the traditional estimator for the classical variance will not converge for such a process. However, the estimators for the TVAR and TDEV statistics do converge. In addition, the slope of TDEV (or TVAR) on a loglog scale is easily related to a; this means that one may determine a (i.e., the noise type) from a TDEV plot, TVAR is defined as the expected value of the square of the second difference of the average phase error, where the average is over an interval x referred to as the integration time [17]: TVAR(r) = -E[{A^xf], 6
(6.21)
Here, x(t) is the phase error process and the bar denotes time average. TDEV is the square root of TVAR. TVAR and TDEV are most useful when x(t) represents only the random component of phase error (in particular, they
212
Chapter 6
are useful when the random process has a power law power spectral density as described above). Note that TDEV and TVAR for a constant phase offset and for a constant frequency offset are both zero due to the second difference operation (this may be clearly seen from Eq. (6,22) below). TVAR is estimated using (where x = nXo, NXQ is the total length of the data sample, and n = \,2,,.., integer part (N/3)) [17]
(6.22)
6.2.7
Wander Network Limits and Wander Performance
As indicated in Section 6.2.5, wander is defined as phase variation whose frequency components are less than 10 Hz. The control of wander is important because there are many cases in networks where a signal passes from one timing domain to another via a slip buffer. In this situation, the timing of the signal entering the slip buffer is independent of the timing of the signal leaving the slip buffer. If the phase difference between the two signals becomes large enough, the buffer will overflow or underflow. This outcome is referred to as a slip. If the occurrence of a slip is dealt with by throwing away the current contents of the buffer, recentering, and reacquiring framing for the signal, the slip is referred to as uncontrolled. However, it is possible to delete an entire frame on buffer overflow or repeat an entire frame on buffer underflow (providing that at least one frame of data is always saved). This process is referred to as a controlled slip, and is more desirable because it avoids the need to reacquire framing (and results in overall less data loss). A prominent example of the use of the controlled slip mechanism is with the buffering of data at 64 kbit/s switches in the PSTN. While the switches are ordinarily synchronized, the synchronization is not perfect. Each incoming DSl or El has a slip buffer that can accommodate at least a single frame plus an additional 18 |is worth of data (hysteresis), i.e., the buffer may be larger than this, but it cannot be smaller. This means that if the phase difference between the data entering the buffer and leaving the buffer varies sufficiently, the buffer will overflow or underflow. If this happens, an entire DSl/El frame is deleted or repeated. The magnitude of phase difference needed to cause a single slip depends on the initial conditions and the detailed phase variation. Since slip buffers can accommodate only a maximum phase variation without a slip occurring, a long-term frequency offset between the incoming
Synchronization of Optical Networks
213
and outgoing rates will eventually result in a slip. After a slip has occurred in this situation, the buffer fill will be one frame's worth of data from one edge. Continued operation will cause the buffer fill to move towards that edge. Therefore, the time between slips is equal to the time needed for the accumulated phase difference between the incoming and outgoing signals to equal one frame, or 125 |as for DSl or El. ITU-T Recommendation G.811 specifies a long-term frequency accuracy for a Primary Reference Clock (PRO of 10'^^ [25]. This means that, when PSTN switches are synchronized, their long-term frequencies differ from nominal by no more than 1 part in 10^\ In the worst case, where the incoming and outgoing slip buffer signals are off by 10'^^ in opposite directions, a slip will occur every (125 X 10-^ s)/(2 X 10-^' s/s) = 6.25 x lOS s 72 days. If the clock timing a switch enters holdover, the slip rate will be larger, and depends on the quality of the holdover. Controlled slip rate objectives for an international digital connection, and allocations of these objectives to international, national transit, and local portions of the connection, are given in ITU-T Recommendation G.822 [26]. The discussion above indicates that slip rate is controlled by limiting the long-term frequency differences between network clocks. However, phase noise also results in phase difference that can cause slips. Therefore, the long-term wander for traffic signals is also limited. The limit is specified in ITU-T Recommendations G.823 and G,824 (for the 2048 and 1544 kbit/s PDH hierarchies, respectively) as 18 |as over 24 hours. As indicated above, this limit is the minimum required slip buffer hysteresis. This is a long-term wander limit; it is MTIE for an observation interval of 24 hours (see Section 6.2.6.2). In addition to controlling long-term MTIE (i.e., over 24 hours), shortterm wander is specified in ITU-T Recommendations G.823 and G.824 for traffic and synchronization signals (i.e., interfaces) via MTIE and TDEV masks (the former for both traffic and synchronization interfaces, and the latter only for synchronization interfaces). The synchronization wander limits are referenced in ITU-T Recommendation G.825 for SDH signals; the reason the synchronization wander limits are referenced is that all SDH signals are required to be suitable for carrying synchronization. Wander limits are not specified for OTN (i,e,, OTUk or ODUk) signals, as these signals are not required to carry synchronization. However, it is required that synchronization be carried by SDH clients of OTN, and the OTN specifications ensure that SDH clients will meet the G.825 jitter and wander network limits (and, by reference, G.823 and G.824 synchronization interface wander limits).
214
63.
Chapter 6
ROADMAP OF CURRENT ITU-T RECOMMENDATIONS ON TIMING, AND JITTER, FOR OTN, SDH, AND PDH
Table 6-1 shows the applicable ITU-T recommendations pertaining to timing, synchronization, and jitter for OTN, PDH, and SDH networks. The applicable recommendations pertaining to rates, formats, and interfaces and to other (nontiming/jitter) equipment specifications are also shown, as well as recommendations pertaining to network performance and network architecture. Table 6-1 is not a complete list of all ITU-T recommendations for the respective networks; e.g., it does not show recommendations pertaining to network management. The jitter and wander equipment specifications include the jitter and wander generation, transfer, and tolerance requirements for the various network elements. Physically, these specifications end up applying to various phase-locked loops. In the case of SDH, the requirement that the network is normally synchronized gave rise to more stringent specifications for the clock in network elements where the SDH multiplex section (MS) is terminated, compared with SDH regenerators where only the regenerator section (RS) is terminated. This more stringent clock is referred to as the SDH Equipment Clock (SEC) (for Option 2, this is equivalent to the SONET Minimum Clock (SMC)), and its requirements are in a separate ITU-T recommendation, G.813. Other SDH equipment requirements for timing and jitter, e.g., SDH regenerator and desynchronizer requirements, are in G.783, which also contains SDH equipment requirements not pertaining to timing or jitter. For OTN, the equivalent of the equipment clock is the clock that creates the timing for the ODUk. This clock is referred to as the ODUk Clock (ODC). Two such ODC's are specified: ODCa for asynchronous client mappings and ODUk multiplexing, and ODCb for bit-synchronous client mappings. Since the ODCa and ODCb requirements are not nearly as stringent as the SDH Equipment Clock (SEC) requirements, it was decided to include these in the same recommendation as the other equipment requirements pertaining to OTN timing and jitter. Two other ODC's are also specified: ODCr for 3R regenerators and ODCp for the CBR client desynchronizer. The OTN interface and equipment timing/jitter specifications are all in Recommendation G.8251 [8]. The OTN equipment specifications not pertaining to timing and jitter are in G.798 (G.798 references G.8251 for timing and jitter specifications).
Synchronization of Optical Networks
215
Table 6-1. Summary of ITU-T Recommendations for PDH, SDH, and OTN covering timing, synchronization, and jitter (Recommendations for rates and formats, equipment specifications, architecture, and performance are shown for completeness) SDH OTN Hierarchy PDH (2048 PDH (1544 kbit/s kbit/s hierarchy) hierarchy) Jitter/wander network limits
G.823
G.824
G.825
G.8251
Jitter/wander equipment requirements
G.921
(Note 1)
G.783
G.8251
Equipment clock
Not Applicable
Not Applicable
G.813
G.8251
Node clocks
G.812
G.812
G.812
Not Applicable
Primary Reference Clock
0.811
G.811
G.811
Not Applicable
Synchronization Distribution
None
None
G.781
Not Applicable
Definitions and terminology for timing/synchronization
G.810
G.810
G.810
G.810
Rates, formats, and interfaces
G.703, G.704
G.703, G.704
G.707
G.709/Y.1331
Equipment specifications
G.921
(Note 1)
G.783
G.798
Network architecture
None
None
G.803
G.872
Network (service) performance
G.826
G.826
G.828, G.826
G.8201
Note 1: No applicable Recommendation.
The specifications for node clocks (a node clock is referred to as a Synchronization Supply Unit (SSU) in Option 1 networks and a Building Integrated Timing Supply (BITS) in Option 2 networks) are in Recommendation G.812. The specifications for the Primary Reference
216
Chapter 6
Clock (PRC) are in Recommendation G.811. These recommendations specify the clocks, and therefore are not strictly limited to any network type. However, these clocks are not required for OTN, and are therefore indicated as not applicable in Table 6-1, because the OTN is not synchronized. Recommendation G.810 defines the applicable terminology for timing, synchronization, and jitter, and is applicable to all network types.
6.4.
TIMING AND JITTER REQUIREMENTS FOR SONET/SDH AND OTN
This section uses the concepts introduced in Section 6.2 to describe the timing and jitter requirements for SONET/SDH networks and the Optical Transport Network (OTN). While most of the SONET and SDH standardization occurred during roughly the 1985-1995 timeframe and the technology is now mature, OTN standardization has been relatively complete only for several years, and OTN technology is not widely deployed. When work on OTN standardization began, OTN was intended as the next generation of transport network after SONET and SDH. It is a new transport networking layer [27]. It uses Dense Wavelength Division Multiplexing (DWDM), which results in Tbit/s capacity per fiber, and allows for wavelength level switching. It provides for enhanced 0AM and networking functionality (compared with SDH) for all services. It provides for service transparency for clients. The characteristics and features of the OTN are described in more detail in [27]. The OTN transport includes multiple optical channels, with nominally one per wavelength. Each optical channel transports a digital frame structure, which has payload and overhead areas. The paths transported by the optical channels can have rates of 2.5, 10, or 40 Gbit/s; i.e., the granularity of the paths is relatively large. The clients (payloads) are mapped into this frame either asynchronously or bit-synchronously. When SDH and SONET were developed, several considerations gave rise to timing/synchronization and jitter requirements. First, it was necessary to ensure that PDH clients transported over SDH or SONET networks would meet applicable timing and jitter requirements at network interfaces. This was needed to ensure interoperability with existing PDH networks. This also resulted in careful consideration being given to defining the client mappings. Second, the PDH networks and certain other services needed to be synchronized. Prior to the introduction of SDH and SONET, synchronization was transported on the PDH facilities (DSl in North America and El in Europe). With the introduction of SDH and SONET, it was not possible to transport synchronization via PDH signals that were
Synchronization of Optical Networks
217
SONET or SDH clients, because the VT1.5 (VC-11) or VC-12 pointer adjustments would result in the wander requirements being exceeded. Instead, it was necessary to transport synchronization directly via the SDH or SONET layer network. Third, jitter requirements for the SDH/SONET layer network were needed to ensure acceptable bit error performance for the SDH/SONET layers. The above gave rise to jitter, wander, and synchronization requirements for SDH and SONET. The requirements contain two key ingredients. First, the SDH and SONET networks are required to be synchronized. This means that the timing signal provided to each SONET or SDH network element is normally traceable to a Primary Reference Source (PRS) or Primary Reference Clock (PRC), respectively (see Section 6.5). Such a signal has a long-term frequency accuracy of 1 part in 10^\ relative to Coordinated Universal Time (UTC) frequency. The signal also normally meets several other requirements, e.g., on short-term stability. Second, the PDH network digital hierarchies, network synchronization architectures, and clock hierarchies in use in North America and Europe are different. This gave rise to a number of the jitter, wander, and synchronization requirements (both network interface requirements and equipment requirements) for SDH being different for networks deployed in North America and in Europe. The ITU-T recommendations covering SDH jitter and clock requirements refer to Option 1 and Option 2 networks. The former are SDH networks optimized for transporting PDH clients of the 2048 kbit/s hierarchy; the latter are SDH networks optimized for transporting PDH clients of the 1544 kbit/s hierarchy. The Option 2 SDH networks are, in effect, SONET networks^^ The specification of two options for SDH results in a certain lack of transparency. There is no guarantee that a PDH signal of the 1544 kbit/s hierarchy will meet its respective jitter and wander requirements if it is transported over an SDH network that meets Option 1 jitter and synchronization requirements. Likewise, there is no guarantee that a PDH signal of the 2048 kbit/s hierarchy will meet its respective jitter and wander requirements if it is transported over an SDH network that meets Option 2 jitter and synchronization requirements. In addition, the Option 1 and Option 2 synchronization requirements are different. In developing the OTN, one goal was to avoid the above two ingredients. First, it was desired that the OTN be transparent for the respective services that are transported, e.g., SONET/SDH, Asynchronous Transfer Mode (ATM), Internet Protocol (IP), 1 and 10 Gbit/s Ethernet (GbE), etc. This means there should be a single option for OTN, with a single set of requirements. There should not be separate OTN options for Option 1 and 2 SDH clients. Second, it was desired to avoid the need to synchronize the OTN layer. This would avoid the need for the OTN layer to concern itself
218
Chapter 6
with transport and 0AM for network synchronization. One result is that there is no need to specify synchronization status messages (SSMs) and their behavior at the OTN layer. It was recognized, however, that provision must still be made for the transport of synchronization, and it was decided that synchronization should be transported via the SDH or SONET clients. Therefore, it is required that a SONET or SDH client transported over multiple OTNs meet the respective interface jitter and wander requirements in ITU-T Recommendation G,825 on egress from the final OTN. Such signals would then be suitable for carrying both traffic and synchronization. The above gives rise to timing and jitter requirements for the OTN that are less complicated than the corresponding requirements for SDH. It also allows the specification of a single, global set of timing and jitter standards for OTN, rather than two options as was the case for SDH. The following subsections provide a brief overview of the SONET/SDH and OTN timing and jitter requirements. The subsections are roughly organized such that requirements that must be chosen consistently are discussed together (e.g., the requirements pertaining to server layer jitter, client jitter and wander, and clock requirements are discussed in separate subsections). The specific ITUT Recommendations that contain the SDH and OTN timing and jitter requirements are listed in Table 6-1 above.
6.4.1
SEC and ODC Frequency Accuracy, Clock Modes, Pull-in and PuU-out/Hold-in Ranges
The free-run accuracy, pull-in range, and pull-out range for the Option 1 SEC are specified as ± 4.6 ppm. The free-run accuracy, pull-in range, and hold-in range for the Option 2 SEC are specified as ± 20 ppm. Note that the Option 1 SEC specifies pull-out range to constrain the pull-out process, while the Option 2 SEC uses hold-in range. Both terms (as well as pull-in range) are defined in ITU-T Recommendation G.810 [17], and also are discussed in [28]. The pull-out range is the largest offset between the SEC reference and a specified nominal frequency, within which the SEC stays locked regardless of the rate of change of reference frequency. Essentially, the SEC will remain locked if the reference undergoes a frequency step provided that the magnitude of the step does not exceed the pull-out range. The pull-out range is a dynamic limit. The hold-in range is the largest offset between the SEC reference and a specified nominal frequency, within which the SEC stays locked if the reference frequency changes infinitely slowly. The hold-in range is a quasi-static limit (the term quasi-static means that the limit is applied to the case where the change in frequency occurs infinitely slowly). The Option 1 and Option 2 SECs support free-run, locked, and holdover modes. The free-run accuracy for ODCa, ODCb, ODCr, and
Synchronization of Optical Networks
219
ODCp (see Section 6.3 for a brief description of the ODUk clock (ODC)) is specified as ± 20 ppm. All four clocks support free-run mode. ODCb, ODCr, and ODCp support locked mode and have pull-in and pull-out ranges of ± 20 ppm. ODCa is always free-running and, therefore, does not support locked mode or have pull-in or pull-out ranges defined. None of the ODCs support holdover mode because the OTN layer is not synchronized.
6.4.2
STM-N and OTUk Jitter Network Limit and Tolerance, STM-N Regenerator and ODCr Jitter Generation and Transfer, and STM-N and OTUk Jitter Accumulation
As described in Section 6.2.2, jitter network limits must be specified such that the jitter tolerance of equipment (e.g., clock recovery and dejitterizer PLLs) will not be exceeded. Jitter generation and jitter transfer of equipment must be specified such that jitter accumulation over a reference chain of equipment (e.g., some number of 3R regenerators) will not exceed the network limits. Historically, the jitter specifications for SDH were developed well before those for OTN, and the specifications for STM-1, -4, and -16 were developed before those for STM-64. For STM-1, -4, and -16, both highband and wide-band network limits and tolerance were specified as 0.15 Unit Intervals peak-to-peak (UIpp) and 1.5 UIpp, respectively. The corner frequencies for the high-pass measurement filters, which are the same as the jitter tolerance corner frequencies, are different for the three rates. The specifications for SDH Options 1 and 2 are the same, and are contained in ITU-T Recommendation G.825 [9].^^ The jitter generation and transfer specifications for STM-1, -4, and -16 were developed around the same time as the jitter tolerance and network limit specifications. The jitter transfer specifications are the same for Options 1 and 2; however, the jitter generation specifications are not the same. For Option 1, both high-band and wide-band jitter generation limits are specified, of 0.1 UIpp and 0.3 UIpp, respectively. The measurement filters correspond to the measurement filters for the network limits. For Option 2, a jitter generation limit with a single high-pass measurement filter of 12 kHz is specified. The jitter generation limit is 0.1 UIpp and 0.01 Ulrms, i.e., both limits must be met simultaneously, but with the single measurement filter defined. The measurement filter corner frequency is between the network limit high-band and wide-band corner frequencies for STM-1, -4, and -16 (see ITU-T Recommendation G.783 [10]). Jitter accumulation studies over a chain of SONET/SDH regenerators are not documented in the SONET or SDH specifications for STM-1, -4, and -16
220
Chapter 6
(studies are documented for STM-64; these are discussed shortly)/^ However, jitter accumulation studies based on the Option 1 SDH regenerator specifications were performed during the development of the OTN specifications, and are documented in Appendix IV of ITU-T Recommendation G.8251 [8]. These studies use a frequency domain model for cascaded second-order, linear PLLs, each of which contains a phase detector, proportional plus integral loop filter, and voltage-controlled oscillator (VCO) (see [6], [7], and [28] for more details on PLL models). Models were developed for both systematic and random jitter accumulation, based on previous work in [2], [12], and [13]. However, only the random jitter accumulation results are used, because it is assumed that a narrow-band dejitterizer PLL is always present. Systematic jitter accumulation occurs due to pattern-dependent jitter in the clock recovery PLLs; the presence of a dejitterizer PLL between two successive clock recovery PLLs decorrelates the pattern-dependent jitter. In addition, the jitter accumulation models do not model the clock recovery and dejitterizer PLLs individually; rather, a regenerator is assumed to consist of a single PLL that met the jitter generation and transfer requirements. The PLL parameters are chosen to achieve the desired bandwidth and gain peaking. Using STM-16 as an example, these are 2 MHz and 0.1 dB, respectively. N,
+
+
4-
^^
1
^v
1
H3"0]
Y •
Figure 6-10. Schematic of linear PLL model, used to model regenerator (adapted from [8]). The noise sources represented by N] and N2 are WPM; the noise source represented by Ni, is a combination of WPM and WFM
A schematic of the linear PLL model used to model a regenerator is shown in Figure 6-10 (adapted from Figure IV.2-1 of [8]). This is a standard PLL linear model (see [6], [7], and [28]). The phase detector is modeled by the difference between the input plus noise signal U •{• N] and the feedback signal Y (i.e., the second summation from the left) followed by the gain Ka. The loop filter is assumed to be proportional plus integral (i.e., an active filter), and is modeled by the l-^b/s block. The VCO frequency is proportional to the filtered phase error; it is modeled by the Ko/s block, with the l/s integrating the frequency to produce phase. Additive noise
Synchronization of Optical Networks
221
generation is modeled at the PLL input (the input N\ just before the feedback path), at the loop filter (the input N2 between the forward gain and loop filter), and in the VCO (the input A^3 just after the VCO, before the PLL output). The first two noise sources enter into the model in the same way and can be combined; their combined effect is modeled as WPM. The effect of the PLL is to low-pass filter this noise. The VCO noise is assumed to have PSD that depends on 1// at low frequencies (WFM), and is constant at high frequencies (WPM). The Quality Factor, or g, of the VCO determines the relative magnitude of each component (or, equivalently, the breakpoint where the two regions join). Specifically, the VCO noise has a WFM characteristic for frequencies less than//, =/o/2g, and a WPM characteristic for frequencies greater than//,, where/o is the line rate (i.e., the VCO center frequency). The details of this noise model are given in [29]. The effect of the PLL is to high-pass filter the WPM and WFM generated in the VCO. Full details of the noise accumulation model are given in Appendix IV of [8]. The simulation results are shown in Figures 6-11 and 6-12, in the form of relative jitter accumulation (jitter after A^ regenerators divided by jitter for one regenerator). Figure 6-11 shows results for low-pass filtered WPM (i.e., WPM in noise sources A^i and A^2) and high-pass filtered WPM (i.e., WPM, but no WFM, in noise source N3', this corresponds to a VCO with infinite loaded Q). The frequency domain model actually produces RMS jitter; it is assumed that the ratio of RMS jitter after N regenerators to RMS jitter for one regenerator is the same as the corresponding ratios for peak-to-peak jitter; this means that peak-to-peak jitter is a constant factor times RMS jitter. The results in these figures, expressed in this dimensionless form, also apply if the rates and bandwidths are all multiplied by the same scale factor. The values of loaded Q shown in Figure 6-12 cover a range expected in practice. The ratio of high-band jitter network limit to jitter generation is 0.15/0.1 = 1.5. Figure 6-11 shows that if all the jitter generation is in the form of low-pass filtered WPM that just meets the high-band jitter generation specification, the network limit is exceeded after approximately 10 regenerators. Figure 6-12 shows that if all the jitter generation is in the form of high-pass filtered VCO noise that just meets the high-band jitter generation specification, the network limit is exceeded after approximately 10 and 15 regenerators for Q = 30 and 100, respectively (the jitter accumulation is much slower for Q = 535, and hardly accumulates at all for high-pass filtered WPM (Figure 6-11)). The ratio of wide-band jitter network limit to jitter generation is 1.5/0.3 = 5. This level of jitter accumulation appears to be reached after approximately 100 regenerators in the worst cases in Figures 6-11 and 6-12.
222
Chapter 6
Actually, the situation is much better than this, as this conclusion assumes that a regenerator just meets the wide-band jitter generator specification. However, the simulations also showed that a regenerator that just meets the high-band jitter generation specification is within the wide-band jitter generation specification by between a factor of 2 and 3, i.e., the high-band jitter generation specification is the more stringent of the two. The results indicate that, under the assumptions of the model, the STM16 jitter network limit is exceeded if more than approximately 10 regenerators are cascaded. However, actual networks likely perform much better than this, mainly because SDH regenerators tend to have bandwidths that are much narrower than 2 MHz. As will be seen shortly in the results for OTN, a modest reduction in the bandwidth results in a large reduction in the jitter accumulation.
Wide-band jitter, low-pass WPM High-band jitter, low-pass WPM Wide-band jitter, high-pass WPM High-band jitter, high-pass WPM
12
\
__ .—'
r^""
.__
• '
50
100
150
200
Number of 3R Regenerators
Figure 6-11. Simulation results for jitter accumulation over STM-16 regenerators. Assumptions: regenerator bandwidths meet SDH Option 1 (G.783) requirements, random jitter accumulation, no WFM in high-pass (VCO) noise cases
Synchronization of Optical Networks
223
Wide-band jitter, VCO noise with Q = 30 High-band jitter, VCO noise with Q = 30 Wide-band jitter, VCO noise with Q = 100 High-band jitter, VCO noise with Q = 100 Wide-band jitter, VCO noise with Q = 535 High-band jitter, VCO noise with Q = 535 12
1
1
1
1
1
1
1
1
c
10
/ / / /
CD O)
-
'/
-
:
. . . . . < • - • • • ' :
t0
. . ^
4
y^-^
^ ^ __— -^^-^^ =5
2
^^=^=^=9=r^9=rtr-=^ p_.:r$:=:"^- v--^
-^
——:
^^""^"""^'^ 50
100
150
200
Number of 3R Regenerators
Figure 6-12. Simulation results for jitter accumulation over STM-I6 regenerators. Assumptions: regenerator bandwidths meet SDH Option 1 (G.783) requirements, random jitter accumulation, high-pass (VCO) noise with WFM and WPM and indicated Q-factor
Chapter 6
224
Wide-band jitter, low-pass WPM High-band jitter, low-pass WPM Wide-band jitter, high-pass WPM High-band jitter, high-pass WPM
— — — I I
\c
c
0 D) 0
1
1
1
I
I
I
1
I
1
1
1
1
I
1
-
/••-
10 /
DC
,•
-T— L
0
8
03 k-
-
(1) ti
:;;;o c0
6
D) 0
"
•
.''
-
-
cr -z. 4 Vi^
0
-
^ L-
0
)
2
-
/' n
t , 1
1 , 1
.1
50
1
1
1
. 1
100
1 . -.1
1..
150
i
i
1
1
200
Number of 3R Regenerators
Figure 6-13. Simulation results for jitter accumulation over OTUl and 0TU2 3R regenerators. Assumptions: 3R regenerator bandwidths meet OTN (G.8251) requirements, random jitter accumulation, no WFM in high-pass (VCO) noise cases
Synchronization of Optical Networks
225
___ Wide-band jitter, VCO noise witli Q == 30 o
—
—^. —^_. o-.'.. 0
Higli-band jitter, VCO noise witii Q =:30 Wide-band jitter, VCO noise with Q := 100 Higii-band jitter, VCO noise with Q =: 100 Wide-band jitter, VCO noise with Q = 535 High-band jitter, VCO noise with Q =:535
12
0) O) 0)
10
0 4=: CO 0
c
0 D) 0
4:i CO
s
50
100
150
200
Number of 3R Regenerators
Figure 6-14. Simulation results for jitter accumulation over OTUl and OTU23R regenerators. Assumptions: 3R regenerator bandwidths meet OTN (G.8251) requirements, random jitter accumulation, high-pass (VCO) noise with WFM and WPM and indicated Q-factor
In developing the OTN specifications, the network limits and jitter tolerance for OTUl and 0TU2 were chosen to be the same as for STM-16 and STM-64, respectively. In addition, the Option 1 SDH jitter transfer and generation specifications were used as a starting point. Since the corresponding OTN and SDH rates are very similar, the jitter analyses for the two will be approximately the same if all the bandwidths are the same.'^ The specifications for STM-64 were not fully developed when OTN was being developed. However, the above analysis would hold for STM-64 if all
226
Chapter 6
the bandwidths were scaled by the same factor as the rate (i.e., by a factor of 4), and imply similar results for OTU2 for the same bandwidths. It was decided that OTN jitter accumulation should be acceptable over at least 50 3R regenerators. This Hypothetical Reference Model (HRM) is documented in Appendix II of G.8251. Figure 6-11 shows that, for low-pass filtered WPM, the high-band jitter grows by a factor of approximately 1.8 after 50 3R regenerators. Figure 6-12 shows that, for loaded Q of 30 and 100, the high-band jitter grows by a factor of approximately 1.6 after 50 3R regenerators. Since the ratio of the high-band jitter network limit to highband jitter generation is 0.15/0.1 = 1.5, the network limit will not be met. The discussion above for Option 1 SDH indicated that the growth factor of 1.5 would be reached after approximately 10 to 15 regenerators (with the exact number depending on the specific noise component and the loaded Q). The results in Figures 6-11 and 6-12 show that OTN high-band jitter network limit cannot be met with the above assumptions. The assumptions can be made consistent by (1) relaxing the network limit, (2) tightening the jitter generation, (3) using an HRM with fewer 3R regenerators, or (4) tightening the transfer bandwidth. Approaches (1) and (2) are more difficult to achieve technically than (4), while approach (3) would be too limiting in the number of 3Rs allowed. It was decided, therefore, to tighten the transfer bandwidth requirement, and was found that a tightening by a factor of 8, to 250 kHz and 1 MHz for OTUl and 0TU2, respectively, would provide for acceptable high-band jitter accumulation over 50 3R regenerators. Simulation results are given in Figures 6-13 and 6-14. The results show that the high-band jitter accumulation factor for 50 regenerators is very close to 1.0 for low-pass filtered WPM and for VCO noise with Q as small as 30. For OTU3, the requirements cannot be taken directly from STM-256 because the latter are not yet specified in G.825. The most straightforward approach would be to scale the measurement filters and transfer bandwidths by a factor of 4 and leave the jitter limits the same as for OTUl and 0TU2. This was done for the high-band jitter network limit and generation. However, for the wide-band jitter, it was decided that, from an implementation point of view, it would be preferable to keep the high-pass measurement filter the same as for OTU2 (i.e., 20 kHz) and instead increase the network limit to 6.0 UIpp and the generation to 1.2 UIpp. The 0TU3 noise transfer bandwidth is 4 MHz; additional simulations were done to verify that the wide-band jitter accumulation would be acceptable. The jitter transfer bandwidths of 250 kHz, 1 MHz, and 4 MHz for OTUl, 2, and 3, respectively, are narrower than the high-band jitter measurement filter corner frequencies (1 MHz, 4 MHz, and 16 MHz, respectively). The relation between jitter network limit and jitter tolerance generally means that the clock recovery circuit must track jitter of frequency less than the high-
Synchronization of Optical Networks
227
band jitter measurement filter corner frequency, and therefore cannot have a bandwidth less than this. This implies that the OTUl, 2, and 3 regenerators will, in general, use two PLLs, i.e., a wide-band PLL in the clock recovery followed by a narrower-band dejitterizer PLL. Jitter specifications for STM-64/OC-192 were developed subsequent to developing the OTUk jitter specifications. It was decided to copy the 0TU2 specifications for network limit, jitter tolerance, jitter transfer, and jitter generation. This decision also means that, for STM-64, the jitter generation specifications for Option 1 and Option 2 are the same. The OC-192 jitter specifications for SONET in Tl. 105.03-2003 [11] are consistent with those for STM-64 in G.783 [10]. Note that jitter specifications for STM-256 are not yet specified in G.783, and jitter specifications for OC-768 are not yet specified in T 1.105.03. Jitter tolerance requirements for OTUk were chosen to be consistent with the network limits. In contrast with SDH, jitter tolerance for OTUl, 2, and 3 are not specified below 500 Hz, 2 kHz, and 8 kHz, respectively. These frequencies correspond to a jitter level of 15 UIpp. Higher levels of lowfrequency jitter are not expected to occur in the network; extending the masks to higher levels at lower frequencies would increase testing complexity while providing little value.
6A3
Jitter and Wander Accumulation for PDH Clients of SDH Networks and SDH Clients of OTN
As indicated at the beginning of Section 6,4, a server layer must ensure that transported clients meet applicable jitter and wander requirements. Server layer requirements are determined by starting with client layer requirements and using a top-down approach that is used in many areas of engineering. The server layer requirements may be determined by first considering an appropriate HRM for each client application. The HRM is a description of the types of networks and equipment that the client signal traverses endto-end. The HRM will likely include existing network types, i.e., types of transport in use prior to the introduction of the server layer technology; this will ensure that the introduction of the server layer will allow interworking with existing networks. There may be more than one HRM for a single client, e.g., if there are multiple applications involving that client. After deciding on one or more HRMs, the client network interface jitter and wander requirements are budgeted to the respective network components or jitter/wander producing effects within the HRM. Concurrent with the budgeting, potential requirements are set for the server layer equipment. Analyses are then performed using the equipment requirements to determine
228
Chapter 6
if the jitter and wander budgets can be met. Usually the analyses are performed using simulation; however, they may be analytical when closedform solutions are available. The process is usually iterative; initial analyses often show that some jitter or wander components exceed their budget allocations. This issue is addressed by changing the budget allocations and/or changing the equipment requirements. HRMs are typically not part of the network requirements, i.e., there is no requirement that a network operator design a network in accordance with an HRM. Rather, the HRM is a documentation of the assumptions used in relating the client layer network-interface and server layer equipment requirements. HRMs are usually documented in informative annexes or informative appendices of Recommendations or standards. In addition to HRMs for client layer jitter/wander accumulation, one may consider HRMs for server layer jitter/wander accumulation and for accumulation of phase error, jitter, or wander in synchronization reference chains (i.e., clock chains). For example, it was indicated in Section 6.4.2 that the OTN 3R regenerator jitter generation and transfer specifications were set such that jitter accumulation would be acceptable over an HRM consisting of a chain of 50 regenerators. As a second example, an HRM for SDH network synchronization is given in ITU-T Recommendation G.803 [30]. That HRM allows 1 PRC and, for Option 1 networks, up to 10 SSUs and 60 SECs, with no more than 20 SECs between any pair of SSUs.'^ 6.4.3.1
PDH Clients of SONET/SDH Networks
HRMs and associated jitter and wander budgets for the transport of DSl and DS3 signals over SONET (Option 2 SDH) networks are given in Tl.105.03 [11]. Four HRMs are given (jitter and wander, for each of DSl and DS3). The two wander HRMs are also given in ITU-T Recommendation G.824 [31]. An HRM for the transport of an El signal over an Option 1 SDH network is given in Recommendation G.823 [32]. As an example, the SONET HRMs were developed based on considerations for North America when SONET was developed (i.e., in the early 1990s). Specifically, they assume that a DSl or DS3 path can traverse up to two Inter-Exchange Carrier (lEC) networks, two Local Exchange Carrier (LEC) networks, and two customer (i.e., private) networks. Each lEC network is assumed to consist of eight SONET islands, each LEC network is assumed to consist of six SONET islands, and each customer network is assumed to consist of two SONET islands. Each SONET island is assumed to consist of a mapper node (where the DSl or DS3 client is mapped into the VT1.5 or STS-1, respectively), 10 pointer processor nodes, and a demapping desynchronizer at the 10^ node. The pointer processors are
Synchronization of Optical Networks
229
STS-1 or VT1.5 for DS3 and DSl clients, respectively. The SONET islands are referred to as islands because the client transport is over SONET within an island but not between islands. There is no regeneration of the client between the islands; a demapped client is mapped directly into the next island with whatever jitter and wander has accumulated up to that point. The DSl and DS3 jitter accumulation HRMs, and DS3 wander accumulation HRM, consist of all 32 SONET islands, i.e., a DSl or DS3 is assumed to traverse two LEG networks, two lEC networks, and two customer networks between client terminal equipment. The DSl wander accumulation HRM consists of 8 island; this is because it is assumed that the applications where DSl wander is relevant are those where the DSl transport is between two PSTN (Public Switched Telephone Network) switches, and that such switches will always be within one carrier network. Therefore, the worst-case DSl HRM is the lEC HRM, which consists of eight islands. The DSl and DS3 wide-band jitter network limits are both 5 UIpp, with a 10 Hz high-pass jitter measurement filter [33]. The DSl and DS3 high-band jitter network limits are both 0.1 UIpp, with an 8 kHz and 30 kHz high-pass jitter measurement filters, respectively [33]. Since the wide-band limit exceeds the high-band limit by a factor of 50 but the high-band measurement filter bandwidth exceeds the wide-band bandwidth by factors of 800 and 3000, respectively, the wide-band limit is the more limiting of the two.'^ The DSl and DS3 jitter budgets include components for mapping into SONET islands and PDH mapping/multiplexing, positive pointer adjustments for synchronized islands, negative pointer adjustments for synchronized islands, and periodic pointer adjustments (both continuous and gapped periodic sequences are allowed; see [10] and [11]) due to loss of synchronization in one island. The loss of synchronization component is included because it was decided that jitter accumulation should be acceptable if one island is in holdover for up to 24 hours. The mapping/multiplexing components are controlled by limiting the desynchronizer bandwidth. The component due to periodic pointer adjustments is controlled by limiting the desynchronizer bandwidth and range of SONET island frequency offsets, due to loss of synchronization, for which client payload integrity is guaranteed. This range is ±4.6 ppm, which is the maximum frequency offset of a SONET Minimum Clock (SMC) that has been in holdover for 24 hours [34]. The components due to positive and negative pointer adjustments for synchronized islands are controlled by limiting the desynchronizer bandwidth and specifying the synchronization noise performance at the SONET NE synchronization interface. This latter limit is the synchronization network limit, and is specified by MTIE and TDEV masks (see below).
230
Chapter 6
The DSl and DS3 wander network limits are both 18 [is peak-to-peak over 24 hours (see [35] and [36]). The DS3 wander budget includes components for (PSTN) switch synchronization, mapping into SONET islands and PDH mapping/multiplexing, temperature effects on the fiber, periodic pointer adjustments due to loss of synchronization in one island, clock transients in the synchronization distribution network, and pointer processor buffer fill variations and pointer adjustments due to random synchronization phase noise (i.e., when synchronized). The DSl wander budget does not include the periodic pointer adjustment and clock transient components; the other components are analogous to those in the DS3 budget, though the values are different. The reason the DSl budget does not include the periodic pointer adjustment and clock transient components is that these are sufficiently large that the network limit could not be met if they were included (for example, a gapped periodic VT1.5 pointer adjustment sequence can have wander exceeding 10 [is). Instead, these are not considered part of normal operating conditions. Unlike the case of jitter, the desynchronizer bandwidth has negligible impact on the 24-hour wander. This is because the main source of wander is NE synchronization noise and resulting pointer adjustments and pointer processor buffer fill variations. The buffer fill variations are typically slow compared with the desynchronizer time constant, and the mean time between pointers is typically much greater than the desynchronizer time constant. The primary means of controlling wander in SONET/SDH networks was to specify a sufficiently good network synchronization interface limit (via MTIE and TDEV masks) that there would be sufficiently small pointer processor buffer fill variation and sufficiently few pointer adjustments. The SONET network synchronization interface MTIE and TDEV limits (see [37], [34], [38] and [31]) were developed after extensive DSl wander accumulation simulations were performed over several years. MTIE masks for DSl and OC-N reference interfaces (e.g., interfaces to Building Integrated Timing Supply, or BITS, clocks and to SONET NEs, respectively) were developed during the development of the SONET DS3 jitter specifications. In addition, a DSl Reference TDEV mask was developed at this time. The MTIE and TDEV masks are specified in ANSI Tl.lOl [37]. Along with these masks, SONET DS3 desynchronizer requirements were developed that enabled the DS3 jitter network limit to be met. Subsequently, it was found that level of phase noise allowed by the DSl reference TDEV mask was too high to allow the DSl and DS3 wander network limits (18 |is peak-to-peak over 24 hours) to be met. As a result, a new SONET synchronization interface TDEV mask was developed. This mask is one order of magnitude below (i.e., better than) the DSl reference TDEV mask, and is specified in ANSI Tl. 105.09 [34] (and also documented
Synchronization of Optical Networks
231
in ITU-T Recommendation G.813 [38]). This mask is not sufficient to allow the 18 |j.s peak-to-peak wander over 24 hours to be met for the DSl HRM. However, additional analyses showed that this mask does allow the controlled slip requirements of ITU-T Recommendation G.822 [26] to be met. It was decided that this outcome would provide for acceptable performance because (1) the underlying purpose of the DSl wander requirement is to ensure that the G.822 slip requirements are met, and (2) not all networks are expected to contain a large number of VT1.5 (VC-11) islands. The slip performance analyses are described in Appendix II/G.813 [38], As explained above, requirements are placed on the SONET DSl and DS3 desynchronizers to ensure that the respective jitter and short-term wander network limits are met. These requirements are given in the form of pointer test sequences and mapper/demapper tests. The tests are specified in ANSI Tl.105.03 [11] and ITU-T Recommendation G.783 [10]. The pointer test sequences represent the various network conditions that can arise (and the network conditions are represented in the jitter and wander budgets), e.g., single (random) pointer adjustments, randomly occurring pointer bursts, pointer bursts due to phase transients, and continuous and gapped periodic pointer sequences (with and without an added and a canceled pointer adjustment due to a single pointer adjustment caused by random phase noise). The various tests constrain peak-to-peak jitter and MTIE (the latter in the form of MTIE masks) when the respective test sequences are applied. Tests are also specified to constrain mapping peak-to-peak jitter and MTIE. 6.4.3.2
SDH Clients of OTN
The OTN physical layer is not used to transport synchronization; however, synchronization is transported via the SDH clients. Therefore, the effect of the OTN on SDH client jitter and wander accumulation for an appropriate synchronization reference chain must be considered. Appendix II of G.8251 describes the SDH client synchronization reference chain. The HRM was developed by first considering the pure SDH synchronization reference chain in ITU-T recommendation G.803. As indicated at the beginning of Section 6.4.3, that HRM allows 1 PRC and, for Option 1 networks, up to 10 SSUs and 60 SECs, with no more than 20 SECs between any pair of SSUs.^^ For OTN, it was considered that the SECs between successive SSUs would be replaced by OTN "islands", i.e., the STM-N would be mapped into an OPUk at one SSU location and demapped at the other. There are still assumed to be 10 SSUs, and 20 SECs at the end of the chain are still allowed.
232
Chapter 6
The OTN "island" between successive SSUs has an STM-N client mapping/demapping. It is also allowed to have up to nine other mappings/demappings, so as to account for ODUk multiplexing. This means that the total number of mappings/demappings in the reference chain is 100. The model was then generalized such that the mappings would not have to be evenly divided among the nine SSU pairs, i.e., some of the islands could be large and some small. Finally, for the case where there is no ODUk multiplexing, it was decided to still consider the effect of up to 100 mappings/demappings; for this case, these would all be STM-N to OPUk. This synchronization reference chain was then used to determine (1) the STM-N to OPUk asynchronous mapper buffer hysteresis (the mapping is +1/0/-1), and (2) the ODCp transfer bandwidth, gain peaking, and roll-off. The first requirement is chosen such that, in the worst case, the long-term (24-hour) wander is acceptable for SDH. The second requirement is chosen such that the peak-to-peak jitter and short-term wander are acceptable for SDH for the case of asynchronous mapping. The STM-N to OPUk asynchronous mapper buffer hysteresis is limited to 2 bytes for STM-16, 8 bytes for STM-64, and 32 bytes for STM-256. This is approximately equal to 6.4 ns per mapper. Since the mapping is +1/0/-1, the maximum long-term wander that can possibly occur is equal to this hysteresis. Therefore, for 100 mappings/demappings, the maximum possible long-term wander is 640 ns peak-to-peak. This is sufficiently below the long-term wander limits at the end of a synchronization reference chain of 5 |as peak-to-peak for Option 1 [32] and 1.86 |a,s peak-to-peak for Option 2. Note that the maximum possible long-term wander of 640 ns will not necessarily be reached; whether it is reached depends on the actual client and ODCa frequency offsets from nominal and the actual wander generation in these clocks. However, by taking the above approach of limiting the buffer hysteresis, the task of network engineering becomes trivial. The long-term wander requirement is met as long as the number of mapper (multiplex)/demapper (demultiplex) pairs does not exceed 100. In addition, with this approach it was possible to avoid the need to do long-term wander accumulation simulations (which would be extremely time consuming). The ODCp maximum bandwidth and gain peaking were chosen to be 300 Hz and 0.1 dB, respectively, and the roll-off was chosen to be 20 dB/decade. With these assumptions, simulations of wide-band and high-band jitter accumulation and of short-term wander accumulation were performed for two cases. In the first case, the STM-N client and ODCa clocks were assumed to be free-running, with actual frequency chosen randomly (and separately for each clock) within the frequency tolerance requirements of ± 20 ppm. In the second case, the STM-N and ODCa clocks were assumed to be very close to but not exactly equal to nominal frequency, but sufficiently
Synchronization of Optical Networks
233
close that the resulting rate of stuffs would be less than the desynchronizer bandwidth (specifically, the offsets were within ± 0.05 ppm, which corresponds to a maximum stuff rate of 31 Hz for STM-16 mapped into OPUl; this is approximately 0.1 times the desynchronizer bandwidth of 300 Hz). This latter case was considered to be the worst case, because here the stuffs are filtered individually by the desynchronizer. In fact, for wide-band jitter a rough estimate of the worst case, which occurs for STM-16 mapped into OPUl, is easily obtained by considering an 8 UI step filtered first by a 300 Hz low-pass filter, followed by a 5 kHz high-pass filter (the desynchronizer and wide-band jitter measurement bandwidths, respectively). The result is approximately 0.4 UI peak; the peak-to-peak is therefore twice this, or 0.8 UI. This is well within the 1.5 UIpp jitter limit. This conclusion was confirmed via time domain simulations of multiple OTN islands. Similarly, the high-band jitter is well within the 0.15 UIpp limit, because the jitter limit is 0.1 times as much while the high-band measurement filter corner frequency is 200 times as much (1000 kHz/5 kHz). The short-term wander requirements at a synchronization interface are stated in terms of TDEV masks (see Section 6.2.5 for a brief discussion of TDEV). The masks are given in Recommendation G,813 [38]; TDEV for Option 2 is limited to 10 ns for integration times between 0.05 s and 10 s, and to 12 ns for integration times between 0.1 s and 7 s (both TDEV masks then increase for longer integration times). Simulations were performed (see Appendix VIII of [8]) for STM-16 transported over 100 OTN islands, with the same two sets of assumptions on the clock offsets as for the jitter analyses. As with the jitter analyses, the case with small but non-zero clock offsets was the worse case. TDEV for this case was found to reach a maximum of approximately 2.5 ns for an integration time of approximately 0.5 s, and then decrease. Therefore, the network interface TDEV requirement is met. Similar analyses were performed for reference models that include ODUk multiplexing. These led to the same ODUp maximum bandwidth and gain peaking requirements, namely, 300 Hz and 0.1 dB, respectively. In addition, the HRMs led to maximum mapper buffer hysteresis of 2 bytes for ODUl multiplexed into 0DU2 or 0DU3, and 8 bytes for 0DU2 multiplexed intoODU3[39].
6,5.
RELIABLE DISTRIBUTION OF SYNCHRONIZATION
The previous sections were devoted to the description of slow and fast phase variations, i.e. phase noise, of the synchronization signals, and their
234
Chapter 6
effect on the OTN, SDH/SONET, and PDH transport planes. In this section, the focus is shifted towards the subject of reHable synchronization distribution. Even if synchronization signals have perfect quality, and therefore zero phase noise, the connections over which they are transported may fail, and the synchronization distribution will then be disrupted. Several measures are available to the synchronization network engineer to design reliable synchronization distribution over transport connections that are inherently not entirely reliable.
6.5.1
The Need for Synchronization
For network elements of the PSTN switching network, such as digital switches, remote switching modules, PDH cross-connect systems, primary multiplexers, and ISDN equipment, it is the objective to operate synchronously as much as possible. Any asynchronism in this layer of the telecommunications network results in a phenomenon called {controlled) octet slip. Octet slip means that at the level of 64 kbit/s bearer channels, octets are deleted or repeated in order to align an input signal with the local clock of the network element in question. Octet slips are considered an impairment to the service that runs over the bearer channel, and their occurrence needs to be minimized. ITU-T Recommendation G.822 [26] provides information on the number of octet slips that are considered acceptable. The effect of an octet slip on the service it carries depends on the nature of the service (see Table 6-2). Table 6-2. The effect of an octet deletion or repetition at the 64 kbit/s level Service Effect of octet slips Voice (64 kbit/s PCM) Sometimes an audible click HDLC Data One or two lost frames per slip: throughput loss depends on error detection and recovery scheme of application ADPCM Voice (32 kbit/s) More than twice as bad as 64 kbit/s PCM voice Compressed Video (DSl) Each slip can wipe out one or more lines Facsimile Depending on system used, each slip can wipe out up to eight lines
Synchronous operation in the PSTN switching layer relies on the transport network to carry synchronization information between the network elements of the PSTN. In the days that primarily PDH technology was used for the transport of digital signals, the El and DSl carrier frequencies (2048 kHz and 1544 kHz) were used in the 30-channel and 24-channel markets, respectively, to carry this synchronization information. With the introduction of SDH and SONET, this role was taken over by the STM-N carriers. This change means that in order to synchronize the PSTN network elements, the
Synchronization of Optical Networks
235
El, DSl, and STM-N signals need to be synchronized to some common clock. An important difference between PDH and SDH as transport technologies is that PDH network elements themselves do not need to be operated synchronously in order to transport a synchronous El or DSl signal without introducing levels of wander that are unacceptable for synchronization purposes, thanks to the specific PDH bit-stuffing mechanism. In contrast, SDH networks cannot transport El or DSl payloads with sufficient phase transparency for synchronization purposes, due to the nature of the SDH pointer adjustment mechanism. This is why in the case of SDH, the STM-N carrier signals have been introduced as carrier for synchronization information and, as a consequence, the SDH network elements need to be synchronized as well. Fortunately, the synchronization of the SDH network elements also has a beneficial effect on the wander accumulation of payload signals as explained in the previous sections, since it greatly reduces the probability of pointer adjustments. For the distribution of synchronization information, the presence of OTN networks hardly matters. The synchronization information is still transmitted via the STM-N carrier phase and frequency. Mapping an STM-N signal as a client in an ODUk signal and transporting and cross-connecting it through an OTN network does not affect the timing properties of the recovered STM-N signal at the OTN egress, even if multiple OTN networks are traversed. The mapping of STM-N signals in the ODUk is defined such that the timing properties of the STM-N signals are maintained, even if all nodes in the OTN network have a free-running local clock. This property of the OTN network is comparable to PDH networks, which also are capable of maintaining the frequency and phase properties of synchronous El and DSl signals, while the higher-order PDH signals are derived from free-running network element clocks.
6.5.2
Synchronization Areas
In the public network there is a clear need for a high-quality "master clock" that serves as the synchronization source for all network elements in a certain area, called a Synchronization Area. In such an area, all other network elements operate synchronous with this master clock. 6.5.2.1
Synchronization Reference Chains
In a synchronization area, synchronization information is distributed by interconnection of the network elements via STM-N links or, in case a network element has no STM-N interface, by a synchronous El or DSl
236
Chapter 6
interface. Using phase-locked loop techniques, the timing reference information is recovered from the incoming synchronization link and used to control the frequency of the internal oscillator(s) in the equipment, which in turn synchronizes the outgoing signals. In this way, many network elements can be cascaded, and all are eventually timed by the master clock of the area. The length of the resulting chain of clocks has to be restricted. The longer such a chain becomes, the more phase noise will be accumulated, which degrades the quality of the synchronization information. It is part of the network synchronization plan to design the synchronization paths in such a way that the number of clocks that form a chain, starting at the master clock of the area, remains limited. The amount of degradation or phase noise accumulation on a synchronization path depends on the number and type of clocks that are in the synchronization chain. Each clock will generate some phase noise due to component imperfections, but will also filter phase noise generated by clocks upstream in the chain, depending on the frequency of the noise. In addition, interruptions in the path of the clock chain cause additional phase excursions. For practical networks, the composition of synchronization chains varies considerably. To be able to get a handle on this, a "synchronization reference chain" has been composed for standardization purposes as a model for such chains. In ITU-T Recommendation G.803 [30], a model chain is defined as consisting of one PRC clock as a master and up to 10 SSU and up to 60 SEC clocks slaved to it, with the restriction that the number of SEC clocks in tandem is never more than 20. This model is used throughout Option 1 networks. For Option 2 networks, another model is assumed (in Telcordia GR-436-CORE [40]), where each chain consists of one PRS (Stratum-1 clock) at the start of the chain, followed by a number of Stratum-2 clocks with at most one SMC between two successive Stratum-2 clocks (collocated with the upstream Stratum-1 or Stratum-2 clock) and possibly up to 16 SMCs at the end. The various clock types are summarized in Tables 6-3 and 6-4,
Synchronization of Optical Networks
237
Table 6-3. Overview of standardized clock types in Option I networks (SDH markets) Clock type Applicable standard Quality indication Purpose IxlO'' lifetime PRC (Primary ITU-T Rec.G.811 Master clock, standaccuracy Reference Clock) alone equipment SSU-T (Synchronization Supply Unit — Transit node)
ITU-T Rec.G.812, type I (modern equipment) or type V (older equipment)
3x10'^ holdover stability (period of months)
Slave clock, standalone equipment or built-in in larger network elements (switch or crossconnect)
SSU-L (Synchronization Supply Unit — Local node)
ITU-T Rec.G.812, type VI (older equipment only)
1x10'^ holdover stability (period of days)
Older type slave clock, built-in in larger network elements (switch or cross-connect)
SEC (SDH Equipment Clock)
ITU-T Rec.G.813, Option 1
5x10"^ holdover stability (period of minutes)
Slave clock, built-in in SDH network elements
Table 6-4. Overview of standardized clock types in Option II networks (SONET markets) Clock type Applicable standard Quality indication Purpose 1x10"'^ lifetime Master clock, standPRS (Primary ITU-T Rec. alone equipment Reference Source) G.811/ANSIT1.101 accuracy Stratum-2
ITU-T Rec. G.812, type II
1.6x10'^ lifetime accuracy 1x10-'° holdover stability (period of 24 hours)
Slave clock, standalone equipment or built-in in larger network elements (switch or crossconnect)
Stratum-3E
ITU-T Rec. G.812, type III
4.6xlO-Mifetime accuracy 1.2x10'^ holdover stability (period of days)
Slave clock, built-in in larger network elements (switch or cross-connect)
Stratum-3
ITU-T Rec. G.812, type IV (older equipment only)
4.6xlO"Mifetime accuracy 3.9x10-^ holdover stability (period of 24 hours)
Older type slave clock, built-in in larger network elements (switch or cross-connect)
SMC (SONET Minimum Clock)
ITU-T Rec. G.813, Option 2
5x10"^ holdover stability (period of minutes)
Slave clock, built-in in (small) SDH network elements
238
Chapter 6
The standards have been constructed so that if a network element experiences phase noise as if it is located at the end of a synchronization reference chain, the overall jitter and wander requirements are met for signals transported through such networks. Given the composition of the reference chains, the real network clock chains will generally be much shorter, so the network level performance will be better, even if occasionally some chains are longer than the reference chain. 6.5.2.2
Synchronous and Plesiochronous Operation
A network is said to operate synchronously when all network elements in that network have their internal oscillators (i.e., their "local clock") locked to a single reference clock, the master clock of that network. In this situation, the long-term average frequencies of all local clocks are equal. When considered over a shorter period of time, there may be deviations due to jitter and wander effects, but these are essentially random phase effects that are bounded^^ in the long run. For a network to operate synchronously, it is necessary to distribute the reference signal from the master clock to all network elements of that network. The geographical area in which this situation is the normal state of affairs, i.e., in the absence of failures, is called a Synchronization Area, In practice, most operators design their network to be a single synchronization area, unless the area they serve is very big or consists of multiple disjoint areas, in which case there can be more than one Synchronization Area in a single operator's domain. Since digital signals may need to traverse different Synchronization Areas to reach their destination, each transition between such areas may cause an octet slip at the level of the 64 kbit/s bearer channel. ITU-T Recommendation G.822 prescribes that, under normal conditions, at most one single octet slip is allowed in a 70-day period between adjacent Synchronization Areas. One octet corresponds to 125 |is in the time domain, and the build-up of a time difference of 125 |is in 70 days corresponds to a relative frequency difference of 2xl0"^\ So as long as the master clocks of Synchronization Areas have a frequency that deviates less than 1x10'^^ from the absolute (UTC), the Recommendation is fulfilled. The situation in which the long-term average frequencies of local clocks of adjacent network elements deviate less than 1x10'^^ from the absolute is called Plesiochronous Operation. In case frequency differences are substantially larger than allowed by G.822, but each digital signal still meets its own frequency specification, the network operation is called asynchronous. Table 6-5 gives an overview of
Synchronization of Optical Networks
239
the frequency limits that apply for the various digital signals in asynchronous operations. Table 6-5. Maximum allowed relative frequency deviations for digital signals Signal type Maximum relative frequency offset 4.6x10-^ (20x10-^ for AIS) STM-N (OC-N) DSl 32x10-^ El 50x10-^ 30x10"^ DS2 E3 20x10-^ 20x10-^ DS3 15x10-^ E4 OTUk 20x10-^
Note that of the signals listed in Table 6-5, only STM/OC-N, El, and DSl signals normally operate plesiochronous or synchronous. For OTUk and higher-order PDH signals, asynchronous operation is the normal mode of operation, since such signals are not used to transport synchronization within the network. The PDH multiplexing scheme is designed such that synchronous El or DSl signals maintain their phase and frequency properties when they're carried over higher-order PDH links, even when these carriers themselves run off frequency but still within the bounds given in Table 6-5. The OTN is designed such that synchronization can be transported via SDH clients. 6.5.2,3
Traceability and Quality
From the description of Synchronous and Plesiochronous operation, it is clear that synchronization information has to be distributed from a very accurate master clock with less than 1x10"^^ fractional frequency error to all other network elements in the Synchronization Area. To achieve a long-term accuracy of this level, it is necessary to apply Cesium technology. This technology is too bulky and too expensive to apply in each network element— hence the need for synchronization distribution. Synchronization information can be distributed by using a chain of interconnected clocks. The first clock of the chain acts as the "Master Clock" of the chain, and the frequency and phase information of this clock is used as the "Reference Signal" by each subsequent element (slave clock) in the chain to control its output signal, using a phase locked loop technique. All clocks are thus slaved to the master clock of the chain. The concept of Traceability is based on the practice of chaining clocks. All clocks in the chain are said to be traceable to the first clock or master clock of the chain: their long-term average frequencies are equal.
240
Chapter 6
Since the master clock is the eventual source of the synchronization for all slave clocks in the connected synchronization chain(s), it is important that its quality is adequate. The "Quality" of a clock is determined by multiple factors, but for the discussion of reference chains, the long-term accuracy of its frequency is the most important. Tables 6-3 and 6-4 list the most important quality characteristics of standardized clocks. 6.5.2.4
PRC/PRS Autonomy
Another concept that is useful in the discussion of reliable distribution of synchronization signals is the notion of PRC/PRS autonomy. ITU-T Recommendation G.811 provides the specification of PRC (Option 1 networks) or PRS (Option 2 networks) equipment,^^ of which the most important is the long-term frequency accuracy requirement of lxlO'^\ Such a clock is suitable as the master clock of a Synchronization. However, the link between two clocks may fail and an important question in that case is how long the slave clock, without its accurate reference present, can maintain its output phase and frequency within specification. This period of time is called the PRC Autonomy or PRS Autonomy and plays a role in determining the reference protection strategy. It is illustrated in Figure 6-15.
Allowed Phase Drift - Actual Phase Drift
1 X 10-11
PRC/PRS Autonomy
Figure 6-15. PRC/PRS Autonomy
Time (log)
Synchronization of Optical Networks
241
The PRC/PRS Autonomy and the long-term frequency accuracy are the parameters that determine the quality of a clock. The master clock has the highest quality in a synchronization area. Slave clocks can have very different specifications with regard to PRC/PRS autonomy and frequency accuracy. In the ITU-T Recommendations G.812 and G.813, a number are defined, with frequency accuracies between 3.2x10'^ and 1x10"^ and PRC/PRS Autonomies ranging from seconds to days. The former require quick automatic reference mechanisms; the latter may just wait until the failed link is repaired.
6.53
Reference Duplication and Reference Selection
The time-tried method to enhance reliability is duplication. Simply put: a Network Element can accept multiple input signals to be a potential source of synchronization information, but only one of them is active at any given time. In case this reference fails or a better reference becomes available, the network element may perform a reference switch. The need for reference duplication increases for equipment with clocks that have shorter PRC Autonomy. On the other hand, for Stratum-2 clocks with their excellent hold-over stability, reference duplication is not strictly necessary, since there is sufficient time to repair a reference connection in case of failure, before the frequency of the clock drifts too far away. On the lower end of the clock quality spectrum, and especially for SEC and SMC clocks, operation with duplicated (or triplicated, etc.) references is almost mandatory. With the introduction of multiple synchronization references for a single clock, unavoidably the problem of reference selection is introduced. A mechanism needs to be in place to perform automatic reference switching. Two such mechanisms are in use, namely, selection based on priority and selection based on quality. 6.5.3.1
Priority-based switching
The priority-based reference selection mechanism is very simple. Each potential timing reference input is assigned a certain priority by the network operator. This priority is typically expressed as a number, starting at 1 for the highest priority and incrementing for each alternative reference. The system will then automatically select the acceptable reference with the highest priority as the active reference.
242 6.5.3.2
Chapter 6 Quality-based switching
A second reference selection criterion is the quality of the reference signals. Obviously, if there are multiple references available, it is desirable to select a reference that is traceable to a PRC rather than a reference that is only traceable to an SSU-T, for instance. Unfortunately, it is normally not possible for clocks to distinguish between these traceabilities, based on the frequency and phase properties of the signal. To establish this would require a highly sophisticated decision algorithm in combination with a very accurate internal clock. This setup is only feasible for high-quality standalone clock equipment. For other network elements, the traceability of a reference can only be established based on some form of messaging. In PDH networks, such messaging does not exist'^' but in SDH networks the traceability of a reference signal is conveyed in a 4-bit code in the SI byte of the STMN/OC-N (multiplex) section overhead. Of course, this process can only function properly if the proper message is inserted at the beginning of the chain and is propagated along the chain. The requirements regarding Synchronization Status Messages, as this traceability information is normally called (ITU-T Rec. G.781), are trying to ensure exactly this behavior. 6.5.3.3
Reference Acceptance
In the context of reference selection, switching a reference can be acceptable or unacceptable. The reference selection process operates on the set of acceptable references. Acceptable references are those that are derived from alarm-free input signals and that do not show phase or frequency excursions that are beyond the acceptance levels for the equipment in question. Detecting phase and frequency excursions in the potential references of a network element is not always possible. In principle, only phase and frequency differences can be observed. By comparing, pairwise, the phase and frequency differences between multiple input references and the information from the oscillator(s) in the system itself, it is possible to disqualify references (or internal oscillators) by a system of "majority voting." Depending on the complexity and importance of the equipment in the synchronization network, such mechanisms may be implemented to a certain extent.
Synchronization of Optical Networks
6.5,4
243
Synchronization Status Messages
The presence of Synchronization Status Messages (SSMs) containing reference traceabiUty information in the STM/OC-N overhead allows the synchronization network engineer more options to design adequate reliability, which is especially welcome if sufficient independent references are not available to use in a strict priority-based reference selection scheme. 6.5.4.1
Message Set
Given the differences in the standardization of clocks in Option 1 and Option 2 networks, the SSM code points are also different (see Table 6-6). In principle the rule is that the lower the number, the better the traceability of a reference, but there are a few exceptions to that rule, as included in Table 66. Table 6-6. Available code points in SI byte for SSM messages (after ITU-T Rec. G.781) SI code Option 1 interpretation Option 2 interpretation 0000 invalid STU - Traceability Unknown 0001 PRS - Primary Reference invalid Source (G.811) 0010 PRC - Primary Reference Clock invalid (G.811) 0100 SSU-A - Synchronization Supply TNC - Transit Node Clock UnitA(G.812,typeIorV) (G.812, type V) 0111 invalid ST2-Stratum 2 (G.812, type II) 1000 SSU-B - Synchronization Supply invalid Unit B(G.812, type VI) 1010 invalid ST3(G.812,typeIV) 1011 SEC - SDH Equipment Clock invalid (G.813, option 1) 1100 invalid SMC - SONET Minimum Clock (G.813, option 2) 1101 invalid ST3E - Stratum 3E (G.812, type III) 1110 invalid PROV - Provisionable by network operator nil DUS-Don't Use for DUS-Don't Use for Synchronization Synchronization
Synchronization can also be carried over signals other than STM/OC-Ns, most importantly El and DSl signals. Such signals normally don't carry SSMs, but an exception is made for dedicated El and DSl signals that interface between stand-alone clock equipment and other network equipment
244
Chapter 6
in an office. Such El and DSl signals do not carry payload, but can code SSM information in their overhead (in the TSO frame overhead in the case of El and in the frame Data Link channel in the case of DSl). 6.5.4.2
SSM Protocol Rules
In conjunction with the SSM message set, a set of protocol rules is defined for SDH and SONET network elements that govern the processing of the SSM messages between network elements. SSM message processing is defined for network elements that have a relatively low-level clock like an SEC or SMC in option 1 and option 2 networks, respectively. The protocol rules introduced in this section are illustrated in an example in the next section. 6.5.4.2.1 SSM Selection The selection of a reference based on SSM messages is very straightforward. The reference that has the best traceability is selected, provided that the traceability is better than or equal to the quality level of the internal clock of the network element. In case the quality of the internal clock is better than the traceability of any reference, the internal clock is run without reference in holdover mode. It becomes the "master clock" of the subnetwork that is ''downstream" in a synchronization sense. This situation is normally the case only during failures in the network so that the connection with the PRC or PRS of the network is lost. A final rule for reference selection is that references that carry DUS (Don't Use for Synchronization) can never be selected. In case multiple references have the same traceability, the selection is done according to a provisioned set of priorities that are assigned to each reference input. This means that the SSM-based selection scheme does not replace the priority-based scheme; it is merely added to it. 6.5.4.2.2 SSM Forv^^arding When the SSM of the selected reference is known, this value is forwarded on all outgoing STM-N links. In case the internal clock is in holdover mode, the outgoing SSM is determined by the quality of the internal clock. 6.5.4.2.3 SSM Looping To avoid a timing loop between two neighboring network elements each selecting the other as its reference, an additional rule for the processing of SSMs is that in case a reference is selected by a network element, the signal
Synchronization of Optical Networks
245
in the return path is given the special DUS (Don't Use for Synchronization) message. 6.5.4.2.4 Manual SSM Assignment Two additional options are available to the synchronization engineer to facilitate the use of SSMs, which are useful on the boundaries between a network that uses SSMs and one that doesn't. Each input reference can be assigned a certain traceability (overriding the SSM that is possibly present in the signal). On each output, the DUS message can be inserted (overriding the normally transmitted SSM as established by the protocol). 6.5.4.3
SSM based restoration applied in a ring network
To illustrate the operation of the SSMs and their protocol rules, the example of a simple ring network is worked out. In Figure 6-16, a simple SDH ring network of six network elements connected by STM-N links, labeled A through F, is shown. Each network element has an SEC-type internal clock. NE A receives (duplicated) synchronization signals from outside the ring with PRC traceability (the traceability is shown in the ellipses) and acts as the timing master for the network elements in the ring. The traceability of the reference of node A, PRC, is forwarded on both outgoing STM-N links on the ring. The ring nodes B through F take their reference from either the "east" or the "west" line input. The selection is based on the received SSM and, if those are equal, on a provisioned priority. In the normal situation, the ring is provisioned such that the active synchronization runs clockwise. This result is achieved by provisioning the input from the clockwise direction with a higher priority, i.e., "1, " than the counter-clockwise input, which has priority "2. " The priorities are in the small circles. The outgoing STM-N signal gets the same SSM as the selected reference, i.e. "PRC, " except the output in the direction of the active reference, which is assigned "DUS" to avoid a direct timing loop between adjacent nodes. The active synchronization topology is denoted by the solid lines, and this network forms, indeed, a "tree." When the link between nodes B and C fails, the active reference of node C will become unavailable, and this node will attempt a reference switch. However, the only alternative reference carries DUS and is not usable. Therefore, node C switches to "SEC," representing the internal clock in hold-over. Node D detects the changing of the traceability from PRC to SEC and, for lack of an alternative reference, accepts the degradation of the
246
Chapter 6
traceability, which is still not worse than its internal clock. Node D forwards the new SSM to E, which reacts exactly the same way.
(gC^
c
D
1 ||||SEC|||| i
i^Uus"^
^^^^
^ >
C5^/'
(SEC )
•B I ^^^^^^^^ fP Cj^^
Figure 6-16. SSMs applied in an SDH ring network: normal situation
Node F also sees the degradation of the traceability from PRC to SEC on its active reference, but it has a better alternative, i.e., the input from node A, which still is traceable to a PRC. Node F will make a reference switch to the other input and update the SSMs on its outputs accordingly: DUS back to node A and PRC to node E. Figure 6-17 shows exactly this situation. The restoration process of the ring synchronization completes when nodes E, D and C, in that order, switch reference to the other side because that input carries a better traceability (PRC) than the currently active one (SEC). In the final situation, the synchronization reference chain is split into two parts (Figure 6-18): one clockwise from node A to node B and the other counterclockwise from node A to node C. The process to reach this situation completes within a few seconds. This ensures that the phase transient caused by the short period in the holdover experienced by nodes C, D, and E remains well within 1 |is^^ Once the failed link is repaired, the synchronization situation will return to the situation in Figure 6-16, thanks to the assigned priorities, which cause the synchronization switching to show revertive behavior.
Synchronization of Optical Networks
247
an
0
(SEC)
<so '<SE>
Active synchronization Stand-by synchronization
<SD
c
D
C!E>^ Blliliiiiipj
^ CSD>
^0o:
Figure 6-17. SSMs applied in SDH ring network: restoration in progress
Figure 6-18, SSMs applied in SDH ring network: restoration completed
248
Chapter 6
6*5,5
Satellite Timing
6.5.5.1
GPS
The Global Positioning System (GPS: a general introduction is given in [42]) can very suitably be used as an alternative to synchronization distribution (see, e.g., [43] and [44]) via STM-N signals. The satellites in the GPS system continuously transmit signals for positioning purposes, from which a very accurate frequency can be derived. The satellites themselves contain a combination of duplicated Cesium and Rubidium clocks, which are kept synchronous by ground stations that operate exactly on UTC. A GPS receiver can construct an accurate reference from the signals of the satellites in view. Since the GPS signals have very good long-term stability, but relatively poor short-term stability, the GPS receiver needs to be integrated in a system that contains an SSU-T or Stratum-2 level clock to improve the short-term stability. The combination of a GPS receiver and a high-quality slave clock can easily meet the G.811 requirements for PRC/PRS (and Tl.lOl requirements for the latter) network master clocks, avoiding the costly Cesium technology. This method of implementing the master clock function is quickly gaining popularity. GPS is the best-known positioning system, but alternative satellite systems can also be used for the purpose of deriving accurate timing. 6.5.5.2
Effect of Satellite Timing on Network Synchronization
The introduction of satellite signals as timing reference signals for telecommunication networks can ease the distribution problem enormously. However, there are a number of technical issues that need to be taken into account. • A clock that uses GPS signals for reference purposes needs to have an internal oscillator that has very good short-term stability properties so as to be able to average the reference signals over a sufficiently long time before applying the average as a reference. The averaging process diminishes the effect of the short-term instability (phase wander in the mHz range) of the GPS signal. Rubidium and high-quality quartz oscillators can provide adequate stability. • The reception of GPS signals requires antennas and GPS receivers. This equipment needs to be duplicated for high availability. Long connections to a suitable roof location (with sufficient open view of the sky) are sometimes a problem.
Synchronization of Optical Networks
249
Network operators base their synchronization design on different levels of GPS-derived synchronization. Rather common is to use the GPS signals as a backup for stand-alone synchronization equipment (ST2 or SSU-T clocks). In this case, only one reference through the network is required, which eases the design and diminishes the risk for timing loops. Stand-alone synchronization equipment is present in many larger telecommunication offices, especially in North America. The presence of GPS allows an operator to create many small synchronization areas, since he or she may have many G.811 equivalent clocks. Traffic between such synchronization areas may be considered synchronous because all GPS locked clocks use the same reference, i.e., the GPS signal, although clocks of different vendors may use slightly different algorithms to synthesize a reference signal out of the GPS information. The phase differences between GPS locked clocks easily remain within one 64 kbit/s octet (125 |is), so from a G.822 perspective, controlled octet slip does not occur and the operation is synchronous.
6.5,6
Synchronization Network Engineering
6.5.6.1
Requirements for Synchronization Protection
In a Synchronization Area, the set of actively used references by all network elements in the area, at that point in time, forms the current synchronization network. This network has a so-called spanning tree topology, i.e. it forms an interconnected but loop-free graph [45]. When a link in this network fails, the network element at the receiving end selects an alternative path which changes the graph. However, the selection rules for reference switching should be such that, in case a synchronization reference link fails or multiple references fail at the same time, the outome is as follows: 1. Timing loops are not formed due to reference protection switching, i.e., the current synchronization network keeps a "tree" shape. Timing loops are cases in which a set of network elements uses active reference signals from one another in a circular fashion. Such loops are detrimental because the frequency of these elements may collectively drift away uncontrollably. 2. Clock hierarchy is not violated due to reference protection switching. The clock hierarchy is dictated by the quality of a clock and the traceability of its reference signals. The objective is that a clock of a certain quality should never select a reference that has a traceability that is worse than
250
Chapter 6
its own quality. In such a case, it would be better to switch to holdover operation. 3. The Synchronization Area is not split into disjoint islands (as far as possible with multiple failures), i.e., the current synchronization network remains a "spanning" tree for as long as possible. 6.5.6.2
Synchronization Plan
The design of the synchronization network in a Synchronization Area is often contained in a synchronization plan that provides maps of the network elements and transmission paths in the area and the provisioning data that pertain to the synchronization properties for all these network elements. This determines the topology of the synchronization network, both in the absence of failures and in the presence of one or more failures. Apart from the synchronization plan of the area itself, also the synchronization relationship with neighboring Synchronization areas needs to be taken into account, as well as the synchronization plan within each telecom office in the area.
6,6,
CONCLUSIONS AND CLOSING REMARKS
6.6.1
Conclusions
This chapter has provided an overview of timing, jitter, and synchronization concepts; timing and jitter requirements for SONET, SDH, and OTN; and network synchronization engineering. The chapter has emphasized the key differences between OTN and SDH/SONET timing, namely: • The OTN is not required to be synchronized, and its physical layer is not required to transport synchronization. Instead, synchronization is transported via the SDH clients. • There is a single set of OTN timing and jitter requirements that is applicable globally (unlike SDH, which has two Options). • SDH networks need to be synchronized to be able to use the STM-N and OC-N carrier signals as carriers of synchronization information as well and to minimize the wander accumulation on SDH payload signals. • With the aid of high-quality clock equipment located at strategic places in the network, it is possible to reliably distribute synchronization through the OTN, SDH, and PDH networks to the PSTN equipment.
Synchronization of Optical Networks
251
•
The synchronization network engineering of the SDH network is for these reasons more complex than for OTN or PDH networks, but it allows the use of relatively inexpensive equipment clocks in SDH equipment. These differences result in OTN timing being simpler from both the standpoints of hardware design and network management. For the former, the ODC requirements are less stringent than the SEC requirements in SDH, mainly in that none of the ODC's must meet the holdover and more stringent transient requirements of SDH. For the latter, there is no need to manage a synchronization network at any of the OTN layers.
6,6.2
Closing Remarks
The "building" of specifications listed in Table 6-1 of this chapter is, in essence, a long and detailed justification for the conclusions summarized in the bullet points above for the various transmission technologies: PDH, SDH/SONET and OTN. Most of this work has been carried out in the years 1988 through 2001, the years in which the SDH/SONET and OTN technologies were conceived and introduced in transport networks worldwide. The body of specifications is quite complete, and most standardization professionals consider the subject of network synchronization more or less finished, so participation in these particular standards activities is dropping. Two trends are discussed briefly that can potentially revive this interest. Mobile networks have not been specifically addressed in this chapter, but in general the synchronization strategy of such networks is not very different from the "wireline" optical networks that have been discussed extensively. The mobile part of mobile networks is actually only the few kilometers between the base station and the hand-set, the so-called air interface. From a synchronization perspective, the main problem is to ensure that the carrier frequencies of the base stations are sufficiently stable. Bringing a reference of SDH/SONET quality to the base station is sufficient for that purpose. Since the base station interfaces are often PDH interfaces (El, DSl), dropped from an SDH/SONET network, a technique called retiming is applied to force the frequency accuracy of the SONET/SDH network on the outgoing El or DSl drop signal. The technique of resynchronization is widely applied, but not standardized in detail. This is a relatively small potential area of standardization work. A more challenging problem for network synchronization specialists is likely to be the rise of Constant Bit Rate (CBR) services over packet networks, i.e., ATM, IP, or MPLS. Examples of such services are voice (VoIP) or video transport over IP networks and, lately, the work on the "Pseudo-Wire" specification in IETF. There is some experience with
252
Chapter 6
modeling Telephony over ATM, an area in which the mathematical techniques explained in this chapter have been applied to the Cell Delay Variation (CDV) of ATM-cells. This quantity is comparable to the phase noise found in TDM networks, although the numbers (and so the related problems) are one to several orders of magnitude larger. When shifting from ATM to IP or MPLS the analysis problem becomes harder, since the packet length in IP and MPLS networks is not constant as in ATM networks, and, in addition, the packet delay variation is still larger than in ATM. In addition, the Quality of Service (QoS) aspects of the transport is much less regulated and standardized than in ATM. There is still much work to do to relate QoS parameters to the real-time requirements for the arrival times of packets and the perceived responsiveness of the network by the end-users. So far, some commercial implementations of VoIP exist and success has been claimed. In practice, this success has always been achieved on relatively small networks where bandwidth and routing capacity were not limiting factors. In large-scale deployments, this may no longer be true, which calls for extensive modeling and testing, eventually leading to a next generation of standards documents.
6.7.
NOTES
1. In practice, other considerations come into play. For example, electrical levels may be chosen to be +V and -V, so that the average level is DC-free for 50% ones density. Optical power levels may be chosen consistent with requirements on average power; in addition, extinction ratio (ratio of optical power level corresponding to 1 to optical power level corresponding to 0) is finite. 2. In general, the bit time may be associated with any convenient property of the pulse. The property chosen depends on the modulation scheme. 3. Since, in the notation here, this time corresponds to the pulse leading edge, an additional fixed offset time should be added so that the pulse is sampled near its center. 4. The slip will be controlled or uncontrolled, depending on the mechanism implemented. With a controlled slip, an entire frame (based on a defined frame structure) is repeated or dropped. With an uncontrolled slip, the buffer is reinitialized and data is deleted or repeated as necessary. With a controlled slip, reframing is not necessary because frame boundaries are not lost. Note that it is assumed here that the server layer is the physical layer, i.e., the client signal is not being mapped or multiplexed into a higher-rate server layer that is itself a digital layer. In this latter case, justifications and/or pointer adjustments may be used to prevent data loss when the transmit (server) and receive (client) clocks are different. These mechanisms are described in Sections 6.2.3 and 6.2.4. 5. The term 3R denotes reamplification, reshaping, and retiming. The term is used in OTN because there a link can contain optical amplifiers that, in general, only perform reamplification (an optical amplifier not used in a regime where it distorts the pulses is a IR process). The term is not used in SDH/SONET because there all regenerators are 3R. Some PDH networks have used 2R (reamplification and reshaping) and 3R regenerators together; however, these were not standardized.
Synchronization of Optical Networks 6. 7.
8.
9.
10.
11. 12. 13.
14. 15.
16. 17.
18. 19.
253
Using the terminology of eye diagrams, there is more eye closure; see references [3] and [5]. This definition of jitter tolerance applies only to optical receivers. For electrical receivers, the transition between an error-free signal and loss of frame is abrupt, and a stable error rate cannot be easily created by attenuating the signal. In reality, the PLL will be at least second-order (with first-order loop filter); however, the argument remains the same as long as the roll-off is still -20 dB/decade and the gain peaking is small (both are true in practice). Actually, timing impairments can occur at frequencies up to the Nyquist rate, i.e., onehalf the bit rate. These high-frequency impairments are not measured using typical SONET/SDH jitter test sets. Such impairments are due to specific effects, and are instead considered separately in the power budget in SONET/SDH and OTN. A different specification and measurement approach is taken in defining jitter for 10 Gbit/s Ethernet [41]. The discussion in this subsection is taken from the discussion in Sections 5.3.1 and 5.3.2 of [20], with some condensation. Section 5 of [20] contains a summary of concepts used in describing time, frequency, and network synchronization. There is also an assumption that the mean of x{t) is zero; however, any nonzero mean can be included in the deterministic part of the phase error. The term SONET is used in the North American (Tl) standards; the ITU-T recommendations use the term Option 2 SDH. The historical situation is actually somewhat more complicated. The specifications in ANSI, for SONET, were not originally fully consistent with G.825. However, the jitter tolerance and network limits for OC-3, -12, and -48 were made consistent with G.825 in the 2003 revision of T 1.105.03 [11] (however, the Option 1 and Option 2 jitter generation and transfer specifications are not the same for these rates). In addition, both SONET and SDH have specified reduced jitter tolerance regenerators, referred to as Type B regenerators. In this chapter, only the normal jitter tolerance regenerators are discussed; these are referred to as Type A regenerators. In OTN, there is only a single set of specifications, and all these complications were avoided. The studies were contained in contributions to the respective standards bodies. For sufficiently close optical channel spacings, additional jitter may result from crossphase modulation, a nonlinear effect. This effect is not considered in G.8251 [8], Further work is needed to specify the jitter due to this effect, or any other nonlinear effects. For Option 2 networks, the corresponding numbers are under study but, in practice, are typically smaller. A rough justification for this conclusion may be seen from the fact that the zero-to-peak response of passing a unit step through a low-pass filter followed by a high-pass filter, where the low-pass filter bandwidth is much less than the high-pass filter bandwidth, is approximately the ratio of the low-pass bandwidth to the high-pass bandwidth. Therefore, if the high-pass measurement filter bandwidth is increased by a factor, zeroto-peak jitter is reduced by approximately the same factor. For Option 2 networks, the corresponding numbers are under study but, in practice, are typically smaller [30]. Whether or not a network of clocks, locked to some master, has mutual phase differences that remain bounded over time is the subject of some controversy. Two mechanisms can be mentioned that cause temporary loss of lock, which can over very long times accumulate unbounded. First, the phase detector in a slave clock has essentially a periodic design, so the internal noise of an oscillator in the PLL circuit or the phase noise
254
Chapter 6
on the reference signal may cause the clock to "skip" a cycle, which is never recovered. Second, temporary loss of reference is normally accommodated by a "fly-wheel" mechanism called "hold-over". If the loss of reference extends over a longer period there is a probability that some cycles are lost after regaining lock. A more precise statement is that in a synchronous network the phase differences are bounded on the very short timescales associated with a single cycle as long as the clocks maintain lock. 20. PRS clocks have to meet the short-term stability requirements of ANSI Tl.lOl [37], which are slighdy more stringent than those in ITU-T Rec. G.811. In this sense, a PRC and a PRS are not exactly equal. However, this difference has no consequence for the discussion. 21. An exception must be made for SDH-type signals operating at PDH (E3, DS3, and E4) bit rates as defined in ITU-T Recommendation G.832. These signals contain a one bit "Timing Marker" that can contain extremely basic Traceability information. However, G.832 signals find little or no application in real networks. 22. The value of 1 |is has been the objective for Option 1 networks that has been used in constructing the rules for the SSM controlled synchronization process for SECs, as defined in ITU-T Rec. G.781. This number has been taken from older G.812 clock specifications that allowed maximum output phase transients of this magnitude. To meet the 1 |as phase transient limit under worst-case conditions for a 20 node ring, it has to complete its restoration process within 15 s. A phase transient of 1 )LIS can cause a burst of 7 pointer adjustments at the AU4 and STSl levels, a well-known test sequence for SONET equipment (ANSI Tl. 105.03). For Option 2 networks, the reconfiguration process can take longer due to the narrower bandwidth of the Option 2 SEC (0.1 Hz or less) compared to the Option 1 SEC (1-10 Hz). As a result, the phase transient magnitude will be larger in Option 2 networks.
6.8.
REFERENCES
[1] John C. Bellamy, "Digital Network Synchronization," IEEE Communications Magazine, Vol. 33, No. 4, pp. 70-83, April, 1995. [2] Patrick R. Trischitta and Eve L. Varma, Jitter in Digital Transmission Systems, Artech House, Norwood, MA, 1989. [3] Sergio Benedetto, Ezio Biglieri, and Valentino Castellani, Digital Transmission Theory, Prentice-Hall, Englewood Cliffs, NJ, 1987. [4] Heinrich Meyr, Marc Moeneclaey, and Stefan A. Fechtel, Digital Communication Receivers, Synchronization, Channel Estimation, and Signal Processing, Wiley, New York, 1998. [5] Govind P. Agrawal, Fiber-Optic Communication Systems, Wiley, New York, 1997. [6] Heinrich Meyr and Gerd Ascheid, Synchronization in Digital Communications, Volume 1, Phase-, Frequency-Locked Loops, and Amplitude Control, Wiley, New York, 1990. [7] Dan H. Wolaver, Phase-Locked Loop Circuit Design, Prentice-Hall, Englewood Cliffs, NJ, 1991. [8] ITU-T Recommendation G.8251, The Control of Jitter and Wander Within the Optical Transport Network (OTN), ITU-T, Geneva, November 2001; Amendment 1, June 2002; Corrigendum 1 June, 2002. [9] ITU-T Recommendation G.825, The Control of Jitter and Wander within Digital Networks which are Based on the Synchronous Digital Hierarchy (SDH), ITU-T, Geneva, March 2000; Erratum 1, August 2001.
Synchronization of Optical Networks
255
[10] ITU-T Recommendation G.783, Characteristics of Synchronous Digital Hierarchy (SDH) Equipment Functional Blocks, ITU-T, Geneva, February 2004; Corrigendum 1, June 2004. [11] ANSI Standard T 1.105.03-2003, Synchronous Optical Network (SONET) — Jitter at Network Interfaces, ANSI, 2003. [12] C. J. Byrne, B. J. Karifin, and D. B. Robinson, Jr., "Systematic Jitter in a Chain of Digital Regenerators," Bell System Technical]ournal, pp. 2679-2714; November 1963. [13] E, L. Varma and J. Wu, "Analysis of Jitter Accumulation in a Chain of Digital Regenerators," Proceedings of IEEE Globecom, Vol. 2, pp. 653-657, 1982. [14] ITU-T Recommendation G.709Ar.l331, Interfaces for the Optical Transport Network (OTN), ITU-T, Geneva, March 2003; Amendment 1, December 2003. [15] ANSI American National Standard Tl. 105-2001, Synchronous Optical Network (SONET) — Basic Description Including Multiplex Structure, Rates, and Formats, 2001 (supplement Tl. 105a published in 2002). [16] ITU-T Recommendation G.707A'.1322, Network Node Interface for the Synchronous Digital Hierarchy (SDH), ITU-T, Geneva, December 2003; Amendment 1, August 2004; Corrigendum 1, June 2004. [17] ITU-T Recommendation G.810, Definitions and Terminology for Synchronization Networks, ITU-T, Geneva, August 1996; Corrigendum 1, November 2001. [18] J.A. Barnes, A.R. Chi, L.S. Cutler, D.J. Healey, D.B. Leeson, T.E. McGunigal, J.A. Mullen, Jr., W.L. Smith, R.L. Sydnor, R.F.C. Vessot, and G.M.R. Winkler, "Characterization of Frequency Stability," IEEE Transactions on Instrumentation and Measurement, Vol. IM-20, No. 2, pp. 105—120, (reprinted in NIST Technical Note 1337, March 1990), May, 1971. [19] D. B. Sullivan, D. W. Allan, D. A. Howe, and F. L. Walls, Characterization of Clocks and Oscillators, NIST Technical Note 1337, U.S. Dept. of Commerce, National Institute of Standards and Technology, March 1990 (see, for example, section IX of paper B.l). [20] Geoffrey Garner, Application of Time and Frequency Stability Parameters to the Characterization of ATM Cell Delay Variation and Traffic, Contribution to Tl Standards Project, TIX 1.3/96-086, TlAl.3/96-068, October, 1996 (available via http://www.atis,org) [21] J. A. Barnes and D. W. Allan, "A Statistical Model of Flicker Noise," Proceedings of the IEEE, Vol. 54, No. 2, pp. 176-178, February 1966. [22] Benoit B. Mandelbrot and John W. Van Ness, "Fractional Brownian Motions, Fractional Noises and Applications," SIAM Review, Vol. 10, No. 4, pp. 422-437, October 1968. [23] Jens Feder, Fractals, Plenum Press, New York, 1988 (Chapter 9). [24] Gennady Samorodnitsky and Murad S. Taqqu, Stable Non- Gaussian Random Processes, Chapman & Hall, New York, 1994. [25] ITU-T Recommendation G.811, Timing Characteristics of Primary Reference Clocks, ITU-T, Geneva, September 1997. [26] ITU-T Recommendation G.822, Controlled Slip Rate Objectives on an International Digital Connection, Extract from the Blue Book, ITU-T, Geneva, November, 1988. [27] Maarten Vissers, "Optical Transport Network & Optical Transport Module," Beyond SONET/SDH Conference, Paris, France, April 2001. [28] Roland E. Best, Phase-Locked Loops, Theory, Design, and Applications, 2"'' edition, McGraw-Hill, New York, 1993. [29] D. B. Leeson, "A Simple Model of Feedback Oscillator Noise Spectrum," Proceedings IEEE, pp. 329-330, February 1966.
256
Chapter 6
[30] ITU-T Recommendation G.803, Architecture of Transport Networks Based on the Synchronous Digital Hierarchy (SDH), ITU-T, Geneva, March 2000. [31] ITU-T Recommendation G.824, The Control of Jitter and Wander within Digital Networks which are Based on the 1544 kbit/s Hierarchy, ITU-T, Geneva, March, 2000. [32] ITU-T Recommendation G.823, The Control of Jitter and Wander within Digital Networks which are Based on the 2048 kbit/s Hierarchy, ITU-T, Geneva, March, 2000. [33] ANSI Standard Tl.102-1993 (R1999), Digital Hierarchy — Electrical Interfaces, ANSI, 1999. [34] ANSI Standard T l . 105.09-1996 (R2002), Synchronous Optical Network (SONET) — Network Element Timing and Synchronization, ANSI, 1999. [35] ANSI Standard Tl.403-1999, Network and Customer Installation Interfaces — DSl Electrical Interface, ANSI, 1999 (Supplements T1.403a-2001, T1.403b -2002), ANSI, 1999,2001,2002. [36] ANSI Standard Tl .404-2002, Network and Customer Installation Interfaces — DS3 and Metallic Interface Specification (Revision and Consolidation of TL404-I994 and TL404a-1996), ANSI, 2002. [37] ANSI Standard Tl.101-1999, Synchronization Interface Standard, ANSI, 1999. [38] ITU-T Recommendation G.813, Timing Characteristics of SDH Equipment Slave Clocks (SEC), ITU-T, Geneva, March 2003. [39] ITU-T Recommendation G.798, Characteristics of Optical Transport Network Hierarchy Equipment Functional Blocks, ITU-T, Geneva, October 2001 (to be published). [40] Telcordia Technologies Generic Requirements document GR-436-CORE, Digital Network Synchronization Plan — Issue 1 with Revision 1, Telcordia Technologies, June 1996. [41] IEEE 802.3ae-2002, IEEE Standard for Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications-Media Access Control (MAC) Parameters, Physical Layer and Management Parameters for 10 Gb/s Operation, 2002. [42] Elliott D. Kaplan (Ed.), Understanding GPS — Principles and Applications, Artech House, 1996. [43] Wayne Hanson, "Time and Frequency Transfers by Satellite," Chapter 12 of Proceedings of the NISTTime and Frequency Seminar, Boulder, CO, June 8-10, 1993. [44] W.J. Klepczynski, GPS and PTTI, Chapter 13 of Proceedings of the NIST Time and Frequency Seminar, Boulder, CO, June 8-10, 1993. [45] Kenneth H, Rosen, Discrete Mathematics and Its Applications, A^^ ed., McGraw-Hill, New York, 1999 (Chapter 8).
Chapter 7 SYNCHRONIZATION ARCHITECTURES FOR SONET/SDH SYSTEMS AND NETWORKS
P. Stephan Bedrosian Agere Systems
7.1.
SYNCHRONIZATION CONCEPTS
This chapter provides an overview of issues that must be addressed when designing synchronization functionality used in SONET/SDH systems and networks. In order to understand the place that timing plays in a SONET/SDH network or system, the following questions must be addressed first. Why is synchronization needed? Synchronization is a means of defining or decoding data. As such, timing imperfections can cause data errors. Therefore, an accurate clock is needed for accurate data recovery. • How is synchronization administered? Synchronization flows through a hierarchical structure. In North America, this hierarchy is divided into discrete categories of accuracy or quality called stratum levels. Clocks of a particular stratum level may provide timing signals to other clocks of an equal or lesser stratum. In this way, the quality of the timing distribution is controlled. • Why is source traceability important? The term traceability is commonly used in timing terminology to describe the synchronous nature of a timing signal. However, traceability alone is not sufficient to establish its suitability for a particular timing application. Rather, the origin or source traceability of the timing information is of great importance. Source traceability provides a means for determining the flow of a timing signal. Such knowledge
258
Chapter 7 is important for preventing timing loops, which essentially destroy the integrity of timing signals and lead to the impairment of data transport. Output Data (with Noise)
Input Bit Stream Data Latch
H^
Clock Recovery
Output Data (with Delay) - - 1 0
Bit Recovery "-^ High Bandwidth tracks signal and noise
External Timing Source BITSorSSU
n_nj
FIFO Buffer Read
Loop Timing Source
••\. Internal Timing Source
1 1
Filtered Clock
3 ruu
Bit Retiming Low Bandwidth Filters out HF noise
Figure 7-1. Symbolic representation of timing recovery
A high-level symbolic representation of synchronization and its role in accurate data recovery is presented in Figure 7-1. A bit-stream containing data with an embedded clock (or service clock) is typically the source signal. This signal may contain noise as added by artifacts of various transmission media. As a result, the first step is to accurately recover each bit of information; the second step is to perform a bit retiming. Bit recovery is done by first recovering the service clock along with the noise from the transmission medium. Since the majority of this noise is typically over 10 Hz (possibly in the MHz), a high bandwidth recovery function is used. There are numerous schemes for this recovery process; however, most involve detecting transitions on the incoming bit stream and processing the time between transitions to derive a series of alternating ones and zeros. The ability of any timing recovery algorithm to successfully generate a recovered clock will depend on the noise and data density (ones density) on the incoming bit stream. The end result of the timing recovery process is a clock that tracks the transitions on the incoming bit stream. Once a recovered clock is obtained, it is used to directly decode the data from the incoming bit stream. To provide the largest margin, the recovered clock should be shifted in time so that its edges occur in the middle of the bit or symbol interval of the bit stream. If the relationship between the recovered clock and the bit stream tend to move past a bit boundary, then it is possible to generate a bit error. It is for this reason that the clock recovery process tends to use a high bandwidth tracking mechanism to follow any
Synchronization Architectures for SONET/SDH Systems and Networks
259
phase/frequency modulation (sometimes referred to 2iS jitter) that can occur on the incoming bit stream. In order to control the frequency or phase variations on the recovered data, the data can be reclocked by a new clock domain. This process typically involves writing the data into a buffer with the recovered clock. The data is then read out of the buffer with a clock from the new clock domain. The source of timing for this new clock domain can be obtained from a number of sources, including loop/line, through, external, or internal sources. Figure 7-2 illustrates that there are numerous sources for the clock reference of the filtered clock process. These diagrams provide a listing of the various methods used to derive these sources from the external network. It should be noted that not all of these methods are suitable for every application. That is not to say that some methods are better than others; rather, various transmission applications will favor some of these methods over others. For example, if a repeater application is being considered, then the through timing mode will be most appropriate. If a large public SONET/SDH network requires timing, then either the external or line, or a combination of these two modes, may be used depending on the network architecture.
Loop Timing — A timing reference is selected from a single SONET/SDH interface Line Timing — A timing reference is selected from multiple SONET/SDH interfaces Througii Timing — Timing is independently recovered in the direction of transmission DS1/E1
External Timing — A timing reference is selected from a DS1/E1 reference Line/External Timing ~ The NE provides a derived DS1 or E1 to a co-located BITS/SSU that uses this reference to distribute timing back to the NE and other co-located NEs.
Figure 7-2. SONET/SDH Network Element (NE) timing modes
260
Chapter 7
Loop/line timing mode: This mode derives a clock from the received facihty. The derivation usually involves low pass filtering of the recovered clock to remove the unwanted jitter and short-term phase variations. Longer-term phase or frequency variations, however, will be tracked. Loop timing mode involves deriving timing from a single transmission facility, while line timing involves multiple transmission facilities. In order to prevent a timing loop, loop/line timing should not be used on both ends of the connection. Rather, an independent timing reference must be used to establish a clock domain. Therefore, an external or internal timing source must be the source of timing in the clock chain. Through timing mode: This mode derives timing associated with each direction of the transmission facility. So, for each direction of transmission, a separate clock is recovered and used for the outgoing transmit clock. This timing mode is typically used in a 3R (retime, reshape, and regenerate) repeater function, where transmit timing must equal the receive timing. External timing mode: This timing mode relies on using a clock that is typically derived and distributed by a building integrated timing supply (BITS) or synchronization supply unit (SSU). The input to the BITS/SSU is typically a stratum 1 traceable input that can be sourced from a GPS receiver or from a SONET/SDH facility. By using such high-quality timing inputs, a stratum 3E or better BITS is able to filter out timing impairments at a stratum 3E level or better. This process provides a much cleaner clock than could be derived from a line-timed NE alone. When sourced from a GPS receiver, both ends of the transmission facility can use external timing inputs without limitations. It should be noted, however, that the phase difference between the service clock and external clock will be controlled but unbounded. Since the BITS/SSU at each end is establishing a physical clock source based on a GPS input, the frequency and phase between two such clocks will not be identical. However, it is known that the frequency difference at any stratum 1 reference will be off by no more than 10'^^ Therefore, the maximum frequency difference between two GPS timed BITS (or, for that matter, any two stratum 1 timed BITS) will be 2xl0•^^ Line/External timing mode: This is a timing mode that allows the BITS/SSU to use a timing reference carried by the SONET/SDH facility. Since BITS/SSUs do not have any interfaces that can directly receive timing from a SONET/SDH facility, a co-located SONET/SDH LTE (line terminating element) provides the handoff of this timing signal. The co-
Synchronization Architectures for SONET/SDH Systems and Networks
261
located LTE extracts timing from the SONET/SDH frame and then uses this timing to create a special output timing signal called a derived DSl or EL The derived DSl has a nominal bit rate of 1.544 Mbit/s and is formatted as a framed DSl with all-ones data content. The framing can be optioned as either a superframe (SF) or extended superframe (ESF) format. In addition, to providing a derivation of the SONET/SDH frequency, the ESF formatted derived DSl is also able to provide some or all of the synchronization status messaging (SSM) contained in the SI byte of the line overhead. Details on how the derived DSl signals the BITS/SSU are presented in Section 7.7.5.3 of this chapter. The derived El format can be either a framed data signal at 2.048 Mbit/s or a frequency at the El data rate of 2.048 MHz [1]. If the 2.048 MHz format is used, only frequency information can be carried between the SDH LTE and SSU. If the framed 2.048 Mbit/s signal is used, it may contain a channel associated signaling (CAS) or common channel signaling (CCS) framing format with or without CRC-4. In these framed formats, the El can carry SSMs contained in the SI byte of the line overhead [2]. Internal timing mode: This timing mode creates a timing source from an independent frequency source (oscillator) embedded in the actual receiving equipment or NE. The initial state of this timing source is important when determining the quality of its timing. If the internal timing source was "trained" to an external reference within the last 24 hours (e.g., PRS timing source) then it is said to be in holdover mode. If not, the internal timing source is said to be in free-run mode. It should be noted that the accuracy of this mode is also dependent on environmental conditions (e.g., aging, temperature, vibration, voltage variation, etc.). More information on internal timing mode is presented in Section 7.7.6.
7.2.
TIMING TRACEABILITY
The term traceability deals with the synchronous nature between timing signals. Unfortunately, the term by itself does not adequately address the origin or timing properties of these signals. Is the phase difference between two traceable clocks bounded? The answer depends on the architectural arrangement of the two clocks. Basically, this boils down to the concept of timing flow from one or more physical clock sources. Once this answer has been given, then the phase characteristics between two such "traceable" clocks will be known.
262
Chapter 7
Therefore, two classes of traceability can be defined to address these concerns: source traceability and frequency traceability.
7.2.1
Source Traceability
Source Clock
fi^ .DCZ==> f\^ .D^=> f^ \ Source-Traceable Clocks
MTIE generation is bounded
Figure 7-3. Source -traceable frequency relationship
Source traceability: As shown in Figure 7-3, source traceabihty is a relationship where the frequency of all clocks in a system is referenced back to a single physical clock. Under normal operating conditions, all clocks will have the same average frequency in a source-traceable system. Thus, the phase error or Maximum Time Interval Error (MTIE) between all clocks in such a system is bounded. MTIE is a measurement metric that specifies the maximum excursion of a timing signal's phase (commonly called time-delay) compared with a stable reference (see Chapter 6, Section 6.2.6.2, for a precise definition of MTIE). A series of nested measurements are taken over time intervals of increasing length. Each of these measurement intervals or "bins" is then analyzed for the maximum value of time delay. As the measurement interval (commonly called observation time) increases, the maximum value of time delay will either stay the same or increase. The MTIE sampling process is usually limited to variations in the wander region (10 Hz and lower). MTIE is typically presented in graphical form and plotted against observation time. Due the statistical nature of the MTIE measurement, specialized measurement equipment is typically used to do this analysis. Another measurement metric useful in analyzing time delay is Time Deviation (TDEV). TDEV is a statistical calculation of the rms energy of a timing signal's phase noise (see Chapter 6, Section 6.2.6.3, for a precise definition of TDEV). In short, TDEV provides a measure of the instability of a timing signal in the wander region. Measured values of TIE are taken through a bandpass filter that is adjusted for increasing values of integration
Synchronization Architectures for SONET/SDH Systems and Networks
263
time. Due to the statistical nature of TDEV, a measurement period of at least 12 times the maximum integration time is needed. Therefore, a maximum integration interval of 1000 seconds requires at least 12,000 seconds of data. As with the MTIE measurement, specialized equipment is used to perform the TDEV analysis. Examples of source-traceable timing are typically seen at an equipment or NE level. In SONET/SDH LTEs, there is typically a single clock frequency selected on a system basis to be used throughout the system. A dedicated timing distribution system is used to provide source-traceable clocks at a circuit pack, shelf, and bay level. If redundancy is needed, then duplicated clock paths and reference switching components are also used.
Source Clock (A)
Source Clock (B) ^^
\ ""^
^*^
—•
Frequency-Traceable Clocks
MTIE generation is unbounded but controlled
Figure 7-4. Source -traceable frequency relationship
Frequency traceability: As shown in Figure 7-4, frequency traceability is a relationship where the frequency of all clocks in a system are referenced back to clocks of know frequency tolerance. Under normal operating conditions, all clocks will have frequencies bounded by a predetermined range. Thus, the phase error between all clocks in such a system is unbounded but controlled.^ The average rate of phase drift between frequency-traceable clocks, i.e., the frequency offset, can be computed by knowing the average frequency difference between any two clocks. For example, if a fixed frequency offset between two clocks is 1 part per million (PPM) they will drift at a linear rate of 1 |is per second. This relationship is shown in the following formula for time delay [3] related to frequency offset: X PPM offset = X ^is per second drift
(7.1)
264
Chapter 7
Though Eq. (7.1) assumes a constant frequency offset, it is useful in computing time interval error (TIE ) and MTIE. One of the important points that this relationship shows is that the phase drift, or time delay, will increase linearly over time for a constant frequency offset. Therefore, for a known frequency offset, the MTIE generation is controlled but unbounded. Examples of frequency traceability are typically seen in GPS timing distribution networks. In this case, the source clocks are the GPS receivers and co-located BITS. Therefore, each BITS/SSU will be synchronized to a worst-case accuracy of 10"^ ^ Locally at each BITS/SSU, timing is distributed via source-traceable means to ensure that all externally timed equipment receives a timing source that is frequency and phase synchronous.
Source Clock
^reierence
i
Source-traceable feedback path
MTIE generation is unbounded and uncontrolled
Figure 7-5. Source/frequency-traceable relationship of a timing loop
If the traceability characteristics in a synchronization network are not managed properly, then it is possible to create timing loops. Timing loops are caused when the source-traceable nature of the individual clocks is such that their input source is referenced to their output. This feedback arrangement, shown in Figure 7-5, has a destabilizing effect on the output frequency and will cause the phase error to be both unbounded and uncontrolled. Systems experiencing timing loops will typically have output frequencies that oscillate between the extreme frequency limits of the internal oscillators. The conditions leading to excessive phase movement can compromise data recovery and ultimately lead to data loss. It is for this reason that the integrity of data transport requires that source traceability be managed. Source traceability can be managed in several ways depending on the transport facilities that carry timing. If GPS receivers and co-located BITS/SSU are used, then, by definition, the source of the BITS/SSU will not
Synchronization Architectures for SONET/SDH Systems and Networks
265
be referenced to their outputs, and no timing loops will occur. If, however, any line-timing (or derivatives) mechanism is used, then it is possible to create timing loops under some conditions. For example, if both NEs on either side of a transmission facility are using loop-timing mode, a timing loop will be created. Therefore, certain precautions must be taken when designing line-timed networks. In SONET/SDH networks, synchronization status messages can be used to manage the flow of timing in either a linear or ring network. Typically, this messaging is used to allow timing to travel in only one direction in the network. The return path is often blocked by messaging and can be effectively used to control timing flows and avoid creating timing loops. Typically, timing loops form when NEs undergo some sort of dynamic event that causes an NE's timing source to change. Such events could be the result of protection switching due to a network or equipment fault, manual switching caused by a maintenance activity, or a combination of the two. Unfortunately, in these cases, either multiple faults or the removal of various protection mechanisms lead to the creation of timing loops. Even worse, timing loops can occur without alarm or warning that would indicate that the timing flow is at fault. Timing loops can cause transmission impairments that result in bursty transmission errors. Bursty transmission errors are typically created as receive buffers overflow or underflow due to timing variations between the equipment clock and service clock. Thus, the transmission equipment is first suspected as the source of the fault. Sometimes, the BITS/SSU may indicate an end-of-range of its oscillator during a timing loop event. Thus the timing card or oscillator of the BITS/SSU may be initially and incorrectly identified as the source of the timing problem. This will obviously cause delay and expense in rectifying the timing loop situation. Timing loops are typically localized at a single NE but can affect several pieces of equipment located in multiple remote locations in a network. Therefore, good records are needed to verify timing flows and to safeguard against creating a timing loop under fault conditions.
266
7.3.
Chapter 7
SYNCHRONIZATION DISTRIBUTION
Stratum 1
Stratum 3
Figure 7-6. Source -traceable clock distribution
The concept of synchronization distribution for communications systems has its origins in the Bell System. The Bell System used a frequencytraceable timing distribution system based on a single, physical stratum 1 clock located in Hillsboro, Missouri, USA. The frequency from this clock was distributed to various point-to-point nodes in the network with stratum 2 clocks. These stratum 2 clocks, in turn, distributed timing to other stratum 2 or 3 nodes. Finally, the stratum 3 clocks distributed timing to other stratum 3 or 4 nodes. By distributing timing to nodes of an equal or lesser stratum level, the source traceability relationship of the clocks between nodes was preserved. This source-traceable distribution scheme is shown in Figure 7-6.
Synchronization Architectures for SONET/SDH Systems and Networks
267
% %
^
Figure 7-7. SONET/SDH frequency-traceable clock distribution
In today's SONET/SDH networks, it is not always practical to distribute timing in a source-traceable manner. A source-traceable arrangement would require a complex network of point-to-point links to overlay the existing SONET network. Instead, a frequency-traceable means of timing distribution is employed, as illustrated in Figure 7-7. In this case, frequencytraceable timing is distributed by means of the global positioning system (GPS) [3] satellite network.^ Maintained by the U.S. Department of Defense (DOD), the GPS radionavigation system is capable of providing synchronization information to better than 1 part in 10^^ accuracy (i.e., an accuracy of 10'^^). The GPS satellites are placed in six circular orbits 20,200 km (10,900 nautical miles (NM)) above the earth at an inclination angle of 55 degrees with a 12-hour period. The satellites are spaced in orbit so that at any time a minimum of six satellites will be in view to users anywhere in the world. This constellation of GPS satellites transmits its atomic-clock-based timing to earth-based GPS receivers, which lock onto these signals and recover the timing. Due to a number of factors related to the recovery and distribution of these GPS clock signals, no two GPS receivers will have exactly the same frequency and phase. In fact, from a timing relationship, the only thing that is guaranteed is that the frequency accuracy will be within 1 part in 10^^ accuracy of UTC (Coordinated Universal Time). Therefore, over time, it is expected that the phase error between any two GPS receivers will be unbounded. However, knowing that the frequency error between any two GPS receivers will be no worse than 2 parts in 10^\ the maximum drift rate is controlled. This timing relationship is referred to as plesiochronous [4] timing.
Chapter 7
268
Once GPS timing is recovered, it is usually sent to a BITS or SSU, which then distributes this timing to other local equipment. Typically located in a central office, the BITS/SSU provides many levels of duplication and redundancy, including holdover, to preserve the continuity, accuracy, and availability of this timing signal. When timing is sent to a SONET/SDH LTE, it is used as a basis for timing the outgoing synchronous payload envelope (SPE). The timing from the SPE can be recovered by downstream SONET LTEs for use as a linetiming source. SONET/SDH topologies can encompass linear or ring, or a mixture of the two. In all cases, SONET/SDH line-timing can be used while maintaining both source and frequency traceability. A mechanism used to enforce timing flow is the use of synchronization status messaging. This allows timing flows to be directed within a SONET/SDH network.
7.4.
NETWORK ELEMENT (NE) ARCHITECTURE
'^m'y:zr^^^W'
Timing Processing
m ^
Timing Distribution
* ^
Timing Engines
Figure 7-8. Example SONET/SDH Network Element (NE) architecture
A basic SONET/SDH NE may be illustrated as a series of functional blocks that recover, process, and distribute intrasystem timing signals (Figure 7-8). In the design of a timing architecture, there are a number of design factors that will influence the type and placement of timing
Synchronization Architectures for SONET/SDH Systems and Networks
269
functionality. For the purposes of Figure 7-8, a redundant multicircuit card system has been chosen to illustrate timing flows. The two basic functional blocks of this system are timing engines (TEs) and timing distributors (TDs). The TEs perform the bulk of the timing processing on a system level, including clock recovery, filtering, and backup operating mode. The TDs provide a fan-out and synthesis of this timing to the end devices. The placement of these functional blocks is greatly dependent on the system architecture. They can be implemented as stand-alone circuit packs or modules located in a system or can be part of other common equipment (e.g., system controller card). Typically, the TEs are placed in the category of common equipment. In small to medium-sized equipment (single to multishelf systems), there is only a pair of TEs. The TEs are usually configured in a redundant relationship that provides source-traceable timing for the system. The TDs are typically placed on the cards or modules that require the distributed clock to function. In this case, the TDs can be used to distribute, fan-out, and synthesize the system clock to match the timing or functional needs of its circuit pack or module.
7.4.1
Timing Engine (TE) Functions
The timing engine (TE) operation can be broken down into four distinct functions, including reference selection, clock recovery, synchronization retiming, and synchronization distribution. These functions are used to select a system timing reference and perform an intermediate distribution of this reference to the system line-cards or modules. It should be noted that the actual operation of these functions is dependent on system architecture and implementation. For example, small or single card systems will not have the redundancy and cross-coupling needed by larger systems. The qualification of input references and the performance of the clock will be more sophisticated for better stratum-level clocks. Therefore, these functions should be viewed as guidelines that need to be adapted for various architectures. Reference Selection: When a timing signal is input to a NE, it can come from a line recovery process or external means. In either case, the NE system timing must be extracted from this timing signal. In addition, fault detection is typically done at this point to verify the suitability of the input signal as a timing reference. It may also be desirable to support redundancy by providing multiple inputs to a single clock engine. Therefore, the TE must also perform a source selection of all available timing inputs.
270
Chapter 7
Clock Recovery: Next, a distinct clock signal must be produced or extracted from the input reference. At this point, it may be desirable to perform a frequency synthesis of this clock signal to create a common frequency for the synchronization retiming function. It should also be noted that at this point, if a mate clock engine is being used, a cross-coupling of this clock signal can be made. Such an arrangement will ensure that both TEs are using the same exact clock signal. Synchronization Retiming: This process is often associated with a low bandwidth phase locked loop (PLL) process. In essence, the selected clock signal is frequency filtered, phase aligned, and possibly synthesized to create a more stable copy of the input clock. A synchronizing or framing pulse may also be generated here for system frame alignment. Clock mode selection is also done here. Holdover or free-run mode may be enabled in the event of a clock source failure or the need to operate on internal timing. The choice of holdover stability is dependent on the intended service application and will determine the minimum frequency stability during external synchronization reference failures. In addition, cross-coupling may also be done here as well to align the clock and possibly sync pulse outputs. Synchronization Distribution: This process is used to provide the correct voltage level and format to drive these timing signals across backplanes to other elements in the system. The key here is to produce exact copies of the clock signal without adding detrimental jitter or skew. In addition, since all outputs of a duplicated system may require that all clocks have tight phase deviation characteristics (zero delay), a cross-coupling of these output signals will ensure that this requirement is met.
7.4.2
Timing Distributor (TD) Functions
The TD operation can be broken down into two distinct functions: (1) synchronization selection and (2) synthesis and fan-out. These functions are used to transform the system timing reference to meet the actual synchronization requirements of a line-card or module. Again, it should be noted that the actual operation of these functions is dependent on system architecture and implementation. For example, if all line-cards in a system need to have a common and synchronous frame synchronization signal, the TD needs to be able to generate this in a deterministic way. That is to say, the operation of all TDs in the system must be identical to produce framesynchronizing signals that are in frequency and phase. The timing needs of multiservice platforms may require that the TEs output common synchronizing signals to all line-cards. Therefore, it is the
Synchronization Architectures for SONET/SDH Systems and Networks
271
TDs in the line cards that will need to transform these system timing signals to the specific or multiple frequencies needed to support these services at the line-card level. Synchronization Selection: The synchronization selector terminates and recovers the timing signal of each input fed from the TEs. Since redundancy is preferred in multicircuit card systems, the selection between clocks from each TE is performed. In the case where clocks and frame sync signals are sent as a pair from each TE, the sync selector must preserve the phase relationship between these even, under protection switching cases. Synthesis and Fan-out: The synthesis function, typically located on each line card, is essential for creating the various frequencies needed by each line card. This also simplifies the timing distribution from the TE to the TD by only requiring a single set of timing signals to be input to the TD. Finally, these output clocks are fanned out to the specific devices that require input clock signals. Again, synthesized and fanned-out clocks may require tight phase variation characteristics (zero delay), so proper planning of the clock trees on individual modules or circuit packs will be needed. Timing Distributor (TD) Application Example A key element of any system timing application is the TD function. Typically found on most system circuit pack or modules, the TD supports the local timing needs of these subsystem components. With more functionality being placed in ASIC or ASSP devices, it really becomes an issue of supporting the timing needs of these devices. Multiservice platforms are currently the trend in communications equipment. This means that a single circuit-pack or module should be able to support a variety of services or functions. For example, a SONET/SDH mapping module may also need to support Ethernet mapping via GFP [5] (generic framing procedure) structured mode. Such a requirement will necessitate the need for not only SONET/SDH clock frequencies but also those of Ethernet. Figure 7-9 shows an example of how a multiservice timing distributor might be structured.
272
Chapter 7
Working Clock From System Timing Cards
Reference Selector
Protect . Clock Working Sync Protect Sync
From System
•*
^
_^:
Programmable Clock Generation
^ 3 2.048 MHz " ^ ^ 44.736 MHz
Low Speed SONET/SDH Clock Generation
Reference^""*!
3
^ g ;
Controller MM^MMi^^m
> 38.88 MHz
< Z j 77.76 MHz V f
High Speed SONET/SDH Clock Generation
Serial Interface
A
W
Sync Generation
g ^
155.52 MHz
^ ^
622.08 MHz
^ ' "^ 8 kHz Sync
w^ 's^ii^^^'mm^
Examples of possible Output Frequencies
J
^
Figure 7-9, System block architecture of the TSWCO1622
Figure 7-9 shows a high-level block diagram of the TSWCO 1622, a multiservice timing distributor currently available from Agere Systems. This timing distributor is an integrated device capable of operating in redundant system architecture while providing a wide number and variety of timing signals for use in SONET/SDH, ATM, and PDH applications. This device contains all functions necessary for hitless reference switching, frequency synthesis, clock and sync signal generation and fan-out, and communications. The input reference switching function of the TSWCO 1622 is capable of supporting several input clock frequencies, including 51.84, 38.88, 19.44, 8.192, 6.48, 2.048 and 1.544 MHz, and 8 kHz. The choice of input frequency needs to be done on a system level. Typically, if the system timing requires that both a clock and common frame synchronizing signal be distributed, then an 8 kHz signal will be distributed by the TE to all the TDs. When the 8 kHz reference clock frequency is used, the TSWCO 1622 will generate its output 8 kHz sync signal aligned to this input. In this way, all the frame-synchronizing pulses generated by the TDs in a system will be synchronous. If, however, a frame synchronizing pulse is not required to be synchronous throughout the system, then the choice of input frequency becomes the one that most effectively complements the service clocks. It should also be noted that, in order to meet network MTIE and TDEV requirements, the input clock reference to the TSWCO 1622 must be compliant with all requirements for SONET Minimum Clock (SMC) as defined in ANSI T l . 105.09 or for an ITU SDH Equipment Clock (SEC) as defined in ITU-T Recommendation G.813. Operating in a redundant system configuration, TSWCO 1622 can accept two timing inputs of the same frequency. These clock inputs will typically come from the two system TEs. Integral to the input clock reference section
Synchronization Architectures for SONET/SDH Systems and Networks
273
is the clock protection switch and control circuit. It is up to this section to validate input references based on the presence of clock edge transitions. The TSWC01622 can also be optioned to switch from a failed input autonomously and switch back to a restored input via a revertive switching capability. Should both of the two timing references fail, the TSWC01622 can also use a third input reference to provide a free-run-type capability on a line card or module basis in order to maintain a minimum level of service. Should input clock switches occur, the TSWC01622 can perform these hitlessly. A hitless switch is one where various timing relationships are preserved before, during, and after the switch. These ensure that the output clock and sync signals will not impair the service quality or functionality of their associated line card or module. One such relationship is the number of clock cycles between frame-synchronizing pulses. Another relationship is the preservation of clock phase at the output ports. Since timing signals from the TEs can follow different backplane routes, they will commonly have slightly different phases. The TSWC01622 recognizes the clock skew between these input timing references and is able to perform a phase buildout whereby the skew is equalized. Therefore, the resulting output clocks are capable of meeting the MTIE and TDEV switching requirements previously mentioned. The TSWC01622 is able to output a wide number of standard output frequencies and types to support the needs of multiservice platforms, including SONET/SDH/ATM, Packet over SONET, and PDH services. These frequencies are 622.08, 155.52, 77.76, 51.84, 44.736, 38.88, 34.368, 32.768, 24.704, 19.44, 16.384, 8.192, 4.096, 2.43, 2.048, and 1.544 MHz. Since these output frequencies are programmable, the device configuration can be customized for a variety of applications. The programmable outputs use fractional synthesis to provide almost any frequency within the output buffer's range (-100 kHz to 66 MHz). The device also provides dedicated output ports that support CMOS, LVPECL, or LVDS output levels. An example of using the TSWC01622 in a multiservice platform can be seen in Figure 7-10, which shows how a timing distributor can be used to provide the timing needs. The TSWC01622 is the timing distributor for a line card or module that also contains an Agere Systems Ultramapper"^'^ (TMXF8422) device. The Ultramapper is a multiservice device that supports the following services and features: •
SONET/SDH Line Interface with integrated clock/data recovery — terminates 622 Mbit/s STS-12/STM-4 or 155 Mbit/s STS3/STM-l — built-in CDR (clock and data recovery)
Chapter 7
274 Ultramapper
TSWC01622 CK155P0 CK155N0
THSCP THSCN
24.704 or 49.408 MHz CKPDH3
DS1XCLK* 32.768 or 65.536 MHz
CKPDH2
CKPDH1 CK51
P« 44.736 or r i 34.368 MHz
E1XCLK* DS3XCLK
Mux and Fanout Selection
8 kHz (if required)
i 34.368 MHz p 44.735 or 11 34.368 or K 51.84 MHz .
E3XCLK
DS3DATAOUTCLKf3:11 SYNC8K
^^^^S * Full ITU G.783 combined jitter compliance can be acliieved when clock rates fortheDSIXCLKand E1XCLK are set to 49.408 and 65.536 MHz, respectively
Figure 7-10. Timing connections Between Agere Systems TSWC01622 and Ultramapper^^
Overhead processor (TMUX) — for all transport and path overhead bytes — insertion and extraction of overhead bytes — software controlled linear 1+1 protection STS-12/STM-4 Pointer Processor and STS-1 XC — STS/AU pointer processing to align receive frame to system frame — Full intermediate path monitoring for all nonterminated STS1/AU3/TUG3 — Muxes STS-1 into STS-3 or STS-12 via pointer processing SPE/AU-3 Mapper — maps up to six DS3 or E3 clear channel (framed or unframed) signals into STS-1 or TUG3 — supports all valid Tl/El/Jl multiplexing structures into STS1 and STS-3/STM-1 M l 3 Multiplexer / Demultiplexer — three sets of x28 DSl or x21 El or x7 DS2 to/from DS3 or E3 VTA^C Mapper — monitors/terminates VT path overhead for 84 VT1.5/TU11 or 63 VT2/TU12 — Asynch and Byte synch VT mapping Digital Jitter Attenuator (DJA)
Synchronization Architectures for SONET/SDH Systems and Networks
•
275
— PLL free operation using built in DJA in VTA^C or M l 3 mode T l / E l , DS2, DS3 Cross Connect (xC) x84/x63 Tl/El/Jl Framer Bank — supports T l , E l , and Jl framing modes — supports Tl and El unframed and transparent transmission formats — Facility data link supports SLC96, ESF, and CEPT — HDLC- 192 channels with a maximum rate of 64 kbits; 128byte FIFO per channel
In order for the Ultramapper to support the tributary mapping of all these various rates into a SONET/SDH payload, numerous clocks at precise frequencies must be generated from a common system source. This is where the flexibility and versatility of a TD designed to support such a multiservice device proves to be advantageous. Using a single TSWC01622, the eight clock frequencies and one frame sync are able to be generated from a single redundant pair of timing inputs. Even more useful is the ability of the TSWC01622 to provide hitless protection switching using autonomous or manual switching modes.
7.4.3
Network Element System Architecture
Figure 7-11 shows how a typical SONET/SDH NE could be structured. Multicard equipment is typically composed of various classes of functionality, requiring specific clock types. Such functional classes include common equipment cards that can provide power, synchronization, and control functions for the system and modules or line cards that provide access to transport signals. The choice of what is placed on each card is determined by the amount of circuitry required by each function, how much power it requires, or how much heat it dissipates. Timing engine functions typically use physically large oscillators and will require some accommodation for this size. Modules or line cards typically comprise the largest functional type on a system level. The synchronization needs of each line card are dependent on the type of interface and functions that must be performed. It is for this reason that a system needs to be designed to support the end-service timing requirements. This not only includes providing the correct frequency but also includes availability and quality of the timing signal, even in the presence of switching, fault, or maintenance events.
Chapter 7
276
8ITS/SSU rnmJng Source
Timing Distributor On every Module or Line Card
Figure 7-11. Example SONET/SDH NE clock routing
Another important aspect of the NE synchronization system is that it needs to be scalable. Many platforms tend to evolve or be repackaged to address various markets. Scalability covers the need for many changes or derivatives of a system, including the need for higher operating capacity, higher I/O capacity, and simplex or duplex operation. It will be shown that by using a few basic synchronization building blocks, a scalable architecture can be easily realized. Such an architecture will not only scale to provide appropriate service performance but also address various cost targets that are equally important.
7.4.4
Small Network Element Architecture
Figure 7-12 is an example of a scalable small NE architecture. The physical requirements of this system comprise a single board or chassis. Here, equipment redundancy is not important. So, in this case, the simplex architecture does not have the need for redundant timing circuitry. This contributes to the low cost and low capacity that makes this architecture desirable. Essentially, the timing needs for this architecture rely on the support of just a single TE. The input source to the timing engine can come from either an external source or from the line. In the case of a clock source failure, either a holdover or free-run mode would be sufficient for operation. If desired, a redundant timing input can be supported that will give the small NE a measure of fault tolerance without hardware redundancy.
Synchronization Architectures for SONET/SDH Systems and Networks
211
However, due to the low cost/performance nature of this design, fauh tolerance is not expected.
Figure 7-12. Scalability example of a small NE
7.4.5
Medium Network Element Architecture
Figure 7-13 is an example of a scalable medium NE architecture. The physical requirements of this system comprise a single shelf with multiple circuit cards or modules. In this case, due to multiple cards, redundancy and fault tolerance are part of the architectural requirements. Therefore, a duplex architecture comprised of a pair of timing engines and a timing distributor on each line card would be provided. The cross-coupled TEs are capable of external or line-timing options. Line timing inputs typically come from any of the active line cards. In this case, the SONET/SDH line-terminating processor would provide the SSM associated with each line-timing source as part of the selection process. Every line card has a TD that supports the efficient transport of timing information from the TEs. The TD will typically accept a single clock or frame sync signal from each TE. The TD then perform hitless source selection between each TE as well as controlling any phase shifts by using phase build-out where needed.
Chapter 7
278
in BITS/SSy ~ withGPS/
'"'-U' v,v"> i-r-i
^-v'^^i\'Q^' TE - System Timing Cards
TD - Line Card/IVIodule Timing
Figure 7-13. Scalability example of a medium NE
7.4.6
Large Network Element Architecture
Figure 7-14 is an example of a scalable large NE architecture. The physical requirements of this system comprise a multibay and or multishelf system with multiple circuit cards. Since multiple timing faults could potentially impact a large number of users, redundancy and fault management are integrated into this architecture. The NE system timing is controlled by a pair of cross-coupled TEs located in the common equipment shelves. Here, the TEs will select between external, line-timed, or internal timing options. Cross-coupling will ensure that signals between the pair of TEs are coordinated. As before, the line-timed inputs and associated SSMs will come from active line-cards. Due to the extensive nature of the architecture, it may be advisable to have an intermediate stage of line source selection on a per-shelf or per-bay basis. This intermediate TD function will allow a limited number of line timing sources to be switched to each TE. In this way, all possible line timed sources can be made available to each TE with a minimum amount of cabling. Likewise, the timing signals (clock and frame sync signals) from the TEs are then sent to an intermediate pair of TDs located on each shelf. These
Synchronization Architectures for SONET/SDH Systems and Networks
279
TDs provide a fan-out to all the line cards on the shelf — a process that reduces the number of output ports that a TE would need. Finally, the intermediate TDs output the set of clock and frame sync signals to the TDs located on each line card. As before, each TD will be able to perform hitless source selection between each set of TE inputs as well as controlling any phase shifts by using phase build-out where needed. When using external timing mode, there are a number of strategies that can be used to provide this clock to the SONET NE. Each of these methods provides a different set of advantages that favor various local synchronization distribution designs.
i<&T^§^^'^'-^^'^-^''^p'^^yyi?z^^^
^>„;.v TE - System Timing Cards
TD - Shelf/Bay Timing
TD - Line Card/Module Timing
Figure 7-14. Scalability example of a large NE
7.5.
EXTERNAL TIMING CONFIGURATIONS
External timing configurations involve providing SONET/SDH equipment with external timing references from a BITS or SSU. There are a number of techniques that can be used, each with a set of advantages or
Chapter 7
280
applications that makes it suitable in some but not all circumstances. The suitability of any one method is dependent on the SONET/SDH equipment architecture, office cabling distribution, and placement of equipment in an office.
7.5.1
Direct-Source Timing Method
The direct-source timing method involves cabling a single BITS/SSU clock output directly to a single timing input port (Figure 7-15). This oneto-one mapping requires that, for every NE clock input, a unique BITS/SSU output will be used. For NEs that support multiple BITS/SSU terminations, this method affords the highest redundancy, since the fault of any one output will not affect any of the other outputs. The direct-source arrangement of BITS/SSU clock taps not only contributes to a higher BITS/SSU output density but also require the highest number of cable runs between the BITS and NE. If DSl signals are used, then there is typically a 655-foot cabling limitation from the BITS to the NE. This limitation is based on the maximum amplitude of the DSl at the DSX-1 and a nominal DSl termination impedance (at the NE) of 100 ohms. If DSO signals are used, they have a 1,500 foot [6] cabling limitation from the BITS to the NE. This distance limitation is based on the phase variation from NE to NE when supporting native DSO services using composite clock (CC) external synchronization.
NE1
BITS/SSU with GPS
Timing DS1 Outputs
NE2
Figure 7-15. Direct-source timing method
Synchronization Architectures for SONET/SDH Systems and Networks
7.5.2
281
Bridged-Source Timing Method
Bridged-sourced timing method essentially allows a single BITS output to be shared between two input ports of an NE. Typically used to feed redundant NE timing inputs, bridging provides an exact frequency and phase copy of the BITS clock at an amplitude of 20 dB [7] down from the DSX-1 level.
432 Q Bridging Resistors (IVIay be part of timing pacl<)
Bridging Amplifiers
100 Ohm Termination (IVIay be part of NE baclcplane)
BIIS/SSU wJthOt»S
Timing DS1 Output '
Figure 7-16. The Bridged -source timing method
An example of bridging is illustrated in Figure 7-16. Here, a DSl source is terminated at the NE with a 100 ohm resistive load. This characteristic impedance ensures that the DSl waveshape will be compliant at the end of the cable run. From here, high impedance TE inputs may be placed across this single load. In the case of a 1:2 split, resistive bridging can be used. Figure 7-16 shows how a bridging resistor arrangement would be used to split the input DSl timing signal between two external inputs of a single NE. A pair of 432 ohm resistors is placed in series with the tip and ring sides of the 100 ohm resistor and connected to each external input. The input impedance of each external input is assumed to be 100 ohms resistive. Approximately 20 dB amplification is used to boost the input DSl signal to the level of an unbridged and terminated DSl. A resistive bridging circuit consisting of a pair of 432 ohm resistors and the transformer termination (assumed to be 100 ohms resistive) presents a 964-ohm load to the parallel 100 ohm termination. This yields a total load of
282
Chapter 7
91 ohms at the end of the cable from the BITS/SSU. Two such bridging inputs with a common parallel 100 ohm termination (shown in Figure 7-16) will yield a reduction in the termination impedance to 83 ohms. Though this dual-bridged termination will result in decreased amplitude at each NE input, the bridging amplifiers can be designed to account for this and boost the bridged DSl signal to the proper level. The advantage of using a resistive bridging circuit is that it is backward compatible with the direct-source timing mode inputs. In other words, a system designed for direct-source timing mode can be modified/upgraded to use bridged-source timing mode. This is done by adding external bridging resistors and modifying the TE circuit to include bridging amplifiers. Another advantage of using the bridging circuit on both DSl inputs is that it allows one of the timing packs to be removed without disturbing the input timing signal to the mate pack.
7.5.3
Line/External Timing Method
Line/external timing essentially allows one or more NEs to provide synchronization information that is carried over a SONET/SDH facility to a co-located BITS. Since BITS/SSUs are unable to directly extract timing information from a SONET/SDH facility, a line-terminating SONET/SDH NE is needed [8] In this way, timing may be recovered directly from the SONET/SDH frame and used to generate a derived DSl signal that is used to externally time the BITS/SSU. The BITS/SSU then uses this timing reference to generate outputs that are fed back to the sourcing NE and other co-located NEs. Lastly, the SONET/SDH LTEs use this timing to process and transmit outgoing SONET/SDH frames. The derived DSl signal is created by the SONET/SDH LTE expressly for the purpose of transporting synchronization information. This signal is formatted as an all ones with either a super-frame (SF) or extended superframe (ESF) format. The use of DSl framing is also used to allow signaling between the SONET/SDH NE and BITS/SSU. In the case where SONET/SDH synchronization status messaging is used, it is necessary to provide this information to the BITS/SSU to influence its ability to either switch between external input references or enter a backup mode of operation (holdover). In essence, this allows the SI byte message to be transported to the BITS/SSU without the need for a direct SONET/SDH connection. Figure 7-17 illustrates the configuration used for the Line/External timing method.
Synchronization Architectures for SONET/SDH Systems and Networks
283
mm BITS/SSU
Timing DS1
-©
z.
5r NE
Figure 7-17. Line/external timing method
When the derived DSl uses the SF format, the SONET/SDH NE communicates with the BITS/SSU by changing the format of the derived DSl to a blue signal format (all ones without framing). When the BITS/SSU receives a blue signal on an external timing reference, it will stop using it as a reference (if currently active) and switch to a valid reference. The use of this method of signaling is commonly called threshold AIS generation [9]. A SONET/SDH NE will generate a blue signal on an SF-formatted derived DSl under the following conditions: •
Input SSM "Do not use for synchronization (DUS)" — whenever an incoming SONET/SDH SI byte is set to DUS, it is not used to synchronize the processing or transmission functions of the NE. Failure to comply will commonly result in a timing loop that will lead to the impairment of data transmission. Therefore, under no circumstances should a timing source bearing the DUS message be used for NE or network timing.
•
The input SSM is equal to or worse than a preset threshold value as set on the SONET/SDH NE — In this case, the SONET/SDH NE may have the ability to compare the incoming SSM with a preprovisioned value or threshold. When the incoming SSM is equal to or better than the threshold value, the SONET/SDH NE will
Chapter 7
284
generate a SF-formatted derived DSl. When the incoming SSM is worse than the threshold value, the SONET/SDH NE will generate a blue signal. In this way, the SONET/SDH NE will ensure that the co-located BITS only receive a timing source of equal or better stratum level. Table 7-L Translation between SI and ESF SSMs 1 1 Quality 1 Level
1
1 ^
First-Generation SSMs Description DSl ESF Data Link Code Word Stratum 1
00000100
Traceable
11111111
Synchronized
00001000
Traceability
11111111
SI Bits (bSb8)
Quality Level
0001
1
0000
2
Unknown
1 ^ 1 ^ 1 ^
Second-Generation SSMs Description DSl ESF Data Link Code Word Stratum 1
00000100
Traceable
11111111
Synchronized
00001000
Traceability
11111111
|
SI 1 Bits (b5b8) 0001
0000 1
Unknown
Stratum 2
00001100
Traceable
11111111
Stratum 3
00010000
Traceable
11111111
SONET
00100010
Minimum Clock
11111111
0111
3 4
1010
Stratum 2
00001100
Traceable
11111111
TNC Traceable
01111100
0111 1 0100 1
11111111 1100
5
Stratum 3E
01111000
Traceable
11111111
1101 1
Traceable
1 ^ 1 ^
Stratum 4
00101000
Traceable
11111111
Don't Use for
00110000
Synchronization
11111111
Provisionable
01000000
N/A
j
1111
6
7
Stratum 3
00010000
Traceable
11111111
SONET
00100010
Minimum Clock
11111111
1010 1 1100
Traceable 1
User
1 Assign1
1110
8
11111111
Stratum 4
00101000
Traceable
11111111
Don't Use for
00110000
Synchronization
11111111
Provisionable
01000000
N/A
able 9 User Assign1
1111 1 1110 1
11111111
able
When the derived DSl uses an ESF format, the SONET/SDH NE communicates with the BITS/SSU by using a set of selected code words in
Synchronization Architectures for SONET/SDH Systems and Networks
285
the ESF data link. These code words carry the SI byte information from the SONET/SDH hne overhead, in essence providing a translation of the SSMs between the SONET/SDH and PDH formats. Table 7-1 shows the translation between the SI and ESF SSMs for each generation.
7.5.4
Mult Timing Metiiod
This method allows multiple NEs to be externally timed from a pair of BITS/SSU timing references. This inter-NE timing is performed by daisychaining the external BITS/SSU source between the NEs. As shown in Figure 7-18, each DSl/El timing signal is daisy-chained independently between the TEs. An advantage to using the mult timing method is that it relieves the number of timing outputs needed at the BITS/SSU. It also greatly reduces the amount of cabling from the BITS, thereby contributing to an overall cost savings.
External Timing DS1/E1
Figure 7-18. Mult timing method
As with any daisy-chained configuration, there are susceptibilities that must be addressed. If the daisy-chain is broken, it may affect other NEs in the same chain. It is for this reason that the maximum distance between daisy-chained equipment must be kept relatively short, typically spanning a single bay or adjacent bays. The other concern is the ability to maintain
286
Chapter 7
signal continuity of an intermediate NE that is undergoing maintenance or is being taken out-of-service. It is for this reason that a passive means of daisy-chaining is preferred over an active implementation. For example, will signal continuity be preserved if both the TE packs in an intermediate NE are physically removed? If the answer is no, then all subsequent NEs in the chain will lose their external input and enter a backup mode of timing. There are several ways to maintain signal continuity in the event of a pack pull. Two common methods that provide continuity of the daisy-chain to downstream equipment use either backplane shorting connectors or backplane mounted relays. In addition to providing signal continuity, there is also the transient effect of inserting or pulling a pack. Therefore, additional procedures may be needed to address this. For example, if it is necessary to remove an active TE in a daisy-chained NE, all downstream NEs need to select timing from the standby TE. This selection may be done by forcing all NEs to use the same BITS/SSU reference as the standby TE. Since this represents a manual reference switch, the transient performance of each NE's system clock should be deterministic and controlled. Once all the standby TEs are stable, all NEs will be forced to use the standby reference (now active). Then the intended TE may be removed with no impact on the NE or other NEs in the daisychain.
7.6.
CLOCK BACKUP MODES AND IMPLICATIONS
Clock backup modes of operation are typically used to maintain a system timing function in the presence of an external timing distribution fault. Such faults can cause source traceability to an external reference to be interrupted for a few seconds, several hours, or days. Since timing distribution is all about maintaining source traceability to an external reference, its loss can compromise the integrity of the underlying service. Clock backup modes of operation fall into two basic categories: holdover and free-run. Holdover Mode: This mode is typically used to maintain accurate timing for external timing distribution faults of relatively short duration up to 24 hours. The limitation of 24 hours is not based on the functional aspects of the holdover circuit; rather, the technical standards that define holdover only specify its performance for the first 24 hours after a fault. Depending on implementation and environmental conditions, it is possible for a holdover circuit to maintain proper holdover performance for much longer than 24 hours.
Synchronization Architectures for SONET/SDH Systems and Networks
287
In holdover mode, the TE essentially stores the frequency and phase value of the last good external clock reference prior to the external timing fault. These values are then used as a basis to steer the timing reference located in the TE for the duration of the holdover event. In theory this sounds easy, but in practice there are a number of rather significant issues associated with correctly computing and storing the frequency and phase values. The significance of this will be evident when we look at the four phases of holdover mode: 1. Holdover mode initiation: Initiation involves a system command to the TE that holdover mode is required. The initiation command can be as simple as a manual command from a management port, or as part of a manual timing reference switch. More server causes of holdover initiation may be due to detected system faults. Some of these faults can be as simple as a loss of signal (LOS) or frame (LOF) of the incoming timing reference. If SONET/SDH line timing is used, a change in the SSM can trigger a holdover event if the threshold is crossed or a DUS message is received. Fault events can also be caused by an upstream reference walking off in-frequency. An off-frequency input reference condition is more complicated to detect due to the need for a precision frequency standard. Yet there are implementation-specific methods that can be used to accomplish this comparison without the need for such a standard. If crosscoupling of input references is used, then multiple standby references (or an ensemble of references) can be monitored and compared with the active reference. A majority voting scheme can be used to determine if a significant frequency difference exists between the active and the ensemble references, which can then trigger the appropriate fault detection and corrective action. 2. Entry into holdover mode: Once a fault condition of the timing reference has been detected, it must be categorized and the appropriate action must be taken. Though holdover entry is used in some systems to guard against a multifault condition, some systems employ short-term holdover as part of the regular switching scheme. The reasoning behind a short-term holdover event is to allow sufficient time to qualify the new input reference while not relying on the previous external reference. In either case, entry into holdover requires the successful execution of a number of simultaneous processes.
288
Chapter 7 a. Calculation of holdover value: The calculation of the holdover value is based on an average of the previous input frequency/phase information relative to the external reference. The averaging time can range from a few seconds to several hours. The averaging time and its benefit to overall system performance are left as an implementation detail. b. Phase offset computation: The phase offset value is the difference between the actual TE clock and the holdover value. Since the holdover value represents an average value, it will most likely be different from the actual frequency and phase of the TE clock. If an instantaneous switch between the external reference and the TE clock occurs, a phase discontinuity at the TE clock may result. Such discontinuities may violate the system timing specifications (MTIE, TDEV, frequency stability). Therefore, the computation of a static phase offset may be necessary to allow a stepless phase change during the actual holdover switch. The other alternative is to perform a gradual phase change from the current TE clock phase to that of the computed holdover value. c. Holdover mode switch: This switch involves transferring frequency and phase control from an external frequency process to internal means. Again, due to the need to avoid phase discontinuities per industry standards, an instantaneous switch cannot be performed. Rather, the switch must allow for either a static phase build-out, or a gradual change in frequency and phase to the desired holdover values. Should a phase transient occur during entry into a stratum 3/3E holdover, it must occur within the first 64 seconds of holdover [10]. The magnitude of this transient at a DSl interface is dependent on observation time, as shown in Table 7-2. For stratum 2 holdover entry, such a transient must occur within the first 5000 seconds of holdover [10]. The magnitude of this transient at a DSl interface is dependent on observation time, as shown in Table 7-3. Again, the performance of this switch is implementation specific.
Synchronization Architectures for SONET/SDH Systems and Networks
289
Table 7-2. Stratum 3/3E holdover transient vs. S at DSl interfaces Observation Time, S (Seconds) S< 0.001326 0.001326<S<0.0164 0.0164<S<564
1
64 <S
MTIE (nanoseconds) No specification 61000 X S 1000 + 50 X S No specification
Table 7-3. Stratum 2 holdover transient vs. S at DSl interfaces
1
Observation Time, S (Seconds) S< 0.014 0.014 <S<0.16
MTIE (nanoseconds)
0.16<S<5000
150 + .lxS
No specification 7.6 + 885 X S
1
3. Holdover mode maintenance: Once in holdover mode, all frequency and phase correction must be done by the TE. Depending on the stability of the holdover clocks (e.g., stratum level requirement per ANSI or ITU-T), various means may be used to regulate frequency drift and MTIE generation. Shortterm frequency stability of oscillators is generally influenced by a combination of various factors, including voltage regulation, oscillator temperature stability, shock, and vibration. Table 7-4 [10] summarizes the performance of stratum 3/3E holdover, including the initial frequency offset, frequency drift rate, and fractional frequency offset due to temperature variations. Table 7-4. Summary of Stratum 3/3E holdover performance Initial Frequency Offset Frequency Drift Rate Fractional Frequency Offset Due to Temp. 1 Variations
Stratum 3 0.05 ppm 4.63 X lO'^ppm/sec 0.3 ppm
Stratum 3E 1 0.001 ppm 1.16 X 10"^ppm/sec 0.01 ppm
290
Chapter 7 As can be seen in Table 7-4, oscillator temperature variations are typically the most important parameter to control. Influenced by ambient temperature variations, high-stability oscillators are able to compensate for these effects on their temperature stability by various methods. a. Ovenization: The basic premise here is to put the temperature-sensitive components in a miniature, insulated chamber that is heated to a constant temperature. In this way, ambient temperature influence on the oscillator is greatly minimized. Oscillators using ovens typically have an "O" as a prefix designation (OCXO — oven-controlled crystal oscillator; OCVCXO — oven-controlled voltage-controlled crystal oscillator). There are various implementations of this using either single or dual oven chambers. Ovenized oscillators are typically able to achieve stratum 3E or better performance. However, it should be noted that the power requirements and physical size of an ovenized oscillator will be greater than a nonovenized version. b. Temperature Compensation: The basic operation of temperature compensation schemes is to characterize the operation of the oscillator over a temperature range, and then employ a scheme that compensates the oscillator's performance when subjected to that range. Oscillators using this scheme typically have a "T" as a prefix designation (e.g., TCXO — temperature-compensated crystal oscillator; TCVCXO — temperaturecompensated voltage-controlled crystal oscillator). This method typically involves characterizing the temperature/frequency curve of each oscillator and storing this data as a lookup table in each oscillator. The actual temperature of the oscillator is then measured and the output frequency of the oscillator adjusted in real time to nullify the effects of temperature or temperature change. The performance of this scheme is dependent on the accuracy of every intermediate function: lookup table, temperature measurement, and frequency compensation. As such, temperature compensation is typically found on stratum 3 and lower stability oscillators.
Synchronization Architectures for SONET/SDH Systems and Networks
291
4. Exit from holdover mode: This approach involves a system command to the TE to end holdover mode and use an external timing reference. The initiation of this command can be as simple as a manual command from a management port, or as part of a manual timing reference switch. Typically, exiting holdover mode should only be done if it is certain that the cause for holdover has been resolved. Since holdover mode requires a known good reference to establish frequency stability, it is imperative that the external timing source be good before switching from this mode. There are several steps that are typically taken before leaving holdover mode. These steps are implementation specific and are presented as a guide to what might logically be needed. a. Qualification of input signal: As previously stated, the state and stability of the external reference must be established before exiting from holdover mode. Requalification of a failed reference can be done for two cases: loss of timing input, and off-frequency events. For the case of a lost timing input, requalification times of the input are on the order of 10 to 30 seconds [11]. However, for off-frequency events, significantly more time may be required. Qualification times in the order of 100 seconds for stratum 3e to 4, and up to 600 seconds for stratum 2, are allowed. The reason for the extended qualification times is to allow sufficient duration for frequency measurement techniques to verify the stability and/or accuracy of the external reference. b. Calculation of phase offset: Due to the strict adherence to MTIE requirements, all switching events need to be kept to less than 150 ns for stratum 2 and less than 1000 ns for stratum 3e to 4e. In order to accomplish this, phase build-out capability may be needed. More than likely, the phase between the newly qualified input reference and that of the TE holdover clock will not be in phase. Therefore, a measurement of the phase difference needs to be done and the offset calculated. Generally, it is not required that stratum 3 to 4 clocks perform a phase build-out during reference switching [12].
292
7.7.
Chapter 7 c.
Input reference switch: The switching involves transferring the frequency and phase control from the internal control of holdover to an external process. As before, due to the need to avoid phase discontinuities per industry standards, an instantaneous switch cannot be performed. Rather, the switch must either allow for a static phase build-out (using the previously calculated value) or rely on a gradual change in frequency and phase to reflect the external timing reference.
d.
Lock-on time stabilization: This is the time it takes the TE synchronization retiming process to fully reflect the external reference. If a PLL is used by the TE as a retiming mechanism, a transient overshoot of the output frequency may occur due to the loop acquiring lock. The duration and amplitude of this overshoot is influenced by a number of system control parameters. However, a fixed set of control-loop parameters may not provide optimal performance for both quickly acquiring lock and providing a sufficient level of filtering during locked mode. Therefore, a possible implementation strategy is to have one set of control-loop parameters to optimize lock-on time and then another to optimize the normal "locked" mode. Clock fast mode [13] is the term used to describe the condition where the lock-on time parameters are optimized.
SYNCHRONIZATION GUIDELINES
Many of the concepts presented in this chapter are intended to provide guidance when using synchronization in SONET/SDH equipment or managing synchronization in networks. The point to stress is that there is no one best way to accomplish this; rather, there are a series of choices that both the designer and end-user must make in order to achieve a certain level of performance or a minimum level of availability versus the cost of the implementation or management. Thus it becomes a matter of balance between these variables to arrive at a suitable solution. Source/frequency traceability: This is fundamentally the most important synchronization concept. At a minimum, the origin of synchronization signals needs to be either known or managed such that
Synchronization Architectures for SONET/SDH Systems and Networks
293
timing loops cannot occur. Typical strategies involve locating a GPS/BITS or SSU with the network equipment and then using external timing to synchronize the local equipment. The use of synchronization status messaging can provide limited network management of a preferred direction of timing. In this way, each SONET/SDH LTE knows the quality level of the timing source being used by its neighbor. This knowledge is sufficient to allow a quick determination of the suitability of an incoming line to be used as a timing reference. It should also be noted that the majority of issues involving source traceability typically happen as a result of either a faulted condition or the actions taken to correct faulted conditions. Therefore, the administration and planning of synchronization networks must include all possible failure scenarios. In some cases, if the network topology is not always known, it may be better to enter a back-up mode of operation vs. trying to switch to another available line reference. Synchronization redundancy and availability: This is a key concept that is usually made on the basis of service type and implementation cost. Functionality that supports these concepts is usually tied to architecture. Typically, redundancy and availability are of great importance for large systems or systems that have a large traffic throughput. In these cases, the failure or interruption of network timing will have a significant network impact and ultimately lead to service disruption. Therefore, the use of redundant elements in network equipment will lessen the probability that a single point of failure will cause a traffic-impacting event. In addition, the use of a cross-coupled architecture will enhance the ability to seamlessly switch between redundant synchronization functions in a network element. Such seamless switching may be useful not only during an equipment or network fault but also in verifying that the standby or redundant equipment sides are functional and will be available when needed.
7.8. NOTES 1.
2.
In addition, two clocks that have exactly the same frequency can have unbounded phase difference if a random walk phase noise (White Frequency Modulation (WFM)) process is added to the phase of one of the clocks. Note that a stratum 1 clock need not be implemented using GPS, e.g., an implementation can be based on a Cesium clock without using GPS. However, GPS-based solutions are practical and cost effective.
294
Chapter 7
7.9. REFERENCES [I] [2] [3] [4]
[5] [6] [7] [8]
[9]
[10] [II] [12] [13]
ITU-T Recommendation G.812, Timing requirements of slave clocks suitable for use as node clocks in synchronization networks, Section 12, June 1998. ITU-T Recommendation G.704, Synchronous frame structures used at 1544, 6312, 2048, 8448 and 44 736 kbit/s hierarchical levels. Section 2.3.4, October 1998. ANSI Tl.101-1999, Synchronization Interface Standard —Section 4 Definitions, 1990. M. A. Lombardi, L. Nelson, A. N. Novick, V. Zhang, (TIME AND FREQUENCY DIVISION - 847), "Time and Frequency Measurements Using the Global Positioning System (GPS)," Measurement Science Conference, January 2001, Anaheim, CA January 01, 2001. ITU-T Recommendation G.7041A^.1303, Generic framing procedure (GFP), December 2001. Telcordia GR-378-CORE, Generic Requirements for Timing Signal Generators, Section 2.1 Synchronization, Issue 2, February 1999. Telcordia GR-378-CORE, Generic Requirements for Timing Signal Generators, Section 4 — Input Criteria, Issue 2 February, 1999. Telcordia GR-253-CORE, Synchronous Optical Network (SONET) Transport Systems: Common Generic Criteria, Section 5.4.5.1 — Timing Distribution on Derived DSl Signals, Issue 3, September 2000. Telcordia GR-253-CORE, Synchronous Optical Network (SONET) Transport Systems: Common Generic Criteria, Section 5.4.5.2.2 — Message Translation, Issue 3, September 2000. ANSI Tl .101-1999, Synchronization Interface Standard, Section 8.2.2 — Stratum 3/3E clock, 1999. Telcordia GR-1244-CORE, Clocks for the Synchronized Network: Common Generic Criteria, Section 3.7 — Reference Validation Times, Issue 2, December 2000. Telcordia GR-1244-CORE - Clocks for the Synchronized Network: Common Generic Criteria, Section 5.7 — Phase Build-out, Issue 2, December 2000. ANSI Tl.101-1999, Synchronization Interface Standard - Section 4 Definitions, 1999.
Chapter 8 NETWORK SURVIVABILITY
To the memory of my wife, Lynda Ghani Abbas Marconi Communications Ltd.
8.1.
INTRODUCTION
Network Survivability is a term that refers to the ability of the network to maintain an acceptable level of service during a network or equipment failure or traffic signal degradation. There are several survivability mechanisms covering a wide range of network architectures, technologies, and allocation of network resources. In the last ten years, the ITU-T has developed a number of recommendations for generic and technology-specific protection schemes. This chapter will review these recommendations and their applications in the network using the ITU-T terminology.
8.2.
NETWORK SURVIVABILITY TECHNIQUES
Generally, there are two types of network survivability techniques: a. Network Protection: This is the replacement of a failed or degraded working resource with a preassigned standby resource. Protection mechanisms tend to be deterministic in nature. When a failure occurs, what will happen is fairly easy to predict and cater for. Generally, the protection action is completed in tens of milliseconds. Protection mechanisms are autonomous and operated independently
296
Chapter 8 from the control or management planes. There are several protection schemes of various architectures and for various technologies. Most of these schemes are defined in the following ITU-T recommendations: • Rec. G.808.1 [1], Generic Protection Switching—linear trail and subnetwork protection, • Rec. G.873.1 [2], Optical Transport Networks (OTN)—Linear protection, • Rec. G.841 [3], Types and characteristics of SDH network protection architectures, • Rec. 1.630 [4], ATM protection switching, and • Rec. Y.1720 [5], Protection switching for MPLS networks.
b. Network Restoration: This is the replacement of a failed or degraded working resource by rerouting using dynamically allocated spare resources. Restoration is nondeterministic, and the network behavior in failure conditions is less predictable. The restoration action may take seconds to complete depending on the amount of traffic being restored. Restoration requires control plane actions. There are a number of ITU recommendations defining control plane architectures, signaling, and routing. These are as follows: • • • • •
8.3.
Rec. G.807 [6], Requirements for automatic switched transport networks (ASTN) Rec. G.8080 [7], Automatic switched optical networks (ASON) Rec. G.7713 [8] series. Distributed call and connection management Rec. G.7714 [9] series. Automatic discovery techniques Rec. G.7715 [10] series. Routing in ASON
SURVIVABILITY OFFERED BY PROTECTION
ITU Recommendation G.808.1 provides an overview of generic aspects of linear protection switching and covers protection schemes that are applicable to the Synchronous Digital Hierarchy (SDH), Optical Transport Networks (OTN), and ATM. In this section, various aspects of protection switching will be addressed.
Network Survivability
8.3.1
297
Network Objectives
Generally, the following network objectives are considered important and should be met for any protection switching schemes: a. Switching time. This is the time taken to initiate and complete the protection switching event. It excludes fault detection time and any hold-off time. A value of 50 ms has been standardized. However, this value may be difficult to meet under all network failure scenarios. For example, for certain applications, such as ring protection where the ring circumference is large, it may not be met. b. Transmission delay. The physical length of the fiber and the processing time of the protection protocol along the path determine the transmission delay. Generally, the transmission delay on 1000 km is approximately 5 ms, and processing delay is in the range of hundreds of microseconds. However, in order to meet the protection switching time, it is necessary to impose a certain limit on the transmission delay. c. Hold-off time. This is a useful parameter for limiting the protection switching actions in a nested protection in multilayer multitechnology transport networks. In nested protection switching schemes, it is necessary to impose a hold-off time on the outer protection so that the inner protection switching can restore the traffic before the outer protection is activated. A standardized holdoff timer, provisionable from 0-10 seconds in steps of 100 ms for SDH and OTN and 500 ms for ATM, is mostly used. d. Switch Initiation. Transmission and connectivity defects are used as criteria for initiation protection switching. These defects may be high error rate, loss of signal, or path trace mismatch. Generally, a threshold is set for transmission defects.. For example, a threshold of an error rate, say lE-8, may be set on the working path, and once this level is crossed, a protection switch is triggered for selection of the protection path. Similarly, for a connectivity defect —a path trace mismatch defect, for example— after a persistency check, the destination port triggers the protection switch for the selection of the protection path.
8.3.2
Protection Switching Architectures
Basically, there are five types of architectures: 1+1, l:n, m:n, (1:1)" and rings. The first four types are for linear protection. Each type has its own advantages and disadvantages as detailed below.
298
Chapter 8
8.3.2.1
1+1 Protection Architecture
The 1 + 1 architecture is the least complex type of protection, where the protection channel is dedicated as a backup facility to the working channel. The normal traffic signal is bridged onto the protection channel at the source end-point of the protection domain, as shown in Figure 8-1. The normal traffic signal is transmitted simultaneously on both the working and protection channels to the sink endpoint of the protection domain. A selection between the working and protection signals at the sink end is made on the basis of a predetermined criterion, such as loss of signal or signal degrade. Permane Bridge
Normal Operation
Working Selector
X ^
j
\ / \ J
j ^
J
\ j
•
Normal Operation
Selector
Protection Figure 8-1. 1+1 protection architecture
The main disadvantage of this type of protection is the 100% extra capacity needed in the network. 8.3.2.2
1 :n Protection Architecture
The 1 :n architecture is a shared type of protection where one dedicated protection channel is shared as a backup facility by n working channels, as shown in Figure 8-2. When the normal traffic signal on the working channel is impaired due to transmission or connectivity defects, the normal traffic signal is transferred to the protection channel at both the source and sink endpoints of the protected domain. When more than one working channel is impaired, only one normal traffic signal can be protected at any one time.
Network Survivability
299
The main advantage of this architecture is that the protection channel can be used to transport extra traffic when the protection channel is not used. However, this architecture tends to be complex, and it does not support dual node interconnect for protected subnetworks.
Normal (1)
Selectors
Bridges
Bridges
AIS - Alarm Indication Signal FDI - Forward Defect Indication Figure 8-2. l:n protection architecture
Selectors
300 8.3.2.3
Chapter 8 m:n Protection Architecture
The m:n architecture is another type of shared protection where m dedicated protection channels are shared as backup faciUties by n working channels, as shown in Figure 8-3. The value of m is typically < n. The bandwidth of each of the protection channels should be adequate to protect any of the n working channels. Generally, when the normal traffic signal on a working channel is impaired, first the normal traffic signal is assigned to an available protection channel and then the traffic signal is transferred from the working to the protection at both the source and sink endpoints of the protected domain. Obviously, only m working channels can be protected at any one time. The main advantage of this architecture is that the protection channels can be used to transport extra traffic when they are not used. However, this architecture tends to be complex, and it does not support dual node interconnect for protected subnetworks. 8.3.2.4
(1:1)" Protection Architecture
The (1:1)" protection architecture is an n parallel 1:1 protection architecture and behaves in a manner similar to 1 :n. There are n protection channels sharing same the bandwidth that are the backup facilities for n working channels, as shown in Figure 8-4. The protection channel bandwidth is capable of protecting any of the n working channels. When the normal traffic signal over the working channel is impaired, the normal traffic is first assigned to the associated available protection channel. This is followed by transferring the normal traffic signal on the working channel to the assigned protection channel at both the source and sink endpoints of the protected domain. This architecture allows for only one working channel to be protected at any one time. The main advantage of this architecture is that it allows routing the working traffic over different equipment and resources and thus avoiding a common point of failure. This protection architecture is applicable to cell and packet networks such as ATM and MPLS.
Network
301
Survivability Bridge
Working (1)
Bridge T"
nr I Normal (1)
' ^ ^'. Working (2)1
U
I •-
-n Working (n)
u - /I
•-
I I I I
—•-
:
^
—
-# I Normal (2) I I
I
^^
-n
I Normal (n) I I
Extra traffic (n+1) Protection (0) -^1
I
ir
A
I I I
I I Extra traffic
I
Protection (m-1) •—•
(n+m)
•^1
I 4—
>s'
• ^ AIS/FDI
AIS/FDI I I Selectors
I Selectors Figure 8-3. m : n protection architecture
Chapter 8
302
Normal (1)
"
Working (1)
Normal (1)
V^ I
Protection (1) I I —1-
Normal (2)
I I
Working (2)
Normal (2)
•T !•I I
Normal (n) M
Protection (2)
^
I I I I ^-1
Working (n)
A*-
1v Protection (n) |
I
V
I I
Extra Traffic (n+1)
^
I
Protection (0)
-I-
Normal (n)
T I I I I I I I ^ !-••
Extra Traffic (n+1) M—
l#-
Bridge
Selector
I Bridge
AIS/FDI
Selector
AIS/FDI
Figure 8-4, (1:1)" protection architecture
8.3.2.5
Ring Protection Architecture
Currently, the ITU is developing a new recommendation, G.808.2, which will provide an overview of generic aspects of ring protection. One simple application of ring protection is the self-healing ring, where a failure of one of the links between the ring nodes results in the ring reconfiguring itself and looping the traffic at the nodes nearest to the failure away from it. Another
Network Survivability
303
form of ring protection is shown in Figure 8-5, which is described in ITU Recommendation G.841 and is referred to as SDH Multiplex Section Shared Protection Rings (MS-SPRING). G.841 provides SDH equipment-level specifications to implement this type of protection.
8.3.3
Protection Switching Parameters
8.3.3.1
Switching Types
There are two types of protection switching, namely, unidirectional and bidirectional. a. Unidirectional switching : Switching, in this case, is completed when the normal traffic signal is selected from the protection channel at the end detecting the fault. For 1+1 architecture, the sink-end selector is activated only, and no communication with the source end is required. For other protection switching architectures, the sink-end selector and the source-end bridge are activated. This requires a communication channel between the two ends of the protected domain, which is referred to as the Automatic Protection Switching (APS) channel. The APS channel is terminated at the connection functions at both ends of the protected domain. The unidirectional protection switching is a simple scheme to implement, and under multiple failure conditions there is a greater chance of restoring traffic faster than with bidirectional switching.
STM-M Working Path
Zl. STM-M/ Au-4 Source—
[M
%^ ^
•
3^:7.
STM-N Protection Path
STM-M/ Au-4 Destination
Figure 8-5. SDH - MS - SPRING
304
Chapter 8
b. Bidirectional switching : In this type of switching, the normal traffic signal is switched from the working channel to the protection channel at both ends of the protection span. For 1+1 architecture, the selectors at the sink and source ends are activated. For other protection switching architectures, selectors and bridges at both sink and source ends are operated. The two ends of the protection span communicate via the APS channel to initiate the transfer of the normal traffic signal. Priority, request and switching commands are contained within the APS protocol. The bidirectional protection switching results in a minimum number of Severely Errored Seconds (SES) during repair and maintenance of the working path. This is due to fewer switching events and thus fewer disruptions to normal traffic. 8.3.3.2
Operation types
There are two types of protection switching operations, namely, revertive and nonrevertive. a. Revertive operation: When the defect or the external commands are cleared on the working channel, the normal traffic signal returns to the working channel that was in use prior to these events. Revertive operation is likely to be favored for the following reasons: • In some applications, a protection capacity is needed for frequent network capacity rearrangement. • The protection channel may be of much lower performance such as large delay. • Network management needs to be simplified. To prevent frequent operation of the protection switch due to intermittent defects such as bit error rate fluctuation, a fixed period of time is allowed to elapse before a normal traffic signal can use a recovered working channel, where the Signal Fail (SF) and Signal Degrade (SD) have been cleared. This period is called a wait-to-restore period. I It is generally configurable, and a settable value from 5-12 minutes is standardized. b. Nonrevertive: The normal traffic signal stays on the protection channel and does not return to the working channel after the defect or external commands are cleared. The advantage of nonrevertive operation is that, in general, it will have less impact on traffic performance. 8.3.3.3
Protocol Types
All protection switching types require the APS protocol to coordinate the bridge and selector actions at both ends of the protection domain. The only exception is the 1+1 unidirectional protection switching scheme. Different
Network Survivability
305
APS protocols are required depending on the type of protection, selector, and bridge used. The near-end node (A) and the far-end node (B) of a protection domain communicate with each other via the APS channel. The number of times the APS protocol transverses the protection span between nodes A and B determines the protocol type. To reduce the protection switching time it is necessary to minimize the number of communication cycles between nodes A and B. The APS protocols associated with these communication cycles are referred to as 1-phase (B to A), 2-phase (B to A and A to B), or 3-phase (B to A, A to B, and B to A). The 1-phase protocol is employed in (1:1)" architecture, the 2-phase protocol in 1+1 bidirectional architecture and the 3phase protocol in all architectures. The 3-phase protocol is most commonly used, since it is applicable to all architectures and prevents misconnection occurrence in all conditions. However, it does suffer from increased switching times due to triple exchange of APS messages between the two ends of the protected domain. 8.3.3.4
Protection Switching Temporal Model
Figure 8-6 illustrates the protection switching model whose parameterare defined below. Detection Time Tl: This is the time interval from the instant of the network impairment occurrence and its detection Hold-off time T2: This is the time taken to confirm that the defect conditions require protection switching Switching operation time T3: This is the time taken to complete the processing and transmission of the APS messages to invoke protection switching. Switching transfer time T4: This is the extra time needed to complete the protection switching operation Recovery time T5: This is the time needed after completion of the protection switching operation to fully restore the protected traffic. The protection switching time (T3 + T4) is standardized to be less than 50 ms. Furthermore, restoration is the total time take to restore protected traffic and is the summation of all the time intervals above (i.e., Tl + T2 + T3 + T4 + T5).
Chapter 8
306
8.3.4
Protection Switching Classes
83.4.1
Trail Protection
Trail protection is a dedicated end-to-end protection across an entire network or multiple networks. It protects against faults in the server layer as well as connectivity faults and performance degradation defects in the client layer. It can be deployed in various network architectures such as mesh or rings, and there is no limitation on the number of network elements (NEs) within the protected trail. Trail protection is quite a versatile protection class, and it can be deployed in all combination of protection switching architecture, types, and operation. Figure 8-7 illustrates the generic concept of trail protection. It is worth noting that trail protection at the path level requires an additional termination function on the cross-connect equipment, which results in additional cost relative to other classes. Trail protection can be applied on individual trails or group of trails.
Onset of network impairment
Activation of SFor SD signals
Start of protection switching operation
The last control/ command received
The protection switching operation is completed
Protected traffic fully restored
T3
'
T4
Protection Switching time
Figure 8-6. protection switching temporal model
Network Survivability
307 Working Channel
Traffic Signal
Traffic Signal
Protection Channel
Protection Domain TT - Trail Termination
PS - Protection Switching
Figure 8-7. Trail protection
8.3.4.2
Subnetwork Connection Protection (SNC-P)
SNC-P is used to protect a segment of a trail within a network or multiple networks. The subnetwork can be between two Connection Points (CPs), as shown in Figure 8-8, or between a CP and Termination Connection Point (TCP), or between two TCPs as the entire end-to-end network connection. SNC-P can operate in various protection architectures and switching and operation types. It can be used in various network structures, such as mesh or rings or mixed, and can be applied in any layer in a layered network. SNC-P is further characterized by the way defects are monitored within its domain. a. SNC/I — Inherent: In this variant, SNC operates on the server layer defects conditions only. The server layer termination and adaptation functions determine the signal fail and signal degrade (SF/SD) conditions. b. SNC/N -Nonintrusive: This approach requires the use of nonintrusive monitoring functions to determine the SF/SD conditions. It can be applied using end-to-end detection, where the server layer defects, continuity/connectivity defects, error degradation, and end-to-end OAM overhead are monitored. Alternatively, it can be applied using a sublayer detection, which is similar to end-to-end detection except that the sublayer OAM overhead is monitored. This is typically applied in a segment sublayer (referred to as the tandem connection (TC) layer) and referred to as SNC/S. TC monitoring functions are also deployed to
Chapter 8
308
determine the SF/SD conditions. Protection can be applied on individual SNCs or a group of SNCs.
Traffic r Signal PS
k
J
Trafllc Signal PS
S N C Protection Domain
Figure 8-8. SNC/I protection
Working Channel
Traffic r Signal
Traflfic Signal PS
Protection Channel
SNC Protection Domain E —Endlo-End Trail Termination M —Nonintrusivc Monitor
PS —Protection Switching
Figure 8-9. 1 + 1 SNC/N protection
Network Survivability
309
Traffic Signal
Traffic Signal
E —End-to-End Trail Termination S —Protection Sublayer Trail Termination
PS —Protection Switching
Figure 8-10. SNC/S protection
8.3.4.3
Unidirectional Path Switch Ring (UPSR)
S0NET[11] defines the above protection scheme, where the working channels for both directions of bidirectional traffic may follow different physical paths in a protected ring. In general, it can be used for both SNC/I and SNC/N but not SNC/S or trail protection. This scheme may result in a longer transfer delay, and it is different for the two directions of traffic as compared with schemes where the working channels for both directions of traffic follow the same physical path.
8.3.5
Hold-off Timer
In a nested protection mechanism, a single failure may trigger multiple recovery mechanisms, which may interact with each other, resulting in undesirable and unknown network states. To prevent this state, a hold-off timer is used. This allows the innermost protection switching to operate first and to restore traffic before the outermost protection switching is activated. Hold-off timers provide the means to limit unnecessary protection switching events. In some applications a hold-off timer is used in each protection selector. For example, in 1+1 SNC/N and SNC/I protection switching types, hold-off timers are used to prevent early switching due to the differential delay difference between the long and short route. A hold-off timer is activated when one or more defect conditions such as SD or SF are detected in the protected group, and it runs for a provisionable period from 0 to 10 seconds. Recommendation G.841 defines provisionable steps of 100 ms for SDH, and a similar value is defined for OTN.
310
Chapters
When the time interval of the hold-off timer expires, the SF/SD status of all the traffic signals is passed to the protection switching process to act upon them. The SF/SD conditions are not required to be present during the entire duration of the hold-off period, since only the states at the end of the holdoff period are required.
8.3.6
Protection Switching Trigger Criteria
Recommendation G.808.1 defines the Signal Fail (SF) and Signal Degrade (SD) conditions and the combination rules associated with the layer network. It also presents tables that illustrate defects that contribute to SF and SD conditions in various network transport technologies. For example, among SDH conditions that contribute to SF are Loss of Signal (LOS), Alarm Indication Signal (AIS), path mismatch (TIM), Loss of Frame (LOF), etc. The SD condition is caused by digital degradation due to increased error rate.
8.3.7
Null Signal
The null signal is inserted by the connection function and is ignored at the sink end of the protection. The null signal is transported by the protection channel when the channel is not transporting extra traffic or a normal traffic signal. Typical examples of null signals are an Unequipped signal in SDH, Optical Data Unit — Open Connection Indication (ODUk-OCI) in OTN, no signal in ATM/MPLS, and test signals.
8.3.8
Automatic Protection Switching (APS) Protocol
8.3.8.1
The APS signal
An APS signal is required to coordinate the protection switching actions at both ends (A and B) of the protected domain. It is transported via the APS channel and is generally allocated to one or more protection channels. The APS signal contains the following messages: •
Request/state type: to identify the highest-priority fault conditions, external commands, or the state of the protection process
Network Survivability • •
8.3.8.2
311
Requested and bridged signals: to identify the null, normal traffic and extra traffic signals Protection configuration: to identify the use of an APS channel, the protection architecture, switching, and the operation type External Commands
External commands are issued externally to initiate the protection process. Generally, only one external command is issued per protection group. External commands can be preempted or denied by higher-priority failure conditions, requests, or states. These commands are useful to do the following: • to change equipment configuration for maintenance purposes • to disable access to protection • to test the protection process • to freeze the current state of the protection process to prevent further action • to clear previous external switch commands 8.3.8.3
Protection Switching Process States
Three states of the protection switching process are identified below. •
•
•
No Request (NR): This means that in the event of a failure, the normal traffic signal is selected from the working channel. The protection channel may be carrying the null signal, extra traffic, or bridged traffic in the 1+1 protection architecture. Do Not Revert (DNR): This is used in a nonrevertive operation to maintain a normal traffic signal selected from the protection channel Wait-to-Restore (WtR): This is used in a revertive operation to prevent frequent selector operation due to intermittent failures. After clearing the SF/SD conditions on the working channel, the normal traffic signal will remain selected from the protection channel until the Wait to Restore time has expired. At the end of the WtR period and in the absence of any other event or command, the state will change to NR .
312 8.3.8.4
Chapters Priority
In the APS protocol, various priorities are assigned to fault conditions, external commands, or protection states. These are generally applied locally at each endpoint or between endpoints in the protection domain.
8.3.9
Examples
The ITU has developed a number of recommendations that cover technology-specific protection switching schemes. The modeling methodology used in developing these recommendations is based on ITU Recommendation G.805 [9] a.
Recommendation G.841: This Recommendation defines the implementation of different protection schemes and architectures for SDH networks. It covers protected entities such as SDH linear multiplex section protection, a segment of an SDH end-to-end path (i.e., SNC protection), or an entire end-to-end path (i.e., higher-order or lower-order linear VC trail protection). It also specifies the requirements for Multiplex Section Shared Protected Ring (MSSPRING).
b.
Recommendation G.873.1: This Recommendation defines various linear protection schemes for Optical Transport Networks (OTNs) at the Optical Channel Data Unit (ODU) level. It also defines the objectives and applications for these schemes. The protection schemes covered by this recommendation are ODU SNC/I, ODU SNC/N, and ODU SNC/S. OTN ring protection is currently being developed in the ITU draft Recommendation G.873.2.
c.
Recommendation Y.1720: This Recommendation defines the requirements and mechanisms for 1+1, 1:1, shared mesh, and packet 1+1 protection switching functionality for MPLS networks. The defined mechanisms are specified to support end-to-end point-topoint Label Switched Paths (LSPs).
d. Recommendation 1.630: This Recommendation defines architectures and mechanisms for ATM VP/VC protection switching and ATM VP Group protection switching. The Recommendation also describes individual and group protection, arrangement of protected domains, and resource allocation policies. The protection mechanism defined
Network Survivability
313
in this Recommendation includes protection switching trigger, holdoff mechanisms and protection switching control protocol.
8.3.10
Optical Transport Networks (OTN) Survivability
The recent advances in Dense Wavelength Division Multiplexing(DWDM) technology resulted in its deployment in Optical Transport Networks (OTNs) worldwide. DWDM allows the transport of various bit rates on a multitude of optical wavelengths, which are closely spaced, on a single optical fiber. The ITU has developed a number of recommendations defining frame format and OTN interfaces (Recommendation G.709) and the physical parameters (Recommendation G.959.1). In terms of survivability, OTN protection schemes are similar to those defined for SDH and SONET networks and are based on the protection schemes defined in ITU Recommendation G.808.1. ITU Recommendation G.873.1 defines linear protection schemes for OTN. Figure 8-11 shows an example of 1+1 Optical Subnetwork Connection (OSNC) protection. It shows that the digital client signal is applied to two optical transmitters and transported over two optical wavelengths, one as a working channel and the other as the protection channel. At the receiver, a defect on the optical path results in selecting the digital client signal from the protection optical wavelength. For shared ring protection, the work in the ITU is at an early stage, but it is based on SDH shared ring protection. Using such a protection scheme, each connection, which is protected, is provided with a preassigned 1:1 protection route and capacity. The protection connection itself does not carry a copy of the working connection under nonfailure conditions; therefore, the capacity is not occupied and can be used for the extra traffic with a lower priority. The extra traffic itself is not protected. This protection capacity can be shared by other protection connections on a link-by-link basis. To restore the OTN ring network after a failure, the affected working connections are switched to counterdirectional routes on an end-to-end basis with preassigned wavelengths. Obviously, an APS protocol is required for such a protection scheme.
Chapter 8
314 Optical Receiver* Demux+ Monitor
Optical jrransmitter+ Multiplexer Digital
iDigital
Client Signals
I^Sel.
Client Optical Receive r+ Demux+
Optical {Transmitter Multiplexe
Signals
Optical Fiber
Figure 8-11. An example of 1+1 Optical Subnetwork Connection (OSNC) protection
8.4.
SURVIVABILITY OFFERED BY RESTORATION
In the last few years, there has been considerable interest in the use of control plane technology such as Automatic Switched Optical Networks (ASON) and Generalized Multiprotocol Label Switching (GMPLS) in transport networks for restoration. Restoration differs from protection, as it does not use a dedicated spare capacity for the protection path. Instead, alternative routes are found from the available spare capacity in the network. Therefore, there is a degree of sharing of the available spare capacity between a number of working routes. The restoration mechanism offers the following advantages: • fast recovery of unprotected circuits and thus an increase in their availability • improved time-to-repair for unprotected services • under major outage, a reduction of the number of circuits that need to be manually rerouted Additionally, restoration offers capacity savings compared with network protection schemes. However, restoration should not be considered as a complete replacement for protection. For the restoration to work efficiently, a high degree of meshing in the network is needed. Generally, this approach is better suited to core network, which is highly meshed and operating at VC-4 or optical wavelength level. The outer core network, with its low meshing and the need for dual parenting protection, will dictate the use of protected architectures such as ring protection. In the last four years, the ITU-T approved a number of recommendations defining control plane architecture, routing requirements, signaling, and discovery [5-9]. ITU recommendation G.8080 specifies the architecture and requirements for the automatic switched transport network as applicable to SDH transport networks, as defined in G.803, and Optical Transport Networks, as defined in Recommendation G.872. It describes the set of control plane components
Network Survivability
315
that are used to manipulate transport network resources in order to provide the functionality of setting up, maintaining, and releasing connections.
8.4.1
Network Restoration Techniques
Basically, restoration may be implemented in the following manner: a. Preplanned routing with centralized route calculations: In this type of restoration, the route calculations are not made in real time, and thus optimal routes can be achieved prior to the occurrence of the network failure. These preplanned routes can be enhanced or amended and downloaded to the network elements. b. On-the-fly restoration with centralized route calculations: In this type, routes are calculated using the available spare capacity in the network after the failure has occurred. The limitation of this scheme is that restoration can take longer. The reason for this delay is the post-failure route calculations, which are only efficient when topological convergence of the routing protocol has occurred and the management system has obtained all the relevant alarm information. c. Distributed techniques: This type uses distributed rather than centralized route calculations and can use either on-the-fly or preplanned route calculations. All the above schemes are sometimes augmented with crankback should the route calculations fail to converge.
8.4.2
Restoration time
Generally, restoration is nondeterministic, and thus the way the network performs under failure conditions is much less predictable and depends on many factors such as failure location, type of failed circuit, the topological distribution of the spare capacity, and traffic distribution. The speed of operation of restoration is likely to be slower than protection schemes, since on failure a new path has to be computed and setup across the network. It is inevitable that restoration will take longer than dedicated protection schemes. While the time for restoration of a single circuit may look impressive, in reality significant failures such as fiber cut will result in restoration of many circuits over many individual links depending on the size of the network and the failure type. Restoration is a serial process, and therefore, restoration times tend to be cumulative. The larger the scale of the failure, the slower
316
Chapters
the restoration time. In some applications, priority is assigned to traffic to restore important services first. The restoration process involves the computation of new routes using an existing prefailure network topology map. This process may resuh in an attempt to use bandwidth that is no longer available; in addition, the topology map may have been modified, or other nodes may have used the available paths for restoration. These issues obviously prolong restoration time. During restoration, it is likely that many nodes will try to restore traffic at the same time. Since all nodes use the same network topology map, it is possible that two or more nodes will attempt to use the same capacity as part of their paths. Since only one node can do so (first come, first served), the others will have to roll back and try again. This process will also contribute to prolonged restoration time. Restoration time is also influenced by the traffic granularity used in the restoration process. Core networks with a high degree of meshing operating at the VC-4 level, and above such optical wavelengths, are better suited for implementing restoration. The reason is the lower number of circuits, which is more manageable in terms of topology and route calculations than a large number of lower bit rate circuits, as in VC-12. Overall restoration time is therefore affected by many factors, all of which contribute to its total value.
8.4.3
Interoperability
Interoperability is one of the key requirements of restoration-based networks as they become widely deployed. Most of the current restoration deployment in today's networks utilizes proprietary implementation. Interworking between protection domains can be achieved at two basic interfaces, namely, a User-Network Interface (UNI) or a Network-Network Interface (NNI). The UNI approach is based on a client-server relationship where the client requests bandwidth from the server over the interface without any need for client and server to know each other's network topology or signaling/routing protocols. The NNI approach tends to be more complex, since it requires topology interchange and hence compatibility of the signaling/routing protocols between the different protection domains. Therefore, one domain knows the topology of the other and thus can determine a complete route through the network. Figure 8-12 illustrates these features.
Network Survivability
317
Client A
Client B
Transport Network
Figure 8-12. An example of topology on UNI and E-NNI control and data planes
In terms of standards, the ITU and the Optical Internetworking Forum (OIF) has achieved considerable progress over the last three years. Interworking has recently been demonstrated between a number of operators and vendors by the OIF at the SuperComm 2004.
8.5.
LINK CAPACITY ADJUSTMENT SCHEME (LCAS)
ITU-T Recommendation G.7042 describes a technique, referred to as Link Capacity Adjustment Scheme (LCAS) and covered in Chapter 4 in more detail, for autonomously increasing or decreasing the bandwidth on a path in SDH or OTN transport networks. LCAS can also provide another dedicated end-to-end scheme for network survivability. It can offer survivability across single or multiple networks and can operate in various network structures such as mesh and rings. Generally, LCAS protects against faults in the server layer and connectivity faults and signal degradation in the client layer. It can operate in all combinations of protection architecture, switching, and operation types. Figure 8-13 shows the generic concept of LCAS where network survivability is offered. The scheme operates by removing the fractional payload transported on any member of the Virtual Concatenated Group (VCG) that experiences fauh conditions. The result is a reduced payload size.
318
Chapter 8 Transport entities^ over Route 1
Traffic Signal
Traffic Signal
Sublayer Protection Domain E —Endto-End Trail Termination LPS —LCAS Protection Switching Function Figure 8-13. Survivability using LCAS
8.6.
MULTILAYER SURVIVABILITY
Current networks a support variety of technologies such as SDH/SONET, OTN, ATM, Ethernet, IP, etc. This situation has resulted in multilayer transport networks with a variety of layer nesting depending on technology deployment and network evolution. Each technology layer may support a variety of protection and restoration schemes. Therefore, multilayer
Network Survivability
319
survivability uses two or more nested protection mechanisms, while singlelayer survivability employs a single end-to-end or cascaded protection schemes. A good survivability strategy in a multilayer network is to select the survivability schemes associated with these technologies that offer optimal performance to deliver the desired QoS more cost-effectively than can be achieved in a single layer network. Multilayer survivability can effectively combine the merits of the protection mechanisms of each of its constituent layers, such as the transport and service layer protection schemes. However, it is necessary to ensure that a single failure cannot trigger multiple protection mechanisms, which may interact with each other, resulting in undesirable and unknown network states. A good approach to the design of a multilayer network survivability strategy is to identify the usefulness of nesting of protection mechanisms and to identify cases where interworking should be avoided. ANSI Tl-TR-68-2001 [10] report describes various network scenarios for multilayer survivability and recommends a strategy for its implementation.
8.7. REFERENCES [I]
ITU Recommendation G.808.1, Generic Protection Switching—linear trail and subnetwork protection [2] ITU Recommendation G.873.1' Optical Transport Networks (OTN)—Linear protection [3] ITU Recommendation G.841, Types and characteristics of SDH network protection architectures [4] ITU Recommendation 1.630' A TMprotection switching [5] ITU Recommendation Y. 1720' Protection switching for MPLS networks. [5] ITU Recommendation G.807, Requirements for automatic switched transport networks (ASTN) [6] ITU Recommendation G.8080, Automatic switched optical networks (ASON) [7] ITU Recommendation G.7713 series, Distributed call and connection management [7] ITU Recommendation G.7714 series, Automatic discovery techniques [8] ITU Recommendation G.7715 series, Architecture and Requirements for Routing in Automatic Switched Optical Networks [9] ITU Recommendation G.805, Generic Functional Architecture of Transport Networks [10] ANSI Tl report Tl-TR-68-2001, Enhanced Network Survivability Performance, 2001. [II] ANSI Tl.105.01-2000, Synchronous Optical Network (SONET) — Automatic Protection, 2000.
This page intentionally blank
PART 2 Services Offered Over Transport Networks
This page intentionally blank
Chapter 9 METRO ETHERNET OVERVIEW AND ARCHITECTURE
Nan Chen Strix Systems; Metro Ethernet Forum President
9.1.
METRO ETHERNET DEMAND AND REQUIREMENTS
Ethernet is rapidly evolving into an end-to-end solution extending from the LAN to the metro core, the metro edge, and even the "last mile." The key drivers behind this evolution are the compelling cost-effectiveness of Ethernet, the ubiquity of Ethernet, and broadband services. In order for Ethernet to truly address the requirements of MANs, it must scale performance levels, deliver carrier-class protection mechanisms, guarantee services, and support TDM traffic. The future of Ethernet as the foundation for metro carrier networks lies in its ability to support these carrier-class features, which in turn will enable service providers to offer a broad set of services.
9.1.1
Network Resiliency
To deliver the network resiliency/protection and rapid fail-over rates required by carriers and service providers, developers of metro Ethernet gear must overcome the limitations of Ethernet's incumbent spanning tree protocol (STP), an Ethernet technology that delivers 45 -second recovery rates from link failures. The rapid spanning tree protocol (RSTP), a recent new standard from IEEE, has reduced the recovery time to a few seconds.
324
Chapter 9
While this time frame represents a significant improvement over the existing STP standard, certain carrier appHcations such as TDM-based voice and financial data transactions need the greater protection mechanisms to provide recovery in under 50 milliseconds. Increasingly, metro Ethernet solutions are emerging with 50 ms protection for ring and mesh-based topologies, while preserving the cost advantages of Ethernet and its MAC. The key advantages of using these solutions for protection switching are follows: • Facilitates alternative switching paths • Enables 50-msec protection by using labeled switch-path tunnels • Elimates the need for protection among different transport protocols isn't needed MPLS is the only protocol used • Provides visibility and enables manipulation at the packet flow level. • Allows providers to optimize their bandwidth for protected traffic and to allocate less bandwidth for unprotected traffic • Enables carriers to use alternative routes for switched traffic according to bandwidth availability • Eliminates constriction to certain topologies: rings or mesh
9.1.2
Traffic and Performance Management
As metro Ethernet becomes an alternative transport infrastmcture, the best-effort Ethernet will not be adequate, since new universal transport will need to provide guaranteed services with regard to bandwidth, delay, jitters, etc. The fundamental building block for guaranteed services is a connectionupon which Service Level Specifications (SLS) are assigned and enforced. In addition, a connection is always uniquely dedicated to a customer and/or a service, thereby ensuring data security. With connectionbased flows, carriers can enforce service-level specification on a connection flow basis, across the network. For example, committed information rates and excess information rates can be configured and assigned as an attribute of a connection flow. MPLS offers the ability to take advantage of packet information, enabling a flexible service provisioning architecture that allows carriers to assign SLAs based on customer, IP address, and application type.
9.1.3
Circuit Emulation Services
Another key carrier-class feature that will propel the use of metro Ethernet is support of TDM services via circuit emulation services. At the customer premises, metro Ethernet edge switches can packetize TDM traffic via interfaces such as Tl/El, T3/E3, OC-3/STM-1, etc.. This packetized
Metro Ethernet Overview and Architecture
325
traffic is assigned the highest available classification tag for transport across the network. Clocking is supported for accurate and reliable TDM voice transmission. The connection tunnel itself provides guaranteed, secure service levels across the metro network, with interfacing to the publicswitched telephone network occurring at a central point. The allure of such carrier-class capabilities is clear: Carriers can deliver high-bandwidth data services and at the same time continue to support their traditional voice services on a single network. Ethernet offers a costeffective, scalable transport solution across the metro core with support for existing SONET/SDH-based services at the metro edge. The ability to map Tl/El, T3/E3, and 0C-3/STM-1 interfaces over an Ethernet network makes this option compelling for carriers who wish to migrate their legacy services in a phased approach and to integrate new Ethernet transport with existing infrastructure. SONET/SDH investments are preserved, while a new, more profitable solution is realized for metro transport.
9.2.
METRO ETHERNET FORUM CHARTER
The Metro Ethernet Forum (MEF) [1] was founded in June 2001 with a mission to accelerate the adoption of metro Ethernet as the technology of choice in metro networks worldwide. The primary priorities of the MEF are to define: • Ethernet Services for metro transport networks. Such services will be delivered over native Ethernet-based metro networks and could also be supported by other transport technologies. • Carrier-class Ethernet-based metro transport technologies by specifying architecture, protocols, and management for Ethernet-based metro transport networks The secondary priorities of the MEF are (when deemed necessary) to define: • Work to be done by other organizations on other transport technologies • Non-Ethernet interfaces, if not defined by other organizations Figure 9-1 shows the technical areas in which MEF is developing recommendations.
Chapter 9
326 Technical Areas 3E
3C
Architecture
Services
Transport
Management
Protection I— Requirement & Framework
_ Services ' Model
^ Reference ' Model
, Traffic Management
, User Network • Interface (UNI)
Test
, EMS-NMS Info Model .EMS I— Requirement
• Protection lA
L_Testlng Methods
Figure 9-1. Categorical topical coverage within MEF
9.3.
METRO ETHERNET NETWORK (MEN) ARCHITECTURE
9.3.1
MEN Reference Model
The basic network reference model of a MEN [2] is depicted in Figure 92. Two major functional components are involved: • the subscribe/customer edge equipment, and • the public MEN transport infrastmcture
Subscriber
Subscriber
T
Metro Ethernet Network (MEN)
^
Ethernet Virtual Connection
j UNI
11 I
i s-
i
» j
1
UNI 1 Private 1 , i c n H . ! 1 1- i1 Customer H-i J 1 \ 1 Network j T l "ser J L
.--.tS
i
w <
'^
1
End-to-End Ethernet flow
1
^
w
Figure 9-2. Basic Network Reference Model
Reference point T, also referred to as the UNI (User Network Interface) reference point, demarcates the boundaries between the public MEN and a private customer network. Reference point S is the conceptual point that demarcates the boundaries between the private customer network equipment, when present, and the end-user terminal equipment generating an Ethernet frame flow. If no other private network infrastructure exists between the subscriber terminal equipment and the public MEN equipment, the S and T
Metro Ethernet Overview and Architecture
327
reference points coincide. Unless otherwise stated, the term subscriber/customer equipment is used to refer to all private customer equipment outside the MEN. An Ethernet flow represents a particular (potentially noncontiguous, e.g., consecutive Ethernet frames may belong to different flows) unidirectional stream of Ethernet frames that share a common treatment for the purpose of transfer steering across the MEN. In particular, an end-to-end Ethernet flow refers to the flow of Ethernet frames between the communicating terminal equipment (TE) that creates and terminates the Ethernet frames. The Ethernet Virtual Connection (EVC) is the architecture construct that supports the association of UNI reference points for the purpose of delivering an Ethernet flow between subscriber sites across the MEN. There may be one or more subscriber flows mapped to a particular EVC (e.g., there may be more subscriber flows identified by the flow classification rules at the ingress point to network than EVCs). The mapping of Ethernet flows to EVCs is service specific and specified in the MEF Ethernet Service Model specification and also in Chapter 10, of "Ethernet Services over Metro Ethernet Networks."
9.3.2
MEN Layer Network Model
The MEN layer network model specified in this architecture framework defines the MEN in terms of three layer network components: the Ethernet Services layer supporting basic Layer 2 (L2) Ethernet data communication services; a set of one or more supporting Transport Services layer(s); and an optional Application Services ayer supporting applications carried on the basic L2 Ethernet services. In addition, each of these layer networks may be further decomposed into its data, control, and management plane components. This layer network view of a MEN is illustrated in Figure 9-3.
Chapter 9
328
.Mm
0
c m
.E
m c S.
"5 sL
•c-
m
o O
Application Services Layer (e.g., IP, MPLS, PDH. etc.)
Ethernet Services Layer (Ethernet Service PDU)
CL
£L
u> m
m S'.
D Transport Services Layer (e.g., IEEE 802.1, SONET/SDH, MPLS)
Figure 9-3. MEN layer network model
9.3.2.1
Ethernet Services Layer (ETH Layer)
The Ethernet Services layer, also referred to as the ETH layer, is responsible for the instantiation of Ethernet MAC-oriented connectivity services and the delivery of Ethernet service frames presented across welldefined internal and external interfaces and associated reference points. The ETH layer is also responsible for all service-aware aspects associated with Ethernet MAC flows, including operations, administration, management and provisioning capabilities required to support such Ethernet connectivity services. The service frame presented by the ETH layer external interfaces is Ethernet unicast, multicast, and broadcast frames conforming to the IEEE 802.3-2002 frame format. The Ethernet Services layer is a single layer network. The detailed architecture framework for the Ethernet Services Layer is provided in a separate Ethernet Layer Framework Architecture document of the MEF. 9.3.2.2
Transport Services Layer (TRAN Layer)
The Transport layer, also referred to as the TRAN layer, supports connectivity among ETH layer functional elements in a service-independent
Metro Ethernet Overview and Architecture
329
manner. Various layer network technologies and interconnect approaches may be used to support the transport requirements for the Ethernet services layer. Sample transport layer networks include IEEE 802.3 PHY, IEEE 802.1 bridged networks, SDH VC-n/VC-n-Xc, SONET STS-n/STS-4n-.Xc, ATM VC, OTN ODUk, PDH DSl/El, MPLS LSP, etc. Those transport layers are supported by their respective server layers; e.g., SDH STM-N Multiplex Section, ATM VP, OTN OTUk, PDH DS3/E3, MPLS LSP, IP, fiber, etc. This model may be applied recursively downwards into the transport layer network stack until the physical transmission medium (fiber, copper, coax, wireless) is reached. 93.2.3
Application Services Layer (APP Layer)
The Application Services layer, also referred to as the APP layer, supports applications carried on the basic Ethernet services across the MEN. Various application services may be supported over the basic Ethernet services supported by the Ethernet services layer. Sample services include the use of ETH layer as a TRAN layer for other layer networks such as IP, MPLS, PDH DSl/El, etc. The APP layer may also include add-on functions to complement ETH layer services. Each APP Layer may support other APP Layers. This model may be applied recursively upwards into the application layer network stack.
9.3.3
MEN Reference Points
A MEN reference point identifies a set of layer network reference points used for demarcating administrative boundaries where a link traverses open interfaces specified by the MEF. Figure 9-4 shows the relationship among external architectural components to a MEN, their associated interfaces, and their reference points. External components to a MEN elements include (1) the subscribers to the MEN services, (2) other MEN networks, and (3) other (non-Ethernet) transport and service networks. Subscribers connect to a MEN at a User-Network Interface reference point. Internal network elements (NE) are interconnected via Internal Network-to-Network Interfaces, or INNIs (not shown). Two autonomous MENs may interconnect at an External NNI (E-NNI) reference point. A MEN may interconnect with other transport and service networks at a Network Interworking NNI (NI-NNI) or a Service Interworking NNI (SI-NNI) reference point. An Ethernet Wide Area Network (E-WAN) refers to any MEF-defined ETH services-aware network that provides connectivity between two or more MENs via E-NNIs. A MEN may use non-Ethemet-based transport and service elements as internal architectural components. In some cases, these architectural
Chapter 9
330
components may be integrated in a single network element (NE). In the latter case, an integrated and hybrid MEN may be deployed such that the NI-NNI and/or SI-NNI are logical reference points within a particular network equipment in the MEN. Note that any of the transport/services networks identified in Figure 9-4 may indeed belong to the same service operator (e.g., they may represent different business units within the same company).
other L2/L2+ Services Networks (e.g.. ATM. FR, IP)
Subscriber
Service Interworking NNI
Metro Ethernet Network (MEN) Service Provider X
Subscriber
• NNI
J1
Network Interworking NNI . ^ ^
External NNI
Wide Area Network (E-WAN) Service Provj<JerY
1 1
illiSliiMiiiiiliiWi
iiiiWiii^^Wiii
Subscriber
Network Interworking NNI
MEN Service Provider X
Figure 9-4: MEN External Interfaces and associated reference points
9.3.3.1
User-Network Interface (UNI)
The User-Network Interface (UNI) is the interface used to interconnect a MEN subscriber to its MEN service provider(s). The UNI also provides a reference point for demarcation between the MEN operator's equipment that enables access to the MEN services and the subscriber access equipment. Therefore, the demarcation point indicates the location where the responsibility of the service provider ends and the responsibility of the subscriber begins. The specific location of the UNI reference point (T) is specified in the MEF UNI document.
Metro Ethernet Overview and Architecture
331
Subscriber Site A
\ \ \
Subscriber Site B Metro
I UNI UNI Ciiemt
11 \ 1 \
^ ^
UNI 1 Metwork}
Ni
Ethernet
j
Network
! Metwork 1
UNI 1.
^ -1
T
(MEN)
^
Ethernet Virtual Connection
1
j
1
1 Cllerit 1
UNI 1
T
W
^ End-to-End Ethernet flow
w
Figure 9-5. The UNI and the MEN Reference Model
Functionally, the UNI is an asymmetric, compound functional element that consists of a client side, referred to as the UNI-C, and a network side, referred to as the UNI-N, as illustrated in Figure 9-5. Thus, the term UNI is used to refer to these two functional elements, and generically, to the data, management, and control plane functions associated with them. 9.3.3.1.1 UNI Client (UNI-C) The UNI-C (UNI Client) is a compound architectural component of a MEN that represents all the functions required to connect a subscriber to a MEN. Individual functions in a UNI-C are entirely in the subscriber domain and may or may not be managed by the service provider/network operator. From the perspective of the MEN, the UNI-C supports the set of functions required to exchange data, control, and management plane information with the MEN subscriber. As such, the UNI-C includes functions associated with the Ethernet services infrastructure, the transport network infrastructure, and (if present) application-specific components. 9.3.3.1.2 UNI Network (UNI-N) The UNI-N (UNI Network) is a compound architectural component of a MEN that represents all the functions required to connect a MEN to a MEN subscriber. The individual functions in a UNI-N are entirely in the service provider/network operator domain. From the perspective of the subscriber, the UNI-N supports the set of functions required to exchange data, control, and management plane information with the MEN. As such, the UNI-N includes functions associated with the Ethernet services infrastructure, the transport network infrastructure, and (if present) application-specific components.
332 9.3.3.2
Chapter 9 External Network-to-Network Interface (E-NNI)
The External Network-to-Network Interface (E-NNI) is an open interface used to interconnect two MEN service providers. The E-NNI provides a reference point for network equipment and Ethernet service demarcation between the two directly attached MENs. The E-NNI also provides a reference point for NEs and Ethernet service demarcation between a MEN and an Ethernet service-aware Wide Area Network (E-WAN). Transport interfaces and Network Interworking capabilities other than those associated with native Ethernet physical interfaces (e.g., see Section 9.9.3.3.4) may be supported across this interface. In addition, the term E-NNI is used to refer generically to the protocol exchange that exists at the E-NNI reference point between the architectural elements in each of the MENs that support the ENNI delineation functions. The specific location of the E-NNI reference point is specified in the MEF E-NNI framework document. 9.3.3.3
Internal Network-to-Network Interface (I-NNI)
The Internal Network-to-Network Interface (I-NNI) is an open interface used to interconnect NEs from a given MEN service provider. The I-NNI provides a reference point for Ethernet service demarcation between the two directly attached NEs. Transport interfaces and Network Interworking capabilities other than those associated with native Ethernet physical interfaces may be supported across this interface. In addition, the term I-NNI is used to refer generically to the protocol exchange that exists at the I-NNI reference point between the architectural elements in each of the MENs that support the I-NNI delineation functions. 9.3.3.4
Network Interworking Network-to-Network Interface (NINNI)
The Network Interworking NNI (NI-NNI) is an open interface that supports the extension of transport facilities used to support Ethernet services, and associated EVCs, over an external transport network(s) not directly involved in the end-to-end Ethernet service. The NI-NNI is intended to preserve the characteristic information of a subscriber's flow. The NI-NNI also provides a reference point for demarcation between the two MEN service provider interfaces attached via public transport networks. Examples of other public transport networks include OTN, SDH/SONET, ATM, Frame Relay RPR, etc. The term NI-NNI is also used in this chapter to refer to the protocol exchange that exists at the NI-NNI reference point and the
Metro Ethernet Overview and Architecture
333
architectural element in each of the MENs responsible for the support of the NI-NNI delineation functions (NI-NNIIWF). 9.3.3.5
Service Interworking Network-to-Network Interface (SI-NNI)
The Service Interworking NNI (SI-NNI) is an interface that supports the interworking of an MEF service with services provided via other service enabling technologies (e.g., Frame Relay, ATM, IP, etc.). The SI-NNI provides a reference point for demarcation between a MEN and another public service network. Examples of other public services networks include ATM, Frame Relay, and IP. The term SI-NNI is also used in this chapter to refer to the protocol exchange that exists at the SI-NNI reference point and the architectural element in each of the MENs responsible for the support of the SI-NNI delineation functions (SI-NNI IWF). 9.3.3.6
Other MEN Access Arrangements
The basic MEN reference model illustrated in Figure 9-2 presumes a oneto-one relationship between the MEN subscriber port and the MEN service provider port. Indirect means of access into the service provider equipment supporting the UNI functions, e.g., via a so-called feeder or Access Networks, may be required in deployment scenarios where Ethernet services are introduced over preexisting access technologies (e.g., PDH, SONET/SDH, or Hybrid Fiber/Coaxial networks) or over alternative Ethernet transport facilities. 9.3.3.6.1 Service Node Interface (SNI) The Service Node Interface (SNI) is an interface that supports the extension of the MEF UNI capabilities across an intermediate access network not directly involved in the end-to-end Ethernet service. As illustrated in Figure 9-6, the SNI provides a reference point for demarcation between the network location where Ethernet Service attributes are enforced (MEF ESM) and a packet-aware Access Network that aggregates subscriber flows at a packet level (layer 2) into a common transport channel. In this scenario, the UNI and SNI reference points are equivalent to T (TB) and V (VB) in the ISDN (B-ISDN) terminology. The Access Network strictly provides a packet-based transport function for the access portion of the connection between the subscriber and the MEN. For this reason, the SNI reference point is also informally referred to as the Virtual UNI reference point. As such, the SNI is intended to preserve in a transparent manner the characteristic information of a subscriber's flow. The specific location of the SNI reference point is specified in the MEF TMF specification.
Chapter 9
334
: Sufc^cribir i^ \ SNi
Metro Ethernet Network
\ Syb$ci1l:^rS
Figure 9-6. Reference points for other access arrangements into a MEN
9.3.4
MEN Architectural Components
A functional modeling approach is used in this framework to represent the various architecture components of all the layer network of a MEN. The functional model is based on the architectural constructs created to describe connection and connection-less oriented transport networks in ITU-T Recommendations G.805 [3] and G.809 [4]. This chapter describes the main architectural concepts defined by these two Recommendations as related to MEF architecture framework. Three types of architectural components are identified: (1) Topological Components, (2) Transport Components, and (3) Processing Components. Topological and transport components are used to represent abstract connectivity constructs. Processing components are used to represent abstract system components that affect information transfer. These concepts are further discussed in this section. Formal definition of topological components, transport entities, and reference points can be found in the ITU-T Reconmiendations G.805 and G.809. 9.3.4.1
Topological Components
A topological component provides the highest level of abstraction for the description of an architectural component of a transport network. It defines inclusion/exclusion relationships between sets of like functions and associated reference points. There are four topological components of interest: • Layer Network: A complete set of logical or physical ports (see also access group) of the same type that may be associated for the purpose of transferring information. The transferred information is in terms of a well-defined traffic unit of the particular layer network and is termed its Characteristic Information (CI).
Metro Ethernet Overview and Architecture
335
•
Subnetwork: A partition of a layer network used to affect the steering of specific user data within a portion of a layer network. In the ITU-T terminology, the term subnetwork is reserved for connection-oriented networks. The term Flow Domain is used in the context of connectionless layer networks, such as Ethernet. • Link: A (fixed) connectivity relationship between a subnetwork or access group and another subnetwork or access group. The terms Flow Point Link and Flow Point Poll Link are used in the context of connectionless layer networks, such as Ethernet. • Access Group: A group of co-located logical or physical ports, with associated processing functions that are connected to the same subnetwork or link. Basically, an access group represents the logical access ports into a given subnetwork or flow domain. CI is defined in ITU-T Recommendation G.805. It is used to specify units of information, including a specific unit format, which is transferred on connections within the given layer network. The CI format is always defined in a technology-specific manner for each network layer in its associated architecture framework document. The Ethernet MAC layer, the IP network layer, the SONET/SDH High Order/Low Order layers, or even a fiber infrastructure are example of layer networks. 9.3.4.2
Transport Components
A transport component, or transport entity, provides the means to affect the transfer of information between reference points. Two types of transport entities, and their associated reference points, are defined: • Connection: A transport entity that represents an aggregation of one or more connection-oriented traffic units with an element of common routing. Referred to as Flow in the context of the connectionless layer network. • Connection Point: A reference point that represents a location of transfer of connection-oriented traffic units between topological components. Referred to as a Flow Point/Flow Point Pool in the context of the connectionless layer network. • Trail: A transport entity that represents the transfer of monitored and adapted characteristic information of client layer network between two access points. Typically used to represent the association between source and destination(s) on a per-traffic-unit basis. Referred to as a Connectionless Trail in the context of the connectionless layer network. • Trail Termination Point: A reference point that represents a location of insertion/extraction of monitored and adapted information characteristic
336
•
Chapter 9
to a given layer network (as opposed to the information presented by the chent of the layer network). Referred to as a Flow Termination Point in the context of the connectionless layer network. Access Point: A reference point where the output (input) of a trail termination is bound to the input (output) of an adaptation or the output of an adaptation function.
93.4.3
Processing Components (the MEN Functional Elements)
A processing component provides the actual means to affect the transfer of information at a given reference point. The concept of a Functional Element (FE) is used in this architecture framework to represent the specific set of processes, or functions, within the MEN services or transport network that act on a particular collection of input data to produce a specific collection of output data. Functional elements may also be used to represent compound functions, i.e., a collection of other predefined functional elements It is the intent of the MEF architecture framework to adopt existing functional elements, and associated functional models, from accredited standards organizations and industry forums. In particular, functional elements for Ethernet LANs are derived from IEEE 802.3-2002 and IEEE 802.1Q-1998 specifications. Functional models for connection-oriented functional elements are derived from ITU-T Recommendation G.805. Functional models for connectionless functional elements are derived from the ITU-T Recommendations G.809. Detailed functional models for functional elements in the MEN architecture framework are outside the scope of this chapter. 9.3.4.3.1 Generic MEN Functional Elements Two generic functional elements are defined in ITU-T Recommendations G.805 and G.809 to distinguish the processes required: (1) adapt a client signal for transport across a server layer network, and (2) generate a traceable flow/connection across the layer network. Both source and sink versions of the processes are defined: • Adaptation Function: A transport processing function that converts the server layer network trail information into the characteristic information of the client layer network (and vice versa). • Termination Function: A transport processing function that accepts adapted characteristic information from a client layer network at its input, adds information to allow the associated trail to be monitored (if supported), and presents the characteristic information of the layer network at its output(s) (and vice versa).
Metro Ethernet Overview and Architecture
337
Note that the Adaptation Function is an interlayer function that contains processing aspects of both the chent and the server layer networks. In addition, ITU Recommendation G.806 [5] defines a generic functional element to stir flows within a network: • Connection Function: A transport processing function that transfers information (potentially transparently) from a given input to one or more outputs. Note that a Connection Function is the smallest subnetwork / flow domain (also referred to as Flow Domain Function).
9.3.5
MEN Layer Relationship to the Architecture Model Components
This section discusses the relationships between the ETH, TRAN, and APP layers, the operational planes, and the generic topological components. 9.3.5.1
Operational Planes and the MEN Layer Networks
Three operational planes are distinguished: Data Plane Control Plane Management Plane The Data Plane, also referred to as the user/transport/forwarding plane, provides the functional elements required to steer the subscriber flow, and supports the transport of subscriber traffic units among MEN NEs. The Control Plane provides the functional elements that support distributed flow management functions among NEs participating in the MEN data plane. The control plane also provides the signaling mechanisms necessary to support distributed setup, supervision, and connection release operations, among other flow-control functions. The Management Plane provides the functional elements that support Fault, Configuration (including flow and/or connection configuration). Account, Performance, and Security (FCAPS) fiinctions, as well as any related Operations Administration and Maintenance (0AM) tools. Subscribers and external networks connected to a MEN are likely to include similar layers and planes. The information exchange between the subscriber and MEN Management and Control Planes across defined reference points is restricted (and may be absent) according to the implementation agreements for the UNI, NNI, and other external IWF. • • •
Chapter 9
338
9.3.5.2
MEN Network Reference Model and the Topological Components
A MEN itself consists of physical components (e.g., network elements, ports, etc.) and logical components (e.g., meters, policers, shapers, virtual switches, links, etc.). A MEN architecture is further described by defining the associations between points in the network and the interconnected topological and/or functional components. The partition of a MEN into layer networks places a bound on the scope of the various MEN topological and functional components. Access groups connect customers from the client layer network to the services supported by the server layer network in a client/server relationship. Formally, a reference point describes any binding between the input and output of processing functions or transport entities (see ITU-T Recommendations G.805/G.809). The relationship between the MEN reference model, its layer networks, and its topological components is illustrated in Figure 9-7. Note that within a particular service provider network, multiple Ethernet layer network domains (e.g., ETH Subnetwork A and ETH Subnetwork B) and different kinds of transport technologies (e.g., TRAN Subnetwork A, TRAN Subnetwork B, and TRAN Subnetwork C) may be used to instantiate a particular MEN.
APP Subnetwork A
APP Subnetwork B Application Services Layer (Optional)
Transport Services Layer
C
I TRAN-FE I
TRAN Subnetwork A
J} "'^Link
^
I TRAN-FE | ^
^
. " ^
TRAN Subnetwork B
\^ Link
\ TRAN-FE | _^
•
TRAN Subnetwork 0 \
Figure 9-7. Sample relationship between MEN Reference Model and the Architectural Components
Metro Ethernet Overview and Architecture
339
Note also that a given layer network (e.g., IP, MPLS, SONET/SDH, PDH) may play a dual role with the Ethernet services layer: • as a transport layer providing transport services to the Ethernet service layer, and • as an application service layer using the service provided by the Ethernet service layer. For instance, the Transport and Application Services Layers may be further partitioned into additional layer networks and associated protocols. This generic layer network modeling principle is illustrated in Figure 9-8. For a compound functional element XYZ that expands multiple layer networks, the terms APP XYZ, ETHXYZ, and TRANXYZ are used to refer to the specific set of functional elements within the compound functional element that form part of the APP, ETH, and TRAN layers, respectively (if present).
1 VoIP
1 ^^^
TCP/UDP
1 TCP/UDP Ethernet 802.3
1Ethernet '^
MPLS Ethernet
Ethernet
Others Protocol Suites
PDH
Other Apps
t»«/
t^^^i^ Ethernet Services Layer
Ethernet P802.1ad
m
MPLS VC LSP
ATMVC
MPLS Tunnel LSP
ATMVP
ODUk LOVC
802.3 PHY
I
OTUk
HOVC
OCh
STM-N
OTM-n
=1=
=1=
Medium (Fiber, Copper, Coaxial, Wireless, etc.)
Others
Figure 9-8. Sample decomposition of a MEN into layer networks and protocol stacks
340 9.3.5.3
Chapter 9 MEN Reference Link Model
The term link is used in this architecture framework to refer to a topological component that describes a fixed connectivity relationship, and available transport capacity, between a pair of subnetworks (flow domains), or a subnetwork (flow domain) and an access group, or a pair of access groups. From the point of view of the MEN architecture framework, links are further classified according to the MEN layer network they support and their relationship to internal/external reference point. Links may be used in a variety of different arrangements to produce more complex link models. For instance, a link at layer network N may instantiated by one or more links at layer network N-1 to create a nested link. Multiple-layer network N links may be aggregated into a single-layer N link to create a compound link. In this arrangement, each aggregated link is also referred to as a component link. With respect to the MEN reference points (Figure 9-4), this framework document classifies links into two classes: • Access Link: A link that provides connectivity across an UNI reference point • Trunk Link: A link that provides connectivity across an NNI reference point With respect to the MEN layer network(s), this framework document classifies links into three classes: • APP Link: A link in (one of) the APP layer(s) • ETH Link: A link in the ETH layer • TRAN Link: A link in (one of) the TRAN layer(s) Multiple links may exist between any given subnetwork (flow domains) and access group or pair of subnetworks (flow domains) or access groups. While links are established and maintained at the time scale of the server layer network (e.g., EMS/NMS provisioned links vs. control plane provisioned links), they are not limited to being provided by a server layer network and can also be provided by client layer network connections. Figure 9-9 illustrates the high-level relationship between link types and the MEN UNI and E-NNI reference points. Separate specifications will provide detailed relationships between processing functions and reference points for any MEF-specified interface.
Metro Ethernet Overview and Architecture
341 Logical Connectivity Physical Connectivity
^
CD
T^
APP Link
^mmmmmmm - ^
c^ '^— ^^m _• PiB i ^ - ^
CZ
.^I^H
ETH Access Link
p^
""""l
^^"P ^ ^
^ >TRAN Access Link ^
A
ETHfrtfeui
^^^^^t^i TRAN Trunk Link ^ ^
^
•i ^
1
• P
TRAN Access Link
MEN 2
Figure 9-9. MEN Reference Link Model (example)
ACKNOWLEDGMENTS Metro Ethernet Forum Enrique Hernandez-Valencia
9.4. [1] [2] [3] [4] [5]
REFERENCES
Metro Ethernet Forum, www. metroethemetforum.org. Technical Specification, MEF 4, Metro Ethernet Network Architecture: Framework, Part 1: Generic Framework. ITU-T Recommendation G.805, Generic functional architecture of transport networks, March 2000. ITU-T Recommendation G.809, Functional Architecture of Connectionless Layer Networks, March 2003. ITU-T Recommendation G.806, Characteristics of Transport Equipment — Description Methodology and Generic Functionality, February 2004.
This page intentionally blank
Chapter 10 ETHERNET SERVICES OVER METRO ETHERNET NETWORKS
Bob Klessig Cisco Systems,, Metro Ethernet Forum Member of the Board and Co-Chair of the Technical Committee
10.1.
INTRODUCTION
One of the highest priorities for the Metro Ethernet Forum has been the estabhshment of standards for Ethernet services. These standards will allow Subscribers to successfully plan and integrate Ethernet services into their overall networks and also to be able to do such integration with services from more than one service provider. These standards will also allow equipment vendors to implement capabilities in both service provider equipment and subscriber equipment such that the Ethernet services can be efficiently provided by the service providers and accessed by the subscribers. To accomplish these goals, Ethernet services must be described in precise technical details. The descriptions of the fundamental constructs for Ethernet services are presented in Section 10.2. Service features are described in Section 10.3. The material in this chapter is based on [1].
10.2.
SERVICES MODEL
The services model for Ethernet services is portrayed in Figure 10-1 and is further described in the following subsections.
Chapter 10
344
Customer Edge (e.g., router) (CE)
User Network
Customer
Service Attributes Figure 10-1. Ethernet services model
10.2.1 Customer Edge View The MEF Ethernet services are described from the point of view of the subscriber equipment, referred to as the Customer Edge (CE). Thus the services are defined only in terms that are observable to the CE. The types of technology and the architecture inside the Metro Ethernet Network (MEN) are invisible. By defining services in this manner, the MEN can evolve independently of the evolution of the subscriber's network without disruption of the service that the subscriber is given.
10.2.2 User Network Interface The User Network Interface (UNI) is the physical demarcation point between the responsibility of the service provider and the responsibility of a single subscriber. A UNI must be dedicated to a single subscriber. A UNI is frequently an RJ-45 socket on a service provider-owned Ethernet switch that is placed on the subscriber's premises. Another typical example is an RJ-45 socket on a service provider-owned patch panel. As implied in Figure 10-1, another way of viewing the UNI is that it is located where the CE begins and the MEN ends when looking outward from the MEN. It follows that the MEF services are described as the UNI-to-UNI behavior. For example, frame loss performance is defined in terms of frames received by the MEN at a UNI and frames delivered by the MEN to one or more other UNIs. A key goal in the development of the MEF Ethernet services is that existing, Standard Ethernet devices should be able to attach to an MEN (at a UNI) and successfully work with the service. This makes MEF Ethernet
Ethernet Services Over Metro Ethernet Networks
345
services the first wide-area services, with over 100 miUion existing devices capable of using the services. It follows that the protocols operating across the UNI between the CE and the MEN must be Standard Ethernet [3]. (See Appendix A for background material regarding Standard Ethernet.)
10.2.3 Service Frame There are several physical layers defined in [3], and it is that expected that the different UNIs will have different physical layers. For example, an enterprise might have a Gigabit Ethernet physical layer UNI at headquarters and Fast Ethernet physical layer UNIs at branch offices. However, all UNIs have the same Layer 2 protocol, as defined in [3]. Service frames are Ethernet frames that are exchanged between the MEN and the CE across the UNI. A service frame sent from the MEN to the CE at a UNI is called an egress service frame. A service frame sent from the CE to the MEN at a UNI is called an ingress service frame. 10.2.3.1
Format
The service frame is just a regular Ethernet frame beginning with the first bit of the Destination address through the last bit of the Frame Check Sequence (FCS). As per [3], a service frame that contains an IEEE 802.IQ tag can be up to 1522 bytes in length, and a service frame that does not contain an IEEE 802. IQ tag can be up to 1518 bytes.^ 10.2.3.2
Delivery Transparency
When an ingress service frame is delivered by the MEN to one or more UNIs, the fields of the egress service frame are identical to those of the ingress service except as follows: • The egress service frame may have an IEEE 802.IQ tag when the corresponding ingress service frame does not. • The egress service frame may not have an IEEE 802.IQ tag when the corresponding ingress service frame does have such a tag. • The egress service frame may have an IEEE 802.IQ tag that has a different value from the IEEE 802.IQ tag in the corresponding ingress service frame. In all three cases, the FCS will be recalculated and thus will probably be changed from ingress to egress.
346 10.2.3.3
Chapter 10 Error Detection
If an ingress service frame has an invalid FCS, it will be discarded by the MEN. This is a consequence of the fact that with MEF Ethernet services, the disposition of an ingress service frame is based on the content of the frame header. If a frame header contains bit errors, then the service frame could be delivered to the wrong destination. Discarding service frames with errors avoids this undesirable result.
10.2.4 Ethernet Virtual Connection In theory, an MEN could behave just like an Ethernet shared medium segment where each ingress service frame is replicated and delivered to all other UNIs. This LAN-like behavior can be acceptable within a single organization but is clearly not acceptable for a public network. In an MEN there must be some way to limit communication between UNIs. In MEF Ethernet services, this mechanism is called an Ethernet Virtual Connection (EVC). Each instance of an MEF Ethernet service is based on an EVC. 10.2.4.1
Definition of EVC
An EVC is defined as an association of two or more UNIs. Such UNIs are said to be in the EVC. An ingress service frame sent into an EVC can be delivered to one or more of the other UNIs in the EVC. It cannot be delivered to a UNI that is not in the EVC or back to the originating UNI. Service frames cannot leak into or out of an EVC. An EVC can be thought of as a form of Layer 2 VPN. The UNIs in the EVC seem to be in their own Layer 2 network. However, it is possible for a given UNI to be in multiple EVCs, which means that an EVC is more than a simple Layer 2 VPN. The basic EVC concept is illustrated in Figure 10-2. Two service instances are shown in the figure. The first service instance, based on EVC 1, allows headquarters to communicate with the backup data center. The second service instance, based on EVC 2, allows headquarters and the branches to communicate among themselves. The UNI at headquarters is in both of the EVCs. The router at headquarters would typically be configured with two sub-interfaces on the port attached to the UNI. Communications between the branches and the backup data center must go via the router at headquarters that would route between the two sub-interfaces.
Ethernet Services Over Metro Ethernet NetM^orks
347
Backup Data Center
Branch 2
Headquarters
Branch 1
Figure 10-2. Example of two service instances
There are two types of EVCs, as detailed in the following two subsections. 10.2.4.2
Point-to-Point EVC
A point-to-point EVC associates exactly two UNIs. EVC 1 in Figure 10-2 is an example of a point-to-point EVC. For a point-to-point EVC, all ingress service frames at one UNI, with the possible exception of Layer 2 control protocol messages (see Section 10.3.8.1), will typically be delivered to the other UNI. 10.2.4.3
Multipoint-to-Multipoint EVC
A multipoint-to-multipoint EVC associates two or more UNIs. EVC 2 in Figure 10-2 is an example of a multipoint-to-multipoint EVC. At first glance, it might appear that a multipoint-to-multipoint EVC with two UNIs is the same as a point-to-point EVC. However they are different because additional UNIs can be added to the multipoint-to-multipoint EVC. There are more options for delivery of ingress service frames (that are not Layer 2 control protocol messages) for a multipoint-to-multipoint EVC than for a point-to-point EVC. Broadcast service frames (all ones Destination MAC address) and multicast service frames (multicast Destination MAC address) are replicated and delivered to all other UNIs in the EVC. Unicast service frames (unicast Destination MAC address) can be handled in one of two ways:
348
Chapter 10
•
They can be replicated and delivered to all other UNIs in the EVC. This makes the EVC behave like a shared media Ethernet. • The MEN can learn which MAC addresses are "behind" which UNIs by observing the Source MAC addresses in service frames and deliver a service frame to only the appropriate UNI when the Destination MAC address has been learned. When the Destination MAC address has not been learned, the service frame is replicated and delivered to all other UNIs in the EVC. This makes the MEN behave like a learning bridge. It is important to know which technique is being used on a multipoint-tomultipoint EVC, since the behavior can impact the bandwidth consumed at each UNI and the compatibility with higher layer protocols.
10.2.5
Identifying an EVC at a UNI
In general, there can be more than one service instance, i.e., EVC, at a given UNI, as shown at the UNI at Headquarters in Figure 10-2. This means that for each service frame at the UNI, there must be a way to determine the EVC with which it is associated. This function is accomplished with the Customer Edge VLAN Identifier (CE-VLAN ID) and the CE-VLAN ID/EVC map, as described in the following two subsections. 10.2.5.1
CE-VLAN ID
There are 4095 CE-VLAN IDs, numbered 1, 2, ..., 4095. The CE-VLAN ID is derived from the content of the service frame as follows: • For a service frame that has an IEEE 802.IQ Tag, and for which the 12bit VLAN ID in the tag is not zero, the CE-VLAN ID is equal to the VLAN ID in the Tag. • Untagged and priority tagged^ service frames have the same CE-VLAN ID, and the CE-VLAN ID value is configurable to any value in the range 1, ...,4094 at each UNI. The special treatment of untagged and priority-tagged service frames is consistent with IEEE 802.1Q [4]. An IEEE 802.1Q-compliant Ethernet switch will treat untagged and priority tagged frames as belonging to a default VLAN, and the default VLAN is configurable on each port of the switch. 10.2.5.2
CE-VLAN ID/EVC Map
The CE-VLAN ID/EVC map associates each EVC at the UNI with one or more CE-VLAN IDs.
Ethernet Services Over Metro Ethernet Networks Service Frame Format
CE-VLAN IP
349 EVC
Untagged Priority Tagged Tagged, VID = 1 Tagged, VID = 2 Tagged, VID = 3
Tagged, VID = 4094 Tagged, VID = 4095
> 4094"^ > 4095
^ Blue
CE-VLAN ID/EVC Map Figure 10-3. Example of a CE-VLAN ID/EVC map
Figure 10-3 shows an example of CE-VLAN ID/EVC map. In this case, CE-VLAN ID 1 is mapped to the blue EVC, CE-VLAN ID 2 is mapped to the red EVC, and CE-VLAN ID 4094 is mapped to the green EVC. A tagged ingress service frame with VLAN ID 2 will be delivered by the MEN according to the properties of the red EVC. An untagged ingress service frame will also be delivered by the MEN according to the properties of the red EVC. A tagged ingress service frame with VLAN ID 1 will be delivered by the MEN according to the properties of the blue EVC. In this example, an ingress service frame with VLAN ID not equal to 1,2, or 4094 will be discarded because the associated CE-VLAN ID is not mapped to an EVC. An egress service frame from the blue EVC will be tagged with VLAN ID 1. An egress service frame from the red EVC could apparently have three formats, namely, tagged with VLAN ID 2, untagged, or priority tagged. If CE-VLAN ID Preservation (see Section 10.3.1) is not in force, the egress service frame is untagged. Section 10.3.1 describes the format when CEVLAN ID preservation is in force.
10.3.
SERVICE FEATURES
The basic constructs of the service model described in Section 10.2 provide a flexible foundation for the definition of a number of services. The details of these various services are determined by the service features that are described in this section.
350
10.3.1
Chapter 10
CE-VLAN ID Preservation
If a subscriber attaches IEEE 802.IQ bridges as CEs to the UNIs in an EVC, it is possible that the subscriber will want to maintain his or her VLAN structure among the sites at the UNIs. In this case, the CE will generate tagged service frames with VLAN IDs that correspond to the subscriber's VLANs. It would be very inconvenient if the MEN changed those VLAN IDs, since the subscriber would be forced to use different VLAN ID values for the same VLAN and coordinate with the service provider in setting those values. To address these issues, the service provider could offer CE-VLAN ID preservation. When an EVC has this feature, the relationships between ingress and egress service frames shown in Table 10-1 are maintained by the MEN. Table 10-L CE-VLAN ID preservation Ingress Service Frame Untagged J
Egress Service Frame Untagged Tagged with VLAN ID equal to that of the Ingress Service Frame
Clearly, CE-VLAN ID preservation is useful in situations like that above. However, as we shall see in Section 10.3.3, there are other scenarios where it is valuable to not have this feature.
10.3.2
AU-to-One Bundling Map
For many subscribers, all that is required is a single service per UNI needing minimal configuration of the CE. This has been the typical configuration for the Transparent LAN services that have been offered by service providers since the early 1990s. The All-to-One Bundling Map is the way to achieve this scenario. With the All-to-One Bundling Map, the CE-VLAN ID/EVC map is configured such that all CE-VLAN IDs map to a single EVC at the UNI. In addition, the EVC must have the CE-VLAN ID preservation feature enabled. Since all CE-VLAN IDs are mapped to a single EVC, it follows that there can only be one EVC at the UNI. Figure 10-4 shows an example of an All-to-One Bundling Map. As can be seen in the figure, any ingress service, no matter what the format, is mapped to the red EVC. In addition, CE-VLAN ID preservation means that the format and VLAN ID, if tagged, will not be changed by the MEN. Note that there is no need for the subscriber and service provider to coordinate the use of the VLAN ID values in tagged service frames.^
Ethernet Services Over Metro Ethernet Networks
Untagged Priority Tagged Tagged, VID = 1 Tagged, VJD = 2
Tagged, VID = 4094 Tagged, VID = 4095
351
CE-VLAN IP
-4095 CE-VLAN ID/EVC Map
Figure 10-4. Example of All-to-One Bundling Map
The high degree of transparency with the All-to-One Bundling Map makes the resulting service much like a dedicated private Ethernet. When the EVC is Point-to-Point, the result is easily substituted for an existing private line such as a DS3. When the EVC is multipoint-to-multipoint, the result is much like the classical Transparent LAN service. Figure 10-5 illustrates an example of how these two configurations could be used by an enterprise to connect headquarters to the branches and to connect headquarters to a disaster recovery service provider. The multipointto-multipoint EVC service can be thought of as a backbone LAN connecting the sites of an enterprise and is frequently called LAN Extension. Disaster Recovery Service Provider
Branch
Branch Bridge or Router
Private Line Replacement
Figure J 0-5. Examples of the use of All-to-One Bundling
LAN Extension
352
Chapter 10
103.3 Service Multiplexing In the example of Figure 10-5, a separate port on the CE is required for each EVC at the headquarters. In many cases, this setup is acceptable and even desirable. But when many EVCs need to be accessed at a given location, consuming a port on the CE for each can both be costly in equipment and yield complex physical arrangements. The way to avoid these disadvantages is to enable multiple EVCs on a UNI. Service multiplexing is the term used to describe the case where multiple EVCs can be made available at a UNI. The two approaches to service Multiplexing are described in the following subsections. 10.3.3.1
One-to-One Map
When the CE-VLAN ID/EVC map is such that at most one CE-VLAN ID is mapped to an EVC, it is called a One-to-One Map."^ Untagged Priority Tagged Tagged, VID = 1 Tagged, VID = 2 Tagged, VID = 3
Tagged, VID = 4094 Tagged, VID = 4095
• 4095 CE-VLAN ID/EVC Map
Figure 10-6. Example of a One-to-One Map
Figure 10-6 shows an example of a One-to-One Map. At this UNI, only ingress tagged service frames with VLAN ID 3 and 4094 will be delivered to the destination UNI(s). Egress service frames at this UNI will be tagged with only VLAN ID values 3 and 4094. When the EVCs are point-to-point, the resulting service is analogous to the Frame Relay. Figure 10-7 shows an example of the use of point-to-point maps that illustrates this analogy. The ISP router is able to use a single port, e.g., Gigabit Ethernet, to reach several ISP customers with a point-to-point EVC to each. The CE-VLAN ID/EVC map for each UNI is shown in the figure, and it can be seen that the CE-VLAN ID is different at each UNI in an EVC. In particular, each ISP customer uses CE-VLAN ID 2000 to identify the EVC to the ISP, while the CE-VLAN IDs used at the ISP UNI are 178, 179, and 180. In this case, not using CE-VLAN ID preservation is
Ethernet Services Over Metro Ethernet Networks
353
valuable. It makes it simple for the ISP customers to configure their routers by using a well-known VLAN ID value (2000) and still makes it possible for the ISP router to identify the EVC for each ISP customer. This setup is analogous to the Frame Relay where the DLCI that identifies the Permanent Virtual Connection (PVC) can be different at each UNI. Internet Service Provider 178 ^ E V C ^ 179^EVC2 180<->EVC 2000 ^ EVC^ ISP Customer 3 2000 <^ EVC ISP Customer 1 Router
2000 o EVC2 ISP Customer 2 Frame Relay PVC Replacement
Figure 10-7. Example of the use of a One-to-One Map Internet Service Provider Routers
ISP Customer 1 Router
ISP Customer 2 Redundant Service Access
Figure 10-8. Example of Redundant service Access
354
Chapter 10
The One-to-One Map can be used with any mix of point-to-point and multipoint-to-multipoint EVCs. Figure 10-8 shows an example of redundant access to an ISP. Service multiplexing is shown at each UNI to an ISP router. Routing protocols running among the ISP and ISP customer routers will control which ISP router is used for each customer and will reroute the packets to the redundant router should the first ISP router fail or otherwise become unreachable. 10.3.3,2 Bundling Map It is also possible to have service multiplexing and multiple CE-VLAN IDs map to an EVC. This configuration is called a Bundling Map, Figure 109 shows an example. In this example, CE-VLAN IDs 1 and 3 are mapped to the red EVC. Untagged Priority Tagged Tagged, VID = 1 Tagged, VID = 2 Tagged, VID = 3
CE-VLAN ID
EVC
Tagged, VID ~40M Tagged, ViD=s 4095 CE-VLAN ID/EVC Map
Figure 10-9. Example of a Bundling Map
The use that is envisioned for the Bundling Map is an enhancement of LAN Extension. Consider the LAN Extension shown in Figure 10-5 and suppose that IEEE 802. IQ bridges are used as CE. The service in Figure 105 is not aware of the VLANs being used by the enterprise. So, for example, a broadcast frame sent on a given VLAN will be delivered to all the enterprise sites only to be discarded by the CE if that VLAN is not present at a site. With a Bundling Map approach, traffic on a given enterprise VLAN can be directed to an EVC going to only the enterprise sites where the VLAN is present. This yields more efficient use of the WAN bandwidth at the cost of the subscriber and service provider having to coordinate the content of the CE-VLAN ID/EVC map. When there is a Bundling Map, an EVC that has more than one CEVLAN ID mapped to it must have the CE-VLAN ID preservation feature.
Ethernet Services Over Metro Ethernet Networks
355
This requirement is motivated by the envisioned use of a Bundling Map described in the previous paragraph.
103.4 Feature Constraints The CE-VLAN ID/EVC map provides a high degree of flexibiHty and is part of the power of Ethernet services. The price for this flexibiUty is complexity in the sense that not all configurations of the map can coexist. These constraints are described in the following subsections. 10.3.4.1 Maps at UNIs in an EVC If a UNI in a given EVC has an All-to-One Bundling Map, then all UNIs in the EVC must have an All-to-One Bundling Map. To see why this is necessary, consider a point-to-point EVC called X, with UNIs A and B and the CE-VLAN ID/EVC maps shown in Figure 10-10. If the CE at UNI A sends a service frame with CE-VLAN ID = 4002, the CE-VLAN ID preservation feature mandates that the resulting egress service frame at UNI B have CE-VLAN ID = 4002. But the CE at UNI B would not be able to determine whether this service frame came from EVC X or EVC Y. And any attempt to respond with a service frame with CE-VLAN ID = 4002 would result in the response being carried on EVC Y, not EVC X. The only way to avoid this mishandling of service frames is to have all UNIs in EVC X have an All-to-One Bundling Map. UNI A
CE-VLAN ID 1-4095
UNIB
EVC X
CE-VLAN ID 1 - 4000 4001-4095
EVC X Y
Figure 10-10. Example of broken CE-VLAN/EVC maps
10.3.4.2
Maps at a UNI
Some CE-VLAN ID/EVC map properties are not compatible with each other. Each column in Table 10-2 shows map properties that are compatible with each other. Note that an All-to-One Bundling Map means that only one EVC can be present at the UNI.
356
Chapter 10
Table J 0-2. Compatible map properties One-to-One Map Al!-to-One Map Bundling Map Multiple EVCs
Combination 1 v^
Combination 2 ^
Combination 3
Combination 4
Combination 5
V
^ ^
>/ V'
10.3.5 E-Line and E-LAN Service The MEF in [2] has defined two general service types, namely, Ethernet Line service (E-Line) and Ethernet LAN service (E-LAN). An E-Line service is any service based on a point-to-point EVC, while an E-LAN service is any service based on a multipoint-to-multipoint EVC. Table 10-3 shows how ELine and E-LAN relate to the service features described so far and how the various combinations can be used. Table 10-3. E-Line and E-LAN service types CE-VLANID/EVC Map EVC Type Characteristic Point-to-Point Mii!tipoint4o-MiiMpoint All-to-One Private Line Replacement LAN Extension One-to-One Frame Relay Replacement Redtindant access to semces Bundling VLAN Extension VLAM Extension E-Line E-LAN
10.3.6 Class of Service In the last few years, data networks based on IP have become a platform for both classical data applications such as email and also real-time applications such as Voice over IP. The integration of real-time applications has generated a requirement that data networks be able to differentiate packets from different applications and provide differentiated performance according to the needs of each application. In MEF Ethernet services, this differentiation is referred to as Class of Service. The following subsections describe the method of identifying different instances of Class of Service and the different performance for each class. 10.3.6.1
Identifying Class of Service
A service provider is likely to offer several Classes of Service. For example, three classes might be offered: • One intended for non-time-critical data such as electronic mail, • One intended for time-critical data such as nuclear reactor sensor data, and
Ethernet Services Over Metro Ethernet Networks
357
•
One intended for real-time applications such as Voice over IP. However, the number of instances of Classes of Service for a given subscriber could be much larger. For example, pricing considerations might cause a service provider to charge differently for a given Class of Service between different end points. New York to Boston may well be priced less than New York to London. To accommodate this flexibility, MEF services allow identification of a Class of Service instance for a service frame by either the EVC or by the combination of the EVC and the userjpriority field in tagged service frames. In the latter case, a set of user__priority field values can identify the Class of Service instance. The EVC or combination of EVC and set of user_priority field values is called a Class of Service Identifier. As an example, consider the CE-VLAN ID/EVC map shown in Figure 10-6 and suppose that the red EVC has just the Standard Class of Service while the blue EVC has both the Standard and the Premium Classes of Service. Then the Class of Service for tagged service frames could be as shown in Table 10-4. Table 10-4. Example of Identifying Class of Service CE VLAN ID userpriority 3 0,1,...,7 4094 0,1,2,3 4094 4, 5, 6, 7
10.3.6.2
Class of Service Standard Standard Premium
Performance Parameters
The point of having different Classes of Service is to provide different performance to service frames according to each frame's Class of Service. The MEF [1] specifies three types of performance parameter: • Frame Delay: The length of time it takes a service frame to go from the ingress UNI to the egress UNI. • Frame Delay Variation: The variation in frame delay for service frames with the same Class of Service instance. • Frame Loss: The success or failure of delivery of a service frame to the egress UNI. The precise definitions of these parameters for a point-to-point EVC are contained in [1]. Performance for a multipoint-to-multipoint EVC will be addressed by the MEF in the future. The MEF is not planning on specifying values for these parameters that service providers should meet. Rather, by standardizing the definition of the parameters, they can be used as a basis for service-level specifications that will allow a subscriber to compare service offerings and negotiate a meaningful service level agreement with a service provider.
Chapter 10
358
10.3.6.2.1 Frame Delay Performance Objective for a Point-to-Point EVC The frame delay for a service frame is defined as the time elapsed from reception at the ingress UNI of the first bit of the ingress service frame until the transmission of the last bit of the service frame at the egress UNI. Figure 10-11 illustrates this definition. Of course, this definition is only meaningful for a service frame that is delivered.
Service Frame Direction
Time
First bit in
Last bit out
Figure 10-11. Frame delay for a service frame The frame delay performance Objective is defined by the three parameters in Table 10-5. A point-to-point EVC meets the frame delay performance Objective for the interval T if at least P percent of the service frames that arrive at an ingress UNI during the interval that have a green bandwidth profile compliance level (see section 10.3.7) and that are delivered, have delay less than or equal to d. Table 10-5. Parameter T P _d_
Parameters used to define frame delay performance objective Description Time interval during which service frames arrive at an ingress UNI (time units) Percentage Delay objective (time units)
Ethernet Services Over Metro Ethernet Networks
359
10.3.6.2.2 Frame Delay Variation Performance Objective for a Point-toPoint EVC Frame delay variation is defined for pairs of delivered service frames as the delay of the first service frame minus the delay of the second service frame. The Frame delay variation performance objective is defined by the four parameters in Table 10-6. A point-to-point EVC meets the frame delay variation performance objective for the interval T if at least P percent of the pairs of service frames that arrive at an ingress UNI during the interval, that have a green bandwidth profile compliance level (see 10.3.7), that arrive / time units apart, and that are delivered, have a frame delay variation of less than or equal to v. Table 10-6. Parameter T P / _v;
Parameters used to define frame delay variation performance objective Description Time interval during which service frames arrive at an ingress UNI (time units) Percentage Time between arrival of two service frames at an ingress UNI (time units) Delay variation objective (time units)
10.3.6.2.3 Frame Loss Performance Objective for a Point-to-Point EVC A service frame is defined as lost if it should have been delivered but was not.^ The frame loss performance objective is defined by the two parameters in Table 10-7. A point-to-point EVC meets the frame loss performance objective for the interval T if no more than L percent of the service frames that arrive at an ingress UNI during the interval and that have a green bandwidth profile compliance level (see 10.3.7) are lost. Table 10-7. Parameter T J^
Parameters used to define frame loss performance objective Description Time interval during which Servicefi-amesarrive at an ingress UNI (time units) Loss objective (percentage)
10.3.7 Bandwidth Profiles Most UNIs will have a line rate of at least 100 Mbps. Yet many subscribers will not need and will not want to pay for that much bandwidth. The vehicle for letting subscribers pay for the bandwidth they need is the bandwidth profile.
Chapter 10
360
The bandwidth profile is a characterization of the lengths and arrival times for ingress service frames at a UNI. When a bandwidth profile is applied to a sequence of service frames, each service frame is classified according to its level of compliance with the bandwidth profile. The bandwidth profile also includes how the MEN should treat a service frame depending on its level of compliance. 10.3.7.1 Parameters and Algorithm The bandwidth profile is the Ethernet version of the Frame Relay committed information rate or the ATM sustained cell rate. It defines longterm average bandwidths as well as limits on the amount of data in back-toback service frames. Reference [1] defines the service frame size and arrival time characterization with six parameters: 1. Committed Information Rate (CIR): a nonnegative number expressed as bits per second. 2. Committed Burst Size (CBS): a nonnegative number expressed as bytes. When CIR > 0, CBS must be at least as large as the maximum length of a service frame. 3. Excess Information Rate (EIR): a nonnegative number expressed as bits per second. 4. Excess Burst Size (EBS): a nonnegative number expressed as bytes. When EIR > 0, EBS must be at least as large as the maximum length of a service frame. 5. Coupling Flag (CF): a binary variable with value either 0 or 1. 6. Color Mode (CM): a binary variable with value either "color blind" or "color aware." The description of the level of compliance determination is via two token bucket algorithms, as shown in Figure 10-12. Committed Information Rate Tokens Overflow Commrttecl Burst Size
Excess ^ Information ^ Rate Tokens Overflow
Excess Burst Size C-Bucket
E-Bucket
Figure 10-12. Graphical depiction of bandwidth profile
Ethernet Services Over Metro Ethernet Networks
361
Each bucket holds up to CBS and EBS tokens, respectively, with one token representing one byte. Tokens are added to each bucket at the rate of CIR/8 and EIR/8 respectively. When the bucket becomes full, additional tokens overflow and are lost. Since each token is a sort of permission to the CE to send one byte of data, this loss of a token is a "use it or lose it" mechanism for bandwidth use. When an ingress service frame is classified, its length is compared with the tokens in the C-bucket. If the number of tokens is at least equal to the length of the service frame, the frame is declared "green," and tokens equal to the frame length are removed from the bucket. If there are not sufficient tokens in the C-bucket, the service frame length is compared with the tokens in the E-bucket and the process is repeated. If there are sufficient tokens in the E-bucket, the frame is declared "yellow"; otherwise, it is declared "red." The precise algorithm is shown in Figure 10-13. Here the Service frames have lengths {/,}, and arrival times {/,} fory = 0, 1, .... The number of tokens in each token bucket is denoted by Bc(tj) and Be(tj), respectively. Finally, Bc(to) = CBS and Be(to) = EBS. When CF = 1, it is possible that unused tokens from the C-bucket can be put into the E-bucket. This could happen if EIR is smaller than CIR and a long burst of service frames is followed by a period of no traffic. At the end of the burst, both token buckets would be essentially empty. During the quiet period, the C-bucket could completely fill before the E-bucket, resulting in the "overflow" tokens (represented by 0{tj) in Figure 10-13) being placed in the E-bucket. When CF = 0, the two token buckets operate independently. Service Frame of length Ij arrives at tj
B/tj) = min{CBS, B/tj_j)+CIRx(tf-tj_j)/8} 0(tj) = max[0, B/tj_j)+CIRx(tj-tfJ/8-CBS\ B/tj) = mm{EBS, B/tjJ+EIRxftj-tjJ/S+CFxOdj)} JilL.
Ij^B/tj)
I
Ij^B/tj) Yes
Yes Declare frame green B/tj)=^B/tj)^lj
No Declare frame red
Declare frame yellow B/tj)-B/tj)^lj
Figure 10-13. Bandwidth profile algorithm
362
Chapter 10
The algorithm defined in Figure 10-13 is referred to as color blind because it does not take into account a color that might be associated with each service frame as it crosses the UNI. Reference [1] also specifies a color aware version of the algorithm, and this is where the Color Mode parameter is used. This is done in anticipation of the specification of a standard way to indicate the color in a service frame at the UNI. Such a standard does not yet exist. 10.3.7.2
Disposition of Service Frames
The classification of each service frame governs how the MEN will handle the frame. Reference [1] specifies the following MEN actions: 1. Green: Deliver the service frame according to the service attributes of the service instance and the service level specification performance objectives (see Section 10.3.6.2). 2. Yellow: Deliver the service frame according to the service attributes of the service instance, but service level specification performance objectives do not apply. 3. Red: Discard the service frame. An example will help to understand the implications of the above points. Consider an enterprise connecting two locations together with a point-topoint EVC using an All-to-One Map at both UNIs, and further suppose that both UNIs are 100 Mbps. The enterprise has determined that the average bandwidth needed is 20 Mbps and that less than 0.1% service frame loss is needed for adequate performance. In this case, the parameter values shown in Table 10-8 could be used along with an SLA that guarantees less than 0.1% service frame loss. CIR is set equal to 20 Mbps and CBS is large enough to handle more than seven maximum-sized service frames. Given the bandwidth needs of the enterprise, most of the Service frames will be declared green and should have less than .1% service frame loss. Table 10-8. Example bandwidth profile parameter values Parameter Value CIR 20 Megabits/sec CBS 15,000 bytes EIR 20 Megabits/sec EBS 6,000 bytes CF 0
Ethernet Services Over Metro Ethernet Networks 10.3.7.3
363
Application of Bandwidth Profiles
There are three ways that a bandwidth profile can be appUed to ingress service frames. These are described in the following subsections. 10.3.7.3.1 Per Ingress UNI In this case, a single bandwidth profile applies to all ingress service frames at the UNI. Figure 10-14 illustrates this approach. A single bandwidth profile is applied to all service frames irrespective of the EVC or Class of Service identifier.
n> EVCi
UNi
/( CE-VLAN Cos 0,1,2.3 (
CE-VLANCOS4.5
\{
CE'VLAN Cos 6 J
Bandwidth Profile
EVCo
Figure 10-14, Example of bandwidth profile per ingress UNI
10.3.7.3.2 Per EVC In this case, a single bandwidth profile applies to all ingress service frames on a given EVC. Figure 10-15 shows two bandwidth profiles, one for each of the two EVCs.
364
Chapter 10 f'\'^'M'-^P^/
D EVC,
um 1I ,1 EVC^
IJ
%$itB-^
1
E-VLAN Cos 0.1^.3 CE«VLAN Cos 4,5 CE-VUANC0S6J
\
3) 1
! Bandwidth Profile 1
i
1 Banciwicith Profile 2
[]
i
Figure 10-15. Example of bandwidth profile per EVC
10.3.7.3.3 Per Class of Service Identifier The most granular application of bandwidth profiles is per Class of Service Identifier. This apphcation is illustrated in Figure 10-16.
CE^VLAN Cos 0.1.211) CE-VLAN Cos 4.5 (
CE-VLANCOS6.7
T Bandwidth Profile 1 f Bandwidth Profile 2 " y Bandwidth Profile 3
Bandwidth Profile 4
Figure 10-16. Example of bandwidth profile per Class of Service ID
10.3.7.3.4 Constraints on Bandwidth Profiles As we will see in Section 10.3.7.4, when a bandwidth profile is used, it is important to configure the CE to smooth the traffic in a way that complements the bandwidth profile. Most routers have the capability to do this for a single bandwidth profile. However, few routers can smooth to complement the case when multiple bandwidth profiles are applied. Therefore, to be consistent with existing CEs and to simplify implementations, the application of bandwidth profiles is constrained. At most one bandwidth profile can be applied to an ingress service frame.
Ethernet Services Over Metro Ethernet Networks
365
This constraint on the application of bandwidth profiles leads to the following rules at a UNI: 1. If there is a per-UNI bandwidth profile, then there cannot be any other bandwidth profiles. 2. If there is a per EVC bandwidth profile on an EVC, then there cannot be any per CoS bandwidth profiles for instances of Class of Service on the EVC. Figure 10-14, Figure 10-15, and Figure 10-16 show examples of bandwidth profiles that are applied according to these rules. 103.7.4
Configuring CE
It has been well known since the rise in popularity of Frame Relay that the use of a bandwidth profile type of bandwidth constraint can lead to excessive frame loss and poor performance for an application using the service. For example, consider a host in an enterprise network well removed from the UNI that is carrying out a file transfer. The source host can send a TCP window of packets back-to-back. When the corresponding frames reach the UNI, if they cause the token bucket to exhaust, service frames will be discarded. The resulting lost packets will be detected by TCP in the host and retransmitted. However, TCP also slows down when loss is detected and the net result is throughput degradation. There are two ways to suppress this behavior. The first is to make CBS large. The second is to implement smoothing in the CE at the UNI. Smoothing buffers a service frame that would be declared yellow or red until tokens have accumulated to allow the service frame to be declared green. Such smoothing has been common in routers since the early days of Frame Relay. It is important to properly configure the smoothing in the CE to align with the bandwidth profile parameters so as to allow the most effective use of an Ethernet service.
10.3.8 Layer 2 Control Protocols A significant number of protocols have been standardized for use by bridges within an enterprise network. These are referred to as Layer 2 control protocols. When a CE is a bridge, it is likely to generate some of these protocols. It is important to know how the Ethernet service handles a service frame that is carrying a Layer 2 control protocol. As an example, consider the configuration of Figure 10-17.
366
Chapter 10
Bridge |
Bridge^
MBridge
Figure 10-17. Example of the need for Spanning Tree Protocol tunneling
In Figure 10-17, a loop is formed by the three EVCs, and there is a danger that a broadcast frame sent into an EVC on one of the UNIs would circulate forever, with dire consequences for the subscriber. To prevent this behavior, the subscriber would typically configure the bridges to run the Spanning Tree Protocol. This protocol would block some of the ports on the bridges to break the loop. But what happens in the case of Figure 10-17? If the service frames carrying the Spanning Tree Protocol are discarded by the MEN, each bridge will conclude that the ports attached to the MEN are not attached to other bridges and all ports would be in forwarding state creating a loop. On the other hand, if the Service frames carrying Spanning Tree Protocol are delivered from one UNI to the other, the bridges would break the loop by blocking the appropriate ports. 10.3.8.1
Handling Layer 2 Control Protocols
There are three ways that an MEN can deal with a Layer 2 control protocol: 1. Discard services frames carrying the protocol, 2. Participate as a peer of the CE, or 3. Deliver unchanged (called tunneling) to appropriate UNIs. When there is more than one EVC at a UNI, tunneling becomes complicated by the question of which EVC to use to carry the tunneled service frames. A recommendation on how to handle various Layer 2 control protocols can be found in [2].
Ethernet Services Over Metro Ethernet Networks 10.3.8.2
367
Bridges versus Routers as CE
When bridges are used as the CEs, it is important to tunnel most of the Layer 2 control protocols. In view of the complexity of tunneling when multiple EVCs are at a UNI, it is the author's recommendation that AU-toOne Bundling be used at all UNIs when bridges are used as CEs. In this case, protocols such as Spanning Tree should be tunneled When it is desirable to have multiple EVCs at a UNI, it is the author's recommendation that routers be used as CEs and Layer 2 control protocols be discarded by the MEN. There are many configurations that will work well and are not consistent with the above recommendations. However, for most uses of Ethernet services, the above recommendations provide easy-to-understand guidelines and will enable the subscriber to realize the full value of Ethernet services.
10.4.
CONCLUSION AND FUTURE WORK
The services specified in [1] establish a solid foundation upon which valuable Ethernet services can be deployed by service providers and used by subscribers. Such services are currently being offered and planned in all parts of the world. However, the MEF realizes that this is just the starting point (which explains the use of "Phase 1" in the title of [1]). Consequently the MEF has embarked on "Phase 2" services. As of this writing, the scope of Phase 2 has not been finalized, but it is likely to include: • Definition of performance parameters for multipoint-to-multipoint EVCs, • Specification of a point-to-multipoint EVC, which can be useful for supporting Internet access for a large number of small customers, • Additional ways to identify CoS, such as by a set of CE-VLAN IDs, and the IP DSCP field, • Egress bandwidth profiles, and • Alignment with the emerging IEEE 802.lad, Provider Bridges standard.
10.5. APPENDIX A: ETHERNET BASICS This Appendix provides a quick tutorial on the basics of Ethernet. Ethernet emerged as a way for multiple computers to connect to a single cable segment (called a shared medium) and to exchange information amongst themselves at high speeds, e.g., 10 Mbps. The protocol that each computer had to execute is referred to as CSMA/CD and can be thought of as
Chapter 10
368
being modeled on a meeting of polite, extroverted, fast talkers. Any computer can send but refrains from doing so if energy is detected on the cable (Carrier Sense Multiple Access). If two computers happen to start sending at the same time, each detects the collision and stops sending for a random period of time (Collision Detection). Each computer receives all frames transmitted and discards them if the destination MAC address in the frame does not match the computer's MAC address and is not a broadcast/multicast address. Of course, Ethernet networking has evolved greatly since these early days. Today, the vast majority of Ethernet networks are composed of computers and routers attached directly to Ethernet bridges (also known as switches). The links are usually full duplex so both the devices on the link can transmit and receive simultaneously. The bridges forward the Ethernet frames to the proper destination(s). From the point of view of Ethernet services, then, the CE connects to the UNI via a point-to-point cable and exchanges Ethernet frames with the service provider network. In the following, we dig a bit deeper into Ethernet as seen by the CE.
10.5.1 Ethernet Physical Layers There are a large number of Ethernet physical layers defined in [3]. Several have fallen into disuse in the market. Others are unlikely to be supported at the UNI by service providers. Table 10-9 summarizes the physical layers that are most likely to be supported at the UNI.^ Table 10-9. Physical layers likely to be used at the UNI Name Media Type Speed lOBASE-T 2 pair Category 3 UTP 10 Mbps lOBASE-FL 62.5 ^m MMF 10 Mbps 100BASE-TX 2 pair Category 5 UTP 100 Mbps lOOOBASE-T 4 pair Category 5 UTP 1000 Mbps lOOOBASE-SX 50 and 62.5 |iim MMF 1000 Mbps lOOOBASE-LX 50 and 62.5 |Lim MMF 1000 Mbps lOOOBASE-LX lO^mSMF 1000 Mbps
Maximum Length 100 meters 2km 100 meters 25 meters 220 - 550 meters 550 meters 5km
UTP — Unshielded Twisted Pair; MMF — Multimode Fiber; SMF — Single-Mode Fiber
10.5.2 Ethernet Media Access Control Layer The frame format for Ethernet is part of the media access control layer because it contains the information that is needed for the operation of CSMA/CD. Figure 10-18 shows the format.
Ethernet Services Over Metro Ethernet Networks
b^
b^
7 octets 1 octet 6 octets 6 octets 2 octets 46-1500 octets 4 octets
b^
b^ b^ b^ Preamble Start Frame Delimiter Destination MAC Address Source MAC Address Length/Type
369
b^
Data and Pad Frame Check Sequence
Figure 10-18. Ethernet media access control frame format
The bits of each octet are transmitted left to right (least significant to most significant) and the octets are transmitted top to bottom. Each field is briefly described below. 10.5.2.1
Preamble Field
This field allows a receiver to synchronize a receive clock. This synchronization is necessary in a shared media network where there can be periods with no transmissions and multiple transmit clocks, one for each other device on the LAN. 10.5.2.2
Start Frame Delimiter Field
This special bit pattern allows the receiver to find the start of the frame. 10.5.2.3
Destination Address Field
This field specifies the destination device or devices. The first bit indicates whether the address is an individual address or a group (multicast) address. The second bit indicates whether the address is globally or locally administered. Globally administered addresses are allocated to manufacturers and are usually "burned into" devices during manufacturing. The all ones address indicates a broadcast to all devices on the LAN. 10.5.2.4
Source Address Field
This field identifies the source of the frame.
370 10.5.2.5
Chapter 10 Length/Type Field
Depending on the value, this field can either indicate the number of data octets or the type of protocol running above the MAC layer. Today, most implementations use this field for protocol identification. 10.5.2.6
Data and PAD Field
This field contains the data that is being carried. If there are less than 46 octets of data, PAD bits are added to extend the field to be 46 octets. 10.5.2.7
Frame Check Sequence Field
This field is a CRC calculation that covers from the beginning of the Destination Address field through the end of the Data and PAD field. It is used to detect transmission errors.
10.5.3 Ethernet VLANs IEEE 802. IQ [4] added two new concepts to Ethernet, namely the virtual LAN and priority. The virtual VLAN concept allows an Ethernet to be logically divided into multiple LANs. For example, a large LAN or complex of LANs bridged together can be divided up into multiple (smaller) VLANs and a separate IP subnet assigned to each. In this case, from the point of view of a router, each VLAN operates just like a separate physical LAN. A broadcast frame sent on a given VLAN will only be propagated to and received by the devices assigned the VLAN. In addition, [4] defines up to eight levels of priority. Bridges compliant with IEEE 802. IQ can buffer and forward frames according to the different levels of priority. Figure 10-19 shows the Ethernet MAC frame format with the IEEE 802.IQ Tag Control Information. Four octets are inserted between the Source Address field and the Data field. The 802.IQ TagType field indicates that the Tag Control Information field follows, as opposed to the Data field.
Ethernet Services Over Metro Ethernet Networks
b^
b^ b^ b^ Preamble Start Frame Delimiter Destination MAC Address Source MAC Address 802.1QTagType Tag Control Information Length/Type
7 octets 1 octet 6 octets 6 octets 2 octets 2 octets 2 octets
46-1500
371
b^
Data and Pad
octets 4 octets
Frame Check Sequence
Figure 10-19. Ethernet media access control frame with tag control information
Figure 10-20 shows the details of the 802.IQ TagType field and Tag Control Information field. The 802. IQ TagType field always has the value 8100 Hex. The VLAN Identifier has 12 bits. The value 0 has a special meaning. It does not identify a VLAN but is used as a way to assign a priority to the frame. When the value is 0, the frame is called a priority tagged frame. The value FFF Hex (all ones) is reserved. Thus 4094 VLANs can be identified. The user_priority field contains the priority of the frame and has eight possible values. The Canonical Format Indicator (CFI) field is intended for use with Token Ring networks.
2 octets
1 0
2 octets
b^ 0 0 0 0 VLAN
b' 0 0
0 0 0 11 0 0 0 01 1 CFI 1 user priority Identifier (12 bits)
Figure 10-20. Tag type and tag control information
10.6.
NOTES
1. The reader may have heard about "double tagged" Ethernet frames in Metro Ethernet Networks. Such double tagging is frequently used internally in MENs. But double tagging is not standard and thus is not to be done at a UNI. 2. An Ethernet frame with an IEEE 802.IQ tag that has zero as the VLAN ID is called priority tagged.
372
Chapter 10
3. [4] specifies that the VLAN ID value 4095 is reserved, and thus an IEEE 802. IQcompHant CE should not generate a tagged service frame with this VLAN ID value. However, there seems to be no advantage in the service provider enforcing this constraint, and thus 4095 is allowed. Note, however, that the only mandate for the service provider to deliver such service frames is when CE-VLAN ID preservation is in force. 5. The One-to-One terminology is not used in [1], but it is a typical and important case for the CE-VLAN ID/EVC map. 6. Not all ingress service frames should be delivered. For example, ingress Pause frames should probably not be delivered. See Section 10.3.8. 7. 10 Gbps Ethemet standards exist, but we don't expect them to be supported at the UNI in the near term.
10.7.
REFERENCES
[1] Metro Ethemet Forum, Technical Specification MEF 10, Ethernet Services Attributes, Phase 1, 10 November 2003, http://www.metroethemetforum.org. [2] Metro Ethemet Forum, Technical Specification MEF 6, Ethernet Services Definitions Phase I, June 2004, http://www.metroethemetforum.org. [3] IEEE Project 802.3 — 2002, Information technology — Telecommunications and information exchange between systems —Local and metropolitan area networh Specific requirements — Part 3: Carrier sense multiple access with collision detection (CSMA/CD) access method and physical layer specifications, 8 March 2002. [4] IEEE Project 802. IQ — 1998, IEEE Standards for Local and Metropolitan Area Networks: Virtual Bridged Local Area Networks, 8 December 1998.
Chapter 11 ETHERNET SERVICES OVER PUBLIC WAN
*Steven S. Gorshe, Ph.D and **Nevin R. Jones, Ph.D * PMC-Sierra, Inc., **Agere Systems, Inc.
11.1.
INTRODUCTION
11.1.1 Why Ethernet over the public WAN? A convergence of technology and applications has created an increased desire for Ethernet WAN connectivity. On the enterprise side, Ethernet is the dominant technology. Enterprise network administrators' familiarity and comfort with Ethernet creates a natural desire that their WAN technology also be based on Ethernet. A number of applications, including voice and video, are already being encapsulated into Ethernet and rim over their LANs. Another trend is for companies to have a number of remote offices that need to exchange data with the headquarters office or with each other. There can be significant advantages to placing data on common servers that can be accessed by all sites. The increased importance of the Internet for business applications has created an increasing desire for incrementally higher WAN interface rates. It is also common to want a single Internet firewall at one site that is accessed on the enterprise network by all sites. Most enterprise WAN data access today uses Frame Relay connections at rates of fractional DSl, full-rate DSl, fractional DS3, and full-rate DS3. In some cases, the data services use ATM instead of Frame Relay either for the customer connection or as a method of transporting the data through the core network. The Frame Relay-based services suffer from a lack of scalability in
374
Chapter 11
terms of both high-speed connections and ubiquitous multipoint networks. ATM suffers from being somewhat provisioning-intensive in large networks and bandwidth inefficiency. At the other end of the spectrum, some customers use WDM or dark fiber for their WAN or MAN connections with native Gigabit Ethernet, 10 Gbit Ethernet, or Fibre Channel. However, WDM equipment is expensive and not ubiquitously deployed, and dark fiber is not universally available, which limits the applications for these approaches. Another drawback of WDM and dark fiber applications is that unless they use the new G.709 [1] Optical Transport Network standard, there is no embedded overhead for the carrier to monitor the quality of the connection or to provide protection switching in order to guarantee their service level agreements (SLAs). Some larger customers use high-speed SONET OC-N / SDH STM-N connections that have typically terminated the Ethernet frames and re-encapsulated the data into PPP and use Packet over SONET/SDH (PoS) for transmission through the SONET/SDH pipe. On the carrier (network operator) side, the situation is somewhat complicated. In the wake of the bursting of the telecom "bubble" in the year 2000, the carriers face two clear drivers. First, they must deploy any new services on their existing SONET/SDH networks. Throughout the 1990s, carriers invested heavily in building SONET/SDH-based fiber optic networks, including developing the OAM&P systems to run them. The enormous capital investment required to build an overlay network for data transport, be it native Ethernet or DWDM based, makes overlay networks unrealistic. Fortunately, SONET/SDH has proven flexible enough to handle the challenge. As discussed in earlier chapters, recent developments that extend the useful life of SONET/SDH include virtual concatenation (VCAT) [2, 3] to create more flexible data channel sizes, Link Capacity Adjustment Scheme (LCAS) [4] to increase the flexibility and functionality of VCAT, Generic Framing Procedure (GFP) [5, 6] to encapsulate data frames for robust and bandwidth-deterministic transport, and the Resilient Packet Ring (RPR) that provides a very flexible technology for data access onto SONET/SDH access and metro rings. The second driver for the carriers is the desire to generate new, revenuebearing services. With data traffic continuing to grow faster than voice, offering WAN connectivity with higher bandwidth and enhanced service capabilities appears to be a natural direction for new services. Equipment and device manufacturers, who are also interested in building new products, have also seen WAN connectivity as a natural direction. Ethernet is in many ways an obvious technology choice for WAN connectivity. The job of enterprise network administration is simplified if all their sites can be treated as part of the same Ethernet network. Also, Ethernet interfaces are relatively inexpensive and are ubiquitous on
Ethernet Services Over Public WAN
31S
enterprise equipment. The development of VCAT and GFP allow efficient transport of Ethernet frames through SONET/SDH networks. Also, Ethernet bridge and router technology requires less provisioning and administration than technologies such as ATM. As some carriers begin a migration to using packet switching technology such as MPLS in their core networks rather than traditional circuit switching, Ethernet again looks like an excellent fit as an access technology. The relatively low cost of Ethernet enterprise equipment, however, has created a customer expectation that Ethernet WAN interfaces should be less expensive than DS1/DS3 Frame Relay interfaces. For the carriers, even if the Ethernet connectivity is provided through their SONET/SDH backbone network, they still have to develop new OAM&P procedures and tools to provide the service. In fact, although it is being actively addressed by multiple standards bodies, Ethernet currently lacks some of the OAM capabilities that will be required to monitor and guarantee the SLAs. So, even though customers and carriers agree that Ethernet (over SONET/SDH for the carriers) is the most attractive technology for highspeed WAN connectivity, carriers will still have to invest in equipment and management system upgrades while the customers demand and expect to pay less for the Ethernet-based services.
11.1.2 Organization of the chapter The information in this chapter primarily focuses on the output of the Ethernet transport work in ITU-T Study Group 15 (SGI5) and its series of new standards. SGI5 is the ITU-T body responsible, among other things, for standards related to public transport networks. Ethernet services in metro networks are the topic of Chapter 10. The status of standards that are currently approved or are in progress is summarized in Section 11.1.3. The chapter begins with a discussion and description of the Ethernet services that are being considered over public WANs. This description includes an examination of the different service characteristics and attributes that must be defined in order to provide the services. The next section explores different transport network models that can be used to support Ethernet services. Next comes a description of the Ethernet-based user network interfaces (UNIs) followed by a discussion of the network-tonetwork interface (NNI) that will be required for transport network equipment carrying Ethernet services. OAM is the topic of the following section. As already noted, some amount of OAM capability comes inherently from deploying Ethernet services over SONET/SDH (server) networks. Additional OAM capabilities will be required at the Ethernet client layer. The final section summarizes some of the protection and
376
Chapter 11
restoration technologies that can be used to guarantee carrier-grade service rehabihty for Ethernet WAN services. Table 11-1. Summar>f of related standards activities
Organization IEEE
Activities
Status
802.3ae
10 Gbit Ethernet, which included a WAN PHY interface to simplify interfacing to a SONET/SDH or G.709 OTN network Resilient Packet Rings: Working on a ring-based network for access and metro applications Ethernet in the First Mile, where work includes 0AM aspects for Ethernet Links, especially access links Provider Bridge specification - This is the Q-in-Q standard.
Approved
802.17 802.3ah (EFM) 802.1 ad
802. lag 802. lae ITU-T SG15 G.8011.1 (Q12) G.8012 (Qll) G.8010 (Q12)
1
Connectivity Fault Management, or Ethernet Service OAM MAC Security (MacSec), including authentication, authorization, and encryption (With input from ANSI TlXl) Ethernet Private line service
Approved Approved In progress
In progress In progress | Approved
Ethernet UNI and Ethernet Transport NNI
Approved Approved
G.8021[7] (Q9)
Ethernet Layer Network Architecture, which is largely to translate the IEEE 802 network material into ITU-T transport network terminology and models Ethernet over Transport — Ethemet Service Characteristics Ethemet over Transport—Ethemet Service Multiplexing, which will cover the multiplexing protocol(s) required to implement EVPL and EVPLAN Characteristics of Ethemet transport network equipment functional blocks
Q2
Studying Ethemet OAM aspects relating to access
G.8011 (012) G.esm (Q12)
Approved In progress
Approved 2004 (focus onEPL portion) In progress
1 ITU-T SG13 Y.1730 Y.ethoam (Q3) Y.ethps |(Q3)
Requirements for OAM functions in Ethemet-based networks and Ethemet services End-to-end and edge-to-edge aspects of Ethemet OAM including PM Ethemet protection switching
Approved In progress In progress
Ethernet Services Over Public WAN Table 11-1 - continued Metro Ethernet Forum (MEF) MEF is studying various aspects of Ethernet MANs, including Ethernet architecture, service model, service definition, traffic management, UNI and NNI definition, and 0AM. MEF work is covering all possible OAM flows, such as end-to-end, edge-toedge, access, interprovider, intraprovider, etc. MEFl Ethernet Services Model, Phase 1 MEF2 Requirements and Framework for Ethernet Service Protection in Metro Ethernet Networks MEF3 Circuit Emulation Service Definitions, Framework and Requirements in Metro Ethernet Networks MEF4 Metro Ethernet Network Architecture Framework — Part 1: Generic Framework MEF5 ^ • Traffic Management Specification: Phase I UNI Type 1 Specification of UNI, data-plane aspects UNI Type 2 Specification of UNI, control-plane aspects (ELMI) EMS-NMS MIBS for Ethernet and network management MEF Specifies functional elements of Ethernet trail, such Architecture, part as adaptation, conditioning, etc. 2 CESPDH Implementation agreement of PDH CES over Ethemet. Includes both AALl and Raw method Internet Engineering Task Force (IETF) Working on defining an Ethemet transport over PWE3 WG IP/MPLS using Martini drafts. This is mainly EVPL service using UDP, L2TP, or MPLS as multiplexing layer Requirements for Virtual Private LAN Services PPVPNWG (VPLS) Working on framework and service requirements of L2VPNWG Ethernet-based VPN, and defining EVPLAN service using IP/MPLS.
111
Approved Approved Approved Approved Approved In progress In progress In progress In progress
In progress
In progress
In progress In progress
11.1.3 Related standards activity The current amount of standards activity is a good indication of how many companies and organizations see Ethemet WAN as the next key step both for Ethemet and for the pubhc transport network providers (i.e., carriers). The major standards activities are summarized in Table 11-1. Each standards organization has its own areas of expertise. The majority of the standards that will be required for the public transport network are being developed in the Q12 and Ql 1 groups of ITU-T SGI5. This work was partitioned not only logically by topic but also in a manner that allowed for
378
Chapter 11
the earliest possible approval of useful standards/recommendations. The initial set of standards was approved in mid-2004 (see Table 11-1). Those recommendations that will require more study and debate prior to consensus are targeted near the time of the publication of this volume. ITU-T SGI5 has established liaison contact with the other standards organizations and forums where their input is required or desired. For example, the G.ethsrv work is expected to use a considerable amount of input from the MEF regarding the definition of services. Multiple organizations are working on operation, administration, and maintenance (0AM) aspects of Ethernet MANs/WANs. 0AM is critical once Ethernet is extended beyond the customer premises, especially when multiple transport service providers carry the traffic. In a multiple carrier environment, for example, the 0AM is crucial for determining the locations of problems and degradations when they occur. From a transport network provider standpoint, this OAM requirement is an area where SONET/SDH really shines. The OAM capabilities inherent in the SONET/SDH backbone allow full monitoring and protection of the transmission facilities and transport path through the SONET/SDH network.
11.1.4 DeHnition of some technical terms in this chapter The sources of these definitions are the appropriate ITU-T Recommendations, as indicted at the end of the definition. 1. Access group: A group of co-locatedyZow termination functions that are attached to the same^/ow domain ox flow point pool link. (G.809) [8] 2. Characteristic Information (CI): A signal with a specific format, which is transferred onflows. The specific formats are defined in technologyspecific standards. (G.809) 3. Flow: An aggregation of one or more traffic units with an element of common routing. (G.809) 4. Flow domain: A topological component used to effect forwarding of specific characteristic information. (G.809) 5. Flow point: A reference point that represents a point of transfer for traffic units between topological components. (G.809) 6. Flow point pool: A group of co-located flow points that have a common routing. (G.809) 7. Flow point pool link: A topological component that describes a fixed relationship between a flow domain or access group and another flow domain or access group. (G.809) 8. Flow termination: A transport processing function. There are two types of flow termination, namely, a flow termination sink and a flow termination source. (G.809)
Ethernet Services Over Public WAN
379
9. Link flow: A transport entity that transfers information between ports across a flow point pool link. (G.809) 10. Topological component: An architectural component, used to describe the transport network in terms of the topological relationships between sets of points within the same layer network. (G.809)
11.2.
SERVICE TYPES AND CHARACTERISTICS
The description of Ethernet transport services varies depending on one's vantage point. This chapter approaches Ethernet transport from the perspective of the transport network or service provider, while the customer view of Ethernet transport services is presented in Chapter 9. This chapter follows the approach of ITU-T Rec. G.8011 [9] in its discussion of the Ethernet service types and characteristics of Ethernet from the transport network / service provider viewpoint. The G.8011.X series covers specific services within the framework of G.8011. One can think of the goal of the network provider as making the Ethernet service look like an extension of the customer's Ethernet LAN. In order to provide this transparency, the network provider view must take into account a number of items that are not necessarily directly visible to the customer. These items are presented in this section in terms of Ethernet connection attributes and their associated parameters. As described in Chapter 10, an Ethernet Virtual Connection (EVC) provides the connection between two or more customer UNIs such that Ethernet frames (service frames) associated with that EVC can be transferred between these UNIs and not to any UNIs that are not associated with that EVC. Consistent with ITU-T G.8011, this chapter uses the more generic term EC (Ethernet Connection) rather than EVC. Figure 11-1 illustrates an Ethernet connection with its different reference points from the standpoint of both the Ethernet MAC layer network (ETH) and the Ethernet physical layer network (ETY). Figure 11-1 also illustrates the different Ethernet service areas in a multicarrier Ethernet connection. These three service areas are the access (UNI-C to UNI-N), end-to-end/customer-to-customer (UNI-C to UNI-C), and edge-to-edge/intracarrier (UNI-N to UNI-N).
Chapter 11
380
ETH
ETY
FD = Flow domain Figure 11-1. Illustration of Ethernet service areas (from ITU-T Rec. G.8011)
From a customer's perspective, the EC connectivity can be one of two types: • •
Line connectivity (point-to-point) LAN connectivity (point-to-multipoint or multipoint-to-multipoint)
From a transport network viewpoint, these line and LAN connections can be provided either through dedicated transport channels (including router bandwidth) or through a shared medium. The former is referred to ^s private service, and the latter as virtual private service. The difference between a private line connection and a virtual private line connection is illustrated in Figure 11-2. The service type variations are summarized in Table 11-2.
Ethernet Services Over Public WAN
381
Customer A Equipment
Customer B Equipment
Carrier Equipment
Carrier Equipment
Customer B Equipment
a) EPL for two customers, each with his or her own TDM channel
Ethernet
Ethernet
Customer A Equipment
Customer B Equipment
Carrier Equipment
Carrier Equipment
Customer B Equipment
b) EVPL for two customers where they share a TD!\/I channel for increased efficiency Figure 11-2. Illustration of private and virtual private connections
Table 11-2. Summary of the types of Ethernet services Resource sharing Connectivity Service type Point-to-point Dedicated EPL (Ethernet Private Line) Shared EVPL (Ethernet Virtual Private Line) Multipoint Dedicated EPLAN (Ethernet Private LAN) Shared EVPLAN (Ethernet Virtual Private LAN) Note : The MEF (see Chapter 9) refers to EPL and EVPL as E-Line services and EPLAN and EVPLAN as E-LAN services.
11.2.1 Ethernet connection (EC) attributes The Ethernet connection service attributes that must be taken into account in a transport or service provider network are summarized in Table 11-3 and described in more detail in the following text.
Chapter 11
382
Table 11-3. Ethernet connection service attributes (derived from ITU-T Rec. G.8011)
Service attribute parameters and values
EC service attribute Network connectivity Transfer characteristics
Separation Link type Connectivity monitoring Bandwidth profile UNI Hst Preservation Survivability
Point-to-point, point-to-multipoint, multipoint-to-multipoint Address (deliver conditionally or unconditionally) Drop Precedence (drop randomly, drop conditionally, or not applicable) Class of Service Customer Spatial or logical Service instance Dedicated or shared Sublayer monitoring: On demand, proactive, none Inherent monitoring: Proactive Specified An arbitrary text string to uniquely identify the UNIs associated with the EC VLAN ID (yes or no) Class of Service (yes or no) None, or server-specific
11.2.1.1 Network connectivity As noted, one way in which Ethernet services can be characterized is by the type of desired customer connectivity. The types of connectivity are • • •
Point-to-point Point-to-multipoint Multipoint-to-multipoint
Figure 11-3 shows examples of these different connectivity types. Figure 11-3(a) shows a basic point-to-point connection between customers through a transport network. Figure 11-3(b) shows a popular point-to-multipoint topology known as hub-and-spoke. Figures ll-3(c)-(e) show examples of various multipoint-to-multipoint topologies. Care must be taken to avoid confusing the customer connectivity (logical topology) with the physical topology of the underlying network providing that connectivity. For example, Figures 11-3(b) and 11-3(d) use the same physical topology. The difference between a hub-and-spoke and a star network is that a star network provides arbitrary multipoint-to-multipoint connectivity among all the customer nodes while a hub-and-spoke network connects the hub customer node to each of the spoke customer nodes (pointto-multipoint). Any connectivity between spoke nodes would have to be provided by a router at the customer's hub node. A logical hub-and-spoke
Ethernet Services Over Public WAN
383
network could be provided over the physical topology of any of the networks in Figures ll-3(b)-(e). Figures ll-3(c)-(e) illustrate common transport network topologies. In reality, a transport network will often consist of a combination of star, ring, and more arbitrary mesh subnetworks.
a) Point-to-point
b) Hub and spoke
c) Ring
d) Star CE
CE
CE = Customer Edge e) Aribtrary Figure 11-3. Network connectivity examples
Section 11.3 discusses transport network models.
Chapter 11
384 ETH UNI
PQR network entity
XYZ network entity'
THFD ETHFD
7 ETH UNI
ETH UNI t Customer M
ABC network entity
Customer N
ABC, PQR, XYZ are server layer networks (can all be the same or different); they may be CO-CS, CO-PS, CLPS Figure 11-4. Network portion of the multipoint-to-multipoint topology (from ITU-T Rec. G.8011)
When discussing multipoint-to-multipoint connectivity, it is common to refer to the network topological component that provides the desired forwarding of information between source and destination UNIs as a flow domain} In the example of Figure 11-4, if customers M and N are exchanging data, each is connected to a flow domain (FD), with the flow domains being connected through an Ethernet link flow (LF) over the ABC network entity. A point-to-point connection can be characterized as either not having a flow domain or as having a flow domain with only two flow points (i.e., the two endpoints of the network connection). The point-topoint connection is typically described as not having a flow domain, since a flow domain implies a layer network with inherent switching/routing and other layer network capabilities. A flow domain with only two points is really the degenerate case of a multipoint network and leaves open the potential of adding additional flow points (UNIs here) to the network.
Ethernet Services Over Public WAN
385
11.2.1.2 Transfer characteristics The transfer characteristics of a network relate to what frames are to be dehvered to the destination unconditionally, are to be delivered conditionally, or may be dropped. In the case of Ethernet, the three parameters that determine the disposition of a frame are address. Drop Precedence (DP), and Class of Service (CoS). For the address, a frame can either be delivered unconditionally, regardless of its destination address, or be delivered for only some destination addresses. The DP indicates the relative priority of the frame if it encounters a congestion situation in which frame dropping must occur. If dropping is based on the DP, frames are said to be dropped conditionally. Another option would be dropping randomly (i.e., drop the overflow frames from a full queue). For some services, frames cannot be dropped and hence DP is not applicable. The CoS parameter, which is based on the DP and indicates the frame's class queuing, is not fully defined at this time. 11.2.1.3 Separation (customer and service instance) Separation refers to how the traffic of each customer or service instance is kept separate from the traffic of others. In the case of customer separation, it is the traffic from different customers that is separated. In the case of service instance separation, it is the different service instances that are separated, even for the same customer. A spatial separation implies a circuit switched network (e.g., a TDM network in which each customer has its own TDM channel or facility, or a virtual circuit such as in an ATM network). Logical separation implies that customer or service instance traffic is separated at the packet level (i.e., based on per-packet information such as address or tag values). 11.2.1.4 Link type A link can either be dedicated to a single customer service instance or shared among multiple service instances. For a dedicated link, a single customer service instance has a one-to-one mapping to a set of one or more Ethernet links and the associated server layer trail (i.e., a spatial separation from other service instances). As such, the service instance does not compete for bandwidth/resources (transport and switch fabric bandwidth) with other service instances, and does not allow multiplexing on the access link. (See Figure 11-2(a).) On the other hand, a shared link allows more than one service instance to share that link, (i.e., logical separation) which
386
Chapter 11
means that the service instances can compete for the hnk resources. (See Figure 11-2(b).) 11.2.1.5 Connectivity monitoring Connectivity monitoring is the mechanism by which network nodes determine their ongoing connectivity to their neighbor nodes in that layer network. See Section 11.6 for a discussion of this and other 0AM topics. 11.2.1.6 Bandwidth profile A bandwidth profile specifies the parameters that a traffic flow must meet at a UNI or NNI. Policing of the bandwidth profile is typically done at the edge of the transport network. See Chapter 10 for a more detailed discussion of bandwidth profiles. 11.2.1.7 UNI list For the purposes of management and control, a service provider assigns an arbitrary string to uniquely identify each UNI. 11.2.1.8 Preservation Preservation refers to whether a customer's Ethernet frame VLAN ID and/or CoS are preserved through the transport network. If the value is preserved, it will have the same value at the egress UNI that it had at the ingress UNI of the transport network. In some circumstances, however, it may be necessary or desirable to change these values within the transport network. For example, the service provider may perform a translation between the VLAN ID values that a customer uses on the customer side of the network to a different set of VLAN IDs that are used within the service provider network. Another example is that if a frame fails to meet the specified bandwidth profile, the ingress node may choose to let it pass into the transport network but will set its DP value to a higher value so that it is prioritized for dropping if it encounters congestion. 11.2.1.9 Survivability Survivability pertains to the ability of the network to continue to provide service under situations in which a fault(s) exists in the network. See section 11-7 for more details.
Ethernet Services Over Public WAN
387
11.2.2 Ethernet Private Line (EPL) service EPL, as illustrated in Figures 11-5 and 11-2, consists of point-to-point Ethernet connections using reserved, dedicated bandwidth. With EPL, the transport network effectively looks like a "piece of wire" from the Ethernet client perspective. From the transport network provider standpoint, however, the transport network (server layer) provides the performance monitoring and protection capabilities required for guaranteeing the service level agreement (SLA) with the customer.
Customer
Equipment
^
^-\
Carrier Network /
Carrier x ^ Equipment
^xX^
^
^
Carrier Equipment
Customer
Equipment
Figure 11-5. EPL service illustration
The primary advantages of EPL are the simplicity of setting up a dedicated circuit, and the security for the traffic that is inherent when it is isolated in its own TDM channel.^ Sharing bandwidth can lead to greater bandwidth efficiency in the transport network due to statistical multiplexing gains. However, it is more difficult to administer, since it requires additional effort (e.g., traffic engineering and monitoring) in order to guarantee the customer SLA. EPL service is described in ITU-T G.8011.1 [10]. The EPL connection characteristics are summarized in Table 11-4. ITU-T G.8011.1 defines two types of EPL services. For Type 1 EPL, the CI transferred between the UNIs is the Ethernet MAC frames. As described in Section 11.3, the Ethernet preamble, start of frame delimiters, and interframe characters are discarded, and the Ethernet MAC frames are then encapsulated (e.g., into GFP-F) and mapped into the transport channel. Type 2 EPL, which is only defined for IGbit/s Ethernet, treats the 8B/10B line code information as the CI to be transferred between the UNIs. As discussed in Chapter 5, the data and control code information from the Ethernet signal's 8B/10B line code characters are translated into a more bandwidthefficient 64B/65B block code, and multiple 64B/65B codes are then mapped into a GFP frame (GFP-T). The primary advantages of Type 2 EPL are the preservation of control codes (primitive sequences of special line code
Chapter 11
388
characters) and lower mapping latency. While is possible to also define Type 2 EPL for 4B/5B-encoded 100 Mbit/s Ethernet, there has been no formal request for this service so far. Table 11-4. EPL connection characteristics (derived from ITU-T Rec. G.8011.1)
EC service attribute Network connectivity Transfer characteristics
Separation Link type Connectivity monitoring Bandwidth profile
UNI list Preservation Survivability
Service attribute parameters and values Point-to-point Address — deliver unconditionally Drop Precedence — not applicable Class of Service Customer Spatial or logical (always connection oriented) Service instance Dedicated None, on-demand, or proactive Committed information rate (CIR) and committee burst size (CBS) An arbitrary text string to uniquely identify the UNIs associated with the EC VLAN ID is preserved Class of Service is preserved None, or server-specific
11.2.3 Ethernet virtual private line service (EVPL) EVPL is also a line service; however, the line can be derived from a flow domain and allows sharing network resources among multiple customers or service instances in order to achieve more efficient use of those resources. EVPL is illustrated in Figure 11-2(b). In addition to allowing more efficient use of transport network resources, another potential advantage of EVPL is to reduce the number of UNIs required at the customer edge. This is illustrated in Figure 11-16 in Section 11.4.1. For the customer edge node on the left to connect to four other nodes would require four different UNIs and their associated ports. Service multiplexing is the packet multiplexing of multiple ECs onto a single UNI. While EVPL is still under study in ITU-T SGI5, its expected connection characteristics are summarized in Table 11-5. For virtual connections, the separation is logical (i.e., at the packet level). Due to the sharing of network resources, it is possible that frames may be dropped due to congestion. Also, as discussed above, a service provider may wish to perform VLAN ID translation at the boundaries of the transport network.
Ethernet Services Over Public WAN
389
Table 11-5. Expected EVPL connection characteristics
EC service attribute Network connectivity Transfer characteristics
Separation Link type Connectivity monitoring Bandwidth profile UNI Hst Preservation Survivability
Service attribute parameters and values Point-to-point Address (deliver conditionally or unconditionally) Drop Precedence (drop randomly, drop conditionally, or not applicable) Class of Service Customer Logical Service instance Shared None, on-demand, or proactive Specified An arbitrary text string to uniquely identify the UNIs associated with the EC VLAN ID (yes or no) Class of Service (yes or no) None, or server-specific
11.2.4 Ethernet private LAN (EPLAN) service An EPLAN provides LAN-type connectivity between multiple customer sites through dedicated channels. Figure 11-6 illustrates some of the different basic transport network topologies that can support this service. From the customer viewpoint, these topologies are equivalent (i.e., the carrier network architecture is transparent to the customer). In Options 1 and 3, the carrier does the switching at the edge of the network. Option 3 does the switching at one end of the network rather than at each end. In Option 2, the traffic is brought to a centralized switch (or a number of centralized switch points) in a star connection. Since the switching is performed at Layer 2 in these examples, an MSPP can be used to implement Options 1 and 3. Open issues to be resolved for an EPLAN standard include the following: • How do the customer and carrier specify the bandwidth requirements? For example, if the traffic was evenly distributed among the different customer nodes, the bandwidth between nodes could be specified on the basis of CIR. The more realistic scenario, however, is that multiple customer nodes will want to simultaneously communicate with a single node (e.g., remote sites communicating with a headquarters office). A safe policy would be to reserve enough bandwidth for each node to simultaneously receive data at full rate from each other node; however, this would be too inefficient to be practical. • Closely related to the above issue, how much buffering must the carrier provide to handle congestion, and what will the discard policy be?
390 •
Chapter 11
Is protection handled at Layer 1 (e.g., SONET APS) or Layer 2?
Ethernet PHY Ethernet PHY Customer Equipment Customer Equipment
a) Mesh-type connectivity
Customer Equipment
b) Traffic hauled to a centralized switch point(s)
Ethernet PHY
Customer Equipment Customer Equipment
c) Edge node serves as a bridge or router Figure 11-6. EPLAN illustrations
Ethernet Services Over Public WAN
391
While EPLAN is still under study, its expected connection characteristics are summarized in Table 11-6. Table J1-6. EPLAN expected connection characteristics
EC service attribute Network connectivity Transfer characteristics
Separation Link type Connectivity monitoring Bandwidth profile
UNI list Preservation Survivability
Service attribute parameters and values Multipoint-to-multipoint (and probably point-to-multipoint) Address — deliver unconditionally Drop Precedence — (for further study) Class of Service Customer Spatial or logical (always connection oriented) Service instance Dedicated None, on-demand, or proactive Committed information rate (CIR) and committed burst size (CBS) An arbitrary text string to uniquely identify the UNIs associated with the EC VLAN ID is preserved Class of Service is preserved None, or server-specific
11.2.5 Ethernet virtual private LAN service EVPLAN is a combination of EVPL and EPLAN. The transport channel bandwidth is shared among different customers, as are the routers in the carrier network. Ultimately, the sharing of bandwidth in the transmission channels and switch fabrics give EVPLAN the potential for very costeffective carrier network resource utilization. Clearly, however, EVPLAN is the most complicated network architecture to administer. The open issues regarding EVPLAN transport architectures include all of those already discussed for EVPL and EPLAN; however, the magnitude of some of these issues is greatly increased for EVPLAN, which in turn restricts some of the potential solution space. For example, the tagging mechanism to differentiate the data from different customers, and the different data flows within each customer data stream, must have an adequately large address space. (E.g., the 4K address space of VLAN tags makes them impractical for large EVPLANs. Also, their applicability to only Ethernet frames further lessens their appeal for a generic data network.) While EPLAN is still under study, its expected connection characteristics are summarized in Table 11-7.
Chapter 11
392 Table 11-7. EVPLAN expected connection characteristics
EC service attribute Network connectivity Transfer characteristics
Separation Link type Connectivity monitoring Bandwidth profile UNI Hst Preservation Survivability
11.3.
Service attribute parameters and values Multipoint-to-multipoint (and probably point-to-multipoint) Address (deliver conditionally or unconditionally) Drop Precedence (drop randomly, drop conditionally, or not applicable) Class of Service Customer Logical Service instance Shared None, on-demand, or proactive Specified An arbitrary text string to uniquely identify the UNIs associated with the EC VLAN ID (yes or no) Class of Service (yes or no) None, or server-specific
TRANSPORT NETWORK MODELS IN SUPPORT OF ETHERNET CONNECTIVITY SERVICES
The preceding sections provided a discussion about the service types and characteristics associated with Ethernet WAN connectivity services. This section will provide a discussion of the issue of transport models and architectures as well as a functional description of their underlying client signal adaptation and termination processes. As the demand for Ethernet managed connectivity services increases, transport and service providers are finding that over the near- to mediumterm time frames, economic optimality is best secured and customer demand best served by implementing a connection-oriented packet-switched (COPS) transport infrastructure by leveraging the existing connection-oriented circuit switched (CO-CS) TDM networks. Within this framework, and of vital importance to this emerging CO-PS transport model, there is a need for a multiplexing scheme that enables efficient, payload-transparent transport of multiservice packets and Ethernet MAC frames over optical networks. To better understand the variety of applications being pursued and addressed by this emerging transport model, we provide a set of representative service scenarios in Figures 11-7 through 11-10. In Figure 11-7, we depict the aggregation of a single customer's multiple flows over the access portion of the network, where those multiple flows may be differentiated based on class of service (high-priority guaranteed
Ethernet Services Over Public WAN
393
bandwidth such as voice, video, etc. vs. best-effort services such as e-mail, Internet access, etc.), quahty of service (error performance, latency, delay variation), or type of traffic (voice, video, Ethernet data access. Fibre Channel Storage Area Networking, etc.). Enterprise HQ
® 1
Multiple flows aggregation <(ni+n2+n3)xVTi.5
t 1^ &
7 f
A^CMSS
1
^
1 riJvT1£ x ^ ^ | ^ 3 x V T - t 3 ^
^
1 ^ ^ Enterprise Branch Site
. . ^ » 1
^P^
1
1 mATijA 1
1
^
ouppiitir o n e
»
1 ^^^^ Business Partner Site
Figure 11-7. Single-customer flow aggregation
Figure 11-8 provides an example of the delivery of data services to multiple customers co-located in one or more MxUs over the Metro Area. MxUs can be apartments or multidwelling units (MDUs), office buildings or multitenant units (MTUs), and hotels or multihospitality units (MHUs). Multiple flows aggregation VC3
ISPB
ISPC
Figure 11-8. Multiple-customer flow aggregation
Chapter 11
394 CO
Multiple flows aggregation
VC3
EO
Figure 11-9. DSLAM uplink aggregation
Depicted in Figure 11-9 is an application example of aggregating traffic from multiple Ethernet-based DSLAM uplinks. Figure 11-10 provides an application example where EoS/PoS/DoS (Ethemet/Packet/Data over SONET/SDH) traffic from multiple, partially populated SONET/SDH pipes are groomed, at the packet level. In the remainder of this section, we will provide a description of some of the feature requirements that are being discussed in the industry to provide support for the generalized service multiplexing mechanism that is vital for the deployment of EPL and EVPL connectivity services over the infrastructure of the emerging CO-PS network.^ In many service multiplexing applications, large numbers of packet flows need to be multiplexed. For example, an access interface to a service provider edge router should serve 10-20 remote aggregators, each aggregating 10-50 flows. This results in 100-1000 flows per access interface. Maximum efficiency of the transport network, as well as the router access interfaces, is achieved if all the flows are multiplexed in the packet domain rather than TDM domain, before feeding the router. Minimum cost (capital and operational expense) is achieved if we can keep these aggregators "dumb," i.e., transport layer aggregators rather than higher layer aggregators (e.g., routers). Hence the transport network should enable multiplexing of thousands of multiprotocol, bursty, data flows into a single SONET/SDH interface. In the direction from the edge router or Ethernet
Ethernet Services Over Public WAN
395
switch to the access network, individual packets must be labeled such that the access network would be able to identify the physical destination port of the packet.
Multiple flows aggregation
ISPC
Figure 11-10. EoS/PoS/DoS aggregation
Video on-demand distribution to residential customers is another example where thousands of flows should be multiplexed asynchronously over the access ("feeder") network. Any generalized multiplexing procedure or mechanism should be capable of tagging a service packet with a unique identifier and carrying the tagged packet over the optical transport link. The tagging of flows on ingress with unique (per flow) tags enables the multiplexed flows to be uniquely separated and identified as to ownership and destination at the farend of the multiplexed link. This process would, in a very fundamental sense, effectively enable a "PVC" (Permanent Virtual Circuit) mode multiplexing model, whereby tag-based multiplexed flows could be connected to the appropriate endpoints or destinations across the SDH network. In order to maximize the utilization of the transport network, transport service providers may offer oversubscribed services. In that case, packet multiplexing can be performed in an arbitrary manner, or in a more controlled fashion, with support for SLAs. Accordingly, client-agnostic priority information and drop precedence information must be carried with the packets inside the transport network.
396
Chapter 11
The multiplexing capability should be payload (service) agnostic and should permit the transport network to evolve independently of the services networks, while supporting the addition of new client services as they evolve. The tagging mechanism should allow for SLA function support. In general, if the emerging CO-PS-based transport network is to be capable of addressing the preceding discussed service requirements, then it must be outfitted with a mechanism that not only provides for service multiplexing support but more importantly must enable the suite of connection-oriented attributes that are vitally necessary for the realization of a CO-PS network. In this regard, the following functional characteristics are of first-order importance to the emerging CO-PS network: • Tag the packets with appropriate flow/service ID that guarantees uniqueness over the service domain • Provide for unconstrained allocation of multiplexing identifier tags • Support an extensible set of packet service types • Support a service-type identifier so that traffic can be directed to the appropriate adaptation function • Support packet multiplexing using traffic management capabilities that are suitable for controlled aggregation of best-effort traffic, but that can be extended to support contracted SLAs • Provide client agnostic drop precedence (SLA-non-conforming) indication to aid in meeting QoS characteristics • Provide client agnostic priority/CoS indication to support CoS characteristics • Provide protection for the header/tag, with the following protection characteristics: • —Insertion or removal of tags must not compromise end-to-end performance monitoring • —Tag stacking in support of multiple domains or domain extension must not compromise end-to-end or domain-specific performance monitoring • —^The tag must be protected against transmission channel errors in order to prevent misrouting of data packets • —^A simple means of administering tag IDs must be provided that is compatible with existing carrier practice. As the CO-PS network emerges with the above-described functional attributes, at a very fundamental transport level it should still behave as a traditional SONET/SDH network (depicted in Figure 11-11). The key difference would be the existence of a more sophisticated connectionoriented and QoS/CoS paradigm.
397
Ethernet Services Over Public WAN
Service Provider GbE/FX
FE/FX
Interoffice Wm
Feeder
Distribution
0
Figure 11-11. SONET/SDH physical layer transport for EoS
The Figure 11-11 physical layer would be expected to provide the traditional transport infrastructure functions such as the following: • • •
Uses SDH/Optics (use of Fiber/Copper Ethernet at edge) Provides physical connectivity between nodes - STM-« Rings etc. Handles protection at physical level and transmission performance management When the CO-PS aspects are taken into account, the Figure 11-11 network layer becomes modified as in Figure 11-12. There we observe support for a so-called Packet Switching sub-layer that: Implements packet labeling Provides Logical connectivity between ports Performs statistical multiplexing of customer traffic Management & CoS Handles protection at Packet level
Ensures Traffic
Figure 11-12 can be modified to provide a layered network view that takes into account the architectural attributes necessary for implementing CO-PS-based EPL services. This setup is depicted in Figure 11-13.
398
Chapter J]
Service Provider^ _
Packet switching sublayer *' ~- ^ ..^
(7bt.. ^^
VC4r!y^- Packet^Ring,,,-.»'
••*llli«|>^
'•
„.^2=^J
LilP****4 IL—*•
"
.
.MMUUj-^
llj 1 [/
'
''
FE/FX
/ " T L
I r^TI Mill! !.ir?M 5 ^ - - ^ -n j I K ^ 1 K «.^ svMi
,^.
Q
'='^P
FE/FX
m i
FE/FX
Interoffice f ^ Packet Ring
/I
Feeder
f ^
Distribution
Packet
A
Adaptation
Figure 11-12. Packet switching sublayer overlay for EoS
The diagrammatic representations and discussion in this section have addressed examples of the layered architectures for CO-PS-based support for EoS. Several paragraphs ago, we also outlined the functional attributes that would be necessary for the effective transformation of the current SONET/SDH transport infrastructure into one that provides a CO-PS basis for EoS. With this background in hand, we shall now begin a discussion of the details of the packet mechanism that would be necessary for the implementation of this new network paradigm. A CO-PS network can realistically be realized on the basis of leveraging two existing technologies (MPLS and GFP). To effectively support the COPS infrastructure, both technologies would require (varying levels of) enhancement and modification. In the remaining paragraphs of this section, we shall summarize the basic frame formats and identify, where possible, the necessary modifications to properly support CO-PS.
Ethernet Services Over Public WAN
399
p-t-p Connectivity Services SLS»BE SLS=15Mb/$
Broadbond Access Service
'•;,Vc4nv;','-'V-'V
.''..'ii*<W'»»'
'/', ;-'- -- ; - ' - <^»M'^»'»Mi>''^^V'«'**'*'f!r''/':
ISP
i Interoffice Packet Ring
Packet , Aggregation
4
Adaptot ton
gjjjj^ Service Port
Figure 11-13. Layered architecture for EPL services
We begin with a brief examination of MPLS as a CO-PS candidate. MPLS is a complex series of protocols encompassing traffic engineering (TE), packet forwarding (i.e., data plane), and label switched path (LSP) setup. For our purposes at hand, the aspect of MPLS that is most relevant for CO-PS is the data plane (MDP). Figure 11-14 provides the essence of MDP. One unfortunate aspect of MDP is that it does not appear to unambiguously provide for CoS and DP (drop precedence) in the shim header.'* In fact, there are two different approaches to support CoS and DP. One way is via L-LSP (Label-inferred LSP), and the other is by way of E-LSP (EXP-inferred LSP).^ In addition, L-LSP maps the CoS information into the label value. The drop probability is mapped into the EXP bits in the MPLS shim header (e.g., like the CLP bit in ATM). E-LSP maps both the CoS and the DP information into the EXP bits in the MPLS shim header. An alternative approach has been suggested^ in which the CoS and DP information is encoded in an extended GFP header (eGFP). There is a substantial similarity between eGFP and the L-LSP approach taken by RFC 3270. The approach that was, however, proposed for adoption by eGFP has the advantage of being able to leverage the eHEC mechanism that natively
Chapter 11
400
protects the extension header of the GFP frame. Figure 11-15 gives a visual depiction of the eGFP frame format.
L-LSP
Forwarding + CoS
Label (20 bits)
EXP (3b) S(1b)
Forwarding
E-LSP
DP
TTL(8b)
CoS+DP
Figure 11-14. MPLS Data Plane (MDP)
Both MDP and eGFP would require some modification in order to effectively support (in the short to medium term) the emerging CO-PS network. The modifications required in both cases are depicted in Table 11-8 and Table 11-9.^ In summary, the emerging CO-PS network can be made realizable in the short to medium term by leveraging the existing SONET/SDH network infrastructure and enhancing it with a generalized packet multiplexing capabihty. This approach would allow the service and transport provider to efficiently deploy EPL and EVPL services (with the necessary customer traffic isolation and protection) and benefit from optimizing their network resources through statistical gain with guaranteed performance (delay, jitter, loss). An architectural approach that is based upon either an MDP or a eGFP (or a combination of these technologies) could be the basis for a scalable, efficient, and simple CO-PS path layer network. Moreover, given its hybrid TDM/Packet nature, it would be competitively capable of cost-effectively supporting legacy transport services relative to full packet switching infrastructures.
TBD
Label (16 bits)
COS (3b)
DP (3b) S(1b)
Reserved (9b)
Figure 11-15. Extended GFP (eGFP) fFrame format
Ethernet Services Over Public WAN
401
Table 11-8. MDP frame format-required modification for CO-PS Tributary Slot ID 1 20-bit label CoS/DP 3-bit EXP Hierarchy | Infinite tunneling capability Tunnel Connection MPLS 0AM Monitoring Nested Connection MPLS 0AM Monitoring Pro-active FM/PM 1 MPLS 0AM (CV, FFD, FDI, BDI) On-demand fault location MPLS 0AM (ping) SNCP MPLS protection switching; further enhancements/extensions are required SPRing To be developed Mapping IP mapping is available; ATM, FR, ETH mappings are under development in IETF PWE3 Layer Network Interworking MPLS/ATM, MPLS/FR are under development in Q.5/13
Table 11-9. Extended GFP (eGFP) frame format-required modifications for CO-PS network Tributary Slot 1 Proposed 16/20-bit Flow ID. CoS/DP Proposed 3-bit Priority and DP fields Hierarchy Proposed accommodation for stacking in the extension header Tunnel Connection To be developed Monitoring Nested Connection To be developed Monitoring Pro-active FM/PM To be developed On-demand fault location To be developed SNCP To be developed SPRing To be developed Mapping To be developed Layer Network IW To be developed Switching 1 Development of connection-oriented GFP switching
11.4.
ETHERNET CLIENT INTERFACES
The client interface (i.e., the UNI) for Ethernet services will typically be an Ethernet physical interface. In the past, the UNI for WAN data
Chapter 11
402
connections was typically based on telecom signals (e.g., DSl, DS3, fractional-DSl, etc.), which required either special customer equipment (e.g., a Tl multiplexer) or special WAN port cards on customer routers. Being able to use an Ethernet interface is a significant advantage for enterprise customers. Ethernet interfaces are typically less expensive than telecom WAN interfaces, especially for higher bit rates, and are supported by relatively inexpensive Ethernet LAN routers. Using an Ethernet interface also allows the enterprise network administrators to keep their OAM in the Ethernet domain and allows the carriers to handle the telecom signal domain. The Ethernet UNI service attributes that are common for all services are shown in Table 11-10, and the attributes that are service dependent are shown in Table 11-11. Table J J-I O.Vm serviceattributes common to all services
Layer ETH ETY
UNI Service Attribute
Service Attribute Parameters and Values
MAC service UNI ID UNI EC ID PHY speed PHY medium
IEEE 802.3 frame format Arbitrary text string to identify each UNI instance Arbitrary text string to identify each EC instance 10 Mbit/s, 100 Mbit/s, 1 Gbit/s, or 10 Gbit/s IEEE 802.3 physical interface
Table IJ-I I. Service-dependent UNI service attributes
Layer
ETH
ETY
UNI Service Attribute
Service Attribute Parameters and Values
Multiplexed access VLAN mapping Bundling Bandwidth profile Layer 2 control protocol processing PHY mode
Yes, no Specify Yes, no, all-to-one For further study for most services Block, process, or pass per protocol on ingress Generate, or none per protocol on egress Full duplex, half duplex, or auto-negotiation
The UNI service attributes common to all services are reasonably selfexplanatory. The service-dependent attributes are explained as follows.
11.4.1 Multiplexed access The multiplexed access aspect of the UNI relates to whether a customer edge node has an individual UNI associated with each of the far-end UNIs to which it is connected, or whether a UNI is shared for the connections to multiple other customer UNIs. These cases are illustrated in Figure 11-16.
Ethernet Services Over Public WAN
4xUNI
a) Multiple point-to-point EPL (one EVC per UNI)
b) Service multiplexed UNI (e.g., for multiple EVPL) Figure 11-16. Service multiplexing example
403
404
Chapter 11
11.4.2 VLAN mapping Ethernet (per IEEE 802.IQ) allows the insertion of tags into Ethernet MAC frames in order to create virtual LANs (VLANs). When a customer desires to preserve this VLAN segregation of traffic through the WAN, the carrier may simply preserve the entire MAC frame, including the VLAN tag. It is possible that both the customer and service provider desire to use VLAN technology. (More will be said about this topic in Section 11.5.) If the service provider also wishes to use VLAN technology, then it must either insert a second ("stacked") VLAN tag into the MAC frame or perform a translation of the customer VLAN tag at the ingress in order to conform to the service provider's VLAN tag assignments, and then restore the customer VLAN value at egress. See Chapter 10 for a more detailed discussion of this topic.
11.4.3 Bundling Bundling refers to whether multiple customer VLAN IDs can be mapped into a single EC at a UNI. The case where all VLAN IDs are mapped to a single EC is called all-to-one bundling. See Chapter 9 for an extended discussion on bundling. It should be noted that all-to-one bundling and multiplexed access are mutually exclusive. This is because multiplexed access is the multiplexing of multiple ECs into a UNI, and mapping all VLAN IDs into a single EC means that there is a single EC at that UNI rather than multiple ECs.
11.4.4 Bandwidth profile The bandwidth profile refers to the characteristics of the traffic that a customer presents to the network at the UNI. The parameters include such things as guaranteed committed information rate (CIR), peak information rate (PIR), burst sizes, and what is done with excessive traffic. See Chapter 10 for an extended discussion on bandwidth profiles and their policing algorithms.
11.4.5 Layer 2 Control Protocol processing Layer 2 Control Protocols are used by enterprise network bridges for a variety of functions. Depending on the protocol application and the WAN service, these protocols can either be passed transparently through the WAN, processed (with the network provider equipment acting as a peer to the
Ethernet Services Over Public WAN
405
customer equipment), or discarded at the UNI. See Chapter 10 for a detailed discussion of carrying these protocols through metro networks, which also applies to transport networks.
11.4.6 Summary of UNI Service Attributes for Different Services The UNI service attributes are currently only defined for EPL service (ITU-T G.8011.1). These attributes are summarized in Table 11-12. EVPL and EVPLAN (especially EVPL AN) will typically make use of bundling and multiplexed access and have more complex bandwidth profiles. Table 11-12. Summary of UNI service attributes for different services UNI Service Attribute Service Attribute Parameters and Values Layer
ETH
ETY
11.5.
Multiplexed access VLAN mapping Bundling Bandwidth profile Layer 2 control protocol processing PHY mode
EPL No No All-to-one CIR, Committed Burst Size (CBS) All are passed except PAUSE frames, which are discarded. (The network equipment providing the UNI-N may generate PAUSE frames.) Full duplex
ETHERNET TRANSPORT NETWOIOC TO NETWORK INTERFACE (NNI)
We see from the depiction in Figure 11-24 in Section 11.6 that there are typically different client server functions involved in a communications network relationship. Graphically depicted in that figure are the separate functions of customer, service provider, and transport provider. These different functions imply not only a difference in the client server interface relationship models but also a difference in the information entities that are actually passed between each client server pair. In a typical communications network based upon client server architecture, customers would usually connect to the service provider network for service over a user network interface (UNI).^ In a similar manner, the service provider would connect for transport service to the network of a transport provider over a network-to-network interface (NNI). This interface is deployable in a couple of different configurations. One way is to interconnect the separate administrative network domains of the same
406
Chapter 11
service or transport provider. This is often described technically as an Intra Domain Interface (laDI). Another method is to interconnect two or more network domains of different service or transport providers. This is technically described as an Inter Domain Interface (IrDI). We will dedicate the remainder of this section to discussing the specifics of the different NNI approaches that are likely to be adopted for the support of Ethernet managed connectivity services. In Figure 11-17, we provide a graphical depiction of an laDI for a generic Ethernet over Transport (EoT) service.^ In this example, we see that there are two separate administrative domains for the same service provider. ^^ Note also the fact that the same NNI is used both internally to the administrative domain and externally for the same provider network.
Service Provider A Administrative Domain A
Service Provider A Administrative Domain B
Figure 11-17. laDI NNI for Ethernet over Transport Service (adapted from ITU-T Draft Rec. G.8012[ll])
Turning to the IrDI, we note that (as depicted in Figure 11-18) this case is architecturally indistinguishable from the laDI case except for the fact that the network infrastructure is independently owned. The NNI inside the individual administrative domain is exactly the same in terms of features and functionalities as the NNI connecting the two networks.
Ethernet Services Over Public WAN
407
31
Ety-UN»
t
j
•'fl"'"''''-"''"'flim I |Ety4JNr
J Service Provider B Administrative Domain B
Service Provider A Administrative Domain A
Figure 11-18. IrDI NNI for Ethernet over transport service (adapted from ITU-T Draft Rec. G.8012)
As was mentioned above, there are several transport technologies (represented by different layer network models) that are capable of supporting Ethernet connectivity services. Figure 11-19 provides a graphic depiction of the different layer networks and their NNI relationships that are available to the service and transport operators for supporting Ethernet connectivity services. In Figure 11-19, we note that there is an NNI defined at the MAC level for connecting two laDI or IrDI network elements (NEs). Note also that no specific identity has been given to the server transport technologies. What is of importance in this case is the fact that the client Ethernet physical interface (ETY) could be just as easily transported over server transport technology A, B, or C as long as the appropriate laDI or IrDI NNIs are in place in the different NEs.
ETH
ETH UNI 11 II
ETY
ETY UNI II II
ETH NNI II II
ETH
SrvA NNI ETY
SrvA
II
ETH SrvA NNI II SrvA SrvA II
ETY
ETH UNI II II
ETH
ETY UNI II II
ETY
SrvB NNI SrvB NNI SrvB NNI II II II SrvB SrvB SrvB SrvB II II 11 SrvC
SrvC NNI SrvC NNI SrvC NNI II II II SrvC SrvC SrvC II II II
Figure 11-19. EoT server layer networks (from ITU-T Draft Rec. G.8012)
Chapter 11
408
The discussion above has estabhshed that NNIs are based on either laDI or IrDI relationships. These latter entities are also typically defined in NEs situated at the boundary points of the network. To proceed with developing our understanding of the role of the NNI in supporting Ethernet connectivity services, we will now explore some of the details of the functional elements of the NNI. In this regard, Figure 11-20 will be useful in helping us with our understanding of these details. From Figure 11-20, we see that the typical NNI functionality provides support for the structured transfer of information associated with the following functions: 1. Management plane "^e.g., plane management and resource management functions 2. Control plane ^ call control, connection control, and signaling functions 3. User/Data plane -^ user information flow transfer and the associated inband flow and error management controls
UNI/ NNI
TNE
Figure 11-20. Three planes of Ethernet UNI and NNI (from ITU-T Draft Rec. G.8012)
In a manner similar to the structure in Figure 11-20, the NNI function is further decomposed into plane-specific functions corresponding to the management, control, and user/data planes (i.e., NNIM, NNIC, and NNID, respectively). In the remainder of this section, we will provide greater details
Ethernet Services Over Public WAN
409
on the N N I D for EoT. Figure 11-21'' provides an illustration of the basic signal structure for the NNID. In Figure 11-21, several transport server technologies for the transport of the Ethernet client signal are identified. For the CO-PS-based case that we have been discussing, SONET/SDH and PDH would be the most directly relevant transport technologies. Although not entirely visible in Figure 11-21 (but nevertheless there), the N N I D function associated with the aforementioned technologies would provide for such functions as client signal adaptation and mapping, error management, link state signaling, flow control, and connection monitoring. From the graphics in Figure 11-21, we can see that the methodology for mapping and adaptation of the Ethernet MAC client signal is dependent upon the underlying transport server technology. Since GFP is the preferred method for such mappings for PDH, SDH/SONET and OTH transport server technologies, we will limit our discussion to GFP.'^ To repeat in summary some of the information from Chapter 5, the GFP ingress source mapping process accepts the Ethernet MAC client signal and performs an octet-aligned insertion into the payload area of the GFP-F frame. It then follows this by attaching a core header, which includes a payload length indicator (PLI) and core header error check (cHEC) fields. Added also is a payload header with support for discriminators for payload type (PTI), presence of GFP payload field FCS (PFI), and extension header (EXI), as well as an indicator of user payload (UPI) and an error check over the type field.'^ A graphic of the preceding described process is depicted in Figure 11-22. The intent of Figure 11-22 is to provide the reader with a graphic understanding of the encapsulation process and how the Ethernet client signal format is mapped into the GFP-F frame format for subsequent transport across the WAN. Figure 11-23 continues this analysis from a more functional architectural perspective. Our purpose here is to provide a more comprehensive architectural modeling view of the manner in which the Ethernet WAN transport service via the N N I D is actually implemented.
Chapter 11
410
Clients (e.g., IP, MPLS, PDH) _SNAR-
_ J
LLC
\
ETHP
Era.
ETHS -GFPflLAPB ETYn
GFPflikpb-
64B/Md-
PDH Pqe, Pq^ Pqe-X Pqs-Kvl
SDH VC VC-n Xv VC-4| •Xc
SDH VC-44d4c
OTH ODwt ODUk-Xv
ATM VC
MPLS
EoP
EoS
EoS
EoO
EoA
EoM
Native Etliernet
RPR
_E^
Etiiernet-over-Transport
Figure 11-21. ETH interface signal structure (from ITU-T Draft Rec. G.8012)
2< CO
Link Header
M_SDU
<
ETH
Link Trailer
Payload
^/
,
/
Payload
Link Frame
|
\ MAC PCS
/ core payload Header Header PLI cHEC
GFP Payload Information Field
GFP-F
PTI=000 PFI=0 EXI=0000 UPI=0x01 tHEC
Figure 11-22. GFP-F encapsulation of the Ethernet MAC frame (adapted from ITU-T Draft Rec. G.8012)
In Figure 11-23, the Ethernet connectivity service is defined betw^een a source Ethernet process at CPE_A and a sink Ethernet process at CPE_B.^'*
Ethernet Services Over Public WAN
411
We are able to also observe a fragment of the client signal source adaptation process at the CLE in CPE_A that presents the source client signal (formatted per UNI requirements) to the ETY sink termination process in the source NE of the service provider. From there the client signal physical interface is adapted and provided to the source GFP-F adaptation process for mapping onto the SDH/SONET transport layer. This mapping by the GFP process as well as the signal conditioning and structuring by the Sn-X function is done strictly according to the requirements of the NNID-
ETC_CI
\ljpM^?.^^
\ETY/ETH/^:/^-x/GFP->;
^qFp-F/ETH/:;;g
^^^^^'/,
ETY CFClient Link
x l
ETY Cl/ClientLink SDH/SONET Transport Network
Figure 11-23. Functional model of Ethernet over SDH/SONET (EoS)
11.6.
OAM
The material introduced thus far in this chapter has taken the reader through the motivation for Ethernet WAN services, the various service types and characteristics, service models, and transport architecture models. In this section, we shall introduce a discussion about the operational and administrative and management aspects of the service, the so-called OAM dimension. From a general perspective, the issue of OAM is almost exclusively a concern of WAN as opposed to LAN network operators. LANbased systems typically reside within the domain of an enterprise and, for the
412
Chapter 11
most part, commit only to providing best-effort service delivery. On the other hand, most operators of public transport communication networks pay very special attention to the issue of OAM as the mechanism designed to assure operational quality and efficiency of the network assets. More particularly, it is commonly recognized that OAM functionality is vitally important in public transport networks to guarantee operational ease, resource efficiency, and performance predictability. This all leads to a streamlining of the operational cost of running the network and ensures that its behavior is in conformance with agreed-upon metrics of SLAs established with customers (ITU Recommendation Y.1730, Clause 7]) If there is to be a broad-based deployment of managed Ethernet connectivity services, the networks offering such services must be equipped with OAM capabilities in a manner that is comparable with existing transport technologies like SONET/SDH, PDH and ATM. This requirement is made even more necessary when one takes into account the fact that service and transport operator networks are typically hierarchically layered, with various levels of OAM requirements and responsibilities at every layer. Complicating the preceding issue is the problem of delivering the Ethernet connectivity service within the context of a multioperator framework, with the ensuing difficulty of detecting and diagnosing connectivity and other network- and service-related issues. At a more specific level, for the public network operator engaged in providing an Ethernet managed connectivity service, there are a number of very significant reasons why it is critically necessary for OAM to be supported. In what follows, we provide a discussion of some of the primary ones. An important consideration for OAM support in the network relates to the need for the network operator to be able to proactively anticipate any basic availability and performance characteristics of the network that would negatively affect the quality of the transported service contracted for by its customers. This ability allows the network operator to take timely remedial action to minimize the extent and duration of any service-affecting malady. Furthermore, the information gleaned from such surveillance enables the operator to make the appropriate billing adjustments to the customer's account in the event of any degraded or service outage episodes. Streamlined OAM capabilities within the operator's network would tend to allow for the timely and structured detection, diagnosis, and correlation of defects, errors, and faults during the provision of the transport service. An immediate positive outcome would be a lowering in the levels of support staff that would be required to be deployed (both in office and in the field) to troubleshoot and solve customer network connectivity services issues. This
Ethernet Services Over Public WAN
413
reduction would contribute importantly toward the perennial objective of minimizing operational expenditure and costs (OPEX). Public network operators, as a rule, assume that their customers' data is confidential and very important to them. Consequently, such service attributes as guaranteed confidentiality, security, and maintained integrity of the customer's traffic is only supportable on the basis of a deployed and operational 0AM system. This internal network capability would generate appropriate remedial actions to minimize any negative effects of defects that resulted in the customer's traffic being misrouted to an unauthorized end point. A final reason (and benefit) for OAM support in public Ethernet transport networks relates to the fact that such a mechanism removes the customer from being a critical part of the defect detection and reporting system. Thus, a streamlined OAM system would tend to maximize the number of defects that are automatically detected, thus obviating the need for the customer to report a detected problem to the transport provider. It leads also to the ability to identify and sectionalize defects to the appropriate errant source (i.e., to the appropriate layer, whether it is within the client Ethernet layered network structure or within the network operator's transport domain). This ability not only leads to the transport operator being able to provide a more flexible, maintainable, and reliable transport service but also more importantly, leads to an increase in customer satisfaction and trust in the reliability and integrity of the operator's transport service. At this point, we shall provide a high-level introduction to the idea of the OAM domain. This discussion is intended to provide a conceptual starting point in the mind of the reader prior to introducing the more complex layered view. Figure 11-24 introduces the three domain areas of customer, service provider, and transport operator and indicates not only the extent of the "service" but also the extent of the respective responsibilities and areas of administrative control of the three domains. The preceding discussion focused essentially on motivating OAM and highlighting some of the more important benefits it affords both the network operator and his or her customers engaged in a (SLA) managed Ethernet connectivity service. At this point, we will approach this issue from a more formal and detailed perspective. To aid in the discussion, we will employ the reference architectures models depicted in Figures 11-25 and 11-26 to illustrate the details of OAM for the case of point-to-point topologies. These reference architecture examples are most relevant for the cases of EPL and EVPL service types.
Chapter 11
414 Iranspon rt-OYider
^p^^?*=»)Pwider ^^'---^''''"''''^^
Customer
! i ^
!
\%
Access !
!
Transport Core
-
\
1 Customer Domain
:
i t 1
1 ^ !^ 1 ^
1 1 ^
1
^
«^ W^% Operator Domain
Provider Domain Operator Domain
^P^^?9^
1
Customer
Access \
1 1
!
1 ^! 1 ^1
|
!
1
1 i
^i Ps
i 1
f
i
i
P*^
Fj 1 Operator Domain
Figure 11-24. OAM domain service vs. network
In Figure 11-25, we depict an example of a layered reference network architecture for a point-to-point Ethernet transport flow. The perspective is based upon the ITU-T functional modelling methodology defined in Recommendation G.805 [12] and G.809.^^ Several points are worthy of mention. First, from a demarcation perspective, the equipment or so-called network elements (NEs) associated with functional reference points A and D are (typically) customer-located equipment (CLE). Second, these NEs indicate several functional relationships, including an association with the termination flow points (TFPs) for the Ethernet service as well as the ingress and egress of the transport network traffic flow. In addition, the NEs associated with reference points B and C are connected with the network of the operator. As such, they are typically situated at the edges of the transport network domain and form the ingress and egress to the operator's network. In this particular example, the client traffic flow (i.e., the ETH link flow) is described as being transported over a generic single server trail (S), which could be a CO-CS or a CO-PS network. ^^
Ethernet Services Over Public WAN
415
NNI ETH LF
UNI link flow
ETHFP
-x®
ETHLF
transit link flow
ETHFP
>^ UNI link flow
UNI-UNI flow network flow
Figure 11-25. Example of point-to-point Ethernet flow reference model (1) (from ITU-T Rec. Y.1730)
If we were to now allow for the case in which the Ethernet cHent traffic to be transported is sourced from a bridge device as opposed to a single termination flow point, we would then have Figure 11-26 (where the connection into the adaptation function is now a flow point (FP) as opposed to a TFP). Note that one important implication of this architecture is that the UNI-to-UNI relationship is no longer the same. In the nonbridging TFP case, the UNI-UNI flow is coterminous with the network flow. In contrast, for the bridging case, the UNI-UNI flow is now not the same as that of the network flow. The importance of the above functional representations lies in the fact that details of the functional processes and relationships are made visible to the implementer. Consequently, it is very easy to visualize and understand the processing flow of the client signal as it traverses the various functional process blocks (which may or may not reside in specific network or client located equipment). Visible too are the areas of administrative demarcation (for data and 0AM transport) both from the customer and the network operator perspectives. With this framework, it is now quite easy to understand the administrative span of the OAM/ME pertaining to the access
Chapter 11
416
link versus edge-to-edge versus UNI_C-to-UNI_C versus lJNI_N-toUNI N. We will now turn to a more detailed discussion.
NNI ETHFP UNI link flow
>^
ETHLF transit link flow
ETHFP
>^ UNI link flow
UNI-UNI flow network flow
Figure 11-26. Example of point-to-point Ethernet flow reference model (2) (from ITU-T Draft Rec.Y. 1730)
The reference network architectures discussed above represent only a few of the possible architectural models for the point-to-point case. For instance, these reference architectures could be modified to include multiple network operator domains or even multiple server layer technologies involved in the transport process. Note also that such architectural relationships could be multipoint-tomultipoint (i.e., LAN-based relationships). It should also be noted that although these represent a more complex set of architectural relationships, the fundamental 0AM interactions pertain. Since our mission is to develop a fundamental understanding of the OAM function in a network environment, the basic architectures depicted above for the point-to-point case are adequate for our purposes. At this point, let us introduce the notion of a maintenance entity (ME), An ME represents the OAM flows across the FP and TFP reference points and is useful for the demarcation of the maintenance relationships between
Ethernet Services Over Public WAN
417
the various layers and sub-layers in the client and network operator administrative domains. If we look at either Figure 11-25 or Figure 11-26, it is very easy to see where the OAM ME relationships would be situated. Thus, for example, there would be the ETH layer network relationship of the customer UNI to customer UNI. There would also be a similar relationship at the ETH layer between the UNIs for NEs B and C. In addition, there are the segment OAM ME relationships, which would include the following areas: 1. Between flow points on the boundary of a provider's network 2. Between flow points on the boundaries of two adjacent provider networks 3. Between any flow points as required 4. ETY link connection OAM (as defined in Clause 57 of IEEE 802.3ah) The direct implication of points 1 and 2 in the preceding list is that, depending on the particular OAM flow, a typical network operator would seek to restrict its operational range to within its administrative boundary. Thus, for example, it is to be expected that segment OAM flows between flow points on the perimeter of a provider network may be restricted from interacting with the networks belonging to a given customer or other network provider. In a similar vein, segment OAM flows between flow points on the boundaries of two adjacent transport provider networks may not be allowed to interact with the networks belonging to a given customer network or another transport network provider. At this juncture, it would be useful to summarize the relationship between the maintenance entities described above (and defined in Section 9 of ITU Recommendations Y.1730 and G.8010 [13]) and the OAM flows. In this regard. Table 11-13 provides a summary mapping of these relationships. Several key relationships immediately emerge from this tabular presentation. First, we observe that the ME-level UNI-UNI (customer) relationship is really a UNI_C-to-UNI_C ME interaction, as viewed more practically from the Recommendation G.8010 perspective. Likewise, the ME-level UNI-UNI (transport provider) relationship is more importantly a UNI_N-to-UNI_N ME relationship, as viewed from the Recommendation G.8010 vantage point. Other important ME interactions that come into sharper relief include the intradomain versus interdomain OAM flows as well as those that are link OAM flows restricted either to the customer-totransport provider access link or to the transport provider-to-transport provider NNI.
Chapter 11
418
Table 11-13. Mapping maintenance entities to OAM flows (adapted from ITU-T draft Rec. Y.ethoam)
G.8010
Y.1730 ME
OAM flows
UNI-UNI (customer)
UNI-UNI Flow
UNI-UNI (provider) Segment (PE-PE) intra-provider
Transit Flow Transit Flow
Segment (PE-PE) inter-provider Segment (any to any)
Transit Flow Transit Link Flow Transit Flow Transit Link Flow UNI Link Flow Transit Link Flow
ETY Link OAM -UNI ETY Link OAM -NNI
ME UNI_C to UNI-C ME UNI_N to UNI_N ME Intra Domain ME Inter Domain ME
Access Link ME Inter Domain ME
With OAM ME relationships and reference point locations now understood, we will consider the specific functions that are necessary for the practical operation of an OAM ME capability. In this regard, the following functions have been identified as being vital for such a capability: • Continuous connectivity check (CC) • Alarm suppression function • Intrusive Loopback and nonintnisive Loopback • Path trace • Discovery • Performance monitoring • Survivability function (e.g., protection switching, restoration, etc.) The above are broadly representative of the OAM functions that have been identified as important by standards-making organizations and forums like the ITU-T, IEEE, and the Metro Ethernet Forum (MEF). Although these groups are at various stages in their independent definition of the mechanism to convey the OAM information across the different transport networks, they are uniform in their decision to adopt a frame-based approach. From its perspective, the ITU-T has proposed a single generic format (Figure 11-27) for all Ethernet OAM messages.^^ It is differentiated from the standard Ethernet data frame by its Ethertype (which has not yet been allocated by the IEEE and which will likely have different values for
Ethernet Services Over Public WAN
419
customers versus transport provider networks) and by its direct support for a field differentiating the OAM type, which identifies such events as loopbacks, performance monitoring, and defect and anomaly signaling. Also included in the frame format is its optional support for VLAN tags, which would indicate the service instance corresponding to the OAM message (e.g., only bridge nodes participating in that service instance will process/forward the OAM message).
OAM MAC DA
1
OAM MAC SA
1 Ether type (VLAN) 1
VLAN Tag
1 Ether type (OAM) 1 VER Type
1
OAM Type
Length Service-ID
OAM Data
1
^^^
1
Figure 11-27. Generic OAM message format
11.7.
PROTECTION AND RESTORATION
For WAN transport, the service protection will typically be provided by the underlying transport network. Service restoration can also be provided, however, at the Ethernet layer using Ethernet protocols. Both approaches are summarized briefly here.
420
Chapter 11
11.7.1 Service Protection or Restoration Provided by the Transport Network Transport networks have traditionally been engineered for ultra-high service availability in order to ensure reliable communications of emergency voice traffic. The objective is for a fault to be detected and the service restored through a protection switching action within 60 ms (10 ms for the detection and 50 ms to complete the switch). A protection switch is the routing of the traffic from the failed facility or equipment to backup facilities or equipment that has been reserved in advance.*^ In most cases, the protection channels are permanently reserved, with their bandwidth either being unused during fault-free conditions or carrying low-priority traffic that can be dropped if the channel is required for protection. In the case of TDM transport systems such as SONET/SDH, the protection switch is between TDM channels or fibers [14]. In the case of ATM, the protection switch is between virtual circuits [15]. The Link Capacity Adjustment Scheme (LCAS) discussed in Chapter 4 provides an alternative approach for restoring data services. For constant bit rate services such as voice, it is necessary to have protection bandwidth that is at least equal to the service bandwidth. In the case of packetized data, however, it is possible in many cases to maintain the service connection at a lower data rate in the event of a network problem. LCAS provides this capability through its control of virtually concatenated SONET/SDH, asynchronous/PDH, or OTH channels. As discussed in Chapter 4, virtual concatenation is the construction of a larger channel through the combining of multiple smaller member channels. The data is then interleaved onto the constituent member channels in a byte-by-byte round-robin basis. The individual members can take different routes through the transport network with the NE that terminates the virtually concatenated channel providing the buffering to compensate for the differences in delay that result from the different routes. In the event that a subset of the members fail, LCAS provides the signaling mechanism between the virtually concatenated channel's source and sink to allow the source to place the traffic onto only those members that have not failed. This allows for a very powerful new paradigm for network operators, since, for services that can tolerate the reduced data rate during faulted conditions, the operator can now provide very fast service restoration without having to reserve dedicated protection channels.
Ethernet Services Over Public WAN
421
11.7.2 Service Restoration at Layer 2 When constructing a complex Ethernet network, it is important that only a single switching path exist between any pair of nodes. Otherwise, loops would exist that cause instability in forwarding tables and would cause broadcast data to be unnecessarily multipUed (i.e., broadcast storms). The Ethernet solution to this issue is the spanning tree protocol (STP). The STP sends Layer 2 Control Protocol messages between the Ethernet nodes in order to establish an overlaid logical tree structure on the network. Messages are periodically sent between adjacent nodes to confirm that no changes have occurred to the network connectivity. When a fault occurs, the node detecting the failure (i.e., the change in network connectivity) will initiate a new spanning tree construction. The network is then unavailable for the 30-60 seconds that it can take for the STP to resolve. A new rapid STP (RSTP) protocol has been developed, but it still takes seconds to resolve. While this is a very powerful and flexible service restoration tool, there are several issues when it is applied to Ethernet WAN services. Of course, the larger the network in terms of number of nodes and geographical reach, the slower the STP resolution. Another potential issue occurs if the network operator uses Ethernet routing technology within its network. Here, conflicts between customer and network operator STPs must be avoided. Another issue is the interaction of the 50 ms protection switching (e.g., SONET/SDH) with the STP. If the Ethernet node and the transport network node both detect the fault, it would cause unnecessary extended network unavailability if the STP were initiated for a fault that will be quickly protected by the SONET protection switch. It is desirable, then, for the Ethernet node to wait at least 50 ms to see whether the problem clears before it initiates the STP.
11.8.
CONCLUSION
Due to a convergence of business requirements for higher data connectivity rates and more flexible services, along with the availability of new data transport technology, wide-area Ethernet data transport is becoming increasingly important. This chapter has provided an introduction to some of the new technology and standards that have been developed (and are being developed) to enable carriers to provide Ethernet transport services. A key point is that this is an evolutionary approach that builds on the customers' capital investment in Ethernet technology and the transport providers' SONET/SDH backbone networks and OAM procedures. All the pieces are in place for offering EPL services. The tools and standards to
422
Chapter 11
provide the more complicated ELAN, EVPL, and EVPLAN services are currently being developed in the various standards bodies. While some proprietary solutions exist, carriers require standards-based solutions for any significant deployment of new services. The use of standards helps guarantee that the solution will be available from multiple vendors, will be stable and supported for many years, and ideally should also be less expensive due to economies of scale. While each carrier will probably want to offer variations on the basic service types discussed here in order to differentiate them from their competitors, these services provide the framework from which they will build.
11.9.
NOTES
1. A topological component describes a portion of a layer network in terms of the topological relationships between the (flow) points. See Chapter 2 for a discussion of layer networks. 2. It is also possible to provide EPL and EPLAN using constant-rate ATM or MPLS virtual circuits instead of TDM circuits. 3. Much appreciation is extended to Dr. Gilad Goren of Native Networks (Israel) for his valuable contribution to the development of the ideas and material discussed in this section of the chapter. 4. See IETF RFC 3032. 5. See IETF RFC 3270. 6. Cisco, Agere Systems, PMC-Sierra, and others submitted a technical contribution to the May 2003 ITU-T SG15AVP3 meeting proposing that CoS and DP information be encoded in a GFP (Recommendation G.7041) extended header. See WD34, "A GFP based approach for the proposed Generalized Multiplexing Procedure." This proposal is currently documented in the G.7041 Living List for future consideration. 7. Alcatel SEL presented a technical contribution to an ITU-T SG15AVP3 interim meeting (January 2004) discussing the modifications outlined above. See "Future CO-PS layer network." 8. For more details on the UNI, see the discussion in Section 11.4. 9. Note that EoT (i.e., SONET/SDH, PDH, OTN, and MPLS) is not the only technology approach to interconnecting NNIs. Ethernet physical interfaces themselves (ETy) could be used as the NNI. See Clause 6, ITU-T Recommendation G.8012. 10. Administrative domains within the same provider network typically demarcate either a technology or a geographical or some other distinguishing boundary. For different providers, the separation boundary is usually based upon infrastructure ownership. 11. Standardization efforts are most advanced for the case of the NNIp and continue to progress for the NNIM and NNIc- To simplify the discussion, we will focus only on the NNID.
Ethernet Services Over Public WAN
423
12. See ITU-T Rec. G.8012, Clause 6.1 ff, for further details on LAPS and the other mapping methodologies. Note that LAPS has several known deficiencies versus GFP. These are discussed in detail in the Lucent Technologies white paper, authored by HernandezValencia, E., "Generic Framing Procedure: A Next Generation Adaptation Protocol for Data Transport over SONET/SDH and Optical Transport Networks" (2001). As an additional note, the ITU-T has designated that GFP be used at international network boundaries or at the boundaries of different network operators unless an alternative is mutually agreed to. 13. See ITU-T Rec. G.7041 and ITU-T Rec. G.806 for a more detailed description of the GFP client signal mapping process. 14. Note that sink and source Ethernet processes depend on the direction of the link connection and information transfer. 15. See the discussion by Varma and Daloia in Chapter 2 of this book. 16. See Section 11.3 for a more detailed discussion on the CO-CS and CO-PS networks. 17. The specific 0AM frame formats adopted by the IEEE 802.3ah and the MEF are discussed respectively in IEEE Draft P802.3ah (Clause 57) and in MEF draft, Service OAM Requirements and Framework.
18. In mesh networks that are constructed with digital crossconnect systems (DCS), it is possible to have the DCSs restore the traffic by communicating among each other to determine the protection path through the network in response to a fault. While this approach typically allows the network operator to reserve much less overall network bandwidth for protection, it can be difficult to meet the 50 ms switch time.
11.10. REFERENCES [1] ITU-T Recommendation G.709), Interfaces for the optical transport network (OTN), 2001. [2] ITU-T Recommendation G.707, Network node interface for the Synchronous Digital Hierarchy (SDH), 2003. [3] ITU-T Recommendation G.7043/Y.1343, Virtual concatenation of Plesiochronous Digital Hierarchy (PDH) signals, 2004. [4] ITU-T Recommendation G.7042, Link Capacity Adjustment Scheme (LCAS) for Virtual Concatenated signals, 2004. [5] ITU-T Recommendation G.7041 A'. 1303, The Generic Framing Procedure (GFP), 2004. [6] ITU-T Recommendation G.8040, GFP frame mapping into Plesiochronous Digital Hierarchy (PDH), 2004. [7] ITU-T Recommendation G.8021A^.1341, Characteristics of Ethernet transport network equipment functional blocks, 2004. [8] ITU-T Recommendation G.809, Functional architecture of connectionless layer networks, 2003. [9] ITU-T Recommendation G.8011 A". 1307, Ethernet Services Framework, 2004.
424
Chapter 11
[10] ITU-T Recommendation Gm\\.\fY.n()l.\, Ethernet Private Line Service, 2004. [11] ITU-T Recommendation G.8012A^.1308 (2004), Ethernet UNI and Ethernet NNI [12] ITU-T Recommendation G.805, Generic functional architecture of transport networks, 2001. [\2>]\1\}~T RQQommQnd2ii\or\ Gm\()IY A2>()6, Ethernet Layer Network Architecture,2()QA. [14] ITU-T Recommendation G.808.1, Generic Protection Switching —Linear Trail and Subnetwork Protection, 2003. [15] ITU-T Recommendation 1.630, ATM Protection Switching, 2000.
Chapter 12 ETHERNET SERVICES OVER MPLS NETWORKS
Iftekhar Hussain Cisco Systems, Inc.
12.1.
VIRTUAL PRIVATE NETWORKS
For corporations and enterprises with geographically distributed sites, network connectivity between different sites is essential to meet increasing demands for voice, video, and data communication. Initially, corporate networks were interconnected using dedicated transport facilities such as DSl/El and DS3/E3. Service Providers (SPs) leased transport facilities as a service to their customers. A network in which sites are interconnected using dedicated transport facilities is called a private network. Using this type of network connectivity, the cost of offering a private network services was very high for the SPs and their customers. Additionally, the provisioning of new services was a slow and laborious task.
12.1.1 Traditional Layer 2 Virtual Private Networks A network in which sites are interconnected using circuits over a shared network infrastructure is called Virtual Private Network (VPN). When all sites in a VPN belong to the same organization, a VPN can be viewed as providing intranet connectivity. On the other hand, when various sites in a VPN belong to different organizations, the VPN can be thought of as providing the extranet connectivity. The fact that multiple VPN customers share the network infrastructure is what sets a VPN apart from a private network. The shared network infrastructure is known as the VPN backbone.
426
Chapter 12
The sharing of the VPN backbone allows SPs to offer VPN services to their customers at lower costs. A VPN that interconnects a set of customer sites over a shared network infrastructure and allows them to communicate based on Layer 2 frames is known as Layer 2 VPN (L2VPN). In contrast, a VPN that interconnects a set of customer sites over a shared network infrastructure and allows them to communicate based on Layer 3 addresses (e.g., IP addresses) is known as Layer 3 VPN (L3VPN), Thus the distinguishing characteristic of an L2VPN, in comparison to an L3VPN, is that in L2VPNs packet forwarding is carried out at L2 such as ATM, FR, and Ethernet. Figure 12-1 shows an example of a L2VPN using ATM Virtual Connections (VCs).
Customer Hcigc (F'F.) Dc\ ice
Provider Edge (PE) Device
Shared VPN Backbone
Figure 12-1. Traditional Layer 2 Virtual Private Networks
12.1.2
Classification of VPNs
Generally, VPN services may be provisioned and managed by SPs or customers. A VPN for which the SP participates in management and provisioning of the VPN service is termed a Provider Provisioned VPN (PPVPN). There are many ways in which an SP can participate in the management and provisioning of a VPN service. Correspondingly, there is a wide spectrum of VPN types. The following attributes are useful for classifying VPNs: • Service Layer (e.g., L2 versus L3): the layer at which VPN service is offered by the SP • VPN Edge Device (e.g., CE-based versus PE-based): the device where VPN-specific functions are performed • Service Connectivity (e.g., point-to-point versus point-to-multipoint): the type of connectivity the VPN service offers
Ethernet Services Over MPLS Networks
All
One such taxonomy of PPVPN technologies is depicted in Figure 12-2. PPVPN
Point-to-Multipoint
Point-to-Point
CE-based
RFC2547 Style
Multicast
Virtual Router Style
Unicast
Figure 12-2. Taxonomy of PPVPN technologies [1]
12.1.3
Multiservice Converged Packet Switched Backbone
Although L2VPNS based on ATM VCs and FR Data Link Connection Identifiers (DLCIs) were easier to provision and had a lower cost than dedicated leased lines, they still had some drawbacks. For example, this type of L2VPN approach restricted the SP's backbone to a single transport technology such as ATM/FR links, which made it burdensome to share the same physical transport facilities for Internet and VPN traffic. Even when Internet and VPN infrastructures could share the backbone transport facilities, they needed separate administration and maintenance. Although provisioning of ATM VCs and FR DLCIs was easier and relatively less cumbersome than dedicated lines, it was still tedious. For example, adding a new customer site to an existing VPN required provisioning an ATM VC to every other site in the VPN. Traditional L2VPNs work well from the customer's point of view; however, the costs of maintaining separate network infrastructures and the administrative burden of provisioning these VPNs have led SPs to migrate their legacy L2 and emerging L3 services onto a common multiservice IP/MPLS packet switched network (PSN). The following discussion assumes some basic familiarity with the MPLS (refer to Appendix A for a quick overview of MPLS technology).
428
n.l.
Chapter 12
L2VPNS OVER MPLS BACKBONE
There are two main types of L2VPN services that an SP can offer to its customers, namely, Virtual Private Wire Service (VPWS) and Virtual Private LAN Service (VPLS) [2]. A VPWS is a point-to-point L2VPN service. In contrast, a VPLS is a point-to-multipoint L2VPN. A VPLS emulates Local Area Network (LAN) service over the Wide Area Network (WAN), which allows interconnecting LAN segments on geographically dispersed customer sites, as if they were connected to the same LAN. In both VPWS and VPLS, L2 frames are transported across the IP/MPLS backbone. In both types of L2VPN, a CE (e.g., CEl) transmits a frame to a PE (e.g., PEl); PEl encapsulates the frame in one or more additional headers and transports the frame to another PE (e.g., PE2). The PE2 in turn removes the encapsulation header and sends the frame to another CE (e.g., CE2). With the exception of some specific services such as point-to-point versus point-to-multipoint, both VPWS and VPLS employ a number of functional components such as header encapsulations. Therefore, to avoid repetition of common functions, it is more efficient to decompose L2VPN functional components into service independent (common) and servicespecific components.
12.2.1 L2VPNs Architecture Generic Components This section describes generic components that are common in all L2VPNs. The service-specific components and architectures are described later in the pertinent sections. 12.2.1.1 Attachment Circuit (AC) In all types of L2VPNs, a CE device (a router or a switch) attaches to a PE device via a physical connection (e.g., Ethernet port) or logical connection (e.g., VLAN port) termed an Attachment Circuit (AC). An AC may be an Ethernet port, a VLAN port, a FR DLCI, or an ATM VPI/VCI, and so forth. An AC carries L2 frames between a pair of CE and PE devices. 12.2.1.2 Pseudowire (PW) In all types of L2VPNs, a L2 frame between two PE devices is carried over another logical connection termed a Pseudowire (PW). Thus any given L2 frame first travels on an AC from a CE to a PE, then on a PW from a PE to another PE, and finally on another AC from a PE to a CE.
Ethernet Services Over MPLS Networks
429
A PW is a mechanism that emulates the essential attributes of a telecommunications service such as FR, ATM, Ethernet, Time Division Multiplexing (TDM), and SONET/SDH over IP and/or MPLS PSN [3]. A protocol data unit (PDU) that contains all the data and control information necessary to emulate the desired service is known as a PW-PDU. Figure 123 shows the logical protocol layering of a generic PW for different types of services. It is worth mentioning that each type of service is emulated using a separate PW. For a given emulated service, to a CE device, the PW appears as an unshared dedicated circuit. The PW protocol layers are discussed in the following sections. ^l n.lriulurrtP Klhrriic(, KK, Ccll(AIM) Ar.MAAl.5) V. J K. )
DSI, DSV Kl,
Packet (r.R., SONKI/SDII SI'K, \'r, N\l)SO)J
('rll(AT.M) ATMAAL5)
l)SI,l)SV Kl, K3) ^
SONK.I SDH
Service PayloaJ (such a.s packet, cell, ilMrcain. and stnicliircd bilslrc:ini)
PayioLid liming nw y been rr using Rc:»l TimeProtocol (R'lT)
Pavload Convcrccncc TiminK SetiuciK'ing PWlXmulliplcxer PSN Convergence PSN n:U:i Ijnk V ^J
Pliy.Mcal
SucilasPackcI(>^cI SONI: ! "snilurl-lhcmcWAcrlilKT
Pnckot Switched Network (PSN)
Figure 12-3. Protocol Stack Model for PW Emulation Edge to Edge (PWE3)
12.2.1.2.1
Encapsulation Layer
The PW encapsulation layer contains three sublayers, namely, Payload Convergence, Sequencing, and Timing. 12.2.1.2.1.1 Payload Convergence The primary function of the Payload Convergence Layer is to encapsulate the incoming (CE to PE direction) payload in PW-PDUs. In the outgoing direction (PE to CE direction), the Convergence Layer replays the native data units on the physical interface attached to the destination CE.
430 Xl.lA.lA.l
Chapter 12 Timing
Delivery of native services such as structured and unstructured bistreams requires availability of an accurate timing recovery mechanism depending upon the characteristics of the transport network. For example, the clocks used to synchronize SONET equipment are stratum 3 (or better accuracy) clocks that are normally traceable to a primary reference source (a clock source that provides a timing signal with long-term accuracy at 10"'' or better with verification to the Universal Time Coordinated (UTC)). Therefore, the emulated service must also duplicate the timing characteristics as closely as possible to that expected of a native service. The Timing Sublayer provides two synchronization-related functions, namely, clock recovery (the extraction of output transmission bit timing information from the delivered packet stream) and timed payload delivery (the playing out of noncontiguous PW PDUs to the PW output interface with a constant phase relative to the input interface). Generally, timing signal can be distributed through external mechanisms such as Building Integrated Timing System (BITS), Stand Alone Synchronization Equipment (SASE), and Global Positioning System (GPS) or extracted from the bitstream using an adaptive clock recovery mechanism. To facilitate extraction of timing information from the packet stream at the receiver, the timing information from the sender can be carried using the Timestamp field of the Real Time Protocol (RTP) [4]. 12.2.1.2.1.3 Sequencing Functions In an IP/MPLS network, the packets carrying PW-PDUs may arrive out of order, may arrive duplicated, or may never arrive at the destination PE. The Sequencing Layer provides services that enable in-order and unduplicated frame delivery to the CE and additionally enable detection of frame loss. 12.2.1.2.2
Demultiplexer Layer
A PSN tunnel is a logical link that provides a data path across the backbone. The main function of the PW Demultiplexer Layer is to provide a demultiplexer field to allow multiple PWs to be carried in a PSN tunnel. The demultiplexer field allows the receiving PE device to distinguish one PW from others. In general, depending on the tunnel protocols, the demultiplexer field may have a different format. For example, when PWs are being carried in an MPLS tunnel, the PW demultiplexer field contains an MPLS label. To summarize, the PW Demultiplexer Layer provides the
Ethernet Services Over MPLS Networks
431
ability to carry multiple PWs within a PSN tunnel transparently across the backbone. Other than PE devices that must perform functions related to PWPDUs (such as encapsulation, decapsulation, and sequencing), PWs are invisible to the other backbone devices (e.g., P routers). 12.2.1.2.2.1 Fragmentation and Reassembly After accounting for the PW and the packet switched network headers (such as IP or MPLS headers) if the combined size of the payload and the associated network headers exceeds the path Maximum Transmission Unit (MTU) of the network, fragmentation and reassembly at the PE devices is required in order for the packet to be delivered across the PSN. In theory, CEs should be able to adapt the packet size according to the path MTU to avoid fragmentation in the network. In practice, there may be situations when a CE lacks this capability. If the CE cannot adhere to an acceptable MTU size, the PE should be able to perform PW fragmentation and reassembly. 12.2.1.2.3 Service-Specific PW Preprocessing In general, at the PE some form of service-specific preprocessing (such as Ethernet bridging, ATM VPIA^CI header translation, SONET Virtual Tributary (VT) cross-connection) on the native data units received from the CE is needed before PW PDUs can be transmitted on the PW. The PW preprocessing can be divided into two components, namely. Native Service Processor and Forwarder.
12.2.1.2.3.1 Native Service Processing (NSP) The purpose of the NSP is to perform service-specific operations based on the semantics of the payload. In the case of Ethernet service, for example, the NSP function includes frame processing and may include additional functions such as stripping, overwriting, or adding VLAN tags, physical port multiplexing and demultiplexing, PW-PW bridging, L2 encapsulation, and so forth. 12.2.1.2.3.2 Forwarders A forwarder is a logical module in the PE that that selects the PW to use to transmit a payload received on an AC. The selection of a particular PW may be based on incoming AC, the contents of the payload (e.g., packet
432
Chapter 12
header), or some statically/dynamically configured forwarding information. Based on the type of service (e.g., point-to-point or point-to-multipoint), the forwarder may selectively forward payload from one AC to exactly one PW or one or more ACs to multiple PWs (Figure 12-4 and Figure 12-5 depict a point-to-and point-to-multipoint forwarder, respectively).
AC
Forwarder
PW
Forwarder
MPLS Network Figure 12-4. Forwarder for a point-to-point service.
MPLS Network Figure 12-5. Forwarder for a point-to-multipoint service.
AC
Ethernet Services Over MPLS Networks
433
12.2.1.3 VPN Tunnels In all type of L2VPNs, data traffic between PE devices is transported over VPN tunnels. A VPN tunnel is a logical link between two PE (or CE) entities that is used to carry VPN traffic across the backbone. In general, a tunnel is implemented by encapsulating packets within another header that are transmitted between those two entities. For example, in PE-based VPNs, a PE-PE tunnel provides connectivity between two PE devices. 12.2.1.3.1 Motivations for Tunnels One of the main motivations for the use of tunneling in VPN applications is to be able to transport customer packets with nonunique addressing information between the VPN edge devices. For example, customer networks often use private or nonunique IP addresses. However, in many VPN applications (as, for example, in L3VPNs [5]), a single VPN edge device such as a PE router can provide VPN service to multiple customers even if those customer networks have overlapping addresses. The fact that the customer addresses are not globally unique means that IP packets from a customer cannot be transmitted to the correct destinations over the shared VPN backbone in their native form. In other words, some form of additional header encapsulation (tunneling) must be utilized to forward packets to their correct destinations. Thus, a tunneling protocol attaches an extra encapsulating header (which, in the case of MPLS, corresponds to one or more labels) to a VPN packet, and this additional header information is then used for forwarding the packet between the VPN edge devices. There are other important reasons for using tunnels in VPN applications, such as the need to isolate traffic from different customers and to provide different quality of service (QoS) and security characteristics. For example, QoS and security requirements of different VPN customers may differ and can be satisfied by using different tunnels with the appropriate characteristics. 12.2.1.3.2 Hierarchical Tunnels If you were to form VPN tunnels across the backbone for each instance of VPN, devices in the backbone such as P routers would need to be VPNaware and to maintain state for each VPN tunnel, which from a network scalability point of view is highly undesirable. A better solution is to establish one tunnel between each pair of VPN edge devices and then multiplex multiple VPN-specific tunnels through the single outer tunnel. With this approach, the amount of state depends only on the number of VPN
434
Chapter 12
edge devices, not on the number of VPNs. A tunnel that encapsulates one tunnel within another is known as a hierarchical tunnel, 12.2.1.3.3 Tunneling Protocols Several protocols can be used to establish and maintain VPN tunnels, including Generic Routing Encapsulation (GRE) [6,7], IP-in-IP [8,9], IPsec [10-13], and MPLS [14,15]. Each tunneling protocol can be characterized in terms of a common set of characteristics such as the format of the encapsulation header and the overhead introduced by the encapsulation, how the VPN-related information is inferred from the packet's encapsulation header, whether an explicit signaling protocol is required for setting up the tunnel, whether the tunneling protocol allows hierarchical tunneling, and whether the tunneling protocol supports mechanisms to detect tunnel failures and to respond appropriately to restore service. This chapter focuses mainly on the use of MPLS based tunnels. 12.2.1.3.4 MPLS Tunnels This section briefly describes distinguishing characteristics of MPLS tunneling protocols for VPN applications. A detailed survey of other tunneling techniques can be found in [16]. In MPLS networks, routers are known as label switching routers (LSRs) that forward packets based on labels. A label is a short, fixed-length, physically contiguous identifier, which is used to forward a packet instead of an IP address and usually has a local significance. The sequence of LSRs that a labeled packet traverses, starting at ingress LSR and terminating at the egress LSR, is called label switched path (LSP) (see Appendix A for an overview of MPLS). In the PPVPN terminology, an LSP corresponds to a VPN tunnel. In general, MPLS tunnels have following characteristics: • Tunnel encapsulation is based on label stack, and the label value is used as multiplexing field • Tunnels can be multiplexed within other tunnels. For example, in the MPLS backbone, P routers only need to maintain state for the topmost label in the label stack. This means that the VPN-specific state of the nested tunnels is not visible to the P routers. • Tunnels are set up and maintained using signaling protocols such as Label Distribution Protocol (LDP) or Resource Reservation Protocol (RSVP).
Ethernet Services Over MPLS Networks
435
12.2.1.3.5 Carrying PWs over MPLS Tunnels The protocol layering to carry PW emulated services (e.g., Ethernet) over MPLS networks is depicted in Figure 12-6. Each packet header carries two labels, namely, a bottom label and a top label. The bottom label represents a particular PW within the tunnel and provides the PW demultiplexing function to distinguish one PW from another. The top label represents a route (an LSP) across the backbone to the egress PE. The bottom label is invisible to the devices in the backbone until the packet arrives at the PE. In other words, the bottom label is meaningful only to the PE devices. In the remainder of this chapter, the term tunnel label refers to the top label and the term PW label refers to the bottom label.
Slrip-ofTprcaFTihic an J frame check sum (FCS)
Attach the daia link iivcr hcadci —^
Attach the physical !a\cr hcadt
Paytoad (LthcrnciFraiix)
'C:^
Control Pa>!oaJ (Ethernet Framed WurJ
FW Ubfl
Tunnel j [.abd
Tunnel Label
PIT Header
SON FT Header
Payload (r.lhL-nicl Frank)
Conlrol Word
Uhtl
Tunnel Label
Header
Attach data link (PPP) and _ ph>$icall3>ei{S0NRl) headers Packet is lahcl.su itched _ inthcMPl.^neiuork
Figure 12-6. Protocol layering for carrying Ethernet PWs over MPLS tunnels
|
Header
436
12.3.
Chapter 12
METRO ETHERNET SERVICES
This section briefly describes Ethernet services with the goal of motivating ensuing discussions relating to the delivery of Ethernet services over MPLS using VPWS and VPLS L2VPNs. For complete details on various Ethernet service features and attributes, refer to Chapter 10, "Metro Ethernet Services".
12.3.1 Ethernet Virtual Connection (EVC) To visualize and characterize Ethernet connections like other L2 circuits such as FR DLCI and ATM VCs, the notion of Ethernet Virtual Connection (EVC) is very useful. An EVC is an association between one or more User Network Interfaces (UNIs) for exchanging an Ethernet frame among the associated entities. An EVC can refer to point-to-point or multipoint-tomultipoint connectivity. A point-to-point EVC is associated with exactly two UNIs. A multipoint-to-multipoint EVC is associated with two or more UNIs. A point-to-multipoint EVC is a special case of the multipoint-to-multipoint EVC in which one UNI is designated as the root node and the remaining UNIs as the leaf node such that a frame received at the root UNI is transmitted to all leaf nodes. Based on the notion of EVC, there are two types of Ethernet services that can be defined, namely, Ethernet Line (ELine) Service and Ethernet LAN (E-LAN) Service.
12.3.2 E-Line Service E-Line Service uses point-to-point EVC and thus can be used to offer point-to-point Ethernet Virtual Leased Lines between two CE devices (refer to Figure 12-7). From the previous discussion, recall that E-Line Service is analogous to traditional L2VPN services based on ATM or Frame Relay virtual circuits. Later on, we will discuss how E-Line Service can be offered over an MPLS backbone using VPWS.
12.3.3 E-LAN Service E-LAN Service is based on multipoint-to-multipoint EVC and thus can be used to connect geographically dispersed customer sites in a Metro area and make them appear as if they were on the same LAN (refer to Figure 128). In the following section, we will discuss how E-Line Service can be offered over an MPLS backbone (WAN) using VPLS.
437
Ethernet Services Over MPLS Networks
T^^^"SP•fff ^ '^ '^ ^n\ In t^^'
Metro Ethernet Network (MEN)
^=3
CH
Figure 12-7. An example of Ethernet Line (E-Line) service
UT'i-r^
Metro Ethernet Network (MEN)
\^ ^^• • • • S T n In Cli
Figure 12-8. An example of Ethernet LAN (E-LAN) service
12.4.
METRO ETHERNET SERVICES OVER MPLS
This section describes the VPWS and VPLS architecture framework, which can be used to provide E-LINE and E-LAN services over an MPLS backbone.
438
Chapter 12
12.4.1 Emulation of E-Line Services using VPWS As discussed previously, a VPWS is an L2VPN that provides L2 pointto-point services. In VPSW, PE devices behave as MPLS-capable L2 switches and provide a logical interconnect of L2 circuits such that pair of CE devices appear to be connected by an L2 virtual circuit. When a CE device transmits a frame on such an L2 virtual circuit, the CE device at the other endpoint of the virtual circuit receives the frame. In the VPWS case, the forwarding of frames from one CE device to another is completely determined by the virtual circuit on which the frame is transmitted. In other words, the forwarding of frames is not affected by the contents of the frame header, and the PE acts as a virtual circuit switch.
12.4.1.1 VPWS Reference Model An Ethernet PW allows Ethemet/802.3 PDUs to be transported across an IP or an MPLS network. A point-to-point Ethernet PW emulates a single Ethernet link (an EVC) between exactly two endpoints. Figure 12-9 shows the VPWS reference model for emulation of point-to-point Ethernet services over MPLS. The protocol stack for VPSW over SONET-based MPLS transport is also depicted in Figure 12-9. Most of the entities in this reference model such as AC, PW, CE, and PE have already been described in the previous section. In general, VPSW can be used to offer a variety of L2 point-to-point services such as Ethernet, ATM, FR, TDM, and so forth (for example, refer to Figure 12-3). This is accomplished by using service-specific encapsulation methods to carry L2 frames across IP or MPLS networks. For this purpose, different encapsulation methods have been specified [17-21]. This section specifically describes the use of the encapsulation method defined in [17] to transport Ethernet frames over MPLS networks (to see protocol layering for Ethernet over SONET-based MPLS network, refer to Figure 12-6) and the use of the LDP extension defined in [22] for setting up Ethernet PWs.
Ethernet Services Over MPLS Networks
439
Ijmilalcd SciAJce
AUachniciit Circuit • r* AlU
PR-PI' Pscuclowirc
-V AUachinciit Circuit-
— lM-;-PI- Tunnel
I-thcrnc( I'ramcs Cl-2
MPLS Network
MPLS tunnel is setuj"'. usin-LDPorRSVP-T!-:
Figure 12-9. VPWS Reference Model for emulation of E-Line service over MPLS network
12.4.1.2 Ethernet Modes (Raw versus Tag) There are two modes in which an Ethernet can operate, namely, raw mode or tag mode. When operating in the tag mode, each frame carries an 802. IQ VLAN tag and the Native Service Processing (NSP) component at the two PW endpoints processes the tag. By contrast, in the raw mode, the NSPs do not process the tag (i.e., the VLAN tag passes transparently through the NSP). Depending on whether the tag is service delimiting or nonservice delimiting, the NSF may need to handle it differently in the tag mode and the raw mode. 12.4.1.3 VLAN Tag Processing The VLAN tag is said to be service delimiting if the tag is placed in the frame by the service provider equipment and the tag is meaningful to that equipment only. For example, consider a deployment scenario where the service provider has employed a LAN switch to aggregate traffic from multiple customers. In this case, the LAN switch may apply VLAN tags to distinguish streams of customer traffic from one another, and then forward the frames to the PE. The tag is said to be nonservice delimiting if the CE placed it in the frame and the tag is meaningless to the PE. Whether a tag is service or nonservice delimiting has an important implication for the processing of frames at the Ethernet PW endpoints. For example, in the raw mode, the service-delimiting tag is not transmitted over the PW. In this case, if the service delimiting tag is present in the frame
440
Chapter 12
received from the CE, the PE strips it off before transmitting on the PW. In the tag mode, the service-dehmiting tag must be transmitted. In this case, if the service-dehmiting tag is not present in the frame received from the CE, the PE must prepend the frame with a dummy VLAN tag before sending the frame on the PW. 12.4.1.4 Establishing Ethernet PWs via LDP As discussed previously, to transport L2 PDUs from ingress PEl to egress PE2 across an MPLS network over a tunnel, PEl pushes two labels (a tunnel label as the top label and a PW label as the bottom label). The distribution of tunnel labels is accomplished using LDP [23] or RSVPTE[24]. This section describes how LDP is used to signal PW labels. A bidirectional PW consists of two LSPs between PEl and PE2 (refer to Figure 12-9). The first step toward setting up a PW involves establishment of LDP session between PEl and PE2. To establish an LDP session between them, the PEs must know each other's address. A PE may learn a remote PE address through configuration or autodiscovery procedures (for example, using BGP [25]). PEl and PE2 then establish an LDP session in downstream-unsolicited mode. Before a PE can begin signaling PW labels to a remote PE, the PE must know the Target Attachment Identifier (TAI) of the remote PE. In VPWS, a PW can be thought of as connecting exactly two forwarders. To identify a particular forwarder, through configuration or based on some algorithm, each forwarder is associated with an Attachment Identifier (AI) that is unique in the context of the PE router in which the Forwarder resides. The combination of can identify a globally unique forwarder. To identify a set of forwarders that is part of the same group, an AI is defined to contain an Attachment Group Identifier (AGI) plus an Attachment Individual Identifier (All). We may think of an AGI as a VPN-id or a VLAN identifier. For example, if a bidirectional PW connects PEl with PE2, the LSP in the PEl to PE2 direction can be identified as « P E 1 , , PE2, , PEl,
Ethernet Services Over MPLS Networks
441
as CE-facing interface MTU size). For PWs setup, LDP has defined the following two new FEC TLVs: • PWid FEC (FEC type 128) • Generalized ID FEC (FEC type 129, as shown in Figure 12-10)
rrr
1 1
1 1 I
FEC Type(=129)
AG I
1 1 1~n
1 II 1
c
1
|15|!6|
1
1
1
1
1
123 124 1
11 J T3in
PW info length
PW Type
Length
II
Value
Value (continued)
SAII
Length
Value
1
\ 1
Value (continued)
TAII
Length
Value
>
i
< t
Value (continued)
Figure 12-10. Generalized ID FEC format [22]
The PWid FEC element is intended for the case in which both PW endpoints have been provisioned with the same AI. On the other hand, the Generalized ID FEC element is used when the PW endpoints can be uniquely identified and when it is desired to auto-discover the identity of the remote endpoints instead of configuring them statically.
12.4.2 E-Line Service Emulation Walk-Through Example This section provides a walk-through example of E-Line emulation over MPLS using the network diagram shown in Figure 12-11. The following discussion assumes that a bidirectional MPLS tunnel between PEl and PE2 has already been established using LDP or RSVP-TE. Hence, the label exchange for the tunnel setup is not shown. The MPLS network is assumed to be based on packet over SONET transport. For the sake of simplicity, the data link (i.e., PPP) and physical layer headers (i.e., SONET) are not shown. The sequence of steps described here may not necessarily be followed by a specific implementation. 1. PEl and PE2 learn each others addresses and establish a LDP session between them. Before PEl and PE2 can begin signaling PW labels, they
442
2.
3.
4.
5.
6. 7.
8.
Chapter 12
need to know TAI. PEl and PE2 can learn addresses and TAI information via configuration or some auto-discovery procedure (such as BGP auto discovery [25]). To setup the PW in the PEl-to-PE2 direction, PE2 allocates a label L2 and advertises this label to PEl using an LDP Label Mapping message with FEC type 129 containing SAII, AGI, and TAIL On receipt of the LDP Mapping message, PEl checks whether the message contains a valid TAI for its forwarders. If PEl cannot use the received TAI to identify one of its forwarders, then PEl sends a Label Release message to PE2, with a Status Code meaning invalid TAI. Assume that the received TAI is valid, thus PE2 accepts this label binding information. Before PE2 can program its MPLS forwarding plane to attach this label as the PW label (bottom label) for PDUs in the PE2 to PEl direction, PE2 needs to ensure that the PW is set up in the opposite (PE2 to PEl) direction. To set up the PW in the PE2 to PEl direction, using similar procedures as described for PE2 earlier, PEl allocates a label LI and advertises this label to PE2 using an LDP Label Mapping message with FEC type 129 containing SAII, AGI, and TAIL At this stage, the bidirectional PW setup is complete (for example, PEl should be using label L2 as the PW label for sending PDU to PE2, and PE2 should be using label 1 for sending PDUs to PEl). Now let us consider the CEl-to-CE2 data flow. In the incoming direction (CEl to PEl), PEl strips off the preamble and frame checksum (FCS) from the received Ethernet frame, prepends the optional control word to the frame, selects the appropriate PW, prepends the PW label L2 to the resulting packet, prepends the appropriate tunnel label to the resulting packet, and then transmits the labeled packet along the tunnel (refer to Figure 12-6 for protocol stack layering). When carrying Ethernet over an IP or MPLS network, packets containing Ethernet PW PDUs may arrive out of order at the destination PE. The control word (a four byte header) provides a sequence field to support in-order frame delivery. The labeled packet is forwarded based on the tunnel label (top label) across the MPLS network. When the packet arrives at PE2, PE performs the reverse set of operations in the outgoing (PE2-to-CE2) direction. For example, PE2 strips off the tunnel and PW encapsulation after processing the frame as required by the control word, strips off the control word, and hands the resulting frame to the Forwarder/NSP, which regenerates the FCS and replays the frame to the attached CE2 interface. A similar set of operations is performed in the other direction of traffic (CE2 to CEl). For example, PE2 strips off the preamble and FCS fields of the Ethernet frame received from CE2, prepends the optional control
Ethernet Services Over MPLS Networks
443
word to the frame, selects the appropriate PW, prepends the PW label LI to the resulting packet, prepends the appropriate tunnel label to the resulting packet, and then transmits the labeled packet along the tunnel (refer to Figure 12-6 for protocol stack layering). Finally, when the packet arrives at PEl, PEl strips off the tunnel and PW encapsulation headers and hands over the resulting frame to the Forwarder/NSP, which regenerates the FCS and replays the frame to the attached CEl interface. 0 ©
nslabUsh PE1-Pi:2 l.DP session for PW setup
Pi:2 signals label 1.2 toi'hl
©
l>i;i siiiiKils label I.I u>l>i;2
"^
I:lhcnicl I'lai
^ Q •:;:
^ CE2
MPLS Network (2) I'W setup is complete
© pi;i F.tlurnct Fr:imc
-^
iMhcrncl
attach cs two la be Is 1.2
Tunnel L:il)cl
©PI:: frnmc
rcnun cs the la K-IS L2
Tunnel
-^
Kthc Fr;i
I1C(
(J) l-orwarcl packet using tunnel label
(^
Similar proccssinu in the rc\ crsc direction o{ How
Figure 12-11. E-Line emulation over MPLS using VPWS service
12.4.3 Emulation of E-LAN Services using VPLS A VPLS is a Layer 2 service that emulates LAN service over the packet switched network and makes it possible to interconnect LAN segments on geographically dispersed customer sites as if they were connected to the same LAN. In a VPLS, the attachment circuit between CE and PE can only carry Ethernet frames. 12.4.3.1 VPLS Reference Model Figure 12-12 shows the VPLS reference model and the internal logical structure of a VPLS PE. Like VPWS, in VPLS, CE and PE devices are connected through ACs. The functionality of a VPLS PE device is quite
444
Chapter 12
Figure 12-12. VPLS Reference Model for emulation of E-LAN service over MPLS network
similar to a VPWS PE. For example, both VPWS and VPLS PE devices can be thought of mapping ACs to PWs. The VPLS PE and VPWS PE devices mainly differ in the forwarder functionality. A VPWS PE forwarder performs one-to-one mapping between AC and PWs, whereas a VPLS PE forwarder performs many-to-many mapping between ACs and PWs. In a VPWS PE, the forwarding decision of a frame is not affected by the content of the frame header but is completely determined by the virtual circuit on which the frame is to be transmitted. That is, a VPWS PE device behaves as a virtual circuit switch. In contrast, in a VPLS device, the forwarding decision of a frame is based on L2 header information (e.g., MAC address). That is, a VPWS PE device behaves as a bridge. A VPLS PE device may be thought of containing a forwarder module and a bridge module(s). The bridge module performs functions such as learning and aging out (removing) Ethernet MAC addresses. The forwarder maps frames from ACs to the appropriate set of PWs. Figure 12-13 shows that CE devices connect to the bridge module of a VPLS PE through ACs. The bridge module attaches to the forwarder through the emulated LAN interface. An emulated LAN consists of a set of VPLS forwarder modules connected by PWs, where PWs are carried over PE-PE tunnels across the backbone. A set of forwarders (no more than one in each VPLS PE) connected by PWs is known as a VPLS instance. Thus, the bridge module of each PE in the VPLS instance attaches (via an emulated interface) to the Emulated LAN. Hence, to a CE device, the VPLS backbone appears as a set of PE bridges attached to an Emulated LAN.
Ethernet Services Over MPLS Networks
445
Figure 12-13. Logical structure of an emulated LAN using VPLS
A physical LAN has built in broadcast and multicast capabilities. For example, when a device sends a frame to a particular device (unicast), all other devices on the LAN receive the frame. So how does a VPLS emulate the broadcast and multicast capabilities of a physical LAN? When a forwarder receives a unicast frame over its emulated LAN interface, unlike real LAN, the forwarder does not send the frame to all other forwarders but only to one forwarder that can deliver the frame to the correct destination address. How does the PE learn about the reachability of a MAC address? Conceptually, a PE maintains a separate L2 forwarding table for the bridge module and the forwarder module. What is the difference between the functionality of two L2 forwarding tables? The L2 forwarding table of a bridge module maps a MAC destination address (DA) to an AC and/or an emulated LAN interface. In contrast, the L2 forwarding table of a forwarder maps a MAC DA to a PW only. As a result, a bridge module performs MAC address learning on ACs as well as emulated interfaces, whereas a bridge module performs MAC address learning on PWs only. This means that when a VPLS forwarder module receives a frame from a PW whose MAC source address (SA) is previously unknown, the forwarder associates this MAC address with the corresponding outgoing PW (refer to [27,28] for further information on MAC address learning procedures). Similarly, a bridge module learns the MAC address on ACs and an emulated LAN interface. In short, the PE bridge function treats the emulated LAN like a normal LAN. Specifically, the PE bridge makes its forwarding decisions as it would on any normal
446
Chapter 12
LAN, based on MAC SA learning. Although conceptually the bridge module's forwarding table and the forwarder's forwarding table are treated as distinct entities, a particular implementation may support these functions within a single forwarding table. 12.4.3.2 Avoiding VPLS Forwarding Loops Connecting a set of forwarders via PWs forms a VPLS instance, and the set of PWs thus forms an overlay topology. Depending on the form of interconnectivity between forwarders, several types of overlay topologies are possible. For example, if every forwarder in a given VPLS is connected by exactly one point-to-point PW to every other forwarder in the same VPLS, a full mesh overlay topology is formed. In a tree-structured overlay topology, every forwarder in a particular VPLS is assigned a particular position (level) in the tree such that a given forwarder is connected via at most one PW to a higher-level forwarder (the root of the tree is at the highest level). By allowing more than one forwarder at the highest level (provided that the forwarders at the highest level are fully meshed), one can form a variant of the tree-structured overlay topology. Each VPLS forwarder performs MAC learning to populate its L2 forwarding table. To do this, each forwarder treats each PW as a bridge port and employs standard MAC learning procedures over the point-to-point PWs. To achieve loop-free forwarding over the overlay topology, forwarding decisions of VPLS forwarders require some coordination. In general, loopfree L2 forwarding requires the use of the Spanning Tree Protocol (STP) between bridges. While STP works well in the enterprise LAN environment, its poor resiliency characteristics may not be suitable for use in the highavailability service provider backbone. VPLS obviates the use of STP in the backbone through a combination of full mesh topology and split-horizon forwarding techniques [26]. For example, in a full mesh overlay topology to ensure loop-free forwarding, when a forwarder receives a frame over a PW it does not forward that frame over any other PW. Thus, when a forwarder receives a unicast frame with a known DA over its emulated LAN interface, the forwarder sends that frame on exactly one point-to-point PW. In contrast, when a forwarder receives a multicast or unicast frame with unknown DA, it sends a copy of the frame over each point-to-point PW in the same emulated LAN. Similarly, in a tree-structured overlay topology, to ensure loop-free forwarding, when a frame is received over a PW from a higher level it is not transmitted over a PW that leads to a higher level.
Ethernet Services Over MPLS Networks
447
12.4.3.3 VPLS PW Encapsulation Like VPWS, in VPLS the payload on the ACs consists of Ethernet frames with or without VLAN headers, and thus VPLS also uses the same Ethernet encapsulation format as described in the previous section. For VPLS, handling of a service-delimiting tag can be summarized as follows. If the packet, as it arrives at the PE, has an encapsulation (e.g., VLAN tag) that is nonservice delimiting, then that encapsulation is preserved as the packet is sent into the VPLS. If the packet has an encapsulation that does contain the required VLAN tag, an appropriate tag is attached [17]. Similar to VPWS, in VPLS, the PW encapsulations are in turn encapsulated within an MPLS tunnel encapsulation. 12.4.3.4 VPLS PW Setup Each VPLS requires a globally unique identifier (a VPN-id). Before a PE can set up PWs to other PEs in the VPLS, the ACs attaching the CEs to the PEs must be provisioned on both the PEs and the CEs in that VPLS, a forwarder for that VPLS must be provisioned on each PE, the local ACs of that VPLS must be associated with the forwarder, and the forwarder must be provisioned with the identifier (a VPN-id) of the VPLS to which it belongs. The mapping of a VPLS identifier into a set of remote PEs that belong to the same VPLS can be provisioned or accomplished through an auto-discovery procedure such as BGP auto-discovery. Once the set of remote PEs is determined, the VPLS PEs can use LDP to set up a fiill mesh of PWs among the forwarders in a particular VPLS. As described earlier, LDP has defined two new FEC TLVs for establishing PWs, namely, PWid FEC (type 128) and Generalized ID FEC (type 129). For the VPLS PWs setup, the Generalized ID FEC with following assignment of fields is used: • Control bit (c) is set if the control word is needed on a PW. • PW type value indicates the method of encapsulation, namely, Ethernet (i.e., raw mode or Ethernet VLAN (i.e., tag mode). • The AGI, Length, and Value fields contain the unique identifier of the VPLS (VPLS-id). • TAII, and SAII fields are null, since the VPLS PWs terminate on MAC learning tables (many to many forwarders) rather than on individual ACs. • Interface Parameters include parameters such as MTU (which must be the same for all PWs in a given VPLS) and Requested VLAN ID (for a PW of type Ethernet VLAN, this parameter may be used to signal the insertion of the appropriate VLAN ID by the ingress PE before sending the packet on a PW).
448
Chapter 12
12.4.4 E-LAN Service Emulation Walk-Through Example This section provides a walk-through example of E-LAN emulation over MPLS using the network diagram shown in Figure 12-14. The goal is to set up a VPLS by establishing a full mesh of Ethernet PWs between PEl, PE2,
0
l-:st;iblisli 1.DP session f) Setup PWs
MPl.S B:ukl)()iu-
Q) P!-l :ill;iclics two l.ih I 1 iliciiK-t I I 1 iliciiKi I , , I Tunnel j L Li-'!]]h:_l ~*" I 1 KwiK- I I L:il>vl I 0
0
0
VV.l removes llie labels •liiiim.-I 1
I MlK-niol I
I-otAvai(i packet usiiij'. Iiinnel label
Similar pioccssiiu', in liie reverse clireclion of How
Figure 12-14. Emulation of E-LAN service over MPLS
and PE3. The following discussion assumes that bidirectional MPLS tunnels between PEl, PE2, and PES have been already established using LDP or RSVP-TE. The label exchange for tunnel setup is not shown. The MPLS network is assumed to be based on packet over SONET transport. For the sake of simplicity, the data link (i.e., PPP) and physical layer headers (i.e., SONET) are not shown. The sequence of steps described here may not necessarily be followed in this order by a specific implementation. L PEl, PE2, and PE3 learn each other's addresses and estabhsh an LDP session between them. Before PEl and PE2 can begin signaling PW labels, they need to know the VPLS-id. Each PE is provisioned with a unique VPLS-id. PEs can learn addresses and Remote Forwarder Selector (i.e., VPLS-id) through an auto-discovery procedure. 2. To set up a full mesh of bidirectional PWs using an LDP Label mapping message with FEC type 129, PEl advertises a label 12 to PE2 and a label
Ethernet Services Over MPLS Networks
3.
4. 5.
6.
449
13 to PE3. Similarly, PE2 advertises a label 21 to PEl and a label 23 to PE3, and PE3 advertises a label 31 to PEl and a label 32 to PE2. Assume that in the LDP signaling message the control (C) bit was set (which means the ingress PE must attach the control word) and the PW type was set to Ethernet VLAN (tag mode). Now let us consider flow of a packet from CEl to CE2. Suppose when the packet leaves CEl, it has a MAC SA of CI and a destination MAC DA of C2. Assume that PEl has learned MAC DA C2 from PE2 and associated with a PW label 21 advertised by PE2. If PEl does not know the MAC DA C2, it would multicast this packet to both PEl and PE2. In the incoming direction, PEl strips off the preamble and frame checksum (PCS) from the received Ethernet frame, prepends the optional control word to the frame, selects the appropriate PW, prepends the PW label 12 to the resulting packet, prepends the appropriate tunnel label to the resulting packet, and then transmits the labeled packet along the tunnel. The labeled packet is forwarded based on tunnel label (top label) across the MPLS network. When the packet arrives at the PE2, it performs the reverse set of operations in the outgoing (PE2 to CE2) direction. For example, PE2 strips off the tunnel and PW encapsulation after processing the frame as required by the control word, and strips off the control word, and hands the resulting frame to the Forwarder/NSP, which regenerates the PCS and replays the frame to the attached CE2 interface. A similar set of operations is performed in the other direction of traffic (CE2 to CEl).
12.5.
IMPORTANCE OF VPLS FOR METRO ETHERNET SERVICES
To provide multipoint-to-multipoint Ethernet services between multiple enterprise sites in geographically dispersed locations, Service Providers must be able to extend native Ethernet service from one metro area to another in a reliable and scalable fashion. The number of unique VLAN IDs and the poor convergence characteristics of the STP limit the scalability and availability of Ethernet networks, respectively. As VPLS PWs use MPLS labels, VPLS can support a large number of unique identifiers (e.g., over a million labels). As VPLS PWs are carried within MPLS tunnels, which can be quickly (in tens of milliseconds) rerouted around link/node failures using mechanisms such as MPLS FastReRoute [29], VPLS can support highavailability services. Thus by enabling two key service attributes, namely.
Chapter 12
450
scalability and high availability, VPLS is expected to play a pivotal role in the large-scale deployment of Metro Ethernet services (see Figure 12-15).
y, Metro Ar ia A1
Mcli-Q Area C
I VPLS P t l
Metro Area B iVIBN MPLS Backbone
Ml-N Figure 12-15. Inter-Metro Ethernet extension via VPLS
12.6.
SUMMARY
This chapter described architectures of the Layer 2 Virtual Private Network (L2VPN) services such as Virtual Private Wire Service (VPSW) and Virtual Private LAN Service (VPLS) that are being standardized in the Internet Engineering Task Force (IETF). VPWS and VPLS can be used to support E-Line and E-LAN services being standardized by the Metro Ethernet Forum (MEF) across a Wide Area Network (WAN). Specifically, MPLS-based L2VPN services are the main focus of this chapter. The material in this chapter should allow network engineers, administrators, and planners to understand how they could offer E-Line and E-LAN services over the converged packet switched network infrastructure. The content of this chapter is based on IETF standards and does not necessarily reflect any particular networking vendor's L2VPN implementation.
Ethernet Services Over MPLS Networks
12.7.
451
APPENDIX A: MPLS BASICS
This Appendix provides a brief tutorial on the basics of the Muhiprotocol Label Switching (MPLS) technology.
12.7.1 Forwarding Equivalence Class In conventional IP forwarding, when a router needs to forward an IP packet, it performs a longest prefix match for the packet's destination address in the forwarding information base (FIB) to obtain the next hop and the outgoing link information. The aforementioned IP forwarding procedure is then carrier out in a sequence of routers as the packet traverses the network toward its destination. Since each router has a fixed (finite) number of outgoing links, packets with different destination IP addresses are forwarded along the same outgoing link. A group of packets that is forwarded over the same path is said to form a Forwarding Equivalence Class (FEC). As a concrete example, a router typically considers two packets to be in the same FEC if there is some address prefix P in that router's forwarding tables (i.e., FIB) is such that P is the longest match for each packet's destination address [14].
12.7.2 Labels Unlike IP, which bases its forwarding on a packet's destination IP address, MPLS forwards packets based on other information in the header that is referred to as a label. A label is a short, fixed-length value that identifies a particular FEC. Figure 12-16 shows the format of an MPLS label. A common type of FEC is known is known as the Address Prefix FEC, which is based on packet's IP destination address. In general, however, the FEC can be based on information other than an IP destination address, such as a PW identifier (as discussed in the main body of this chapter). The association between an FEC and a label is known as label-to-FEC binding. The FEC-to-label binding and its distribution is accomplished through a set of procedures and messages commonly referred to as MPLS signaling protocols (MPLS signaling protocols are further discussed later on in this appendix) The label of an incoming packet is known as the incoming label, and the label of an outgoing packet is referred to as the ""outgoing label. In general, a packet may contain one or more labels that are collectively known as a label stack. The labels in the stack are organized as a last-in-first-out (LIFO) stack (Figure 12-17 shows a label stack). An unlabeled packet can be viewed
Chapter 12
452
as having a label stack of depth 0. As described in the main body of this chapter, transport of Ethernet frames over an MPLS network involves using a label stack where the bottom label is the pseudowire (PW) label and the top label is the tunnel label. 23 24
15 16 Label (20 bits)
Etpcrimentiti (3 bib)
Time-Tfl-Livt
S
(8 bit«)
When set, this field indicates bottom ofthe label stack Figure 12-16. MPLS label format
12.7.3 Label Encoding The term Multiprotocol in MPLS refers to the fact that MPLS is capable of switching packets or frames from multiple Layer 3 and Layer 2 protocols. MPLS can operate over multiple data link layer technologies such as Ethernet, Point-to-Point protocol (PPP), and ATM. An example of MPLS label encoding for the case of Ethernet and PPP is depicted in Figure 12-18. Top of the stack
Label Stack Entry (level d) Stack Depth
Bottom of stack (the S bit in this entry is set to one)
Label Stack Entry (level 2) Label Stack Entry (level 1)
-Each label stack entry is 32 bits -
Figure 12-17. MPLS Label Stack
Ethernet Services Over MPLS Networks
45 3
MPLS belong to layer 2 5 in the OSImodd v
MPLS over Bhemet over Fiber/Copper
•
| Fiber or Copper | Ethernet Header ^
M a s over PFP over SOIET/SDH
•
|
SOJET/SDH
|
PFPHeader
^
^
^
Layer 3 Header |
Paykad
{ l ^ S t a d c | Layer 3 Header |
Payload
Figure 12-J 8. MPLS label stack encoding
12.7.4 Label Switched Router (LSR) A router that is capable of supporting MPLS is referred to as a label switched router (LSR). An interconnection of LSRs forms an MPLS networks. An LSR at the edge of the MPLS network is known as an edge LSR. In general, any LSR whose neighbor does not support MPLS can be considered as an edge LSR. In contrast, a transit LSR's neighbors are expected to support MPLS. Depending on its location in the MPLS network, an LSR can perform different set of label operations.
12.7.5 Label Stack Operations — imposition, disposition, swapping In the incoming direction, an edge LSR attaches one or more labels (i.e., label stack) to unlabeled packets. The action of adding one or more labels is commonly known as label imposition. In the outgoing direction, an edge LSR removes labels and thus turns labeled packets into unlabeled packets. The action of removing the label stack from a labeled packet is commonly referred to as label disposition. In contrast with an edge LSR which either attaches or removes the label stack, a transit LSR only replaces the incoming label with an outgoing label. The action of replacing an incoming label with an outgoing label is commonly referred to as label swapping.
12.7.6 MPLS Control Plane The phrase MPLS control plane refers to the set of tasks performed by MPLS signaling protocols. The main function of the MPLS signaling
454
Chapter 12
protocols is to establish, maintain, and release label switched paths. The MPLS control plane consists of signaling protocols such as Label Distribution Protocol (LDP) [23] and Resource ReSerVation Protocol (RSVP) [24].
12.7.7 IVLPLS Forwarding Plane The label information provided by the control plane is used to populate forwarding tables BGP. The information provided by the MPLS control plane protocols is used to populate state in the forwarding tables. The forwarding state is in turn used for switching packets from input to output ports based on label information. The MPLS forwarding tables make up the forwarding plane of an LSR.
12.7.8 Label Switched Path (LSP) The sequence of LSRs that a labeled packet traverses, starting at the ingress LSR and terminating at the egress LSR, is known as a label switched path (LSP), Figure 12-19 shows an example of a LSP. As MPLS LSPs are unidirectional, a separate LSP in each direction is required for bidirectional traffic flows. The path of an LSP can be selected either hop-by-hop or specified explicitly by the source node. When the path of an LSP is selected hop-byhop (for example, along the path computed by the interior gateway protocol such as Open Shortest Path First (OSPF)), such an LSP is termed a hop-byhop routed LSP. In contrast, an LSP whose path is explicitly specified at the source node is called an explicitly routed (or a traffic engineered) LSP. For instance, LDP establishes hop-by-hop routed LSPs, whereas RSVP-TE establishes explicitly routed LSPs.
Ethernet Services Over MPLS Networks
455
LSPs 1 J Edge LSR
Edge LSR
Unlabeled IP Packets
\
Unlabeled IP Packets
Figure 12-19. MPLS label switched paths.
12.7.9 Benefits of MPLS Technology The key benefits of MPLS above and beyond those of IP are the apphcations it enables, such as Layer 3 VPNs (L3VPNs), Layer 2 VPNs (L2VPNs), traffic engineering (TE), and fast rerouting (FRR) of traffic around hnk/node failures. These applications are possible because MPLS decouples forwarding from the control plane, which means that the same forwarding component can be used without any special considerations for a particular application.
12.8. REFERENCES [1]
L. Andersson and T. Madsen, "PPVPN terminology," IETF Work in Progress, July 2004. [2] L. Andersson and E. Rosen, "Framework for Layer 2 Virtual Private Networks (L2VPNs);' IETF Work in Progress, June 2004. [3] S. Bryant and P. Pate, "PWE3 Architecture," IETF Work in Progress, March 2004. [4] H. Schulzrinne et al., "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996. [5] E. Rosen and Y. Rekhter, "BGP/MPLS IP VPNs," IETF Work in Progress, March 2004. [6] D. Farinacci et al., "Generic Routing Encapsulation (ORE)," RFC 2784, March 2000. [7] G. Dommety, "Key and Sequence Number Extensions to GRE," RFC 2890, September 2000. [8] C. Perkins, "IP Encapsulation within IP," RFC 2003, October 1996. [9] A. Conta et al., "Generic Packet Tunnehng in IPv6 Specification," RFC 2473, December 1998. [10] S. Kent et al., "Security Architecture for the Internet Protocol," RFC 2401, November 1998.
456
Chapter 12
[11] S. Kent et al., "IP Authentication Header," RFC 2402, November 1998. [12] S. Kent et al, "IP Encapsulating Security Payload (ESP)," RFC 2406, November 1998. [13] D. Harkins et al, "The Internet Key Exchange (IKE)," RFC 2409, November 1998. [14] E. Rosen et al, "Multiprotocol Label Switching Architecture," RFC 3031, January 2001. [15] E. Rosen et al, "MPLS Label Stack Encoding," RFC 3032, January 2001. [16] R. Callon and M. Suzuki, "A Framework for Layer 3 Provider Provisioned Virtual Private Networks," IETF Work in Progress, October 2003. [17] L. Martini et al, "Encapsulation Methods for Transport of Ethernet Frames Over IP/MPLS Networks," IETF Work in Progress, July 2004. [18] L. Martini et.al, "Encapsulation Methods for Transport of ATM Over IP/MPLS Networks," IETF Work in Progress, July 2004. [19] A. Malis et al, "SONET/SDH Circuit Emulation over Packet (CEP)," IETF Work in progress, June 2004. [20] C. Kawa et al, "Frame Relay over Pseudo-Wires,"/£'rF Work in progress, February 2004. [21] L. Martini et al, "Encapsulation Methods for Transport of PPP/HDLC Over IP and MPLS Networks," IETF Work in Progress, April 2004. [22] L. Martini et al, "Pseudowire Setup and Maintenance using LDP," IETF Work in Progress, July 2004. [23] L. Andersson et. al, "LDP Specification," RFC 3036, January 2001. [24] D. Awduche et. al, "RSVP-TE: Extensions to RSVP for LSP Tunnels," RFC 3209, December 2001. [25] Ould-Brahim et al, "Using BGP as an Auto-Discovery Mechanism for Layer-3 and Layer-2 VPNs" IETF Work in progress, May 2004. [26] M. Lasserre et al., "Virtual Private LAN Services over MPLS,", IETF Work in Progress, August 2004. [27] IEEE 802.1D-2003, IEEE Standard for Local and Metropolitan Area Networks: Media Access Control (MAC) Bridges, 2003. [28] IEEE 802.1Q-1998, IEEE Standards for Local and Metropolitan Area Networks: Virtual Bridged Local A rea Networks, 1998. [29] P. Pan et al, "Fast Reroute Extensions to RSVP-TE for LSP Tunnels," IETF Work in Progress, November 2004.
Chapter 13 METRO ETHERNET CIRCUIT EMULATION SERVICES
Nan Chen Strix System; Metro Ethernet Forum President
13.1.
METRO ETHERNET CIRCUIT EMULATION SERVICES
13.1.1 Circuit Emulation Service Definition Ethernet CES provides emulation of TDM services, such as N x 64 kbit/s, Tl, El, T3, E3, OC-3, and OC-12, across a Metropolitan Ethernet Network (MEN). The objective is to allow MEN service providers to offer TDM services to customers. Hence it allows MEN service providers to extend their reach and addressable customer base. For example, the use of CES enables metro Ethernet transport networks to connect to PBXs on customer premises and to deliver TDM voice traffic alongside data traffic on metro Ethernet. The CES is based on a point-to-point connection between two InterWorking Functions (IWF). Essentially, CES uses the MEN as an intermediate network (or virtual wire) between two TDM networks. This setup is handled as an application of the Ethernet service, using the interworking function to interface the applications layer onto the Ethernet services layer.
Chapter 13
458 13.1.1.1
TDM Line Service (T-Line)
The TDM Line (T-Line) service provides TDM interfaces to customers (N X 64 kbit/s, Tl, El, T3, E3, OC-3, OC-12, etc.), but transfers the data across the MEN instead of a traditional circuit switched TDM network. From the customer perspective, this TDM service is the same as any other TDM service, and the service definition is given by the relevant ITU-T and ANSI standards pertaining to that service. From the provider's perspective, two CES interworking functions are provided to interface the TDM service to the Ethernet network. The CES interworking functions are connected via the Metro Ethernet Network (MEN) using point-to-point Ethernet Virtual Connections (EVCs), as illustrated in Figure 13-1. The TDM Service Processor (TSP) block shown in Figure 13-1 consists of any TDM grooming function that may be required to convert the TDM service offered to the customer into a form that the CES IWF can accept. For example, the TSP may be a Framer device, converting a fractional DSl service offered to the customer into an N x 64 kbit/s service for transport over the MEN. The operation of the TSP is outside the scope of this chapter. The TSP and the CES IWF may physically reside in the Provider Edge (PE) unit at the provider's nearest point-of-presence, or in a service provider-owned box in a customer location (e.g., a multitenant unit). From the architecture perspective, there is no difference between these alternatives.
TDM Subscriber Ethernet UNI Demarcation
Ethernet UNI TDM Subscriber Demarcation
Figure 13-1: TDM line service over Metro Ethernet Networks
Metro Ethernet Circuit Emulation Services
459
13.1.1.1.1 Operational Modes of a T-Line Service The basic T-Line service is a point-to-point, constant bit-rate service, similar to the traditional leased-line type of TDM service. However, service multiplexing may occur ahead of the CES interworking functions, (e.g., aggregation of multiple emulated Tl lines into a single T3 or OC-3 link), creating a multipoint-to-point or even a multipoint to multipoint configuration (see Figure 13-2). This service multiplexing is carried out using standard TDM multiplexing techniques, and is considered as part of the TSP block rather than the CES interworking function. The TDM interface at the input of the CES interworking function is the same as that output from the CES IWF at the opposite end of the emulated link. It is the TSP that may be used to multiplex (or demultiplex) that TDM service into the actual TDM service provided to the customer. This setup allows a TDM service to a customer to be provided as a collection of emulated services at lower rates.
Figure 13-2: Possible TDM virtual private line configurations
Hence there are three possible modes of operation of a T-Line service: 1. Unstructured Emulation mode (also known as structure-agnostic emulation) 2. Structured Emulation Mode (also known as structure-aware emulation) 3. Multiplexing mode Modes (1) and (2) are point-to-point connections. Mode (3) permits multipoint-to-point and multipoint-to-multipoint configurations, although all these modes are operated over simple point-to-point EVCs in the MEN.
460
Chapter 13
13.1.1.1.1.1 Unstructured Emulation Mode In Unstructured Emulation Mode, a service is provided between two service endpoints that use the same interface type. Traffic entering the MEN on one endpoint leaves the network at the other endpoint and vice versa. The MEN must maintain the bit integrity, timing, and other client payload-specific characteristics of the transported traffic without causing any degradation that would exceed the requirements for the given service as defined later in this chapter. All the management, monitoring, and other functions related to that specific flow must be performed without changing or altering the service payload information or capacity. Examples where unstructured emulation mode could be implemented are leased line services or any other transfer-delay-sensitive (real-time) applications. The specific transport rates and interface characteristics of this service are defined later in this chapter. 13.1.1.1.1.2 Structured Emulation Mode In Structured Emulation Mode, a service is provided between two service endpoints that use the same interface type. Traffic entering the MEN on one endpoint is handled as overhead and payload. The overhead is terminated at the near endpoint, while the payload traffic is transported transparently to the other end. At the far endpoint, the payload is mapped into a new overhead of the same type as that at the near endpoint. The MEN must maintain the bit integrity, timing information, and other client payload-specific characteristics of the transported traffic without causing any degradation that would exceed the requirements for the given service as defined later in this chapter. All the management, monitoring, and other functions related to that specific flow must be performed without changing or altering the service payload information or capacity. An example of such a service is the transport of OC-3 when the SOH is terminated at both ends and the STS-1 pay loads are transported transparently over the MEN. A second example is a fractional DSl service, where the framing bit and unused channels are stripped and the used channels transported across the MEN as an N x 64 kbit/s service. The specific transport rates and interface characteristics are defined later in this chapter. 13.1.1.1.1.3 Multiplexing Mode In the multiplexing mode, multiple lower-rate transparent services are multiplexed at a specific service endpoint of the MEN into a higher digital hierarchy. Similarly, a higher-rate service may be decomposed into several lower-rate services. For example, a customer may have several sites — a head office with a full DSl connection, and several satellites with fractional DSl connections, as shown in Figure 13-3. The same architecture can be
Metro Ethernet Circuit Emulation Services
461
used for the multiplexing of other rate services, e.g., several full DSl services onto a single DS3 or multiplexing of VT-1.5s into an STS-1. The specific transport rates and interface characteristics are defined later in this chapter. Service multiplexing is typically performed in the TDM domain as part of a TSP, not in the Ethernet domain. A fractional TDM link going into the MEN comes out as a fractional TDM link, or at least as a payload containing solely that fractional link. Multiplexing and demultiplexing is performed outside the CES IWF as part of any native TDM signal processing, as shown in Figure 13-3. Therefore the customer service is multiplexed, but the emulated service (i.e., the service handled by the IWF) is structured. Fractional DSl payload to be multiplexed
Fractional DS1 link
TDM multiplexing function (TSP) Head Office
Full DSl link
Point-to-point EVCs Fractional DSl links
Satellite sites
Figure 13-3: Example multipoint-to-point T-Line multiplexed service
13.1.1.1.2 Bandwidth Provisioning for a T-Line Service The Metro Ethernet service provider will need to allocate sufficient bandwidth within the network to carry the T-Line service. This may require very fine granularity of bandwidth provision to allow efficient allocation. For example, an N x 64 kbit/s service may have very low bandwidth requirements where N is small. This could result in very inefficient provisioning if the smallest unit increment of bandwidth provision is 1 Mbit/s. The following sections outline three possible schemes to provision the bandwidth efficiently for low-data-rate CES.
462
Chapter 13
13.1.1.1.2.1 Bandwidth allocation at 100 kbit/s granularity The bandwidth required to offer emulation of N x 64 kbit/s services increases with N in steps of 64 kbit/s, plus an overhead for encapsulation headers. Therefore, in order to be able to allocate MEN bandwidth efficiently for such services, the bandwidth needs to be provisioned in similar step sizes. It is recommended that to achieve reasonable efficiency, the granularity of service provisioning should be 100 kbit/s or smaller. However, most existing equipment only allows for bandwidth provision at multiples of 1 Mbit/s (or 1.5 Mbit/s for much SONET-based Ethernet equipment), so this fine level of granularity is not guaranteed to be available. Therefore, alternative means of allocating bandwidth may need to be used to provide a reasonable level of efficiency. 13.1.1.1.2.2 TDM multiplexing Multiple TDM services between the same two IWFs may be multiplexed together in the TDM domain before being carried across the MEN. Demultiplexing at the far end is also accomplished in the TDM domain. For example, several fractional DSl services could be multiplexed onto a single full DSl link before transmission across the MEN. Bandwidth can then be provisioned for the full link, rather than individually for each fractional customer service. Similarly, the same architecture can be used for multiplexing of other rate service, e.g., several full DSl services onto a single DS3 or SONET link. 13.1.1.1.2.3 Ethernet multiplexing Another alternative is to multiplex several circuits across a single Ethernet Virtual Connection, as shown in Figure 13-4. All TDM circuit emulation between any two given IWFs is handled across a single point-to-point EVC. Individual circuit emulated links are carried across that EVC using Ethernet multiplexing. Customer separation is maintained using this multiplexing. The total bandwidth of circuit emulation traffic between two points is known in the case of constant bit rate, always on service. Therefore, the amount of traffic across that EVC is known and can be provisioned accordingly. This allows efficient bandwidth provisioning. For example, five fractional TDM services at 384 kbit/s could share a single 2 Mbit/s EVC, and don't have to be allocated 1 Mbit/s each.
Metro Ethernet Circuit Emulation Services
463
TDM links (may be full or fractional)
Customer sites (may or may not belong to the same custor^er)
Circuit emulated links
Point-to-point EVCs
Figure 13-4: Multiplexing across a single Ethernet Virtual Connection
13,1.1.2
TDM Access Line Service (TALS)
The second type of TDM service that can be offered over MetropoUtan Ethernet Networks is where the MEN service provider hands off one (or both) ends of the network to a second network (e.g., the PSTN). For example, Figure 13-5 shows the case where the customer interface is TDM, and the handoff to an external network (in this case the PSTN) is via some form of TDM or SONET trunk line (e.g., OC-3). The prime use of this type of service is where the MEN is used as an access network onto the second network. With TALS service, the customer-facing IWF can be located on the customer's site but is still under the control of the MEN operator. Multiple customer services may be multiplexed up onto a single trunk to be handed off to the second network (Figure 13-5). As with T-Line service, this multiplexing is implemented using conventional TDM techniques in a native signal processing block. In some instances, both ends of the emulated TDM service may be within the service provider network, e.g., backhaul of ATM or SONET services. As with T-Line service, the customer-facing TSP and CES IWF may physically reside in the Provider Edge (PE) unit at the provider's nearest point-of-presence, or in a service provider-owned box in a customer location (e.g., a multitenant unit). From the architecture perspective, there is no difference between these alternatives.
Chapter 13
464
TDM Subscriber Ethernet UNI Demarcation
Figure 13-5: Handoff of a multiplexed trunk to an external network
13.1.1.2.1 Operational Modes of a TALS Service The TALS service is essentially very similar to the multiplexed, multipoint-to-point T-Line service. The two services use the MEN in the same manner (see Figure 13-6). The only difference is that the final multiplexed service (e.g., OC-3) is handed off to another network rather than to an end customer. As such, it may have some performance requirements deriving from the requirements of the second network that are not present in the T-Line service. The MEN must maintain the bit integrity, timing, and other clientpayload-specific characteristics of the transported traffic without causing any degradation that would exceed the requirements for the given service as defined later in this chapter. All the management, monitoring, and other functions related to that specific flow must be performed without changing or altering the service payload information or capacity.
I
I'uinl (o puini V\
III \mM\
cnsrwF
TSP! CES IWK
!ir
, ^ ^
" TDM
\
TSP: CES IWF
Figure 13-6: Possible TDM handoff configurations
Metro Ethernet Circuit Emulation Services 13.1.1.3
465
Customer-Operated CES
It is also possible for customers to operate the Circuit Emulation Service themselves across an E-Line service (see Figure 13-7). However, in order to operate CES at a reasonable level of quality, the service provider will need to offer ?i premium SLA, with tighter definitions of parameters such as packet delay, variation in packet delay, and packet loss. The requirements section of this document covers the parameters that need to be controlled, as well as the appropriate range of values over which CES can be operated with acceptable quality. Some customers, for cost reasons, may choose to operate CES across a standard E-Line service with no special SLA. In this case, the level of quality experienced is entirely the customer's own responsibility. Since from the service provider's perspective this is purely an Ethernet service, the definition of this service is considered to be outside the scope of this chapter, other than to document possible parameters of a service-level specification for CES-capable E-Line service.
Service Provider Network TDM Link
CES IWF
TSP (optional)!
>i^^W^^
CUET Ethernet UNI Customer Demarcation
1^1
\ 1 TDM Link
CUET Ethernet UNI Customer Demarcation
Figure 13-7: Customer-Operated CES over a Metropolitan Ethernet Service (e.g., E-Line)
13.1.1.4
Mixed-Mode CES Operation
A further scenario is a mixed mode service, where the customer interface is Ethernet at one side and TDM at the other. Therefore the customer is providing its own interworking function at one end of the service, and this function must interoperate with the service provider's interworking function at the far end (Figure 13-8).
466
Chapter 13
This service may be operated using the same methods as the other services described in this chapter. However, it may create significant issues as far as troubleshooting hnks between the service provider and the customer. For example, it may be difficult to determine whether a fault is in the customer's own equipment or in the service provider's equipment. Service providers offering mixed-mode services should be aware of the potential issues involved. Resolution of these types of issues, as well as any operations and management issues involved with running between IWFs in different administrative domains, are for further study. The MEF does not plan any further work on the definition of such a service at present.
H
TSP (qjtional)
CES IWF
1 TDM Link
CUET Ethernet UNI Customer Demarcation
Ethernet UNI TDM Subscriber Demarcation
Figure 13-8: Mixed-mode service
13.1.2 Circuit Emulation Service Framework This section is intended to provide a framework within which the actual requirements for the Circuit Emulation Service can be understood. 13.1.2.1
General Principles
In most service provider networks, SONET/SDH has been used as the technology for transporting customers' PDH or SONET/SDH traffic. The purpose of the CES is to allow the MEN to serve not only as a transport of Ethernet and data services but also as a transport of customer's TDM traffic. The CES solution in the MEN should therefore make the MEN behave as a standard SONET/SDH and/or PDH network, as seen from the customer's perspective. The intention is that the CES customer should be able to use the
Metro Ethernet Circuit Emulation Services
467
same (legacy) equipment, regardless of whether the traffic is carried by a standard SONET/SDH or PDH network, or by a MEN using CES. 13.1.2.2
Service Interface Types
There are two basic service interfaces in the TDM domain. These are shown in Figure 13-9, and are defined as follows: 1. TDM Service Interface: The TDM service that is handed off to the customer or TDM network operator. TDM services fall into two categories namely, unstructured and structured. 2. Circuit Emulation Service Interface (CES TDM Interface): The actual circuit service that is emulated between interworking functions through the MEN. For unstructured TDM service, the CES Interface carries all information provided by the TDM Service Interface transparently. The service is emulated in its entirety by the IWF, including any framing structure or overhead present. For structured TDM service, the TDM service interface is operated on by the TSP (TDM Service Processor) to produce the service that is to be emulated across the MEN. A single structured TDM service may be decomposed into one or many CES flows, or two or more structured TDM services may also be combined to create a single CES flow. TDM Service Interface
Customer TDM — 7 ^ Service
CES TDM Interface TSP (optional) (e.g. framing, mux/demux)
CES IWF Type
Ethernet Interface
CES Payload
CES IWF TDM Packetization
Adds DA, SA, and CRC fields
Figure 13-9: Functional Elements and Interface Types
13.1.2.2.1 Examples of TDM Service Interfaces Table 13-1 shows examples of possible TDM service interfaces. These service interfaces are defined as standardized PDH, SONET, or SDH interfaces. Unstructured TDM services are either emulated as is by the CES IWF or mapped by the TSP onto the CES interface without modifying their content. These services preserve the framing structure as well as SONET/SDH overhead as input to the TDM service interface. These services provide point-to-point connectivity across a MEN.
Chapter 13
468
Structured services may be either demultiplexed into any of their defined service granularities or multiplexed with other structured TDM services. For example, an OC-3 structured service can be demultiplexed into three separate STS-1 signals. Each of these signals can then be circuit emulated by different IWF across the MEN to one or more service endpoints. At these IWFs, these STS-ls can be reassembled with other STS-ls (from different originating services) to form an entirely new OC-3 signal (provided each originating site uses a PRS level clock). Structured service can be used to provide point-to-point, point-to-multipoint, and multipoint-to-multipoint TDM services. Table 13-1. TDM service interfaces TDM Service Unstructured TDM Interface Service DSl Yes DS3 Yes El Yes E3 Yes
Structured TDM Service Yes Yes Yes Yes
OC-1
Yes
Yes
OC-3
Yes
Yes
OC-3c STM-1
Yes Yes
Yes Yes
STM-lc
Yes
Yes
OC-12
Yes
Yes
OC-12c STM-4
Yes Yes
Yes Yes
STM-4C
Yes
Yes
Structured TDM Service Granularity N X 64 kbit/s DSl,Nx64kbit/s N X 64 kbit/s El, Nx 64 kbit/s, DSO STS-1, VT-1.5, VT-2 STS-1, VT-1.5, VT-2 STS-3C VC-11(DS1),VC12(El),VC-3 {DS3, E3, other) VC-4, VC-3, VC11,VC-12 VT-1.5(DS1),VT2 (El), STS-1 (DS3, E3, other), STS-3C STS-12c VC-11(DS1),VC12 (El), VC-3 (DS3, E3, other), VC-4 VC-4-4C
13.1.2.2.2 TDM Service Processor (TSP) The TDM Service Processor is dejfined as a function, operating in the TDM domain, that 1. modifies the bit rate or contents of the service to be emulated (e.g., overhead termination, removal of unused channels in a fractional service) 2. multiplexes two or more TDM services into a single service to be emulated, and demultiplexes the composite service into its component
Metro Ethernet Circuit Emulation Services
469
services at the remote end as required (e.g., multiplexing of several N x 64 kbit/s services onto a larger N x 64 kbit/s, or multiplexing of several DSl services onto a DS3) 3. transparently maps the TDM service onto a different service to be emulated (e.g., asynchronous mapping of a DSl onto a VT-1.5) By this definition, therefore, a Tl/El Framer would be considered a TSP (terminates the framing bits, breaks the service down into an N x 64 kbit/s service). Another example of a TSP would be a DSl to VT-1.5 mapper. 13.1.2.2.3 Circuit Emulation Interworking Function (CES IWF) Circuit Emulation Service is defined as an application service in terms of layered network model defined in the MEN Architecture framework document. It uses the MEN as an intermediate network (or virtual wire) between two TDM networks. The Circuit Emulation Interworking Function is therefore defined as the adaptation function that interfaces the CES application to the Ethernet layer. The CES IWF is responsible for all the functions required for an emulated service to function, e.g., data packetization and depacketization (including any padding required to meet the minimum Ethernet frame size), sequencing, synchronization, TDM signaling, alarms, and performance monitoring. Listed in Table 13-2 are the defined interface types required to support circuit emulation over the MEN. These services are divided by payload type, CES Interface type, CES rate, and IWF type. 13.1.2.2.3.1 PDH Circuit Emulation Service PDH emulation services provide transport of PDH services through the MEN. Managed by the IWF, as defined in Table 13-2, these services can be used to support structured or unstructured TDM service interfaces. As structured TDM services, the input TDM service can be demultiplexed into its service granularity as defined in Table 13-1. This service granularity includes DSO, N x 64 kbit/s, DSl, El, DS3 or E3. Likewise, multiple structured TDM services can be multiplexed into a single service and circuit emulated across the MEN as this new service. For example, a multiple N x 64 kbit/s services can be multiplexed into a DSl that is circuit emulated across the MEN and output as a DSl structured service. These interface rates and IWFs are presented in Table 13-2. 13.1.2.2.3.2 SONET/SDH Circuit Emulation Service SONET/SDH emulation services provide transport of SONET/SDH services through the MEN. Managed by the IWF, as defined in Table 13-2, these services can be used to support structured or unstructured TDM service interfaces. As structured TDM services, the input TDM service can be
Chapter 13
470
demultiplexed into its service granularity as defined in Table 13-1. This service granularity includes STS-1, STM-1, or the appropriate virtual container. Likewise, multiple structured TDM services can be multiplexed into a single service and circuit emulated across the MEN as this new service. These interface rates and IWFs are presented in Table 13-2. Table 13-2. CES TDM interface definition CES TDM Interface Payload Type Type PDH NDSO DSl El E3 DS3 SONET/SDH STS-1 STS-3 STS-3C STM-1 STS-12 STS-12c STM-4 SONET/SDH VT-1.5 Tributary VT-2 VC-11 VC-12 VC-3 STS-IP STS-3P VC-4 STS-3cP STS-12P STS-12cP VC-4-4 VC-4-4cP
CES Rate
IWF Type
N X 64 kbit/s 1.544 Mbit/s 2048 kbit/s 34368 kbit/s 44.736 Mbit/s 51.84 Mbit/s 155 Mbit/s 155 Mbit/s 155 Mbit/s 622.08 Mbit/s 622.08 Mbit/s 622.08 Mbit/s 1.728 Mbit/s
NDSO-CE DSl-CE El-CE E3-CE DS3-CE STS-1-CE STS-3-CE STS-3C-CE STM-1-CE STS-12-CE STS-12C-CE STM-4-CE VT-1.5-CE
2.176 Mbit/s 1.728 Mbit/s 2176 kbit/s 48384 kbit/s 50.112 Mbit/s 150.336 Mbit/s 150.336 Mbit/s 150.336 Mbit/s 601.344 Mbit/s 601.344 Mbit/s 601.344 Mbit/s 601.344 Mbit/s
VT-2-CE VC-ll-CE VC-12-CE VC-3-CE STS-1 P-CE STS-3P-CE VC-4-CE STS-3cP-CE STS-12P-CE STS-12cP-CE VC-4-4P-CE VC-4-4cP-CE
SONET/SDH emulation service provides emulation of the following services: 1. Higher-order Virtual Container (HOVC): VC-3/STS-1, VC-4/STS-3c, VC-4-4c/STS-12c 2. Lower-order Virtual Container (LOVC): VC-llA^T-1.5, VC-12/VT-2, none/VT-3, VC-2/VT-6, VC-3/none SONET terminology refers to HOVC as Synchronous Payload Envelope (SPE) and LOVC as Virtual Tributary (VT). Figure 13-10 shows an example of the functional blocks used in the SONET/SDH structured emulation service. For the SONET/SDH interface,
Metro Ethernet Circuit Emulation Services
All
a SONET/SDH framer TSP is used to terminate and handle the section and Hne overheads. The SONET/SDH virtual containers (SPEs) are extracted and passed to the SONET/SDH IWF(s). Each virtual container (SPE) can be sent to a different destination IWF. The ECDX and EFT are used to add the required multiplexing and Ethernet headers and trailers. For PDH interfaces, a SONET/SDH mapper TSP is used to map the PDH signal (T1/E1/T3/E3) into SONET/SDH virtual containers (SPEA^T). Each virtual container can be sent to a different destination IWF and be either groomed to a SONET/SDH interface or mapped back to a PDH interface.
TSP Mapper
IWF TSP
U ECDX
H EFT
Framer
Figure 13-10: Functional elements of SONET/SDH emulation service
13.1.2.2.4 Emulated Circuit Demultiplexing Function (ECDX) The Emulated Circuit Demultiplexer (ECDX) is a function, operating in the packet domain, that 1. selects one (or more) IWF as the final destination of each Ethernet frame received from the MEN based on an Emulated Circuit Identifier (ECID) attribute within the frame header 2. prepends an ECID attribute for each Ethernet frame sent to the MEN to allow circuit demultiplexing on the MEN egress 3. assigns the length/type field to each Ethernet frame sent to the MEN By this definition, equipment supporting 8 DSl TDM service interfaces would be able to multiplex 8 DSl interface onto a single EVC. Each emulated circuit would be identified by a unique ECID attribute. The ECDX function would examine the ECID attribute of received Ethernet frames and compare it to its local ECID to IWF association. According to the look-up result, the ECDX would pass the circuit emulation payload to the relevant IWF for processing.
472
Chapter 13
13.1.2.2.5 Ethernet Flow Termination Function (EFT) A termination function is defined as follows: A transport processing function which accepts adapted characteristic information from a client layer network at its input, adds information to allow the associated trail to be monitored (if supported) and presents the characteristic information of the layer network at its output(s) (and vice versa) In the context used here, an Ethernet Flow Termination function takes an adapted payload from the ECDX (the MAC client information field), along with a Length/Type attribute describing it as a CES payload. It adds the MAC Destination and Source addresses, and finally the frame check sequence. In the CE-bound direction, the EFT takes in an Ethernet frame from the MEN. It determines whether this contains a CES payload from the Length/Type field, and forwards it to its associated ECDX function for passing to the appropriate CES IWF. 13.1.2.2.6 Direction terminology For each direction of an emulated circuit, there is a pair of CES interworking functions. The MEN-bound IWF handles the packetization of the TDM data, encapsulation into Ethemet frames, and forwarding into the Ethernet network. The corresponding CE-bound IWF extracts the TDM data from the Ethemet frames and recreates the TDM service. Each direction of the service is handled separately by a second pair of IWFs. Hence, for a given MEN-bound IWF, the corresponding CE-bound IWF is at the other side of the MEN. 13.1.2.3
Synchronization
Figure 13-11 shows the synchronization domains needed to support CES.
473
Metro Ethernet Circuit Emulation Services
TSP
CESIWF
CES IWF
TSP
© SDcEi
© SDcE2
Figure 13-11: Timing distribution architecture showing clock domains The notations used in Figure 13-11 are described in Table 13-3. Table J3-3. Timing distribution definitions Attribute Description CES IWF Circuit emulation service interworking function CEn Customer edge devices terminating/ originating TDM circuits TSP The TDM Service Processor performs all TDM functions necessary to support structured or unstructured TDM emulation service. These functions are performed in the TDM domain using standardized techniques. MEn Metro Ethernet devices providing Circuit Emulation Service transport in the MEN PHY Physical interface terminating Ethernet traffic •^ The TDM-end service circuits =• The CES IWF providing edge-to-edge-emulation for the TDM circuit SDedn The synchronization domain used by Ethernet network elements (NEs) in the MEN cloud. This timing information is used to transport packets between the Ethernet NEs providing the CES. Since each of these devices operates independently, the synchronization mode of each device must be provisioned separately. Typical timing modes include line, external, and free-run. SDiwFn The synchronization domain used by CES IWF. This timing is used to transport the circuit emulated service across the MEN. This synchronization domain is specific to the CES interface types as shown in Table 13-2 (CES Interface Definition). Details of this synchronization domain are contained in the lA reference as shown in Table 13-2 (CES Interface Definition). SDxDMn The synchronization domain used by the TSP. This timing is used to transport TDM signals between the TSP and CE. This synchronization domain supports through timing mode. SDcen The synchronization domain used by the CE devices to establish the TDM service clock. This timing is used to transport TDM signals between the CE and the TSP. This synchronization domain supports external, line, or free-run timing modes.
One of the objectives of edge-to-edge-emulation of a TDM circuit is to preserve the service clock with a performance level as specified in the
474
Chapter 13
relevant Implementation Agreements. For example, the performance objective could be to meet the G.823 Traffic Interface specification. The service clock can be either generated at a CE via external timing mode or recovered from the TDM bit stream via line/loop timing mode. It should be noted that loop timing mode is a recovery process where timing is extracted from only one available input bit stream. Line timing mode is a recovery process where timing is extracted from one of several input bit streams. Since the number of available timing inputs is based on network design and system architecture, this chapter will use the term line timing when describing timing modes where either line or loop timing are used. Typical timing modes for a point-to-point CES connection are shown in Table 13-4. Table 13-4. CE synchronization modes and expected timing performance Timing CEl CE2 Timing Comments Option Timing Mode Mode External to External to 1 CEl and CE2 will operate in a Plesiochronous PRS PRS mode. If connected to other TDM circuits (in traceable traceable other synchronization domains), network slipsource source rate objectives will be met. External to Line timing CE2 will have the same average frequency as PRS mode CEl. If connected to other TDM circuits (in other traceable synchronization domains), network slip-rate source objectives will be met. Line timing External to CEl will have the same average frequency as mode PRS CE2. If connected to other TDM circuits (in other traceable synchronization domains), network slip-rate source objectives will be met. Free-run Line timing CE2 will have the same average frequency as mode mode CEl. If connected to other TDM circuits (in other synchronization domains), undesirable slip performance may result. Line timing Free-run CEl will have the same average frequency as mode mode CE2. If connected to other TDM circuits (in other synchronization domains), undesirable slip performance may result. Free-run Free-run CEl and CE2 synchronization domains will operate independently. TDM data will mode mode experience slips between CE 1 and CE2 and TDM circuits in other synchronization domains.
When the TDM circuit is transported via CES, this continuous signal is broken into packets at the MEN-bound IWF of the CES connection and reassembled into a continuous signal at the CE-bound IWF of the CES connection. In essence, the continuous frequency of the TDM service clock is disrupted v^hen the signal is mapped into packets.
475
Metro Ethernet Circuit Emulation Services
In order to recover the service clock frequency at the egress of the CES connection, the interworking function must employ a process that is specific to the CES interface type. The description and requirements of the IWF service clock recovery are contained in the Implementation Agreement for that service. 13.1.2.3.1 CES Interworking Function— Synchronization Description The synchronization block diagram for the CES Interworking function is shown in Figure 13-12.
External Timing Reference
Data
IWF Processor O-K Data
<—
©
Figure 13-12: CES IWF Synchronization Reference Model The notations used in Figure 13-12 are described in Table 13-5. Table 13-5. CES IWF synchronization definitions Attribute Description Performs all data, addressing, and timing functions necessary to support circuit IWF emulation over the MEN. Processor A line timing option may be used to provide the IWF with a timing reference. TDM TDM Line timing will extract timing from the TDM service SDJDMLine Ethemet Line timing will extract timing from either the MEN SDED or from the Ethernet arriving Ethemet frames. Line An External timing option may be used to provide the IWF with a timing EXT reference. An internal timing reference may be used to provide the IWF with a timing FR reference. This internal timing reference may either be a free-running oscillator or an oscillator in holdover mode. The synchronization domain for the IWF. The definition and requirements for SDT, the SDiwF may be found in the lA for the specific IWF per Table 13-2 (CES Interface Definition). Physical interface terminating Ethemet traffic ETF
476
Chapter 13
The main goal of the CES IWF is to preserve the service clock of the TDM service through the MEN. The CES IWF must preserve the accuracy of the service clock despite the levels of MEN jitter. The operation and definition of the interworking functions are specific to the CES interface type, as shown in Table 13-2 (CES Interface Definition Table). Specific information about operation and requirements for specific interworking functions is contained in the implementation agreements as indicated in Table 13-2 (CES Interface Definition Table). The IWF can use a variety of timing inputs to use as a reference for service clock recovery. A list of the five available timing inputs used by the CES IWF are shown below: 1. Line timing (from the CE): This mode is used to recover timing from a CE. In order for this timing mode to be PRS traceable, the CE must be provisioned to recover and transmit PRS-traceable timing. Line timing should be used if synchronization messaging is available (e.g., SONET or SDH line timing sources). 2. Line timing (from the MEN): This mode is used to recover timing from the MEN. In order for this timing mode to be PRS traceable, the adjacent Ethernet NE (in the MEN) must be provisioned to recover and transmit PRS traceable timing. Line timing should be used if synchronization messaging is available (e.g., SONET or SDH line timing sources). Line timing can also include deriving a clock from incoming Ethernet packets (e.g. use of adaptive or differential clock recovery methods). 3. External Timing: This mode is used to recover PRS traceable timing from a co-located building integrated timing supply (BITS). Timing is typically sent to the TSP/IWF as an all-ones DSl or El with framing. Synchronization status messaging (SSMs) may also be transmitted over the DSl link using a "blue signal" (all ones without framing) or via SSMs on the ESF data link as defined by ANSI Tl.lOl. 4. Free-Run: This mode is considered a stand-alone mode. It should only be used when a suitable line or external timing reference is not available. The frequency and stability of this timing mode is determined by the TSP/IWF's internal oscillator. 5. Holdover: This mode is usually considered a backup or protection timing mode. Holdover mode may be initiated when the external or line timing references have been lost due to a failure. This failure is indicated to the CES IWF via a loss of frame (LOF), loss of signal (LOS), or SSM indication. Unlike free-run, this timing mode relies on the TSP/IWF's internal oscillator, which has been trained by an external timing reference (e.g., a PRS-traceable timing source). See ANSI Tl.lOl for performance specifications of this and other timing modes. Holdover may also be used when there is a cessation of activity on the
Metro Ethernet Circuit Emulation Services
All
MEN-bound TDM interface (e.g., due to the normal operation of a variable bit rate IWF). 13.1.2.3.2 Synchronous IWF and Associated Tributaries Synchronous IWFs require that the IWF synchronization domain (SDIWF) at both the MEN-bound IWF and the CE-bound IWF use common clock synchronization (e.g., timing that is traceable to the same physical clock). Unless specific mechanisms have been put in place to ensure that a common clock is distributed between both IWFs, it should be assumed that the IWFs are not synchronous. It is not reasonable to assume the IWFs have suitable clocking information unless it has been specifically provided for. Tributaries to the IWFs consist of those TDM services that originate at the CE and are terminated at the IWF. The service clock of each tributary may be either synchronous (PRS traceable) or asynchronous (not PRS traceable). 13.1.2.3.2.1 Synchronous IWF and Tributaries Synchronous IWF: In this scenario, the synchronization domains at the MEN-bound and CE-bound IWFs (SDJWF) use common clock synchronization (e.g., timing that is traceable to the same physical clock). Synchronous Tributary: The synchronization domains of the CE devices (SDcen) require that at least one of the CE devices use external timing that is PRS traceable. The other CE, in the CES connection, may be either line timed to the originating CE or externally timed to a PRS-traceable source. 13.1.2.3.2.2 Synchronous IWF and Asynchronous Tributaries Synchronous IWF: In this scenario, the synchronization domains at the MEN-bound and CE-bound IWFs (SDJWF) use common clock synchronization (e.g., timing that is traceable to the same physical clock). Asynchronous Tributary: The synchronization domains of the CE devices (SDcen) require that at least one of the CE devices be externally timed to a non-PRS-traceable source or use its internal free-running oscillator. The other CE can be either line-timed or externally timed, or may operate on its internal free-running oscillator. Of these options, only the line-timed option will provide slip-free performance under nonfault conditions (see Table 13-4). 13.1.2.3.3 Asynchronous IWF and Associated Tributaries In the Asynchronous network, the synchronization domains of all IWFs (SDIWF) do not use a common source of timing. For example, the IWFs may receive plesiochronous timing. Tributaries to the IWF consist of TDM
478
Chapter 13
services that originate at the CE and are terminated at the IWF. At the TDM service processor, the service clock of each TDM tributary (SDcen) is terminated. This service clock will be used to support any functions necessary to create stmctured or unstructured TDM emulation services. After these services are created, the service clock and the TDM data are then input to the CES IWF. 13.1.2.3.3.1 Asynchronous IWF, Asynchronous Tributaries Asynchronous IWF: In this scenario, not all the synchronization domains of the IWF devices (SDIWF) use common clock synchronization (e.g., timing from the same physical clock). Timing distribution for the IWF may be configured with each either line-timed to the MEN, line-timed to a dedicated CE, or externally timed to a PRS traceable source. Asynchronous Tributary: The synchronization domains of the CE devices (SDcen) require that at least one of the CE devices be externally timed to a non-PRS traceable-source or use its internal free-mnning oscillator. The other CE can be either line-timed, can be externally timed, or may operate on its internal free-running oscillator. Of these options, only the line-timed option will provide slip-free performance under nonfault conditions (see Table 13-4). 13.1.2.3.3.2 Asynchronous IWF, Synchronous Tributaries Asynchronous IWF: In this scenario, not all the synchronization domains of the IWFs (SDJWF) use common clock synchronization (e.g., timing from the same physical clock). Timing distribution for the IWFs may be configured with each either line-timed to the MEN, line-timed to a dedicated CE, or externally timed to a PRS traceable source. Synchronous Tributary: The synchronization domains of the CE devices (SDcen) require that at least one of the CE devices use external timing that is PRS traceable. The other CE, in the CES connection, may be either line-timed to the originating CE or externally timed to a PRS traceable source. 13.1.2.3.4 Synchronization Administration Proper synchronization administration is crucial to support the transport of TDM emulation and circuit emulation services. Each synchronization domain in the CES needs to be considered and provisioned independently. Table 13-6 provides a summary of typically available options per synchronization domain.
Metro Ethernet Circuit Emulation Services Table 13-6. Timing options per synchronization domain CE IWF TSP SDcE SDiwF SDTDM External PerIA External Line Line - CE Free-Run Line-MEN Free-Run
479
MEN SDED
PRS Traceable Non-PRS Traceable
Proper synchronization administration begins with the following concepts: • Separate and Diverse: Synchronization equipment and paths should follow separate and diverse paths. This reduces the chance that a single point of failure will cause service disruption. This philosophy extends to equipment placement, cable routing, powering sources, and network element configuration. • Synchronization Trail: All synchronization flows have a defined source and end. The context of this flow is based on a logical flow rather than a topology (which could be linear, ring, mesh, or a combination of these). The flow can even span multiple transport technologies including Ethernet, SONET, and SDH. The source of a synchronization trail should be either from an independent reference that is PRS traceable or from a free-running oscillator. The synchronization trail ends on a specific device that does not further propagate the source timing information. Such a terminating device is necessary to prevent timing loops, which will cause transport errors in the physical layer. Intermediate equipment in the synchronization trail can use timing information derived from the source in a daisy-chained configuration. It should be noted that upstream timing events (protection switching, holdover events, clock failures, etc.,) will be propagated to downstream equipment in this configuration. For these cases, downstream equipment will need to have predetermined fault modes to deal with these circumstances when they arise. Such fault modes may include protection switching to use a different line-timed reference or entry into holdover. • Service Clock Preservation :The service clock is defined at the timing source used to generate the client signal. In a CES transport network, the service clock frequency must be preserved in order to have error-free transport of the physical layer. That is to say, the average bit rate of the transmitting and receiving Customer Edge devices must be the same. Otherwise, the transport capability of the CES will be compromised. • Synchronization Traceability: In synchronization network planning, it is important to know where a timing source originates, how it flows in the network, and its quality. The concept of traceability provides the
480
Chapter 13
answers to these quantities. Synchronization flows can be described in terms of source traceability. That is to say, that timing originates at a physical location and is used by equipment in a synchronization trail either by external or line-timing options. Common clock distribution is an example of a source-traceable clock scheme. In the case offrequency traceability, the quality of a timing signal can be specified. For example, the frequency of synchronization sources may be specified as being PRS traceable or accurate to within 1 e-11 (.00001 parts per million). Plesiochronous clock distribution is an example of a frequency-traceable clock scheme. Three synchronization administration models are presented in the following sections. These models have been chosen to illustrate how they may be used to accommodate the CES definitions. These models are as follows: • Single Service Provider-Owned Network: A single service provider controls the entire transport synchronization trail. The service synchronization trail is separate. • Multi Service Provider-Owned Network: Multiple service providers control the transport synchronization trail. The service synchronization trail is separate. • Private (customer-owned) Network: A customer controls the entire transport and service synchronization trails. Table 13-7 summarizes these synchronization administration models and shows how they map to the CES service definitions. Table 13-7. CES services and supporting synchronization administration models CES Service Sync. Admin. Model Notes Can also be used with Private TDM Virtual Private Line Single Service Provider Network model. service model Handoff to PSTN does not TDM Access Line service Single Service Provider retime the pay load (DSl or model DS3). Therefore, there is no new sync trail. Customer-operated CES IWF at CES egress may be Multi service Provider over Metro Ethernet owned by a different service model provider.
13.1.2.3.4.1 Single Service -Provider-Owned Network The synchronization administration for a single service -provider-owned network is illustrated in Figure 13-13.
Metro Ethernet Circuit Emulation Services
SDci
!
SD,wF
I
SDTDM
Transport Timing
481 gD,
OI^IWF
1
SDTDM
Service sync trail
IWF T^
i
'l
\ \^
SDcE
CE2
\
Transport " sync trail Service Timing
Figure 13-13: Synchronization Administration for a Single Service Provider Network
A summary of available timing options for the Single Service Providerowned-network is presented in Table 13-8. Table J3-8. Timing Options for Single Service Provider- Owned Network Device Timing Domain Timing Options Notes CE SDcE External Customer Option: Requires a collocated BITS or SSU to supply a timing input to the CE. May be used as a source for the synchronization trail. Customer Option: May be used at the end Line of a synchronization trail. Customer Option: May be used as the Free-Run source for the synchronization trail. IWF SDn SeelA Service Provider Option: Timing recovery for CES IWF must match the IWF type. Details can be found in the IA as specified in Table 13-2 (CES Interface Definition). TSP SDT Service Provider Option: Requires a External collocated BITS or SSU to supply a timing input to the PE. Line-CE Not used, since CE and TSP are on different synchronization trails. Line-MEN Service Provider Option: Used if the IWF can provide timing. Consistent with the service needs of the TSP. Options and capabilities for a specific IWF are specified per the appropriate I A. Service Provider Option: Used only if there Free-Run is no suitable line or external timing source. PRS Traceable MEN SDED Service Provider Option: Requires that all elements use PRS traceable timing. Non-PRS Service Provider Option: Requires that all Traceable elements use PRS traceable timing.
Note that in the Single Service Provider configuration, the synchronization trails for service-timing and transport-timing are separate. Separation of these synchronization trails is typically done for administrative and liability reasons. An end customer will not be able to manage a service
482
Chapter 13
provider's timing equipment. Likewise, a service provider would typically not use timing from a CE due to the reliability and liability issues associated with a single point of failure. Separate synchronization trails also mean that the physical timing source used to derive the service clock is not the same as that used in the transport network (TSP/IWF and MEN). This is not to say that the source for the service and transport cannot be PRS traceable, simply that one is not the source for the other. 13.1.2.3.4.1.1 Service Timing— Single Service Provider Network The Customer Edge synchronization domain (SDCE) is owned and maintained by the customer. Timing for the CE (SDCE) requires that at least one CE be the source of timing. Referring to Table 13-4, any of the six options listed can be used. The choice of which option to use will depend on economic and service needs. A PRS-traceable source can be highly reliable and accurate but requires physical placement and substantial engineering support. It is for this reason that option 1 may be more expensive to administer than options 2 and 3. Options 4 and 5 rely on the CE's internal oscillator to provide a frequency source for the service clock. In this case, due to the relaxed need for additional timing equipment (BITS or SSUs) this option may be less expensive to administer than option 1 and require less engineering support. Option 6 may yield undesirable slip performance due to the frequency difference between the free-running clocks. This option is generally the least expensive but also has the lowest overall performance. 13.1.2.3.4.1.2 Transport Timing— Single Service Provider Network Timing for the single service provider network assumes that the service provider owns the TSP/IWF and all the Ethernet devices in the MEN. The equipment will encompass synchronization domains SDJWF, SDTDM , and SDED. For this case, it is up to the service provider to perform the synchronization administration of all transport and synchronization equipment. Regardless of synchronization options that the service provider may use, as listed in Table 13-5, it is the ability of the service provider's network to accurately transport the service clock that is most important. Therefore, the operation of the CES interworking function is key to transport timing. The CES interworking function (SDJWF ) must use timing that is consistent with the IWF type. The TSP synchronization domain (SDJDM ) provides a foundation for the CE IWF. PRS-traceable timing may be available via externally timing the TSP/IWF or line timing (from the MEN). It should be noted that the use of
Metro Ethernet Circuit Emulation Services
483
synchronization messaging (SSM) between the MEN and TSP/IWF is required to always ensure that the TSP/IWF is receiving PRS-traceable timing. SSMs facihtate fault or protection switching from upstream failures. If SSMs are not available, then external timing is the preferred timing mode for PRS-traceable timing. 13.1.2.3.4.2 Multi Service Provider-Owned Network Synchronization administration for a Multi Service -Provider-Owned Network is illustrated in Figure 13-14. ,^.„ Service sync trail ^ ^ Transport i Provider-A f t Provider-B | sync trail JTransport Timing; [Transport Timing;
Provider - C | Transport Timing;
Figure 13-14: Synchronization Administration for a Multi Service Provider Network
A summary of available timing options for a Multi Service -ProviderOwned Network is presented in Table 13-9. Note that in the Multi Service Provider configuration, the synchronization trails for service timing and transport timing are separate. Also notice that the transport timing can be TDM based (PSTN) or Ethernet based (MEN). Separation of these synchronization trails is typically done for administrative and liability reasons. An end customer will not be able to manage a service provider's timing equipment. Likewise, a service provider would tj^ically not use timing from a CE due to the reliability and liability issues associated with a single point of failure. Separate synchronization trails also means that the physical timing source used to derive the service clock is not the same as that used in the transport network (TSP/IWF and MEN). This is not to say that the source for the service and transport cannot be PRS traceable, simply that one is not the source for the other. 13.1.2.3.4.2.1 Service Timing —Multi Service Provider Networl^: The Customer Edge synchronization domain (SDCE) is owned and maintained by the customer. Timing for the CE (SDCE) requires that at least one CE be the source of timing. Referring to Table 13-6, any of the six
484
Chapter 13
Table 13-9. Timing options for multi service provider-owned network Device Timing Timing Notes Domain Options Customer Option: Requires a collocated BITS or SSU to CE SDcE External supply a timing input to the CE. May be used as a source for the synchronization trail. Line Customer Option: May be used at the end of a synchronization trail. Free-Run Customer Option: May be used as the source for the synchronization trail. PSTN SDTDM External Service Provider Option: Requires a collocated BITS or SSU to supply a timing input to the PE. Line-CE Customer Option: May be used to recover timing from the received client signal. This mode should only be used if synchronization messaging is available (e.g., SONET or SDH line-timing sources). Line-TSP Service Provider Option: Used only if the connecting TSP equipment is PRS traceable. This mode should only be used if synchronization messaging is available (e.g., SONET or SDH-line timing sources). Free-Run Service Provider Option: Used only if there is no suitable line or external timing source. IWF SDiwF See lA Service Provider Option: Timing recovery for CES IWF must match the IWF type. Details can be found in the lA as specified in Table 13-2 (CES Interface Definition). TSP SDTDM External Service Provider Option: Requires a collocated BITS or SSU to supply a timing input to the PE. Line — Customer Option: May be used to recover timing from CE the received client signal. This mode should only be used if synchronization messaging is available (e.g., SONET or SDH line-timing sources). Line — Service Provider Option: Used only if the connecting PSTN PSTN equipment is PRS traceable. This mode should only be used if synchronization messaging is available (e.g., SONET or SDH line-timing sources). Line— Service Provider Option: Used if the IWF can provide MEN timing. Consistent with the service needs of the TSP. Options and capabilities for a specific IWF are specified per the appropriate I A. Free-Run Service Provider Option: Used only if there is no suitable line or external timing source. PRS MEN SDED Service Provider Option: Requires that all elements use Traceable PRS-traceable timing Non-PRS Service Provider Option: Requires that one or more Traceable elements not use PRS-traceable timing.
options listed can be used. The choice of options to use will depend on economic and service needs. A PRS-traceable source can be highly reliable and accurate but requires physical placement and substantial engineering
Metro Ethernet Circuit Emulation Services
485
support. It is for this reason that option 1 may be more expensive to administer than options 2 and 3. Options 4 and 5 rely on the CE's internal oscillator to provide a frequency source for the service clock. In this case, due to the relaxed need for additional timing equipment (BITS or SSUs), this option may be less expensive to administer than option 1 and require less engineering support. Option 6 may yield undesirable slip performance due to the frequency difference between the free-running clocks. This option is generally the least expensive but also has the lowest overall performance. 13.1.2.3.4.2.2 Transport Timing —Multi Service Provider Network Timing for the Multi Service Provider Network assumes that no one service provider owns the PSTN, TSP/IWF, and all the Ethernet devices in the MEN. The equipment will encompass synchronization domains SDIWF? SDTDM, and SDED. For this case, it is up to each service provider to perform the synchronization administration of all transport and synchronization equipment in its domain. Regardless of synchronization options that the service provider may use, as listed in Table 13-10, it is the ability of the service provider's network to accurately transport the service clock that is most important. Therefore, the operation of the CES interworking function is key to transport timing. The PSTN synchronization domain (SDTDM) preserves the service clock through the PSTN. PRS-traceable timing may be provided to all the PSTN network elements (NEs) externally by BITS or via line timing from adjacent NEs. It should be noted that the use of synchronization messaging (SSM) between NEs may be required to ensure that each NE is receiving PRStraceable timing. SSMs facilitate fault or protection switching from upstream failures. If SSMs are not available, then external timing is the preferred timing mode for PRS-traceable timing. The TSP synchronization domain (SDTDM ) provides a foundation for the CE IWF. PRS-traceable timing may be available via externally timing the TSP/IWF or line timing (from the MEN). It should be noted that the use of synchronization messaging (SSM) between the MEN and TSP/IWF is required to always ensure that the TSP/IWF is receiving PRS-traceable timing. SSMs facilitate fault or protection switching from upstream failures. If SSMs are not available, then external timing is the preferred timing mode for PRS-traceable timing. Private (Customer-Owned) Network Synchronization administration for a private network is illustrated in Figure 13-15.
Chapter 13
486 SD,wF
Transport Timing
gp
ouFiwF SDTSP
SDcE CE2
:
Service Timing
Figure 13-15. Synchronization Administration for a Private A summary of available timing options for a private network is presented in Table 13-10. Table 13-10. Timing options for a private network Device CE
Timing Domain SDcE
Timing Options External Line Free-Run
IWF
SDiwF
See lA
TSP
SDTDM
External
Line—CE
Line— MEN
Free-Run
MEN
SDED
PRS Traceable Non-PRS Traceable
Notes Customer Option: Requires a collocated BITS or SSU to supply a timing input to the CE. The most accurate means of timing distribution. Customer Option: May be used at the end of a synchronization trail. Customer Option: May be used as the source for the synchronization trail. Service Provider Option: Timing recovery for CES IWF must match the IWF type. Details can be found in the IA as specified in Table 13-2 ( CES Interface Definition). Service Provider Option: Requires a collocated BITS or SSU to supply a timing input to the TSP. The most accurate means of timing distribution. Customer Option: May be used to recover timing from the received client signal. This mode should only be used if synchronization messaging is available (e.g., SONET or SDH line-timing sources). Service Provider Option: Used if the IWF can provide timing. Consistent with the service needs of the TSP. Options and capabihties for a specific IWF are specified per the appropriate lA. Service Provider Opfion: Used if the IWF can provide timing. Options and capabilities for a specific IWF are specified per the appropriate lA. Service Provider Option: Requires that all elements use PRS-traceable timing Service Provider Option: Requires that one or more elements not use PRS-traceable timing.
Note that in this configuration, the synchronization trails for service timing and transport timing overlap. This is allowed because the customer owns and operates the equipment that supports all the synchronization domains. In this case, the customer and service provider are the same entity,
Metro Ethernet Circuit Emulation Services
487
so combining synchronization trails should not create any problems. The rules for timing, as with the single and multi service provider cases, should still be followed. 13.1.2.3.4.2.3 Service Timing - Private Networlc The Customer Edge synchronization domain (SDCE) is owned and maintained by the customer, just as in the single service provider case. In fact, all timing aspects related to private networks are identical to those of the single service provider case. 13.1.2.3.4.2.4 Transport Timing —Private Networli: Timing for the private network assumes that the TSP/IWF and Ethernet devices in the MEN are all owned by the same customer. This equipment encompasses synchronization domains SDIWF, SDTDM, and SDED. The options for configuring this equipment are shown in Table 13-10. The timing options for the Private Network are the same as those of the single service provider case, with the exception of the line-CE mode for the TSP/IWF. In this mode, the TSP/IWF are allowed to use synchronization from client signal as sent by the CE. In order to ensure the best-quality timing, it is recommended that the CE sourcing this client signal use PRStraceable timing. It should be noted that the use of synchronization messaging (SSM) between the CE and TSP/IWF are required to always ensure that they are receiving PRS-traceable timing. SSMs facilitate fault or protection switching from upstream failures. If SSMs are not available, then external timing is the preferred timing mode for PRS-traceable timing. 13.1.2.4
Performance Monitoring and Alarms
The CES should allow the operation of the normal mechanisms for monitoring the performance of the TDM service, at the level of the TDM interface type. Performance monitoring is one-way, and occurs between two endpoints. 13.1.2.4.1 Facility Data Link The Circuit Emulation Service will carry any signal that meets the bit rate requirement. In the particular case of DSl circuits using ESF framing, a Facility Data Link (FDL) may be present in the signal. The DSl ESF Facility Data Link is used to carry once-per-second Performance Report Messages. These messages carry information on numbers of CRCs, framing errors, line code violations, and other impairments detected over the last second.
488
Chapter 13
The CES IWF is allowed to monitor the FDL, and to modify the relative position of the FDL with respect to the TDM payload, but must not change messages carried by the FDL or insert new FDL messages. For example, the Interworking Function may be required to monitor Performance Report Messages. 13.1.2.4.2 Alarms 13.1.2.4.2.1 Unstructured Service For unstructured services, all alarms received at the input of the Service Interface are carried through to the output Service Interface without modification, since they are embedded in the data on the wire. In addition to this, the IWF can detect a loss of signal (LOS) at the IWF Service Interface. Upon detection of LOS, the IWF is required to notify the IWF at the opposite end of the CES service. 13.1.2.4.2.2 Structured Service For structured N x 64 service, the alarm status is not necessarily propagated within the data. Several kinds of alarms can be detected at the point where the Service Interface is received by the IWF. Some alarm situations require that an alarm condition detected at the point where the TDM Service Interface is received by the MEN-bound IWF be propagated downstream to the CE-bound IWF responsible for reproducing the bit stream. 13.1.2.4.2.3 Buffer Underflow and Overflow The IWF at the egress of the MEN will require a buffer in which the reassembled data stream is stored before it is transmitted out the Service Interface. The size of this buffer will be implementation dependent, but it must be large enough to accommodate expected MEN frame jitter, while small enough to not introduce excessive delay in the emulated circuit. This buffer will be subject to overflow or underflow if slight clocking differences exist between the upstream and downstream IWF, or in the presence of unexpectedly large network jitter. In the case of an underflow, or data starvation condition, data will have to be inserted into the TDM stream until a new Ethernet frame has arrived. The data to be inserted is implementation dependent. Under some circumstances, such as a failure in the MEN network carrying the emulated service, the flow of Ethernet frames to the reassembly unit will stop for an extended period. This situation is effectively the same as an LOS condition on the TDM network. If this condition persists for a long period, itshould be signaled to the downstream TDM equipment using a Trunk Conditioning procedure. For most applications, implementers are
Metro Ethernet Circuit Emulation Services
489
advised to use a 2.5 ± 0.5 s integration period, in a manner analogous to that used to integrate Loss of Signal to declare a red alarm state. Although not required as part of this specification, implementers may wish to consult Bellcore GR-1113-CORE and ETSI ETS 300 353 Annex D for advice on the handling of various fault conditions. 13.1.2.4.3 End-to-End Delay End-to-end delay requirements are application specific. However, it should be noted that excess delay could have adverse effects on some traffic types, such as voice. 13.1.2.5 Service Impairment This section addresses impairments to the emulated TDM service caused by errors within the MEN. Principally, it addresses the relationship between the performance parameters of the underlying Metro Ethernet Network (MEN) transport to the defined service impairment metrics for the TDM service being emulated - errored seconds (ESs) and severely errored seconds (SESs). 13.1.2.5.1 Errors Within the MEN Causing TDM Service Impairment There are three performance parameters defined for an EVC that have an impact on the TDM service impairment metrics: Frame Loss, Frame Delay, and Frame Jitter. In addition, any bit errors induced in the data flow across the MEN will also have an impact on the TDM service, although there is no performance metric defined to measure this within the MEN. 13.1.2.5.1.1 Frame Loss Frame loss is defined as the percentage of in-profile frames ("green" frames that are within CIR/CBS) not reliably delivered over a measurement interval T. Frame Lossj = (1 - frames delivered to destination / total frames sent to destination) X 100 Frame loss will cause a burst of bit errors in the reconstructed TDM stream. 13.1.2.5.1.2 Frame Delay and Frame Jitter Frame Delay is defined as the maximum delay measured for a percentile (P) of successfully delivered frames over a measurement interval T. Frame Jitter can then be derived from the Frame Delay measurement using the maximum and minimum of Frame Delay samples over same measurement interval T and percentile (P). Frame Jitter can be calculated as follows:
490
Chapter 13
Frame Jitterj,? = Frame Delayx,? (Max. measured delay value - Min. measured delay value) Typically the Frame Jitter is used to size the reassembly buffer in the IWF at the egress of the MEN. This buffer is sometimes referred to as the jitter buffer. However, the Frame Jitter figure is calculated on a percentile of frame delay values, where the percentile is defined as part of the Service Level Specification. Therefore, a small percentage of frames will arrive outside the specified jitter level. Depending on the size of the jitter buffer, these frames may either arrive too late to be played out or too early to be accommodated within the buffer. Such frames will then be discarded, and must be considered lost for the purposes of reconstructing the TDM service. 13.1.2.5.1.3 Bit Errors Bit errors induced in the MEN will normally be detected by a CRC error in the frame check sequence. Therefore, the errors will cause the whole frame to be discarded. On some occasions a frame containing bit errors may still yield a correct CRC value, but this outcome is expected to be extremely rare. Therefore, as far as the IWF is concerned, any bit errors in the data stream will appear as lost frames. 13.1.2.5.1.4 Frame Error Ratio and IWF behavior The collective sum of all the above errors (frame loss, excess frame jitter and bit errors) can be aggregated into a single measure termed Frame Error Ratio (FER), defined as the total of all effects leading to the loss of or discarding of a frame. For the purposes of TDM emulation, an Ethernet frame is deemed to be errored if it 1. fails to arrive at the egress IWF 2. arrives too late to be played out 3. arrives too early to be accommodated in the jitter buffer 4. arrives with bit errors, causing the frame to be discarded In order to maintain timing integrity, the IWF must then insert an equivalent number of octets of data into the reconstructed stream. The data to be inserted is application and/or implementation dependent. If the frame should subsequently arrive, it should be discarded. 13.1.2.6
TDM Signaling
It is not normally required for the Circuit Emulation Service to intercept or process TDM signaling, e.g., Channel Associated Signaling (CAS) or Common Channel Signaling (CCS). Signaling is embedded in the TDM data stream, and hence it is carried end-to-end across the emulated circuit. The
Metro Ethernet Circuit Emulation Services
491
signaling can be extracted by the end equipment from the data that has been transported across the MEN. The exception is N x 64 kbit/s service where CAS signaling is supported. N X 64 kbit/s Service with CAS requires direct recognition and manipulation of the signahng bits by the CES IWF. This mode is necessary to support multiplexed N x 64 kbit/s applications requiring DSl Robbed Bit Signaling or El CAS support, where the association between the signaling bits and the channel may otherwise be lost. 13.1.2.7
Loopbacks
Loopbacks should be supported at the level of the TDM interface type. Loopback of internal multiplexed levels (e.g., a DSl level inside an OC-3 line) is outside the scope of this document. 13.1.2.7.1 Provider-Controlled Loopbacks It is suggested that provider-controlled loopbacks should be supported in the manner illustrated in Figure 13-16, where the direction of loopbacks is from PEl (near end) towards PE2 (remote end). Loopback for PE2 to PEl is the reverse of Figure 13-16. The aim of the loopback is to isolate the fault with respect to PEl. Therefore, four loopbacks at the system level are suggested: 1. a PEl-CEl facility loopback, which should be located at a point within the provider's network as close as practicable to the Network Demarcation, for example, at a final span line repeater or SmartJack; 2. a PEl terminal loopback of the TDM interface; 3. a PEl terminal loopback of the MEN interface; and 4. a PE1-PE2 MEN loopback. The loopback at the Network Demarcation must comply with the appropriate specification for the TDM interface type (e.g., ANSI for a DSl interface).
492
Chapter 13 Cli4 ^««
:i
2
€El\
PE2
MEN
PEl
4
I
^M>om^
"W^
;
%mmm,^
Network Demarcation
Network Demarcation
Figure J 3-16: Provider-cControlled loopback points
13.1.2.7.2 Customer-Controlled Loopbacks It is also possible for customers to use loopbacks to verify the operation either of their own equipment or of the service being provided. The aim of the loopback is to isolate the fault with respect to a particular item of customer equipment (e.g., CEl), while treating the service provider network as a "black box." Therefore, two loopbacks are suggested, as illustrated in Figure 13-17: 1. Local loopback at CEl 2. Remote loopback at CE2 The required loopbacks for CE2 are the reverse of those shown in Figure 13-17.
CEl
PEl
w
MEN U . «. ««. ^
^
PE2
€1^ ^
;
2
^ ^ ^ ^
Network Demarcation
Network Demarcation
Figure 13-17: Customer-cControlled loopback points
Metro Ethernet Circuit Emulation Services
493
Customer-controlled loopbacks are often initiated under manual control, although signaling may be provided to operate the remote loopback. The nature of such signaling is implementation specific. Customers are not normally able to control the provider loopbacks shown in Figure 13-16, except via a service call to the provider concerned. 13.1.2.8
Protection
TDM networks usually include protection mechanisms for service recovery in less than 50 ms. The CES solution should fit into such a protection scheme. This means the following: • A CES service running over a MEN should have capabihties for sub-50 ms protection within the MEN part. • A failure of the TDM interface connecting a TDM legacy network to the MEN should cause a switchover to an alternative TDM interface. This switch should interact both with the MEN and with the legacy network so that both sides of the interface start working with the alternative interface. The following scenarios may be relevant for CES protection. 13.1.2.8.1 Scenario 1 — Dual Unprotected Services In dual unprotected service, topologies a failure of one of the interfaces connecting the TDM and MEN, or inside the MEN, will cause the TDM network to work with the alternative CES connection. This setup may use 1+1 protection, 1:1 protection, or 1:N protection, but it is outside the scope of the CES solution.
O ^ uap^tected txmix^m. \
MEM CBS imp««tertM ^i&me^KM 1
Figure 13-18: Dual unprotected services
The CES service is a nonprotected service. Protection is established through the mechanisms of the TDM network connected on both sides of the MEN. For example, a SONET network connected on both sides of the CES connection could perform the protection switching based on either APS (Automatic Protection Switching) or BLSR (Bidirectional Line Switched
494
Chapter 13
Ring). Since the protection-related information is carried in the hne overhead (LOH), there are two possibiHties: • The CES service is not terminating the Hne layer. The protection information is carried across the MEN to the other side. • The CES service is terminating the line layer. Upon failure of the path in the MEN, the PE updates the relevant protection bytes in the LOH within the requirements of SONET/SDH, so that the SONET/SDH equipment can make the switching in time. This may be by asserting AIS/RDI bits for the line. 13.1.2.8.2 Scenario 2 —Dual Protected Services The dual protected service scenario is similar to the previous one; however, each CES connection is protected inside the MEN.
Figure 13-19: Dual protected services
In the case of a failure in the TDM interface, the behavior is similar to the previous scenario. In the case of a failure inside the MEN, both the MEN internal mechanisms may trigger bypass, and the TDM network may initiate a switch to the alternate TDM interfaces as well. This setup requires careful design in order to avoid races between the TDM and the MEN protection mechanisms. 13.1.2.8.3 Scenario 3 —Single Protected Service In The single protected service scenario, a single TDM interface is used on each side of the MEN. In this case, the CES connection in the MEN needs to be protected using MEN protection mechanisms. A failure inside the MEN would be bypassed in less than 50 ms so that service is maintained.
Metro Ethernet Circuit Emulation Services
495
Figure 13-20: Single protected service
This scenario does not cover the case of TDM interface failure. The endto-end protection mechanisms require detecting the Uvehness of an end-toend path. This is done generically through an OAM mechanism. Note that in the case of CES, traffic is constantly flowing through the active path (unlike Ethernet connections, where there may be silent periods on the connection). Receiving traffic at the end of the path can therefore be used for detecting path liveliness, and can trigger switchover in case of failure. 13.1,2.8.4 Scenario 4 —Single-to-jPual Interface Service
^»-<>Z,^
'^^^
Figure 13-21: Single-to-dual interface connection
In the single-to-dual interface service topology, there is a single TDM interface on one side of the MEN and two on the other side. This scenario is typical of a network where the right side is a server/hub— for example, multiple users connected with DSl lines to a central office with OC-3. The subscribers may get a single DSl interface, while on the other side there are 2 OC-3 uplinks so that a failure of the OC-3 uplink (or even the TDM equipment attached to it) will still enable using the other OC-3 link. In this case, traffic coming in from CEl, CE2, and CE3 is sent on both internal connections to CE A and CE B. It is multiplexed into a single higher-rate TDM link going towards CE A & CE B. On the other direction, traffic coming from CE A & CE B is demultiplexed, and each lower rate is sent towards its destination CE (1/2/3). The links between the end-subscribers in CE 1/2/3 and the MEN are not protected. However, each such link affects only one subscriber. On the central office side, each link affects multiple subscribers, and is therefore protected.
496
Chapter 13
Each CES on the left side can select the source for (red or blue of) the lower rate it sends towards CE1/2/3. In the other direction, each CES connected to CE 1/2/3 sends two copies of the traffic, one towards CE A and one towards CE B. Both CE A and CE B receive traffic on their TDM links and can select between them. 13.1.2.9 Service Quality In general, CES should aim to perfectly emulate the TDM services, thus allowing service providers a migration path to MENs while not sacrificing service quality. Noncompliance of an emulated service to the TDM service requirements as specified in the relevant ANSI and ITU documents shall be explicitly stated in the implementation agreement, as well as the conditions under which this noncompliance occurs. 13.1.2.10
Efficiency
The CES solution should try to minimize both network delay and the network bandwidth needed for transporting the TDM traffic. During setup of an emulated circuit, the CES solution should enable the MEN provider to favor efficiency over network delay and vice versa, by being able to define the size of the TDM payload transmitted in each Ethernet frame. For example, a small payload results in low latency, since the CES IWF does not have to wait long for the Ethernet frame to fill. However, the ratio of header to payload is large, making small Ethernet frames inefficient in terms of bandwidth consumption in the MEN. Therefore, the payload should be made as large as possible, while still keeping within the latency budget for the emulated circuit. It is not required to be able to dynamically adjust the size of the payload while an emulated circuit is in operation.
ACKNOWLEDGMENTS Metro Ethernet Forum Tim Frost
13.2.
REFERENCES
Metro Ethernet Forum, Technical Specification MEF 3, Circuit Emulation Service Definitions, Framework and Requirements in Metro Ethernet Networks, April, 2004.
Chapter 14 METRO ETHERNET NETWORK RESILIENCY AND TRAFFIC MANAGEMENT Nan Chen Sirix Systems; Metro Ethernet Forum President
14.1.
METRO ETHERNET NETWORK RESILIENCY
14.1.1 Introduction Protection in Metro Ethernet Networks (MEN) [1] can encompass many ideas. Protection is a self-healing property of the network that allows it to continue to function with minimal or no impact to the network users upon disruption, outages, or degradation of facilities or equipment in the MEN. Naturally, there is a limit to how much the network can be disrupted while maintaining services, but the emphasis is not on this limit but rather on the ability to protect against moderate failures. Network protection can be viewed in two ways: • From the viewpoint of the user of the MEN services, the actual methods and mechanisms are of minor concern; it is the availability and quality of the services that are of interest. These can be described in a Service Level Specification (SLS), a technical description of the service provided, which is part of the Service Level Agreement (SLA) between customer and provider. • The other viewpoint is that of the network provider. The provider is tasked with translating the SLSs of all the customers (and future customers) into requirements for the network design and function. We do not study this translation here; it is an area of differentiation and specialization for the provider and depends on the policies that the
498
Chapter 14
provider will use for protection. What we do study is the mechanisms that can be used to provide protection. Any protection scheme has three clear components: • Detection refers to the ability to determine network impairments. • Policy defines what should be done when impairment is detected. • Restoration is the component that acts to "fix" the impairment; it may not be a total restoration of all services and depends on the nature of the impairment and the policy. We focus on the detection and restoration mechanisms and leave the choice of policy to the providers. However, the policy itself cannot be ignored and is based on the services supported. Detection and restoration can be done in many different ways in the MEN. The techniques available depend on the nature of the equipment in the network. The requirements have their basis in the interpretation of Service Level Specifications for Ethernet services (such as availability, mean time to restore, mean time between failure, etc.) in terms of network protection requirements (such as connectivity restoration time, SLS restoration time, protection resource allocation, etc.). In other words, the protection offered by the network is directly related to the services supplied to the user and the requirements derived from the need to protect the services provided to the user. In most cases, an Ethernet Virtual Connection (EVC) implementing an Ethernet service traverses different transport networks (also known as transports), and therefore the end-to-end protection may involve different mechanisms. For example, many transport networks may be involved: Ethernet, Ethernet over DSL, Ethernet over SONET/SDH, MPLS, and data link layer switching as Ethernet. In the case of Ethernet protection, technologies such as Rapid Spanning Tree Protocol (RSTP) or Link Aggregation may be used to provide protection at the ETH layer. An Ethernet Line service EVC is built of a single ETH-trail, while an Ethernet LAN service EVC is built of a number of ETH-trails. The protection requirements have two distinct goals, one set of which is specified for Service Level Specifications (such as protection switching time) and can be measurable parameters, and the other set of which is specified for providers of the service (such as Protection Control Requirements specifying protection configuration) and is not directly reflected in a Service Level Specification but is required from the provider. Examples for such requirements are those that relate to control, manageability, and scalability of a protection scheme. The following topics are examples of those discussed in the requirements section:
Metro Ethernet Network Resiliency and Traffic Management
499
• • • • • • • •
Protection switching times; Failure detection requirements; Protection resource allocation requirements; Topology requirements; Failure notification requirements; Restoration and revertiveness requirements; Transparency for end-user; and Security requirements: e.g., separation between LAN & MAN protection mechanisms. Observe that if all EVCs passing through a specific connected part of the network are known to have similar protection requirements, it is sufficient for this part of the network to comply with the specific requirements that are needed by the EVCs of services passing through it. An example is the lastmile: protection requirements are directly related to the customers needs. The framework section deals with models and mechanisms specific to the Metro Ethernet. We can make use of any existing mechanisms for protection of transport, and upper-layer protection mechanisms can sit on top of lower-layer protection mechanisms to provide a unified protection approach. The model allows protection mechanisms to be enabled as part of each layer (ETH layer or TRAN layer) in the network.
14.1.2 Protection Terminology This section defines the precise MEF protection terminology. 14.1.2.1
Protection Types
A network can offer protection by providing alternative resources to be used when the working resource fails. There is specific terminology for the number and arrangement of such resources. 14.1.2.1.1 1+1 The Protection Type 1+1 uses the protection resources at all times for sending a replica of the traffic. The protection merge point, where both copies are expected to arrive, decides which of the two copies to select for forwarding. The decision can be to switch from one resource to the other due to an event like resource up/down, etc., or can be on a per-frame/cell basis. The selection decision is performed according to parameters defined below (e.g., revertive, nonrevertive, manual, etc.).
500
Chapter 14
UA2A2 mm The m:n Protection Type provides protection for n working resources using m protection resources. The protection resources are only used at the time of the failure. The protection resources are not dedicated for the protection of the working resources, meaning that when a protection resource is not used for forwarding traffic instead of a failed working resource, it may be used for forwarding other traffic. The following subsections define the important special cases of m:n protection. There are two variants of the m:n protection type, one in which a protection resource can be used concurrently for forwarding the traffic of a number of working resources, in case a few of them fail at the same time. The other variant is when the protection resource is able to forward the traffic of a single working resource at a time. 14.1.2.1.2.1 1:1 The 1:1 Protection Type provides a protection resource for a single working resource. 14.1.2.1.2.2 n:l The n:l Protection Type provides protection for one working resource using n protection resources. 14.1.2.1.2.3 l:n The l:n Protection Type provides protection for n working resources using one protection resource. In this protection type, the protection resource is shared for protection purposes by the n working resources. 14.1.2.2
Failure Types
Failures may occur in network nodes or on the links between nodes. 14.1.2.2.1 Fail condition (Hard Link Failure) Fail condition is the status of a resource in which it is unable to transfer traffic (e.g.. Loss of Signal, etc.). 14.1.2.2.2 Degrade condition (Soft Link Failure) Degrade Condition is the status of a resource in which traffic transfer might be continuing, but certain measured errors (e.g.. Bit Error Rate, etc.) have reached a predetermined threshold.
Metro Ethernet Network Resiliency and Traffic Management
501
14.1.2.2.3 Node Failure A Node Failure is an event that occurs when a node is unable to transfer traffic between the links that terminate at it. 14.1.2.3
Resource Selection
14.1.2.3.1 Revertive Mode The protection is in revertive mode if, after a resource failure and its subsequent repair, the network automatically reverts to using this initial resource. The protection is in nonrevertive mode otherwise. Automatic reversion may include a reversion timer (i.e., the Wait To Restore), which delays the time of reversion after the repair. 14.1.2.3.2 Manual Switch A Manual Switch is when the network operator switches the network to use the protection resources instead of the working resources, or viceversa. By definition, a Manual Switch will not progress to failed resources. A manual switch may occur at any time according to the intention of the network operator, unless the target resource is in failure condition. 14.1.2.3.3 Forced Switch A Forced Switch is when the network operator forces the network to use the protection resources instead of the working resources, or viceversa, regardless of the state of the resources. 14.1.2.3.4 Lockout A lockout command on a resource makes the resource not available for the protection of other resources. 14.1.2.4
Event Timing
The terminology distinguishes events, which occur at particular instants (points in time), and times, which are the time durations between events. \4A2AA Impairment Instant The Impairment Instant is the point in time that the failure event occurs. 14.1.2.4.2 Fault Detection Instant The Fault Detection Instant is the point in time at which the failure is detected and declared. The Fault Detection Instant may be different for different network elements.
502
Chapter 14
14.1.2.4.3 Hold-off Instant It may be desirable to delay taking any action after the Fault Detection Instant. The Hold-off Instant is the instant at the end of this delay period, if there is one. Otherwise, it is the same as the Fault Detection Instant. The hold-off instant is useful when two or more layers of the same network provide protection. In such a case, the hold-off instant in an upper layer gives an opportunity to the protection in a lower layer to take place before the upper layer protection acts. 14.1.2.4.4 Connectivity Restoration Instant The Connectivity Restoration Instant is the first point in time after impairment that user traffic can begin to be transferred end-to-end. At the connectivity restoration instant, services affected by the failure already deliver user traffic end-to-end but may not do so with the performance required by the SLS. In case the impairment caused only degradation in performance to the point of losing SLS compliance, the connectivity restoration instant is defined to be identical to the impairment instant. 14.1.2.4.5 SLS Restoration Instant The SLS Restoration Instant is the first point in time after impairment that user traffic can begin to be transferred end-to-end with the original performance guarantees. 14.1.2.4.6 Reversion Instant In revertive mode, the Reversion Instant is the point in time at which the original resources are again used. This point in time may be the same as the SLS Restoration Instant. UA2AJ Detection Time The Detection Time is the difference between the Fault Detection Instant and the Impairment Instant. 14.1.2.4.8 Hold-off Time The Hold-off Time is the difference between the Hold-off Instant and the Fault Detection Instant. The Hold-off Time may be zero. 14.1.2.4.9 Connectivity Restoration Time The Connectivity Restoration Time is the difference between the Connectivity Restoration Instant and the Impairment Instant.
Metro Ethernet Network Resiliency and Traffic Management
503
14.1.2.4.10 SLS Restoration Time The SLS Restoration Time is the difference between the SLS Restoration Instant and the Impairment Instant. 14.1.2.4.11 Reversion (wait to restore (WTR)) Time In revertive mode, the Reversion Time is the difference between the repair instant of the original resource and the Reversion Instant. 14.1.2.4.12 Timing Relationships Figure 14-1, below, shows event instants and times, as defined above, on a timeline. Some times shown may actually be identical. Other times besides those defined above may be of interest.
Impairment
Fault Detection
Instant
Instant
HoldOff Instant
Connectivity Restoration Instant
SLS Restoration Instant -•
Time
-•^ Detection Time
Hold Off Time
Connectivity Restoration Time <
• SLS Restoration Time Figure 14-L Illustration of event timing
14.1.2.5
Other Terms
14.1.2.5.1 Shared-Risk Link Group (SRLG) A Shared Risk Link Group (SRLG) is a group of links that share a resource whose failure affects all the links in the group. For example, a bundle of fibers connecting two sites is an SRLG.
504
14.1.3 14.1.3.1
Chapter 14
Discussion of Terminology Timing Issues
Different applications and different users require different restoration times. In some provider network designs, a faster restoration time may require more network resources. For example, one technique of providing fast restoration is by creating one or more protection paths per working path that needs to be protected, including the provisioning of bandwidth for the protection path. For this reason, it is beneficial to define a number of different restoration times that can be provided for different resources. 14.1.3.2
SLS Commitments
Protection paths can either maintain or reduce bandwidth, and can alter other SLS characteristics. For example, a path that is assigned with a certain amount of CIR can be assigned with PIR instead when going through the protection path. In this way, the protection path requires fewer resources to be provisioned. In [2] and [3], two different types of protection times are mentioned, namely, recovery time and full restoration time. These correspond to the Connectivity Restoration Time and the SLS Restoration Time defined above. The differentiation in the terms is based on the SLS provided on the protection path, as well as the time it takes to provide the SLS provided by the original working path. In terms of providing various SLS levels on the protection path, it can be beneficial to provide a two-stage protection mechanism where the first-stage protection switching occurs (rapidly) onto the limited protection path, which is a protection path with reduced SLS commitment. The equivalent protection path, which is a protection path with full SLS commitment, can then be installed and the traffic switched from the first-stage protection path. Note that such full restoration time (SLS restoration time) may or may not be different from the recovery time (connectivity restoration time), depending on whether limited or equivalent protection path is used as the first-stage protection switching. The protection type 1+1 uses the resources in the protection path at all times for sending a replica of the traffic. The protection merge point decides which of the two copies to forward. On the other hand, 1:1, l:n, n:l, and m:n protection use only one path at a time, and therefore have the advantage that the protection-provisioned bandwidth can be used for other purposes when there is no failure.
Metro Ethernet Network Resiliency and Traffic Management
505
14.1.4 Protection Reference Model To deliver protection to Ethernet services implemented over Metro Ethernet Networks (MEN), a reference model has been created to allow description of a unified protection structure. The Protection Reference Model (PRM) allows a consistent description of protection capabilities applied on these services across various transmission technologies, network topologies and policies, thereby enabling the description of protection of services in the ETH layer (Ethernet Line Service, Ethernet LAN service, etc.). The following PRM section highlights the main functional elements of protection in a MEN. The elements are described in following sections through a "bottom-up" approach. The model shows a single layer in the architecture, which can be a transport layer or the ETH layer. The TRANS layer could consist of multiple layers, so there can actually be layering of transport networks, and protection can be provided at each of these layers (for example, Ethernet over MPLS over SONET). This is shown in Figure 14-2, where each layer may contain protection capabilities and may run above a lower layer that might contain protection capabilities as well. The entire protection scheme is controlled according to the application protection constraint policy.
New Protection Reference Model ETH layer
JEEPP
i iALHP
]\WnM>^
ilLAG
' I Protedion i ^ Pratedion llechanisiii " 1
1
]\ ' ;
Topology
1 TRANS layer : EEPP 1
:
'
i : ALNP
] \MPip
; i LAG
! i
* 1 rimiisciion MEF Piutedion Mechanisiii
r Topology
Figure 14-2. The PRM model (two layers are shown, from a stack of two or more)
506 14.1.4.1
Chapter 14 Transport
The purpose of the Transport Layer is to provide transfer of data between MEN elements. Many transports provide error-checking mechanism and data flow controls. The Transport Layer leverages any native mechanisms that the transport technology provides to gain protection from faults. The type of protection available may be local (e.g., a given node or link) or end-to-end (e.g., a set of nodes and links traversed by a "virtual" link) depending upon the technology used. The scenarios of the MEF protection architecture can be divided into two categories: 1. The service is carried natively over the transport layer and protection is achieved at the transport layer. An example is carrying Ethernet traffic in Ethernet over SONET (EoS), where the protection is done at the SONET layer. 2. The protection is done above a transport layer. Here there are two subtypes: • A transport layer is not capable of providing protection, or its protection capability is ignored by the protection mechanism of the upper layer. An example is Ethernet transport with the protection performed at the ETH layer or in an MPLS layer above it. • A transport layer and the protection mechanism of the upper layer are working in conjunction to bring the required protection SLS. An example is where the ETH layer containing a protection mechanism is implemented over an interconnected succession of SONET transport networks with 1+1 capability. The SONET 1+1 capability repairs local failures on certain links, while ETH layer protection is used where SONET protection is not applicable or as an additional end-toend protection method. The ability of protection mechanisms to be independent of the transport technologies allows metro networks to be deployed utilizing various transmission technologies, interconnected to create a heterogeneous network infrastructure. Protection mechanisms can span various transmission technologies (transports) regardless of whether each of these transports can deliver native protection capabilities. As each individual subnetwork of transport is utilized in a MEN, protection mechanisms could be requested from these transports to match an end-to-end protection SLS. If a transport does not have the ability to offer such services, then protection capabilities are performed at a higher or a lower layer, to ensure end-to-end protection SLS.
Metro Ethernet Network Resiliency and Traffic Management 14.1.4.2
507
Topology
Protection requires the topology to be such that it does not hinder an endto-end protection SLS. Depending on the specific technology, topology discovery may also be important to ensure that nodes (or a management utility) understand how to support the required protection. There can be many ways of delivering topology discovery. The topology is different at each layer of the MEN, since the internal topology of the lower layer is not visible to the upper layer and vice versa. The topology may look different when one looks at different layers of the network. At the ETH-layer, the network is built of Ethernet Connect Funtions (ECFs), interconnected by ETH-links. Different example topologies at the ETH layer are • ECFs on the edges only • ECFs at edges and in the core (e.g., grooming of Ethernet service frames for efficiency improvement, where EVCs are supported using multiple transport layers) Each TRAN-layer subnetwork over which the ETH layer is implemented has its own topology, built of Transport Connection Functions (TCFs) interconnected by TRAN-links. Protection can be provided in a specific layer if the topology at that layer contains enough redundancy. A service can be protected even if the topology at a specific layer does not provide enough redundancy, as long as the protection at other layers creates an end-to-end protection for the service at the ETH-layer. We discuss how the mechanisms described possibly apply to a few layers and technologies. For this reason, we use the terms links, nodes and Network Elements, where: • Network Element (NE, node) refers to a device containing an ECF or a TCF, depending on the layer. • Link refers to an ETH-link or a TRAN-link, depending on the layer. 14.1.4.3
MEF Protection Mechanism
The following styles of network protection mechanisms are currently under consideration: 1. Aggregated Line and Node Protection (ALNP) service 2. End-to-End Path Protection (EEPP) service 3. MP2MP protection service 4. Link Protection based on Link Aggregation
Chapter 14
508
The protection services can be layered one on top of the other in any combination. For example, the ALNP can protect the network facilities while EEPP provides an additional protection at the path level. EEPP supports 1+1, 1:1, and l:n protection mechanisms, and ALNP supports 1:1 as well as l:n facility protection. 14.1.4.3.1 Aggregated Line and Node Protection (ALNP) ALNP provides protection against local link and nodal failure by using local path detour mechanisms. In this case, local "backup" or "detour" paths are created along the primary path that bypass the immediate downstream network element NE or the logical link and immediately merge back on to the primary path. The detour path may provide l:n protection or 1:1 protection of the primary paths in the network. The backup paths are either explicitly provisioned, as described as an option in [1] or are implicit in the technology, as in SONET/SDH ULSR/BLSR.
LcKallletau'
isnE
"
^ ^ NetWMfc Element
Piinuy
ME
NE
•~n
(NE) W.r%«»1 Yk»tf««i«.
P^Ai2
«.
NE
Figure 14-3. ALNP
Protection with short restoration times is possible in many cases with ALNP because many failure events can be instantaneously detected without long feedback loops end-to-end. The restoration time actually depends on the
Metro Ethernet Network Resiliency and Traffic Management
509
local failure detection time, which may differ in different scenarios. As each failure is detected (at each link or node), ALNP protects many end-to-end paths (with similar end-to-end protection SLSs) in a single restoration invocation. If a lower layer transport subnetwork has the ability to deliver services that are similar to those that ALNP provides at an upper layer, then the native protection mechanism of the transport subnetwork can be used and ALNP can be bypassed. If a transport subnetwork in a layer below the layer at which ALNP operates does not support native protection capabilities to support a specified SLS, then it is the responsibility of the Aggregated Line and Node Protection (ALNP) mechanism to deliver the protection required according to the specified SLS. Bidirectional Line Switching Redundancy (BLSR) capabilities of SONET and SDH, and MPLS local repair are examples of ALNP derivatives in specific transports. ALNP may deliver a 1 :n protection capability with a sub-50-ms restoration time and other default parameters. (Other restoration times could also be supported and invoked, depending on the protection SLS specified and on failure detection capabilities.) The protection ALNP should deliver will be dependent on the protection desired for the service or services it protects. ALNP provides the ability to aggregate many end-to-end paths in a hop-by-hop and node-by-node manner. At any time, both the ALNP and other protection mechanisms in transport layers below the layer at which ALNP executes could offer similar protection capabilities. Interoperability is achieved in this case by configuration of the hold-off time of the ALNP mechanism such that the lower layer protection mechanism converges before the ALNP mechanism at the upper layer decides whether to take action. To protect each link and node using ALNP, the mechanism for generating ALNP protection paths is preferably done using an automated scheme or is implicit in the transport technology. A possible mechanism for automatic creation of protection paths is to allow the specification of the protection parameters desired as part of the trail creation. Upon detection of the first request (for a given protection SLS for a specific trail), or earlier (e.g., at network setup), protection paths with certain protection parameters are created for each given transport subnetwork. 14.1.4.3.2 End-to-End Path Protection (EEPP) End-to-end path protection (EEPP) is the ability to provide a redundant end-to-end path for the primary path. This mechanism can be used to augment ALNP. A variation of this method can be used to protect partial segments of the end-to-end path within the same layer if such capability is supported by the protection mechanism at the specific layer.
Chapter 14
510
lAMa] D e t o v
1NE
Netwoik Elenoeiit (NE) \
Piuiuy
^
NE
^ ^ ^ • " ^
NE
•^ r 4
I.ni-«l IWinMVi ^^***'*''*»*ta
P^A2
\
ME
1S[E
-
/
^
\
/ . . . . ^
5S^
Mi^
Figure 14-4. E]EPP
Figure 14-4 illustrates the use of a secondary path for EEPP as well as detour paths for ALNP. In an EEPP scenario, a path is created from a source node to a destination node. Alternative or secondary paths are then created with different routing segments that protect the primary path. The number of redundant paths needed is policy defined and has implementation limits (each of these redundant paths could consume network resources such as bandwidth, CPU cycles, and memory). The intelligence for computing redundant paths (that are not part of the primary path resources) can be done with an online constraint mechanism (e.g., CSPF) or offline traffic engineering tools. Each of these redundant paths is end-to-end in nature and therefore provides redundancy in a different manner than ALNP. The EEPP handles protection on segments of the global path, and in some cases could be provisioned endto-end, and can provide redundancy when a transport segment along the path cannot provide protection of any kind (this includes ALNP or native transport protection). The EEPP could also be used when ALNP protection is available at each transport subnetwork but further redundancy is desired for path diversification. The restoration time of EEPP can be much longer
Metro Ethernet Network Resiliency and Traffic Management
511
than for ALNP and is dependent on the protection type that is used. There are a few types of protection that can be used: • l-fl— This type of configuration requires that the redundant paths are fully established and active. All data sent on the primary path is also copied to the redundant path(s). The receiver (which is the node at which the two paths merge) coordinates and decides which of all the available paths (primary and secondary) is used at each point in time. This decision can be performed on a per-path basis according to OA&M, for example, or on a per-packet basis, where each packet is taken from a path from which it was received, in which case a sequence-number field can be added to the packets so that the receiver can correlate between the two packet streams. This type of redundancy can achieve very fast restoration times (milliseconds), since the receiver decides if the primary path has failed based on alarm indication or performance information derived from the primary path. However, this type of redundancy will consume double the bandwidth and hardware resources (CPU, fabric, memory, etc.) since it is always active and passing data. • 1:1 Cold Standby — This type of configuration requires that the redundant paths have their routing information calculated ahead of time, but the redundant paths are not established until failure of the primary path; the source node establishes the redundant path only when failure has occurred on the primary path, resulting in long restoration times. • 1:1 Hot Standby — This type of configuration requires that the redundant paths have their routing information calculated ahead of time and established during the service activation time of the primary path: the redundant path(s) are kept active waiting for the primary path to fail. The chief determination of the time to repair a failure is the detection time, since the switchover to any redundant path(s) can occur very quickly. The drawback to this type of redundancy is that the redundant paths consume network resources even though they are not passing data. Based on protection policy, however, the setup of the redundant paths may be made with fewer resources in order to give fast restoration for part of the traffic immediately. Cold standby can be invoked later onto restore full traffic BW. • Shared Redundancy — Since a single failure in the network may only affect a subset of the primary paths, there is an opportunity to share the same protection resource among multiple primary paths. There are many schemes that achieve sharing of the protection resources by exploiting this fact: 1 :n, ring, and shared mesh protection are some of the wellknown sharing mechanisms.
Chapter 14
512
14.1.4.3.3 MP2MP protection The E-LAN service is a multipoint-to-multipoint service that requires connectivity between all its UNIs. Depending on the implementation of the E-LAN service, the protection schemes above may not be sufficient for protecting it. The reason is that the implementation of an E-LAN may be involved with one or more ECFs, which are interconnected by a number of ETH-trails. A failure of such an ECF is not covered by EEPP and ALNP as described above. The implementation of an E-LAN service may include implementation of multipoint-to-multipoint connectivity at the TRANS-layer as well. Three methods are typically used for multipoint-to-multipoint protection of Ethernet service or transport: • Split horizon bridging with full mesh connectivity • Spanning Tree or Rapid Spanning Tree • Link redundancy With Split-Horizon bridging, a full mesh topology is created between the TTF (Trail Termination Function) entities (each is an ECF or an TCF, depending on the layer under discussion), creating the protected domain. Each trail in the full mesh of trails is a point-to-point trail and may contain nodes (ECFs or TCFs) in the same layer or in a lower layer.
Access Inlcs
Access
Figure 14-5. Split-horizon bridging with full mesh connectivity
Metro Ethernet Network Resiliency and Traffic Management
513
Split-Horizon bridging is performed as follows: Each TTF maintains a bridging database controlling its bridging function. Each frame received by one of the TTF entities from an access-link is forwarded according to the bridging database of that TTF, to one, all, or some of the other TTF entities. Each copy is transmitted to one of the remote TTF entities through the direct trail leading to that remote TTF. Frames received by a TTF from one of the trails of the full mesh are forwarded by the TTF only to access links. With split-horizon bridging, the protection techniques discussed above are sufficient for protection of the MP2MP service, as long as each of the trails connecting the TTF entities is protected. A split-horizon-bridging subnet can serve as a subset of a larger bridged network by connecting it to other bridging components. In this case, its bridging elements may not be TTF entities, but ordinary ECF/TCF entities with split-horizon bridging capabilities. The Spanning Tree Protocol is defined in IEEE 802.ID; the Rapid STP is defined in IEEE 802.Iw. These protocols provide protection in a network in which the TTF entities are connected in a partial mesh, and each of the TTF entities performs 802.ID compliant bridging between the links and trails connected to it (access links as well as trails of its own layer). ECF/TCF entities through which the trails between the TTF entities pass may also perform 802.ID bridging. Observe that 802.ID bridging requires all links between the bridging entities to be bidirectional; therefore, this scheme requires all trails between ECF/TCF entities that perform bridging to be bidirectional. IEEE 802.ID requires the bridging to be performed over a subset of the network that forms a spanning tree of that network, and here is where STP and RSTP come to help, creating a spanning tree of trails that participate in the bridged network, which spans all TTF and ECF/TCF entities implementing the service. STP requires fast aging or reset of the bridging databases in case of a change in the topology of the created spanning tree. As described in MEF services, STP-BPDUs may be • Processed at the UNI, in which case the subscriber network becomes part of the network for which a single STP is calculated • Tunneled by the service, in which case, the service is perceived by the subscriber network as a single segment. In this case, a subscriber STP can be created between its sites • Dropped at the UNI, in which case the subscriber should manually ensure that his or her network does not contain loops going through the service. Note that tunneling and discarding also mean that an internal (MEN) STP can be created that is separated from the subscriber STP. In case tunneling is performed, the subscriber STP is then transparently tunneled through the MEN.
Chapter 14
514
TCF
ECF/ TCF
' "'<,;,''' f
TTF
Figure 14-6. Link redundancy
With the Hnk-redundancy scheme, a single TTF attaches a bridged network of ECF or TCF entities (depending on the layer), using two pointto-point trails, which do not necessarily end at the same ECF/TCF on their other side. The TTF under discussion chooses at any time a single operational trail to work with. The TTF uses one of the mechanisms available in the technology of the specific layer (e.g., OA&M) for monitoring the operational status of the two trails. The TTF forwards frames received from its access link to the chosen trail, and processes or forwards frames received from this trail to its access links. Frames received from the other trail are dropped. When the TTF decides to change the trail, which is used for forwarding, it should inform the bridged network to which it is attached that a topology change has happened so that the bridging entities in it can initiate fast aging or flush their database. 14.1.4.4
Link Protection Based on Link Aggregation
For an Ethernet transport, Link Aggregation allows one or more Ethernet links connecting the same two nodes to be aggregated into a Link Aggregation Group (LAG). A LAG logically behaves as a single link. The
Metro Ethernet Network Resiliency and Traffic Management
515
frames that each of the two nodes transmits through the LAG are distributed between the parallel links according to the decision of that node. The LAG distribution function should be such that it maintains the order of frames within each session using it. A LAG may be made of N parallel instances of full duplex point-to-point links operating at the same data rate. The Link Aggregation Control Protocol (LACP) is defined in IEEE 802.3, and is used by neighboring devices to agree on adding links to a Link Aggregation Group. Figure 14-7 is an example of a LAG topology:
Ettiemet Links
ECF/TCF
ECF/TCF
Figure 14-7. Link aggregation
One of the features of a LAG setup is protection against failures of each of the links composing the LAG. When one of these links fails, the nodes at the two sides of the LAG simply change their LAG distribution function to spread the traffic between the remaining links. When a link recovers, the LAG distribution function changes again to include it. An optional LAmarker protocol ensures that frame order is preserved during these changes. The LAG scheme can be applied at the ETH-layer as well, by creating a number of point-to-point ETH-trails between two ECFs. The LACP can serve in this case as the OA&M procedure detecting the failure and recovery of the ETH-trails so that the LAG distribution function can be adapted accordingly.
516 14.1.4.5
Chapter 14 Application Protection Constraint Policy (APCP)
Functionally, APCP mediates between the subscriber and the MEN. It facilitates all the functions that are necessary to interpret service requests from MEN subscribers and to trigger services provisioning actions in the MEF transport network. Thus, a service request through either UNI or a management interface first comes to the APCP, where the type of service and its parameters get interpreted. Then, on behalf of the subscriber request, the APCP will in turn request (if validated) the MEN to provide the requested transport with desired capabilities/characteristics.
14.1.5 Requirements for Ethernet Services protection mechanisms The following are the requirements for the Metro Ethernet protection schemes. Every protection mechanism in the MEF protection framework is to be evaluated based on these requirements. The requirements are generic in the sense that they apply to all services supported by Metro Ethernet Networks (E-Line, E-LAN, CES). 14.1.5.1
Service-Related Requirements
14.1.5.1.1 Service Configuration Requirements R-1. It MUST he possible for a subscriber to request different protection parameters for Ethernet services. The requested parameters SHOULD include the connectivity restoration time and SLS restoration time. R-2. An EVC of an Ethernet service with SLS that requires protection MUST he protected along all ETH-trails from which it is composed. R-3. Protection parameters MUST be defined on the level of per-service or a group of services. End-to-end service protection MAY be implemented utilizing multiple mechanisms along the flow path. 14.1.5.1.2 Restoration Time Categories Various applications require different connectivity and/or SLS restoration time. The restoration time (connectivity and/or SLS) that is required for a specific service is dependent on the needs of the application that the user plans to execute over that service. R-4. It SHOULD be possible to request a connectivity and/or SLS restoration time of the network for each service. The following SHOULD be the restoration-time categories that the services can choose from for
Metro Ethernet Network Resiliency and Traffic Management
• • • •
517
identifying the required connectivity restoration time and SLS restoration time: Sub-50 ms restoration time Sub-200 ms restoration time Sub-2 seconds restoration time Sub-5 seconds restoration time. Typical examples may be as follows: 1. CES applications require sub-50 ms SLS restoration time, 2. an SLS restoration time of sub-200 ms is sufficient for certain CES applications, 3. some real-time or semi-real-time application may also require also sub-200 ms SLS restoration time, 4. sub-2 seconds connectivity restoration time ensures that a LAG implementation over the service using LACP in fast mode does not reconfigure due to the failure, 5. TCP-based applications usually settle for sub-5 seconds connectivity restoration time, and 6. a connectivity restoration time of sub-5 seconds ensures that STP and RSTP do not start reconfiguring the network. These examples do not serve to categorize applications according to the restoration time, but only serve as examples for providing motivation.
14.1.5.2
Network-Related Requirements
14.1.5.2.1 Protected failures R-5. The protection mechanisms SHOULD be able to protect from any one of the failure types, for example: 1. Fail condition (Hard-link failure, e.g. LOS, LOF, etc.) 2. Degrade condition (Soft-link Failure, e.g., BER, CRC errors greater than a threshold) 3. Node failure R-6. The protection mechanisms M4 7 protect from misconfigurations. Protection switching MUST not cause misconnections as a side effect of its own operation. 14.1.5.2.2 Degrade Condition Threshold Degrade condition is a status of a resource in which traffic transfer might be continuing, but certain measured errors (e.g., packet loss, bit error rate, etc.) have reached a predetermined threshold. R-7. With a protection scheme in which the SLS is preserved during failures, the predetermined threshold MUST be less than the maximum amount of packet loss allowed according to the SLS of the services flowing
518
Chapter 14
through the resource. The provider M^Fset a less restrictive threshold on the resources if the SLS of all services flowing through the resource allows that. One way for the network provider to meet the above requirement is to ensure that the packet-loss commitment in SLS of the services is limited according to the expected amount of packet loss allowed in the resources of the network. It should be noted that, depending on the protection scheme and layer, other measurements than packet loss may be used and a relation to packet loss might not be possible. After the protection trigger (threshold crossing), it will take time (protection switching time) before the traffic is restored. During this time, additional packet loss will occur. 14.1.5.2.3 Transport Layer Protection Mechanisms Interaction The MEN layering model allows layering within the TRANS layer. Therefore when one layer executes above another layer, both may belong to TRANS, or the upper one may be the ETH-layer and the lower one may be a TRANS-layer. R-8. An upper layer protection mechanism SHOULD be designed to work in conjunction with lower layer transport protection mechanisms, such as SONET 1+1/1:1, RPR, etc., as available. Each protection mechanism that is allowed to execute in a network including these lower layers MUST support configuration of the hold-off time such that the lower layer protection mechanism converges before the protection mechanism at the upper layer decides whether to take action. Note that the protection at the lowest layer doesn't need to support a hold-off timer. In some cases, it MAY be required that the various layers will act independently. This depends on the protection policy for the network. 14.1.5.2.4 Protection Control Requirements The following is a list of parameters and controls relevant for protection schemes. 1. Hold-Off Time 2. Revertive/nonrevertive mode 3. Reversion (Wait To Restore) Time 4. Manual switch 5. Forced switch 6. Lockout Items (1), (2), and (3) above are configurations of protection parameters, while (4), (5), and (6) are actually control mechanisms that control the current protection operation mode. The requirement for support of a hold off timer has been covered already in the above subsection.
Metro Ethernet Network Resiliency and Traffic Management
519
R-9. The protection at each (ETH/TRANS) layer MAY enable configuration of the revertive/nonrevertive mode and reversion (Wait To Restore) time. These parameters and controls MAY he applied according to the protection poHcy, Protection MAY be applied in different layers of the MEN, and revertive/nonrevertive mode and reversion (Wait To Restore) time SHOULD be applied to each mechanism separately. R-10. The protection at each (ETH/TRANS) layer SHOULD enable configuration of at least one of the following controls: • Manual switch • Forced switch • Lockout Protection MA Y be applied in different layers of the MEN, and the above three controls SHOULD be applied at each layer separately. 14.1.5.2.5 Bidirectional Switching There are two variants of the m:n protection type. In the first, a protection resource can be used concurrently for protecting a number of working resources, in case a few of them fail at the same time. In he second, a protection resource is able to pass the traffic of a single working resource at a time. R-11. In the case of an m:n protection scheme, in which the protection resource is able to pass the traffic of a single working resource at a time, and if the working and protection resources pass traffic in two directions, the bidirectional switching mechanism MUST be used for controlling the use of the protection resource. The motivation is that the bidirectional switching mechanism ensures that both directions of the same working resource switch concurrently to the protection resource. This outcome ensures that the network does not get to a situation in which one direction of one working resource and another direction of another working resource are protected by the protection resource, so no single working resource is completely protected. An example of a mechanism that achieves this result is the APS mechanism of SONET/SDH. 14.1.5.2.6 Robustness R-12. Each protection mechanism MUST monitor protection standby resources for failures. 14.1.5.2.7 Backward Compatibility Requirements R-13. When upgrading nodes in the network to support new protection mechanisms, these nodes MUST interoperate with nodes on the same network that are not yet upgraded to include these capabilities or schemes.
520
Chapter 14
14.1.5.2.8 Network Topology R-14. A Metro Ethernet Network may consist of different subnetwork topologies. The protection framework SHOULD support these different topologies, although a specific scheme may be limited to few topologies. R-15. The protection scheme SHOULD provide resource diversity such that the working and protection paths do not share a common resource in the network. The protection scheme SHOULD allow the required level of diversity to be according to the operator policy. The policy A//( 7 require link diversity, node diversity, station diversity, fiber diversity, cable diversity, duct diversity, and geographical diversity. The operator SHOULD be able to control the required diversity controlling either by the network topology or the protection scheme and provisioning parameters. R-16. To provide protection for a specific EVC, a protection scheme MUST protect each of the network resources within the network topology through which traffic of that EVC flows. 14.1.5.2.9 QoS In many cases, a trade-off exists between efficient use of network resources and the extent of QoS preservation. The following requirement is helpful for operators for controlling this trade-off. R-17. Each protection method SHOULD enable the operator to define to what extent the original QoS is kept, up to full equivalent behavior. 14.1.5.2.10 Effect on User Traffic R-18. The protection mechanism SHOULD maintain the SLS requirements for loss of traffic and disordering during switching to protection and restoration events. 14.1.5.2.11 Management Requirements for Protection Schemes R-19. A facility implementing a protection scheme SHOULD support a management interface that will expose to management applications the following parameters: • All control parameters defined in previous sections • Status of the working and protection paths • Signaling of events that represent changes of status of the working and protection resources • A failure event, which causes protection switch, SHALL cause an alarm to be sent by the involved Network Element(s) to the Element Management System.
Metro Ethernet Network Resiliency and Traffic Management
14.1.6
Framework for Protection in the Metro Ethernet
14.1.6.1
Introduction
521
The key for the framework is the choice of protection mechanisms. Other aspects are important, but are handled elsewhere: • The protection framework supports arbitrary transports. • Protection solutions at each layer SHOULD be independent of the internal topology of the underlying layer. • The APCP allows a subscriber to specify the parameters of the protection desired. This can be done through management-based configuration or the UNI. Thus, our framework discussion focuses on the protection mechanisms and the failure detection mechanisms described above. The model that has been developed has that property that it is open to new approaches and innovation within the layers. The description here reflects current discussion and can be expanded in the future. 14.1.6.2 MEF Protection Schemes In this section, we outline some implementation approaches to providing MEF protection. Many protection mechanisms may be operating in the network at the same time. One reason is the existence of a variety of services requiring different parameters of protection. Some of these protection mechanisms may offer protection at different layers of the network, whereas others may offer different protection parameters such as restoration time, failure coverage, etc., at the same layer. However, operating many protection mechanisms at the same time in the network requires a coherent interworking strategy so that a triggering of multiple protection mechanisms for protecting the same traffic can be treated appropriately. Note that the use of multiple protection mechanisms for the same traffic may be desired because, for example, different protection mechanisms in the network may provide different failure coverage and restoration time. To ensure coherent operation, the protection requirement section states that each protection mechanism that is allowed to execute in a network including these lower layers MUST support configuration of the hold-off time such that the lower layer protection mechanism converges before the protection mechanism at the upper layer decides whether to take action.
522
Chapter 14
14.1.6.2.1 OAi&M-Based End-to-End Path Protection (EEPP) This end-to-end path protection mechanism requires at least two paths to be created from the source node to the destination node for providing the same service. One of these paths is regarded as the primary path, and the others are redundant paths. The redundant paths are provisioned such that they are disjoint in nodes, hnks, and shared-risk Hnks. In this way, a faikire of a single link or node will disconnect only one of the redundant paths at the most, so another can still be used and the service is maintained. This method can also be used to protect segments of the end-to-end path. Possible modes of EEPP protection are the 1:1 mode, in which two redundant paths are provisioned, but only one is used at a time; the n:l mode, in which n+1 paths are provisioned, but only one is used at a time; the 1 :n mode, in which a single protection path is used for protecting n disjoined working paths; and the 1+1 protection mode, in which two paths are used concurrently —each data packet is sent along both paths and the sink node decides which to use. An end-to-end OA&M protocol is used for sensing the availability of a path. With 1+1 protection, it is sufficient to have a one-way OA&M protocol. With 1:1, l:n, or m:n protection types, the source node is notified of the failure of the source-to-destination path. This means that the OA&M protocol SHOULD be a two-way protocol. An example for an OA&M protocol for MPLS can be found in ITU SGI3. Specific transports, like SONET/SDH, ATM, etc. have their own OA&M mechanisms that can be utilized for EEPP at the specific layer. Other OA&M mechanisms may also be used for the same purpose. Another variation of this scheme is when the end-to-end connectivity information is provided by out-of-band connections to a management station. 14.1.6.2.2 Aggregated Line and Node Protection (ALNP) The ALNP scheme uses a set of local-protection tunnels for protecting each link and each node in the network. In this way, an end-to-end SLS of fast (e.g., sub-50 ms) protection is accomplished. The ALNP scheme can be applied to networks of any topology, provided that the topology provides a redundant path for each of the network resources. ALNP is built to interoperate with any kind of link between the nodes implementing it. This link can be a point-to-point physical link, a protected transport (in which case protection-tunnels are required only for border nodes), a fast protected subnet (again, only border nodes require ALNP protection), and virtual links (in which case an OA&M procedure is required for indicating the state of the virtual link and of its end nodes).
Metro Ethernet Network Resiliency and Traffic Management
523
14.1.6.2.3 Packet 1+1 End-to-End Path Protection Packet 1+1 provides high-reliability, hitless end-to-end path protection. Packets from and application flow receiving packet 1+1 service are dual-fed at the source side onto two disjoint paths. In the simplest case, these paths can be node and link disjoint but in general may involve more complicated notions such as shared risk link groups. On the destination side, one of the copies of the packet is selected from the two possible received copies. Note that the incoming packet is selected from any of the two disjoint paths, irrespective of the path from which the last packet was selected. Thus packet 1+1 treats both paths as working. This is different from the traditional transport 1+1 scheme, where each path is designated as working or protection, and the packets are selected from the working until a detection of failure on the working causes a switching to the protection path. Thus, compared with traditional transport 1+1, packet 1+1 does not require explicit failure detection and protection switching. It also does not require any signaling. This setup allows the packet 1+1 service to recover from any failure that only affects one of the two disjoint paths providing the service instantaneously and transparently. Note that these failures are protected irrespective of the layer, including physical, link, TRANS, and ETH layers, at which the failure occurred, as long as the failure is in the layer at which the mechanism is provided or in a layer below it. In other words, when the mechanism is provided at a specific layer, failures at that layer and in the layers below it are covered. Preserving sequentiality requires defining a new protocol to include sequence numbers. Packet 1+1 requires that the source side assign the same but distinct identification to each dual-fed packet. This can be easily achieved by, e.g., assigning sequence numbers to packets. Each pair of duplicated packets will get the same sequence number but one distinct from the other pairs of duplicate packets. Based on the carried sequence numbers, the destination node is able to identify duplicate packets and select one of them. Note also that if a resequencing is not also provided, unerrored packets may be discarded to preserve sequentially. 14.1.6.2.4 Shared Mesh Protection End-to-end shared protection scheme is targeted to provide guaranteed restoration while using a minimal amount of protection bandwidth in a general mesh topology. Other end-to-end protection schemes either require a dedicated protection path for each primary path, such as 1+1 and 1:1, or provide a limited sharing of protection bandwidth, such as 1 :N. Compared with them, shared protection scheme provides very flexible sharing. Instead of dedicating protection resources for every primary path in the network, a pool of protection resources can be set aside that can then be used for
524
Chapter 14
restoring the primary paths that get affected by the failure. Protection resources are allocated to allow restoration of all the protected traffic from any single possible failure in the network. For each protected primary path, the protection resources are allocated at the time of activation of the primary path. Arrival of a request to establish shared mesh restoration service between two nodes prompts computation of a pair of disjoint paths between them with two necessary constraints. First, sufficient bandwidth is allocated along the route of the primary path to accommodate the requesting traffic. Second, either already reserved protection bandwidth along the protection path is sufficient to guarantee restoration from any single failure along the primary route, or the available bandwidth along the protection path is allocated to be enough to accommodate the additional bandwidth needed for protecting the new primary path. Note that sharing is achieved by always first trying to accommodate a new request with already allocated protection resources. This can be achieved by keeping track, for each link in the network, of the amount of resources that will be required to protect all the primary paths that will be switched to it after any single failure in the network. This information can be either maintained in a centralized fashion at a server or distributed to the nodes in the network. In the case where the information is distributed on the nodes, each node only needs to keep track of the amount of resources required on each of its incident links. Note that since it is well accepted and verified that the probability of multiple concurrent failures in most networks is small, the scheme has been described to protect from any single failure in the network. Protection from multiple failures can be achieved through a straightforward extension.
14.2.
METRO ETHERNET TRAFFIC AND PERFORMANCE MANAGEMENT
14.2.1 Ethernet Traffic Management Overview Ethernet traffic management offers the essential components with which a service provider can offer a customer a range of services that are differentiated based on some performance metrics, e.g., frame loss and frame delay. Customers request a specific performance level as stated in the SLS. Service providers achieve levels of performance assurances by employing the necessary traffic management mechanisms both at the edge and inside the metro Ethernet network (MEN). Figure 14-8 shows the interaction between the customer network and the provider network across the User-Network Interface (UNI) between the CE
Metro Ethernet Network Resiliency and Traffic Management
525
(Customer Edge) and the PE (Provider Edge). Inside the customer network, a customer may implement IEEE 802. IQ VLANs and traffic management mechanisms. Those mechanisms are beyond the scope of this specification. Across the UNI, CE-VLAN ID (Customer Edge VLAN ID) is mapped to a single EVC. The EVC or a combination of an EVC and CE-VLAN-CoS is used by the PE determine the Class of Service (CoS) instance using the CoS Identifier.
MEN
Customer
CE-VLAN D
Figure 14-8. Ethernet traffic management overview
The CoS Identifier is used to identify the CoS instance applicable to a flow of Service Frames. A maximum of 32K (4095*8) CoS Identifiers MAY be supported at a UNI. A much smaller number of Classes of Services MAY be supported. The CoS Identifier can be viewed as an index that points to two fundamental building blocks that together define the service level received by a Service Frame. Those building blocks are the Bandwidth Profile and the forwarding class. The Bandwidth Profile, if present, is a set of traffic parameters that governs the expected arrival pattern of customer traffic per CoS instance. It provides a deterministic upper bound to the expected volume of traffic per CoS instance. The Bandwidth Profile parameters are inputs to a metering
526
Chapter 14
algorithm that verifies the conformance of incoming Service Frames per CoS instance. This allows network providers to engineer network resources to satisfy service performance requirements. Multiple Bandwidth Profiles may exist at the same UNI. The Bandwidth Profile, if present, SHOULD be enforced by the provider's network, since it is part of the SLS and is agreed upon between the customer and the service provider. QoS policy rules, by way of rate enforcement algorithms, are typically used to identify the specific actions taken by a service provider when Service Frames violate a Bandwidth Profile. The actions may be to drop or recolor (or reprioritize) the service frame to indicate high discard precedence within the MEN. The outcome of any such rate enforcement within the service provider network is a set of Service Frames that are labeled Green, Yellow, or Red based on their level of conformance to Bandwidth Profiles. The forwarding class defines the treatment inside the provider network received by a Service Frame belonging to a particular CoS instance. For example. Service Frames may be forwarded at the highest priority available at the nodal queues to assure some level of delay requirement. Forwarding class is outside the scope of this specification. For the details of traffic and performance management, please refer to Chapter 10 "Ethernet Services over Metro Ethernet Networks".
ACKNOWLEDGMENTS Metro Ethernet Forum Lior Shabtay
14.3. REFERENCES [1] Metro Ethernet Forum, Technical Specification MEF 2, Requirements and Framework of Ethernet Service Protection in Metro Ethernet Networks, February, 2004. [2] Sharma, V. and Hellstrand, F., Framework for MPLS-based Recovery, RFC 3469. [3] ITU-T Rec. G.841, Types and characteristics of SDH network protection architectures, October. 1998.
Chapter 15 SONET SERVICES FOR STORAGE AREA NETWORKS
Richard Orgias Nortel
15.1.
DATA GROWTH
There are few trends as dramatic or universal as the growth of data. The rate of creation of digital information is unprecedented with estimates of its annual growth pegged at double digit rates. Every year brings the addition of more digital information to the already considerable store that constitutes the sum total of human knowledge. One study from the University of California at Berkeley [1] estimates that by 2005, worldwide production of digital data will have grown to almost 100 exabytes (10^^ bytes). As more and more organizations adapt to the reality of 24/7 operation, the creation of data accelerates and the need to manage and make it useful grows in importance. Making sense of the vast store of information requires that it be organized. Organization requires that it be cataloged and stored. In order for data to have any meaningful value, the ability to retrieve it as required is also important. In the context of an organization, a big part of data management is simply to provide mechanisms for storing, duplicating, and providing ready-access to information. Of course, the type of information varies with the organization. In the case of a financial institution, it could be records associated with financial transactions. For a healthcare organization, it could be an x-ray or some other type of patient record. For an entertainment company, it might be a film clip or a music file. The common challenge faced by these disparate organizations is the requirement to support storage needs estimated to be
528
Chapter 15
doubling every 18 to 24 months according to research conducted by Gartner, an industry research and consulting firm. The data storage needs of an enterprise can be met in a number of ways. Among the media types used for storage of digital information are magnetic tape, optical disks, hard drives, and more recently solid-state disk drives. Once the data has been written to a medium, it can be duplicated and distributed so that a copy is available for use by multiple parties or used to replace data that is rendered inaccessible or unusable. Making copies of data is one means of ensuring that data loss does not expose an organization to financial loss or preclude it from meeting a requirement for data availability. The emergence of storage arrays, a group of disks that work as a unit, has facilitated the mass storage, protection, and distribution of data by making it possible to store and create copies of digital information on disk that can be accessed and modified in near real time.
15.2.
STORAGE NETWORKING
The drive to maximize computing resource utilization has resulted in a model in which processing power is separated from storage. This setup is typified by the server-based architecture present in most enterprise environments where applications are run from servers with data storage handled on separate dedicated storage devices. This trend continues to be reflected in the move away from direct attached storage (DAS) and toward a networked storage paradigm where storage resources are shared among multiple users. Networking can minimize the suboptimal use of storage resources by making it possible to share storage among multiple servers or end-user devices. Networking also enhances performance and eliminates bottlenecks by • Simplifying management, thereby eliminating the need for backup and maintenance of individual volumes • Enhancing availability by eliminating the need for an attached CPU device to support tasks such as backup and defragmentation • Simplifying recovery procedures by eliminating the need to restore both applications and data in the event of a failure A simple example of the benefits of a network can be illustrated by considering the case where the members of a small workgroup utilize the same application. In the scenario without a network (i.e., direct attached storage or DAS), the application and all associated files would be maintained on the individual computers of each member of the work group. Sharing of files as well as backup and maintenance would have to be handled at an
SONET Services for Storage A rea Networks
529
individual level. The result is the potential for variation in the version of the application being used, as well as variability in the frequency of backups. The consistency of backups will be determined by how diligent each individual is at maintaining his or her workstation. If a network is put in place that allows sharing of the application and associated files with access to both via a server, the ability to share information is enhanced and management of the application and associated files can be centralized. With a networked architecture, a single individual may act as the administrator of the network, with responsibility for keeping everything up to date. The administrator can ensure that applications are updated, backups are performed, and that unwanted our outdated files are purged, all by managing the common server and associated storage. This approach allows for more efficient management of data files and ensures consistency in the information being used by all members of the workgroup. Networking can also address the loss of processing efficiency that results from having to use CPU cycles to support backup and defragmentation on multiple devices. If data is stored on a shared medium, with access enabled by a network infrastructure, additional storage can be more efficiently added to the central pool, and maintenance activities can be performed on a single device instead of on multiple devices. In short, networking storage delivers quantifiable operational and financial benefits to an organization by reducing the complexity of managing a pool of storage assets. The difference between direct attached storage and network attached storage is illustrated in Figure 15-1. There are two distinct storage networking approaches. Network Attached Storage or NAS usually consists of a hard disk or multidisk RAID (redundant array of independent disks) systems connected to users and servers via a Local Area Network (LAN) infrastructure. An NAS allows sharing of applications and files by multiple users who are able to access a storage device via an IP address on a LAN. By contrast, the other storage networking option, a Storage Area Network or SAN, is a special-purpose network that supports the interconnection of data storage devices and servers and uses specialized protocols to optimize block-level data transmission. Some of the protocols used by storage networks include ESCON, Fibre Channel, FICON, SCSI, iSCSI, and Gigabit Ethernet. • ESCON [2] or Enterprise Systems Connection is a protocol developed by IBM in the early 1990s to interconnect its mainframe computers to each other and to attached storage. ESCON operates over fiber optic cable.
530
Chapter 15
Clients
Q
Clients
0 Storage
Direct-Attached Storage (DAS)
/Servers
Network Attached Storage Appliance
Shared Storage
Network-Attached Storage (NAS)
Figure 15-1. Direct Attached and Network Attached Storage
•
Fibre Channel [3] is an ANSI-specified protocol that allows the connection of servers and storage devices as well as the interconnection of storage controllers and drives. Fibre Channel operates over fiber optic and coaxial cable as well as ordinary twisted pair cables • FICON [4] or Fiber Connectivity is an IBM-developed high-speed input/output (I/O) interface for mainframe computer connections to storage devices. FICON offers higher physical link rates than ESCON, the previous standard that it replaces. • SCSI [5] or small computer systems interface is a set of ANSI-specified interfaces originally developed by Apple computer that allow personal computers to communicate with peripheral hardware such as disk drives, tape drives, CD-ROM drives, printers, and scanners. • iSCSI [6] or Internet Small Computer System Interface is an IETF developed protocol that allows SCSI commands to be carried over IP networks and can be used to link data storage facilities and to transmit data over local area and wide area networks (LANs and WANs). • Gigabit Ethernet [6] is an IEEE standard that supports transport of Ethernet frames over an optical medium at data rates of one billion bits per second. Gigabit Ethernet is typically used to build out LAN backbones and is compatible with lower-rate Ethernet transport. Table 15-1 highlights some of the key applications, data rates, and the possible reach with the protocols described above.
SONET Services for Storage Area Networks
531
Table 15-1. Storage Protocols and Applications
§m^^^MMM, ESCON Fibre Channel (FC)
200 Mbps 1 Gbps, 2 Gbps, 4Gbps, 10 Gbps
IGbps
FICON
SCSI
Up to 160 Mbytes/s
iSCSi
Varies
Gigabit Ethernet
IGbps, 10 Gbps
IBM Mainframe to peripheral connectivity Server-to-storage, server-to-server connectivity IBM mainframe to peripheral connectivity, evolution of ESCON, Interworks with FC Computer device interconnection, RAID interfacing IP-based SCSI variant for server-to-server connectivity, RAID interfacing Native packet transport
60 Km 50 Km
50 Km
Data center Data center 50 Km
In most instances, the nature of the applications will determine v^hether a SAN or a NAS solution is implemented by an enterprise. Many enterprises have both solutions.
15.3.
STORAGE AREA NETWORKS
SANs currently dominate as the netw^orking choice for mission-critical storage applications, since they are switched-fabric networks characterized by high-level connectivity coupled with fast and reliable access. SANs tend to be deployed in support of database-intensive applications such as transaction processing, ERP (Enterprise Resource Planning), and CRM (Customer Relationship Management) applications. ERP and CRM applications are critical elements of the IT infrastructure of many enterprises, with ERP facilitating the internal management of a business (maintaining inventories, tracking orders, purchasing, etc.) while CRM supports management of customer relationships. Both types of applications are usually built around databases that utilize block data as opposed to files, employ shared access, and must maintain current data for maximum effectiveness. NAS, which is typically cheaper to implement than
Chapter 15
532
a SAN and fits into the standard Ethernet environment, is better suited to file-based applications such as web-hosting, content delivery, and messaging. A SAN consists of two basic elements, storage media namely, tape drives and arrays, and networking infi-astructure, which includes switches, hubs, host bus adaptors (HBAs), and the cabling that interconnects the storage devices. A simplified SAN is shown in Figure 15-2.
Clients
10/100Base'
Application Serverp with Host Bus Adapters nfrarne Fibre Channel (F|C) SAN Switch
ESCON/F!CO^ AM"-^
ESCON/FICON
storage Devices Figure 15-2. Storage Area Network (SAN)
15.3.1 Factors Driving SAN Extension Due to the tendency of deploying infrastructure to meet local needs, many multisite organizations have developed storage architectures characterized by SAN islands — pockets of noninterconnected SAN
SONET Services for Storage A rea Networks
533
infrastructure. As is the case with direct attached storage architectures, SAN islands may suffer from sub-optimal resource utilization, since storage may be available but not at the site of greatest need. In addition, cost-effective scaling to accommodate growth may be a challenge, since storage capacity would have to be added at every site. A technology solution that enables interconnection or networking between SAN islands can increase asset utilization and lower total cost of ownership (TCO). Asset utilization would be increased by sharing the storage pool among a larger number of users, and TCO would be lowered by amortizing the cost of the storage infrastructure over the larger group. Interconnecting SANs may also improve access to information by enabling data consolidation. This consolidation may be physical or virtual. In the case of physical consolidation, high-speed network links would be used to provide multiple sites with access to a physically consolidated storage pool. With virtual consolidation, software would be used to manage physically separated but optically networked storage so that it appears monolithic. This enables users at one site to access available storage at a remote site. The interconnection of SANs is also being driven by a changing regulatory environment that is placing more stringent requirements on organizations regarding accessibility, security, and retention of information. By connecting SANs, an organization can increase the level of control it has over SAN traffic. Some of the specific regulations governing records management are the Health Insurance Portability and Accountability Act (HIPAA), which governs the healthcare industry, and the Sarbanes Oxley Act, which applies to all publicly traded corporations. In addition, the U.S. Food and Drug Administration regulates record keeping in the drug, medical device, and biotech industries, and the US Securities and Exchange Commission (SEC) has also imposed guidelines on financial institutions regarding minimum standards for data protection. Similar regulations exist or are being implemented in other parts of the world. One of the most common methods for complying with the regulatory regimes and for ensuring business continuity is data replication, a process in which data from a primary volume is copied or replicated on a secondary volume. In the past, this process has occurred with primary and secondary volumes located in close proximity to one another, but recent unexpected events, both natural and manmade, have demonstrated the need for increasing the geographic separation between primary and secondary sites. This drives the need for technologies that support extension of the distances between SANs from meters or hundreds of meters to tens or hundreds of kilometers.
534
Chapter 15
Each of the examples cited above— ehmination of SAN islands, data consolidation, regulatory pressure, and business continuity— is forcing organizations to consider SAN extension as a means of increasing the utility and value of their SAN infrastmcture. SAN extension allows the benefits of SANs to be enabled across a multisite network.
15.3.2 Fibre Channel: The Storage Protocol of Choice No discussion of SAN extension can exclude a reference to Fibre Channel, the serial data transfer architecture and protocol running on the majority of today's SANs. Fibre Channel is currently, and for the foreseeable future, the protocol solution of choice for organizations that need reliable, cost-effective information storage and delivery at Gigabit per second (Gbps) speeds and beyond. Developed in 1988 and ratified as an ANSI standard in 1994, Fibre Channel is the mature, safe solution for storage device communications. It was specifically designed to address the stringent requirements of network servers and their storage applications. Fibre Channel provides serial link connection between multiple nodes and is able to support multiple protocols including SCSI. Fibre Channel facilitates the high-speed transfer of large amounts of information at varying line rates over both electrical and optical media. Fibre Channel supports point-to-point or switched topologies with each node port connected to another node port or a fabric port. The configuration options are varied and can be selected based on the performance required. The topology choices are illustrated in Figure 15-3.
Point to Point
~ ' Loop
Figure J 5-3. Fibre Channel topologies
The Fibre Channel specification [3] defines a hierarchy consisting of five levels; FC-0, FC-1, FC-2, FC-3, and FC-4. The levels FC-0, 1, and 2 define a physical, signaling, and transmission protocol, respectively, for Fibre Channel. FC-3 defines common services for advanced features, and FC-4
SONET Services for Storage Area Networks
535
defines application interfaces. The structure of the Fibre Channel hierarchy is shown in Figure 15-4.
FC4 Protocols e.g SCSCI, IP, ATM, IPI, HIPPI
FC-3 Services FC-2 Framing/Flow Control FC-1 Encoding/Decoding FC-O 133Mbps to 4Gbps Figure 15-4. Fibre Channel Protocol Stack
Fibre Channel is able to meet the storage challenges of organizations large and small with the following advantages: • Solutions Leadership; Fibre Channel provides versatile connectivity with scalable performance. • Reliability: Fibre Channel, a most reliable form of communications, sustains an enterprise with assured information delivery. • Gigabit Bandwidth; 1 and 2 Gigabit solutions are available today, with 4 and 10 Gbps solutions under development. • Multiple Topologies; Dedicated point-to-point, shared loops, and scaled switched topologies are available to meet storage application requirements. • Scalable; From single point-to-point gigabit links to integrated enterprises with hundreds of servers. Fibre Channel delivers unmatched performance. • Congestion Free; Fibre Channel's credit-based flow control delivers data as fast as the destination buffer is able to receive it. • High Efficiency; Fibre Channel has very little transmission overhead, especially for moving large blocks of storage data. Fibre Channel is a well-established and proven protocol that has found application in SANs, computer clusters, and other data-intensive computing configurations. Any consideration of a distance extension technology for SANs needs to account for the requirements for Fibre Channel operation in order to be viable in the marketplace.
536
15.4.
Chapter 15
DISTANCE EXTENSION REQUIREMENTS
Storage applications often have very demanding performance requirements. This is not totally unexpected, given the business value that storage applications bring to an organization. Business continuity, one of the key drivers of storage infrastructure deployment, is about protecting mission-critical data. Since the stakes are so high, the performance of a storage network must be predictable and reliable. A distance extension technology introduced into the transport portion of a Fibre Channel SAN must meet some basic requirements, which include: • In-order delivery of block data • Low latency • Lossless data transport • High throughput • Network flexibility and resilience • Carrier grade • Scalability and performance Let's consider each of these requirements in turn. 15.4.1.1
In-order Delivery
The Fibre Channel protocol transmits data in blocks. In order to preserve the overlay application being run (usually a database-centered application like ERP or CRM, for example), these blocks must be received by the target device in the order that they were sent, since no reordering of blocks is accommodated by the protocol. 15.4.1.2
Low Latency
Latency is the term used to describe the amount of delay in a network. Data sent between two points in a network will incur latency from a number of sources, including adaptation (moving from a native FC format to an alternate scheme for transport, for example), equipment (since there is a certain amount of delay incurred as data transits a router, an add-drop multiplexer or gateway, for example), and the amount of distance that must be covered. Data that must travel a long distance will incur more latency than data that travels short distance. Latency impacts the performance of a SAN because overlying applications usually have some type of expected response time built into their operation. Increase latency too much and an application may not function as expected.
SONET Services for Storage A rea Networks 15.4.1.3
537
Lossless Delivery
Fibre Channel networks do not tolerate discarding of data. Every single piece of data sent must be received at the target in order to complete the transaction. If data is lost in transit, the entire block must be resent. This requirement increases the amount of time required to complete the action and introduces additional latency or delay, which in turn can degrade the performance of the overall application. 15.4.1.4
Throughput
Throughput is an important consideration when connecting SANs, since the replication or backup of data must usually occur within some predetermined window. The amount of data to be transferred and the amount of time available for the backup or transfer will determine how much bandwidth is required to meet an organization's storage networking need. Sustained and deterministic levels of throughput allows IT managers to have confidence that their storage applications will function in a predictable fashion. 15.4.1.5
Network Flexibility and Resilience
In addition to meeting the specific requirements of storage applications such as latency and throughput, there are also overall design considerations that must be accommodated to guarantee an infrastructure capable of supporting storage distance extension. In order to be truly competitive, a storage infrastructure must support standard data center protocols such as Fibre Channel, FICON, ESCON, and Gigabit Ethernet. These protocols are well established and broadly deployed. As such, the acceptance of a storage networking technology will be enhanced by its support for legacy protocols, particularly if it provides a means of supporting multiple protocol streams simultaneously. 15.4.1.6
Carrier Grade
The infrastructure choice for distance extension also needs to be carrier grade and provide end-to-end system availability of 99.999 percent (5 minutes of downtime annually) as well as provide support for peering with multiple carriers with no degradation in service. The ability to provide the sub-50 ms rerouting that has become standard in today's telecommunication network also enhances the storage application. The combination of carrier grade capability and fault tolerance will guarantee a robust network suitable
538
Chapter 15
for deployment by a carrier and well suited to supporting a mission-critical storage application like business continuity. 15.4.1.7
Scalability and Performance
Scalability and performance are two additional elements of a storage network that can be impacted by the technology selected for intersite network connectivity. In order to provide the level of support for the mirroring applications, essential to business continuity and disaster recovery, data recovery needs to measured in minutes, not hours, and the amount of data loss as a result of an interruption should approach zero. This type of performance needs to be supported on scalable storage infrastructure. With growth being a common characteristic of all storage networks, the infrastructure supporting storage connectivity needs to have the capacity to scale as the demand for bandwidth increases. Scalability includes flexible bandwidth choices and easy provisioning.
15.5.
DISTANCE EXTENSION ALTERNATIVES
There are a number of options available for storage distance extension. The right option for any situation will depend on the ability of the networking technology to cost-effectively deliver the performance demanded by the underlying applications. Two important factors that may determine the suitability of a distance extension technology in a storage application are the recovery-point objective (RPO) and the recovery-time objective (RTO) of the organization that owns the data. RPO is a measure of the period of time between data backups. A short RPO would imply backups occurring with greater frequency than a long RPO. Simply put, the RPO places a measure on the amount of information an organization can afford to lose. RTO is a measure of the amount of time that it will take an organization to recover its data after a failure. A short RTO implies a shorter recovery period than a long RTO. Together, RPO and RTO help define the type of distance extension solution that an enterprise can implement while still meeting its business objectives for storage. If an organization has a need for supporting mirroring of large amounts of mission-critical data, for example, it would probably need a distance extension solution able to sustain very short RPOs and RTOs. Such solutions would tend to support significant bandwidth and minimal latency. At the other extreme, support for an application that is
SONET Services for Storage A rea Networks
539
non-mission critical and that is characterized by a long RPO or RTO could be handled by distance extension options that offer lower bandwidth and throughput. Four of the current options available to organizations seeking distance extension solutions are described below. Among these options, SONETbased distance extension offers significant promise as a solution offering a very favorable trade-off between performance and cost and will be discussed in greater detail later in this chapter. The current options include: 1. Legacy access solutions, which provide a low-cost option for connecting storage systems with low bandwidth requirements. 2. WDM storage connectivity (lambda) solutions, which are significantly greater in cost to deploy but provide the highest bandwidth and reliability for the most demanding storage applications. 3. Storage/Fiber Channel (FC) over IP, which is becoming much more common with the advent of the Internet and with core IP backbones becoming available. These are low-cost solutions that take advantage of the prevalence of IP. The drawback is that IP QoS is still emerging at this time, which reduces the overall reliability and predictability of IP storage traffic. 4. Storage/FC over SONET/SDH, which is becoming a popular choice for sub-Gig-data-rate storage requirements. For the service providers this approach leverages the invested SONET infrastructure in place today, enabling them to provide an economical storage service to their customers.
15.5.1 Legacy Private Line The so-called legacy category of distance extension solutions utilizes traditional low-speed private line services like DS-l's and DS-3's for distance extension. This option carries storage traffic across the installed telecommunications infrastructure using ATM over SONET transport. Legacy solutions suffer from a lack of scalability, particularly disadvantageous in the face of a rapidly growing data pool.
15.5.2 WDM Wave division multiplexing (WDM), a technology that uses multiple wavelengths (or frequencies) of light to transport concurrent streams of digital information over fibre optic cable, is an ideal technology for storage extension because of its ability to meet the most rigorous performance requirements of storage networking.
540
Chapter 15
The value that WDM brings to storage is the fact that it is protocol agnostic. This means it can be used to transport any type of traffic, including storage traffic in its native format. This versatility allows a WDM network to transport native Ethernet (at any speed), native Fibre Channel, and ESCON, FICON, and any other storage protocol the user may choose to use. Furthermore, the ability of a WDM system to carry multiple types of traffic over different wavelengths on the same fiber only adds to its utility in data center applications. A WDM network can be used to connect multiple geographically separated sites and can scale to support data transfer rates measured in the hundreds of Gigabits per second. Among the advantages it offers is support for business-continuity applications that require high availability systems that are not susceptible to WAN outages. WDM systems offer massive scalability but usually require dark fiber and may not be cost-effective for all organizations.
15.5.3 Storage over IP 15.5.3.1
iSCSI
iSCSI is one of the three technologies being touted as an option for taking advantage of IP networking for storage applications. As Figure 15-5 shows, iSCSI operates on top of TCP (transmission control protocol), moving block data cells (or iSCSI packets) over an IP Ethernet network. This transport protocol consolidates SANs into a single IP network for both data and storage traffic, using Ethernet for the LAN and IP-over-optics for the WAN.
^^l
Mmi
Figure 15-5. iSCSCI hierarchy for storage
Because of the routing capabilities of an IP network, iSCSI is multipoint and provides any-to-any connectivity. iSCSI requires equipping both server and storage system with iSCSI components. This means incremental capital costs and higher server overhead associated with TCP processing. On the other hand, iSCSI scales well to 1 Gb and 10 Gb Ethernet and can result in administrative cost reductions in a pure TCP/IP network. Applications appropriate for iSCSI are fairly limited today, namely, local storage access where performance is not critical.
SONET Services for Storage Area Networks 15.5.3.2
541
FCIP
FCIP [8] is another technology being proposed as an option for storage networking over an IP network. Figure 15-6 shows the hierarchy in an FCIP network.
J ^ 'ir'.:
Ij-iihi
Figure 15-6. FCIP hierarchy for storage
This transport protocol introduces a new network device between the Fibre Channel SAN and the IP network, sometimes called an edge device because it resides at the edge of the Fibre Channel SAN. FCIP uses tunneling technologies by encapsulating Fibre Channel frames, whether FCP or SCSI, into IP packets and maps Fibre Channel fabric domains to IP addresses. FCIP supports existing Fibre Channel SAN hardware and software investments while allowing all SAN-connected data to be accessed over the IP network backbone without altering the Fibre Channel fabric, servers, storage devices, or software in any way. For this reason, many industry analysts agree that FCIP is the transport protocol that most seamlessly integrates IP into existing fiber channel storage area networks. Applications appropriate for FCIP include storage-to-storage applications such as disk mirroring, tape backup/restore, and FC SAN interconnection. 15.5.3.3
iFCP
iFCP is a gateway-to-gateway protocol that provides Fibre Channel fabric services over a TCP/IP network to FCP-based Fibre Channel devices. This protocol uses TCP to provide congestion control, error detection, and error recovery. This approach currently has very limited vendor support
15.5,4 SONET/SDH Storage over SONET and its ETSI equivalent, storage over SDH, use the widely deployed SONET [9] and SDH [10] infrastructure built to support voice transport to carry native storage traffic over considerable distances. With a few modifications to optimize the transport of block storage over a network built around the transport of 64 kbit voice, storage over SONET/SDH holds promise as a vehicle for providing affordable SAN extension capability to a very wide market segment including small-.
Chapter 15
542
medium-, and large-sized businesses. Table 15-2 provides a comparison of the alternative distance extension options. Table 15-2. Distance extension technology comparison L^ac)? Access
"BWDM
FCIF
++-h
++
++4-
++
+++
+
+4-+
++
++
+++
+
+++
4H-
++
+++
4-
+++
+++
+++
+++ i
+++
++4-
++
++
++
+4-
+4-4-
+++
+
+4-4-
+++
++
+++
++
4-+
-f++
+++
++
+++
+4-4-
+++
+
++
•' iSCSI ' SCM^:
ATM,FR)
r
i
Mrott^^mt
''''••y::'^'^i%mwA%>':'-'
+ Good
15.6.
++Better
+++ +++ !
+++ Best
SONET—AN IDEAL DISTANCE EXTENSION PROTOCOL
SONET is the undisputed king of the voice communication infrastructure. Every major telecommunications provider has networks built on SONET or, in the case of the ETSI world, on SDH. The reason is simple. SONET provides a standardized, high-reliability infrastructure for voice communications. With guaranteed interoperability between disparate networks and the ability to hand off traffic seamlessly at peering points, SONET and SDH have been linchpins in the development of a global telecommunications infrastructure. Given the significant investment in SONET/SDH networks around the world, it makes good economic sense for service providers and users alike to
SONET Services for Storage Area Networks
543
get the maximum benefit out of the network by using it wherever possible to carry different types of traffic. Storage traffic is emerging as a dominant class of traffic, and this has provided the impetus for the development of techniques to facilitate its transport over the SONET/SDH network. SONET-based connectivity offers an attractive option for networking storage area networks because it supports distance extension, which is increasingly important as enterprises implement business continuity solutions. The networking of storage with SONET connectivity presents both a challenge and an opportunity for service providers and enterprises. The opportunity is to create a new revenue stream based on the transport of native protocol storage between any two points where the provider has existing network connectivity. The challenge is to meet the availability, reliability, and performance requirements needed for the secure and effective support of storage applications such as mirroring or providing server cluster connectivity. Figure 15-7 shows how SONET fits in a SAN extension application.
SAN Connected with SONET Clients
Clients
Servers
Servers
Storage Data Center 1
Figure 15-7. Inter-Data-Center connectivity based on SONET/SDH
544
Chapter 15
15.6.1 Making SONET Fit— The Role of Standards While SONET in its current form offers an attractive connectivity option for networking storage, there are enhancements under way that are aimed at enhancing its efficiency as a storage networking protocol, particularly in distance applications. Three areas of interest are the following: • Generic Framing Procedure (GFP) [11] • Virtual Concatenation (VCAT) [ 10] • Flow Control 15.6.1.1
Generic Framing Procedure
Generic Framing Procedure (GFP) is an ITU standard that describes a flexible mapping technique for transparent transport of multiple protocols in a SONET or SDH network. GFP provides a low-overhead procedure for transporting both packet services and storage services including FICON, Fibre Channel, ESCON, Ethernet, and OC-n signals over SONET. GFP offers unambiguous implementation across multivendor networks, error correction schemes that enable the extremely low bit error rates critical for storage connectivity, and very efficient mapping into SONET without the addition of protocol baggage. Protocol baggage is the protocol overhead that occurs when data is mapped to other protocols from its native protocol. For example, prior to GFP, the transport of native Fibre Channel frame required it to be mapped onto TCP/IP, which was then encapsulated in an Ethernet frame and transported across the local area network. Subsequently, for its transport over the wide area network, further protocol conversion of Packet over SONET (POS) needed to be performed, where the Ethernet headers were stripped off and PPP/HDLC headers were appended to the IP packet and subsequently mapped onto SONET. This protocol conversion adds significant processing steps and protocol overhead (baggage) at each layer, as can be seen in Chapter 5. With GFP, the Fibre Channel frame is directly mapped into SONET with little protocol overhead. 15.6.1.2
Virtual Concatenation
Virtual concatenation, as discussed in Chapter 4, is an ITU standard that enhances link utilization and transport capacity efficiency by enabling complete flexibility in the allocation of bandwidth to client signals. SONET is able to transport storage traffic at different rates on standard interfaces; however, in many instances, the allocation of bandwidth to a specific storage signal has traditionally been inefficient. Many times a full OC-3, OC-12, or
SONET Services for Storage Area Networks
545
OC-48 has had to be assigned to transport this storage data, even though only 100 Mbps may have been required. 15.6.1.3
Flow Control
Flow control is a scheme for adjusting the flow of data sent from one device to another in a way that ensures that the receiving device can handle all the incoming data. This important mechanism helps to guarantee delivery between devices in a storage network. Since storage networks cannot tolerate discard, every bit of information sent must be received, and receipt must be acknowledged. Flow control is typically implemented via a credit system that ensures that the rate of data sent does not exceed the ability of the receiving device to receive, and hence no information is lost. The system uses credits to allow the sending device to keep track of how much capacity the receiving device has in its input buffer. Once the receiver's input buffer is full, the sending device will have no more credits and cannot resume sending information until it receives an acknowledgment in the form of buffer credits back. Buffer credits are critical to the transmission of storage traffic over distance because they maintain performance by allowing continuous use of the link and ensure zero data loss. The implementation of buffer credit-based flow control schemes on SONET access devices is a key element for ensuring the efficiency of a storage link that extends across the WAN. A WAN flow control system helps to eliminate a phenomenon known as droop, in which the performance of the extended SAN degrades because the distance between the initiator and target devices precludes a fast enough receipt of acknowledgment messages by the initiator to maintain its store of credits. If the initiator does not receive acknowledgment of delivery by the target and runs out of credits, it will stop putting new data on the link. This situation can arise on long links when the latency introduced by travel time on the link is sufficiently large to impact receipt of the acknowledgment message by the initiator. The end result is a situation in which, even though there is available bandwidth on the link, there is no data being placed on the link by the initiator device. Flow control standards are currently being considered under an initiative within ANSI Technical Committee T i l [12], which has a mandate to standardize additional configurations and protocols that support the extension of Fibre Channel networks over distance. This standards effort, which is being worked on under the title Fibre Channel Backbone - 3 or FC-BB-3 [13], will address standards for flow control as well as performance, timers, and management functions for configurations that support extended distance extension.
546 15.6.1.4
Chapter 15 Additional Benefits of SONET for Distance Extension
The addition of GFP, VCAT, and flow control capabilities to SONET are specifically aimed at optimizing its ability to transport block encoded protocols, including Fibre Channel. However, SONET offers a host of other attractive capabilities that make it ideally suited to distance extension. First and foremost is its ubiquity. SONET/SDH services are available in all geographies globally. This level of availability means that any organization with standard telecommunications connectivity has the potential to use SONET/SDH in a SAN distance extension application. Furthermore, the cost of a SONET/SDH service is within the reach of the average organization. The continued erosion of bandwidth prices will only enhance that attractiveness of SONET in distance extension applications. Another aspect of SONET that adds to its appeal as a distance extension technology is its reliability. SONET/SDH networks are robust. The industry term carrier grade is often used to describe the elements on which a SONET/SDH infrastructure is built. In practical terms, this is reflected in features such as equipment redundancy as well as fast-failover in the event of equipment or link failures. This level of reliability is perfectly suited to storage applications, which are by definition designed to address the problem of equipment failure. SONET networks also offer a level of security and deterministic behavior not found in IP networks, for example. SONET circuits are usually point-topoint. As such, they are less susceptible to snooping, and because their behavior can be accurately characterized, adjustments for factors such as latency can be accommodated when implementing applications. As previously noted, given the impact of latency on storage applications, these features are very useful when an organization is considering a distance extension technology. The ability to characterize the behavior of a SONET link makes it possible for a service provider to offer a customer a servicelevel agreement (SLA) with defined quality of service (QOS) guarantees. An SLA for a storage service would have to take into account the essential elements of service required to support a data replication service, for example. The SLA for the service would define metrics for the service, including availability, throughput, network transit delay, mean time to repair (MTTR), mean time to respond, data delivery ratio, and reporting requirements. A sample SLA is shown Table 15-3. SLAs are an essential element in the mass-marketing of telecommunication services and can be expected to facilitate SONET/SDH adoption in distance extension applications.
SONET Services for Storage A rea Networks
547
Table 15-3. Sample SLA of Storage Network extension
1 ' - 'SetViae I^vel'Guaraiitee/''
\
Availability Throughput Network transit delay Mean time to respond Mean time to repair Data delivery ratio Notification of problems
Domestic S^^Tice. 100%
1 Gbps 125 ms round trip 2 hours 4 hours 99.99% 30 minutes
•'• ,
'> M
1
Finally, the interoperability of SONET networks creates the potential for borderless distance extension. Since the storage traffic being carried across a SONET/SDH network is carried transparently, it can transit add-drop multiplexers, cross-connects, switches, and other elements of the SONET network with no need for special treatment by any of these devices. It can also be handed off from one carrier to another with no need for special consideration. The entire network, metro and long-haul, is available to a customer seeking to move data from one geographic location to another. This capability allows the owner of data to support storage applications ranging from backups to replication to mirroring. Of course, as previously described, the desired performance will drive the required bandwidth and the distance between sites, but the flexibility exists to meet and address a wide range of scenarios.
15.7.
SUMMARY
Data growth is one of the most pressing issues faced by all organizations. In the current environment, data is an important commodity that must be managed to meet business needs and protected to ensure regulatory compliance. Storage networking has emerged as a very important vehicle for simultaneously addressing the business need and fiduciary responsibilities of an organization. Storage Area Networking has proven itself to be the best storage networking option for management of missioncritical block-oriented data. As recent unexpected events have shown, geographic separation is one means of ensuring business continuity in an uncertain world. With the ongoing reductions in the cost of bandwidth and developments in technology, distance extension is proving to be a viable option for organizations looking to increase the separation between their SANs. Of the solutions available for distance extension, SONET/SDH connectivity offers significant promise for distance extension based on its
548
Chapter 15
attractive combination of cost and performance. With enhancements in its abihty to carry native storage traffic using GFP and to carry traffic more efficiently using VCAT, SONET/SDH offers a practical means of transporting storage traffic across multicarrier networks. SONET/SDH comes standard with carrier grade reliability, predictability, scalability, and protection of that bandwidth from end-to-end. The combination of GFP, VCAT, and flow control, which bring intra-data-center performance to interdata-center connections, allow SONET/SDH networks to meet the stringent demands for extending storage area networks.
15.8. REFERENCES [I] [2] [3]
[4] [5] [6] [7] [8] [9] [10] [II] [ 12] [13]
"How Much Information? 2003", hUp://vvwvv.simsJ>erkeley.edii/Yesearcli/projects/howmuch-inib-20Q3/. ESCON, http://\vwvv.redbooksJbmx-oiTi/abslracis/sg244662.html . Fibre Channel Framing and Signaling (FC-FS), ANSI INCITS.373:2003, October 27, 2003. Note: Published Til standards are available from the INCITS online store at http://www.incits.org or the ANSI online store at http://www.ansi,org. FICON, http:/7wvvvv.redbook$.ibm.coni/abstracts/sg246266.htmr?Qpen. SCSI-ANSI X3T10 Internet Small Computer Systems Interface, iSCSI, Internet Engineering Task Force, RFC 3720, April 2004. http://www.ieee802.Org/3/. Fiber Channel over TCP/IP, FCIP, Internet Engineering Task Force, RFC 3821, July 2004. ANSI American National Standard T1.105, Synchronous optical network (SONET) — Basic Description including Multiplex Structure, Rates and Formats. ITU-T Recommendation G.707/Y1322 Network Node Interface for the Synchronous Digital Hierarchy SDH, 2003. ITU-T Recommendation G.7041/Y.1303, The Generic Framing Procedure (GFP), 2003. http://wWW,t 11 .org. Fibre Channel Back Bone -3 (FC-BB-3), ANSI INCITS, work in progress, http://www.tlLorg.
PART 3 Control and Management of Transport Networks
This page intentionally blank
Chapter 16 ARCHITECTING THE AUTOMATICALLY SWITCHED TRANSPORT NETWORK ITU-T Control Plane Recommendation Framework
Alan McGuire , George Newsome , Lyndon Ong^, Jonathan Sadler , Stephen Shew , Eve Varma^ 'British Telecommunication Pic, ^Consulting Engineer, ^Ciena, "^Tellabs, ^Nortel, ^Lucent Technologies
16.1.
INTRODUCTION
Transport networks have traditionally been associated with manual provisioning of circuits for long-duration services based upon a centralized management system for configuration and provisioning. Originally, transport networks were completely manually operated, involving circuit orders on paper along with staff located in equipment stations to both execute the circuit orders (make connections) and locate and repair equipment faults. Each equipment generation has added more to automated network operation. PDH generation networks introduced remote operations, but provided little with regard to integrated management. SDH generation networks provided standards for maintenance features and equipment control, and some network operators even developed automated operational support systems capable of creating hundreds of circuits a day with connection setup taking minutes per connection [1]. This development has been adequate for automating provisioning within carrier-specific operations systems, but did not allow for easy operation between carriers. In fact, after ten years of wide-scale SDH deployment, there is still no existing platform for provisioning connections across multiple operators. Historically, switched services have been considered as connections that are set up and torn down using signaling protocols, while the setup and
552
Chapter 16
teardown of leased line services was performed via network management protocols. This distinction has been an artifact of the traditional demarcation between transmission and switching. The distinction between switchedservices-based and leased line services has begun to blur, partly due to the shortening length of contracts for leased lines, and many network operators and suppliers are developing control plane (see Section 16.2.1 for a discussion of planes) technology for application in transport networks. The goal has been to allow faster service provisioning, particularly between network operators, as well as the creation of new network services [1]. The advent of control plane technology, and associated orientation towards switched connection services, enables fine-grained control of a few specific services versus control of equipment in general. This service orientation facilitates technology independence, and the fine-grain aspect makes it more likely that interworking between carriers will be possible. This is because it is no longer necessary to interconnect all aspects of each carrier's operations system, but only to connect a single connection service. Utilization of control plane technology, however, does not remove the need for fault localization, performance management, or trouble ticketing. The control plane additionally offers opportunities for increased automation, which has traditionally always led to reduced operating costs. The application of control plane technology to transport networks has rapidly gained industry momentum, with various standards bodies and industry fora engaged in tackling various facets of this problem space. The ultimate vision is multivendor and multicarrier interoperable networking that supports end-to-end switched connection services on a global scale. To reach this goal, open standards for a distributed control plane must be established. Standardization activities have been under way within the International Telecommunication Union-Telecommunications Standardization Sector (ITU-T), the Internet Engineering Task Force (IETF), and the industry fora Optical Internetworking Forum (OIF) and ATM Forum. The ITU-T started "top down" with the development of the networking requirements for the generic automatic switched transport network (ASTN), working down into detailed protocol requirements. The IETF started "bottom up" in developing the Generalized Multi-Protocol Label Switching (GMPLS) umbrella of specifications based upon modifications and extensions of existing IP-based signaling and routing protocols [2]. The OIF has focused upon developing control plane implementation agreements based upon, wherever possible, available global standards and provides associated interoperability demonstrations with the intent of offering an early testing vehicle for the industry. The ATM Forum has primarily provided feedback and input regarding proposed extensions of protocols within their scope of expertise (i.e., PNNI). Iterative
Architecting the Automatically Switched Transport Network
553
communications among common participants, and liaisons among them, have been leading towards convergence of requirements and protocols so as to enable industry usage of a common/generic set of base protocols, with protocol extensions for transport domain application. Within this chapter, we focus upon ITU-T control plane standardization efforts, including requirements, architecture, and network models, and the relationship among requirements, architecture, and protocol solutions. The ITU-T Recommendations we will be covering deal with Requirements (G.807 [3]), Architecture (G.8080 [4]), Signaling (G.7713 [5], G.7713.1-3 [6-8]), Discovery (G.7714 [9], G.7714.1 [10]), Routing (G.7715 [11] and G.7715.1 [12]), Data Communications Network/Signaling Communications Network (G.7712 [13]), and Control Plane Management (G.7718 [14]). The draft ITU-T Recommendations on control plane management information modeling, G.7718.1, is touched upon but not detailed, as it was under development at the time of the writing of this chapter. The relationships among these Recommendations are illustrated in Figure 16-1.
omiticalty Switched Optical Network (ASON)
I
G.7713.1 (PNNI-based) n ! G.77132 :(GMPLS-RSVP-TE){
,__„ T I G.7714.1 I (Discovery ; Messages)
••
Y
G.7713.3 : (GMPLS-CR-LDP) I
ls-^s Based | ! Based |
Based
Figure 16-1. ITU-T control plane Recommendation structure
The depth of treatment will enable a reader to understand the reasoning used in the design of the various parts, though a detailed treatment of individual protocol specifications is beyond the scope of this chapter.
16.2.
NETWORK REQUIREMENTS (G.807)
Recommendation G.807, completed in July 2001, provides network-level requirements for the Automatic Switched Transport Network (ASTN) control plane, whose primary functions are related to the setup and release of
554
Chapter 16
connections across a transport network. It specifies the fundamental connection control functions, in single and dual homed applications, and other functions to provide for diversity in routing connections to support high availability services. We note that the dual homed requirements apply equally whether the customer has two connections to the same provider (protecting against physical plant failure) or whether the customer has connections to two different providers (protecting against failures within a single provider network). It is important to recognize this limited scope, as it was intended that management platforms continue to support other aspects of network operation. In particular, there is no mention of equipment provisioning or fault management functions being subsumed by the control plane. Because the time frame of the ITU-T SG 15 specification of the Optical Transport Network (OTN) was roughly concurrent with that for specification of the control plane, subtending SG 15 Recommendations (and many other documents) refer to the Automatically Switched Optical Network (ASON). However, there is nothing in the ASON Recommendations limiting their applicability to other transport technologies, and the terms ASON and ASTN have essentially become synonyms. Recommendation G.807 requirements are client and technology independent, and provide the foundation for the architectural specifications of switched transport networks and the technical specifications required to implement these networks for particular transport technologies. In the years following the 2001 issue of G.807, additional requirements have been added to ASON Recommendations as increased understanding has clarified some of the earlier requirements. It is expected that G.807 will be refined and enhanced in its next revision to reflect these insights.
16.2.1 Architectural Context While G.807 and G.8080 attempt to be explicit in defining the terminology used, there are a few terms that fall through the cracks (quite often this is because these terms are so commonly used by the engineers who draft these recommendations that those engineers no longer realize that these terms may not be familiar to others). A particular case is the sudden appearance of the term planes. The transport network, originally, was manually managed and all the equipment was simply transport. Later, automatic management via remote operations systems was introduced, and the terms transport plane and management plane were introduced. The transport plane refers to the components and resources being managed, while the management plane refers to the components and systems doing the managing. In terms of architectural models, the transport plane refers to
Architecting the Automatically Switched Transport Network
555
everything described by the G.805 [15] functional architecture (Chapter 2), and the management plane refers to the managed objects and the systems that operate on them. Since the management systems were remote from the managed equipment, a communications network was provided for the management applications to use. As this was a general-purpose communications network, the message communications protocol was designed to be independent of the transport equipment infrastructure being managed (even though many of the communications channels were provided by reserved capacity on these facilities). With the introduction of ASON, a new aspect of control was introduced, and the notion of control plane is a natural extension. The ASON control plane is the set of all components that provide the autonomous (from the point of view of a management system) operation of ASON services. It was thus implicitly understood that ASON would need a communications network, and it was implicitly assumed that such a network would have very similar properties to the management communications network that most transport network engineers are familiar with. This long-standing practice of separation between planes and having a communications network independent of the facilities is not generally found in IP networks. As we will later discuss, this has some important implications when signaling and routing protocols from the IP space are applied to the ASON architecture.
16.2.2 Call and Connection Control Connection control, essential to the operation of a transport network, may be provisioned, be signaled, or be some combination of the two (hybrid). Connections that are provisioned, by means of a management system or manual methods, involve configuration of every network element along the path with the information required to establish an end-to-end connection (i.e., G.805 network connection). Such connections are called permanent connections (PC) [3]. A switched connection (SC), illustrated in Figure 162, is dynamically setup via automated routing control (with or without a distributed route computation approach) and signaling functions on an endto-end basis.
Chapter 16
556
Figure 16-2. Example of switched connection service
A hybrid connection, known as a soft permanent connection (SPC) and illustrated in Figure 16-3, is set up by provisioning permanent connections at the ingress and egress ports of the network, with the control plane setting up the switched connection in between. An SPC has the properties of a leased line (private circuit) but is setup using a signaling protocol. All three of these approaches may occur within carrier networks, and may form the foundation for different service models and applications [2].
C
Management Plana
j^'
\ 1
/
;;^'NEU
ni
J
V'•
Control Plane
•>
Transport Plane '.
-NE;'-^- - NE
NE
ME - ^ . ' t
Switched Connections
'
X
/
I
Soft Permanent Connection Service
Figure J 6-3. Example of soft permanent connection service
Regardless of the approach, a critical requirement is that failures in the control or management plane should not affect the connection (we note that the same applies to software upgrades). In other words, in the event that signaling connectivity is lost, the transport network must maintain existing connections. Furthermore, when the control plane recovers, it should recover connection state information without affecting the live connections. This requirement is typically referred to as the separation of control and transport
Architecting the Automatically Switched Transport Network
557
planes and reflects the criticality of the transport plane maintaining an extremely high level of reliability. We note that the persistence of a connection itself is actually a function of the type of service-level specifications supported within a transport network. The majority of connections will have a persistency requirement (e.g., protected connections, or connections with some guaranteed level of availability). However, operators might decide to offer a "best effort" service-level specification for connections that are automatically released when a defect occurs on the connection. A call can be considered as a service provided to user endpoints, where multiple calls may exist between any two endpoints and each call may have multiple connections. The call concept provides an abstract relationship between users, where this relationship describes (or verifies) the extent to which the users are willing to offer (or accept) service to (from) each other. A call does not provide the actual connectivity for transmitting user traffic but only builds a relationship by which future connections may be made. Call control is therefore defined as a signaling association between one or more user applications and the network to control the setup, release, modifications, and maintenance of sets of connections. The concept of call control grew out of the vision of the intelligent switching network (IN), which was enabled by the concept that service control could be logically separated from connection control. This meant that new services could be offered, unconstrained by assumptions about the resources of the underlying network. This allowed the introduction of new value-added services in addition to the basic PSTN (Public Switched Telephony Network). Thus, a service request could still be made from a phone terminal, but the request would be delivered to a service manager control function rather than being directly delivered to the connection control fimction. The service manager is now a client of the original connection service and can employ that connection service as needed to carry out the service request. The service interface can now address resources that are completely unknown to the original connection service, enabling a wide range of new services to be created without affecting the underlying transport network. We can also consider call control itself to be a service that can be logically separated from connection control, allowing call control to be processed (and if necessary located) separately from connection control in the underlying transport network. The notion of considering calls and call control to be a kind of service is a reflection of the idea that the ability to set up connections in the network may be offered to others [16]. While call control functions were not explicitly discussed within G.807, there was an implicit assumption of their existence (e.g., discussion of call arrival rates and the need for associated congestion control mechanisms).
Chapter 16
558
In the PSTN world, examples of call control capabilities include features such as ringback, call divert, etc. In the transport world, a fundamental call property is its ability to be supported by multiple connections, where each connection may be of a different type and where each connection may exist independently of other connections within the call. The concept of the call allows for a better flexibility in how users set up connections and how the network offers services to users. In essence, a call allows: • Verification and authentication of a call prior to connection, which may result in less wasted resources • Support for virtual concatenation where each connection can travel on different diverse paths (e.g., taking an ESCON payload of 160 Mbit/s, adapting it using GFP [17], and mapping into a virtual concatenation group composed of 4 VC-3s, each at 48.384 Mbit/s) • General treatment of multiple connections that may be associated for the purpose of recovery; for example, a pair of primary and backup connections may belong to the same call • Use of a public and private addressing space (hosts using a public space, with the network using only an internal private addressing space) • Better upgrading strategy for service provider control plane operation, where call control (service provisioning) may be separated from switches and connections (where connection control may reside) An example of a single call with multiple connections for recovery purposes is illustrated in Figure 16-4. In this example, a single call exists between two users, and the network provides a highly available service. It accomplishes this by instantiating two connections between the users' points of attachment to the network, where user traffic is replicated on both connections. Should one of the connections fail in the network, the network can bridge traffic (at the egress) from the remaining connection such that the users do not perceive any change.
User 1 Q | -— • • ^
Call
^
3 ^ —4 ^
Connections
V ^ ***»» ^x^
**^
User 2
\
w
Figure 16-4. Example of call with two connections for availability purposes
Architecting the Automatically Switched Transport Network
559
Connections associated with a call may also vary over time. A call may exist with one connection and then at, a later time, another connection may be added. Another example is when a call with a single connection experiences loss of data transfer on that connection due to a failure. The call can still remain alive while a restoration action is initiated to create a second connection that, once active, will allow data transfer to continue. Another concept originating from the PSTN world is that of supplementary services in addition to mandatory services. Supplementary services provide more information regarding an existing service that may not be needed at all times but may be signaled when desired. The service access point for supplementary services (i.e., the service interface on the component that is offering the service) does not necessarily coincide with that for other mandatory services. This implies that different protocols may be used to support the service. Possible supplementary services that might be considered for control plane application include: • Customer management of Virtual Private Networks (Closed User Groups) • Route-Query • Route fragment caching operation (Route-Recording) • Directory services/client reachability Implicit in the discussion of requirements related to call and connection control are business and operational aspects, described in Section 16.2.3 below.
16.2.3 Business and Operational Aspects Control plane deployment will ultimately be set within the context of an operator's business model, and must support the business aspects of commercial operation. Deployment will also take place in the context of the heterogeneity of transport networks, which must also be accommodated in any control plane solution. In this section, we examine control plane requirements and implications arising from support for existing business models and transport network infrastructure options. Currently existing business models, one or more of which might be used by various organizations of the same network operator [18], include: 1. An Internet Service Provider (ISP) that owns all of its own infrastructure (i.e., including fiber and duct to the customer premises) and only delivers IP-based services 2. An ISP that leases some of its fiber or transport capability from a third party, and only delivers IP-based services on that infrastructure
560
Chapter 16
3. A service provider owning, for example, an SDH infrastructure, who offers SDH, ATM, and IP services, and who sells services to customers who may themselves resell to others 4. A Bandwidth Broker (or carrier's carrier) providing optical networking services (a subtle difference between this case and the previous case is that the bandwidth broker may not own any transport infrastructure(s) that support those services, and the connection is actually carried over third party networks) The above business models introduce requirements involving operational aspects that must be supportable by the control plane. Two of the primary areas relate to trust/security boundaries and billing considerations. We note that billing also means that there has to be a commercial agreement before purchasing bandwidth. This is true for a wholesaler even when selling in the same organization. Hence, a client platform requires financial authority to buy lower layer capacity, and service management needs to be aware of it. It is clear that policy and security needs vary among service providers, and that internal network service provider choices must be respected. For business model 1, since the infrastructure is fully owned by the ISP (an unusual scenario), there are no tmst issues as the infrastructure and service providers are one and the same. This is not the case for business model 2, where some of the infrastmcture is leased, and there is a trust boundary between the infrastructure provider and the ISP. A similar situation exists for business model 3, where there is a trust boundary between the service provider business and the client businesses (e.g., regarding the degree of visibility of internal service provider network routable addresses). For business model 4, there are clearly trust boundaries among all of the involved networks [18]. In addition to trust aspects, the different billing models employed among these business models must be accommodated. For example, ISP billing tends to be flat rate for a given access bandwidth. ISP billing mechanisms are not fine-grained, are not sensitive to distance, and do not track the number of different destinations {called parties). However, for business models 3 and 4, service providers would typically be interested in assuring not only that they can bill for any value added services they provide but also that they have the ability to consider various factors/parameters in their billing policies. Additional service provider requirements that impact control plane requirements include aspects that appear to be fairly obvious from a transport network infrastructure perspective, though not necessarily so from a data-oriented perspective. Examples of these include assuring that a carrier has the ability to control usage of its own network resources, i.e., that a carrier controls what network resources are available to individual services
Architecting the Automatically Switched Transport Network
561
or users. This ability allows the carrier to ensure that critical and priority services get capacity in the event of capacity shortages [19]. It also allows carriers to deploy their resources to support Service Level Agreements (SLAs) as they best see fit, supporting carrier-differentiated SLA realization. Another example is that the control plane solution should avoid inherent assumptions regarding optimizations for, or dependencies on, particular supported client services. This is evident from consideration of the aforementioned business models, where the first two only require support of IP services and the latter two require support for a range of client services. Aside from commercial business aspects, carrier choices related to the underlying transport network infrastructure also have implications for control plane requirements. The transport infrastructure has steadily evolved to include a wide range of bearer technologies, infrastructure granularity options, flexible capacity adjustment schemes, and survivability mechanisms. Network operators and service providers have deployed a range of bearer technologies, have chosen differing infrastructure evolution strategies, and have to cope with various considerations and constraints related to their operational support system (OSS) environments. It should similarly be expected that heterogeneity would occur in the deployment of the optical control plane, involving differing control plane signaling and routing protocol options and versioning, as well as management/control plane evolution scenarios. On top of this, it will be necessary to handle multivendor and multicarrier scenarios. Thus, from a pragmatic perspective, the requirements for control plane solutions must be developed with heterogeneous environments in mind so that they are able to coexist with the existing network. In other words, there should be no assumption that such solutions will be deployed in a green field and/or homogenous environment. In summary, examination of business and operational aspects results in an understanding that • Commercial "businesses" require a strong abstraction barrier to protect their operating practices and the resources of the business from external scrutiny or control • Provided value added services must be "verifiable" and "billable" in a value-preserving way • The transport network is segmented into portions belonging to different businesses • Transport networks are inherently heterogeneous (including the means by which they are controlled and managed) • Even within a specific business, further segmentation of the network can take place due to policy considerations (e.g., choice of survivability mechanism)
562
Chapter 16
There are a number of implications arising from these statements. For example, a control plane solution that satisfies commercial business requirements would allow a carrier to provide services without exposing the internal details of its network to their customers (as in the traditional PSTN). This leads to the concept of service demarcation points. The fundamental understanding that transport networks are inherently segmented into portions (or domains, as will be more fully articulated in the next section) drives the realization that the scope of connection control would not generally be on an end-to-end network connection basis. In the context of business and operational requirements, the goal of the control plane may be stated as supporting services through the automatic provisioning of network connections across one or more managerial/administrative domains. This involves both a service and connection perspective: • The service (call) perspective is to support the provisioning of end-to-end services while preserving the independent nature of the various businesses involved • The connection perspective is to automatically provision network connections (in support of a service) that span one or more managerial/administrative domains Indeed, the desire to provide and obtain services across an interface probably colors the ASON requirements more strongly than any other aspect of switched service, and leads directly to the recognition that a network of any size automatically has several business interests taking part.
16.2.4 Reference Points and Domains Recommendation G.807 identified three substantially different interfaces between • Service requester and service provider control plane entities (UNI), • Control plane entities belonging to different domains (E-NNI), and • Control plane entities belonging to one or more domains having a trusted relationship (I-NNI) During the development of subsequent Recommendations, it became apparent that the term interface was a misnomer, as interfaces tend to suggest physical connections. In particular, while working on G.8080, it was recognized that the UNI, E-NNI, and I-NNI interfaces from G.807 are in fact reference points, in that they are logical and are described by the information flows across the points. The flows themselves were characterized in terms of network details that were exposed or hidden from an information user. Each information flow supports a service (transport, signaling, routing, discovery).
Architecting the Automatically Switched Transport Network
563
and each service may have different endpoints. (The components responsible for routing are different from those involved in signaling, and are not necessarily co-located.) However, the reference point comprises all these services. Thus, it is essential to distinguish logical reference points from the physical interfaces supported by signaling and routing protocols that are carried over a communications network. The concept of domain, which was used in a conversational sense within G.807, has evolved over time as ASON requirements and architecture specifications have matured. Initially, it was considered useful from a requirements standpoint that all the equipment operated by a single carrier had its own group name, and the term domain was typically used. As work progressed on ASON architecture and routing requirements, the concept of a control domain was introduced, and more precisely described as an architectural construct that provides for encapsulation and information hiding. The characteristics of the control domain are the same as those of its constituent set of distributed architectural components. It was understood that control domains are generally derived from architectural component types that serve a particular purpose, e.g., signaling control domains, routing control domains, etc. The nature of the information exchanged between control domains across the E-NNI reference point, for example, captures the common semantics of the information exchanged among its constituent components, while allowing for different representations inside each control domain. Continued discussion of call and connection control led to further insights regarding the relationship between operator policy and establishment of domains. In summary, the domain notion embodied in the G.805 definition of administrative and management domains, and the Internet administrative regions (e.g., Autonomous Systems), has been generalized in the control plane architecture to express differing administrative and/or managerial responsibilities, trust relationships, addressing schemes, infrastructure capabilities, survivability techniques, distributions of control functionality, etc. Thus, a domain represents, and is characterized by, a collection of entities that are grouped for a particular purpose; hence, there are different types of domains. Domains are established by operator policies and have a range of membership criteria [4]. As domains are established via operator policies, it was further recognized that interdomain reference points (i.e., UNI and E-NNI) are actually service demarcation points, i.e., points where call control is provided. With this understanding, we can speak of reference points between a user and a provider domain (UNI), between domains (E-NNI), and within a domain (I-NNI), where the • UNI is a user-provider service demarcation point,
564
Chapter 16
•
E-NNI is a service demarcation point supporting multidomain connection establishment, and • I-NNI is a connection point supporting intradomain connection establishment. The fact that domains are created by policy, and have a range of membership criteria, should not be surprising. For example, when we introduced the concept of a subnetwork in Chapter 2, we indicated that subnetworks might be delimited according to a wide range of criteria, including such factors as administrative and/or management responsibility. Just as subnetworks are characterized by the points at their edge, with little regard to the equipment inside, so domains are characterized by the policies applied to the physical interfaces at their boundaries. The rationale for creating domains and subnetworks has everything to do with considerations that make sense to network operators, without the need for standardized rules for constructing them. It cannot be overly stressed, however, that reference points and domain boundaries are essential when the supported services are instantiated on some physical interface.
16.2.5 Architecture Principles Summarizing the discussions of the preceding sections, and referring to Figure 16-5 below, the ASON architecture framework reflects the policy boundaries that exist in transport networks [20]. • Calls are end-to-end service associations. While call state is maintained at network access points, it may also be required at key network transit points where it is necessary or desirable to apply policy. • When a call spans multiple domains, and hence E-NNIs, it is composed of call segments. Each call segment runs between a pair of call-state aware points, and the concatenation of call segments creates the call. • One or more connections are established in support of individual call segments. In general, the scope of connection control is limited to a single call segment, i.e., it does not typically span multiple call segments. The collection and concatenation of subnetwork connections and link connections provides end-to-end connectivity (i.e., the network connection). Some examples of multidomain scenarios that require the scope of connection control be limited to a single call segment include the following: [20] • The service is realized in different ways within each domain (e.g., technology, QoS). • Separate address spaces are used within each domain, especially when separately administered.
565
Architecting the Automatically Switched Transport Network • •
There is independence of survivability (protection/restoration) for each domain. There is a trust boundary.
Domain 3
Domain 1
User1
User 2
End-to-end call Call segments Connections LC
SNG
-•HLC%
SNCs
LC
SNCs
LC
Figure 16-5. Example of call with multiple call segments and connections
It is clear that a generic connection setup service exists that can be implemented with either management or signaling protocols. Further, the underlying protocols need not be from the telecommunications environment, since all that is needed is verification that they can support established telecommunications needs [16]. When we consider solutions from the Internet environment, it should be recognized that these bring along underlying principles and architectural aspects. Thus, it is important to understand Internet architectural principles and implications. The influences of the Internet "end-to-end" principle, and how it compares with the use of the call/connection model found in ASON and other connection-oriented services [20], is particularly interesting. The Internet "end-to-end" principle was described in a 1984 paper coauthored by J. H. Saltzer, D. P. Reed, and D. D. Clark [21]. Arguing that certain required end-to-end functions can only be performed correctly by the end-systems themselves, in RFC 3724 it "was articulated as a question of where best to put functions in a communication system" [22]. Again referring to RFC 3724 [22], "As the Internet developed, the end-to-end principle gradually widened to concerns about where best to put the state associated with applications in the Internet: in the network or at end nodes." As described in RFC 1958 [23] and in its update, RFC 3439 [24], "An endto-end protocol design should not rely on the maintenance of state (i.e., information about the state of the end-to-end communication) inside the network. Such state should be maintained only in the endpoint, in such a way that the state can only be destroyed when the endpoint itself breaks." In
566
Chapter 16
particular, "Hard state, state upon which the proper functioning of the apphcation depends, is only maintained in the end nodes" [22]. Thus, the preferred architectural principle of the Internet has been that connection control should be end-to-end and that no service state should be maintained at transit points within the network. It should be observed that the desire to avoid holding some call state information at transit nodes differs from the fundamental ASON requirement to facilitate service handoff between domains. The desire to make the scope of connection control end-to-end also differs from the ASON architecture framework, where we have seen that for most cases it is necessary that connection control be scoped to a single call segment. It is important to understand these classical Internet architectural principles when considering requirements and candidate protocol solutions. However, it is equally important to understand that the Internet architecture is itself evolving. The increasing commercialization of the Internet has stimulated some rethinking of its underlying architecture [25-28]. Quoting from [25], "An architecture of tomorrow must take into account the needs and concerns of commercial providers if it is to be accepted and thus to be able to influence overall direction". Several cited examples included the need for a framework for policy controls on interprovider routing, as well support for a variety of payment models for network usage. In moving from the purely technical considerations, which are of paramount importance for a community of users with shared goals and mutual trust, to a world in which the various stakeholders that are part of the Internet may have differing and sometimes conflicting objectives, a "tussle" [27] emerges. This "tussle" requires accommodation in the evolution of the Internet architecture and results in additional design principles that are organized around such concepts as separation of concerns, enabling choice, supporting the ability to bill for value-added services, and trust issues. For example, quoting from [22], "...prior to designing the protocol, the trust relationships between the network elements involved in the protocol must be defined, and boundaries must be drawn between those network elements that share a trust relationship." In particular, some key principles that have been introduced include the following: • "Modularize the design along tussle boundaries, even if there is no compelling technical reason to do so" [27]. • "Global communication with local trust"; "transparency modulated by trust" [28] • "The [New Internet] architecture should include a general notion of regions, to express differing interconnection policies, trust relationships, multiplexing mechanisms, etc. Regions may also support distinct
Architecting the Automatically Switched Transport Network
567
addressing regimes, performing any necessary address mapping as data crosses region boundaries" [28]. • "A future Internet should be designed without the requirement of a global address space" [28]. • "The Internet design should incorporate mechanisms that make it easy for a host to change addresses and to have and use multiple addresses. Addresses should reflect connectivity, not identity, to modularize tussle" [27]. The overall conclusions about the Internet end-to-end principle have been that "the end-to-end arguments are still valid and powerful, but need a more complex articulation in today's world". An illustrative example is where "the partitioning of functioning is done so that services provided in the network operate with the explicit knowledge and involvement of endpoints, when such knowledge and involvement is necessary for the proper functioning of the service. The result becomes a distributed application, in which the end-to-end principle applies to each connection involved in implementing the application" [22]. Restating, "a distributed approach in which the end-to-end principle applies to interactions between the individual pieces of the application, while the unbundled consequences, protection of innovation, reliability, and robustness, apply to the entire application." [22] Examination of the principles articulated in next generation architecture studies, and the ASON architecture, shows a striking level of consistency.
16.2.6 Supporting Functions and Requirements In this section we provide a summary of functions and requirements described in G.807. 16.2.6.1 Connection Management In general, connections in a transport network are expected to be bidirectional and symmetric. This differs from packet networking where the path in one direction is often independent of the path in the other, and must be set up separately. However, the capability to handle unidirectional or asymmetric connections should be supportable, if desired. Thus, a fundamental control plane requirement is to support the following connection capability types (for either SC or SPC): • Uni-directional point-to-point connection; • Bidirectional point-to-point connection; and • Uni-directional point-to-multipoint connection. It is also required that the control plane provide support for multihoming, which involves support for more than one link between the end users and the
568
Chapter 16
network. This can be subdivided into multihoming to a single network operator (e.g., for the purpose of resihence or load balancing) and multihoming to multiple network operators. Control plane signaling and routing capabilities must then also permit a user to request diversely routed connections from a carrier that supports this functionality. 16.2.6.2 Routing and Signaling To support connection management, G.807 identifies a routing function, which enables paths to be selected for the establishment of a connection through one or more operator networks. The routing function operates by ensuring that each element needing to select a path through the network has sufficient knowledge of network topology and resource utilization. In general, this is done by disseminating routing information throughout the network, creating several important trade-offs, which will be dealt with in later sections of this chapter. Routing brings requirements related to structuring the network for scalability, usually performed by hierarchical schemes that reduce the volume of data needing to be transferred or stored by removing detail as one moves up the routing hierarchy. This aspect is closely coupled to addressing schemes, which should allow for address summarization and which frequently achieve this by using the routing hierarchy to create an addressing scheme. Recommendation G.807 also introduced the primary connection management processes for signaling, including basic features for the UNI and NNI reference points at source and destination, as well as introducing additional procedures that may also be supported. These processes, features, and procedures provided an outline for specific behaviors and abstract messages in G.7713. 16.2.6.3 Naming and Addressing Coupled with the previous discussion on addressing, and considering our earlier observation of the need for a strong abstraction barrier between user and provider, G.807 also states that service provider addresses should not be communicated to users. It must be observed that, in general, an address is a concatenation of names, and a name only needs to be unique within a prescribed context. In many cases these distinctions are unimportant, and the term identifier is used. An implicit ASON requirement is that identifiers be unique within a single plane, so transport, management, and control planes are not constrained to use different identifier values. Indeed, it is also an implicit
Architecting the Automatically Switched Transport Network
569
requirement that a single plane may have several disjoint identifier spaces. This comes from the G.805 modeling techniques, which describe each network layer in isolation, leading to the usefulness of separate identifiers for each network layer. Further descriptions of identifier spaces are contained within G.8080 and G.7715.1 and are provided in Section 16.3.7. Some examples of drivers for separate user and provider identifier spaces include: • Enabling customers to take their identifier with them should they relocate (number portability) and/or avoiding a need to change their identifier if an operator altered their internal network structure • Avoiding client caching of provider addresses, (security/privacy considerations aside), potentially hindering network ability to evolve to larger/different address spaces or reallocate internal addresses • Ensuring a true multiclient server network (e.g., IP/MPLS, Ethernet, ATM, TDM). This does not imply that the same type of addressing could not be used in both user control planes and provider transport control planes as long as the semantics of the two identifier spaces are separate and distinct. (This would also be in conformance to RFC 1958 [23], which mandates exactly the same principle by requiring that "the internet level protocol must be independent of the hardware medium and hardware addressing" and that "this approach allows the internet to decouple its addressing mechanisms from the hardware".) • Assuring that identifier schemes are flexible enough to allow operators to use their existing routable addresses, when they so desire 16.2.6.4 Transport Resource Management Before an ASON control plane can be used to set up connections, it is clear that transport resources must be available for it to do so, which is a planning and provisioning exercise. This is especially true when ASON is being added to an existing network, as those resources are already under the control of management systems. An important provisioning requirement is to allow for variable resource partitioning between control and management, and to assure that operations that move resources between management or control responsibility do not affect the state of the resource. This requirement allows changes to be made to the network composition or responsibilities while the network is operating. 16.2.6.5 Admission Control Both calls and connections require support functions to decide whether the network should admit a particular call/connection. Call admission control
570
Chapter 16
is performed before any connections are requested and may include checking the customers' privilege to make the call as well as checking the called parties' willingness to receive the call. The latter can be done at various times depending on later design decisions. Connection admission control in general has to do with the availability of network resources to provide the connection. As mentioned earlier, G.807 does not explicitly discuss call control functions.
16.2.7 Signaling Communications Network Requirements Any distributed control system requires communications among the elements implementing the system, and the ASON control plane is no exception. The Signaling Communications Network (SCN) is an integral, yet independent, part of the overall control plane. To maintain the integrity of control message delivery, the SCN must meet several important requirements, including the need for control message reliability to be guaranteed in almost all situations, even during what might be considered catastrophic failure scenarios. There are significant differences between packet-switched and circuitswitched networks impacting the SCN. For example, in MPLS, the control plane messages and the data plane packets share the same transmission medium and hence the same reliability (i.e., the topologies are congruent). A failure affecting data packet forwarding also affects control packet forwarding and vice versa [29]. In contrast, within transport networks it cannot be assumed that the topology of the controlled network is identical to the topology of the network supporting control plane communications. Perhaps the most obvious example for this is to consider the ASON control plane controlling a WDM transport network in which control plane signaling is carried on an Optical Supervisory Channel (OSC). As laser and receiver failures on different wavelengths are generally independent, if the OSC fails, it cannot be assumed that the traffic has failed. Furthermore, in an OSC failure scenario, it is necessary to find another route for the signaling traffic in order to get to the next node via some disjoint path. This separation allows the SCN to be independently designed, and allows for optimum use of resources to achieve the desired control plane performance. The communications network designer is able, but is no longer forced, to use the embedded channels provided by some, but not all, transport technologies. In particular, if adequately secured, a LAN or WAN could also be used. This flexibility is of particular interest when designing communications to support Switched Connections (SCs).
Architecting the Automatically Switched Transport Network
571
16.2.8 Support for Transport Network Survivability Recommendation G.807 specifies that ASON must be able to support transport network survivability. Transport network survivability may be handled by classical protection approaches (i.e., via autonomous actions within the transport plane) or by ASON control plane actions. ASON can offer new mechanisms for survivability by using signaling mechanisms to reroute a failed connection. The discussion in G.807 distinguishes between protection and restoration based upon the usage of dedicated or shared resources. In later ASON work, the distinguishing characteristic for survivability is whether or not the control plane is involved. I.e., protection is described as a mechanism for enhancing availability of a connection through the use of additional, assigned capacity, whereas restoration is described as involving the replacement of a failed connection by rerouting using spare capacity [4]. (We note that an implicit requirement is that the control plane itself be survivable.) Perhaps most important, G.807 states that user requests for explicit survivability mechanisms in a carrier network are not supported because users should not have visibility into the internal details of the carrier network. However, the user is permitted to request diverse connections—that is, a group of connections with limited common routing.
16.3.
ARCHITECTURE (G.8080)
In this section we consider the architecture of the control plane as described in G.8080. Before doing so, however, it is useful to take a short diversion to understand the transport networking and management landscape before the development of G.8080 and how this factors into new developments in control plane technology. In essence, it is the need to interwork with legacy environments while providing a framework for future developments that dictated the nature of the tools and methods employed in the G.8080 architecture. The centralized management systems developed in the last ten years or so normally contain a database of the network (for the technology of concern) that can be used for a variety of purposes, including route calculation, circuit visualization, "plan and build" processes, inventory management, and capacity management. This database can also be related to other systems that provide features such as service management, fault management (within and between technologies), trouble ticket management, fiber and cable records, and so forth. It should be noted that network management capabilities vary considerably not only between operators but also within an operator's
572
Chapter 16
network among technologies and platforms. To some extent, this is a reflection on the size of the network, decisions regarding what to automate, and the maturity of the technology. In such management systems, the centralized route calculation drives the connection setup process by means of management protocols that communicate with management agents (in the form of element managers) in network elements. The communication with the network elements is generally by means of a data communications network (DCN), which is a router network dedicated to network management functions. This architecture can also be made hierarchical. A criticism often leveled at such an architecture is that there is a single point of failure. In reality, large network management centers have duplicated systems at a fallback center and have more than one connection to a DCN. The DCN also provides resiliency so that there is generally more than one way to reach a network element to allow communication via network management protocols. For many applications, such systems are perfectly capable of managing large networks. In many ways, these systems already provide all the functionality that is often associated with a control plane, including route calculation and signaling. Their major limitations come in two distinct forms: • The connection control is too slow for switched circuits (rather than soft permanent connections). Management protocols, which are multipurpose vehicles, are simply too slow for this purpose. It is much better suited to a specialized signaling protocol. • Centralized control does not necessarily provide optimal response times for the purposes of restoration survivability mechanisms. At the other extreme, sometimes depicted (mistakenly) as the control plane, are control systems located on every network element and communicating with one another via a communications network. The centralized network management system plays no part in real-time activities such as route calculation processes, though network element static configuration parameters are still under network management control. The control systems carry out a routing information exchange process operating asynchronously in the background, and a connection setup process acts in real time between cooperating nodes. Here, each network element contains a routing table providing a set of alternative routes from itself to every other network element that shares routing information. These routing tables have the same information as the route search in the centralized architecture. However, the tables are computed using information obtained from a routing protocol whereby each network element exchanges the routing tables of its neighbors. As such, a network element can update its own routing table to provide more
Architecting the Automatically Switched Transport Network
573
current information regarding network topology. This aids the connection setup process and speeds up restoration that does not use pre-calculated routes. In the development of a control plane architecture, it could be assumed that the fully distributed model is all that is required. However, this model does not reflect the way in which control plane technology will be used in many networks or the fact that control plane functionality can also be provided by network management protocols. This situation can be easily understood by consideration of the following scenario. A network operator with a large installed base of outer core network elements that are controlled using network management protocols wants to introduce control plane technology on a new generation of network elements deployed into the inner core. In such a case, the network operator is not going to "rip out" all the existing outer core network elements and replace them and it may not be possible (or cost-effective) to upgrade them, to provide control plane functionality. In such circumstances, the problem is how to provide the end-to-end configuration of circuits in such an environment. One way of solving this problem is illustrated in Figure 16-6. The end-toend configuration and route calculation are directed by the network management system. For those parts of the connection that can be set up using management protocols, the management system calculates the route and sets up the connection. For the control plane controlled portions of the network connection, the management system delegates functionality to the control plane. In effect, the control plane network elements are collectively seen as a virtual network element by the management system. The management system provides the input and output points for the connection(s) that traverses the control plane-enabled network elements and then leaves the actual route calculation and setup to the control plane. In this scenario, it is clear that we have both central route calculation and distributed route calculation working in partnership and two different sets of connection control protocols (one network management based, one signaling based) that interact with network elements in different ways. During the initial development of the control plane architecture in the ITU-T, it became evident that what was required was an architecture that allowed the control plane functionality to be distributed in any allowable fashion, e.g., to every network element, shared by a group of network elements, or centralized. Furthermore, depending on the set of functions that are required, some functions may be centralized and others distributed in a single instance of the architecture. As an example, signaling can be combined with centralized or distributed routing. This situation led to an architecture that has as its main tool the concept of a component, borrowed
Chapter 16
574
and slightly modified, from object-oriented analysis and programming. The use of components also allows all the power of the Unified Modeling Language [30] (UML), and the software tools associated with it, to be applied.
OSS Connect messages from OSS
^'^
^•,.
Subnetwork connections created by a management protocol in this part of network Routing under control of OSS Subnetwork connections created by control plane in this part of network End points defined by the OSS, routing between these points uses control plane Network element
Figure 16-6. End-to-end configuration using a combination of network management and control plane functionality
Recommendation G.8080 is intended to provide a comprehensive model that takes into account various commercial relationships, organizational structures, and operational practices. The goal of the G.8080 architecture is to identify the external interfaces that must have protocols defined, while maintaining the ability to verify the architecture against operations scenarios and functional distributions that are required by the various requirements documents. Historically, the Telecommunication Management Network (TMN) Recommendations approached the problem from the point of view of objects viewed through an interface. This led to an equipment centric, rather coarse interface, which did not allow an easy distribution of the necessary functionality to the most appropriate network element. An attempt was made to improve upon this approach by using the techniques of Reference Model for Open Distributed Processing (RM-ODP), which constitutes a framework of abstractions for the specification of open distributed systems. By enabling a separation of concerns, it provided a means for separating the logical specification of required behaviors from the specifications of physical architectures implemented to realize them [16].
Architecting the Automatically Switched Transport Network
575
The application of RM-ODP to telecommunications has been specified in ITU-T Recommendations G.851-01 [31], G.852-01 [32], G.853-01 [33], and G.854-01 [34]. These techniques take the approach of viewing the desired system from the point of view of the Enterprise, Information, Computation, and Engineering decisions that have to be made. The end result is a more fine-grained collection of interfaces, each providing a simple service that can be assigned to network elements to provide a wide range of solutions, each meeting different needs. However, this end result did not lend itself very well to constructing scenarios to verify that the interfaces specified were necessary and sufficient, since a tremendous amount of system behavior actually occurs within a network element, and internal implementations are not subject to standardization. In order to avoid the verification problems of the earlier work, G.8080 created a component architecture to facilitate the construction of reasonable scenarios. In UML, a component is defined as "a physical and replaceable part of a system that conforms to and provides the realisation of a set of interfaces" [30]. A component in this sense typically represents the physical packaging of otherwise logical elements, including classes and interfaces. In the context of G.8080, a component is defined as "an element that is a replaceable part of a system that conforms to and provides the realization of a set of interfaces". The subtle difference is that a component in G.8080 represents an abstract entity rather than a piece of implementation code. Thus, in this context, components represent logical functions rather than physical implementations. With this in mind, UML can be used in describing the G.8080 architecture. The means of deciding upon interfaces used the same analysis and design techniques from the RM-ODP application to telecommunications. Components were created by considering the lifetime of the objects in the system and the span of control of the resulting component. The result is a small set of components that support a wide range of implementation choices and allow scenarios to be constructed to validate the architecture against requirements. It is important to realize that the G.8080 architecture specifies components and interfaces on a per G.805 layer network basis. In what follows, we first consider how the control plane views network resources and then consider the components that make up the control plane. We note that the architecture described in G.8080 not only applies to connection-oriented networks but also could be employed, with some modifications, in connectionless networks. This outcome could be achieved by describing the transport network using G.809, which is a flow-based version of G.805. To give a hint as to how this might be accomplished, consider a flow in the limit which consists of a single packet. This packet
576
Chapter 16
can be considered as a self-describing short-lived connection that simply uses and releases transport resources as it moves through the network. Alternatively, consider the connection setup and release process operating at a faster and faster rate. Aside from some minor changes to accommodate terminology differences between G.805 and G.809, the only major change to the G.8080 architecture would be the removal of the concept of a call, which simply becomes a null function in the architecture. With this in mind, G.8080, with appropriate modifications, can be used to describe existing and future connection-oriented and connectionless control planes. We can therefore conclude that G.8080 can be used as the basis of a control plane architecture for any transport technology.
16.3.1 The Control Plane View of the Transport Network The description of G.805 transport network functions makes no reference to the control and management of these functions. Depending upon the desired control or management view (e.g., connection, fault, performance management), not all aspects of transport network functionality are of relevance. Thus, it is necessary to abstract the particular aspects of transport network functions that contain information relevant to the specific view. From the perspective of control, the relevant view is concerned with the management of connections. Some key abstractions that are relevant to the control plane are illustrated in Figure 16-7 and enumerated below: • The subnetwork points (SNPs) that need to be associated to form a connection (these are simply an abstraction of the connection points (CPs) in G.805) • The subnetwork connection (SNC) that represents the dynamic relationship between SNPs on a subnetwork • The link connection (LC) that represents a static relationship between SNPs in different subnetworks • A set of SNPs that can be grouped for the purpose of routing, thereby forming a subnetwork point pool (SNPP) • An SNPP link, which is a link associated with SNPPs in different subnetworks. The link contains LCs formed by the association of SNPs.
Architecting the Automatically Switched Transport Network
577
y Subnetwork ,
-§MC . r ^ .
SNP Link Connection / ^
SNC
Figure 16-7. The relationship between entities in the transport plane, the management plane, and the control plane [4]
Another key abstraction that is required for the purposes of routing is the routing area (RA). An RA is defined as being composed of a set of subnetworks, their interconnecting SNPP, links and the SNPPs that represent the ends of SNPP hnks exiting the RA (ilhistrated in Figure 16-8). This setup allows links to be addressable within the RA, hence allowing for step-bystep routing. In contrast, for a subnetwork, only the ends of the link connections are visible from within the subnetwork. We note that the critical distinction of link end visibility is only important to an observer inside the routing area. From the outside, subnetworks and RAs are identical, and this causes the terms subnetwork and RA to be used almost synonymously. The distinction between the two is usually obvious from the context. In the context of routing discussions in G.7715 and G.7715.1, the term node was adopted to denote either a subnetwork or an RA. This decision was based upon the earlier definition of RA within G.8080, where the lowest limit of recursion of an RA was two subnetworks interconnected by a link. In fact, with the updated definition provided in G.8080 Amendment 2 (i.e., the lowest limit of recursion of an RA is a subnetwork), a node and an RA are considered synonymous. As a result, RAs also have the property of recursive containment similar to subnetworks. This property enables support for hierarchical routing schemes. Recommendation G.7715 defines how successive sets of contained RAs form a routing hierarchy. Routing areas are thus the key concept that matches the organization of the control plane to the organization of the transport plane. We note that the scope of the management abstractions for the CTP and TTP objects is different from those in the control plane, reflecting their
Chapter 16
578
different roles and areas of responsibility. This distinction is also immediately apparent by describing the resources that the control plane manipulates in terms of G.805 architecture constructs. Partitioning
Routing area Subnetwork SNPP SNP
•
CTP(M.310O)
9
SNPP Link
SNP Link Connedion
^
N: c>
Unk Connection .
rt
Figure 16-8. Routing Areas, subnetworks, SNPs and SNPPs [4]
Now that we have introduced the abstractions that the control plane needs in order to manipulate resources in the transport plane, we can turn our attention to the manner in which this outcome is achieved.
16.3.2 Identifying Components The RM-ODP-based methods used to construct the G.8080 architecture focus on a single aspect (called a viewpoint) of the problem and solution space at a time. The ODP viewpoints are as follows: 1. Enterprise viewpoint, which is concerned with the purpose, scope, and policies that govern the activities of the specified system 2. Information viewpoint, which is concerned with the types of information handled by the system, together with applicable constraints 5. Computational viewpoint, which is concerned with the functional decomposition of the system solution into a set of objects that interact at interfaces, thus enabling system distribution 4. Engineering and Technology viewpoints, which are concerned with the infrastructure and choice of technology used to support the system As the intent of standards is to allow interworking, rather than to proscribe implementations, G.8080 makes the most use of the Information
Architecting the Automatically Switched Transport Network
579
and Computational viewpoints and limits itself to component interfaces that are essential to distribution. The Computational viewpoint is concerned with objects (in this context, G.8080 components) and interfaces. One may wonder how the objects/components in G.8080 were identified, as every Object-Oriented design method has its own description of how to find objects. Recommendation G.8080 built on existing work from the space of distributed transport management applications, which provided a basic set of objects that support network topological aspects, and then some more general principles were applied to identify additional components. This work is further described below. Work prior to G.8080, involving centralized control via management interfaces, assigned a subnetwork performer computational object [16] to manage the subnetwork and the link. The subnetwork performer embodies all the information there is to know about the subnetwork, with the most important information being the internal structure of the subnetwork (its internal subnetworks and links), which is essential to be able to route a connection across the subnetwork. Similarly, one can consider a link performer computational object that embodies all that can be known about the link—in particular, its composition in terms of sublinks and individual link connections. This performer would also be responsible for preventing the same link connection from being allocated to more than one connection request. Recommendation G.8080 takes many of the ideas associated with these performers, rather than the performers themselves, as axiomatic. In a distributed system, there is no single platform that can support these performers as single objects. Further, it was recognized that not all aspects of a performer need to be available at the same location or at the same time. This leads to a different distribution of routing information from signaling control (and hence a different collection of components and interfaces). Recommendation G.8080 components (i.e., RC and CC, and LRM, respectively, as described in Section 3.4) reflect the distribution of performer operations, factoring in the above considerations. The services these performers offer are realised by the collaborative interactions among their associated components. A final critical consideration involves component lifetime and degree of coupling with other components (and coupling with the transport infrastructure). This consideration has led to only one component being directly involved in any aspect of the underlying transport hardware (i.e., TAP, as described in Section 16.3.4).
580
Chapter 16
163.3 General Component Properties and Special Components The component and its interfaces are illustrated in Figure 16-9. The interfaces are defined based on the operations that the component is intended to carry out and are, therefore, component specific. Recommendation G.8080 also defines some component properties, expressed as special interfaces that every component can be assumed to have, though these are not mandatory. These special interfaces allow monitoring of the component operation and dynamically set internal policies and behavior. In addition, a special class of component {Port Controller) is provided to deal with external policies such as security. For example, one role of the Port Controller component is to validate that an incoming user connection is sending traffic according to the parameters that have been agreed upon in the service-level agreement.
1.
Messages
•
Interfaces
1
< name>
Figure 16-9. The component and its interfaces
The general component model describes protocol-neutral interfaces, which exchange primitives between components. There is one exception to this in the form of the Protocol Controller component class. This class combines several primitive streams into external protocols, which enable various distributions of the components among physical platforms.
16.3.4 Component Overview The control plane architecture can be described by means of a library of components that are illustrated in Figure 16-10.
Architecting the Automatically Switched Transport Network RC
CC
PC
(a) Routing Controller
(b)Connection Controller
(c) Protocol ContioUer
"^CaSkg ^PCC
Calkd PCC
J Netwofk CaBC
(d) Calling Party Call Controller
581
(e) Called party Call Controller
(f) Network CaU Controller
^LRMA
(g) Link Resource Manager A end
(h) Link Resource Manager Z end
0A
TAP
(i) Discovery Agent
(j) Termination & Adaptation Performer
Figure 16-10. The control plane component library
The components are summarized below. 1. Routing Controller (RC) The Routing Controller component is derived by distributing some functionality of a more abstract object called the Subnetwork Performer, which has complete information about all the contained nodes and links within its RA. Routing Controllers belonging to the same RA cooperate to ensure that each RC has a complete view of the internal RA topology. This cooperation takes place via a Routing Protocol and the results are made available to other components via the routing table. 2. Connection Controller (CC) The Connection Controller component is also derived from the Subnetwork Performer. Connection Controller components cooperate to set up connections. This is done by consulting the path computation function in the RC, which then returns the set of nodes and links to be traversed in order to reach the specified endpoint. 3. Link Resource Manager (LRM) The LRM is derived from the Link Performer, which knows all there is to know about a link, by distributing functions to both link ends (LRMA and LRMZ). LRMs are responsible for managing the resources available to the link and allocating a specific link connection when requested. LRMs also cooperate to avoid two connections being allocated to the
582
Chapter 16
same link connection when the connections are being set up from each end of the link. 4. Calling/Called Party Call and Network Call Controllers (CCC, NCC) Call Controller components cooperate to control the setup, release, and modification of calls. They are relevant to service demarcation points (i.e., UNI, E-NNI). As discussed earlier, these service demarcation points are established via inter- and intraoperator policies. Such policies can be applied in several different ways; i.e., either centrally or on each switch or at signaling aggregation points. (The first and the last imply a different distribution of function from either Connection Controllers or Link Resource Managers; this differing distribution dictates that call controllers are different components from either CCs or LRMs.) Network Call Controllers (NCCs) are relevant at the E-NNI and UNI (on the network side) service demarcation points. It is the NCC that makes the choice of technology to support the service by translating service requests into technology choices. This setup meets the domain boundary opacity requirements and allows the network to be most flexible. The NCC also handles other aspects of calls, such as restoration. The restoration architecture, to be described in Section 16.3.9, supports restoration between domain edges. The need for restoration, or lack thereof, is a call property, and the activation of restoration is within the lifetime of a call. Thus, there is no need for an additional component to support restoration. The case of the end user of the network is special, and because of this two additional call controllers have been defined. The Calling and Called Party Call Controllers (CCC) are relevant to the user-provider service demarcation points and are the components that access the network on behalf of the end user of the service. 5. Protocol Controller (PC) ASON components are defined on a per-layer network (G.805) basis and, where appropriate, at a single level in the routing hierachy. They communicate over abstract interfaces using primitives, so called to distinguish logical communications between component interfaces and the communication via an implemented protocol over physical interfaces. Protocol Controllers shield the components from any protocol details that are irrelevant to the component. (An example would be reliable message transport; the component assumes it, and a protocol controller provides it. Another excellent example is related to providing a layered security architecture. Security and authentication are provided by Protocol Controllers, not by the components themselves.) Protocol Controllers also allow primitives from several components to be merged into a single
Architecting the Automatically Switched Transport Network
583
message stream, thereby allowing implementations that handle as many layers and levels as is useful. 6. Discovery Agent (DA) The Discovery Agent deals with network resources that have not yet been assigned to a layer network. (An example could be cross-connects that can switch a wide range of signal types). The DA is derived by distribution of an abstract object that knows about all the uncommitted resources and learns, or verifies, how they are interconnected. After this learning/verification, the resources can be assigned to the desired layer network or link. 7. Termination and Adaptation Performer (TAP) All networks are ultimately supported by physical equipment, which needs to be controlled at some point in its lifetime. However, it is not necessary for ASON components to know anything about the hardware supporting the network, as ASON operations are hardware independent. The TAP is the only ASON component that understands hardware and must therefore be collocated with that hardware. The role of the TAP is to hide the details of physical equipment operation from ASON. Examples of hidden operations include the adjustment of the adaptation function when hardware capable of supporting several signal types is used, and the suppression of alarms when a link connection is left intentionally unused. We note that there is also a Traffic Policing component (TP), which is a subclass of the Port Controller component described in Section 16.3.3. When an incoming user connection sends traffic that violates the agreed parameters, the TP may instigate measures to correct the situation. Traffic policing is important when dealing with packet switched networks, and is included in ASON for completeness. However, it has no function in conventional circuit switched networks.
16.3.5
Interlayer Modeling
As discussed earlier, the G.8080 architecture specifies components and interfaces on a per-G.805 layer network basis. For a transport network that supports multiple adaptations, an ASON instantiation could logically contain multiple ASON control planes, one for each layer network. A client with a UNI interface could request different layer services from the same UNI implementation. In such a scenario, there is no dependence between calls requesting connection services at different layers. On the other hand, transport services exist where the client layer has no resources in the network except at the edges. An important example of this scenario is when
584
Chapter 16
Ethernet traffic is carried across a SONET/SDH network, in which there are no Ethernet switches or Ethernet Hnks, on behalf of a cHent that has requested Ethernet service. In addition to the Ethernet/SDH chent/server example, the interlayer model also applies to the relationship between a layer network that supports virtual concatenation and its server layers. Thus, it is important to be able to model the associated interlayer interactions that must be supported in such scenarios. The above scenarios have been addressed in recent G.8080 developments related to the extension to the Network Call Controller (NCC) component to include an interlayer interface that enables it to have a relationship with that server layer call. This relationship is recursive so that a set of related adaptations are formed. In other words, the NCCs display a recursive G.805 client/server layering relationship. This characteristic is analogous to the stack of adaptations represented by the TMF 814 PTP (Physical Topological Link Termination Point) construct [35]. This setup may be viewed as creating a "stack" of NCCs at different G.805 layers. Wherever an adaptation occurs in a stack of adaptations, an NCC at that layer is created. The decision to use an interlayer NCC interface is driven by policy, as there may be choice regarding which server layers to use. Figure 16-11 illustrates a two-layer example. In this example. Layer A does not have a connection between its NCCs because that transport layer is not present between them. Instead, an adaptation to a server layer (layer B) exists. Associations (labeled 1 and 2) between NCCs at the two layers exist to correlate the service at layer B being used by the client layer A. The model can be generalized to supporting multiple client NCCs with a single server NCC.
Layer B call segment ^ - ^ connection Figure 16-11. Layered Network Call Controllers
The interlayer interface to the NCC enables a client NCC to initiate the relationship with a server NCC or vice versa. When a server NCC initiates
Architecting the Automatically Switched Transport Network
585
the relationship, it presents a pair of SNPs (ends of a network connection) that can be used by the cUent layer for transferring client CI. The connection presented is able to transfer client CI, and no call action at the server layer is initiated. This process is used for an operation where a server layer has already established a call and this connection is presented to the client layer at a later point in time. The client layer may accept or reject the use of the offered SNP pair (connection). This model accommodates the business scenario where the adaptations occur in a single administrative domain as well as in multidomain scenarios (e.g., a scenario in which each layer network is operated by a different carrier). In the latter case, the NCCs may be on different platforms, and the interface between them may need to be exposed. In both cases, the instantiation of the NCCs is still independent on a per-layer basis. For example, a server layer may have a centralized NCC, whereas the client layer may have distributed NCCs. Other components do not require interlayer interactions because once an NCC determines that resources from its layer are to be used to support the call, subsequent actions are taken only within that layer (especially connection control and routing control). This setup confines interlayer knowledge and actions to the NCC. Note that this process differs from sending information about multiple layers in a protocol controller that serves multiple layers (e.g., routing) because the interlayer call model maintains a client/server layer relationship, whereas sending multiple layer information over an interface does not imply that the information between layers is correlated in any way.
16.3.6 Distribution models The G.8080 component architecture identifies components in such a way that the most commonly used distributions of functionality are supported. However, before discussing actual distributions, it is useful to discuss architectural principles that result in some components being fixed. These components are called anchored components (or anchored objects in other contexts). As G.8080 provides the foundation for specification of external interfaces needing standardization, an important principle is that of reducing the number of different interfaces, as well as simplifying those interfaces. For example, we see that the CC provides the same interfaces regardless of the size of subnetwork being controlled. Thus, rather than creating a completely new interface to a switching element, we simply note that the lowest level CC is anchored to the switch, i.e., it is fixed in the switching equipment. All other connection controllers can be freely distributed anywhere in the network. Similar arguments apply to the DA and TAP
586
Chapter 16
components, which are similarly anchored to the equipment they are responsible for. These decisions reduce interface variation and complexity. The architecture is designed to support the independent distribution of routing, switch operation, link control, and call control. A wide range of system designs, ranging from almost all functions being centralized to almost all functions being fully distributed, is possible using the same architectural components and standard interfaces.
16.3.7 An Example of Components in Action So far we have limited our discussion to the stmctural modeling of the control plane by identifying the types of component that are of interest and their interfaces. Describing the associations between components can identify further structure. This modeling of the static aspects of a system allows us to describe and specify the things that make up the control plane. UML provides tools for achieving this by means of class diagrams and object diagrams. However, what is really of interest in the G.8080 architecture is the interaction of control plane components with one another. This interaction occurs as a result of messages being exchanged between a group of components to achieve some defined purpose. An interaction and messages can be formally defined as follows [30]: "An interaction is a behavior that comprises a set of messages exchanged among a set of objects within a context to accomplish a purpose. A message is a specification of a communication between objects that conveys information with the expectation that activity will ensue". UML allows the dynamic aspects of the system to be described using interaction diagrams. This form of diagram allows several forms of action to be modelled as indicated in Table 16-1. Message Type Call Return Send Create Destroy
Table 16-1. Messages and actions Action Invoking an operation on a component. A component may send a message in the form of a call to itself. Returns a value to the requestor Sends a signal to a component. A signal is a named object that is sent asynchronously, e.g., an exception Creates a component Removes a component
(Note that the meaning of call in UML is in the context of UML as a modeling language and not as described in the network context.)
Architecting the Automatically Switched Transport Network
587
Interactions may simply be between two components. Alternatively, a message transmitted from a component to a second component may result in the second component generating a message that interacts with a third component and so forth. In this case, it is often useful to include information regarding the sequence of the messages. Such interaction diagrams in UML are referred to as sequence diagrams. An interaction may be between components of the same type or a set of components that contains different types of components. An example of the former in the control plane is the exchange of routing information between Routing Controllers. This is illustrated in Figure 16-12. Messages are exchanged via the Network Topology interface on each RC, and this information is used to configure routing tables with network topology information and network topology update information. An example of different components interacting is in the setting up of a connection. A simplified example is illustrated in Figure 16-13. The reason why we simplified the discussion is that connection setup behavior actually depends on the means of routing, e.g., hierarchical routing, step-by-step or source-based routing. G.8080 describes these interactions in detail. In our simplified example, the role of the RC is different from that of the previous example, and the interaction with the component is by means of a different interface.
r~ r~n ^^
\^
P>
RC
'v ' — T — '
>k
>k^ Broadcast local route table to nearest neighbors
>f
^VK >f "v ^—T—'
i n
Hr
^^
^ ^
RC
Figure 16-12. Exchange of routing information
The sequence of events is as follows: 1. A connection request arrives at connection controller a (CCa). 2. The CCa component queries the routing component (RC) by means of a Route Query request. The RC returns the outgoing link to be used.
Chapter 16
588
3. The CCa component then interacts with the Link Resource Manager to allocate SNP link connections. The LRM responds with acceptance or rejection of the request. 4. Once the CCa component receives confirmation from an LRM that the connection request has been accepted, a subnetwork connection can be established across the subnetwork controlled by the connection controller. The remaining parts of the sequence show the flow of confirmations that the connection has been set up. This continues until the confirmation is returned to the original user. The above is, as stated above, a much simplified version of an interaction provided for illustrative purposes. Recommendation G.8080 provides more detailed interaction diagrams for connection setup and call control. However, G.8080 does not describe all possible interactions between components. This is intentional, as more detailed interactions are protocol specific and can be described in recommendations derived from G.8080.
RC
2. Route Table Query]
. Connection Request 4. Connection Request N+1. Connection Confirmation
3a. SNP Link Connection Request
N. Connection Confirmation
CCb
3b. SNP Link Connection Request
LRM
Figure 16-13. Setting up a connection
16.3.8 Identifier Spaces When working with a distributed control plane, a consistent set of identifiers needs to be developed for use in the configuration and operation
Architecting the Automatically Switched Transport Network
589
of the signaling and routing protocols. Recommendation G.8080 has recognized four categories of identifier spaces that are used in the ASON architecture, specifically: • Identifiers for transport resources used by the control plane • Identifiers for control plane components • Identifiers locating control plane components in the DCN • Identifiers for transport resources used by the management plane Before discussing these identifiers and their use, it is necessary to define the two types of identifiers that exist, specifically names and addresses. Names are identifiers used to reference an instance. It should be understood that more than one name may exist for an instance, and a particular name may be only valid within a specific scope. Moreover, names for an instance are allowed to change. However, since names do not imply any specific structure of a set of instances, they are not required to be summarizeable. Addresses are identifiers that locate an instance in a topology. While addresses may be composed of names, whenever a name used as a component in an address changes, the location in the topology does not. Since addresses are defined in terms of locations in a topology, they are inherently summarizeable. This means that there is a common prefix for all instances located within the common part of the topology. Two examples of a name are the number for a mobile phone and freephone (or "1-800") numbers. In each case, a mapping is required from the name to an address. This can be accomplished with a directory function. Addresses and names have a scope associated with them, and the larger the scope, the more elements are needed in the identifier itself. At the highest scope (i.e., global), the identifier is complete in the sense that no further information is needed. Within a smaller scope, the identifier is relative to that scope and does not have the same meaning outside of that scope. For example, global telephony numbers include country codes whereas within a city, shorter numbers may be used but are relative only to that city. We observe that the terms name and address were not cleanly distinguished in standards documents through 2004, and consequently there is inconsistent usage of these terms. In the text below, we will use the term identifier unless the distinction is critical to the understanding of the discussion. It should also be emphasized that the syntax of an identifier format does not imply that the identifier is a name or address. For example, usage of an IPv4 syntax for a particular identifier does not imply that it represents an IP address. For example, when such syntax is used to identify transport resources, these identifiers are clearly not IP addresses. The various identifier spaces, and their relationships, are illustrated in Figure 16-14. Here, the transport plane resources are multiply identified by
590
Chapter 16
name spaces in other planes. The 0AM identifiers in the management plane include equipment names and CTP names. Within the control plane, both UNI/E-NNI Transport Resource and SNPP identifiers are also applied to transport resources. DCN (Data Commimications Network)
Views of transport plane resources
Transport Plane Resources
Figure 16-14. Identifier spaces and relationships [14]
16.3.8.1 Transport Resource Identifiers Transport resource identifiers are used by ASON control components to refer to G.805 transport plane resources. Two such identifiers are used [4, 36, 37]: UNI/E-NNI Transport Resource and SNPP. UNI Transport Resource identifiers are used to identify transport resources at a UNI reference point if they exist. Similarly, E-NNI Transport Resource identifiers are applied to transport resources at an E-NNI reference point. They represent the resources between the client and network (or between networks), not the transport network endpoints. These identifiers were referred to as UNI Transport Resource addresses in the 2001 version of G.8080, and as Transport Network Assigned (TNA) addresses in OIF UNI 1.0 [36]. However, the context of their usage indicates that they are, in fact, names that the calling party call controller and network call controller use to specify destinations in making a call (see Section 16.8). This was recognized and reflected within OIF E-NNI 1.0 [37] and G.8080 Amendment 2, i.e., the term UNI Transport Resource name is used in this context.
Architecting the Automatically Switched Transport Network
591
SNPP identifiers provide a link context for SNPs and are used by the control plane to identify transport plane resources. It is important to note that control plane component identifiers cannot be used as SNPP identifiers because they are from the wrong space. The G.8080 architecture allows for multiple SNPP identifier spaces to exist for the same transport resources. An SNPP address must be unique within the RA terminating the SNPP hnk. In general, an SNPP address is the concatenation of the names of any enclosing RAs, the lowest subnetwork, and any additional link context. This scheme allows SNPs to be located at any routing level. The SNP address is derived from the SNPP address concatenated with a locally significant SNP index. Depending on the scope of an SNPP, not all elements of the address are needed. For example, an SNPP address within the scope of a matrix (i.e., the smallest subnetwork) may have just the matrix identifier and a link identifier. An SNPP address at the top of the routing hierarchy may have just an RA identifier, and an SNPP address in the middle of the routing hierarchy may have a sequence of enclosing RAs plus a link identifier. UNI/E-NNI Transport Resource names are distinct from SNPP addresses because of the G.807 constraint that users should not be given internal network addresses. UNI Transport Resource names must be bound to SNPP addresses in order to enable routing across "the network" between transport resource names at the A-end and Z-end of a call. This binding may be changed without changing service aspects. In order for connection management to establish a connection to a destination UNI Transport resource name, an address resolution function is needed to map it to a corresponding far-end SNPP address (or addresses). The relationship between UNI/E-NNI Transport Resource names and SNPP addresses may be any of the following: • 1 :n — One UNI/E-NNI Transport Resource name may map to multiple SNNP addresses at the same reference point. • n:l — Many UNI/E-NNI Transport Resource names may refer to one SNPP address at the same reference point. • m:n — Many UNI/E-NNI Transport Resource names may refer to multiple SNPP addresses at the same reference point. The first two cases are illustrated in Figure 16-15.
Chapter 16
592 UN I Transport
AGC
SNPP1 Resouice 1
m
UNI-C
MatTK 1 (NeVrork Element)
SNPP2
a) Multiple SNPPs to one UNI Transport Resource
Anr.
UNI Transport Resources 1, 2
UNI-C
*
Matrki (Network Element)
SNPP1 b) Multiple UNI Transport Resources to one SNPP
Figure 16-15. SNPP and UNI Transport Resource relationships
16.3.8.2 Control Plane Component Identifiers Control plane components also require separate identifier spaces, since they may be instantiated differently from each other for a given ASON network. For example, one can have a centralized NCC with distributed CCs. Thus, separate identifiers are needed for RCs, NCCs, and CCs. Additionally, the PCs that are used for protocol-specific communication also require a separate identifier space. For example, the identifiers for the Signaling PCs must be unique in order to unambiguously specify a particular signaling channel [37]. 16.3.8.3 Data Communications Network Identifiers To enable control plane components to communicate with each other, a Data Communications Network (DCN) is used (described in Section 16.4). DCN addresses identify the points of attachment for the signaling and routing PCs that instantiate control plane communication functions (generating and processing messages in protocol specific formats). We note that several PCs may share a DCN point of attachment, and any given network element may have multiple points of attachment. For example, the signaling PC DCN address refers to the point where the signaling PC attaches to the DCN. Thus, the signaling PC DCN address is based on the topology of the DCN carrying signaling messages, rather than the topology of the transport plane or control plane components.
Architecting the Automatically Switched Transport Network
593
16.3.8.4 Management Plane Identifiers These identifiers are used by management entities that are located in Element Management Systems (EMSs) and Network Management Systems (NMSs). Identifiers used for OAM purposes include those defined for the TTP (Trail Termination Point) and CTP (Connection Termination Point) [38], illustrated in Figure 16-7. TTPs represent the signal state as a signal leaves a layer network and are associated with the G.805 termination function, while CTPs represent the signal state as it enters a layer network and are associated with the G.805 adaptation function. Existing operations, administration, and maintenance (OAM) address spaces generally describe a physical locality that supports maintenance and fault correlation activities.
16.3.9 Restoration Architecture There are several techniques available to enhance connection availability, which refers to the ability of the connection to provide service even though there is a fault in the network. Recommendation G.805 describes transport network availability enhancement techniques. The terms protection (replacement of a failed resource with a preassigned standby) and restoration (replacement of a failed resource by rerouting using dynamically allocated spare capacity) are used to classify these techniques. In general, protection actions complete in the tens of milliseconds range, while restoration actions normally complete in times ranging from hundreds of milliseconds to up to a few seconds. For G.8080, however, the mechanisms supporting the technique by which the connection is restored are far less interesting than whether the control plane is engaged in restoring it. Recommendation G.8080 therefore extends classical protection and restoration definitions to classify protection as any mechanism that is autonomous and requires no control plane intervention (no rerouting). Similarly, restoration is classified as any mechanism that is operated by control plane actions since these operations always involve rerouting the connection. In principle, rerouting can occur over any portion of the network, and it intuitively feels best to only replace the failed component (link or node). However, it is not always easy to determine which component has failed in a timely manner; neither is it easy to determine which points should switch in response to the failure. It is also advantageous to subdivide a large network into a number of independent recovery domains. Different mechanisms, which are appropriate to the topology and capabilities of the equipment deployed, may then be used in each domain. In this manner, a clean separation is provided between the networks of different operators or
Chapter 16
594
between work forces within a single network, and the availabihty is improved when the size of a restoration domain is Hmited. G.8080 has adopted the ATM Forum approach [39] towards domainbased rerouting. The G.8080 rerouting domain model is illustrated in Figure 16-16.
Rerouting Domain
Destination Node
Rerouting Domain or /<
E-NNI
I
Routing Area
Routing Area
Figure 16-16. G.8080 Rerouting Domain Model
The restoration architecture is static and defined by the points at which restoration action occurs. These points are the ingress and egress edges of the rerouting domain. The egress edge detects the failure and coordinates rerouting with the ingress edge. A rerouting domain supports a single recovery mechanism, and each recovery domain is responsible for maintaining the integrity of the portion of the call that transits that domain. Recovery domains may be nested; thus if the innermost recovery domain cannot maintain the integrity of the call segment, an enclosing recovery domain (with larger scope) may attempt to recover the call. A policy application is used to map the Class of Service for the call onto the types of recovery domains that may be used when routing connections. A general network offering restoration is thus a concatenation and/or nesting of rerouting domains, and the innermost domain is responsible for clearing the fault. While the rerouting domain has a specific purpose, it is clear that it must be possible to route across a rerouting domain. In practice, this means that a rerouting domain boundary must be coincident with, or contained within, an RA, and rerouting takes place entirely within the domain. This model allows for a very flexible network design, in which the scope of the restoration problem is determined by the design of the network. This architecture supports two forms of restoration. The hard rerouting service offers a service survivability mechanism for calls and is always in
Architecting the Automatically Switched Transport Network
595
response to a failure event. The soft rerouting service provides a means to reroute a connection for administrative purposes (e.g., path optimization, network maintenance, or planned engineering works). The latter operation is generally triggered via a request from the management plane and sent to the egress rerouting components. The rerouting components establish a rerouting connection to the ingress components. Once the rerouting connection is created, it is taken into service and the initial connection is deleted. This provides make-before-break switchover and ensures that the service interruption is limited to the switchover time of the tail-end switch. Of particular interest is the failure of the link between two rerouting domains. This type of failure may be handled by providing the interdomain link with an autonomous protection mechanism, by enclosing the interdomain link and its end switches in its own rerouting domain, or by enclosing both domains and the link in an enclosing scope rerouting domain. Which solution is chosen depends on many issues that are outside the scope of this chapter.
16.4.
SIGNALING COMMUNICATIONS NETWORK ARCHITECTURE (G7712)
Transport network elements already support data communication functionality for enabling transport of management messages between network elements and their managers (e.g., EMSs, NMSs). One consequence of distributed connection management is the need to support distributed signaling communications, i.e., the need for a mechanism to transport the signaling messages between communicating ASON nodes. Since new control plane capable network elements will need data communication functionality for both management and signaling applications, we consider each application as running over separate logical networks, i.e., the logical Management Communications Network (MCN) and SCN, respectively. The term DCN will be used to represent the physical communication network supporting the logical MCN and SCN. The logical MCN and SCN may be supported via physically separate DCN networks, or a single DCN may support both the logical MCN and SCN as two applications sharing the same communications network. Recommendation G.7712 provides requirements and architecture specifications for a Data Communications Network that supports the Internet Protocol (IP), including support for ASON signaling applications. Version 1 of G.7712 provides requirements for a DCN providing connectionless services. In developing G.7712, it was understood that the current Embedded-DCN is OSI based, and therefore interworking requirements
596
Chapter 16
between IP and OSI must also be specified in the Recommendation. Version 2 of G.7712 provides requirements for connection-oriented services as needed for certain applications (e.g., ASON signaling).
16.4.1 Signaling Methods There are two primary approaches to the transport of signaling messages, which are referred to as in-band and out-of-band. Each approach can be used exclusively of the other. Alternatively a network design may include both approaches e.g. in-band methods in the access network and out-of-band in the core network. The choice of approach is a major determinant in the design of an SCN. 16.4.1.1 In-Band approaches In-band signaling is defined as meaning that signaling messages are carried on the same channel as the user information. Examples of in-band signaling are TCP/IP and signaling on analogue access links in telephone networks. In-band signaling implies some degree of "intertwining" between user and signaling information. This allows the signaling channel to be used as a proxy for the health of the user traffic. With in-band signaling, the signaling messages follow the user traffic. A disadvantage of this approach is that it is not inherently secure. 16.4.1.2 Out-of-Band approaches In contrast to in-band signaling, out-of-band signaling is defined as meaning that signaling messages are carried in a separate signaling channel. An example of out-of-band signaling is SS7 in telephony networks. Out-of-band signaling can be further subdivided into two categories; channel-associated signaling and common channel signaling. In the former, there is a direct association between the signaling channel and the user information channel, while in the latter the signaling channel is shared between user information channels on a demand basis. Once signaling is taken out-of-band it can no longer act as a proxy for the health of the user information. The reason is that the signaling and information channels are now subject to independent failure mechanisms — a fault in one channel does not imply a fault in the other. A consequence of out-of-band control is that operations administration and maintenance (OAM) has to be built into the traffic units that transfer user information to validate the integrity of information transfer.
Architecting the Automatically Switched Transport Net^A^ork
597
Many signaling systems utilize out-of-band common channel signaling because of the following features: • Efficient use of resources as signaling capacity is utilized on a demand basis by the user information channels. • The signaling protocol can be developed independently of the user channel. • When setting up a call, signaling messages may be transmitted at the same time as the establishment of the user information channel, allowing for shorter setup times. This is not possible with channel associated signaling. • The signaling network is a separate network. This fact allows the signaling network to use the same topology as the transport network or a separate network topology can be used. Both forms of topology can be combined. Signaling and management communications can be logically out-of-band although sharing the same facilities as the data. Examples include: - Embedded communications channels in SDH. Although they are in the same frame as customer traffic, they are part of section overhead rather than payload. An ECC can be misconfigured or can fail separately from the customer traffic. It is not possible for the customer traffic to modify control traffic in the ECC. Traffic that uses the ECC may also use a separate DCN where appropriate - Optical supervisory channels (OSCs) where signaling traffic is carried on a separate wavelength from data traffic. There is no reason for the protocol structure of this wavelength to represent that of the other wavelengths
16.4.2 Delivery of Control Plane Messages The Data Communications Network (DCN) may be composed of various facilities to support the exchange of control plane messages, as illustrated in Figure 16-17. These include Embedded Communications Channels (ECCs) as well as separated dedicated facilities. • The physical transport facility may have an Embedded Communications Channel (ECC), such as the SONET/SDH network's DCC, to carry these messages, removing the need for additional facilities to be deployed besides the physical transport facility. • A separate facility may be deployed between the service provider and the customer, such as a leased facility (e.g., Tl or Ethernet), to carry the control plane messages. The DCN interconnects the various facilities via data communications equipment, allowing network elements connected via a transport link to
Chapter 16
598
communicate over a multihop path that can be different from the transport Hnk.
a)
/11 m t|___.
lii iirf \m ffil vietwork EElement
1
b)
/
"
~~'
/l i i f ^ i 1
^i i i f g i
»Jetwork EElement
Control Plane Message Facility Transport Facility (e.g. SONET/SDH)
Network Element
^ . -'"'SCN
m mi/
1
m mi
Network EElement
/Hi'*"I^ mi
ECC (e. g. SONET/SDH DCC) Transport Facility (e.g. SONET/SDH)
ill • \m\1
m m\i
c)
1
PI 'W{1 ffi ml Network Element
^
^
/illill^'^131 ^
'1 1
—
—
SCN Access Facility — Transport Facility (e.g. SONET/SDH)
m »|
sletwork EElement
Figure 16-17. Example of control plane message facilities
Whether the DCN for the control plane uses an ECC or a dedicated facility, or some combination thereof, is dependent on • The type of physical transport network elements • The locations of these network elements • The level of separation that is desired between the physical transport network and the DCN For example, some transport network elements, such as photonic crossconnects, may not have access to ECCs and therefore must utilize either LAN or WAN interfaces (in the form of an optical supervisory channel, a wavelength dedicated to carrying control and management information) to carry ASON signaling messages. Other transport network elements, because of their location in the network, may not have direct access to a LAN or WAN network and therefore will utilize ECCs to carry ASON signaling messages. Additionally, if it were desired that a DCN be physically separate from the physical transport network (possibly to prevent a single failure impacting both the physical transport network and the ASON signaling network), then only LAN or WAN interfaces would be utilized to carry ASON signaling messages.
Architecting the Automatically Switched Transport Network
599
In order to use a DCN, ASON-capable network elements must support data communication functionality to enable transport of the signaling messages. Such communications functionality consists of • Learning the ASON DCN topology so that an element can calculate the shortest path between itself and the ASON signaling destinations • Creating a forwarding table based on the shortest path calculations • Forwarding packets between ASON DCN interfaces based on the forwarding table entries
16.4.3 DCN Topologies When designing a DCN, a number of different topologies may be considered. This section reviews various DCN topology approaches, including pros and cons. 16.4.3.1 Full Mesh A full mesh, as shown in Figure 16-18, is the simplest topology. However, as the network increases in size, the number of links required to create the mesh grows exponentially. Each link adds capacity to the DCN that will likely be underutilized, since signaling messages are communicated between the Signaling PCs at DCN link ends. Furthermore, since routing topology updates are flooded among RCs in an RA, a mesh topology actually amplifies the number of routing messages needed.
Figure 16-18. Full mesh topology
Consequently, a full mesh topology is not recommended for a DCN.
Chapter 16
600 16.4.3.2 Congruent Topology
A congruent topology, as shown in Figure 16-19, maintains a DCN link in parallel to each transport link; i.e., the overall DCN topology is congruent with that of the transport network. In this topology, messages sent from a signaling PC at one DCN link end may flow to the signaling PC at the other link end using any available path, including signaling channels on other DCN links in the network.
—• Transport Link - - Routing Adjacency 0 = Transport Node 0 = Routing Controller Figure 16-19. Congruent topology [11]
In addition to making it hard to predict the path that will be used for signaling and routing messages, this topology also suffers from underutilized links, since each facility adds a new signaling channel. 16.4.3.3 Focused topologies (i.e., hub/spoke) A focused topology, such as the hub and spoke topology shown in Figure 16-20, maintains one DCN link from each node to a centralized DCN message switch. Since all signaling messages sent by a node are focused onto a single link to the centralized message switch, it is easy to predict the amount of messaging that a DCN link will see. Consequently, this network topology allows the best prediction of what sort of loads a network can sustain without becoming congested, even in the face of DCN link failure.
Architecting the Automatically Switched Transport Network
\\£il.
601
7/ ^ ^ ^-^--^ r^^^^-"""'^ _ ^ ^ ^ ^ ^ ^ ^ ^ J Central Message Switch — Routing Adjacency 0 = Transport Node 0 = Routing Controller
Figure 16-20. Hub/spoke topology [11]
This topology is generally not used, however, since the failure of a DCN link will cause a node to become completely isolated from the DCN. Instead, a modified form with two hubs, as shown in Figure 16-21, is typically used. This provides a secondary path to each node in case the primary fails.
I Central Message Switch — Transport Link ~ " Routing Adjacency ©= Transport Node 0 = Routing Controller
Figure 16-21. Dual hub/spoke topology
In order to guarantee that nodes will be able to send messages to each other even when one of the links to the central message switch is down, a DCN link is included between the central message switches. 16.4.3.4 Hybrid (Multitier) Topologies In large geographically dispersed networks, it may not be reasonable to maintain only one pair of central message switches. In this case, regions may have their own pair of switches, with a backbone of connections between the message switches. Figure 16-22 shows such a network
602
Chapter 16
topology. Within each region, a dual hub/spoke topology is maintained. However, between regions, a mesh network is maintained.
Figure 16-22. Hybrid (two-tier) hub/spoke—full mesh network
16.4.4 DCN Reliability Considerations Whether the DCN supporting the SCN provides connectionless or connection-oriented services has an effect on the level of reliability that can be supported. Failures affecting the DCN will impact: • New connection requests, since the signaling network may not be available to carry the messages related to the new connection request • The ability to tear down existing calls, since the signaling network may not be available to carry the messages related to tearing down an existing connection • The ability to restore existing connections, since the signaling network may not be available to carry messages related to restoration. (We note that this only applies when a failure exists on the signaling network as well as the on physical transport network) To allow ASON signaling messages to be delivered to their destination even in the event of DCN failure conditions, certain design requirements are imposed on the DCN. At a minimum, the DCN should be designed to provide diverse paths between any two communicating control plane network elements. Assuming such a design, even a DCN providing connectionless services will be able to deliver ASON signaling messages to their destination in the event of a failure once the routing tables have been updated to allow packets to route around the failure. Such a design may be sufficient for a DCN, carrying messages related to new connection requests and tearing down existing connections. However, whether this type of
Architecting the Automatically Switched Transport Network
603
design is sufficient to handle messages related to restoration of existing connections depends on the overall restoration time requirements. It is possible that the time required for the DCN to update its routing tables so as to route packets around the failure may negatively impact the ability to meet certain restoration time requirements. If such is the case, it may be necessary for the DCN to provide connection-oriented services that allow the DCN to more quickly route packets around the failure.
16.4.5 DCN Security Considerations There are certain security measures that must be taken when providing a control plane message facility to a network element that is outside of a service provider's administrative domain. The service provider must make sure that the facility does not provide access to private/secure data carried within the service provider's DCN (e.g., management data). If an ECC is used to provide transport of control plane messages, the transport network element terminating the ECC must guarantee that only the control plane messages are allowed across the ECC. We note that the transport network element may also be providing DCN communications in support of management applications and, therefore, the transport network element must be able to separate the management communications from the control plane. If a DCN is used, a firewall is necessary at the edge of the service provider's domain. The firewall is provisioned to only let control plane messages in and out of the service provider domain.
16.5.
SERVICE ACTIVATION PROCESS ELEMENTS
This section outlines the elements involved in customer call activation in a multidomain network, providing a framework for the sections that follow. These involve customer contracting, network planning and installation, and service validation and activation [2]. Note that while an SC example is utilized, these elements have applicability to an SPC as well. Customer contacting with a service provider for a set of transport services may encompass such aspects as contract duration, billing methods, service capabilities, routing and authentication policies, customer premises equipment requirements, registration and connectivity needs, and security considerations [2]. These aspects are not addressed by ASON Recommendations. Network planning and installation encompasses allocating sufficient transport resources to satisfy the terms of the contract, including the installation of new equipment as needed, and provisioning and configuration
604
Chapter 16
of the equipment (including identifier assignments). We note that automated discovery (see Section 16.6) is considered part of the network planning and installation phase and is performed after the processes related to partitioning of network resources for control plane actions has taken place [2]. Service validation assures, for example, that a UNI-initiated call request is authorized based on the contract and confirmation of the successful completion of the request. Request validation activities may include authentication of the user request, message-integrity verification, SLA verification, etc. Once this occurs, the user is authorized to request activation and release {call requests and call disconnects) of network services [2]. Service activation for completion of the user service request includes the detailed processes required to activate service between the customer locations via an automatic end-to-end connection setup. Routing information is exchanged over I-NNI interfaces, and reachability (or summarized routes, if available) is typically exchanged over E-NNI interfaces. Service activation also generally involves ensuring that adequate resources are available to support the requested service, and that the constraints placed by the request, based on SLA parameters (e.g., a certain level of availability), can be met when determining a route to reach the destination end-user across the various domains. The information exchanged between the call and connection controllers ultimately results in approval of the request and activation of the service [2]. Upon successful establishment of service, the call controller also starts the billing and associated business processes. When the user wants to terminate the service request, a disconnect/hang-up request is placed at the UNI, and the call controller stops the billing and transfers control to the connection controllers for tearing down the connection and freeing up the allocated network resources [2].
16.6.
DISCOVERY (G.7714)
The essence of automated discovery, as applied to transport networks, is for each pair of connected network elements to find out the identity of their neighbor element, to determine how their respective ports are mapped to each other, and to negotiate the services (transport entity capabilities) that will be supported across the transport entities interconnecting them. We note that for bidirectional links (which implies that the transmit and receive endpoints of the paired unidirectional links are identified by a single address) the discovery procedure is performed separately for each direction of the link. This separation is necessary because it is possible that within the fiber
Architecting the Automatically Switched Transport Network
605
pair that should make up this bidirectional link, one fiber could be connected properly while the other is not. The connectivity information derived from discovery is crucial for accurately building the network topology database used for computing the path for a connection. Additionally, the discovery procedure is an essential first step in establishing logical connectivity between control entities for exchanging signaling and routing messages. Discovery may also seed the process of establishing the control associations between routing and signaling functions managing the neighboring network elements. Without these associations, the ASON control plane will not be able to use the discovered links. The output of the automatic discovery process may be used by both traditional management systems and by the control plane. While it is possible to manually provision topology information, use of a manual approach is labor intensive, time consuming, and notoriously prone to human error. Additionally, manual provisioning makes it difficult to synchronize changes to network resources with management system databases. Without an automated mechanism such as discovery, these systems can easily lose synchronization with each other, resulting in the inefficient use of network resources or, worse, the systems not being able to compute paths for a connection. The G.8080 foundation for discovery architecture was not mature at the time of the first release of G.7714, which focused upon describing categories of discovery and methods, but has been incorporated into the 2005 version.
16.6.1 Discovery and Connectivity Verification Although they have different connotations, the terms discovery and connectivity verification are often used synonymously in the industry. Connectivity verification is generally associated with the carrier's operation systems and is, in fact, a separate process. Connectivity verification includes the important aspect of verifying the carrier's connectivity plan. Discovery, on the other hand, only determines the actual connectivity (and not what it should have been), and so does not directly verify the carrier's connectivity plan. There is thus a need for external intelligence (e.g., a management function) to take the results of the discovery process and check them against a connectivity plan. The latter is provided by connectivity verification. It should be noted that the basic discovery messages can only be carried in-band. This is unlike all the other control plane messages (e.g., signaling and routing), which can be carried either in-band (i.e., as a part of the data traffic) or out-of-band. In contrast, connectivity verification, when automated, can be carried either in-band or out-of-band.
Chapter 16
606
16.6.2 Discovery Architecture The G.8080-based discovery architecture involves the TAP, DA, LRM, and the protocol controller, as illustrated in Figure 16-23. Discovery is about finding the CP-CP connections across an entire network. In order to reduce the search space, hints may be provided based on previously discovered or configured information. For example, a trail may provide a hint about all its LCs, reducing the search to just the ends of the trail.
^ T A P V - ^ 1 Unit hints •PC
[
ConfigVLRMy
NIMHintK.,^^^ / \ CTP/LC hintr~~~VrAP >
(DA)W
(LKUT
rrApV-"^
Figure 16-23. Components of discovery architecture
The involved components and their roles are as follows: Discovery Agent (DA). The DA provides the necessary coordination for the discovery process. This includes collection of hints from the necessary components and coordinating with the DA on equipment matrices that may be controlled during the discovery process. Note that fully transparent switches cannot be bound to a particular layer network until discovery is complete. (This is because the characteristic information is determined by the trail supporting the discovered link connection). Because DAs may interact with other DAs in the network, it is important to recognize that DAs need identifiers having global scope. Termination/Adaptation Performer (TAP). The TAP provides a view of the status of the physical resources, e.g., link connections and trails (CTP and TTP). Since the TAP is associated with the G.805 termination/adaptation compound function (described in Chapter 2), it can be used to provide hints from the test-signals (the test set, CTP/TTP information) and from nonintrusive monitors (e.g., nonintrusive SNC monitors). We note that discovery operates on transport resources before control plane aliases (SNPs) have been allocated, so it uses CP and TCP names to discover CTP-CTP LCs. Link Resource Manager (LRM). The LRM provides the status of the link in terms of the allocated and available number of LCs. After the discovery
Architecting the Automatically Switched Transport Network
607
phase, SNPs are assigned to the discovered CPs and the LRM is configured with the SNP connections. This assignment is considered to be a management operation that may be delegated to the network element. This allows for "plug and play" operation as a specific management policy. The LRM may also provide hints related to link discovery (e.g., link name). Protocol Controller (PC). The PC provides for the protocolencapsulation of the primitives that constitute the DA-DA interaction. The protocol controller attaches to the signaling network point of attachment, which can be thought of as the DA signaling address.
16.6.3 Types of Discovery Recommendation G.7714, as first published in 2001, defines three general types of discovery functions that can be performed: (1) Layer Adjacency Discovery, (2) Control Entity Logical Adjacency Establishment, and (3) Service Capability Exchange discovery. In the years that followed, further architectural foundation was laid for auto-discovery within G.8080, which had implications on G.7714 concepts. At the time of going to press, G.7714 had just undergone revision and restructuring. Revised G.7714 (2005) provides clearer requirements for the discovery process, refines the terminology used for the transport entities being discovered, extends the use of discovery to the management plane, and provides more detail on the behavior of the Capability Exchange process (renamed Transport Entity Capability Exchange). The Capability Exchange processes in support of the control plane will be provided in a new Recommendation under development, G.7716, which addresses the initialization and restoration of the control plane. 16.6.3.1 Layer Adjacency Discovery Layer Adjacency Discovery (LAD) describes the process of discovering transport network connection and link connection endpoint relationships and verifying their connectivity. In the most basic terms, determining "who a network element's neighbor is at a given layer" is what layer adjacency discovery is about. Not all equipment will necessarily terminate/interpret/alter the characteristic information at all the layers. For example, in Figure 16-24 we show a simplified functional block view of an SDH/optical network comprising eleven network elements (NEs). Two of these network elements are optical amplifiers (NE #3 and #9), and two are WDM multiplexers/demultiplexers, all of which only understand and act upon the physical layer. Two of the elements are SDH HOVC O-E-O switches (NE #4
608
Chapter 16
and NE #8), which act upon the SDH physical, RS, MS, and HOVC layers. The end-user equipment, shown as routers in this example, is also assumed to act upon all the layers, i.e. physical, RS, MS and HOVC. 1^
^
NE#1
NE#3
Routei
X
OC-3 1
OC-3
AMP
X
oxc
OC-48 I X OC-48
oxc
NE#4
NE#6
NE#8
AMP NE#9
Rou^d NE#11
Router
L : : ' ••'• '• "•'.:J
NE#7
NE#5
NE#2
NE#10
Figure 16-24. Example SDH network scenario
Looking at Figure 16-24, for example, we can see that NE #1 has a physical layer adjacency with NE #3, as illustrated in Figure 16-25. PHY (physical) layer adjacency
MS
yw— NE#1
w
NE#3
./NE#4
MS
"^ 'V'
\oi^ V'
'^pY 'V'
NE#5
NE#6
NE#7
MS
""^
\PH7
v..
NE#9
V
NE#11
NE#as MS
NE#2
NE#10
Figure 16-25. Illustration of physical adjacency
Examining Figure 16-26, we can see NE #1*8 and NE #2's RS layer adjacencies with NE #4 (and NE #8's with NE #10 and NE #11).
w
w NE#1
V
NE #>••'''
NE^
NE#5
NE#6
w
MS
RS layer adjacency
NE#7
f: NE#8
/
V "'ME^ "••••••••., \
MS
NE#11 w MS
V w
w NE#2
NE#10
Figure 16-26. Illustration of RS layer adjacency
Architecting the Automatically Switched Transport Network
609
Similarly, Figure 16-27 illustrates NE # r s and NE #2's MS layer adjacencies with NE#4, and NE #8's MS layer adjacency with NE #10 and NE#11.
V
-M^^: •••••
NE#1
jSfE#3
'
V
MS layer adjacency
^MS
^
NE#4
^?W
W ^
^fW
NE#5
NE#6
NE#7
NE#8
'•"•MS""
NE#^
NE#n
•••MS-'
NE#2
NE#10
Figure 16-27. Illustration of MS layer adjacency
Finally, Figure 16-28 illustrates similar HOVC layer adjacency relationships. Different discovery message sets would have to be used to discover adjacencies at the various layers (e.g., RS, MS, or HOVC). The exact LAD methods and protocols are described in greater detail in Section 16.11.1. HOVC layer adjacency \iir»/
XHQ/
/W"
•'MS'
.•••*
NE#1
.•••NE#3
••-MS-''
• MS-
NE#4
^^^
^^^&
^5PW
N E #5
NE #6
N E #7
T NE#8
1^
NE #9 •••.,_ N E # 1 1
\M§--'
••••••MS--
T NE#10
NE#2
Figure 16-28. Illustration of HOVC layer adjacency
We note that G.7714 defines another level of adjacency, termed physical media adjacency (PMAD), which is conceptually no different from any other layer adjacency since the layer that is represented is the physical media layer, e.g., fiber layer. The mechanisms for providing PMAD for optical networks have not been standardized. This is in large part because such
610
Chapter 16
mechanisms would require the use of optical processing techniques at the wavelength and fiber level, areas that are not yet sufficiently mature. 16.6.3.2 Control Entity Logical Adjacency Establishment The 2001 version of G.7714 included discussion of Control Entity Logical Adjacency, or CELA. This was previously defined as the association that existed between two discovery processes to facilitate communication between a pair of control entities across the SCN. The term CELA was utilized prior to the development of the G.8080 discovery architecture, and prior to consideration that the management plane could benefit from the automatic discovery process. As mentioned earlier, the revised (2005) version of G.7714 allows the discovery process to be used by the management plane as well as the control plane, making the term CELA inappropriate. Since the appropriate G.8080 architectural construct is the DA (Discovery Agent), it was considered to replace CELA with the appropriate term, namely Discovery Agent adjacency. However, a DA adjacency does not need to be preestablished and may be created dynamically while other discovery subprocesses (e.g., LAD) are being executed. Furthermore, the communications that occurred across the adjacency were not in any way scoped by the adjacency, removing any functional distinction from the messaging services provided by the DCN. Thus, discussion of the Discovery Agent adjacency was not included in the revised Recommendation. 16.6.3.3 Service Capability Exchange The term Service Capability Exchange (SCE) was used in the first version of G.7714 as defining a process for capability information exchange. This process was used to allow information regarding the ends of the discovered facility to be exchanged, "bootstrapping" the control plane. However, since the term service is often used to describe end-user communication services, the term SCE introduced ambiguity. To avoid this ambiguity, in the 2005 version of G.7714 the term SCE was changed to Transport Capability Exchange (TCE), which more accurately expresses the intent of the process. Again, with the extension of the discovery process to the Management Plane, the scope of TCE has been limited to the exchange of Transport Plane Capability information, and the exchange of control plane or management plane specific information will be moved to other Recommendations. It has been recognized that the exchange of control plane or management plane information has the same requirements as the exchange of transport
Architecting the Automatically Switched Transport Network
611
plane capability information. As a result, the 2005 version of G.7714 includes the specification of a generic mechanism to perform capability exchange. It is expected that this mechanism will be reused by Recommendations that address the specific encodings for the exchange of control plane and management plane information.
16.6.4 Discovery Considerations across Administrative Boundaries Auto-discovery across user-provider interfaces is unique, largely driven by the fact that service provider proprietary information is not exchanged with the user. In that sense, user-provider discovery is a "single ended" discovery process. From the point of view of the provider, it is a mapping of user endpoint names to network addresses that are used for routing, and from the user's perspective, it is an acknowledgment of the availability of the user-to-network connections. Capability exchange is quite limited given that service level information is exchanged and agreed upon during the contract negotiation phase. Discovery over intercarrier interfaces, which are service demarcation boundaries, may also involve tmst boundaries (e.g., between different carrier networks). To this end, the discovery processes offer a provider a significant amount of control over the information that can be exchanged. Each capability that could be exchanged may be, by provider policy, excluded from the exchange process. Furthermore, the exchange process allows providers the ability to restrict the behavior of the entity on the other end of the link.
16.7.
ROUTING (G.7715 AND G.7715.1)
Recommendation G.7715 contains the architecture and requirements for routing in ASON, expanding on the routing function concept in G.807 and G.8080. Recommendation G.7715.1 contains more detailed requirements aimed specifically at link state routing instantiation of ASON routing. The basic principles are presented in this section.
16.7.1 Requirements The routing requirements described in G.7715 encompass architectural, protocol, and path computation aspects. Some key examples of architecture and protocol requirements are described within this section.
Chapter 16
612 16.7.1.1 Architectural
The routing architecture requirements specified in G.7715 reinforce and build upon G.807 and G.8080 requirements. One key requirement with fundamental implications is that the routing adjacency topology and transport network topology shall not be assumed to be congruent [11]. This separation between the transport topology and the routing control topology, and also between the latter and the DCN, means that their topologies may all be different from one another in any given ASON network. Figure 16-29 illustrates a routing area (RA) where the routing control topology forms a tree including all the nodes in the transport plane. (We stress that this tree, used by the RCs to forward routing messages, does not need to be congruent with the transport plane topology.) Separation allows, for example, a single RC to support multiple network elements, and to be addressed separately from the network elements.
— Transport Link - - Routing Adjacency ® - Transport Node 0 = Routing Controller
Figure 16-29. Example of directed topology message Flow [37] [11]
Route computation is achieved using information advertised by the routing protocol and is often subject to a set of optimization constraints. After the route is determined, connection management processes (i.e., signaling) are used to allocate network resources (i.e., subnet connections and link connections) before user traffic is carried on those paths. After signaling has been used to establish a connection, the routing functions are no longer needed. Hence routing protocols in transport networks are not involved in "data-plane forwarding" and therefore have no impact on established services, which is not the case in, for example, IP networks where the forwarding function is continually dependent on the availability and integrity of the routing function. This has important implications for the performance required from the routing control plane; for example, transport plane connections remain active and data can continue to be transported even when the routing control plane is unavailable. This setup is illustrated in Figure 16-30.
Architecting the Automatically Switched Transport Network OSPF )
^
613
IP router Peers
Routing Controllers
.'"Signeling"' I
Control Plant IP packet Data R an e
Cross Connect
I
|lPaddr[
IP prefix next hqj
SDH path ^ n p Forwarding^
A. Transport Routing a n d Forwarding
B. I P Routing a n d Forwarding
Figure J 6-30. Routing and forwarding examples
• •
•
Other G.7715 requirements include the following [11]: The routing information exchanged between routing control domains is independent of intradomain protocol choices. The routing information exchanged between routing control domains is independent of intradomain control distribution choices, e.g., centralized or fully distributed routing functions. The routing information shall support an abstracted view of individual domains, i.e., the topology advertised externally in the routing protocol may be an abstracted view of the actual internal domain topology. The level of abstraction is subject to operator policy.
Recommendation G.7715 also provides requirements addressing the need to provide for unique identification of RAs within a carrier network, as well as avoiding protocol dependencies between hierarchical routing levels. 16.7.1.2
Protocol
Along with requirements related to protocol robustness, scalability, and security aspects, key requirements for the routing protocol itself are defined below [11]: • The routing protocol shall be capable of supporting multiple hierarchical levels. • The routing protocol shall support hierarchical routing information dissemination including summarized routing information. • The routing protocol shall include support for multiple links between nodes and shall allow for link and node diversity.
Chapter 16
614 •
The routing protocol shall be capable of supporting architectural evolution in terms of the number of levels of hierarchies, and aggregation and segmentation of RAs.
Note that the term level is used specifically to refer to the use of hierarchy to support subdivision of transport network resources into RAs, which is analogous to partitioning of a transport layer network into subnetworks as described in Chapter 2. This should not be surprising, as the only distinction between a subnetwork and a RA is the visibility of the link ends. A simple example of hierarchical routing levels is illustrated in Figure 16-31.
Level n+1
Level n
dUlRA
Level n-1
Figure 16-31. Simple example of hierarchical routing levels
Using the drawing convention for partitioning from Chapter 2, Figure 1631 can be redrawn as Figure 16-32. We again stress the difference between transport network layering and partitioning, as described in Chapter 2.
Architecting the Automatically Switched Transport Network
615
O' Figure 16-32. Hierarchical routing levels illustrated using partitioning notation
16.7.2 Architecture The ASON routing architecture of G.7715 supports the various routing paradigms hsted in G.8080, i.e., hierarchical, step-by-step, and source based. The routing architecture apphes after the network has been subdivided into Routing Areas and the necessary network resources have been accordingly assigned [11]. In this section, we will build upon the discussion of Routing Areas (RAs) from Section 16.3.2.1. Associated with each RA is a Routing Performer (RP), which is an abstract entity that provides path computation services for the RA. Signaling to create connections in the transport plane uses path computation. Whatever path computation style an RA supports, the RP will have the necessary topology information to support it. An RP is realized in the form of RCs, as described in Section 16.3.3, which are distributed entities with partial or complete (via duplication and synchronization) routing information for that RA. This architectural arrangement is shown in Figure 16-33. Routing Controllers distribute topology information with each other, and when two RCs communicate, this is known as forming a routing adjacency. The set of routing adjacencies form a routing control topology. Routing adjacencies are communicated/instantiated over the SCN. The RCs communicate via Protocol Controllers (PCs) that support a particular routing protocol. Separation of the RC and PC components allows great flexibility; for example, a single PC (and the routing protocol it implements) may support RCs from different transport layer networks and multiple
616
Chapter 16
hierarchical routing levels. This capability is important for protocol efficiency since it enables carriage of information pertaining to multiple layer networks (and hierarchical routing levels) in one PDU, instead of having to run separate instances for each layer network.
m
••iiii Service offered by RP
\ Computational View
y
RPy
Realized by RCs
* Engineering View
\ RG
Figure 16-33. Relationship between RA, RP, RC
Creation of RAs is related to the scope of routing information flooding (scalability), which impacts both transport resource assignment to Level 0 RAs and routing hierarchy decisions.
Architecting the Automatically Switched Transport Network
617
Transport resources (physical transport network)
Figure 16-34. Possible assignment of resources to RAs — example 1
Some illustrative examples are provided in Figures 16-34 (above) and 1635 (below), respectively.
Possttile allocation of transport resources to Level 0 RAs
Scenarb (b)
Transport resources (physical transport network)
<^>—«^^"^«——^^^ Figure 16-35. Possible assignment of resources to RAs — example 2
As discussed earlier, the RA is the key concept that matches the organization of the control plane to the organization of the transport plane. We stress that the existence of E-NNIs (bounding control domains) should not be inferred
Chapter 16
618
to create RAs. Several examples of potential RA and control domain scenarios are shown below that illustrate this point. Figure 16-36 illustrates several alternative configurations of routing control domains, ranging from multiple RAs in a given routing control domain, as in scenario (a), to complete congruency, as in scenario (b). One example of the former scenario might correspond to the situation in which, for example, there is a single vendor homogeneous solution, but scalability considerations warrant more than one RA.
(a)
RA(Level n) Control Domain
(b)
Figure 16-36. Possible configurations of routing control domains
Correspondingly, there are also alternative configurations of routing control domains ranging from complete congruency to multiple control domains within an RA, the latter of which is illustrated in Figure 16-37.
[C3 RA(Le\eln) ') Cortrol Domain
Figure 16-37. Possible configuration of routing control domains
The configuration in Figure 16-37 might correspond, for example, to a scenario in which two vendors with heterogeneous control plane routing protocol implementations were deployed within the same RA. It should be noted that we would not expect routing control domains to be configured in a manner such that they would intersect RAs. Specifically,
Architecting the Automatically Switched Transport Network
619
while the raison d'etre for creation of these constructs derives from different factors, it would generally be expected that a network planner would consider decisions relating to these aspects in a coherent manner.
16.7.3 Hierarchy in Routing The routing architecture of G.8080 and G.7715 allows for different implementations of the routing functions. That is, the various routing functions can be instantiated in a variety of ways including using distributed, co-located, and centralized mechanisms. A G.7715.1 compliant link state routing protocol can be instantiated as a set of RCs, each one performing topology distribution for the RA it is associated with. Each RC has a replicated topology database representing the transport plane topology where database consistency is maintained by exchanging database synchronization messages between every pair of RCs. This information exchange occurs via the protocol controllers for each RC. Each protocol controller runs an instance of the link state protocol. Using the topology information advertised via the link state routing protocol, a source can calculate routes through the network along which connections may be established. The choice of source routing for path computation has some advantages for supporting connection management in transport networks. It is similar to the manner in which many transport network management systems select paths today. Also, it can be powerful when diverse path computation is needed or for implementing fast restoration, among other things. Figure 16-38 illustrates an example of a network with two levels of routing hierarchy, where the lowest level routing area is tied to the transport network's physical topology. Four routing areas (RAl, RA2, RA3 and RAIO) are defined, with the first three at Level 0 and the parent at Level 1. Internally, each of these RAs manages its own network topology and available resources; i.e., there exists some method for obtaining a path across the RA that can be established via signaling.
Chapter 16
620
Level 0 RAs
Figure 16-38. Simple network with two levels of routing hierarchy
As described earlier, each RA has one or more associated RCs. It should not be assumed that there is a one-to-one relationship between transport resources and RCs, as a single RC may support multiple network elements. An example of a possible RC distribution including this scenario is illustrated in Figure 16-39.
Figure 16-39. Possible distribution of RCs
As we have discussed earlier, an architectural requirement of Section 16.7.1.1 is that "routing information exchanged between routing control domains must be independent of intradomain protocol and distribution choices". From an RC perspective, this means the RC distribution within a routing control domain is not externally visible. Thus, an RC can "act on behalf of a routing control domain; we note that one or more RCs can assume this role. Each RC is described by an RC Identifier (RC ID) and is uniquely defined within its containing RA, which, in turn, is identified using a RA Identifier (RA ID). Again, as noted earlier, a single RC supporting multiple network elements may have its own RC ID. The grouping of RCs in RAs at different hierarchical levels is defined in a flexible manner. For example, there does not exist any association or set of rules between the RA ID and the RC ID or the identifier of any RC
Architecting the Automatically Switched Transport Network
621
belonging to that RA. Additionally, the term level only has a relative sense, since it is not fixed in value; thus, a new RA may be added as needed as the first hierarchical level, at the top of the current hierarchy, or even between two existing hierarchical levels. This flexibility yields great convenience in managing carrier's networks as they develop and grow. The routing architecture described in G.7715 is applicable to multilevel routing hierarchies and is very powerful for scaling networks while providing sufficient routing information to efficiently compute routes across multiple separately managed physical networks. Recommendation G.7715 does not specify the detailed routing protocol to be used but leaves it as an implementation decision. Recommendation G.7715.1 further defines requirements for link state routing protocols in a protocol-independent manner.
16.7.4 Routing Information Exchange The following section discusses the routing information that is available at the various routing levels. From the point of view of architecture, links are wholly contained within a routing area, so a link only exists in the lowest level RA that contains both link ends. Connection Controllers have identical scope. In this model, there is no notion of exchanging routing information up and down the levels of the routing hierarchy via RCs. However, implementations invariably handle all layers and levels within a single component, and in this case it is convenient to consider links to be attached to the physical switch, and to discuss the visibility of links in terms of information flow up and down the levels. While the rest of this section is in terms of information flows, it is important not to lose sight of the architectural separation that is still maintained. Routing information may be exchanged across different levels of the routing hierarchy (between an RC, its parent, and its child RCs), and the information flows between levels are not specific to any particular paradigm (e.g., centralized or link state). The transport plane topology is advertised as a set of nodes and links. Nodes may correspond to abstract or physical entities within the child RA, and no distinction is made between them in the advertisement. For example, referring to Figure 16-38, there is no distinction in the advertisement between the (abstract) nodes in the Level 10 RA, and the physical nodes in the Level 0 RAs. Recommendation G.7715.1 indicates that the type of information flowing upwards (i.e., level N to level N+1) and downwards (i.e., level N+1 to level N) both involve the exchange of reachability information and may include summarized topology information—in other words, the transformation of
622
Chapter 16
one RA topology as a virtual RA topology (in terms of nodes and links) for the purposes of summarizing routing information for advertisement in its parent RA. The transformation mechanism is not intended for standardization. Recent insight clarifies that this transformation of RA topology is concerned with calculating the cost of crossing a RA, and not with presenting a different view of the RA internal topology. The cost has been expressed in terms of a set of nodes and links because that is what today's routing protocols handle. The reader should be aware that the discussion is about cost; it is not about revealing RA internal details. It should be noted that G.8080 requires the topology exchanged to be specific to a layer of the transport hierarchy. Consequently, links will only be reported in a specific layer's topology if the link supports the signal type of that layer. As noted earlier, multiple layer topologies can be carried by one routing protocol as long as the link and node information exchanged is identified as being specific to a layer. Finally, to route a connection to a given customer, we must know through which RA the customer can be reached. The routing protocol must thus advertise client reachability information in the form of UNI~N SNPP addresses. A UNI-N SNPP address associated with a given RA is advertised by the RC(s) that represent that RA, so that other RCs will learn of the reachability of that UNI Transport Resource and pass this information onto their associated RAs. Note that the network must contain a directory that maps UNI-C Transport Resource names onto the local UNI-N SNPP addresses. Also note that the advertised SNPP address may be different from the internal SNPP address, with address resolution occurring at the RA boundary. There are two options for propagation of routing information at a given hierarchical level. The first is uploading the information to a centralized route server, such as a management station, where the information can be integrated and used for computing routes. This option is sometimes referred to as the path computation server approach. The second option involves propagating the information throughout the entire routing hierarchy, which requires that the information is disseminated among all the hierarchical routing levels. The routing information of an RA at the hierarchical level A^ (RAN) can be disseminated to the RA at the level N+1 (RAN+J). The information communicated among the cooperating RCs usually includes the set of reachable UNI Transport Resource identifiers, inter-RA links (from the perspective of RAN+I), and nodes. Nodes with internal detail (abstract nodes) may have a cost associated with them, which may be expressed in terms of a summarized/transformed topology of nodes and links. After the information is communicated to level N+1, the RCs in RAN+I will cooperate to advertise
Architecting the Automatically Switched Transport Network
623
the information so that the routing information associated with a small set of RAs will be learnt by others. Recommendation G.7715.1 describes two approaches by which the routing information of RAN+I can be provided to RAN. In the first approach, the RP in the containing RA at level N+1 provides the level N RP with the reachability and topology information visible at level N+1. The information visible at level N+l includes the information visible at consecutive upper levels. This information may then be used by the level N RP to compute a path that leaves the RA. In the second approach, recursive requests are made from the level N RP to the level N+1 RP upward towards the root of the routing hierarchy. The result of each request is analyzed by the requesting RP to determine the exit point utilized by the level N+1 RP. The RP will then update the path including the path computed through the level N RA, and return it to the requester. This approach is loop free as the routing hierarchy defined in G.8080 has strict containment (preventing a contained RA from containing a containing RA). 16.7.4.1
General Attributes
Recommendation G.7715.1 focuses in more detail on the attributes associated with usage of ASON compliant link state routing protocol to advertise transport plane topology between RAs. We divide the information disseminated via a routing protocol into node attributes (using the definition of node as described earlier) and link attributes, since these are the basic topological elements. Link attributes may be further classified as those that are layer independent, such as identification, and those that are layer specific, such as adaptation support. The way that these attributes are used by the routing protocol is normally dependent on the operator's policy (e.g., the operator may decide to advertise a restricted set of reachability information). We consider the different attributes below. 16.7.4.2 Node Attributes All nodes in a graph representation of a network belong to an RA; hence the RA ID is an attribute of all nodes. As discussed earlier, no distinction is made between abstract nodes and those that cannot be decomposed any further; the same attributes are used for their advertisement [12]. The following node attributes are defined [12]: • Node Identification (ID): The Node ID is the subnetwork ID that exists in an SNPP name. All node IDs advertised within an RA are allocated from a common name space for that RA.
Chapter 16
624 •
Reachability Information: Reachability information describes the set of endpoints that are reachable by the associated node. It may be advertised either as a set of UNI Transport Resource identifiers or a set of associated SNPP identifiers, the selection of which must be consistent within the applicable scope. For implementation purposes, it is important to identify when attributes are required to be supported in a protocol realization and must be present in advertisements, and when attributes are required to be supported but may not be present in an advertisement based on operator policy. Table 8-1, from G.7715.1 [12], provides this information, where capability refers to level of support required in the realization of a link state routing protocol, whereas usage refers to the degree of operational and implementation flexibility, i.e., the ability of the operator to define the use or non-use of the attribute by policy. Mandatory usage attributes are those that are needed as a minimum to support path computation.
Attribute Node ID Reachability
Table 16-2. Node attributes [12] Usage Capability Mandatory Mandatory Mandatory Optional
16.7.4.3 Link Attributes Recommendation G.7715.1 defines the following set of link attributes to be supported in link state routing protocols for ASON [12]: • Local SNPP Name: identifying the transport plane resource at the local SNPP link end • Remote SNPP Name: identifying the transport plane resource at the remote SNPP link end Table 16-3 provides implementation requirements for general link attributes.
Link Attribute Local SNPP Name Remote SNPP Name Layer specific characteristics
Table 16-3. Link attributes [12] Capability Mandatory Mandatory (refer to Section 16.7.4.4)
Usage Mandatory Mandatory
Note: When the remote end of a link is located outside of the RA, usage of the remote SNPP Name is optional.
Architecting the Automatically Switched Transport Network
625
16.7.4.4 Layer-Specific Characteristics Recommendation G.7715.1 defines the following set of layer-specific characteristics as attributes of a link [12]: •
Signal Type: This attribute identifies the characteristic information of the layer network. Since advertisements are layer specific, this information identifies the layer network being advertised. If advertisements for multiple layer networks are combined in a single protocol instance, this attribute allows advertised information to be forwarded to the RC for that layer network. • Link Weight: This attribute represents a vector of one or more metrics, each of which indicates the relative desirability of a particular link over another during path selection. • Resource Class: This attribute corresponds to a set of administrative groups assigned by the operator to this link. A link may belong to zero, one, or more administrative groups. • Local Connection Type: This attribute identifies whether the local SNP represents a TCP, or a CP or can be flexibly configured as either a TCP or a CP. Some links may, for example, support termination of connections but not transit of connections as a result, and should only be used if the connection terminates at the remote node. • Link Capacity: This attribute provides the sum of the available and potential link connections for a particular network transport layer. Other types of capacity information have not been precluded and are for further study in G.7715.1. Providing such information on a layer-specific basis allows more accurate connection routing, since it takes into account the potential for connections at one layer impacting the availability of a link for connections at another layer due to factors such as placement within the frame, which are not obvious from a simple measurement of capacity in total available bits per second. • Link Availability: This attribute represents a vector of one or more availability factors for the link or link end. Availability may be represented in different ways between domains and within domains. Within domains, it may be used to represent a survivability capability of the link or link end. In addition, the availability factor may be used to represent a node survivability characteristic. Link availability may be a constraint used in routing of paths supporting connections with higher class of service. • Diversity Support: This attribute represents diversity information with respect to links, nodes, and Shared Risk Groups (SRGs) that may be used
Chapter 16
626
during path computation. Such information can then be used in computation of paths for protection/restoration purposes. • Local Client Adaptations Supported: This attribute represents the set of cHent layer adaptations supported by the TCP associated with the Local SNPP. This is only applicable when the local SNP represents a TCP or can be flexibly configured as either a TCP or a CP. This type of information may be used when calculating paths requiring a specific adaptation when support may differ on a link-by-link basis. In Table 16-4, implementation requirements are specified for layerspecific link attributes. Table 16-4, Layer-specific characteristics [12] Capability Layer-Specific Characteristics Signal Type Mandatory Mandatory Link Weight Mandatory Resource Class Local Connection Type Mandatory Mandatory Link Capacity Link Availability Optional Diversity Support Optional Local Client Adaptations Supported Optional
16.8.
Usage Optional Optional Optional Optional Optional Optional Optional Optional
SIGNALING (G.7713)
Recommendations G.7713 and G.7713 Amendment 1 provide protocolneutral specifications for distributed call and connection management in ASON, which are thus applicable to multiple signaling protocols. In addition to the processes related to signaling communication, several other important issues are addressed, including: • Rainy-day scenarios that need to be covered to support the unlikely event of defects impacting the control plane. These may include defects of the signaling channel, or defects of the control plane itself. • The operation and communications between the call and connection control components in setting up and tearing down connections. These include specification of the messages and the information content of the messages, as well as the behaviors of the signaling mechanism. • Issues that need to be resolved to handle alarm suppression in the transport plane when connections are setup and removed. We note that ASON signaling components (especially the NCC) may be centrally instantiated. While G.7713 does not preclude this, it focuses upon protocol requirements for the case in which signaling components are
Architecting the Automatically Switched Transport Network
627
distributed. Information passed by the distributed signaling components and messages are defined in G.7713 in an abstract manner. If different signaling protocols are used in a common ASON network, they may interwork across various reference points by transferring equivalent messages and information elements between their protocol-specific encodings.
16.8.1 Call and Connection Management Operations This section describes call and connection management operations after the contract between the user and the provider has been established. Figure 16-40 (based upon Figure 6-Aml-l /G.77ISA". 1704 [5]) provides a simple high-level illustration of the interactions between end user (calling and called party), call controllers (CCC), and network call controllers (NCC) in a two-domain network. The Calling Party Call Controller (CCC-a) interacts with a Called Party Call Controller (CCC-z) by means of one or more intermediate network call controllers (NCCs) at service demarcation points (i.e., UNI, E-NNI). The Call Controllers perform the following actions: • The NCC correlates the SNCs to the call. • NCC-la and NCC-2z work with CCC-a and CCC-z, respectively, to correlate Link Connections(s), LC(s), to the call access segments. • NCC-1 works with its peer NCC-2 at domain boundaries to correlate LC(s) to the interdomain call segment. • The NCCs correlate the LCs and Subnetwork Connections (SNCs) that are associated with each call segment within their respective domains.
Domain 2
Domain 1
^ ^pia
ccc^
AGO
End-to-end call Call segments Connections
..^^»y...ErNNJ....tt NCC-1) r
A—
<
•
C-2
*^
CCC-z
NCC-2Z 1
^ ^ ^ »
AGO
!»•
-
^--> ^ ^^ LC ^
~— P SNC
^ W
• ^
LC
^
9
^
w. SNCs
^
^-~~-»» ^ w^ ^ LC
Figure 16-40. Interaction among Call Controllers in two domain example [5]
628
Chapter 16
Connection controllers (CCs) establish the connections associated with each call segment.
that
are
16.8.2 Basic Call and Connection Control Sequences Call and connection setup in transport networks uses a three-message sequence consisting of the SetupRequest message, which initiates the call and specifies the desired traffic characteristics for the connection; the Setuplndication message, which acknowledges establishment of the connection across the network, and the optional SetupConfirm message, which confirms end-to-end to the destination node that the connection has been made. 16.8.2.1
Call and Connection Setup
Referring to the example of call setup request processing in Figure 16-41 (based upon Figure 6-Aml-l IG.llUIYAlOA [5]), the Calling Party Call Controller, CCC-a, requests call setup, and the ingress NCC-la initiates processes to check the call request. These may include checking for authentication and integrity of the request, as well as constraints placed by policy decisions (described in Section 16.5). The request is also sent to NCC-2z and NCC-2 (at the service demarcation points). Processes included in the egress NCC-2z may include verifying that the call request is accepted end-to-end [5]. Upon successful checking, CCC-a continues the call setup request by initiating a connection setup request to its associated Connection Controller, CC-a. The connection request performs the coordination among respective CCs to set up and release connections. When a connection is required for the call, the call is not considered complete until the connection is in place [5].
Architecting the Automatically Switched Transport Network
ACC-1
TCC-1
629
ZCC-1
Connection setup CC-a: A-end Connection Controller CC-z Z-end Connection Controller ACC-n: A-end Connection Controller at Domain n ZCC-n: Z-end Connection Controller at Domain n TCC-n: Transit Connection Controller in Domain n
Figure 16-41. Example of call setup request processing [5]
Upon successful indication by the connection setup request process (across all call segments) the call setup request is successfully completed, and transfer of user characteristic information may begin. If the connection setup request process was unsuccessful, a call-denied notification is sent to the user [5]. 16.8.2.2 Call/Connection Release Call and connection release in transport networks, in its basic form, uses a two-message sequence consisting of the ReleaseRequest message, which initiates release of the connection and triggers release at the next hop of the connection, and the Releaselndication message, which is an acknowledgment of completion of the release of the local channel. Optionally, a Notify message may be sent to the remote end prior to the initiation of connection release, in order to prevent alarming due to in band performance monitoring that might be triggered by the connection release in the data plane. 16.8.2.3
Query
Finally, the ability to query the status of a connection or call is included through the QueryRequest message and the associated response through a Querylndication message so that it is possible to audit the state of the connection at a neighboring node and initiate a connection release or state resynchronization in case of a conflict.
630
Chapter 16
16.8.3 Signaling Attributes Distributed call and connection management attributes may be separated into attributes associated with the call and those associated with connections. In both cases, the scope of the attribute could be local, constrained to one reference point, versus global, which is carried across the network. Recommendation G.7713 Amendment 1 provides these attributes for UNI, E-NNI, and I-NNI signaling processing [5]. • UNI signaling processing includes call attributes as well as connection attributes for setting up LC(s) on user-to-network domain access links. Examples of call attributes include Calling and Called UNI Transport Resource Names, Call name, and policy attributes. It should be noted that call identity attributes have end-to-end scope. For example, the value of the UNI Transport Resource Name must be globally unique, and is assigned by the service provider. Examples of connection attributes include Initiating/Terminating Connection Controller Names and Connection Name. • E-NNI signaling processing includes call attributes as well as connection attributes for setting up LC(s) on interdomain access links. The call attributes are the same as for the UNI, though the Calling/Called UNI Transport Resource name may be carried transparently. Connection attributes include SNP and SNPP IDs, as well as Called/Calling Access Group Container(AGC) SNPP ID. An AGC is a single layer entity that can terminate multiple SNPP links and contains access groups, LRMs, and TAPs. - I-NNI signaling processing includes connection attributes. If call communications traverse I-NNIs, call parameters must be carried transparently. Abstract attributes at the E-NNI are provided in Table 16-5 (Table 7 3/G.7713A^.1704[5]):
Architecting the Automatically Switched Transport Network
631
Table 16-5. E-NNI Call and Connection attributes :5] Call vs. Connection Scope Attributes Call Calling UNI Transport End-to-end Resource name Call Called UNI Transport End-to-end Resource name Identity Connection Initiating CC/CallC Local attributes name Connection Terminating CC/CallC Local name Connection Connection name Local Call Call name End-to-end Connection SNPID Local SNPP ID Connection Local Service Connection Called AGC SNP ID End-to-end attributes Connection Called AGC SNPP ID End-to-end Directionality Call/connection Local Call Cos End-to-end GoS End-to-end Call ! Policy attributes Call/connection j Security Local Explicit resource list Connection Local Recovery Connection Local
16.8.4 Signaling Application Example An interesting signaling applications example is that for dual homing. Dual homing [40] refers to the scenario in which an individual user interacts with a provider network via more than one UNL Another form of dual homing is where a user interacts with two different provider networks via different UNIs. This configuration is commonly used to increase the reliability of a user's access to the network. If the transport link(s) to the network associated with one UNI fail, the other UNI can be used. This concept could also be applied over an E-NNI for multidomain connection reliability. Aside from increased access reliability, additional services can be supported for dual homed users: • Simple path diversity: In this scenario, illustrated in Figure 16-42, dual homed user 1 places two calls to a dual homed destination user 2, via each of its two UNIs over a provider network domain. The connections associated with each of the two calls could be established such that they
Chapter 16
632
do not share transport resources in the provider network. This is a feature being considered in the OIF for the UNI 2.0 Implementation Agreement.
umJ P ^ ^ <^ ^ '•'^ ^ -^%^NI User 1 r
Connections
UNIT
^^r""
-^
^ ^J
User 2
"* ^ '^y^uHi
Figure 16-42. Dual homing with simple path diversity
Multidomain path diversity: In this scenario, illustrated in Figure 8-43, one user is dual homed on domain 1 and the other is dual homed on domain 2. There are multiple E-NNIs between domains 1 and 2. Again, we consider the case where dual homed user 1 places two calls to destination user 2 via each of its two UNIs. The connections associated with each the two calls could be established such that they neither share transport resources in domain 1 or 2 nor traverse the same interdomain transport links.
Domain 1 ^ource
Domain 2 E-NNI
UNI
UNI^
Destination]
r User 2
Userl: UNI
^
E-NNI
UNI
Figure 16-43. Multidomain path diversity
•
Restoration from User Access Failure: In this scenario, again referring to Figure 16-42, consider a failure in one of the transport links associated with a particular UNI. It would be desirable to trigger a restoration action that would enable a connection to still be established between users 1 and 2 via the transport link associated with the alternate UNI. • Optimal Path: A dual homed user may prefer to utilize the transport link associated with one particular UNI versus another for a particular destination. A capability could be established to enable a user to do so. In order to support the above applications, we must consider the interactions between a call over one UNI and a call over the other UNI. This also introduces dependencies, or awareness, between calls within network domains. In other words, the Network Call Controllers for a dual homed user need to discover and then communicate with each other.
Architecting the Automatically Switched Transport Network
633
Further, for diverse path apphcations, a critical factor is whether a user knows that path diversity is required before the first path is estabhshed. This factor introduces the concept of dependent and independent calls. If the user knows diversity is required in advance, we consider these to be dependent calls. In this case, separate sequential call requests are made for each path. The service to the user is to be able to make a call request for another connection that should exclude elements of an existing path. In the case of independent calls, it is not known prior to any call establishment that diverse paths are needed. Here the requirement for a second diverse path was not known at the time the first connection was established. This situation differs from the previous case in that the source user does not know ahead of time that a second connection will be needed that is diverse from an existing one. This situation may arise when an existing transport user upgrades a service contract with a provider due to increased bandwidth needs. As with dependent calls, there are no requirements on when the two paths have to be set up in time relative to each other. In both cases, what is required in the network to support these dual homing applications is the ability for Network Call Controllers that share a common user to discover each other and then communicate regarding call context. NCCs can accomplish the discovery of other NCCs that share a common user by using the common CCC to inform the NCCs about each other. Further call contexts from one NCC can be transferred transparently to another dual homed NCC via the common CCC [40].
16.9. CONTROL PLANE MANAGEMENT While the goal of the ASON control plane is to provide a means of relieving management systems from the burden of connection management, it adds its own set of elements requiring management. The management of ASON is currently work in progress within ITU-T SG 15 and the TMF, and this section provides a brief description of the direction the work is taking in the several involved standards bodies. Specifications for network management are primarily being developed within the TMF and the ITU-T. The TMF is concentrating on the network aspects of the management problem, while the ITU-T is concentrating predominantly on the configuration of network elements. These efforts are complementary, and both are necessary for a complete management solution. The ITU-T initiated its work effort with G.7718, "Framework for ASON Management", which discusses the management requirements for the control plane. These requirements encompass the use of ASON constructs (e.g., RAs, SNPP links), management of calls and connections, and configuration.
634
Chapter 16
fault, performance, accounting, and security management for ASON. G.7718 also gathers the many management-related details from other ASON-related Recommendations into one place. Work is proceeding on G.7718.1, "Protocol-neutral management information model for the control plane view." By protocol neutral, we mean a design that can be implemented using several available management protocols; thus, the design is not restricted by the strengths (and weaknesses) of any particular protocol. While a full description of network management is beyond the scope of this chapter, it is useful to present some context regarding the principles of network management as practiced in the transport network. For many years, ITU-T network management standardization efforts have focused on managing the service instance and its relation to the network. Traditionally, the connection service instance is realized using transport resources across potentially nonhomogeneous network segments. This means that connections are first-class objects that are linked to the underlying hardware supporting the connection. Thus, the focus has been more on service rather than equipment, and it has been said, "Service is managed while equipment is repaired". The alternative approach is for the existence of a connection to only be inferable from properties of the hardware, which makes it difficult, if not impossible, to observe (and hence manage) any particular connection. The focus on service has given rise to a uniform method for modeling the transport resources and presenting management information that is independent of the underlying technology. The principle of managing the service instance has also been followed in ASON control plane management standardization, where the service instance is now the call supported by connections that are managed as before. The definition of a limited set of objects that are used as required on every network element also follows that model. This method is contrasted with that of only managing protocol instances, with little regard for uniform models or for the underlying network structure. The TMF has produced four documents, namely, TMF 513 Business Agreement [41], which details the business requirements; TMF 608 MultiTechnology Interface Agreement [42], which details the interface design in UML; TMF 814A Implementation Statement [43] and TMF 814 IDL Implementation [35], which define the interface in CORBA IDL. These documents provide a network view of the transport resources assigned to the control plane, as well as the calls and connections that are supported by that network. The assumption is that any necessary provisioning has already been done, and while the network manager can inspect the results of that provisioning, it does not itself do the provisioning. Network management interfaces are described in terms of managed objects, and these objects are derived from the real world entities that
Architecting the Automatically Switched Transport Network
635
describe the system. Transport-managed objects are derived from the atomic functions of the G.805 (and G.809) architectural models, as these models provide an implementation-independent view of the transport functions. The equivalent entities in ASON are the components, but ASON components themselves are merely used to describe interfaces that are externally observable. ASON components themselves, therefore, do not make good managed objects, since the components are merely a descriptive technique and not a functional specification. Our examination of the ASON architecture has shown that there are distinct classes of components that serve very different purposes. One set describes the topology of the network that ASON operates on, leading to objects for links and routing areas. These objects are sufficient to support the network view. The current view of the other ASON-managed objects is that the component interfaces and the Protocol Controllers that support them are a good basis for management. These interfaces exist on real equipment and have properties that need to be configured. The interfaces support real control plane services (signaling, routing, discovery), and a particular service may be enabled or disabled, or a particular protocol may need its timers to be set, depending on which physical interface is used. Service- and protocolrelated objects allow for policies to be set, and the users that are affected are determined entirely by how the managed objects relate to one another. Using the RC components. Figure 16-44 provides an illustration of how these concepts might apply to the Routing service. Other services are similarly handled. For completeness, the current view of ASON-managed objects for G.7718.1 is illustrated (in UML notation) in Figure 16-45; we note that these object classes may be different once the Recommendation is published. The essential element to understand with regard to these objects is the rationale behind their derivation; comprehension of the actual objects is straightforward.
636
Chapter 16
Figure 16-44. Routing service example
& d s f n g Transport ^
0 :1
SMPLmkCoimecSon
f
•.7"" Y ControtPlaneServtce
• < /
^
/ C^tContcrflerSer>ice
^
/ K \
\
:
-
i « * j
i i adaptaSon
i-^-H-
R-otoooiyMachanism
r H. •^%.. \ \ /
ConnedionControHerService
i
j
RoutingService
j
0 * i Connecfion
CarC^elaitRecord
\
ConnectonDetai IRecord
Figure 16-45. UML notation
;
Trait —
i
Architecting the Automatically Switched Transport Network
637
16.10. PROTOCOL ANALYSIS As was discussed in the introduction to this chapter, various standards bodies and industry fora have been involved in the development of control plane specifications, which are moving towards a convergence of requirements and protocols. This convergence has involved an iterative multiyear process of analysis, solution assessment, interoperability testing, and feedback resulting in the continuing evolution of supporting foundation protocols and ASON-specific protocol extensions. From a protocol perspective, this process spans areas ranging from "graceful restart" mechanisms to adding additional error codes/values, for example, for connection rejection. To recap, some key ASON requirements include support for • Noncongruent control plane, DCN, and transport plane topologies, with accompanying independent identifier spaces; • Separation of call (service) and connection, including zero or more connections associated with each call segment; • Call and connection operations arising only from explicit requests; • Control plane failures not causing connection failures; • Separation of location from identity (both of which are represented by the IP address in today's Internet [28]); • Modularized design around open interfaces at domain boundaries; • Layer-specific, in addition to layer-independent, information (e.g., per G.803 [44], G.872 [45], 1.326 [46], G.8010 [47], G.8110 [48]); • More than two levels of routing hierarchy; and • Rearranging/modifying address plans and Routing Areas. Considerable analysis of signaling protocols vis-a-vis requirements has been performed. Within this section, we cite several examples of requirements that were reflected in the foundation signaling protocols. Further examples, and elaboration thereof, are provided in more detailed references (e.g., [48]). Requirements that were reflected in ASON protocol extensions, as described in G.7713.1, G.7713.2, and G.7713.3, are described in Section 16.11. Similar analysis is currently taking place with regard to routing protocols. At the time of the development of this chapter, this work was not complete.
16.10.1 Analysis Approach As discussed in the introduction, the ITU-T started "top down" and the IETF started "bottom up" in developing standards to support the optical control plane. Within the IETF, protocol extensions for supporting dynamically provisionable optical cross-connects, initially termed MPAS
638
Chapter 16
{MPLambdaS), started in earnest during 2000. This work predated G.807, as described in Section 16.2, which provided a foundation of fundamental ASON requirements and architecture. As might be expected, iteration of the first GeneraUzed MPLS protocol proposal, introduced in November 2000 [49], was required to start addressing some of these requirements. Once G.8080 and G.7713 were approved, they were used to provide a further basis for determining the capabilities of the ASON network. During 2002, an intensive effort took place to methodically analyze the emergent GMPLS RSVP-TE signaling protocol vis-a-vis ASON requirements. This analysis was performed utilizing a traceability matrix showing, for each ASON requirement, what (if any) GMPLS feature fulfilled the requirement and where any gaps existed. For illustrative purposes, an excerpt from this traceability matrix is provided in Table 16-6. Comp. CC
CC CC CC CC CC CC CC CC CC CC CC
Table 16-6. Excerpt from Template of ASON Requirements Traceability Requirements REF# GMPLS RSVP-TE 1146 Connection setup — success (or Yes: Resv ResvConf) message failed: message Yes: ERROR SPEC Connection setup 1147 error 1148 Connection setup — failed: called party Yes:ERROR_SPEC24/ 5 busy Connection setup — failed: calling party No (requires extensions 1149 busy to error codes/values) 1150 Connection setup — failed: timeout Yes: ERROR SPEC24/5 Connection setup — failed: identity No (requires extensions 1151 error: invalid A-end user name to error codes/values) 1152 Connection setup — failed: identity No (requires extensions to error codes/values) error: invalid Z-end user name 153 Connection setup — failed: identity No (requires extensions to error codes/values) error: invalid connection name Connection setup - - failed: service error: Yes: 1154 invalid SNP ID ERROR SPEC24/6 1155 Connection setup — failed: service error: Yes: ERROR SPEC24/6 unavailable SNP ID Connection setup - failed: service error: No (requires extensions 1156 to error codes/values) invalid SNPP ID Connection setup - failed: service error: No (requires extensions 1157 to error codes/values) unavailable SNPP ID
The effort resulted in information on the degree to which the protocol satisfied ASON requirements, and stimulated work on refinements and extensions for identified gaps. In the evolution towards convergence of requirements and protocols to meet business needs, the industry has come a long way from the earliest
Architecting the Automatically Switched Transport Network
639
MPA.S/GMPLS protocol behaviors, where data and signahng were assumed congruent, the focus was on IP services, and the behavior was a mirror of that for data (e.g., connection tear-down without request).
16.10.2 Requirements Implications on Protocol Solutions This section provides some examples of protocol extensions resuhing from requirements considerations. The intent is not to provide an exhaustive enumeration but merely to provide a flavor for the nature of such extensions. For example, the requirement that congruence not be assumed among control plane, transport plane, and DCN topologies had some fundamental signaling protocol implications. "Classical" RSVP and RSVP-TE assume that the control signaling messages are sent in-band and a failure of a link or node results in both transport plane failure and control plane failure. In other words, it is assumed that the failure to receive a Path or Resv refresh indicates that the local state should be released. Similarly, in RSVP-TE the failure to receive Hello messages can be an indication that the link or adjacent node has failed [50]. However, this is not the case for out-of-band signaling where the control and transport planes can fail independently (and their failure detection should be performed independently). To ensure high reliability and robustness, and to prevent calls and connections from being released without an explicit request, all call and connection operations within the ASON network have to be explicitly requested. This means that signaling protocols should operate as a "hardstate" mechanism. PNNI and LDP are both hard-state mechanisms in that any operation must be explicitly requested. However, classical RSVP-TE is, by definition, a soft-state mechanism. To address this issue, extensions needed to be made to GMPLS RSVP-TE to allow it to behave in a "pseudo" hard-state manner. These included extending timers to handle an "infinite" timer for soft state, that is, adding a Restart Capabilities object (RESTART_CAP). This extension enables one to distinguish between restarting the control plane, and operating the transport plane without any control plane [50]. (It should be noted that other failures types may still occur which will result in a soft-state response from RSVP-TE; for example, a failure to receive consecutive refresh messages due to congestion, and not due to failure of node, may result in the failure of particular connections). Another example is related to the requirement that call and connection operations should only arise from specific requests, and thus, the act of establishing or tearing down a connection should not, of itself, cause alarms to be raised [1]. However, it was observed that a false alarm could occur during connection deletion if a source stopped sending a valid signal, or if a network node deleted its transport plane state, before all downstream nodes
640
Chapter 16
have stopped fault detection for the connection. To address this issue, an Administrative Status object was created to allow the ingress or egress of the connection to change the administrative status of the connection [50].
16-11. METHODS AND PROTOCOLS — DISCOVERY 16,11.1 Layer Adjacency Discovery Methods Consistent with the ASON control plane architecture, Layer Adjacency Discovery is performed on a per-transport-layer basis. However, the actual discovery protocol used can be the same at each layer. Only the mechanism to transport the discovery protocol messages may differ among layers. There are two methods that may be utilized for transporting discovery messages, described within G.7714 and G.7714.1: one using in-band overhead in the server layer, and the other using a test set in the client layer. • Type 1: In-band overhead in the server layer In this approach, the server layer trail overhead carries the discovery message, which is used to discover the peer TCPs (e.g., TCPss to TCP3R in Figure 16-46 [10]). The CP-to-CP relationships are inferred from the TCPto-TCP relationships using local knowledge of the configuration of the adaptation function and its relationship with the trail termination function. An important point to note about this method is that this overhead information persists through control entity failures. • Type 2: Test set in the client layer In this approach, a test signal carried in the client payload is used to discover the peer TCPs (e.g., TCPjs to TCPIR in Figure 16-46 [10]). The CP-to-CP relationship is inferred from the local knowledge of the matrix connection that was previously set up to connect the test signal to the desired CP.
Architecting the Automatically Switched Transport Network
^Jy22.2J;££^>E^^^Um \
Test Signal Carried in Client Paytoad| TCP,
^
641
V
i^TssCii^^i^Si^^s' Si?™''^*^''^^^™
A
aient Layer TCP,,
V irffer^ed
Test Signal Camed in V Trait Overhead \
T y#«
, t y p e 1 LAD prodess
Server
Figure 16-46. LAD discovery processes [10]
16.11.1.1 Type 1 — Overhead in Server Layer As discussed earlier, the mechanisms defined to support the layer adjacency discovery process apply on a per-layer basis. Within each of the layer networks that support the discovery process, different mechanisms are available. These may reuse the available Embedded Communications Channels (ECCs) for the particular layer. The following mechanisms are applicable to SDH layer networks: • RS layer: within the RS layer, the JO section trace and Section DCC may be used to support discovery of the RS TCP-to-TCP adjacency. • MS layer: within the MS layer, the Multiplex Section DCC may be used to support discovery of the MS TCP-to-TCP adjacency. • HOVC layer: within the HOVC layer, the higher-order Path layer Jl trace may be used to support discovery of the HOVC TCP-to-TCP adjacency. • LOVC layer: within the LOVC layer, the lower-order Path layer J2 trace may be used to support discovery of the LOVC TCP-to-TCP adjacency. The following mechanisms are applicable to OTN layer networks: • OTUk layer: within the OTUk layer, the SM section monitoring byte and the GCCO may be used to support discovery of the OTUk adjacency. Specifically, the SAPI subfield within the SM is used to carry the discovery message. • ODUk layer: within the ODUk layer, the PM path monitoring byte and the GCC-1 and GCC-2 bytes may be used to support discovery of the ODUk adjacency. Specifically, the SAPI subfield within the PM is used to carry the discovery message.
642
Chapter 16
16.11.1.2 Type 2 — Test-Signal Method In this method, test signal generators and receivers are used for directly discovering layer adjacencies. The test signals can encapsulate the necessary identification information. This method is utilized when the associations at the fabric port levels cannot be accomplished using the server trail overhead method because the trail is not available to do so. This approach is often useful in scenarios where there are two switches connected by intermediate pre-configured switches (or nailed-up connections), e.g., two SONET/SDH switches interconnected by a time-slot interchange device. In such cases, physical port connectivity does not directly provide information on the connectivity at the fabric port level. The test-signal method, therefore, discovers the TCP-to-TCP associations by means of test signal generators and receivers. Specific test signals are used for creating in-band events that effect associations between two TCPs directly, i.e., without discovering any server layer trails. For example, in the case of SONET/SDH, the testsignals themselves can include the identity information in the form of JO, J l , or J2 strings or in the form of Section and Line DCC. It is important to remember that this method can only be used prior to establishing a link connection. 16.11.1.3 LAD Protocol (G7714.1) The format for carrying discovery messages is defined in G.7714.L The following considerations influenced the definition of this format: • In order to foster interoperability, it was decided to use one single common message format for the discovery protocol independent of the mechanism used (e.g., J0/J1/J2 trace bytes or Section/Line DCC) at any specific layer to carry this message. Of the various mechanisms available to carry the discovery messages, the J0/J1/J2 trace bytes had the greatest number of limitations in terms of available message size and acceptable formats. Consequently, the requirements of the trace byte format were used to set limits to the format for the discovery messages. • It was decided to limit the discovery message format to only printable characters so as to allow reuse of existing text-oriented man-machine language (e.g., TLl) equipment commonly used for carrying other nondiscovery trace signals (e.g., Access Point Identifier, API). Use of arbitrary bit patterns for the discovery messages could also trigger unwarranted alarms from equipment expecting printable characters. • An important consideration in designing the discovery protocol was to provide a means in new equipment for distinguishing the discovery messages from the traditional trail trace messages. As a result, G.7714.1
Architecting the Automatically Switched Transport Network
643
includes a distinguishing character ("+") as the first non-CRC byte that will never appear as the first character of a TTI (trail trace identifier). • It was also decided that the discovery message should allow for the use of any address format (e.g. NSAP, IPv4, IPv6). Recommendation G.7714.1 makes use of an external name-server that allows translation of any address format into a specific format used by the discovery message. In summary, G.7714.1 limits the discovery messages to printable characters defined by T.50 [51], uses a special distinguishing character to set itself apart from other trace messages, and requires Base64 encoding for the TCP-ID and DA ID. External name-servers may be used to resolve the G.7714.1 TCP name, allowing the TCP to have an IP, NSAP, or any other address format. The ECC mechanism further supports messages using the raw LAPD data link protocol, or alternatively, using the Point-To-Point protocol (PPP).
16.12. METHODS AND PROTOCOLS — SIGNALING The following sections describe signaling protocol extensions developed to support the requirements of G.807, G.8080, and G.7713 for PNNI, GMPLS RSVP-TE, and GMPLS CR-LDP.
16.12.1 G.7713.1 PNNI Signaling Recommendation G.7713.1 [6] defines distributed call/connection management signaling based on the ATM PNNI signaling protocol [52]. PNNI [36] has been in use since the mid-1990s for control of ATM Switched Virtual Connections. It was designed for connection-oriented packet switched networks (CO-PSN) and matches fairly closely to the functionality required for ASON signaling. In particular, PNNI already supports a form of call and connection separation through the use of the Network Call Correlation Information Element in the PNNI signaling protocol. PNNI is also intrinsically designed for bidirectional connections, as are used in ATM networks, and already has extensions defined for explicit routing, soft permanent connections, connection setup and release, crankback, and connection query. Recommendation G.7713.1 adds new functionality into PNNI to support the following: • Signaling for SONET/SDH connections • Connection persistence • Transport over an IP control network infrastructure
644
Chapter 16
The main extensions involve the addition of a new Connection Identifier Information Element (IE) format for use with transport networks allowing the specification of new connection types, and to include the concepts of SNP and SNPP, and the addition of a new Traffic Descriptors IE format defined for use with transport networks that allows the specification of traffic descriptor attributes suitable for transport networks. No new protocol elements are required to support the connection persistence desired for ASON circuits, but some procedure changes are required to handle connection persistence under conditions of control plane failure. The ATM form of PNNI is designed to release a virtual connection in case of failure of the underlying signaling link layer protocol so as to allow the resources to be freed for future use. The opposite procedure is specified for ASON PNNI: the resources must be kept active for use by the transport connection, even in the event that the signaling link is terminated. Finally, transport over an IP control network infrastructure is specified in an Annex to G.7713.1. Use is made of an existing extension of the ATM Signaling Adaptation Layer protocol defined in Q.2111 [53] (SSCOP-MCE). This extension was originally defined for transport of messages over a link with connectionless characteristics, including potential multiplexing of traffic over multiple links.
16.12.2 G.7713.2 GMPLS RSVP-TE Signaling Recommendation G.7713.2 [7] defines GMPLS RSVP-TE-based distributed connection management for ASON. It is based on the requirements defined in G.7713 and specifies a number of extensions to the GMPLS RSVP-TE signaling defined in RFC 3473 [53]. The specifications in G.7713.2 were designed to use the same basic mechanisms as, and to be compatible with, RFC 3473 when using RFC 3473 as an Internal NetworkNetwork Interface (I-NNI) protocol. Recommendation G.7713.2 added new extensions onto RFC 3473 to support the following: • Call and connection separation • Call segments • SPC services • Explicit support for separation of control, transport, and DCN identifier spaces • Address space isolation across interdomain reference points (UNI, E-NNI transport resource names)
Architecting the Automatically Switched Transport Network 16.12.2.1
645
Support for Call and Connection Separation
Within this section, we describe the approach taken towards supporting separation of calls and connections in G.7713.2. It was recognized that the GMPLS RSVP-TE signaling protocol could be extended to handle the case where a call has one or more connections, as well as the case where a call has zero connections. Recommendation G.7713.2 takes the approach where call setup and teardown are signaled simultaneously along with connection setup and teardown (in other words, the call control protocol is "piggybacked" on the connection control protocol). The rationale for the approach that was taken is summarized below. Complete separation of call and connection is utilized within PNNI, which supports a call and connection separation based on Q.2982 [55]. Recommendation Q.2982 essentially reuses the "connection" messages as "call" messages, and allows for complete dissociation of the call operation (which is a separate flow) from the connection operation. These messages contain only call-specific information elements (call identifier, bearer identifier, call capability). It was recognized that ASON GMPLS RSVP-TE could take a similar approach to supporting the call and connection separation, i.e., to reuse the Path and Resv message to set up a call session (with an associated call identifier), followed by using the same Path and Resv message to set up connections within the call (using the previously set up call identifier to link the call and connection together). Alternatively, it was also recognized that the call functions could initially be supported via utilization of a logical call and connection separation. Using the above approach, all GMPLS RSVP-TE messages associated with connection operations would remain valid for call operations. Given the limited use of Q.2982-based signaling protocol, which provides the call and connection separation for the Q.2931 [56]-based signaling network, it was decided to first provide support within G.7713.2 for the simple case of logical call and connection separation. The implications of logical call and connection separation are described below: • A call with zero connections is considered a transient behavior (e.g., during restoration activities). A steady-state call will always have one or more connections associated with it. In such cases, since the call operation will always be part of a connection operation, only a unique CALL_ID is required to identify which of the operations are associated with a particular call. • A new call request may be initiated with a new call identifier, i.e., assuming call modification constraints are satisfied, any request with an existing call identifier signifies a call modification (in other words, the addition, deletion, or modification of a connection within the call).
646 •
•
Chapter 16
Subsequent requests for call modifications may be supported by using an existing call identifier within the set of RSVP-TE messages (e.g., Path and Resv messages). Since RSVP Session was scoped to connections within the same call segment, different connections established within the same RSVP Session share the same CALL_ID. Call release is achieved by releasing all the connections that have been set up within a particular call, i.e., that share the same CALL_ID.
16.12.2.2 Support for Signaling Control Domains In the modeling terminology of MPLS, the G.7713.2 extensions to RFC 3473 amount to using LSP stitching across signaling control domains, in that a separate RSVP session is created at the boundary between any two domains, and a separate session is used within a domain (assuming that it uses RSVP internally). As in LSP stitching, the sessions are stitched together in the control plane so that a single (end-to-end) G.805 network connection is achieved. As described in Section 16.2.5, in the transport plane it may be possible for a service to be transported in different ways across individual domains, using technologies such as virtual concatenation. Consistent with the stitching approach, G.7713.2 GMPLS RSVP-TE signaling has a different source and destination address within each signaling control domain and at the boundaries. At the call level, a single source and destination name pair is maintained end-to-end in the protocol. 16.12.2.3 Summary of G.7713.2 Extensions A limited number of extensions were made to the base RSVP-TE signaling defined in RFC 3473. This section summarizes the impact of these extensions. • Messages: No new messages were defined in G.7713.2. Use of some messages was considered not necessary for ASON, in particular, ResvTear and ResvErr messages (to be further discussed below). • Procedures: No changes to procedures were specified. (While usage of ResvErr and ResvTear was not deemed necessary, if used, the procedures remain the same.) • Objects: Three new call objects are provided in G.7713.2, namely, the Generalized_UNI object defined in the OIF UNI Implementation Agreement [36], a Call_ID object, and CallOPS (within an informative Appendix of G.7713.2). Having distinguished between call and connection objects, TLVs were designed so that they do not mix call and connection objects. This setup makes it easier for entities that only deal with connections.
Architecting the Automatically Switched Transport Network
647
•
C-Types: New Session Object C-Types were added for UNI and E-NNI. The new object and C-Type code points are allocated in RFCs 3474 [57] and 3476 [58]. It should be noted that G.7713 (and subtending protocolspecific Recommendations) distinguishes between attributes that are call related versus those that are connection related, since connection-handling entities in the network generally do not have to understand or deal with call objects. 16.12.23.1
Messages and Procedures (ResvTear/RevErr)
No case needing use of the ResvTear and ResvErr messages was identified in the call/connection control requirements for ASON in G.7713. Additionally, the procedures supported by ResvTear and ResvErr were felt to be inconsistent with the desired behavior of connection establishment in transport networks, since they result in a requirement for exchange of multiple messages in order to tear down the connection in case of failure during connection setup. If a need is identified in the future, the associated procedures can be added as an extension to G.7713.2. It should be noted that no protocol violation occurs if a ResvTear or ResvErr generates PathTear in the opposite direction, as the protocol allows for this case. In particular, ResvTear and ResvErr can be used within a domain as long as the interworking point at the edge of the domain supports the appropriate interworking behavior. 16.12.2.3.2
New Call Objects
As discussed above, the new call objects defined for G.7713.2 are the GeneralizedUNI, Call_ID, and Call_OPS, none of which require processing anywhere other than at call segment boundaries, based on the code point range assigned for these objects. For Soft Permanent Connections, a new SPC_Label subobject is defined in G.7713.2 to convey the identification of the label to be used by the destination switch at its egress (which is a nonsignaled interface). 16.12.2.3.3
New C-Types
G.7713.2 defines specific C-Types for UNI and E-NNI in order to allow the receiving entity to unambiguously distinguish when signaling received is for a G.7713.2 UNI or E-NNI interface. Such signaling received at an RFC 3473 interface will be rejected due to the unrecognized Session C-Type.
648 16.12.2.3.4
Chapter 16 Client Addressing
The Destination TNA subobject, as a content of the GenerahzedUNI object, is only processed at call segment boundaries (i.e., at interdomain interfaces: UNI, E-NNI), and the choice of address format supported in each domain is an option of the network operator. A similar situation occurs in IP with the understanding of exterior addresses, where only boundary network elements are required to support routing based on such addresses. It has been noted that IPv6 includes a mechanism for support of NSAP format addresses. This mechanism could potentially be used to eliminate one sub-type of the TNA; this would not affect the actual processing of addresses required, only the manner of carrying them.
16.12.2.3.5
Session Paradigm
It should be noted that G.7713.2 treats the Session object as a connectionrelated identifier. Using the separation of call and connection control, an end-to-end call consists of potentially multiple concatenated or "stitched" call segments. In the same way, each call segment may have associated connections; hence, the Session object would contain the destination endpoint for that connection, rather than the call endpoint. Thus, RSVP Session endpoints would coincide with corresponding call segment endpoints.
16.12.3 G.7713.3 GMPLS CR-LDP Recommendation G.7713.3 [8] defines GMPLS CR-LDP-based distributed connection management for ASON. Based on the requirements defined in G.7713, it contains a number of extensions to the GMPLS CRLDP signaling defined in RFC 3472 [59]. The specifications in G.7713.3 were designed to use the same basic mechanisms as and to be compatible with RFC 3472 when using RFC 3472 as an Internal Network-Network Interface (I-NNI) protocol. It also incorporates extensions to CR-LDP that are defined in the OIF UNI 1.0 [36] and that use protocol assignments in RFC 3476. CR-LDP [60] has similarity to PNNI in that signaling state for connections is hard state. Protocol actions are used to create and remove signaling state, and signaling adjacency communications are assumed to be reliable since they are based on TCP. G.7713.3 adds a number of features to GMPLS CR-LDP in order to meet ASON requirements. These include
Architecting the Automatically Switched Transport Network
649
• • •
Call control and call/connection separation Support for UNI and E-NNI reference points Support for separation of control, transport, and DCN functions and identifiers Call control and call/connection separation are key aspects of the ASON architecture that enable the instantiation of communication (connections) to be independent of the service interface between the user and network or between networks (calls). G.7713.3 contains additions for supporting both these aspects. For calls, two objects are added to the call objects of OIF UNI: - Call ID TLV: this is used to identify calls and is compatible with the C A L L J D object in G.7713.2 and G7713.1 - Call Capability TLV: this is used to signal call parameters between the user and network or network to network. To enable call/connection separation, G.7713.3 support both logical and full separation. Logical separation uses the existing connection protocol to enable call control and enables simultaneous call and connection setup. Full separation is provided by two new messages, namely, the Call_Setup and Call_Release messages. They both apply at the UNI and E-NNI reference points and are assumed to be passed from a UNI-N to an E-NNI or other UNI-N in the course of call setup. Full separation enables the capability of setting up a call without a connection and then subsequently adding connections. When the Call_Release message is used, all connections associated with the call are also released. Completion of the call release action, including connection release, is signaled by a Notification message back to the initiator of the call release.
16.12.4 Interoperability and Interworiiing With a potential for multiple signaling protocol options in the near future, how will service providers be able to achieve interoperability in a multivendor environment? ITU-T has defined a domain model that allows multiple vendors or technologies to coexist within the service provider's transport network, using UNI and E-NNI interfaces to interoperate. The problem is simplified because each E-NNI signaling protocol is based on a common model and a common set of requirements defined in G.7713. Since it defines an abstract set of primitives and attributes, it can serve as a mapping from one signaling protocol to another, as might occur at a gateway transport switch terminating one protocol and originating another. Further, almost all connection management signaling is hop-by-hop, simplifying the mapping issue.
650
Chapter 16
In addition, interworking between E-NNI and I-NNI signaling protocols is also a topic that has received a great deal of attention. A typical scenario might involve a network element terminating a G.7713.2 GMPLS RSVPTE-based access signaling interface and originating an intradomain signaling protocol based on an internal version of RSVP-TE, CR-LDP, or PNNI. Using the messages and attributes defined in G.7713, it is possible to map the information carried in one protocol into the corresponding message/object in the other and thereby support the desired service across the network boundary. As a specific example, the general principles for interworking between G.7713.2 and RFC 3473 GMPLS RSVP-TE are as follows: • Non-ASON GMPLS connections should be supported unchanged; that is, RFC 3473 can continue to be used intradomain, as G.7713.2 is only used at the UNI or E-NNI boundaries for a domain relying on RFC 3473 as its I-NNI. • Transit control plane-enabled network elements do not need to be upgraded to support new functionality; transit equipments are only required to pass the GeneralizedUNI and Call_ID objects transparently, which will naturally occur based on the code point range. • A message received at a non-G.7713.2 capable network element will be rejected. We noted that for the UNI and E-NNI interfaces, this rejection necessarily occurs due to the use of distinct C-Types for UNI and E-NNI Session objects. Since these C-Types are not recognized by a network element that does not support a G.7713.2-based interface, the network element will reject the message, since it has no way to process the unrecognized Session object type. It should be noted that specific G.7713.2/RFC 3473 interworking design guidelines have been proposed in external standards and industry fora, and it is expected that work will continue towards progressing these to maturity. At this time, multiple parties have implemented these procedures, although further interoperability testing is expected. There have been a number of interoperability demonstrations over the past few years, intended to demonstrate progress on technical specifications while identifying areas where refinement is needed. A 2004 OIF World Interoperability Demonstration validated ASON equipment from fifteen vendors, tested in seven carrier lab facilities across the globe [61]. The UNI/E-NNI interoperability demonstration included a standards based control plane for the setup and tear-down of a transport path across multidomain networks including Switched Connections and Soft Permanent Connections(s). Interoperability testing utilized the UNI 1.0 Signaling Specification, Release 2 [36], and E-NNI 1.0 Signaling Specification [37] Implementation Agreements that were based upon, and consistent with, ITU-
Architecting the Automatically Switched Transport Network
651
T ASON Recommendations. While both UNI and E-NNI interfaces were based on ITU-T G.7713.2, no protocol was mandated for the I-NNI. A 2005 OIF World Interoperability Demonstration further extended this work to include dynamic setup of Ethemet-over-SONET/SDH connections using UNI controlled Gigabit Ethernet access links and E-NNI controlled SONET/SDH server layer connections, supporting GFP and VCAT.
16.13. METHODS AND PROTOCOLS — ROUTING As discussed in Section 16.7, the ASON routing architecture and requirements have been developed in G.7715, including recognition of the utility of a link state routing approach for providing path computation with constraints and the detailed requirements for its instantiation. Recommendation G.7715.1 goes on to define the detailed requirements for link state routing protocols, and work is progressing on routing protocol extensions for OSPF, IS-IS, and PNNI. Scaling of the protocol follows the concept of G.7715 hierarchy, which allows flexibility of implementation at different levels and recognizes that transport networks are organized into domains. At the time of the writing of this chapter, no standards are yet available addressing E-NNI routing protocols, though considerable work has been devoted towards understanding the nature of the extensions that might be required for existing protocols to meet ASON requirements. Closest to an instantiation of a routing protocol addressing ASON routing requirements is the work in OIF on prototyping of extensions to OSPF. The OIF prototype work involves the addition of OSPF-TE Link State Advertisement TLVs to carry separate identification of the routing controller and the transport node being advertised, to carry more detailed information on the link availability to support connections at different layer networks, to carry client reachability information, and finally to support multilevel hierarchy in the routing architecture (existing OSPF protocol supports only a single higher-level backbone routing area). This prototype work has been input to the OIF interoperability testing and was tested in limited scale in the 2003-2005 OIF interoperability demonstrations (e.g., [61]), using multiple vendors organized into multiple domains. Cooperative work has been ongoing at IETF, ITU-T, and OIF to pursue extension of routing protocol standards with the requirements identified in G.7715 and G.7715.1. Joint Routing Design Teams (with participants representing IETF, ITU-T, and OIF perspectives) have been formed under IETF auspices in support of this initiative; the first output is an Informational RFC [62] (in the RFC Editor's Queue) describing the G.7715 and G.7715.1 routing requirements on the GMPLS suite of protocols to support ASON
652
Chapter 16
capabilities. Currently, the Joint Routing Solutions Design Team is in the process of finalizing an evaluation document that compares GMPLS routing protocols against the requirements. The areas to be addressed are much the same as were considered in the OIF prototyping effort, that is, the separation of routing and transport entities, the ability to indicate link availability for ASON, the advertisement of client reachability, and the ability to support multilevel hierarchy in the routing architecture. Liaisons have also been exchanged between ITU-T and the ATM Forum to discuss potential extensions to ATM PNNI Routing Protocol to support ASON requirements.
16.14. SIGNALING COMMUNICATIONS NETWORK — MECHANISMS (G7712) Recommendation G.7712 specifies the lower three layers for data communication and interworking between protocols within the lower three layers. Specifically, G.7712 supports IP on the SDH/SONET DCC as well as allowing OSI and IP network elements to coexist. As illustrated in Figure 16-47, the data link layer specifications in G.7712 encompass • IP-only nodes using IP/PPP [63, 64] • OSI-only nodes using OSI/LAPD [65, 66] • Dual "intermediary" nodes using (OSRIP)/PPP or (OSI+IP)/LAPD We note that the higher-layer protocols are shown as examples, since not all protocols are applicable in any particular application.
AppL MCN{TL1) 1 MCN snmp SON ASN.1 BER 1 (TLI) 1 ASN.1 Session
j
TP4
1
TCP UDP MPLS
1 Session
1 ^^
1 CLNP. IS-IS 1 1
IP. OSPF
1\
1
LLC 802.2 1I
LLC 802.2
11 LAPD
[
LAN 802.3 1 1
Ethernet
11
1
v_
^
IS-IS
DCC OSI DCC (GR 253,
1 [ CLNP, Int IS-IS 1 |CLNP, IP. Int IS-IS| I
LAPD
1 1 LAPD PPPHD j 1
DCC 1 1 OSI only
IP. Int IS-IS
j
PPPHDLC
1
DCC DCC I 1 Dual OSI + IP
DCC IP only
^
G.784) LAN
DCC (G.7712)
Figure 16-47. IP, OSI, and Dual protocol stacks for G.7712
Architecting the Automatically Switched Transport Network
653
Recommendation G.7712 provides a standardized approach for supporting "message forwarding" interworking between OSI NEs and IPbased NEs, while still being able to interwork with OSI-only NEs, through Dual OSI/IP NEs (illustrated in Figure 16-48). To forward IP traffic over the embedded OSI-only network, the IP traffic would be "encapsulated" into packets at the "Dual IP/OSI" NE and forwarded through OSI-only NEs to another "Dual OSI/IP" NE where it would be "unencapsulated" and forwarded to the appropriate IP-only NE.
Tunnel IP ove'•OSI
Tunnel OSI over IP
^ ^
1 ^ stack
Dual Stack
L • • \
r
OS,
U
^ Dual stack
L
r
—1. J
IP L
Dual Stack
^
J
/ \ —/I Stack ^^' 1i
V_y
Figure 16-48. IP and OSI tunnelling using "dual stack" NEs for encapsulation.
Similarly, management traffic received from an OSI-only NE could be encapsulated and forwarded through IP-only NEs to another Dual OSI/IP NE where it could be unencapsulated and forwarded to an OSI-only NE. For OSI nodes, G.7712 requires usage of IS-IS, and for IP and dual (OSI+IP) nodes, it specifies usage of Integrated IS-IS (a superset of IS-IS that can handle IP and OSI). These requirements do not preclude usage of OSPF, or other protocols, for IP nodes. The rationale for selecting IS-IS as the default protocol for interoperability was because it supports routing for multiple network layer protocols and does not require any other network layer protocol to operate. Two options for the routing of these encapsulated packets exist: static tunneling and "autotunneling". Static tunneling involves manually provisioning static routes for IP packets through the OSI domain. "Autotunneling" hides tunneling from the end user and automatically routes in mixed networks. Thus, it is more desirable, since it requires less operator intervention and maintenance [67]. This scheme is described in detail in G.7712 and involves an enhancement to Integrated IS-IS. Further details on these specifications are beyond the scope of this chapter.
16.15. FUTURES SDH transport networks employing control plane technology are already carrying live traffic. Initial deployment has focused upon soft permanent
654
Chapter 16
connections. At the time of this writing, these networks are single vendor single network-operator solutions. One of the justifications for introducing the control plane has been the reduction in required network capacity brought about by moving from dedicated protection to restoration. The introduction of the E-NNI is opening up the possibility of network operators integrating vendors within a single network. The interworking of multiple network operators to provide national or international service has yet to occur. Among the challenges that might direct future work in this area are • Development of an agreed upon addressing plan • End-to-end resiliency schemes that cross multiple domains A next step beyond simple private circuits created by the control plane is the manipulation of a set of private lines that collectively form a layer 1 virtual private network. These networks may be created as a static network that is managed as a single service for the customer or may be dynamic, whereby the customer has the ability to rearrange connectivity within a dedicated pool of bandwidth provided by the operator. Initial requirements for such services can be found in Y. 1312 [68]. What about switched networks, where the user has the ability to setup and tear down connections — just like the telephone network? It is fair to say that the jury is still out with regard to this application. The majority of issues are focused on identifying a valid business case for such a proposition. Firstly, how do you price a switched product versus a private line product, and to what extent will there be product substitution? Secondly, will this approach increase or decrease revenue? Thirdly, how do you design a costeffective transport network for such a service? After all, the telephone network is based on a busy hour coupled with an expectation of the call holding time. When the network is congested, if users find that their calls are blocked, they are normally successful on their next attempt if their wait to redial is on the order of the call holding time. Understanding how these concepts translate to high bandwidth connections in the transport network is nontrivial. Control plane technology can also be applied to all-optical networking. Although this approach has produced several research papers, it is questionable as to whether it requires much in the way of additional standardization. Many items of discussion, such as treatment of analogue impairments, distribution of optical parameters in routing protocols, and routing of wavelengths, are highly proprietary in nature. Consequently, there is currently no standard means of interconnecting all-optical add-drop multiplexers and cross-connects. Thus, there is no value to be had by standardizing specialized optical extensions to signaling and routing protocols. In part, this is because what represent parameters of interest in one system design are of little or no interest in another where they have been
Architecting the Automatically Switched Transport Network
655
engineered out. As an example, all-optical networks that can provide optical wavelength interchange have different blocking characteristics than those that don't. This situation is in many ways no different than the problem of timeslot assignment and timeslot interchange when considering path setup in digital systems. It is debatable as to what optical parameters should be distributed (some are effectively static, while others are dynamic) and how often they should be updated and whether this constitutes a sensible use of a routing protocol. More detailed parameters and constraints increase route calculation time and increases the load on the routing protocols. These combined factors suggest that centralized route calculation and precalculated backup paths may be the most appropriate means of controlling all-optical networks. Finally, we note that even if inter-vendor issues are solved, international links and intercarrier interconnect will still require digital interworking. As control plane technology increases in the network, the focus will move away from development of protocols in the control plane itself toward improving performance and the integration of control plane technology with network operators operational support systems. The latter requires the development of tools that interact with and complement existing systems. A simple example is the use of offline capacity management tools that understand current network utilization, identify whether sufficient network capacity is available to handle network failures, and model the impact of planned engineering works (e.g. loss of separation) and future capacity demands. Such tools may have to interact with other systems containing fiber and duct records, inventory systems, and planning systems.
16.16. ACKNOWLEDGEMENTS The authors would like to acknowledge review comments and input provided by Jim Jones, Jim Lavranchuk, Zhi-Wei Lin, Ben Mack-Crane, Rajender Razdan, and Walter Rothkegel.
16.17. REFERENCES [1]
[2]
Alan McGuire, Shehzad Mirza, and Darren Freeland, "Application of Control Plane Technology to Dynamic Configuration Management", IEEE Communications, Vol. 39, No. 9, pp. 94-99, September 2001. Curtis Brownmiller, Yong Xue, Carmine Daloia, Zhi-Wei Lin, Sivakumar Sankaranarayanan, and Eve Varma, "Switched Connection Services — The ITU-T ASTN Standards Framework", Optical Networks Magazine, Vol. 4, Issue 1, January/February 2003.
656 [3] [4]
[5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
[19] [20] [21] [22] [23] [24] [25]
[26]
Chapter 16 ITU-T Rec. G.807A^.1301, Requirements for Automatically Switched Transport Networks, July 2001 ITU-T Rec. G.8080A'.1304, Architecture of the Automatically Switched Optical Network'', November 2001; Amendment 1, March 2003; Amendment 2, February 2005. ITU-T Rec. 0.7713^^.1704, Distributed Call and Connection Management (DCM), November 2001; Amendment 1, June 2004 ITU-T Rec. G.7713.1A^.1704.1, DCM Signaling Protocol based upon PNNI, March 2003. ITU-T Rec. 0.7713.2/ Y. 1704.2, DCM Signaling Protocol based upon GMPLS RSVPTE, March 2003. ITU-T Rec. 0.7713.3/ Y.1704.3, DCM Signaling Protocol based upon GMPLS CRLDP, March 2003. ITU-T Rec. 0.7714/Y. 1705, Generalized Automatic Discovery Techniques,August 2005 ITU-T Rec. 0.7714.1/ Y.1705.1, Protocol for Automatic Discovery in SDH and OTN Networlcs, April 2003. ITU-T Rec. 0.7715/Y. 1706, Architecture and Requirements for Routing in the ASON, June 2002. ITU-T Rec. G.ll\5ArYM06.\, ASON Routing requirements for Link State Protocols, April 2003. ITU-T Rec. 0.7712/Y. 1703, Architecture and Specification of Data Communication Network, March 2003. ITU-T Rec. 0.7718, Framework for ASON Management, February 2005. ITU-T Rec. O.805, Generic functional architecture of transport networks, March 2003. Eve L. Varma, Thierry Stephant, et al., Achieving Global Information Networking, Artech House, 1999. ITU-T O.7041/Y.1303, Generic framing procedure (GFP), December 2003. Eve L. Varma, Sivakumar Sankaranarayanan, Oeorge Newsome, Zhi-Wei Lin, Harvey Epstein, "Architecting the Services Optical Network", IEEE Communications, Vol. 39, No. 9, pp. 80-87, Septempber 2001. Monica Lazer and Yong Xue, Carrier Optical Services Framework and Associated Requirements for UNI, oiOOOO. 155.2. ITU-T SO 15 Q14/15, Rapporteur Meeting Liaison to IETF CCAMP chairs, February 2004. J. H. Saltzer, D. P. Reed, and D.D. Clark, "End-To-End Arguments in System Design", ACM TOCS, Vol. 2, No. 4, pp. 277-288, November 1984. IETF RFC 3724, The Rise of the Middle and the Future ofEnd-to-End: Reflections on the Evolution of the Internet Architecture, March 2004. RFC 1958, Architectural Principles of the Internet, June 1996. RFC 3439, Some Internet Architectural Guidelines and Philosophy, December 2002. R. Braden, D. Clark, S. Shenker, and J. Wroclawski, "Developing a Next-Oeneration Internet Architecture", July 15, 2000. http://www.isi.edu/newarch/DOCUMENTS/WhitePaper.pdf. D. D. Clark, and M. S. Blumenthal, "Rethinking the design of the Internet: "The end to end arguments vs. the brave new world", Version for TPC submission, August 2000, http://cyberlaw.stanford.edu/e2e/papers/TPRC=Clark-Blumentahl.pdf
Architecting the Automatically Switched Transport Network [27]
657
D. D. Clark, K. R. Sollins, J. Wroclawski, R. Braden, "Tussle in Cyberspace: Defining Tomorrow's Internet", Proceedings of the 2002 conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, http://portal.acm.org/citation.cfm?id=633059&dl=ACM&coll=portal. [28] D. Clark, K. Sollins, J. Wroclawski, D. Katabi, J. Kulik, X.Yang, R. Braden, T. Faber, A. Falk, V. Pingali, M. Handely, and N. Chiappa, Final Technical Report, "New Arch: Future Generation Internet Architecture", sponsored by Defense Advanced Research Projects Agency (DoD), 6/30/00-12/31/03, http://www.isi.edu/newarch/iDOCS/fmaLfinalreport.pdf. [29] Guangzhi Li, Jennifer Yates, Dongmei Wang, and Charles Kalmanek, "Control Plane Design for Reliable Optical Networks", IEEE Communications Magazine, Volume 40, Issue 2, pp. 90-96, Feb. 2002. [30] G. Booch, J. Rumbaugh and I Jacobson, The unified modeling language user guide, Addisson Wesley, 1999. [31 ] ITU-T Rec. G.851 -01, Management of the Transport Network, Application of the RMODP Framework, 1996. [32] ITU-T Rec. G.852-01, Management of the Transport Network, Enterprise Viewpoint for Simple Subnetwork Connection Management, 1996. [33] ITU-T Rec. G.853-01, Common Elements of the Information Viewpoint for the Management of a Transport Network'' 1996. [34] ITU-T Rec. G.854-01, Management of the Transport Network, Computational Interfaces for Basic Transport Network Model, 1996. [35] TMF 814, Multi-Technology Network Management Solution Set, NML-EML Interface Version 3.0, kxixiM^OA. [36] OIF Implementation Agreement OIF-UNI-01.0-R2, "User Network Interface (UNI) 1.0 Signaling Specification, Release 2: Common Part", February 2004. [37] OIF Implementation Agreement OIF-ENNI-01.0, External Network-Network Interface (E-NNI) 1.0 Signaling Specification, February 2004. [38] ITU-T Rec. M.3100, Generic Network Information Model, July 1995; Amendments 1-8 March 1999 - August 2004. [39] af-cs-0173.000, Domain-based rerouting for active point-to-point calls vl.O, August 2001. [40] S. Shew and E. Ye, "Dual-Homing Apphcations in Transport Networks," 4th International Workshop on the Design Reliable Communication Networks (DRCN 2003), Banff, AB, October 2003. [41] TMF 513, Multi-Technology Network Management Business Agreement, NML-EML Interface Version 3.0, April 2004. [42] TMF 608, Multi-Technology Network Management Information Agreement, NMLEML Interface Version 3.0, April 2004. [43] TMF 814A, Multi-Technology Network Management Implementation Statements and Guidelines, NML-EML Interface Agreement Version 3.0, April 2004. [44] ITU-T Rec. G.803, Architecture of transport networks based on the synchronous digital hierarchy (SDH), March 2003. [45] ITU-T Rec. G.872, Architecture of optical transport networks, November 2001. [46] ITU-T Rec. 1.326, Functional architecture of transport networks based on ATM, March 2003. [47] ITU-T Rec. G.8010, Architecture of Ethernet Layer Networks, February 2004. [48] ITU-T Rec. G.8110, MPLS layer network architecture, January 2005.
658 [49]
Chapter 16
draft-ietf-mpls-generalized-signaling-OO.txt, "Generalized MPLS-Signaling Functional Description", November 2000. http://community.roxen.com/developers/idocs/drafts/draft-ietf-mpls-generalizedsignaling-OO.html [50] Adrian Farrel, The Internet and Its Protocols: A Comparative Approach, Morgan Kaufmann, 2004. [51] ITU-T Rec. T.50, International Reference Alphabet (IRA) (Formerly International Alphabet No. 5 or I A 5) - Information technology - 7-bit coded character set for information interchange, September 1992. [52] ATM Forum Implementation Agreement PNNI 1.1, Private Network-Network Interface Specification Version LI, April 2000. [53] ITU-T Rec. Q.2111, Service specific connection oriented protocol in a multi-link and connectionless environment (SSCOPMCE), December 1999. [54] IETF RFC 3473, Generalized MPLS Signaling - RSVP-TE Extensions, 2003. [55] ITU-T Rec. Q.2982, Broadband integrated services digital network (B-ISDN) Digital Subscriber Signaling System No. 2 (DSS2) - Q.2931-based separated call control protocol, December 1999. [56] ITU-T Rec. Q.2931, Digital Subscriber Signaling System No. 2 - User-Network Interface (UNI) layer 3 specification for basic call/connection control, 1995; Amendment 1, 1997; Amendments 2 - 4 , 1999. [57] IETF RFC 3474, Documentation of I AN A assignments for Generalized Multiprotocol Label Switching (GMPLS) Resource Reservation Protocol - Traffic Engineering (RSVP-TE) Usage and Extensions for Automatically Switched Optical Network (ASON), March 2003. [58] IETF RFC 3476, Documentation of I AN A Assignments for Label Distribution Protocol (LDP), Resource ReSerVation Protocol (RSVP), and Resource ReSerVation Protocol-Traffic Engineering (RSVP-TE) Extensions for Optical UNI Signaling, March 2003. [59] IETF RFC 3472, Generalized Multi-Protocol Label Switching (GMPLS) Signaling Constraint-based Routed Label Distribution Protocol (CR-LDP) Extensions, January 2003. [60] IETF RFC 3212 Constraint-Based LSP Setup using LDP, 2002. [61 ] OIF World Interoperability Demo, 2004, littp:/7wvv\v.oifomm.coni/public/downloads/De}no-WPaper-FlNAL.pdf [62] Design Team Report, Requirements for Generalized MPLS (GMPLS) Routing for Automatically Switched Optical Network (ASON), http://www.watersprings.org/pub/id/draft-ietf-ccaoip-gmp1s-ason-routing-reqts-05.txt. [63] IETF RFC 1661, The Point-to-Point Protocol (PPP), 1994. [64] IETF RFC 1662, PPP in HDLC-like Framing, 1994. [65] ITU-T Rec. Q.811, Lower layer protocol profiles for the Q and X interfaces, 2004. [66] ITU-T Rec. Q.921, ISDN user-network interface - Data link layer specification, 1997; Amendment 1,2000. [67] Chris Murton, Murton Consultancy, IP based DCNfor Optical Products using ITU-T G. 77/2,http://w^ww'.google.coni/scarch?q^cachc:Bivgu6uvhQJ:w^ww.niurtoncd.pwp.bl ueYonder.co.uk/G7712»/o2520wp.pdf-fG.7712&hl-en. [68] ITU-T Rec. Y. 1312, OAMfunctionality for A TM-MPLS Interworking, January 2004.
PART 4 Intra-Network Elements and Component-Centric Standards
This page intentionally blank
Chapter 17 INTRA-NETWORK ELEMENTS COMMUNICATION Khurram Kazi SMSC
17.1. INTRODUCTION Over the past few years the focus of the networking industry has shifted towards providing various services that seamlessly connect networks over different geographical locations, across the globe. These services go beyond the features and capabilities of the traditional packet (e.g. Frame Relay services) or Time Division Multiplexed (e.g. Tl/El, T3/E3 or STS-n/STM-n connections) technologies. Most of the leading networking service providers worldwide are racing to build integrated multiservice platforms that allow them to offer bundled services to their customers that can be provisioned almost instantly. From a service provider's point of view, these multiservice platforms allow improved flexibility in adding, migrating or removing customers. For example, as we have seen in the previous chapters, services that offer point-to-point, point-to-multipoint or multipoint-to-multipoint Ethernet connectivity over the metro or the wide area public networks have gained tremendous momentum and continue to grow. Likewise, multiple Storage Area Networks (SANs) residing in different locations are being connected via the metro or the wide area transport networks in various configurations. These new services that the transport networks are gearing up to provide are fuelling the evolution of ever increasing intelligent network elements. This chapter covers chip-to-chip communications standards that serve as the key architectural building blocks of the network elements. Their usage is illustrated with generic hardware architectural examples. The discussion is
Chapter 17
662
restricted to the network elements (with line rates of 2.5 > Gb/s) that are used within the carriers' multiservice networks or at the enterprise edge that provide metro or wide area network connectivity.
17.2. REQUIREMENT PLACED ON THE NETWORK ELEMENTS BY THE NETWORK Today's multiservice heterogeneous landline networks can be fuzzily categorized into different network demarcation segments as depicted in Figure 17-1. The segments are: 1. Premise or Enterprise 2. Metro edge 3. Metro core/backbone 4. Long haul core/backbone
Figure 17-1. Fuzzy Networking Demarcation Points
Heuristically we have seen that as the diverse data centric traffic along with traditional TDM traffic, traverses from the enterprise networks towards the core network, the traffic keeps getting aggregated or bundled into bigger pipes (illustrated in the lower portion of Figure 17-1). Depending on the application, the traffic aggregation takes place at the transport layer as the
Intra-Network Elements Communication
663
traffic is mapped onto one of the standardized transport protocols like SONET/SDH, G.709, 1/10 gigabit Ethernet, Fibre Channel (1, 2, 4, or 10 Gb/s data rates). This traffic aggregation ensures efficient utilization of the network bandwidth while enabling efficient traffic control and management. Therefore, by the time the traffic reaches the network core, it has been well shaped and smoothed. This relieves the core network from managing fine granular traffic pipes such that it can concentrate on the management of higher bandwidth pipes and A.s traversing across the country or continents. The network elements used in building ever-evolving diverse networks have to process data with effective throughput typically dictated by transport layer protocol (SONET/SDH, Ethernet etc.) line rates. Presently, the throughput requirements placed on the individual line cards within the network elements depend on where they are used. For example, within the access network, the line rates are typically less than or equal to 2.5 Gb/s, at the metro edge they could be up to lOGb/s, at the metro core the rates can be 10-40 Gb/s, and long haul core the rates are heading towards 40+ Gb/s. The above-mentioned line rates place minimum effective throughput rates for chip-to-chip and card-to-card (via that backplane) communication within the network element. During these interesting and dynamic times for the networking industry major operators worldwide are moving beyond simple next-generation SONET/SDH, offering more ambitious services. These services are offered based on the network elements that provide full layer-2 aggregation and switching (carrier grade Ethernet switching), support MPLS pseudowire, and end-to-end control plane to create truly packet-aware transport gear [1]. These functions within the network element mandate the processing of the data at different hierarchical layers of the OSI model. Such processing is typically performed by several VLSI devices that are optimized for handling specific networking functions. Some of the requirements placed on these devices are to handle hierarchical protocol traversal, processing and conversion, flow control between devices, in addition to switching/routing of data. The data processing requirements subsequently translate into payload processing at the line rate along with processing of management and control plane data for chip-to-chip communications within the network element. Numerous new services carried over the transport networks require that network elements handle diverse data centric protocols that are carried over legacy as well as newly developed physical interfaces. These requirements fundamentally translate into the network elements (especially customer edge or provider's edge network elements) to terminate the mix of physical layer interfaces and protocols and map them onto a unified transport protocol like SONET/SDH, G.709, Ethernet etc. These types of network elements are typically know as Multiservices Provisioning Platforms (MSPPs). For
Chapter 17
664
example, the MSPPs may have a mix of 1/10 GbE, Fibre Channel interfaces on the customer side and a SONET/SDH interface on the service provider side.
17.3-
NETWORK ELEMENT DESIGN AND INTERFACE ARCHITECTURE
The majority of the high speed data traffic traversing over fiber (over public networks or private/leased fiber) is transported either on SONET/SDH, G.709, 1/10 gigabit Ethernet or Fibre Channel (1, 2, 4, or 10 Gb/s data rates) based transport protocols. At present, the native Ethernet and Fibre Channel transport is primarily restricted to the same metropolitan area whereas SONET/SDH transport spans from the customer premise to the metropolitan and wide area networks. G.709 is slowly gaining momentum where strong Forward Error Correction (FEC) is used to extend the link span (e.g., intercity or undersea links). Modern network elements, (primarily switches, routers, digital cross-connects etc.) although may have different physical layer interfaces, share very similar architectural features. The design of the network elements logically separates two major processing paths: a fast data/packet/frame processing path and a slow packet/frame processing path. Figure 17-2 illustrates some of the processing functions of the fast and the slow paths. The fast path is traditionally referred to the data path where the incoming packet/frame is operated upon/processed at the line
o
Opuratton. Admtnistratton. Maiiagcmcnt and Provisioning (OAM&P) (Less time cntical applications to bo pcrfonnc'd at much slo\vi;r rates than the line rate)
^
zx
Poticy adminisiralioa'applicatjons Network management Signalling/provisioning Conllijiiration/suuistics ytiihcring Topology management
Data cncrvption/dccr\'ption TralTic ManDL>cmciit (segmentation rcasscniblv. qucuin^j. policing etc)
Data plane processing: (Tunc sensitive data to be processed at line rate)
Protocol Translations Classification (filtering, fonvarding, lookup etc) Data Header Parsing (e.g. for addmss/proiocol information) MACWAN Franicr (eg. Ethernet or SONET/SDH frame processing)
\7
Phvsical Laver
Figure 17-2. Conceptual data processing partitioning within a Network Element
Intra-Network Elements Communication
665
rate. The slow path typically performs processing of Operations, Administration, Management and Provisioning (OAM&P) data. For example, such data can consist of information pertaining to the routing tables that need updating, modification to the existing configuration, gathering of statistical information etc.
17.3.1 Packet Based Network Elements Switch Fabric Card
n; Hosl Processor
Figure 17-3. Line card using packet/cell switch fabric
Figure 17-3 illustrates architectural blocks of a packet data line card which is connected to switch fabric module via a proprietary or standardized backplane [2,3] (Chapter 19 discusses standardization efforts on backplanes). The network element routes/switches variable or fixed length packet/cell based data. (The terms switching and routing are used interchangeably since the functionality they represent within the network elements context, is the same). Within this architecture, the switching/routing can take place at the Ethernet, MPLS, ATM, IP, Fibre Channel or some other packet based protocol. There are seven major fastpath processing functional blocks that make up the generic architecture of network element. With present state-of-the-art VLSI technology, these architectural blocks also represent discrete VLSI components. The components consist of Serializer/Deserializer (SERDES), FEC Device, Multiservice Framer (SONET or G.709), Network Processor (NP), Traffic Manager (TM), switch fabric interface device and switch fabric/crossbar module. Optional components like security processors for line rate data encryption/decryption can also be added within the data path. Elaborate
Chapter 17
666
features and architectural concepts with vendor specific implementations of NP and TM functions can be found in [4] and switch fabrics in [5]. The highly complex and integrated networking VLSI components used within the network elements are supplied by several different vendors. Therefore, for these devices to interoperate, standardized chip-to-chip implementation agreements were developed by the Optical Internetworking Forum (OIF) and are discussed extensively in Section 17.4-17.7.
17.3.2 TDM Based Network Elements In the late 1990s and early 2000s, efforts were made in emulating TDM switching within packet based switch fabrics. However, the proposed solutions did not gain traction due to several limitations of packet-based switching. One of the major disadvantages of the packet-based solution was the need to buffer TDM data over possibly multiple SONET/SDH frames to form fixed-length packets before they can be switched. This introduced unnecessary latency, jitter, and overhead processing. The need for a standardized TDM-based backplane was identified within the OIF community and subsequently specified. The implementations agreement allowed TDM switch fabrics (digital cross-connects, add-drop muxes, grooming switches, and time slot interchange ICs etc.) to interface with a multiservice framer in a standardized format. Figure 17.4 illustrates an architectural design of a typical TDM based network element design. Line Card
_o. fe>l
Switch Fabric Card
Multi-Servlce
Transponder/1 (Optics ^ SERDES)
la
7T .. XZ
iz
n
Host Processor Interface (e.g PCI 2.2 compliant 66 MHz, 64 bit host interface)
Host Processor
Figure 17-4. Line card using TDM switch fabric
TDM Switch Fabric
X
Intra-Network Elements Communication
667
17.3.3 Hybrid (TDM + Cell/Packet based) Network Element Architecture With a centralized control and management platform based on ASON which allows rapid deployment of network resources, it is easy to architect a hybrid network element that utilizes the enhanced properties of both TDM and packet-based networking. For example, TDM service allows users to reserve guaranteed fixed bandwidth for end-to-end connections. These preprovisioned connections have fixed latency and minimum jitter. In highspeed applications' (greater than 1 Gb/s line rate) market place, SONET/SDH is the transport technology of choice. As we have seen in chapters 4 and 5, SONET/SDH has been optimized to support both TDM and packet traffic. From the previous section we have seen that emulating the TDM switching within the packet-based environment adds unnecessary processing resources, latency, and to some extent unpredictability (under adverse conditions). There are applications for which network paths are established such that fixed bandwidths are allocated with bound latencies and jitter. For such applications TDM centric services are optimally suitable. At times, there are applications were packet based designs that utilize the network resources efficiently and provide ease of manageability are needed. To have the best of both worlds, a hybrid solution provides an optimized platform that allows true TDM and packet/cell based switching in an integrated environment. Figure 17-5 illustrates the architecture of a hybrid network element. Line Card
_ci
Transponder/ (Optic. • SERDES)
\^.
ia
"vl
Switch Fabric Card(s)
Nrvi
7T L/l [' |\|
I Network { / l _ J \ ; Traffic J / L K t ProcMtor if )t Managw < )i t (Optional) r \ | i X ! (Optional) S \ j \/\
XZ
iii Host Processor
Figure 17-5. Integrated TDM/Packet based line card with different switch fabrics
668
Chapter 17
In the architecture depicted in Figure 17-5, the multiservice framer separates the TDM traffic from the packet traffic and presents it on two separate buses. The TDM traffic is carried on the TFI and the packet/cell traffic is carried on the SPI-x (TFI and SPI-x buses are discussed in sections 17.4-17.7).
17.4. 2.5 GBITS/S SYSTEMS In the 1990s numerous chip vendors started offering products that catered to the burgeoning transport of IP centric data over SONET/SDH transport networks. Promoting multi-vendor chip-to-chip and module-to-module interoperability became the charter of the Physical and Link Layer (PLL) group within the Optical Networking Forum (OIF). In June 2000 the System Packet Interface level 3 (SPI-3) implementation agreement [6] was released. It defines the interface between a SONET/SDH framer and rest of the system. Originally the SPI-3 was intended for Packet over SONET (POS) applications; however, over time it has been used for different applications. SPI-3 provides a versatile bus interface for exchanging packets between various VLSI devices within the network elements supporting line-rates of OC-48 (approximately 2.5 Gb/s) or lower. Specifically the SPI-3 acts as the demarcation point between the physical layer and the link layer device. SPI3 provides isolation between the synchronous physical layer and the asynchronous packet-based, higher-layer processing units (e.g. between a SONET/SDH framer and a network processor). The SPI-3 implementation agreement defines: • The SPI-3 bus • The signaling protocol used to communicate data between devices • The data structure used to store data in First-in, First-Out (FIFOs) buffers. The SPI-3 compliant devices have independent transmit and receive data paths, which can be either 8 or 32 bits wide. The maximum standardized clock rate at which these data transfers occur is 104 MHz, allowing a maximum data throughput of 3.228 Gb/s across the bus. The bus transmit and receive clocks are independent of the line clocks and operate at different rates. To support the rate mismatch between the line clock and the internal system operating clock, decoupling FIFOs are used. To ensure the integrity of data transmission, a parity check is performed on both transmit and receive data buses. In order to support multiple PHY devices, an in-band PHY port address is inserted with the packet data that is transferred on the data bus. In SPI-3, up to 256 ports are supported. Discrete control/status signals are used to indicate start of packet, end of packet, start of transfer.
Intra-Network Elements Communication
669
error indications etc. Figure 17-6 illustrates typical usage of SPI-3 interface, where the concepts developed in [6,14] are merged. Transmit Clock
\A
hJ
Optical Transceiver
O.
L/l Nj
NJ Optical j / l Transceiver
O.
M |\|
hsj Optical j / 1 Transceiver
o.
M |\j
M Optical j / 1 Transceiver
o.
Transmit Data Bus (8 or 32 bits) Transmit I>dta Bus Parity
Multiport Physical Layer Device
Link Layer Device
Receive Data Bus (8 or 32 bits) Receive Data Bus Parity
Z ^
5^ ^
W^
Figure 17-6. Typical usage of SPI-3 interface
17-4.1
SPI-3 signal descriptions
A brief overview of the signals used in the SPI-3 interface is given below. For the complete definitions of the signals please refer to [6]. Transmit Direction (from link layer device to the PHY) Clock and Data signals TFCLK Transmit FIFO write clock is used to synchronize data transfer between the link layer device and the PHY. TDAT[31:0] Transmit Packet Data Bus; a 32 bit wide bus used to transport the packet/cell octets to be written to the selected transmit FIFO, and the in-band port address used in selecting the desired transmit FIFO. Discrete control/Status signals TENB Transmit Write Enable (TENB) signal controls the flow of data to the transmit FIFOs. The PHY device processes signals like TDAT, TMOD, TSOP, TEOP
670
TPRTY
TERR
TSOP
TEOP TMOD[1:0]
TSX
TADRfJ
DTPAfJ
STPA
PTPA
Chapter 17 and TERR when TENB is low. The TSX signal is processed when TENB is high. Transmit bus parity (TPRTY) signal, when asserted, indicates that the calculated transmit parity is being transported over the TDAT. Transmit Error Indicator (TERR) signal flags that there is an error in the current packet/cell. The error could be caused by conditions like FIFO overflow. Frame Check Sequence error, or any other user defined error condition. Transmit Start of Packet (TSOP) indicator is used to delineate the packet boundaries on the TDAT bus. TSOP being high indicates the start of the presence of a packet/cell on the TDAT bus. Transmit End of Packet signal flags the termination of the packet/cell being transmitted over the TDAT bus. Transmit Word Modulo (TMOD[1:0]) is primarily used during the transmission of the last word of the packet/cell. Since the number of octets within a packet/cell do no have to be a multiple of 32 bits, TMOD[l :0] indicates the number of valid octets in the last word being carried by the TDAT[31:0] bus. Transmit Start of Transfer (TSX) signal indicates the presence of the in-band port address on the TDAT bus. When TSX along with TENB is high, the value of TDAT[7:0] represents the address of the selected transmit FIFO. Transmit PHY Address (TADR[]) bus is used in conjunction with the PTPA signal to poll availability of the respective transmit FIFO. Direct Transmit Packet Available (DTPA[]) bus indicates the status of the FIFO (whether it is available to accept data or not), corresponding to the respective ports in the PHY device. Selected-PHY Transmit Packet Available (STPA) signal indicates whether the addressed transmit FIFO (that is addressed by the content on the TDAT bus) is full or not. This signal is primarily used when in ByteLevel mode Polled-PHY Transmit Packet Available (PTPA) signal is used when in Packet-level mode. It indicates whether the polled transmit FIFO is full or not. The
Intra-Network Elements Communication
671
selected polled PHY is addressed by the contents of the TADR address bus. Receive Direction (from the PHY to the link layer device) Clock and Data signals RFCLK Receive FIFO Write Clock (RFCLK). RDAT[31:0] Receive Packet Data Bus (RDAT[31:0]) a 32-bit wide bus used to transport the packet/cell octets to be written to the selected receive FIFO, and the in-band port address used in selecting the desired receive FIFO. Discrete control/Status signals RVAL Receive Data Valid (RVAL) signal, when high, indicates the validity of the receive data signals; RDAT[31:0], RMOD[1:0], RSOP, REOP, and RERR. RENB Receive Read Enable (RENB) signal is used for controlling the flow of data from the receive FIFOs. During data transfer, RVAL must be monitored as it indicates the validity of the RDAT[31:0], RPRTY, RMOD[1:0], RSOP, REOP, RERR, and RSX signals. RPRTY Receive Parity (RPRTY) signal, when asserted, indicates that the calculated receive parity is being transported over the RDAT bus. RMOD[1:0] Receive Word Modulo (RMOD) signal is primarily used during the transmission of the last word of the packet/cell. Since the number of octets within a packet/cell do no have to be a multiple of 32 bits, RMOD[1:0] indicates the number of valid octets in the last word being carried by RDAT[31:0]. RSOP Receive Start of Packet (RSOP) flag is used to delineate the packet boundaries on the RDAT bus. RSOP being high indicates the start of the presence of a packet/cell on the RDAT bus. REOP Receive End of Packet (REOP) signal flags the termination of the packet/cell being transmitted over the RDAT bus. RERR Receive error indicator (RERR) signal flags that there is an error in the current received packet/cell. The error could be caused by conditions like FIFO overflow. Frame Check Sequence error, abort sequence or any other user defined error condition.
672 RSX
17.5.
Chapter 17 Receive Start of Transfer (RSX) signal indicates the presence of the in-band port address on the RDAT bus. When RSX is high, the value of RDAT[7:0] represents the address of the selected receive FIFO from which the subsequent data on the RDAT bus will be transferred.
10 GBITS/S SYSTEMS
As a natural progression to the systems operating at 2.5 Gb/s line rate, OIF generated 10 Gb/s systems implementation agreements in two phases. In the first phase the System Framer Intererface-4 Phase I (SFI-4 Phase 1) [7] and System Packet Interface-4 Phase 1 (SPI-4 Phase 1) [8] were released in September 2000. In phase 2, a reduced signal count based, 10 Gb/s recommendations for both the SFI-4 and SPI-4 were introduced. SFI-4 phase 2 [9] was introduced in September 2002 and SPI-4 phase 2 [10] in October 2003. Subsequent sections give an overview of the implementation agreements. For exact details, it is highly recommended that the reader refer to the implementation agreement documents released by OIF.
17.5.1 System Framer Interface-4 Phase 1 (SFI-4 Phase 1) SFI-4 phase 1 is a relatively simple interface that primarily defines the clocking scheme and the data signals between the STS-192/STM-64 SERDES and SONET/Framer. SFI-4 can also be extended to the OTN applications. It consists of two independent sixteen bit data buses, one in the receive direction and the other in the transmit direction. In the receive direction, the SERDES recovers the clock from the received line data and provides the receive clock to the framer. In the transmit direction, the SERDES uses the reference clock and provides a source clock to the framer. The framer subsequently uses the transmit clock source to generate the transmit clock and the associated data signals to the SERDES. The SERDES, in the receive direction, takes the serial line data and converts it into raw sixteen bit wide parallel data. It is the framer that extracts the framing from the incoming data stream and establishes the byte boundaries for further processing. In the transmit direction the SERDES takes in sixteen-bit wide data and converts it into serial stream for transmission over the physical link. Figure 17-7 depicts the typical usage of the SFI-4 phase 1 interface. In the STS-192/STM-64 applications an aggregated throughput of 9.9532 Gb/s is transported in each direction. This throughput is achieved by sixteen 622.08 Mb/s differential data lines in both
Intra-Network Elements Communication
673
transmit and receive directions. The SFI-4 phase 1 is defined to support up to speeds of 10.66 Gb/s. System to Optics Transmit Direction
_P
REFCLK
1 TXDATA[15.0]
< ^
Transmit Clock
•
Transmit
• ^Transmit Clock Source
Serializer/ Deserializer (SERDES)
FRAMER (e.g. SONET/5>DH or OTN)
Electro/Optic Module
IOCpATA|15:0] ^
Receive Clock
^ Receive Sync Error
,
^
Receive
Optic 5 to System Receix^e Direction
Figure 17-7. Typical usage of SFI-4 Phase 1 interface
A brief overview of the signals used in the SPI-4 phase 1 interface is given below. For the complete definitions of the signals please refer to [7]. Transmit direction (framer to SERDES) TXData[15:0] A 16 bit transmit data bus that is used to transport the data from the framer to the SERDES. Each bit lane is transporting data at a rate of 622.08 Mb/s. TXCLKJPN 622.08/311 MHz transmit clock used by the TXData bus. TXCLK_SRC__PN 622.08 MHz transmit reference clock that is provided by the SERDES to the framer. Receive direction (framer to SERDES) RXData[15:0] A 16 bit receive data bus that used to transport the data from the SERDES to the framer. Each bit lane is transporting data at a rate of 622.08 Mb/s. RXCLKJPN 622.08/311 MHz receive clock used by the RXData bus.
674
Chapter 17
Miscellaneous Signals REFCLKPN 622.08 MHz board reference clock used by the SERDES. SYNCJERR This signal indicates that RXCLK and RXData are not derived from the received optical signal. PHASEJNIT This is used to reset the SERDES clocking interface. PHASEERR This signal, when asserted, flags that the phase of the TXCLK with respect to the SERDES internal clock is out of specification.
17.5.2
SPI-4 Phase 1 (OC-192 System Packet Interface)
The SPI-4 phase 1 interface supports the transfer of packets or cells at STS-192/STM64 rates with the maximum throughput capacity of 12.8 Gb/s. It is used to transfer information between physical layer and the link layer device (framer to network processor), or between peer devices (network processor and traffic manager, traffic manager and switch fabric, or network processor and security processor, etc.). Figure 17-8 illustrates the typical usage of the SPI-4 phase 1 interface [8]. During the development of the SPI4 phase 1, the specification developers, to mitigate the risk and shorten the time-to-market, took a conservative approach and recommended a wide and relatively slower rate interface. Some of the key features of the interface are [8,14]: 1. Supports transfer of variable length packets and fixed length cells 2. Independent 64-bit wide receive and transmit buses operating with a clock rate of 200 MHz, thereby, allowing a throughput of 12.8 Gb/s 3. Parity checking to ensure data integrity 4. Discrete address bus to support out-of-band addressing and multiple PHY devices 5. Discrete control signals that indicate start of packet, end of packet, error indications etc. 6. Synchronous continual transmission of FIFO (receive and transmit) information for flow control purposes. A brief overview of the signals used in the SPI-4 phase 1 interface is given below. For the complete definitions of the signals please refer to [8]. Transmit Direction (from the system to the PHY) Clock and Data signals TxData[63:0] A 64-bit transmit data bus that is used to transport the data from the system side to the PHY. The data on this bus is valid when TxValid is asserted.
Intra-Network Elements Communication
675
TxClk
Transmit clock has a nominal frequency of 200 MHz and is used by the PHY device to sample the Tx signals. Discrete control/Status signals TxValid Transmit data valid when asserted is used by TxData[63:0], TxAddr[n-l:0], TxSOCP, TxEOP and TxSize at the respective times. TxSOCP Transmit start of cell or packet flags the beginning of packet or cell available on the TxData bus. TxEOP Transmit end of packet indicates the end of packet or cell on the TxData bus.
Transmit Clock Transmit Address Bus (n bits) Transmit Data Bus (64 bits) Transmit Data Bus Parity
^
Z^
^H^
i^
^ Receive Flow Control (5 bits)
Link Layer Device Transmit Flow Control (5 bits)
Physical Layer Device (PHY)
Receive Address Bus (n
Figure J 7-8. Typical usage of SPI-4 phase 1 interface
TxAddr[n-l:0]
Transmit PHY port or channel address, n bits of address supports up to 2" ports or channels. The TxAddr signals are only sampled when TxValid is asserted, and are ignored when TxValid is deasserted.
676
Chapter 17
These signals determine the port or channel associated with the TxData, TxSOCP, TxEOP, TxError, TxSize, and TxValid signals. TxPrtyf3:0] Transmit data parity bus represents the parity bits calculated over the TxData bus. TxPrty[0] provides parity over TxData[15:0] portion of the TxData bus. Likewise TxPrty[l], TxPrty[2], and TxPrty[3] provide parity over TxData[31:16], TxData[47:32], and TxData[63:48] respectively. TxError Transmit data error flags that there is an error in the current transmit packet/cell. The error could be caused due to conditions like FIFO overflow, abort sequence or any other user defined error condition. It is processed only when TxValid and TxEOP signals are asserted. TxSize[2:0] Transmit octet count signal is primarily used during the transmission of the last word of the packet/cell. It indicates the number of valid octets during the transmission of the last word. Values from 1-7 represent the respective number of octets present, while a value of a 0 indicates 8 valid octets present in the last word. TxStart Transmit flow control frame start is sourced by the PHY device to flow control the link layer device. TxFull[3:0] Transmit flow control full indication is sourced by the PHY layer to inform the link layer about its buffers being full. The complete status of the channels is time multiplexed onto these four signals. Receive Direction (from the PHY to the system) Clock and Data signals RxData[63:0] A 64-bit receive data bus that is used to transport the data from the PHY to the system side. The data on this bus is valid when RxValid is asserted. RxClk Receive clock has a nominal frequency of 200 MHz and is used by the link layer device to sample the Rx signals. Discrete control/Status signals RxValid Receive data valid when asserted is used by RxData[63:0], RxAddr[n-l:0], RxSOCP, RxEOP and RxSize at the respective times. RxSOCP Receive start of cell or packet flags the beginning of packet or cell available on the RxData bus.
Intra-Network Elements Communication RxEOP RxAddr[n-l:0]
RxPrty[3:0]
RxError
RxSize[2:0]
RxStart RxFull[3:0]
17.5.3
611
Receive end of packet indicates the end of the packet or ceil on the RxData bus. Receive PHY port or channel address, n bits of address supports up to 2"^ ports or channels. The Rx address signals are only sampled when RxValid is asserted and are ignored when RxValid is deasserted. These signals determine the port or channel associated with the RxData, RxSOCP, RxEOP, RxError, RxSize, and RxValid signals. Receive data parity bus represents the parity bits calculated over the RxData bus. Rxprty[0] provides parity over RxData[15:0] portion of the RxData bus. Likewise RxPrty[l], RxPrty[2] and RxPrty[3] provide parity over RxData[31:16], RxData[47:32] and RxData[63:48] respectively. Receive data error flags that there is an error in the current transmit packet/cell. The error could be caused due to conditions like FIFO overflow, abort sequence or any other user-defined error condition. It is processed only when RxValid and RxEOP signals are asserted. Receive octet count signal is primarily used during the transport of the last word of the packet/cell on the RxData bus. It indicates the number of valid received octets during the last word. Values from 1-7 represent the respective number of octets present, while a 0 indicates 8 valid octets present in the last word. Receive flow control frame start is sourced by the link layer device to flow control the PHY. Receive flow control full indication is sourced by the link layer to inform the PHY about its buffers being full. The complete status of the channels is time multiplexed onto these four signals.
System Framer Interface-4 Phase 2 (SFI-4 Phase 2)
With the desire to fit more and more components on the same Printed Circuit Board (PCB), the developers of SFI-4 phase 2 project capitalized on the ability to specify high speed signal paths. They developed a narrower 4bit independent transmit and receive data paths. The introduction of the high-speed signals posed the challenge of ensuring very low bit-error rates. It is a well known fact, that with the high speed transmission of signals, the
678
Chapter 17
probability of transmission errors increases. To reduce the transmission error probability SFI-2 phase 2 [9] incorporated FEC mechanism between the SERDES and the framer. Moreover, in the SFI-4 phase 2, the byte and lane alignment processing along with clock encoding within the data stream is performed by using a 64b/66b encoding scheme. Figure 17-9 illustrated the reference model used in defining the SFI-4 phase 2 recommendation. The reference points A, B, C and D are used in defining the parameters associated with the interface. As we have seen in the earlier chapters, there is no one particular transport technology that is ideally suitable for a diverse set of traffic types. Therefore, from the outset, it was realized that this interface should be protocol agnostic that supports popular transport technologies such as 10 Gigabit Ethernet, STS-192/STM-64, G.709, 10 Gigabit Fibre Channel, and proprietary data streams. Some of the key features of the SFI-4 phase 2 are [9,14]: 1. Independent 4 bit wide data bus in the transmit and receive direction 2. Embedded clock within the data stream 3. Each bus lane operating at a minimum of 2.488 Gb/s with an aggregated throughput of 9.95328 Gb/s. Under certain circumstances the specification allows the interface to operate at 12.5 Gb/s. System to Optics Transmit Direction REFCK
REFCK
C
REFCK
C
P TXDATA[3:0]
P
I TXDATA[3:01
J^L^
Transmit ^ [ TXCKSRC I FRAMER (e.g. SONET/SDH orOTN)
Serializer/ Deserializer (SERDES)
FEC Processor
^ I RXDATA[3:01 |
RXDATA[3:0]
^-CL Receive
Optics to System Receive Direction
Figure 17-9. Reference model of SFI-4 phase 2
A brief overview of the signals used in the SFI-4 phase 2 interface is given below. For the complete definitions of the signals please refer to [9].
Intra-Network Elements Communication
679
Transmit direction (framer to SERDES) TXData[3:0] A 4-bit transmit data bus that used to transport the data from the framer to the SERDES. Each bit lane transports 64b/66b encoded data at a rate of 2.566 Gb/s to 3.125 Gb/s. TXCKSRC Transmit clock has a nominal frequency of 622.08 MHz with 50% duty cycle. It is used by the TxData bus. Receive direction (framer to SERDES) RXData[3:0] A 4-bit receive data bus that used to transport the data from the SERDES to the framer. Each bit lane transports 64b/66b encoded data at a rate of 2.566 Gb/s to 3.125 Gb/s. Reference Clock REFCK The Reference Clock is a reference used for transmit data path timing. It has a nominal frequency of 622.08 MHz.
17.6.
SPI-4 PHASE 2 (OC-192 SYSTEM PACKET INTERFACE)
SPI-4 Phase 2 [10] is a nimble interface with a significantly lower number of signals than phase 1. Phase 2 took advantage of the advances in the high-speed electronics by defining a faster interface that is narrower than the SPI-4 Phase 1. This resulted in the reduction of the number of traces required on a PCB and the pins on the VLSI devices. It provides isolation between transmit and receive directions by making them completely separate and independent of each other. Similar to its predecessors, SPI-4 phase 2 interface is also protocol agnostic. It can be used in the transport of Packet over SONET, any packet centric Generic Framing Procedure (GFP) mapped data (which could include the encapsulation of Ethernet or Fibre Channel, ATM, constant bit-rate traffic over GFP) or any proprietary data scheme. Figure 17-10 [14] illustrates a typical application of the SPI-4 phase 2 interface. Since the physical layer device, e.g., a framer, and the link layer device operate at different clock frequencies, FIFOs are used in both transmit and receive directions to accommodate the clock mismatches. Sending the FIFOs status over out-of-band control channels (via the respective FIFO status signals) provides isolation between transmit and receive paths. The variable length control and the payload data is transferred between the devices in bursts as illustrated in Figure 17-11. The packets can be of variable sizes
Chapter 17
680
with upper and lower limits. The transferred packets have to be multiples of sixteen bytes except when terminated with an asserted End ofPacket signal.
Transmit Clock Transmit Data Bus (16 bits)
Transmit Direction
Transmit Control Transmit FIFO Status (2 bits)
^ Link Layer Device Receive Direction
Transmit FIFO Status Clock
Receive Clock
^• ^^^ w
Physical Layer Device (PHY)
\^^ ^ ^ R ^ c e i v ^ a t ^ u ^ l ^ i t s ) v^T
^
Receive Control Receive FIFO Status (2 bits) Receive FIFO Status Clock
Figure 17-10. SPI-4 phase 2 system reference diagram
Payload Control
Payload Data (ATM Cell)
Payload Control
Payload Data (Packet)
Payload Control
Payload Data (ATM Cell)
Payload Control
Payload Data 1
(Packet) J
Figure 17-11. Transferred data stream consisting of interleaved control and payload data.
A brief overview of the signals used in the SPI-4 phase 2 is given below. For complete definitions of the signals please refer to [10]. Transmit Direction (from the system to the PHY) Clock and Data signals TDCLK Transmit Clock has a nominal frequency of 311 Mhz. It is used as the timing source by the transmit data and control signals. TDAT[15:0] Transmit data bus used for transporting the payload data and in-band control words from the link layer device to the PHY device. Minimum data rate on each line is 622 Mb/s. TSCLK Transmit status clock is used in sampling TSTAT signals.
Intra-Network Elements Communication
681
Discrete control/Status signals TCTL Transmit Control, when asserted indicates the presence of control words on the TDAT bus. TSTAT[1:0] Transmit FIFO Status is used to carry FIFO status information in a round robin scheme. It also carries associated errors detected or framing information. Receive Direction (from the PHY to the system) Clock and Data signals RDCLK Receive Clock has a nominal frequency of 311 Mhz. It is used as the timing source by the receive data and control signals. RDAT[15:0] Receive data bus used for transporting the pay load data and in-band control words from the PHY device and the link layer device. Minimum data rate on each line is 622 Mb/s. RSCLK Receive status clock is used in sampling RSTAT signals. Discrete control/Status signals RCTL Receive Control, when asserted indicates the presence of control words on the RDAT bus. RSTATfL'OJ Receive FIFO Status is used to carry FIFO status information in a round robin scheme. It also carries associated detected errors or framing information.
17.7.
40 GBITS/S SYSTEMS
As we can see from Figures 17-3 and 17-5, the data centric line cards consist of Optical transceivers, Serializer/Deserializer, Forward Error Correction processor, Framer, Network Processor, Traffic Manager and Switch Fabric Interface Device. OIF, true to its form, once again took the lead in defining implementation agreements of several chip-to-chip communication interfaces that allow the networking gear manufacturers to use VLSI devices with standardized interfaces. It allowed them to use VLSI devices from different IC vendors, operating at 40 Gb/s throughputs. At the 40Gb/s rates, there are three types of interfaces that have been defined by OIF's Physical and Link Layer (PLL) working group. They are • SFI-5 Serdes Framer Interface-5 • SPI-5 System Packet Interface-5 • TFI-5 TDM Fabric to Framer Interface-5 The electrical characteristics of SFI-5 and SPI-5 are defined in [13].
682
11.1A
Chapter 17
SERDES Framer Interface-S (SFI-5)
SFI-5 [11] defines the communication between SERDES, FEC processor, and a framer (typically a SONET/SDH or G.709 framer) device. Figure 1712 depicts a system model illustrating the interconnecting electrical signals between the devices. In the receive direction, the serial data operating at approximately 40 Gb/s from the optics is converted into parallel data by the SERDES. The parallel data signals are relatively lower speed signals (each data channel of the data bus operating at 3.125 Gb/s). The SERDES is typically connected to either an FEC processor or a framer. At these higher operating speeds, one of the interesting challenges is to accommodate the data skew between the data channels of the respective data bus. This skew is primarily caused by the trace length mismatches that can be quite significant. SFI-5 incorporates a separate deskew channel in both transmit and receive directions that continuously provides data samples to the deskewing algorithm. The deskewing takes place at the sink of the respective signals. System to Optics Transmit Direction TXREFCK
1,
TXDATA[15:0J
(:TXDATA[15:0] I) TXDSC =#
r4
TXDSC TXDCK
y
** ^B
Serializer/ Deserializer (SERDES)
IRXDATA11S:0J|
RXDSC
kn
RXDCK
^
RXS
^
I
FEC Processor
RXDATA[1S:0]
.
Transmit
TXCKSRC
^
I3
(^
TXDCK
TXCKSRC
FRAMER (e.g. SONET/SDH or OTN)
\
A
RXDSC RXDCK
B ^
"^-
. <^ A
Receive
RXS
RXREFCK Optics to System Receive Direclio
Figure 17-12. SFI-5 System reference model
17.7.1.1 Signal descriptions A brief overview of the signals used in the SFI-5 interface is given below. For complete definitions of the signals please refer to [11]. Receive Direction: from Optics to the System
Intra-Network Elements Communication
683
RXDATA[15:0] Serial data from the optics is converted into parallel data by the SERDES into 16-bit wide bus (or also referred to as 16 lanes). The serial data is demultiplexed onto the parallel bus in a round robin manner. The first data bit receive by the SERDES is placed on the RXDATA[15] lane and the subsequent bits are placed in the lower numbered lanes, implying that the 16^^ bit received is placed on the RXDATA[0] lane. When the RXDATA is generated by the SERDES, the 16-bit data chunks are arbitrarily placed on the data bus. However, when the RXDATA is generated by the FEC processor, the framing is performed by the processor such that it can perform the error detection and correction. The octet boundaries are extracted from the data stream. The RXDATA[15] lane carries the V bit of the V octet and the RXDATA[0] transports last bit of the 2"^ octet. Each RXDATA signal operates between 2.488Gb/s - 3.125Gb/s stream and carries every 16'^ bit of the data stream. RXDATA is frequency locked to lOCDCK. The Receive Deskew Channel (RXDSC) a signal with RXDSC a data rate of 2.488 Gb/s - 3.125 Gb/s, is the reference signal that is used in measuring the skew of the Receive data bus. The receive data path signals use the Receive Data RXDCK Clock (RXDCK) as the timing reference. The clock has 50% duty cycle with a % frequency that of RXDATA[x] or the RXDSC bit rate. For example, if the RXDATA[x] is operating at 2.488 Gb/s then the clock frequency of RXDCK is 622 Mb/s. The sink device will need to have a 4x clock multiplier to generate an internal clock that will be used to sample the RXDATA. The implementation agreement does not specify any phase offset between the RXDCK and the RXDATA bits or the RXDSC. In the receive direction the Receive Reference Clock RXREFCK (RXREFCK) is used as the reference timing signal. It typically has 50% duty cycle with a Vi frequency that of RXDATA[x] or RXDSC. The SERDES and the FEC processor are required to have the RXREFCK whereas in the Framer device it is optional.
684 RXS
Chapter 17 If there is a problem in deriving the RXDCK or RXDATA signals from the optical receive signal, the Receive Status (RXS) is used within the network element to carry this alarm condition. Depending on the usage of respective devices, the RXS carries the status signal from SERDES to the FEC or the SERDES to the Framer or from the FEC to the Framer.
Transmit Direction: From System to Optics TXDATA[15:0] From the electronics towards the optics, the data is transmitted via the Transmit Data (TXDATA) bus. In the SERDES the parallel data is converted into serial data and placed onto the optical stream in a round robin manner. For example, in the octet aligned TXDATA[15:0] the TXDATA[15] is the first bit that is transmitted out onto the optical link and TXDATA[0] is the last bit out of the second transmitted octet. Each TXDATA signal operates between 2.488 Gb/s - 3.125 Gb/s stream and carries every 16^^ bit of data stream. TXDATA is frequency locked to TXDCK. The Transmit Deskew Channel (TXDSC), a signal TXDSC with a data rate of 2.488 Gb/s - 3.125 Gb/s, is used as the reference signal in measuring the skew of the Transmit data bus. The transmit data path signals use the Transmit Data TXDCK Clock (TXDCK) as the timing reference. The clock has 50% duty cycle with a % frequency that of TXDATA[x] or TXDSC bit rate. For example if the TXDATA[x] is operating at 2.488 Gb/s then the clock frequency of TXDCK is 622 Mb/s. The sink device will need to have a 4x clock multiplier to generate an internal clock that will be used to sample the TXDATA. The implementation agreement does not specify any phase offset between the TXDCK and the TXDATA bits or the TXDSC. It is mandatory for the source device in the transmit path to generate TXDCK. TXCKSRC The transmit data path signals use the Transmit Clock Source (TXCKSRC); a signal generated from the optics to the systems direction, as the timing
Intra-Network Elements Communication
TXREFCK
11.1.1
685
reference. The clock has 50% duty cycle with a VA, frequency that of TXDATA[x] or TXDSC bit rate. This clock is derived from the TXREFCK in the sink device. For example, in Figure 7-12, in the transmit direction the SERDES is the sink, therefore, the TXCKSRC is derived from its TXREFCK. In the transmit direction the Transmit Reference Clock (TXREFCK) is used as the reference timing signal for the transmit data path devices. It typically has 50% duty cycle with a % frequency that of TXDATA[x] or TXDSC. One of the devices in the Transmit chain is required to have an external clock source connected to TXREFCK.
SPI-5 (OC-768 System Packet Interface)
Functionally SPI-5 (as illustrated in Figure 17-13) is very close to SPI-4 phase 2 with the major difference being that the data channel operates at 4 times the data rate with minimum rate of 2.488 Gb/s and a maximum of 3.125 Gb/s. It uses in-band PHY port addressing and supports 256 addressable ports. For highly channelized applications, an extended addressing scheme can be used to address up to 2^"^"^ ports.
Receive Direction
Link Layer Device (e.g. NP) Transnnit Direction
RDCLK RDAT[15:0] RCTL RSTAT
TDCLK TDAT[15:Q]
TCTL
Pinysical Layer Device
T^
TSTAT
Figure 17-13. Reference application of SPI-5 interface
A brief overview of the signals used in the SPI-5 is given below. For the complete definitions of the signals please refer to [12]. Transmit Direction (from the system to the PHY)
686 TDCLK
TDAT[15:0]
TCTL TSTAT
Chapter 17 Transmit Data Clock is used by the transmit data path signals (TDAT[15:0] and TCTL) as the timing reference. The clock has 50% duty cycle with % frequency of TDAT[x] or the TCTL bit rate. For example if the TDAT[x] is operating at 2.488 Gb/s then the clock frequency of TDCLK is 622 Mb/s. The sink device will need to have a 4x clock multiplier to generate an internal clock that will be used to sample the TDAT. Transmit Data Transmit Data bus carries the payload data and in-band control words from the link layer device to the PHY. The minimum data rate of each bit lane is 2.488 Gb/s. Transmit Control when asserted acts as the flag to indicate the presence of control words on the TDAT bus. Transmit Status Channel signal carries the transmit Pool status information and other associated information like error detection and framing. It reports the status of all the calendar entries while running at a minimum rate of 2.488 Gb/s.
Receive Direction (from the PHY to the system) RDCLK Receive Data Clock is used by the receive data path signals (RDAT[15:0] and RCTL) as the timing reference. The clock has a 50% duty cycle with a Vi the frequency of RDAT[x] or the RCTL bit rate. For example if the RDAT[x] is operating at 2.488 Gb/s then the clock frequency of RDCLK is 622 Mb/s. The sink device will need to have a 4x clock multiplier to generate an internal clock that will be used to sample the RDAT. RDAT[15:0] The Receive Data bus carries the payload data and inband control words from the PHY to the link layer device. The minimum data rate of each bit lane is 2.488 Gb/s. RCTL Receive Control when asserted acts as the flag to indicate the presence of control words on the RDAT bus. RSTAT Receive Pool Status signal carries the receive Pool status information and other associated information like error detection and framing. It reports the status
Intra-Network Elements Communication
6S7
of all the calendar entries while running at a minimum rate of 2.488 Gb/s.
17.7.3 TFI-5 (TDM Fabric to Framer Interface) TFI-5 provides an interoperable point-to-point link between framer and TDM switch fabric devices that maybe offered by different vendors. The TIF-5 interface is primarily used within a SONET/SDH based network element with line rates of 2.5 through 40 Gb/s. However, its scope can be extended beyond SONET/SDH to accommodate 10 Gigabit Ethernet LAN PHY, 10 Gigabit Ethernet WAN PHY or G.709 interfaces. The major functions that TFI-5 specifies are • Link integrity monitoring • Connection management • Mechanisms for mapping both SONET/SDH and non-SONET/SDH clients signals. TFI-5 has three layers: link layer, connection layer and mapping layer. The link layer maintains the link integrity between the devices. It generates and terminates the necessary information for the source and the sink devices on each of the TFI-5 links. The connection layer handles monitoring of individual STS-1 timeslots from ingress framer to the egress framer. Finally the connection layer performs the association and mapping of client signals onto one or more STS-1 timeslots. Figure 17.14 gives a system architectural reference diagram illustrating possible apphcations of TFI-5. A brief overview of the signals used in the TFI-5 is given below. For the complete definitions of the signals please refer to [15]. TFIDATA TFI-5 DATA (TFIDATA) carries data between the framer and the TDM switch fabric devices. The same signal definition and characteristics are pertinent to the data transfer whether from the framer to switch fabric or the switch fabric to the framer. Each TFI-5 link operates at 2.488 Gb/s and has a SONET/SDH style-framing format. TFI-5 can optionally support 3.1104 Gb/s data rates. TFIREFCK TFI-5 Reference Clock (TFIREFCK) is used in providing timing reference to all the TFIDATA signals within the system. It is typically a 155.52 MHz clock with a 50% duty cycle. TFI8KREF TFI-5 8 KHz Frame Reference (TFIREF) is used in providing reference to frame boundaries for all the devices within the TFI-5 system. It is typically an
Chapter 17 eight KHz clock signal that has 50% duty cycle and is locked to the TFIREFCK.
System Clock/ Sync Source
n
SONET/SDH Transponder/1 > 1 SERDES)
K !
rSn/^l
cpr
'
^
TFIREFCK
TFIREFCK
TFI8KREF
TFI8KREF
SONET/SDH Framer + TFI-5 Mapper TFIDATA_I
TFIDATA_0
TFIDATA_0
TFIDATA_I OTN(G.709) Transponder/ d e r / y L K r " " " ' " 1 y L J v (Optics SERDES)
OTN(G.709)
\A
^FI 5 Link
^
X
to GbELANPHY L^UPH' Transponder/ (Optics + SERDES)
^
^
TDM Switch
WGbELANPHY Framer + TFhS Mapper
^
to GbE WAN PHY\
£^
Transponder/ (Optics + SERDES)
^
f\\lO GbE WAN PHY \A , >| Framer + TF\-5 K .
V\
Mapper
TFI-SLink
N
^J . yi
\/\
Figure 17-14. TFI-5 System reference model
17.8.
ACKNOWLEDGEMENTS
The author would hke to acknowledge review comments and input provided by Alan Reynolds, Andrew Reynolds and Osman Ahmad.
17.9. [1] [2] [3]
REFERENCES Lightreading Webinar , "Packet Aware Transport", www.lightreading.com, February 10,2005. http://www.asi-sig.org Gary Lee, "Advanced Switching in Communication Systems", http://www.asisig.org/education/whitepapers/AS_in_Communication_Systems_-_fmal.pdf.
Intra-Network Elements Communication [4] [5] [6] [7]
[8]
[9]
[10]
[11]
[12]
[13] [ 14] [15]
689
Panos C. Lekkas, "Network Processors: Architecture, Protocols and Platforms", McGraw-Hill, 2003. H. Jonathan Chao, Cheuk H. Lam and Eiji Oki, "Broadband Packet Switching Technologies: A Practical Guide to ATM Switches and IP Routers", Wiley, 2001. OIF Implementation Agreement OIF-SPI3-01.0, "System Packet Level 3 (SPI-3): OC48 System Interface for Physical and Link Layer Devices", June 2000. OIF Implementation Agreement OIF-SFI4-01.0, "SFI-4 (OC-192 Serdes-Framer interface) OIF-PLL-02.0 - Proposal for a common electrical interface between SONET framer and serializer/deserializer parts for OC-192 interfaces", September 2000. OIF Implementation Agreement OIF-SPI4-01.0, "System Physical Interface Level 4 (SPI-4) Phase 1: A System Interface for Interconnection Between Physical and Link Layer, or Peer-to-Peer Entities Operating at an OC-192 Rate (10 Gb/s)", September 2000. OIF Implementation Agreement OIF-SFI4-02.0, "SERDES Framer Interface Level 4 (SFI-4) Phase 2: Implementation Agreement for lOGb/s Interface for Physical Layer Devices", September 2002. OIF Implementation Agreement OIF-SPI4-02.1, "System Packet Interface Level 4 (SPI-4) Phase 2 Revision 1: OC-192 System Interface for Physical and Link Layer Devices", October 2003. OIF Implementation Agreement OIF-SFI5-01.0, "Serdes Framer Interface Level 5 (SFI-5): Implementation Agreement for 40Gb/s Interface for Physical Layer Devices", January 2002. OIF Implementation Agreement OIF-SPI5-01.1, "System Packet Interface Level 5 (SPI-5): OC-768 System Interface for Physical and Link Layer Devices", September 2002. Optical Internetworking Forum OIF2001.149, "SxI-5: Electrical Characteristics for 2.488 - 3.125 Gbps parallel interfaces," October 2002. Tom Palkert, "OIF OC-48, OC-192 & OC-768 Electrical Interfaces", http://www.oiforum.com/public/documents/Electrical_InterfacesWP.pdf Optical Internetworking Forum OIF-TFI-5-01.0, "TFI-5: TDM Fabric to Framer Interface Implementation Agreement," September 16, 2003.
This page intentionally blank
Chapter 18 ITU OPTICAL INTERFACE STANDARDS Evolution and its Impact on Implementations Peter J.J. Stassar Networking Consultant
18.1. INTRODUCTION Over the past 20 years, optical transmission systems have evolved from fairly simple, single span, point-to-point configurations, operated at a single wavelength, to rather complex multiwavelength, multispan, point-to-multipoint architectures. Within the context of this evolution, the International Telecommunication Union (ITU) has developed a wide range of optical interface recommendations, beginning with PDH applications and later addressing SDH/SONET, DWDM, and OTN applications. An historical perspective of the various ITU recommendations is provided in this chapter, addressing not only the maturation of the industry but also the intent to use standards to modify the application space from low-volume and high cost conditions to cost-efficient and high-volume conditions. Towards that end, a description of the migration from proprietary optical solutions with custom and discrete components towards standardized integrated solutions supported by Multi-Source-Agreements (MSAs) is discussed. The intent of this chapter is to provide the reader with a basic understanding of ITU's objectives, terminology, and typical content found within the various optical interface recommendations. Moreover, the impact that the recommendations have on practical applications and designs is also
692
Chapter 18
addressed. In this chapter, detailed specifications are not discussed, as a complete treatment of optical parameters, mechanisms, and designs is beyond the scope of this text. References to various standards documents are cited as appropriate. Finally, some information is provided on the use and implementation of optical alarms and degradation monitors. Despite the fact that the latter topic is not directly related to optical interface implementations, it is a thoroughly studied item within ITU, and as such its treatment in this chapter will prove very useful to gain a better understanding of the challenges and sensitivities surrounding the optical standards process.
18.2.
ITU OPTICAL INTERFACE STANDARDS
18.2.1 Historical perspective 18.2.1.1 PDH Between 1980 and 1988, the ITU developed two recommendations for specifying Plesiochronous Digital Hierarchy (PDH) optical line systems. Recommendation G.955 [1] contained specifications for PDH line systems for the 1.544 kbit/s hierarchy (24-channel market, mainly deployed in the USA and Canada), and Recommendation G.956 addressed PDH line systems for the 2048 kbit/s hierarchy (32-channel market). In the 1990s, both recommendations were "collapsed" into a single recommendation, namely G.955, and G.956 was subsequently withdrawn. These recommendations are no longer relevant for today's market; however, they laid the foundation for how optical interfaces are specified within the ITU, a process that has been followed in subsequent ITU recommendations. In this approach to specifying optical interfaces, only the characteristics of the fiber optic plant,^ principally the attenuation and the dispersion, are specified. This approach to specifying optical interfaces, where the performance properties of the transmitter and receiver are not addressed, is called longitudinal compatibility. Longitudinal compatibility implies that on a certain link, with standardized characteristics, the equipment on both sides of the link must be from the same vendor. In this case, the transmitter and receiver performance properties and characteristics were proprietary. In the case of PDH apphcations, it was also not uncommon to use proprietary balanced coding techniques, e.g., 5B/6B, to provide stable link performance. In this way, 140 Mbit/s or 4*140 Mbit/s equipment.
ITU Optical Interface Standards
693
employing 5B/6B coding, actually operated with a line rate of 168 and 672 Mbit/s, respectively. In the 1980s, optical components were a significant contributor to the cost of an optical transmission system. As such, Recommendation G.955 focussed only on long-haul systems, where the cost of the optical devices could be balanced by maximizing the transmission distances. The principle of longitudinal compatibility is discussed further in Section 18.2.2. 18.2.1.2 SDH/SONET In 1988, the revolutionary concept of developing optical links with interoperable equipment from different manufacturers was introduced to ITU by Bellcore (currently called Telcordia). For a truly interoperable link, more parameters beyond just the fiber plant had to be specified. This new set of specifications was titled SDH/SONET for the global/North American markets, respectively. Initially, this interworking concept was called midspan-meet, implying that at a certain point on the fiber link between two pieces of optical equipment, the interoperability had to be guaranteed through a specified set of parameters and associated values. Because this point was not at a fixed location, but rather at an unknown "floating" location, it was not considered appropriate for a specification principle. A floating approach lends itself to specifying formulas instead of values and is, as such, cumbersome to implement. Instead the complete optical configuration was split into three parts: transmitter equipment, the actual outside plant, and receiver equipment. These three parts were separated by two fixed reference points, namely. Point S, located between the transmitter and the outside plant; and Point R, located between the outside plant and the receiver. At these reference points, a complete and detailed set of transmitter and receiver parameters and associated values were specified. This principle of specification was called transverse compatibility, where the intent was to achieve interoperability between different manufacturers' equipment so long as the specifications were met at Points S and R. Note that the previous term, mid-span-meet, is commonly used as an alternative expression to transverse compatibility, although that usage of the term is not strictly correct. Within transverse compatibility, it is important to note that the optical path or outside fiber plant is not specified by distance but rather by parameters like attenuation range, maximum (and sometimes minimum) chromatic dispersion, Differential Group Delay (DGD), etc. Distance is not a specification and is only used for the purpose of classification. A specific transmission distance can never be guaranteed, since it fully depends on local conditions associated with the fiber link, e.g., number of splices, loss/splice, presence of patching panel connectors, etc. Transverse
694
Chapter 18
compatibility has become the standardization method of choice for all modem ITU recommendations for optical interfaces. The details of transverse compatibility are further discussed in Section 18.2.2, where a reference diagram is shown. The first relevant ITU recommendation for SDH optical interfaces was G.957 [2], in which optical parameter values for STM-1 (OC-3), STM-4 (OC-12), and STM-16 (OC-48) applications were specified for distances up to 80 km. Note, that the transmission distance of 80 km is a point of classification, not a guarantee of transmission reach. Recommendation G.957 has become a template for all subsequent ITU optical interface recommendations. During the development of G.957, the extensive experience with PDH implementations for 140 Mbit/s and 4*140 Mbit/s was reused to define the STM-1 and STM-4 parameter values. Because of the maturity of the technology for these applications, the associated parameter values have hardly been modified since the first agreements in 1990. This is, however, not true for the STM-16 parameter values. The first set of values, agreed upon in 1990, was based on the limited availability of initial test results from prototype systems. Nevertheless, the availability of an early version of an STM-16 optical interface specification appeared to be a major market driver for next generation optical applications. The wide-scale deployment of STM-16 optical interfaces resulted in a strong enough knowledge base that ITU was able to readdress the initial STM-16 specification with an update based on a "mature" set of parameter values several years later. Around 1990, a market need for a higher transmission capacity of 10 Gbit/s in SDH/SONET transmission systems was foreseen, and the ITU started to work on a new set of optical interface recommendations to accommodate that need. At the same time the new Erbium Doped Fiber Amplifier (EDFA) technology became available, enabling longer transmission distances and multiwavelength operation on a single fiber. Optical amplifiers were being used to increase transmitter output powers (booster configuration) and to improve receiver sensitivities (optical preamplifier configuration). They were also deployed as line amplifiers by positioning them at intermediate positions on very long fiber links (multispan configuration). In this way, the physical distance between transmitting and receiving equipment could be substantially increased to multiples of the original distances specified in G.957. Because optical amplifiers were also operating over a wide optical bandwidth, well over 30 nm, they were capable of simultaneously amplifying the power levels of multiple, narrowly spaced signals (also called channels) over a single transmission fiber. The latter application, also called Dense Wavelength
ITU Optical Interface Standards
695
Division Multiplexing {DWDM), will be further discussed in Section 18.2.1.3. Because of the availability of this new optical amplifier technology, ITU decided to specify the new SDH/SONET rate of STM-64/OC-192 (--10 Gbit/s) along with multiwavelength (multichannel) configurations operating at 2.5 Gbit/s/channel, each with its own characteristic wavelength, as an alternative to the single-wavelength (single-channel) application at the STM64/OC-192 rate. The multichannel application is further discussed under Section 18.2.1.3. In order to maintain the stability of Recommendation G.957, the ITU decided to put the new sets of parameter values for STM-4 and STM-16 applications with extended distances (longer than 80 km) via OA (Optical Amplifier) technology, and the new STM-64 (OC-192) applications, into a new recommendation called G.691. Initially, it was the intent to define transversely compatible parameter value sets for both single and multiple spans. Due to a variety of reasons, the ITU discontinued the attempt to include specifications for multiple spans. It appeared to be a too significant challenge to define an unambiguous set of optical parameter values for multispan configurations incorporating line optical amplifiers. Furthermore, the ITU was unable to agree on the specification for a standard Optical Supervisory Channel (OSC), necessary for maintenance of the inline optical amplifiers. The relevant optical parameter values for SONET applications OC-3 to OC-192 can be found in Telcordia's GR-253-CORE [3], which in most cases are consistent with the ITU specifications. Originally, the OC-192 specifications were put into a separate Telcordia specification, GR-1377, but these were incorporated into GR-253-CORE at a later stage. 18.2.1.3
DWDM
As discussed in Section 18.2.1.2, the early 1990s presented a market need for SDH/SONET transmission systems that could operate at per-channel data rates greater than 2.5 Gbit/s. Because of initial concerns about the cost of 10 Gbit/s optics (including the necessity to use optical amplifiers) and possible limitations due to Polarization Mode Dispersion (PMD) on the installed fiber base, it was decided to prepare a second recommendation, namely G.692 [4], for multichannel SDH STM-4 and STM-16 applications. In general, these applications with closely spaced channels are referred to as Dense Wavelength Division Multiplexing {DWDM) applications. Initially, it was the ITU's objective to define a transversely compatible specification for these multichannel DWDM applications, but ultimately it was unable to agree on a channel plan (unique set of wavelengths) for these
696
Chapter 18
applications. Naturally, to guarantee interoperability, it is essential that different manufacturer's equipment transmit on exactly the same set of wavelengths and follow the same sequence for "lighting up" new wavelengths. However, some network operators felt that the channel plan would restrict their ability to optimize the usage of their installed outside plant. Additionally, the market pressure on equipment vendors to increase performance by adding channels to the fiber (requiring that the channel-tochannel spacing becomes narrower) and increasing transmission distances necessitated that optical technology advance at a rampant pace. This pace gave the impression that there was little stability in the solution and that any attempt to standardize this solution within G.692 would be promptly outdated. The market conditions mandated that equipment vendors constantly design their equipment at the edge of available technology. Furthermore, because of the usage of optical amplifiers, the fibers operated in a nonlinear regime, making the matter even more complex from a standardization perspective. Therefore, the published version of G.692 contains the statement that it was aimed towards a fiature realization of transverse compatible multichannel systems. It contains listings of parameters that would be required in the case that a transversely compatible specification was ever developed for a multichannel system. Specific values were not included in this specification. The resulting Recommendation G.692 is actually a longitudinally compatible specification. One of the main achievements of Recommendation G.692 was agreement on a frequency grid for DWDM applications, a grid that is followed even today. Because G.692 was an SDH-related recommendation, the ITU ultimately decided to put the grid definition and specification in a separate Recommendation G.694.1 [5] for generic DWDM applications. In G.694.1, the actual channel frequencies are not specified. Instead, G.694.1 provides a ruler or formula, anchored to 193.1 THz, with which to calculate the channel frequencies for a variety of channel spacings, ranging from 12.5 GHz to 100 GHz. Within the context of grid definitions, it is important to note that the ITU decided to maintain a wavelength-based signal spectral specification for widely spaced signals and a frequency specification for narrowly spaced signals. At first pass, this appears to be a confusing way to specify wavelengths. The reason for this choice is the fact that a frequency specification is exact and unambiguous, whereas a wavelength specification depends on the medium in which the wavelength is measured (air, vacuum and glass all have different values for the speed of light, which is used to calculate wavelength from frequency^. As an indication for the demarcation point between the two specification methods, it can be noted that in G.692, the frequency-based specification is used up to a channel spacing of 1000
ITU Optical Interface Standards
697
GHz (equivalent to approximately 8 nm), above which a wavelength-based specification is used. Very recently the ITU established two new recommendations for DWDM applications intended for use within the Optical Transport Network, described in Section 18.2.1.4. In Recommendation G.696.1 [26] physical layer specifications are given for point-to-point multispan DWDM applications within a single administrative domain with bitrates up to 40 Gbit/s. Because of the technical complexity of these systems, the specifications given in G.696.1 are longitudinally compatible, thus with equipment from a single vendor. As in the case of Recommendation G.692, this means that specifications are provided only for the outside plant. G.696.1 further contains some extensive information on theoretical limits and design considerations for DWDM systems. In Recommendation G.698.1 [27] optical interface specifications are given for transversely compatible point-to-point DWDM systems in a metro environment. By using the black-link specification method, as described for CWDM applications in Section 18.2.1.5, interworking is enabled at the single-channel points, i.e. at the inputs of the multiplexer and at the outputs of the demultiplexer. The realization of this transversely compatible DWDM recommendation has been a ground-breaking achievement, enabling operators to mix and match equipment from different vendors at the singlechannel level. G.698.1 contains specifications for applications with a channel spacing of 100 GHz, bitrates up to 10 Gbit/s, covering transmission distances in the range of 30 to 80 km. For further details, see Recommendation G.698.1. In a similar way as for CWDM applications, G.698.1 is relevant to the new hot-pluggable SFP and XFP packaging technologies described in Sections 18.3.2.2 and 18.3.3.2. Future versions of G.698.1 are intended to address the inclusion of optical amplifiers in order to achieve transmission distances longer than 80 km, further widening the application space for network operators to deploy multi-vendor DWDM systems. 18.2.1.4 OTN In 1996, the potential use of DWDM technologies was recognized as having the opportunity to extend beyond relatively straightforward point-topoint applications. ITU embarked on a new generation of recommendations intended to support the Optical Transport Network (OTN), which would include formats beyond SDH/SONET. As an example, the new OTN would address Forward Error Correcting (FEC) codes (which were once proprietary to the equipment vendor) and enhanced optical network architectures, which
698
Chapter 18
included new optical network elements like Optical Add Drop Multiplexers (OADMs) and Optical Cross Connects (OXCs). Details on OTN architecture and the associated rates and formats specifications are contained in a variety of ITU recommendations, including G.872 [6] and G.709 [7]. The ITU decided to make a distinction in optical interfaces for IntraDomain Interfaces (laDI) and Inter-Domain Interfaces (IrDI). As is clarified in Recommendation G.872, the laDI refers to a physical interface that lies within an administrative domain and the IrDI to a physical interface representing the boundary between two administrative domains. In general an IrDI will be bound by 3R regeneration at both sides of the interface. 3R regeneration requires that the signal is re-amplified, reshaped and retimed. Currently, transversely compatible optical interface specifications are only required for the IrDI. The IrDI configurations are relatively simple in that they are limited to a single span and are either single-channel (singlewavelength) or 16-channel configurations. The required technology for these IrDI applications was considered sufficiently mature to create a basis for agreement on complete sets of transversely compatible parameter values. Because the laDI involved more complex optical architectures, like very long distance multi-span configurations with high numbers of closely spaced channels, possibly including OADM configurations, it was decided to specify the laDI optical interfaces in a longitudinally compatible form. This kind of specification provides the highest level of freedom and flexibility for designing systems for the laDI. To illustrate, it is quite common to use proprietary Forward Error Correction (FEC) techniques within the laDI to further optimize the optical performance. Refer to Section 18.2.2 for a more detailed description on transverse versus longitudinal compatibility. The ITU established two new optical interface recommendations, namely, G.959.1 [8] and G.693 [9], originally intended to address the new OTN applications having rates as specified in Recommendation G.709. Before being completed, these two recommendations were transformed into a more generic form, permitting usage across a variety of applications, including the originally intended OTN applications in G.709, along with SDH/SONET and even Gigabit Ethernet. The introduction of optical tributary signal classes, which addressed a range of data rates rather than a specific data rate, made this possible. As a consequence of this choice for generic specifications over a range of data rates, the previous optical interface Recommendation G.691 [10], which previously addressed only SDH/SONET applications, was updated with appropriate references to either G.693 or G.959.1. Recommendation G.959.1 was intended to be a general specification for single-span, unidirectional, point-to-point optical links, which addressed
ITU Optical Interface Standards
699
single and multichannel line systems. This recommendation was generated in conformation with the approach in the earlier G.957 and G.691 recommendations, but was generalized to apply over ranges of data rates rather than specific data rates. G.693, however, introduced new optical links beyond what was previously addressed in Recommendation G.691. Recommendation G.693 targeted Very Short Reach (VSR) applications, with link distances up to 2 km and potentially higher-than-"normal" losses, at nominal lOGbit/s and 40 Gbit/s aggregate bit rates. G.693 specifically included the possibility of inserting optical cross-connects in the optical link (which at the time had very high insertion losses). As a result, a wide range of link-loss categories is included, ranging from 4 dB to 16 dB. Even higher link losses are foreseen as next-generation network elements are added to the optical link. 18.2.1.5
CWDM
Because of a market need for relatively low-cost point-to-point multichannel systems, the ITU decided to work on a new set of recommendations supporting Coarse WDM (CWDM) applications with channel spacing of 20 nm. Note that a wavelength specification is used instead of a frequency specification. The requirements for coarsely spaced channels permitted the use of low-cost uncooled lasers and low-cost WDM filter technologies. In Recommendation G.694.2 [1], the 20 nm grid has been specified with 1551 nm as one of the grid wavelengths. Recommendation G.695 [12] provides optical interface parameter values for CWDM applications with up to 16 channels and up to 2.5 Gbit/s. Recommendation G.695 contains two general specification methods: the black-box method and the black-link method. In the case of the black-box method, sets of parameter values for the aggregate multichannel reference points (after the optical multiplexer and prior to the optical demultiplexer) are given, implying that the parameters for the individual channels, which lie within the black box, are not specified. Alternatively, in the black-link approach, optical interface parameter values are only specified at the individual tributary single-channel interfaces. In this approach, interworking is enabled only at the single-channel points and not at the aggregate multichannel points. In this case, the combination of optical multiplexer and demultiplexer is treated as a single set of devices. This means that in the specified per-channel losses, the partitioning between on one hand the actual fiber losses and on the other hand the losses of the optical multiplexer and demultiplexer is not specified. The same is valid for other per-channel link parameters like chromatic dispersion and polarization
700
Chapter 18
mode dispersion. In particular, this black-link approach is relevant to the new hot-pluggable SFP packaging technologies described in Section 18.3.2.2. For further details, see Recommendation G.695. In the most recent version of G.695 OADM network elements have been included in the treated CWDM architectures. 18.2.1.6
All Optical Networks (AON)
Recently the ITU began studying the physical characteristics of Optical Network Elements (ONE) and optical interfaces for All Optical Networks (AON). One of the major challenges the ITU faces is whether it is possible to generate transversely compatible interface specifications for these networks or whether a new method of specification is required. One of the key considerations is to not drive the optical performance specifications to their physical or state-of-the-art limits, and as such be able to specify mutually compatible network elements and to more easily build and configure interoperable AONs. A similar challenge was addressed when ITU evolved from specifying proprietary PDH equipment to specifying interoperable SDH/SONET equipment.
18.2.2 Transverse versus longitudinal compatibility 18.2.2.1
Introduction
In the previous section, a brief explanation for the differences between longitudinally and transversely compatible specifications was given. In this section, a more detailed explanation is given. An excellent overview of the two principles is provided in ITU G.Sup39 [13], "Optical System Design and Engineering Considerations", which is a very important reference document for many of the design considerations used in defining the various ITU optical interface recommendations. 18.2.2.2
Physical layer longitudinal compatibility
The first definition for longitudinally compatible optical interfaces was developed for PDH applications and can be found in Recommendation G.955. In this case, only the optical path (optical fiber or outside plant) characteristics are specified. Other optical interface parameters, like transmitter output power, source spectral characteristics, receiver sensitivity and overload, are not specified. Furthermore, the actual line rate and data transmission format are also not specified. This principle simply implies that
ITU Optical Interface Standards
701
operators can use "standardized" outside-plant characteristics for tendering purposes, permitting almost total implementation freedom to equipment manufacturers. For obvious reasons, the transmitting and receiving equipment must be from the same manufacturer. Initially, only the outside-plant maximum attenuation and chromatic dispersion were specified. Later, when bit rates of 2.5 Gbit/s and higher were introduced, the maximum Differential Group Delay (DGD) and optical line reflection characteristics were added. It should be noted that the maximum path characteristics are based upon End of Life (EOL) specifications, indicating that they should include appropriate margins for temperature and aging variations and for repair splicing. For further details on longitudinal compatibility, see G.sup39. 18.2.2.3
Physical layer transverse compatibility
To achieve complete interworking between equipment from different manufacturers on a single optical section, additional requirements beyond just the optical physical layer must be specified. In other words, the pieces of equipment at both sides of the optical link should be able to "talk to" and "understand" each other. Essential to a transversely compatible specification is the definition of appropriate reference points in the fiber path at which optical parameters can be both specified and measured. Two reference points were chosen to reflect where the transmit signal enters the outside plant (S-type reference point) and where the signal leaves the outside plant and is received by an optical receiver (R-type reference point). At these reference points, a full set of optical parameters and associated (verifiable) values is necessary to enable interoperability at the optical level. The first definition of this partitioning and associated parameter definition and specification is found in Recommendation G.957. Examples of parameters at Point S are transmitter output power (min and max) and the spectral characteristics of the transmitter, and at Point R the minimum receiver sensitivity, receiver overload, and maximum optical path penalty. The optical path penalty describes the apparent reduction of receiver sensitivity due to distortion of the signal waveform (generated by the transmitter) during its transmission over the optical path. A complete overview of the relevant parameters can be found in the various ITU optical interface recommendations. Some further details on optical power budget design considerations and limitations are given in Section 18.2.6. G.Sup39 gives a complete overview of levels of transverse compatibility, with the single-span configuration being the most straightforward.
Chapter 18
702
See Figure 18-1 for a generic single-span transversely compatible configuration.
Supplier A
Supplier B
m.
Transmitter
R Receiver
Outside fiber plant Figure 18-1. Generic single-span transversely compatible configuration
When adding line optical amplifiers into the middle of an optical link configuration, a transversely compatible specification reaches an increased level of complexity, because some parameters apply to a single section (e.g., attenuation), whereas others apply to the complete end-to-end section (e.g., chromatic dispersion and differential group delay). In such a multispan configuration, additional reference points are required to differentiate between end-to-end and per-section parameter specifications. One example configuration is shown in Figure 18-2, where a single-line Optical Amplifier (OA) has been inserted. In this case, slightly different reference points are used—MPI-S and MPI-R—^to indicate Main Path Interfaces, relevant for end-to-end performance. Also OA reference points R' and S' are used to indicate the parameters at the input and output of the Optical Amplifier. In general, different reference point nomenclatures are used across the various optical interface Recommendations.
Supplier A
m.
MPI-S Transmitter
Supplier B R'h. S'
Supplier C MPI-R Receiver
Figure 18-2. Example of multispan configuration
It should be obvious that when adding Optical Network Elements (ONEs) like optical add-drop multiplexers or optical cross-connects, the complexity
ITU Optical Interface Standards
703
of defining transversely compatible specification reaches an even higher level.
18.2.3 Overview of optical fiber types and associated recommendations In parallel with the development of a range of optical interface recommendations, the ITU has also developed a variety of recommendations for optical fiber cables. The evolution of the various fiber types had a significant impact on the development of optical interface recommendations, described later in Section 18.2.4. In the early 1980s, most optical fiber systems were deployed on multimode fiber. Many fiber types were developed, all with different characteristics and dimensions. In Europe the so-called 50/125 |im gradedindex type (with a 50 )im fiber core diameter and a 125 jLim cladding diameter) was mostly used, whereas in the USA 62.5/125 jim fibers were mostly deployed. ITU has only specified the properties of 50/125 jam graded-index multimode fiber types, found in Recommendation G.651 [14]. In general, multimode fibers are used in the 850 nm and 1310 nm wavelength windows. Because of a variety of technical issues with multimode fibers, e.g., limitations on transmission distance in relation to system speed, single-mode fibers became available and were generally deployed. Single-mode fiber was and is still used today for laser-based telecom applications. In the late 1990s, there was an interest in transmitting 10 Gbit/s Ethernet-based data rates over multimode fibers. A significant amount of work was done to improve the properties of the multimode fiber, resulting in the ability to transmit multimoded lasers over a distance of 300 m at 10 Gbit/s. Because applications using multimode fibers are not specified in a majority of the ITU optical interface recommendations, they are not further discussed in this chapter. The first single-mode fiber type is commonly referred to as Standard Single-Mode Fiber (SMF or SSMF). The ITU has developed Recommendation G.652 [15] in order to specify the properties of cables that utilize this fiber type. Initially, this fiber was only used in the 1310 nm wavelength window, where the chromatic dispersion is near-zero, theoretically providing unlimited transmission bandwidth. Around 1990, the SSMF fiber was additionally used in the 1550 nm window, due to the combination of a market drive for longer distances (at 1550 nm, the fiber loss is about half the value in the 1310 nm window) and the availability of 1550 nm laser sources. Because of the subsequent need to transmit at higher data rates (10 Gbit/s) and the need to utilize more of the wavelength range
704
Chapter 18
between 1360 nm and 1530 nm, additional specification details were necessary for Recommendation G.652. For this purpose, three subcategories have been defined within G.652: • Subcategory A, for the base G.652; • Subcategory B, having additional requirements on Polarization Mode Dispersion (PMD) for 10 Gbit/s applications; and • Subcategory C, having additional requirements for operation in the wavelength range between 1360 nm and 1530 nm, relevant to wideband CWDM applications. PMD occurs when the cross-section of the fiber core is not perfectly circular and tends to be more elliptical in shape. This results in the fiber effectively transmitting in a dual-mode instead of a single-mode condition. Because both modes have different group delays, additional pulse broadening beyond the traditional broadening, caused by chromatic dispersion, will occur. PMD is typically significant for transmission speeds of 10 Gbit/s or higher. In certain cases, however, it is known to affect the performance of 2.5 Gbit/s systems. Further information on PMD can be found in G.Sup39 or Recommendation G.650 [16]. Despite the huge advantage of a low fiber loss in the 1550 nm region, G.652 fiber exhibits a fairly high chromatic dispersion of around 17ps/nm*km at 1550 nm, severely limiting the transmission distance of systems operating at 2.5 Gbit/s or higher. Therefore a new fiber, called Dispersion Shifted Fiber (DSF) was developed, which is specified by ITU Recommendation G.653 [17]. This fiber was specifically designed to give low losses around 1550 nm and to provide near-zero chromatic dispersion in the 1550 nm wavelength window. At the same time, the ITU developed Recommendation G.654 [18] to address fiber types for submarine applications operating in the 1550 nm window. Because this fiber type is not used in terrestrial applications, further details are not provided here. An overview of ITU's single-mode fiber recommendations is provided in Table 18-1. When DWDM systems with closely spaced optical signals were being introduced, it appeared that the near-zero chromatic dispersion in G.653 fibers was the cause for the occurrence of the nonlinear Four-Wave-Mixing (FWM) effect. FWM, also called four-photon mixing, occurs when additional optical signals are generated by the interaction of two or three adjacent channel optical signals operating at closely spaced wavelengths at equivalent speeds. These additional signals, which transmit at wavelengths other than their parent wavelengths, are called mixing products.
ITU Optical Interface Standards
705
Table 18-1. Overview of ITU single-mode fiber recommendations ITU Recommendation G.652 G.653 G.654 G.655 G.656
Recommendation title Characteristics of a (standard) single-mode optical fibre cable Characteristics of a dispersion-shifted single-mode optical fibre cable Characteristics of a cut-off shifted single-mode optical fibre and cable Characteristics of a nonzero dispersion-shifted single-mode optical fibre cable Characteristics of a fibre and cable with nonzero dispersion for wideband optical transport
Because of FWM, a new fiber generation was developed that had nonzero dispersion and would generally prevent FWM fi'om occurring in the optical line. These new fibers are specified in Recommendation G.655 [19]. A complete variety of these fibers has since become available, all with very different characteristics for actual zero-dispersion wavelength and chromatic dispersion slope. No further details are given here, because that would entail a detailed theoretical treatment of DWDM systems and is beyond the scope of this chapter. More details on nonlinear effects can be found in Recommendation G.663 [20]. The first DWDM systems were operated with wavelengths around 1550 nm, coinciding with the transmission bandwidth of the first generation of optical fiber amplifiers. Subsequent deployments of DWDM equipment utilized wavelengths above and below the 1550 nm region. To simplify discussion of these wavelengths, the ITU defined a range of wavelength bands in G.Sup39, reproduced in Table 18-2. Most recently, the usage of DWDM and CWDM systems have extended a large range of bands, including S-, C-, and L-bands. ITU has created a new fiber Recommendation G.656 [21], which addresses the optimal performance of the fiber in wide-band applications. Full details of key parameters and associated values can be found in the relevant ITU Fiber recommendations. Table 18-2. Overview of wavelength bands Band O-band E-band S-band C-band L-band U-band
Description Original Extended Short wavelength Conventional Long wavelength Ultralong wavelength
Range (nm) 1260 to 1360 1360 to 1460 1460 to 1530 1530 to 1565 1565 to 1625 1625 to 1675
706
Chapter 18
18.2.4 Overview of optical interface recommendations It has been clarified in the previous sections that over time the ITU has developed a whole range of recommendations related to optical interface specifications. In retrospect, it might not seem logical how the ITU recommendations are organized and where to look for a certain specification. In this chapter, an attempt is made to guide the user/designer of optical interface technologies. Table 18-3 provides an overview of ITU's various optical interface recommendations existing to date. As stated previously, the ITU decided to use its latest Recommendations G.693 and G.959.1, to serve as generic documents for most optical interface specifications. The basis for this generic specification was the definition of signal classes in Recommendation G.959.1, covering a range of data rates for which sets of optical parameter values are specified. Presently, the following signal classes are addressed: NRZ 1.25G, NRZ 2.5G, NRZ lOG, NRZ 40G, and RZ 40G ( see the Table 18-4 for details). The parameter values for the older STM-1 and STM-4 interfaces remain within G.957 along with most of the STM-16 interfaces. For the NRZ lOG data rate class, ITU put almost all the sets of parameter values into either G.693 or G.959.1, except some of the applications that use optical amplifiers for 80 km or 120 km distances, which remain in G.691. One of the blocking issues that prevented the generic recommendations (G.959.1 and G.693) from covering the STM-1, STM-4, and STM-16 data rates was the fact that the reference BER applicable to G.957 differed from that of G.959.1 or G.693. At the time that G.957 was originally drafted, the parameter values were specified relative to an optical section design objective of a Bit Error Ratio (BER) not worse than 1 x 10"*^, whereas for all of the more recent Recommendations (G.691, G.692, G.693, and G.959.1), the BER design objective is 1 x 10"^^. The more stringent BER objective was derived from error performance requirements specified in Recommendation G.826 [22]. At the time of drafting and agreement of the first version of G.957, G.826 was not in existence yet, and as such, a BER design objective of 10"^^ was considered appropriate. Because of the wide deployment of SDH interfaces based upon this early version of G.957, the ITU decided to maintain the parameter value sets within G.957. Had ITU decided to adopt the more stringent BER requirement for STM-1 to STM-16 interfaces, substantial changes to some of the parameter values like maximum optical path attenuation and/or minimum receiver sensitivity would have been necessary. This could have resulted in inoperability issues between new and installed systems. Further discussion is provided in Section 18.2.6.
ITU Optical Interface Standards
107
Table 18-3. Overview of relevant ITU optical interface recommendations 1 Recommendation Application area Description G.691 SDH (SONET) Transversely compatible specification for single1 channel SDH STM-64 in general and STM-4 and STM-16 interfaces with optical amplifiers for distances longer than 80 km; design objective BER < 10'^^ G.692 SDH-DWDM "Limited" transverse compatible specification for multichannel SDH systems; actual parameter values "for further study"; design objective BER < 10"^^ G.693 Transversely compatible specification for singleGeneric VSR channel signal classes NRZ lOG and NRZ 40G for link distances up to 2 km with potentially high losses up to 16 dB; design objective BER < 10"'^ G.694.1 Grid for DWDM DWDM grid specification with spacing ranging from 12.5 GHz to 100 GHz G.694.2 Grid for CWDM CWDM grid specification with 20 nm grid spacing G.695 CWDM Transversely compatible specification for up to 16 channels, up to 2.5 Gbit/s per channel; design objective BER < 10'^^ G.696.1 OTN-DWDM Longitudinally compatible specification for multispan, multichannel OTN systems; design objective BER < 10"*^ G.698.1 OTN-DWDM Transversely compatible specification for 100 GHz spaced channels, up to 10 Gbit/s per channel; design objective BER < 10"'^ G.955 PDH Longitudinal compatible specification singlechannel, single-span up to 4* 140 Mbit/s; design objective BER < 10"'^ G.957 SDH (SONET) Transversely compatible specification for STM-1, STM-4, and STM-16 interfaces for link distances up to 80 km; design objective BER < 10"'^ G.959.1 1 Transversely compatible specifications for singleGeneric IrDI channel signal classes, NRZ 1.25G, NRZ 2.5G, NRZ lOG, NRZ 40G, and RZ 40G, and 16-channel signal classes NRZ 2.5G and NRZ 1OG; design objective BER < 10'^ | Table 18-4. Overview of signal classes Signal class NRZ1.25G NRZ 2.5 G NRZIOG NRZ40G RZ40G
Range of nominal bitrate [Gbit/s] 0.622-1.25 0.622 -2.67 2.4-10.71 9.9-43.02 9.9 - 43.02
Chapter 18
708
Table 18-5. Overview of optical interfaces and the related optical interface recommendations
1 Bit rate STM-1 (OC-3) STM-4(OC-12) STM-4 (OC-12) + optical amplifiers
STM-16(OC-48) STM-16 (OC-48) + optical amplifiers Signal Class NRZ 2.5G STM-64(OC-192) +optical amplifiers Signal Class NRZ lOG (incl. STM-64 /OC-192) Signal Class NRZ lOG (incl. STM-64 /OC-192) Signal Class NRZ 40G (incl. STM256 / OC-768) Signal Class NRZ 40G (incl. STM256 / OC-768) Signal Class RZ 40G (incl. STM-256 / OC-768) 1 NRZ 0TU3 with FEC
Distance <80km <80km <60km@1310nm < 160 km @ 1550 nm <80km < 160 km <80km 40-120 km
BER Objective JO-IO
10-^^ 10-'^
JO-IO
Relevant Recommendation | G.957 G.957 G.691
10-^2
G.957 G.691 G.959.1 G.691
<2km
10'^
G.693
2 - 80 km
10-^2
G.959.1
<2km
10-^2
G.693
2 - 8 0 km
10-^^
G.959.1
4 0 - 8 0 km
10'^
G.959.1
40 - 80 km
10-^2
G.959.1
10-*^ 10-^^
1
In Table 18-5, an overview provides a map between various optical interface types and their associated ITU recommendations. As a consequence of the standards evolving with the available technology and market needs, it is possible to find multiple standards covering the same nominal data rate (e.g., 2.5 Gbit/s). For this example, one can find STM-16 parameter value sets in G.957 or G.691 and NRZ 2.5G parameter value sets in G.959.1. A detailed comparison between G.957, G.691, and G.959.1 demonstrates that some parameters have different values. These differences are actually a consequence of having different target BERs, but nevertheless they are intended to be mutually consistent. As an example, the STM-16 40 km long-haul specification in G.957 contains a worst-case receiver sensitivity o f - 2 7 dBm at a reference BER of 10"^^, whereas the equivalent application in G.959.1 shows a worst-case receiver sensitivity of-25 dBm at a reference BER of 10"*^. Finally, it should be noted that the maximum data rate for the NRZ signal classes in Recommendations G.959.1 and G.693 is slightly higher than the equivalent SDH-rate in Recommendations G.957 and G.691. Because in both cases the same optical path penalty applies for equivalent application
ITU Optical Interface Standards
709
codes, the requirements for receivers and transmitters parameters are potentially more stringent in the case of the higher data rate.
18.2.5 Application code terminology related to distance Historically, ITU and Telcordia (Bellcore) have used different naming conventions to indicate the various application codes and associated transmission distances. Also, within the ITU, some confusion might arise from the naming conventions, which exist primarily for historical reasons. In this section, an attempt is made to clarify the naming conventions and correlate them with Telcordia. At the time the first version of G.957 was drafted, 1310 nm technology was used as a preferred technology for all distances up to 40 km and bit rates up to 2.5 Gbit/s. While 1550 nm parameter values are also included in G.957 for distances up to 40 km, 1550 nm technology was typically used in practice for longer distances up to 80 km. Therefore the name long-haul or long-reach was used to indicate possible transmission distances of 40 km when operated in the 1310 nm window and of 80 km when operated in the 1550 nm window. Later, when 10 Gbit/s technology was introduced, 1550 nm technology became the preferred technology for 40 km distances due to the lower attenuation of the outside plant in the 1550 nm window. This 40 km application, operated in the 1550 nm window, is referred to as a short-haul or intermediate-reach application in order to distinguish it from the 80 km 1550 nm long-haul applications. As a consequence, a 40 km distance is referred to as long-haul/long-reach, when operated at 1310 nm, and as short-haul/intermediate-reach when operated at 1550 nm. In Table 18-6, an overview is given of the various target distances, including a correlation between ITU and Telcordia (Bellcore) nomenclature. In some cases, different target distances apply. As already mentioned in Section 18.2.1.2, these target distances are only used for classification and not for guaranteed specifications. In Recommendation G.693, a totally different nomenclature has been introduced and is defined by the following attenuation categories: R(estricted), L(ow), M(edium), H(igh), and V(ery high). In G.693, three target distances are defined: 0.6 km, 1 km and 2 km. The reader can refer to Recommendation G.693 for further details. It should be noted that G.695 and G.959.1 also contain some application codes for transversely compatible multichannel/multiwavelength interfaces, which are not described here.
710
Chapter 18
Table 18-6. Overview of application code naming conventions in relation to target distances
I-codes: Intra-office
Equivalent Telcordia naming SR-codes: Short-reach
I-r-codes: Intra-office-r (reduced target distance) VSR codes S-codes: Short-haul
IR-codes: Intermediate-reach
ITU naming
L-codes: 1 Long-haul 1 V-codes 1 Very-long-haul U-codes 1 Ultra-long-haul
LR-codes Long-reach VR-codes Very-long-reach
Target distance 2 km in G.957 2 km @ 1310 nm in G.691, G.959.1 25 km @ 1550 nm in G.691, G.959.1 0.6 km @ 1310 nm in G.691, G.959.1 2 km @ 1550 nm in G.691, G.959.1
0.6, 1 and 2 km in G.693 15 km in G.957 and for NRZ 2.5G in G.959.1 20 km @ 1310 nm in G.691, G.959.1 40 km @ 1550 nm in G.691, G.959.1 40km@1310nm 80km@1550nm 60km@1310nm 120 km @ 1550 nm 160 km @ 1550 nm
18.2.6 Power budget design considerations and limitations 18.2.6.1
Worst-case design approach
The optical parameter values contained in the various optical interface recommendations have been chosen according to a worst-case design approach. Details of this principle are provided in Recommendation G.sup39. This worst-case design approach means that in an optical link, where all of the specified optical parameters simultaneously are at their worst-case End-Of-Life (EOL) value, the system BER shall not be worse than the value specified for the application, e.g., 10'^^. Within this context. End Of Life refers to a condition where the characteristics of a piece of equipment (e.g., the optical transmitter or receiver) or the outside plant have degraded such, that they are outside the intended specifications and consequently imply a failed condition. Thus, when the transmitter optical output power level is at its End Of Life (EOL) minimum value, the receiver sensitivity is at its EOL maximum value,^ the outside plant loss is at its EOL maximum value, and the source spectral characteristics are such that the optical path penalty is at its maximum value, the resulting system BER should not be worse than the
ITU Optical Interface Standards
711
value specified for the application, e.g., 10''^. This conclusion does not imply that it is generally accepted that an optical link operates at this reference BER. Instead, it is commonly expected by system users that an optical link operates virtually error free under nominal operations conditions, leveraging the fact that the probability that all optical parameters are simultaneously at their worst-case limits is very low. For the manufacturers of optical interface technologies (transmitters and receivers) this method of specification implies that their designs should have sufficient margins between Beginning-Of-Life (BOL) and EOL parameter values. Within this context, BOL refers to the performance of the equipment at the initial condition at the time of installation. The actual choice of defining appropriate margins (for temperature and aging effects and design variations) is fully dependent on the actual implementation and has therefore always been considered outside the scope of the various optical interface recommendations. In many cases, component manufacturers have used these ITU specifications directly as reference specifications for their solutions, whereas they are specifically intended to be systems specifications. This approach is acceptable as long as the ITU EOL specifications are not used as component BOL specifications. As an example, a receiver device with a BOL receiver sensitivity o f - 1 4 . 1 dBm is very likely to fail in a system application that has an EOL receiver sensitivity requirement o f - 1 4 dBm. It has, however, become more and more common practice that manufacturers of optical interface devices, like transmitters, receivers, and transceivers, use a one-to-one copy of the optical interface parameter values from the various ITU recommendations for their device interface specifications. In many cases margins to the EOL values are not indicated, making it very challenging for the user to judge whether appropriate design margins have been used. 18.2.6.2
Optical path penalty modeling and verification
When the first version of G.957 was drafted, it was the intent to define a unique set of optical spectral characteristics for each transmitter type, which would be unambiguously related to a specified value of optical path penalty through a mathematical model. In this way, a system BER test could be avoided since such a test would require a reference receiver. At the time of this writing, ITU has only succeeded in defining a sufficiently reliable model for predicting the optical path penalty for Multi-Longitudinal-Mode (MLM) or Fabry Perot laser devices. The details of this relationship, called the epsilon model, can be found in G.957 as well as in G.Sup39. G.Sup39 states that the epsilon model can, in principle, be extended to other types of laser sources, e.g., Single-Longitudinal-Mode (SLM) or Distributed Feedback
712
Chapter 18
(DFB) lasers. However, for these laser types, chirp' plays a dominant role in determining the path penalty, and as such, the epsilon model most likely is not reliable enough to predict optical system performance. Furthermore, G.957 states that a reliable method for estimating the dispersion penalties arising from laser chirp and finite side-mode suppression ratio for SLM lasers has not been agreed upon, indicating that only the use of a true BER test can provide sufficient verification of proper transmitter functioning with respect to path dispersion performance. Since the time that G.957 was written, this issue has yet to be resolved, and therefore in most optical interface specifications a complete set of transmitter spectral characteristics is missing, denoted hy ffs to indicate "for future study". As an example, ITU once considered defining a very narrow SLM laser linewidth for the 1550 nm 80 km 2.5 Gbit/s application in Recommendation G.957, with the intent to reliably predict a limitation of the dispersion penalty to 2 dB. After an extensive investigation, it was demonstrated that a number of laser sources, which failed a BER test, could pass the linewidth test, while other sources that failed the linewidth test would pass the BER test. Clearly, in practice, a situation where "good" devices are rejected and "bad" devices are accepted is unacceptable and should be avoided, as it would have a huge impact on the manufacturing yield and device cost.
18.3.
OPTICAL INTERFACE IMPLEMENTATIONS
18.3.1 General Over the past 20 years, the technology and approach for designing and manufacturing optical interfaces has migrated from proprietary and/or company-specific designs with discrete components towards fully integrated transceiver devices that follow Multi-Source-Agreements (MSAs) for standardized packages, conforming to ITU and Telcordia specifications for standardized optical parameters. For the initial discrete designs, standards were not yet available to serve as a reference for specifying optical performance. The current SDH/SONET/OTN applications offer a full range of optical interface specifications enabling interoperability between devices and systems from different manufacturers. The early optics designs, manufactured in low volumes, were relatively high in cost. Today's optics, in particular for the lower rates between 155 Mbit/s and 2.5 Gbit/s, which at present are identified as commodity products, are typically manufactured at
ITU Optical Interface Standards
713
high volumes, have the very competitive prices that are typical with commodity products, and are available from a large range of vendors. The interoperability provided by the optical interface standards paved the way for optical devices to evolve from proprietary devices to commodity products. In this section, an overview is given of the various designs used for transmitting data for various SDH/SONET/OTN applications, along with their associated migration paths.
18.3.2 140 Mbit/s - 2.5 Gbit/s technology 18.3.2.1 Discrete solutions
18.3.2.1.1 140 Mbit/s - 622 Mbit/s Approximately 20 years ago, optical transmitters and receivers were designed and built using "discrete" devices. These initial laser devices and pin-FET receiver modules were typically packaged in their own housing. The complete optical transmitter or receiver function, requiring more than just the laser or receiver, was realized by using a Printed Circuit Board (PCB), upon which the laser driver, receiver amplifier, and decision circuitry were placed using non-integrated parts. The laser package typically contained a laser chip, an integrated backface monitor photodiode to control the laser output power, in some cases (depending on the package style) a Thermo-Electric Cooler (TEC) and a fiber pigtail with a coupling lens. Initially, two package styles were available: bulky coaxial packages and the 14-pin Dual-In-Line (DIL) style. The coaxial packages were uncooled and there were many (nonstandardized) versions, each requiring a vendor-specific bracket and electrical interface, and therefore vendor-specific PCB mountings and wiring. In general, these early packages were difficult to handle in a manufacturing environment. The DIL package became a de facto packaging standard for optics, as it was easier to implement in a general manufacturing environment. Both cooled and uncooled DIL versions existed. In early implementations TECs were used to control the laser temperature^ (mostly at room temperature), thus improving the reliability of the laser. Unfortunately, the TEC devices were also the most unreliable component in a laser package. Initially lasers were used in 850 nm and 1310 nm applications for bit rates up to around 1 Gbit/s. The first laser chips were Multi-Longitudinal-Mode (MLM) or Fabry Perot types. Later, 1550 nm Distributed Feedback (DFB) laser chips became available, exhibiting narrow spectral width and being useful for distances up to 80 km. The 1550 nm lasers were initially packaged
714
Chapter 18
in DIL housings in order to get improved laser stability and high-power operation through the use of the TEC for laser chip cooling. Early lasers for optical transmission operated at fiber-coupled optical output powers of approximately 1 mW or above. Newer low-power lasers, operating in the 1310 nm window around 0.1 mW, were later introduced, allowing the coupling between the laser chip and the fiber pigtail to be less critical. In this case, one could work with mechanical tolerances much higher than those for laser devices with high fiber-coupled powers. It was recognized that, when lasers are manufactured at high volume, the majority of the cost of a laser is in the optical coupling between the laser and the fiber as well as the package and not in the laser chip itself. As such, anything that could be done to increase the yield of optical coupling or simplify the manufacturing resulted in significant cost savings. The trend towards using lower-power lasers to save cost is reflected in the definition of intraoffice/short-reach and short-haul/intermediate-reach interface specifications in ITU Recommendation G.957. At the beginning of the 1990s a de facto standard for smaller-sized coaxial pigtailed packages arose. This new package, together with an improved performance of uncooled laser devices, allowed a migration towards using uncooled transmitters for almost all applications with distances up to 80 km for bit rates up to 622 Mbit/s. For discrete receiver modules, the packaging evolution has proceeded at a slower pace. In most cases the receiver devices required additional semiconductor parts, like GaAs FETs or preamplifier ICs, in order to obtain the standardized receiver sensitivity performance. For receivers, both DIL and coaxial housings have been used. Initially only pin receivers were available to the market. However, due to the need to achieve improved receiver sensitivities, APD (Avalanche Photo-Diode) receiver devices were introduced to the market and the standards. The APD receivers were initially used for 622 Mbit/s 120 km applications. The lower cost PIN receiver technology has been appropriate for all applications at bit rates up 622 Mbit/s for distances up to 80 km. 18.3.2.1.2 2.5 Gbit/s Approximately 15 years ago, the first 2.5 Gbit/s optical interfaces were designed, and with that, several new challenges related to High Frequency (HF) performance and signal integrity through the package were introduced. At this speed, optical isolators were necessary inside the laser packages to minimize the influence of back-reflection from the outside plant towards the laser chip, since this could cause transmission errors with the laser. The traditionally used DIL packages were not suitable at 2.5 Gbit/s because of their limited HF performance. As such, "butterfly" packages were
ITU Optical Interface Standards
715
introduced, incorporating impedance-matching circuitry for optimized HF performance. For all 2.5 Gbit/s applications, SLM or DFB laser chip technology was required to meet the dispersion requirements of the various applications. Moreover, TECs were initially also used to obtain stable laser performance. At the end of the 1990s, DFB laser performance had improved such that uncooled operation was becoming possible and a miniature version of the butterfly package was introduced, since the space for a TEC was no longer needed. The mini-butterfly packages initially served 1310 nm applications and later were used also for 1550 nm applications. When the first 2.5 Gbit/s DWDM applications were being introduced at the beginning of the 1990s, direct laser modulation was no longer suitable for transmission over distances beyond 80 km. Instead, external modulation was required to minimize the effect of laser chirp, permitting transmission distances well above 80 km. The laser and external modulator were ultimately integrated into a single package, known as an Externally Modulated Laser (EML), and later the device was integrated onto a single chip. The single-chip EML device consists of a laser section, operating in Continuous Wave (CW) mode, and an electro-absorption modulator section, in which the light emitted by the laser section is modulated. A new range of EML driver ICs had to be developed, because EMLs work in "reverse" mode: no light is emitted when current is applied, which contrasts with conventional laser chips, where light emits when current is applied to the laser. For the first 2.5 Gbit/s DWDM applications, with channel spacings of 100 GHz (0.8 nm) or more, the drift of the EML frequency was not significant enough to result in optical link problems. However, for narrower channel spacing, wavelength lockers were necessary to achieve the required frequency stability and minimize the drift of the laser's wavelength. 2.5 Gbit/s were initially deployed in 40 - 80 km long-haul/long-reach spans. The link budget of these spans required the use of a very sensitive receiver; hence APD receivers were initially deployed. In order to maintain high-frequency signal integrity, the electrical leads coming out of the receiver package needed to have a minimum number of discontinuities. Because of this, the electrical leads were typically oriented parallel to the axis of the cylindrically shaped package. At 2.5 Gbit/s, integration of (pre)amplifier electronics within the receiver package was required in order to achieve the required performance. Later, once 2.5 Gbit/s systems were deployed in short-haul/intermediate-reach and intra-office/short-reach applications, the lower-cost pin-type receiver devices were introduced, employing the same packaging concepts as for APD-type receivers.
716 18.3.2.2
Chapter 18 Integrated solutions
18.3.2.2.1 Initial versions The availability of ITU's SDH/SONET optical interface Recommendation G.957 served as a reference for manufacturers of optical interface technology to design full functionality transmitter and receiver devices with standardized performance characteristics. In most cases, these devices were housed in the DIL packages. The supporting electronics for the transmit and the receive functions, e.g., laser driver circuitry and receiver amplification stages, were the first elements to be integrated into the packages. For certain applications, clock extraction circuitry was needed, and that also was integrated into the package. Initially, and to some extent even today, an industry agreed-upon packaging standard {de facto or realj didn't exist, and many vendors offered proprietary, vendor-specific solutions for transmitters and receivers for bit rates up to 2.5 Gbit/s. These initial versions were equipped with fiber pigtails. 18.3.2.2.2
1 X 9 device
Towards the end of 1990s, the emerging market of Gigabit Ethernet applications served as a driver to introduce the first integrated transceiver devices without fiber pigtails. These initial integrated transceivers (since they contained both a transmitter and a receiver in one module) were called the 7 X P transceivers. These modules were approximately 25 mm (1 inch) wide, 10 mm high, and 40 mm long, with two integrated SC-style optical "receptacle-type" connectors. The electrical interface consisted of nine pins, positioned in one line at the side of the package opposite to where the optical connectors were located. This interface was of a "serial" type, implying that the electrical data rate given/received to/from the device is exactly the same as the data rate that is transmitted/received from the optics. These devices were mounted directly on the edge of a system's Printed Circuit Board (PCB), and mounted with the faceplate of the circuit. Previous implementations allowed the laser/receiver devices to be mounted anywhere on a PCB, and the adjoining fiber pigtail was used to bring the light up to PCB faceplate. A de facto reference design was available for designers to properly manage the mechanical and electrical interfaces of the 1 x 9 transceiver with their PCBs. The 1 x 9 transceiver ultimately served 1310 nm short-haul/intermediatereach distances for Gigabit Ethernet, Fibre Channel, SDH/SONET STMl/OC-3, and STM-4/OC-12 applications.
ITU Optical Interface Standards
111
18.3.2.2.3 Small Form Factor (SFF) devices The market requirement to grow the capacity of network equipment also required that the density of the network equipment improve. This requirement manifested itself in a need to improve the density of optical interfaces at a faceplate. This was addressed by putting multiple channels (e.g., multiple 1 x 9 transceivers) on a single PCB, but ultimately required that the package be redesigned. Leveraging the newly available LC fiber optic connector, which was much smaller than previous connectors, a new transceiver package was introduced by multiple vendors, that was only half the size of a 1 x 9 transceiver. This new type of transceiver is most commonly referred to as a Small Form Factor {SFF) device. Its width is only 0.5 inch (12.7 mm), and the optical interface consists of two LC optical connectors. The electrical interface consists of two parallel rows of pins, either in a 2*5 or 2*10 configuration, depending on the features included in the transceiver. This package was developed through a voluntary industry cooperative effort, a practice which is now commonly referred to as a MultiSource Agreement or MSA. The official specifications for the SFF MSA can be found at: www.sffconimittee.org/ie/. An example of an SFF device is shown in Figure 18-3.
Figure 18-3. Example of SFF device (courtesy of Finisar Corporation)
718
Chapter 18
SFF devices are not only available for the same application space as the 1 x 9 device but also now address 2 Gbit/s Fibre Channel and 2.5 Gbit/s SDH/SONET applications. Initially only devices for 1310 nm shorthaul/intermediate-reach were available. However, due to subsequent improved performance with uncooled 1310 nm and 1550 nm SLM or DFB lasers and APD receivers, devices for 2.5 Gbit/s 40 km and 80 km longhaul/long-reach applications have become available. This package achieved very competitive market pricing, because the module-level package was utilized in so many applications, which allowed the volumes to be converged on the manufacturing line. 18.3.2.2.4 SFP devices The availability of SFF devices enabled very high port densities in optical transmission systems. Examples are circuit packs with 16 SFF devices next to each other for STM--4/OC-12 and 1 GbE applications. While solving the density problem, the SFF package had to be soldered down to the board, meaning that the end customer was required to buy all 16 channels, even if only a couple of channels were initially needed. The physical layout of so many devices on the faceplate of the PCBs proved to be an incredible challenge. Additionally, it was generally not feasible to support multiple applications (e.g. different transmission reach) on the same circuit pack, despite the fact that the package lent itself to this opportunity. Finally, there is an increased reliability risk when so many optical devices are mounted on a single circuit pack. In the case of a single transceiver failure, either the complete circuit pack must be replaced and repaired or the relevant optical port had to be permanently taken out of service. In order to resolve the aforementioned drawbacks, a completely new generation of electrically pluggable transceivers was developed. The most prominent transceiver today is called the Small Form-factor Pluggable (SFP) module. The SFP modules are similar in size to the SFF devices; however, they are hot-pluggable, implying that they can be inserted into and removed from a live circuit pack without disrupting service or impacting the performance of any of the other ports already in service. This new concept offers full flexibility, permitting a mix of various transceiver types on a single circuit pack under a "pay as you grow" scenario. Also, in case of a transceiver failure, only the impacted device will require replacement. An example of an SPF device is shown in Figure 18-4. In order to facilitate simple pluggable insertion and removal from the board, along with minimized EMC interferences, a protective cage is mounted at the front of the host PCB. The width of the cage is slightly larger than that of an SFF device, implying a slightly lower port density than for SFF devices, but the gain in flexibility more than compensates for this
ITU Optical Interface Standards
719
reduced density. The electrical interface is realized through a 20-pin electrical connector, providing all necessary input and output signals. In order to manage the device, a two-wire serial bus is incorporated into the electrical pin field. Access to the device's performance and alarm monitors is provided through this bus.
Figure 18-4. Example of SFP device (courtesy of Finisar Corporation)
The SFP was developed through a voluntary cooperative industry agreement. Additional information on the MSA specification, detailing the mechanics, pinning, and electrical interface can be found at www.scheltQ.conVSFP. The application space for the SFP devices covers all speeds up to 2.5 Gbit/s, and most recently, 4 Gbit/s for quadruple Fibre Channel applications. The SFP is used in single-channel applications as well as in CWDM applications as specified under ITU Recommendation G.695. The operation of DFB lasers in uncooled conditions matches very well with the 20 nm channel spacing requirements for CWDM systems in G.695. Recent advances in packaging and micro-cooling of substrates has enabled the introduction of SFP devices also for 2.5 Gbit/s DWDM applications. Prior to the formation of the SFP hot-pluggable device, a package called the GBIC (GigaBit Interface Converter) module had been developed, primarily covering 1-2 Gbit/s Fibre Channel and Gigabit Ethernet applications. Its dimensions, including alignment guides and the electrical
720
Chapter 18
connector, are approximately 13.7 x 38 x 76 mm^ (/?*w*J) and it contains either one or two optical ports. The GBIC part used the SC-style optical connectors. Because this device has not been applied to the broad range of SDH/SONET interfaces, no further attention is given to it here. There is one exception: couple of companies developed 2.5 Gbit/s and GbE DWDM applications and packaged them in a GBIC. The long-term opportunity for this part is unclear, given that the same capability is available in SFPs. An example of a GBIC device is shown in Figure 18-5. Additional information about the GBIC can be found at www.schelto.com.
Figure 18-5: Example of GBIC device (courtesy of Finisar Corporation)
18.3.2.2.5 2.5 Gbit/s transponders Before the aforementioned SFF and SFP transceivers were available for 2.5 Gbit/s applications, transponder devices were introduced to aid in the implementation of 2.5 Gbit/s optical interfaces onto circuit boards. In contrast to the transceiver devices, which have serial electrical interfaces, exhibiting the same speed on the electrical interface as on the optical interface, a transponder contains electrical (de)multiplexing functions in order to equip the device with electrical signals that can be fairly easily handled at circuit pack level. The transponders contained an electrical interface that operated with sixteen channels operating at 155 Mbit/s/channel. An industry-wide MSA has not been developed for the 16-
ITU Optical Interface Standards
721
channel 2.5 Gbit/s transponder. Instead, several smaller-scale MS As were agreed upon by some manufacturers of optical interface technology. Initially, transponders were developed for single-channel SDH/SONET interfaces and later as well for DWDM optical interfaces. In most cases, the latter were proprietary. Today, the 2.5 Gbit/s transponders are mostly used for the latter DWDM applications, while the (serial) transceivers are used for the singlechannel applications.
18.3.3
10 Gbit/s technology
18.3.3.1
Discrete solutions
When the first 10 Gbit/s optical interfaces were being designed, the available technical choices were fairly limited. APD receivers were not yet available and transmitter output powers were limited. The first 10 Gbit/s transmitters were based upon a combination of 1550 nm CW-lasers and Mach Zehnder (MZ) modulators. Only when using optical amplifiers, either in a booster or in an optical preamplifier configuration, could distances of 80 km and beyond be achieved. Furthermore, dispersion compensation was necessary to achieve transmission distances above 50 - 60 km. For the first draft of ITU Recommendation G.691, it appeared almost impossible to define an appropriate set of parameter values for 40 km distances because of the performance limitations of the first generation of 10 Gbit/s optical technology. For a long period, the design of 10 Gbit/s optical interfaces was quite challenging, not only due to the limited availability of appropriate optical devices, but also due to signal integrity issues related to the transmission of High Frequency (HF) data between lasers, detectors, and board-level interconnections. As such, proprietary 10 Gbit/s optical interfaces dominated the designs of equipment manufacturers. This is true in particular, even today for DWDM applications, where stringent requirements surrounding high-channel count and multispan configurations are needed to guarantee system performance. For channel spacing of 100 GHz or less, wavelength locking devices were added to the optical train to meet stringent transmitter frequency stability requirements. Due to advances in laser technology and the upcoming availability of Electro-absorption Modulated Lasers (EMLs), it was possible to develop smaller solutions than what was possible with the existing MZ-based technology. Moreover, these new integrated devices were significantly easier to manufacture and were much more cost-effective than MZ devices. As such, it was possible to define a new set of parameters for 40 km
722
Chapter 18
applications in Recommendation G.691 without the need for an optical amplifier. However, despite these advances, the first SDH/SONET 80 km application codes in G.691 were specified with the assumption that an optical amplifier would be required. Subsequent advances in the use and packaging of 10 Gbit/s APDs made it possible to extend the reach beyond 40 km, without the need for an optical amplifier.
18.3.3.2
Integrated solutions
18.3.3.2.1 300-pin transponder At the end of the 1990s and towards the turn of the century, 10 Gbit/s optical component technology had matured to the point where proprietary optical interfaces could be standardized with an integrated package that could be supported by multiple vendors. One example of this was developed through a multivendor consortium, resulting in the 300-pin transponder MSA. Like the 2.5 Gbit/s transponders described above, the 10 Gbit/s transponder contains electrical (de)multiplexing flinctions to aid designers in managing the signal integrity at the circuit pack level. In the case of the 300pin transponder, a 16-channel multiplexing function was used, so the lowerspeed electrical interface was 622 Mbit/s. The Optical Internetworking Forum (OIF) developed a standard around the electrical interface and signal management, called OIF-SFI4-01.0: "Common electrical interface between framers and serializer/deserializer parts for OC-192/STM-64 interfaces" [23]. This standard defines the interoperability between the transponder electrical signals and the framer, which resides on the host printed circuit board. The OIF specification is commonly referred to as the SFI4 specification and, with the 300-pin MSA specification, provided the industry a complete specification around the mechanical housing, the electrical interface, and the optical interface. The 300-pin MSA contains a full description of the mechanical aspects of the module package (module outline, dimensions, mounting holes, location of electrical interface connector, etc.), as well as optical and electrical characteristics. With respect to the optical interface, reference is made to the relevant ITU and Telcordia SDH/SONET requirements. The transponder contains SERDES (serializer/deserializer) ICs, and for this transponder, the SERDES refers to a 16:1 multiplexer and 1:16 demultiplexer functionality. In addition to the laser and laser driver, the transmitter section contains circuitry intended to minimize jitter. In the receiver section, clock and data recovery circuitry is included. The electrical interface is implemented through a 300 (10*30) Pin
ITU Optical Interface Standards
723
Grid Array connector, which allows for straight forward integration of the optical module with the host PCB. Because this connector is separable, the 300-pin transponder is also referred to as a cold-pluggable device. Through the electrical connector, the usual data and clock signal interconnects are provided along with a variety of digital alarm and analogue signals to report on the "health" of the various parts within the transponder. Initially, a maximum footprint of 5 x 5 inch^ was defined, but in practice, the footprint was typically 3.5 x 4.5 inch^. Recently, a small-form-factor version was defined with a maximum size of 2.2x3 inch^, reflecting the need for higher optical channel densities on the host PCB, and was enabled by technologies focusing on improved component integration within the transponders. An example of a 300-pin transponder is shown in Figure 18-6. A separate MSA for an I2C digital diagnostic interface, provided through a two-wire serial bus, has been defined to provide an improved means for remote transponder monitoring and provisioning. This was an improvement over the polling technique that was typically conducted to monitor the performance of the transponders through the analogue pins in the 300-pin connector as described above. Full details of both the 300-pin transponder MSA and the I2C digital diagnostic MSA can be found at w^ww\300pinMSA.org.
Figure 18-6. Example of 300-pin transponder (courtesy of JDS Uniphase)
724
Chapter 18
The 300-pin transponders were initially available only for the 1550 nm short-haul/intermediate-reach 40 km and the 1310 nm intra-office/shortreach applications. Based on market demand, a complete variety of transponders became available to accommodate longer ranges, e.g., 80 km, SDH/SONET, and DWDM applications. In the latter case, MZ modulators and/or tunable lasers have been incorporated within the transponders. No reference optical interface specifications exist for the DWDM transponders, and therefore the corresponding designs and associated specifications are mostly proprietary. During the transponder design phase, it was discovered that it was extremely challenging to meet Telcordia's transmitter jitter generation requirements, in particular the RMS-jitter requirements. As a result, a whole range of successive SERDES devices has been developed by various ASIC vendors to improve jitter performance. Because of advances in EML output powers and the availability of lOGbit/s APD receivers with improved receiver sensitivity, it became possible to bridge 80 km distances without the use of optical amplifiers. Therefore, new application codes have been defined in the generic ITU Recommendation G.959.1 as an alternative to the SDH/SONET Recommendation G.691 80 km application codes, which previously required optical amplification. In order to manage the total module cost. Very Short Reach (VSR) transponders became available, principally to leverage cost reductions that had occurred with 10 Gbit/s optical component technology and also the shorter-reach applications (e.g., 600 m) that had been standardized in ITU Recommendation G.693. With these transponders, it was possible to build cost-efficient interconnects over VSR distances between SDH/SONET and DWDM systems. In addition to SDH/SONET applications, the 300-pin transponders can in principle also be used for G.709-based OTN as well as 10 GbE applications, because of provisions within the 300-pin MSA to operate at bit rates between 9.9 and 10.7 Gbit/s. Whereas in some transponder designs the relevant transmitter jitter filter frequency has to be set through the 300-pin connector, others are able to operate without the need to independently set the jitter filter frequency by employing a broadband jitter filter approach. A judgement on which version works best is considered outside the scope of this book. 18.3.3.2.2 200-pin transponder Soon after the introduction of the 300-pin MSA, a competitor transponder MSA arose. This transponder was based on a 200-pin connector, and initially promised a smaller package and footprint, which proved to be
ITU Optical Interface Standards
725
especially interesting for high-density intra-office/short-reach applications. Ultimately, the market favored the 300-pin MSA as the transponder of choice for 10 Gbit/s SDH/SONET applications. Consequently, no further details are provided here. 18.3.3.2.3 XENPAK Because of continued breakthroughs in component-level integration, it became possible to package a "transponder" in increasingly smaller packages. At the same time, a market need arose for systems with more than one 10 Gbit/s optical port on a single optical interface card and an interest to leverage the same "hot pluggability" feature that was available with the SFP modules that covered lower data rates. The density and pluggability issue was in particular true for 10 Gigabit Ethernet (10 GbE) applications but also for SDH/SONET applications. A new MSA became available, called the XENPAK-MSA, to address the above-mentioned market needs.
Figure 18-7. Example of XENPAK device (courtesy of JDS Uniphase)
The XENPAK housing is equipped with two SC optical connectors, and its board attachment scheme requires a cut-out in a PCB with alignment to a mating PCB connector. Unlike the SFP pluggable device, the XENPAK package was intended to be fully EMI compliant; hence a cage or guidance system is not required. Its size is approximately 2 x 5 inch^. An example of a XENPAK device is shown in Figure 18-7. An industry standard 70-pin electrical connector provides the electrical interface. The input and output data signals are transmitted according to a new electrical interface specification, called XAUI, which was defined in IEEE 802.3ae. XAUI
726
Chapter 18
details can be found at www.10gea.org. In short, the XAUI interface specification is based on four bidirectional lanes carrying 3.125 Gbit/s per channel. This setup simplified electrical trace management on the host PCB board relative to the 300-pin transponder, which required 16 parallel electrical channels per 10 Git/s optical signal. However, it also required that each trace carry a higher data rate. In order to simplify design and layout, additional overhead processing was added to the signal with which to correct for signal integrity challenges in the host PCB. The four 3.125 Gbit/s XAUI lanes give an aggregate bandwidth of 12.5 Gbit/s in order to transmit a 10.3125 Gbit/s optical signal. Full details on the XENPAK module can be found at www.xenpak.org. While use in SDH/SONET applications was foreseen in principle, most versions are specifically aimed at 10 GbE applications, where large volumes were expected. 18.3.3.2.4 X2, XGP, and XPAK Many transceiver users faced difficulties applying XENPAK devices, mainly due to the module's size, required board cut-out, and associated thermal issues. The market requested that an alternative smaller sized package be defined that would not require a board cut-out. Three solutions have been proposed: XGP, X2, and XPAK. The application area is the same as for XENPAK, 10 Gigabit Ethernet, and Fibre Channel, but SDH/SONET applications are also foreseen. The XGP concept has since been abandoned due to lack of agreement on a specification. The two remaining and competing MSAs are the X2 and XPAK. These were developed around the middle of 2002, both using the same 70-pin electrical interface connector that was required for XENPAK. However, unlike the XENPAK, both modules require a guiding/cage system. Both of the packages are of smaller size than XENPAK. The first parts available to the market were intended for 10 GbE applications, and the input and output data signals are according to the XAUI specification (as for XENPAK). Versions accommodating SDH/SONET STM-64/OC-192 applications are also foreseen, where the electrical interface would be addressed with four 2.5 Gbit/s data signals based upon the OIF SFI4 Phase 2 based electrical interface specification [24]. An example of an X2 device is shown in Figure 18-8. For further details, please visit www.x2msa.org or www.xpak.org.
ITU Optical Interface Standards
111
Figure 18-8. Example of X2 device (courtesy of JDS Uniphase)
18.3.3.2.5 XFP Because of a continued market drive for further size reduction in optical modules and the need for flexibility with respect to application space, together with the foreseen availability of new IC generations that could operate directly at 10 Gbit/s interface speeds, a new hot-pluggable, smallform-factor transceiver, called XFP, was conceived. The XFP was defined specifically to cover all of the possible 10 Gbit/s applications, including SDH/SONET STM-64/OC-192, OTN G.709 rates, 10 Gigabit Ethernet and Fibre Channel. An example of an XFP device is shown in Figure 18-9. This XFP transceiver has a serial interface, implying that both the electrical and optical interface speed is operating at 10 Gbit/s. This setup is similar to the concept employed with the SFP modules. This approach is forward-looking in that it is designed around the future availability of framer ICs with 10 Gbit/s interface signals, but it can still be operated by interfaces with a serial-to-parallel converter on the host board. By taking the (de)multiplexer electronics out of the module, the device power dissipation could be substantially lowered in comparison with to earlier devices and this also makes the device less dependent on the exact application bit rate.
728
Chapter 18
Figure 18-9. Example of XFP device (courtesy of Finisar Corporation) An MSA has been defined, including specifications for the XFP module itself, cage hardware to fit to PCBs, digital diagnostics, and electrical interface specifications. For full details, please visit www.xfpmsa.org. The package size is approximately 18 x 71 x 8.5 mm^ (w*(i*/z). The electrical interface is realized through a 30-pin connector, and the optical connectors are of the LC receptacle-type, similar to those used in SFP packages for bit rates up to 2.5 Gbit/s. The XFP MSA also contains guidance on mechanical provisions to provide optimum thermal management on the host PCB. Given the reduced number of pins available relative to the 300-pin transponder (a lOx reduction), it was necessary to adopt a digital diagnostic to provide access to the alarms and performance of the lOG optical module. The digital diagnostic is accessible through a two-wire serial bus, called the I2C. The high-speed 10 Gbit/s electrical signals and their specifications are commonly referred to as XFI, pronounced "Ziffie". Initially, only versions for STM-64/OC-192 intra-office/short-reach (2km) and 10 GbE long reach (10 km) applications were available, based upon usage of uncooled 1310 nm DFB laser chips. Recently, the availability of semi-cooled 1550 nm EMLs has opened a road to the availability of XFP devices for longer-reach single-channel and DWDM applications with transmission distances up to 80 km.
ITU Optical Interface Standards
18.3.4
40 Gbit/s technology
18.3.4.1
Discrete solutions
729
Currently most implementations for 40 Gbit/s optical interfaces, both for single-channel and DWDM multichannel applications, are still completely proprietary. The cost is still extremely high and the market demand is limited, and as such, it is somewhat premature to build an extensive set of industry agreed-upon MSAs. 18.3.4.2
Integrated solutions
Around 2002, it was anticipated that there would be a market need for commercially available integrated, multisourced 40 Gbit/s devices. Therefore, an MSA was defined for a 300-pin transponder in a way similar to the 300-pin transponder for 10 Gbit/s applications. Initially, only Very Short Reach (VSR) applications as defined in G.693 were foreseen. For full details, please visit www.300pinMSA.org.
18.4.
18.4.1
CONSIDERATIONS ON OPTICAL FAULT AND DEGRADATION DETECTION General
Since the initial deployments of optical fiber systems, the philosophy on how to handle optics-related faults and degradations in optical transmission systems has substantially changed. In this section, a perspective on this evolution is provided.
18.4.2
Faults in conventional transmitters and receivers
At the time that the first optical transmission systems were being deployed in the 1980 - 1985 time frame, optical technology was not yet mature with respect to reliability. Optical transmitters were mostly based on 850 nm laser technology, and there were still many reliability issues with respect to laser chip technology and laser-to-fiber coupling stability. In particular, the 850 nm laser threshold/bias currents were in many cases continuously degrading
730
Chapter 18
(increasing) versus time, and a laser End-Of-Life condition was relatively quickly reached. Because of these reliability issues, several monitors were built into the transmitter in order to track the laser degradation to provide the user with a means to predict equipment failure. Examples of monitoring parameters are laser bias current, laser backface monitor diode current (to give an indication of the laser output power), and laser chip temperature (in the case that a Thermo-Electric Cooler was used). Furthermore, two alarm levels were defined: a degraded alarm, indicating a close-to-failure condition to provide the end-user some time to replace the degraded equipment, and a failure alarm, requiring immediate replacement of the failing parts. At the receive side of the optical link, monitors were built into the optical receivers in order to measure the optical input power into the receiver so as to check the condition of the line and the actual receiver input signal. Originally, the aforementioned monitor points were connected to test points at the front of the equipment. Because these points were very sensitive to ESD/EMC effects, the industry moved to fully integrated monitoring methods based upon analogue-to-digital conversion techniques. The generated analogue information was collected by the equipment and used to raise alarms at preset levels or for network and/or equipment health monitoring. In particular, when the first SDH/SONET recommendations were being developed, several optical degraded and fail alarms were defined with the intent to indicate when the relevant optical interface parameters had degraded to a value outside the range specified. Since that time, many of these monitors have been implemented in most of the various transmitter, receiver, transceiver, and transponder designs. The way they are being used, however, has substantially changed. While some equipment users are still requiring the actual analogue values for their own assessment of network and/or equipment health, a completely new philosophy on the usage of optics-related alarms has been developed within the ITU. It is being recognized that putting alarms only on lasers is not very useful, in particular when a laser cannot be replaced separately. A failing solder joint on a low-cost resistor can be as dramatic to the performance of an optical system as the failing of a relatively expensive laser. Within the relevant ITU equipment recommendations, one no longer speaks of alarms, but of defects. Defects are being defined in an unambiguous way and should be one-to-one coupled to "consequent actions"; i.e., if the failure of a piece of equipment is reported, it should be immediately clear what the follow-up action is. Therefore, measuring analogue values only makes sense if sensible thresholds can be defined. Furthermore, it is critical for the user to make a distinction between equipment defects and transmission defects. Only then can appropriate
ITU Optical Interface Standards
731
consequent actions be defined. The equipment defects should be coupled to replaceable (e.g., hot-pluggable) units and not to individual parts integrated on these units. As a result, the definition of optical interface-specific defects (alarms) at the component level has been abandoned by the ITU. Furthermore, optical technology has matured substantially over the past twenty years to a level where the reliability of the involved parts, in particular for the 1310 nm and 1550 nm components, has significantly improved. Today's failure mechanisms are very different as compared with the initial degradation patterns that were associated with 850 nm laser technology. With today's technology, if a laser component fails, the actual failure remains very difficult to predict by using measured laser performance parameters. This makes the reporting of analogue values even less useful as a tool to perform preventive maintenance than one would expect. Therefore the usage of the "degraded alarms" as mentioned above has been abandoned by the ITU. In practice, the laser's failure could be either within a minute, thus faster than the actual repair time, or within a year after the moment that the degraded alarm was raised. A further complexity is the fact that while equipment vendors initially had extensive knowledge of the optical interface design details, this expertise has dispersed into the whole of the optics supply chain. As such, local experts are rarely available in a single institution to interpret analogue optical monitoring values. Today, instead of using sub-component level alarms, the use of highly-integrated modules, like transceivers or transponders can provide relevant alarm information to the board upon which they are placed. These module-level alarms can then be used to generate notification of equipment defects. The manufacturers of these technologies have the best design knowledge for defining how the appropriate defect levels predict the failure of a module.
18.4.3
Faults in optically amplified systems
When the first optical amplifiers were being designed and used in singleand multichannel transmission systems, a similar discussion on degradation and failure alarms took place within the ITU. As for the conventional optical transmitter and receiver technologies, the performance degradation alarms were regarded as implementation specific and therefore not suitable for incorporation into the applicable standards. Furthermore, the design-specific failure alarms were expected to be incorporated in the general equipment defects, in a way similar to the description in Section 18.4.2. As before, the analogue monitoring was considered to be only useful when sensible thresholds and related consequent actions could be defined. The specific technologies used in optical amplifiers, e.g., pump lasers, generally did not
732
Chapter 18
exhibit the same Hnear degradation patterns that were observed for transmission-based lasers. Finally, there was a period when the very dangerous and unpredictable phenomenon of "sudden death" was known to occur and to dominate failures with optical amplifiers, and it could not be prevented by laser parameter monitoring. Recently, a lengthy discussion took place within the ITU on the topic of optical monitoring in optical networks. In very long-haul DWDM networks, the distances between access points, where there is (electronic) access to the relevant network and/or equipment performance data, are so great that it is not easy to locate faulty equipment or fiber breaks in the outside fiber plant. In order to have the capability to trace these faults, some users wanted to standardize the mandatory measurement of several analogue optical parameters, like total power, individual channel power, channel wavelength. Optical Signal-to-Noise Ratio (OSNR), and Q-factor. Unfortunately, it appeared virtually impossible to unambiguously relate these optical parameters directly to network error performance, and as such, sensible thresholds and consequent actions couldn't be defined within the ITU. The fact that most of the parameters and their values are very equipment design specific was one of the complexities of this topic. It was recognized, however, that optical monitoring is a useful tool to perform network and equipment fault location, but that this tool should stay proprietary and should not be considered mandatory by the standards bodies. Naturally, not all DWDM networks would require this degree of optical monitoring, and as such, the ITU wanted to avoid the impression that it should be performed in all multi-wavelength systems. In particular, use of these monitoring methods in low-cost CWDM systems would not provide any added value and would only add unnecessary elements and cost to the system. Instead, it was decided to provide some general information on optical monitoring in DWDM systems in a new Recommendation G.697 [25], which specifically avoids the implication that optical monitoring is necessary in all multichannel systems.
18.5. NOTES 1. 2.
3.
In this chapter,y/6er optic plant and outside plant will be used interchangeably. The general relationship between signal frequency / and wavelength X (measured in vacuum) is given by X = c/f, with c being the speed of light in vacuum, for which a value of 2.99792458 x 10^ m/s should be used when carrying out this conversion. Within this context, the maximum value of the receiver sensitivity refers to its worst-case value, reflecting the notion that a (mathematically) higher sensitivity value implies a less sensitive receiver.
ITU Optical Interface Standards 4.
5.
733
Laser chirp is the phenomenon by which the laser cenlcr wavelength is varying during a binary pulse. This phenomenon is evident in particular during the rising and falling edges of a pulse. This control implies that TECs are used to cool the laser under hot ambient conditions and heat the laser under cold ambient conditions.
18.6. ACKNOWLEDGMENTS The author of this chapter would hke to sincerely thank two of his excolleagues from Lucent Technologies for their extremely valuable suggestions for improving the contents of this chapter: George Newsome, author of another chapter in this book and currently engaged with consulting, and Michael J. Schabel, Ph.D., Member of the Technical Staff at Bell Laboratories, Lucent Technologies, Murray Hill, NJ.
18.7. REFERENCES [I] ITU Recommendation G.955, Digital line systems based on the 1544 kbit/s and the 2048 kbit/s hierarchy on optical fibre cables, November 1996. [2] ITU Recommendation G.957, Optical interfaces for equipments and systems relating to the synchronous digital hierarchy, July, 1999, plus Amendment 1, December 2003, plus Amendment 2, January 2005. [3] Telcordia GR-253-CORE, ISSUE 3, Synchronous Optical Network (SONET) Transport Systems: Common Generic Criteria, September 2000. [4] ITU Recommendation G.692, Optical interfaces for multichannel systems with optical amplifiers, October 1998, plus Corrigendum 1, January 2000, plus Corrigendum 2, June 2002, plus Amendment 1, January 2005. [5] ITU Recommendation G.694.1, Spectral grids for WDM applications: DWDM frequency grid, June 2002. [6] ITU Recommendation G.872, Architecture of optical transport networks, November, 2001, plus Amendment 1 December 2003, plus Corrigendum 1, January 2005. [7] ITU Recommendation G.709, Interfaces for the Optical Transport Network (OTN), March 2003, plus Amendment 1, December 2003. [8] ITU Recommendation G.959.1, Optical transport network physical layer interfaces, December 2003, plus Erratum 1, April 2004. [9] ITU Recommendation G.693, Optical interfaces for intra-office systems, January 2005. [10] ITU Recommendation G.691, Optical interfaces for single channel STM-64 and other SDH systems with optical amplifiers, December 2003, plus Amendment 1, January 2005. [II] ITU Recommendation G.694.2, Spectral grids for WDM applications: CWDM wavelength grid, December 2003. [12] ITU Recommendation G.695, Optical interfaces for coarse wavelength division multiplexing applications, January 2005, plus Erratum 1, June 2005. [13] ITU Recommendation G.sup39, Optical system design and engineering considerations, October 2003.
734
Chapter 18
[14] ITU Recommendation G.651, Characteristics of a 50/125 pim multimode graded index optical fibre cable, February, 1998. [15] ITU Recommendation G.652, Characteristics of a 50/125 jum multimode graded index optical fibre cable, [16a] ITU Recommendation G.650.1, Definitions and test methods for linear, deterministic attributes of single-mode fibre and cable, June 2004. [16b] ITU Recommendation G.650.2, Definitions and test methods for statistical and nonlinear related attributes ofsingle-mode fibre and cable, January 2005. [17] ITU Recommendation G.653, Characteristics of a dispersion-shifted single-mode optical fibre and cable, December 2003 [18] ITU Recommendation G.654, Characteristics of a cut-off shifted single-mode optical fibre and cable, June 2004. [19] ITU Recommendation G.655, Characteristics of a non-zero dispersion-shifted singlemode optical fibre and cable, March 2003. [20] ITU Recommendation G.663, Application related aspects of optical amplifier devices and subsystems, April 2000, plus Amendment 1, January 2003. [21] ITU Recommendation G.656, Characteristics of a fibre and cable with non-zero dispersion for wideband transport, June 2004. [22] ITU Recommendation G.697, Optical monitoring for DWDM systems, June 2004. [23] OIF implementation agreement, OIF-SFI4-01.0, Proposal for a common electrical interface between SONET framer and serializer/deserializer parts for OC-192 interfaces, September, 2000. [24] OIF implementation agreement, OIF-SFI-02.0, SERDES Framer Interface Level 4 (SFI4) Phase 2: Implementation Agreement for 10 Gb/s Interface for Physical Layer Devices, September, 2002. [25] ITU Recommendation G.826, End-to-end error performance parameters and objectives for international, constant bit-rate digital paths and connections, December 2002. [26] ITU Recommendation G.696.1, Longitudinally Compatible Intra-Domain DWDM Applications, July 2005 (pre-published). [27] ITU Recommendation G.698.1, Multichannel DWDM applications with single channel optical interfaces, June 2005.
Chapter 19 HIGH-SPEED SERIAL INTERCONNECT Intrasystem Chip-Chip and Backplane Interfaces Adam Healey Agere Systems
19.1. INTRODUCTION With the increase in port speeds and Hne card port density comes a proportional increase in Hne card capacity. This increase in capacity puts pressure on intracard and card-card communications channels. In many cases, it makes sense to transition these interfaces to arrays of high-speed serializer/deserializer devices, commonly referred to as serdes. For a given bandwidth, use of high-speed serdes allows a reduction in package and connector pin density as well as board trace density. By reducing the number of board traces, the layer count and routing complexity of the board are reduced, leading to a cost savings. Such cost savings are offset by the additional signal integrity concerns that accompany high-speed interfaces. This chapter provides an introduction to high-speed serial interconnect, focusing on solutions that operate in the vicinity of 2.5 Gb/s per serdes channel. Section 19.2 will discuss intrasystem communications architectures based on high-speed serdes interconnect and will highlight some the major signal integrity challenges. Section 19.3 will discuss some of the key issues related to conformance test of high-speed serdes transceivers. Section 19.4 will investigate how equalization techniques can be applied to extend the solution beyond on-card communications and into the backplane environment. Section 19.5 will provide an overview of selected industry standards, and Section 19.6 looks to future of high-speed serial interconnect (speeds beyond 2.5 Gb/s).
736
Chapter 19
19.1.1 Chip-Chip Interconnect This chapter will consider both chip-chip interconnects for communication between devices on line cards and also card-card communication over backplanes. The difference between these two classifications is typically the supported length and number of connectors allowed in the channel. Communications paths on a line card are typically characterized as requiring channel lengths up to 20 cm, including a single connector (for a mezzanine card, for example). Interfaces to pluggable optical modules also fit this mold, since they consist of some length of printed circuit board (PCB) interconnect plus the module connector. The compliance methodologies described in Section 19.3 apply to both backplane and chip-chip interconnects. One particular standards-based chip-chip communications interface, SxI-5, is discussed in detail in Section 19.5.2.
19.1.2 Backplane Interconnect The backplane fabric is the heart of any modular computing or networking platform. Fabric redundancy and fail-over mechanisms improve robustness, while congestion management algorithms and differentiated classes of service allow the system to provide guarantees for latency, latency jitter, and data loss. Also, a scalable fabric allows the system to grow with the demand for bandwidth. This chapter will discuss the electrical signaling aspects of backplane communications. Backplane links are characterized as requiring channel lengths up to (and sometimes in excess of) 1 m and include at least two connectors (where the cards mate to the backplane). In some circumstances, the line card is designed so that the termination of the high speed link resides on a mezzanine card, which adds an additional connector to the channel. Backplane topologies and design challenges are discussed in detail in Section 19.2. Section 19.4 details how equalization enables the support of longer interconnect distances, and in Section 19.5, select examples from industry standards are discussed.
High-Speed Serial Interconnect
19.1.
11>1
HIGH-SPEED INTERCONNECT SYSTEM ARCHITECTURE
This section will discuss the architectural considerations and signal integrity design challenges of high-speed interconnect systems. These considerations are core to the development of the problem statement that must be addressed by a particular serdes design or an industry standard. While not an exhaustive treatment of the topic, this section seeks to set the stage for the discussion of equalizer design (Section 19.4) and the overview of industry standards (Section 19.5).
19.2.1 Topologies The selection of a backplane topology is a function of many factors such as scalability and resiliency. The chosen topology influences the complexity of the backplane, which in turn influences the signal integrity design challenges and the cost. This section enumerates several common backplane topologies. The simplest backplane topology is the star topology. In the star topology, there is a central switch fabric that typically occupies a single slot in the system. Each line card in the chassis has a communication path to the switch fabric over the backplane. Since this is the simplest topology, it is utilized in low-cost systems where resiliency is not a principal concern. A common enhancement to the star topology is the dual-star topology, which adds a redundant protection fabric to the system (presumably occupying a second slot). In this case, each line card has a second, backup communications path to the redundant switch fabric and therefore doubles the total number of connections on the backplane. There is also a communications path between the working and protection fabrics so that they may remain synchronized. The resiliency of the system is improved through the elimination of the single point of failure (the switch fabric) but at the cost of additional backplane complexity. The full mesh topology does not have a centralized switch fabric; instead, each line card has a point-point link with every other line card in the chassis. The capacity of the system increases with the number of slots that are populated (as opposed to the star or dual-star, where the capacity is dictated by the capacity of the central fabric). However, the number of connections increases exponentially with the number of slots. Backplanes designed to support the full mesh topology are quite complex, requiring several thousand differential pairs to complete the connection matrix. There are several variants of the star and mesh topologies that will not be explained in detail here. It is important to note that each of these topologies
738
Chapter 19
is simply an arrangement of point-to-point links that go from line card to fabric or line card to line card. Therefore, the discussion of simple point-topoint links in the following sections applies generally to any conceivable Nchannel chip-chip communications path on the line card or N-channel communication path across the backplane.
19.2.2 Printed Circuit Board (PCB) Interconnects This section discusses some the key signal integrity issues associated with transmission of high-speed signals over printed circuit board interconnects. Similar to what is experienced in systems that utilize electrical cabling, the transmission line (typically a differential stripline) has frequency-selective loss and may be victim to unwanted coupling from adjacent transmission lines and the ambient environment. The frequency-selective loss is attributable to the skin effect in the conductors and to the loss of the dielectric material. At higher frequency, dielectric losses typically dominate. The loss may be highly selective (i.e. resonance structures) when there are significant impedance discontinuities in the channel. Such discontinuities can be exacerbated by imperfect driver and receiver terminations in the serdes devices themselves. While the transmission performance affects the strength of the signal (S) that arrives at the receiver, the noise coupling dictates the noise power (N) and defines the signal-to-noise ratio (SNR) of the channel. Noise coupling is typically separated into near-end crosstalk (NEXT) and far-end crosstalk (FEXT). NEXT is attributed to the unwanted coupling of adjacent transmitters into the local receiver. In plesiochronous systems, the reference clock for the local transceiver is different from the reference clock for the remote receiver (a 200 ppm difference is typically allowed), so this noise is typically not synchronous with the signal of interest. FEXT is the unwanted coupling of remote transmitters (adjacent to the transmitter of interest) into the local receiver. FEXT is distinct from NEXT in terms of spectral content and, in some cases, the timing relationship with the signal of interest. Since FEXT needs to traverse the length of the backplane link to reach the victim receiver, it tends to show a high frequency roll-off. This is in contrast to NEXT, where the coupling generally increases with increasing frequency, and the frequency content of the transmitter heavily influences the frequency content of the coupled noise. In addition, since the transmitters adjacent to the transmitter of interest typically share the same reference clock, the FEXT noise may be synchronous with the signal of interest. Such distinctions may be useful in understanding the nature of the noise and perhaps in the development of noise cancellation algorithms. In general,
High-Speed Serial Interconnect
739
the noise environment is best controlled by intelligent selection of connector pin-out and by managing certain parameters of the aggressor transmitters (for example, the peak-peak amplitude and rise time of the aggressor have a strong influence in the amplitude of the crosstalk noise). Typically, a serdes solution tries to improve the SNR by improving the signal strength (S) through the application of equalization. This approach will be discussed in detail later. The forward transmission performance of the backplane may also be improved by design, and several key considerations are discussed in detail below. 19.2.2J
Material Loss
The loss of a PCB interconnect is dominated by its length, the dielectric material selected, and the geometry of the trace. Different dielectric materials offer improved dielectric constants and dissipation factors, with some added cost. Even with a given class of dielectric materials, subtle variations in manufacturing can result in the deviation of key performance parameters from their nominal values. While this deviation may correspond to better than expected transmission line losses, it may also correspond to higher losses. Therefore, this so-called manufacturing variation must be taken into account in the link design. If not carefully considered, the additional losses that may be experienced in a given manufacturing run may cause unacceptable bit error rate (BER) degradation in the system. 19.2.2.2 Layer Connection To achieve the high routing densities typically required of a backplane, signal traces are distributed among multiple layers. When pin through hole connectors are used, vias are drilled through the board, and a given pin is connected at a given signal layer. In the case where the signal layer is at the top of the board, the remainder of the via (from the signal layer to the bottom of the board) becomes a transmission line stub. Since backplanes tend to be quite thick, the stub length is potentially very long. These via stubs create impedance discontinuities in the transmission line and results in resonance structures in the channel's transmission response. The depth of the null is a function of the strength of the reflection while the position of the null is a function of the spacing of the reflection relative to the main path. As the occupied bandwidth of the signal increases (as a result of increasing speed), a resonance could occur in-band and present a severe transmission impairment.
Chapter 19
740
Figure 19-1 illustrates the impact of layer connection on the backplane channel response. In this figure, the magnitude response of two channels from the same backplane is shown. The channels are of equivalent length, but one makes connections at the near-top layer (longest stub) and the other makes connections at the near-bottom layer (shortest stub). The resonance structures due to stub effect are clear. The data is plotted to lOGHz to illustrate that the stub effect gets worse with increasing frequency.
0 -10 [
j-
j—-j—
f"--'^^: 1 M M N "i'i"! i
-20
[ j - - j - --
CD
1-40 § -50 a CO
-60
[
":"•-•:•
;
Ml
Top;Layer;
j
j \;
\
\
1 : 1
\
r
VT'T'T
1
I
;
i
1 1M 1
i
i
: :
10" Frequency (Hz)
I
;
\ \ \
\':"":"n
i
-70 -80
i..^^iii..ij.l_iJ
.
1
; ; ; 1
1
: ; :
Ii
Figure 19-1. Impact of layer connection on backplane channel response
The most straightforward way to mitigate stub effect is to reduce the length of the stub. One practice, known as counter-boring or back-drilling, removes the stub by drilling out the via up to the level where the layer connection is made. While this approach adds a small amount of cost to the backplane, the benefits in terms of signal integrity are potentially great. 19.2.2.3
Environmental Effects
As mentioned earlier, the loss of the PCB interconnect is dictated by geometry and choice of dielectric material. The key properties of a dielectric that affect the loss are the dielectric constant and the dissipation factor. Both of these parameters, and hence the loss, are impacted by the temperature and relative humidity of the operating environment.
High-Speed Serial Interconnect
741
While the systems typically operate in equipment rooms with controlled climates, the interior of a system is still subject to environmental variation. One obvious example is the transition from system shutdown to full operation. In this case, the interior of the box is at room temperature prior to power on, and once the system is operational, the power dissipated by the components begins to increase the internal temperature. Cooling facilities in the system manage the internal temperature, but the steady-state temperature can be expected to be well above room temperature. Other contributors to environmental variation are the quality of the climate control at the given site, failure of the climate control system, and uncontrolled environments in shipping (for example, water absorption due to storage in humid environments). All of these factors can results in changes between the backplane environment measured in the lab and the one eventually deployed in the field. Results from backplane measurements in a controlled environment are shown in Figure 19-2. The dashed line is the magnitude response as measured at nominal conditions (20°C and 20% relative humidity). The next line is the magnitude response of the same channel with the temperature increased to 60°C. The bottom line is the response of the channel with the temperature at 60°C and the relative humidity increased to 85%.
! 1
•
TTI'T"
^
:
; 1
;;;^i 1
;
20*^0 i 20% ^H
i : ii^ ^ X m -10
.. J. J. j_ 1 : : : 1
60^Jc.2 3%|RH
'Ml
^^"^^Z^* i
Vi^i 1 K 1
' ''
.-15 : : i i60'='c.85%iRH-ii Q
S-20
IT"
...:..:..:J
A' -25 V\ <
-30 10
i i :i 10' Frequency (Hz)
i
\\' \r
10"
Figure 19-2. Environmental influence on the backplane channel response
742
Chapter 19
It can be seen from Figure 19-2 that the impact of increased temperature and humidity is relatively small at low frequencies, resulting in approximately 1-2 dB additional loss at 1.25 GHz (the Nyquist frequency of a 2.5 Gb/s non-retum-to-zero, i.e., NRZ, system). At higher frequencies, this issue becomes more severe, with approximately 5 dB of degradation at 5 GHz, which represents a serious impairment to a 10 Gb/s NRZ system. Therefore, environmental variation must be considered in the design of high-speed backplane interconnects. Measuring the nominal performance of the channel (and the system) is not sufficient, and a margin must be built in to account for environmental variance. In cases where the additional loss results in an undesirable BER performance penalty, equalization circuitry may need to be dynamic and adaptive so that it can track the channel variations.
19.3.
COMPLIANCE TEST METHODOLOGY
An effective standard or implementation agreement must provide a means by which compliance or noncompliance may be established. Compliance testing builds confidence in the ability of standards-based solutions to interoperate in the targeted application space. Several tests common to high-speed serial interfaces, such as transmitter output jitter and jitter tolerance tests, require detailed understanding of the underlying mechanisms in the serdes and the limitations of the test equipment. This section looks specifically at mask testing and jitter decomposition techniques to foster a better understanding of how transmitters are characterized and to describe how the test signals that are used to verify receiver robustness might be calibrated.
19.3.1 Eye Mask The eye mask is a common tool used to establish, at a glance, whether a given implementation is compliant to the specification or not. The eye diagram is an accumulation of the various trajectories of the signal as they are influenced by the current and neighboring symbols in the sequence. The completeness of the eye diagram is a function of how many such trajectories are captured. Eye diagrams are typically captured using an under-sampling approach. In this approach, the scope samples the signal at a rate much slower than the baud rate. However, a precision timebase allows the sampling clock to be offset with very fine granularity. By capturing the signal over a sequence of
High-Speed Serial Interconnect
743
time offsets, a much higher effective samphng rate can be achieved. However, it may take many captures to completely represent a particular signal trajectory. Furthermore, at higher speeds, the signal trajectory may be influenced by many past and future symbols (i.e., the channel impulse response spans many symbols), and there may be many signal trajectories that compose the complete eye diagram. So, while the eye diagram is a fast and simple method for establishing compliance to specifications, it may provide misleading results. The test result is a function of the data pattern used to capture the eye diagram, and the duration that the eye diagram is allowed to accumulate. In Figure 19-3, the eye diagram is represented in terms of contours of constant bit error rate (BER). A given contour is the locus of the sheer thresholds and timing strobe offsets that would yield a given BER. The innermost contour represents a BER of lE-15, and each successive contour represents an increase in the BER of three orders of magnitude. Each contour is also related to the inner boundary of the eye diagram observed for a given measurement interval or test pattern. The outermost contours would represent what would be seen if a short pattern (such as PN7, for example) was used or if a more transition-rich pattern (such as PN-31) was observed over a short observation interval. The innermost contour would represent what could be observed if a PN-31 pattern was allowed to accumulate for a very long time.
•®^*5^I? 0.8 i
0.6
1
i
:
I
1
;
T
:
:
1
:
!
1
t 0-2
: 1 0
< ^ ^ ^ ^ ^ " ^ ^ ^ & >
0)
i^^^^^^^^^L
If
1 -0,2
t
^-0.4
1
j
0.4
{
1
1
-0.6
l i ^ ^ ^ ^ ^ l
;
:
1 -0.3
1 -0.2
:
;
:
:
1 •
4
-0.8 5
-0.4
i -0.1
1 1 0 0.1 Time (Ul)
Figure 19-3. Eye mask
i 0.2
i 0.3
0.4
05
I
744
Chapter 19
Also represented in the figure is a diamond-shaped eye mask that represents a typical far-end compliance mask (i.e., at the receiver input). Note that depending on the pattern used or on the duration of the measurement, the eye diagram passes or fails the mask test. Therefore, it is important to note the test conditions used for the mask test and to be cognizant of these issues when performing mask tests of your own. It is also important to note the source of the sampling clock or trigger when performing a mask test. Long-term variations in the frequency of the symbol clock (wander) of the device under test tend to close the eye while having no impact on the BER performance of the system. This outcome is due to the fact that the receiver in the actual system has a phase-locked loop that recovers the symbol clock from the data signal and is capable a tracking this low-frequency wander. Because of this, most eye mask test setups require the use of a Golden PLL to generate the sample clock or trigger for the test system. The Golden PLL is a clock recovery unit similar to what might be found in actual receivers, but with well-defined performance characteristics. While these characteristics vary depending on the system to be tested, they are typically defined to have low intrinsic jitter, minimal gain peaking, and a specific bandwidth (usually related to the operating rate of the system). Use of the Golden PLL eliminates the impact of low-frequency wander, improving accuracy and allowing longer-term measurements to address the issues of eye diagram completeness that were discussed earlier.
19.3.2 Jitter modeling conventions for high-speed interfaces Jitter is the deviation in the time of occurrence of a significant event (i.e., a data transition) from its expected time (i.e., as dictated by a reference clock). This section is devoted to the study of high-frequency jitter and its impact on the BER performance of the system. This is in contrast to lowfrequency jitter, or wander, which is typically tracked by the clock recovery unit in the receiver subsystem but is of great concern in synchronous networks, such as SONET/SDH. Jitter may be characterized in the time domain in terms of a histogram of phase offsets accumulated over time, or in the frequency domain in terms of a power spectral density. While many methods of modeling and measuring jitter exist, this section will focus on the time-domain approach and specifically on the BER bathtub model. It is important to note that there two general classes of jitter that are commonly accepted. Deterministic jitter (DJ) represents high probability timing offsets that accumulate in the system via a linear algebraic sum. For
High-Speed Serial Interconnect
745
example, if a transmitter output has deterministic jitter X, and the channel introduces deterministic jitter F, then the total deterministic jitter at the receiver is X+Y, Random jitter (RJ) represents low-probability timing offsets that accumulate in the system via root-mean-square addition. From the previous example, if Jf and 7 were random jitter components, then the total random jitter at the receiver would be Q^+Y^y^. Random jitter tends to assume a Gaussian distribution and is therefore sometimes referred to as Gaussian jitter (GJ), Jitter is further broken down into subclasses that are more closely associated with the jitter sources. Deterministic jitter is typically broken down into Data Dependent Jitter (DDJ) and Periodic Jitter (PJ). DDJ is directly attributed to intersymbol interference (ISI) intrinsic to the serdes or introduced by the channel. Because of this, the magnitude of the time offset is highly correlated to the data pattern and therefore is sometimes referred to as a correlated, bounded, high-probability jitter (CBHPJ). DDJ can be mitigated through the use of equalization (i.e., reduction in the ISI). One typical source of PJ is power supply leakage into the clock synthesis circuit. It has no correlation to the data and is therefore sometimes referred to as uncorrected, bounded, highprobability jitter (UBHPJ). PJ is often deliberately injected into the system during testing for receiver jitter tolerance, as it is easy to generate and characterize. One of the principal sources of random jitter is the intrinsic noise of the reference clock. While the spectral content of this noise may vary, it always yields a Gaussian distribution in the time domain. It is important to note that the Gaussian distribution is unbounded and, for all practical purposes, so is the reference clock noise. In other words, the peak-peak value of the noise is a function of the observation interval, and longer observation intervals yield larger peak-peak values. For this reason, this type of jitter is sometimes referred to as uncorrected, unbounded Gaussian jitter (UUGJ). The other variant of RJ is actually a special case of DDJ. As mentioned earlier, the magnitude of DDJ at a particular instant is a function of the symbols that precede and follow the symbol of interest. If the impulse response of the system spans many symbols, there are many permutations of symbol strings that yield unique time offsets. For example, if the impulse response of the system spanned 31 symbols, there could be 2^* unique DDJ offsets. Since the contributions of the nearby symbols are typically the strongest, they appear frequently (high-probability) and are categorized as DDJ. The contributions from the symbols at the tails of the impulse response are weakest and tend to add "fuzz" to the eye. Since this fuzz is actually the sum of the contributions from many outlying symbols, it tends to assume a Gaussian distribution (per the Central Limit Theorem). However,
746
Chapter 19
this distribution is bounded, making it distinct from reference clock noise. Also, it is clear that the jitter is correlated to the signal. Therefore, this jitter is often referred to as correlated, bounded Gaussian Jitter (CBGJ). In summary, the jitter in a given system is the sum of deterministic and random contributions from the transmitter, the channel, and the receiver. The transmitter typical contributes UUGJ from the reference clock, PJ via power-supply feed-through, and perhaps DDJ from band-limiting or mismatch in the driver. The channel introduces additional DDJ and CBGJ based on the transfer function of the channel and the data pattern of interest. The receiver may contribute additional DDJ, PJ, and UUGJ from its input buffer and clock recovery system. Transmitter- and receiver-based equalization techniques may also be used to reduce DDJ and CBGJ.
19.3.3 Bathtub curve analysis of jitter Despite the detailed classification of jitter explained above, it is typically necessary to only measure the deterministic and random components. These components may be derived from the BER bathtub curve. The left panel of Figure 19-4 shows an example of the BER bathtub curve. The curve is typically generated using a bit error rate test set and is constructed by simply measuring the BER, at a sheer threshold of zero, at various sample time offsets. Note that this is basically a slice of Figure 19-3 taken at zero amplitude. The left panel of Figure 19-4 also shows the histogram of the timing jitter (measured at zero amplitude) as a dashed line. Note that the jitter histogram and the BER bathtub curve are in close agreement for low probabilities (lower than lE-6), and this is an indication that the BER bathtub is an effective representation of the jitter histogram. To assist in the extraction of parameters, a model for the DJ distribution must be assumed, and the typical assumption is that the distribution is dualDirac. The dual-Dirac distribution is, as the name implies, two impulse functions and is used to model duty cycle distortion or even-odd distortion. Duty cycle distortion is the difference between the width of pulse for a "one" symbol and the width of the pulse for a "zero" symbol. Even-odd distortion is the difference in the width of pulses in odd positions compared with the width of pulses in even positions. In both cases, the magnitude of the offset always assumes one of two values, namely, -DJ/2 or 4-DJ/2.
High-Speed Serial Interconnect
lAl
EM: 6€
^4
A3
-02
•a.l
U
01
0.2
iyi
«4
05
F/gwre 19-4. Bathtub curve jitter decomposition (DJ = 0.28, RJ = 0.021)
The convolution of the dual-Dirac distribution and the random jitter distribution yields the following equation for the jitter distribution model:
Pfjit)l^lTrcT
exp RJ
(-fl
\
r + exp
2^i
/
DJ
2 \
2a RJ
V
In the above equation, DJ is the peak-peak deterministic jitter and GRJ is the standard deviation of the random jitter. To yield the right panel of Figure 19-4, the following transformation is applied:
Q = ^/2Qvr'{\-2BER) This transformation serves to linearize the function in the region of low probability. It can be shown that, in this region, the jitter probability distribution function can be represented in two parts:
a
^t-DJ,ef}
left
bright
^righl
)
The "left" and "right" notation is used to account for the fact that the distribution may not necessarily be symmetric. The benefit of the
748
Chapter 19
transformation is apparent, as it is now possible to extract the DJ and RJ using a simple line fit. However, care must still be taken when computing the best fit line. The fit must be constrained to the region where the slope of the line is dominated by the UUGJ. As mentioned earlier, DDJ contributes to this slope at higher probabilities (i.e., the CBGJ), but since it is bounded, it is inappropriate to extrapolate that line to the lower probabilities (it tends to yield pessimistic results). A rule of thumb is that the fit should be constrained to probabilities less than 1/(pattern length) to avoid corruption by the CBGJ. In the example shown in Figure 19-4, a PN-31 pattern was simulated that corresponds to a |2 of approximately 6. In Figure 19-4, the circled data points represent the ones used for the fit. Note that they are for a Q greater than 6. The extracted parameters may then be resolved to singular DJ and RJ values for comparison with the appropriate specifications:
DJ =
\-{DJ,^„-DJ,^,)
_ ^left ^ RJ ~
~^right I
It is important to note that, due to the dual-Dirac assumption for the DJ distribution, the resulting DJ is best characterized as "effective" DJ. Other analysis techniques may be employed to better understand the exact DJ distribution, but the version derived from decomposition of the BER bathtub curve is typically all that is required for comparison to industry standards.
19.4.
INTERCONNECT EXTENSION USING DEEMPHASIS AND EQUALIZATION
In the 2.5 Gb/s regime, robust and interoperable chip-to-chip interconnects can be realized through the appropriate choice of transmitter output levels, receiver sensitivity, and a reasonable jitter budget. However, when dealing with backplane interconnect, it is often necessary to incorporate equalization to deal with higher-frequency selective losses that accompany the longer interconnect. Such techniques may be applied at the transmitter, the receiver, or both depending on the length of the interconnect and other transceiver and system design trade-offs.
High-Speed Serial Interconnect
749
19.4.1 De-emphasis at the Transmitter One common form of equalization is the transmitter-based technique of de-emphasis. This technique emphasizes the high-frequency content of the signal to compensate for the high-frequency loss in the channel. With properly tuned de-emphasis, the combination of the emphasized transmitter signal and the channel results in the flat frequency response, in the frequency range of interest, at the receiver. In effect, the de-emphasis cancels the distortion introduced by the channel and makes it look like a much shorter interconnect. Note that this technique is sometimes referred to as pre-emphasis. The difference between de-emphasis and pre-emphasis is simply a scale factor. De-emphasis emphasizes the high-frequency content by attenuating the lowfrequency content, which results is a smaller albeit cleaner eye at the receiver. Pre-emphasis emphasizes higher frequencies through the application of gain. This requires a stronger output buffer to accommodate larger peak-peak signal swings and may also enhance crosstalk, offsetting the benefit of a larger eye opening at the receiver. The remainder of this section will focus on de-emphasis, but the same basic principles apply to both techniques. A basic model for a de-emphasis transmitter is shown in Figure 19-5. The model represents a two-tap digital implementation (analog implementations are also possible) where to represents the main tap and ti represents the emphasis tap, which typically assumes a negative value. When the current bit and previous bit have the same sign, the output is the sum of the two taps (Fss). When the current bit and previous bit have opposite sign, as is the case with a data transition, the output is the difference of to and t\ (FPK). The degree of emphasis represented by ^ and tx may be defined in several ways. The most common definitions are stated below for reference. The simplest way to represent emphasis is in terms of a percentage, as shown below. This is the definition that will be used throughout the remainder of this section: Emphasis{%) = 100% x - ^ ~ 1
Chapter 19
750
1 z
-*0
7"*®
'—©• i^ ^ ^
Figure 19-5. De-emphasis definition
Yet another method expresses the degree of emphasis through the use of the parameter a. This parameter is defined as
a = 100%x
rV
-V \
*^ PK *^ SS V V -\-V I \*^ PK ^ *^ SS J
Since the a-parameter is also commonly expressed as a percentage, care must be taken not to confuse it with other definitions. Figure 19-6 illustrates the benefits and potential pitfalls of de-emphasis. In this figure, the output of various backplane interconnects (columns) is shown for a 2.5 Gb/s non-retum-to-zero (NRZ) transmitter with various deemphasis settings (rows). Four backplanes were studied with total interconnect length, including two backplane connectors, of 13, 50, 86, and 127 cm, respectively. For each backplane, the transmitter de-emphasis was set to 0, 25, 50, and 100%. Under each eye diagram, the eye height at zero time {H) and the eye width at zero amplitude {W) are reported. Both results are for a BER of IE-12. Note that the resulting eye diagrams do not show the impact of transmitter timing jitter or crosstalk, and are simply intended to illustrate de-emphasis trade-offs. In the first row of Figure 19-6, it can be seen that the eye is wide open for a 13 cm interconnect with no transmitter de-emphasis. This distance is well within the range of what would be considered a chip-chip interconnect and
High-Speed Serial Interconnect
751
illustrates the earlier point that robust, interoperable intra-card communications in the 2.5G b/s regime do not require equalization. However, going from left to right along that row, the vertical eye closure and jitter increase significantly. At 127 cm, the eye is completely closed. When considering the impact of jitter and crosstalk, one finds that even the 50 and 86 cm interconnects are difficult to support without the benefit of equalization. If we focus on the 127 cm interconnect, which is the rightmost column of Figure 19-6, the benefits of de-emphasis are readily seen. As the deemphasis is increased, the jitter decreases dramatically and the vertical eye opening is improved. Recall that with de-emphasis, low-frequency components are attenuated, which results in a smaller eye opening but the emphasis of the higher-frequency content reduces the intersymbol interference. So long as the increase in eye height achieved through ISI reduction exceeds what is lost to de-emphasis, a net improvement in the vertical eye opening is realized.
iM4
M<»I»
•!•
K OM, '.V:0.7<J
\K0^
H.0 7J. W 0 78
'ISm^ ^iPi
H. Q 76. W- 0 64
Figure J 9-6. Various levels of de-emphasis applies to various lengths of backplane interconnect
In cases where a high degree of equalization is required to compensate for channel loss, the net benefit in eye opening may be small, and therefore it is preferable not to rely too heavily on de-emphasis. It is important to note the steady improvement in received jitter with added de-emphasis. While
752
Chapter 19
loss of amplitude can be corrected through the application of additional transmitter power (pre-emphasis rather than de-emphasis), the only way to effectively mitigate channel-induced jitter is through the application of an appropriate amount of equalization. When the equalization benefit does not exceed the de-emphasis loss, a net penalty in the eye opening may be observed. If we refer to the 13 cm channel, which is the leftmost column of Figure 19-6, the application of increasing levels of de-emphasis quickly results in over-equalization of the channel. Over-equalization is manifested in the form of large overshoots due to gain peaking in the frequency response from the convolution of the transmitter and channel. These overshoots increase the jitter and, in the case of de-emphasis, reduce the usable eye height. In the case of pre-emphasis, the inner eye height would not be reduced, but the additional jitter would still be observed. Clearly, for a given channel, there is an optimum setting of de-emphasis that maximizes the vertical eye opening and minimizes the jitter. Such optimum settings may be found empirically, as shown in Figure 19-7. A typical implementation offers a limited number of de-emphasis settings, so the best setting may be found by trial and error. This process may be aided by guidance based on the length of the interconnect. Techniques could be applied that make the system fully adaptive, and these are being applied at speeds above 2.5 Gb/s. However, at 2.5 Gb/s and below, provisionable, static settings have been found to be effective and are commonly used. The left panel of Figure 19-7 shows the vertical eye opening at IE-12 for various backplane interconnects as a function of de-emphasis setting. The right panel shows the horizontal eye opening, again at IE-12. In contrast to the previous analysis, these results include the effects of transmitter timing jitter and crosstalk. The transmitter timing jitter was assumed to be 0.17UIp.p periodic jitter (PJ, 100 MHz sinusoid) and 0.18 UIp.p random jitter (RJ, as measured to IE-12). These values are typical worst-case specifications for devices operating in the 2.5 Gb/s regime. From Figure 19-7, it is clear that there is a singular value that provides the best vertical or horizontal eye opening. It is also important to note that the same de-emphasis setting that maximizes the vertical eye opening does not necessarily maximize the horizontal eye opening, and vice versa. Figure 19-8 is provided to better illustrate the underlying mechanisms of transmitter de-emphasis. In the left panel, the inverse of the magnitude response of a sample backplane interconnect is plotted. This happens to be the 50 cm interconnect utilized in earlier studies. Also plotted in the left panel are frequency response curves corresponding to 25, 50, 75, and 100% de-emphasis, respectively. All these curves are normalized to the lowfrequency gain of the system. Note that the curve representing 100% de-
High-Speed Serial Interconnect
753
emphasis exhibits 6 dB of gain at the Nyquist frequency (1.25 GHz for 2.5 Gb/s NRZ signaHng), which is consistent with the set of definitions provided eariier. Vertical Eye Opening
Horizontal Eye Opening
DJ * 0.17, RJ a 0.18 at 1»-12, crosstalk
DJ « 0.17, Rl • 0.18 at 1».12,
0.90
0 80
i
070
S o.eo-
—•—13cm 1
^ 0.50 1 - >K
»6cm
•J 0.40••-•"•
1 0.30, 0 20 0 10-
°0
200
40.0
600
Transmtttftr Oa-Bnphasis (<»»
800
lOio.o
80.0
1000
Transmittsr Oa-Bnphasfs (%t
Figure J9-7. Vertical and horizontal eye opening as a function of de-emphasis and interconnect length.
A simple way to look at de-emphasis optimization is to match the deemphasis gain to the inverse of the channel magnitude function. Therefore, the cascade of the two functions would be flat, which is the desired frequency response at the receiver. As can be seen in Figure 19-8, the curve for 75% is a good fit. The right panel of Figure 19-8 shows the cascade of the channel magnitude function and the de-emphasis gain. This plot is not normahzed, so the impact of de-emphasis is clear (for 100% emphasis, the low-frequency content is 6 dB down from the curve for 0% emphasis). Here it can be seen that the 75% emphasis yields a flat frequency response, while 100% shows some gain peaking. The eye diagrams in Figure 19-8 serve to tie it all together, showing the eye closure when no emphasis is applied and the overshoot that corresponds to the gain peaking when the system is overequalized. While the previous example illustrated an effective heuristic for selecting the optimal de-emphasis for a given channel, it is important to note that it is only using half of the available channel information. Distortion in the channel phase response also contributes to channel-induced jitter and eye closure. However, since simple amplitude compensation has been proven adequate for the majority of backplane interconnects, the study of phase compensation is left as an exercise for the reader.
Chapter 19
754
-I-*-
*"'V**7"-"''^'"'*7 J 4
.,^...;*.;*^.t^:^
*'<-:•{ H :
r"?"rT"r:jv
j; i
«:.^.
IkT^j^r;^ ^ l ^ i i l f ^
•|^^'^|j|i^v|jjj
Figure 19-8. Frequency-domain interpretation of de-emphasis.
19.4.2 Equalization at the Receiver Many of the concepts discussed in the context of transmitter de-emphasis may also be appHed at the receiver. A simple but common receiver equalizer consists of some gain peaking added at the receive buffer. Figure 19-9 shows example receiver transfer functions synthesized from a single continuous-time zero and two poles. The placement of the zero affects the amount of gain peaking realized at the Nyquist frequency. In some cases, the zero placement can be made provisionable (or adaptive), allowing the receiver boost to be tuned to the interconnect length. Similar to transmitter de-emphasis, the objective for the receive equalizer is to match the inverse of the channel magnitude response so that the cascade of the transmitter, channel, and receiver front-end yields a flat frequency response. As discussed earlier, while the magnitude response of the channel may be used to effectively estimate the required level of compensation, to be complete the phase response of the system must also be considered. Again, the study of phase compensation is left to the reader. It is important to note that while the receive equalizer boosts the input signal, it also boosts the input noise. Such noise enhancement is illustrated in Figure 19-10. In this figure, the amplitude distribution of a crosstalk aggressor, as observed at the victim receiver input, is shown as the dashed line. The remaining lines are the amplitude distribution as observed after the application of high-frequency boost. For the case of 6 dB boost, a factor-of2 enhancement of the peak-peak crosstalk noise is observed.
High-Speed Serial Interconnect
755
; 1
I
! ! ! ! I; 1
;
!
; : : :]'M>'''^^^^^
: I 1 1 ! 1:
^ : i Ni
r i !••
"!•' '<""^-W"'
L
....i---*--r"n-rrT
-\\
:;!!;;;:
i
'""YXT" \ ' \ ' ' H
i
: : : \\f^N
il 10
10'
10
10" Frequency (Hz)
^
Figure 19-9. Sample transfer functions for receiver-based equalizers
^'&ffli:fj&:^;;s^
1
1
1
:
A
/ \ n 7/1 : \\\ hi J ; i \ / i f 1 ; 1 ]i\ff
;
°-5
' i •1
11 ij It . II II • Il
' ] 1
! ;!
1-1 :
! !
i|
1;1 1
1
.1 ;i
1.2
III! : 111]
1 : 111! ; iill ;l 1! ; II 1
-0.15
-0.1
#
I
. ;
»
1
1
1 1
i i -0.05 0 0.05 Normalized Amplitude
0.1
Figure J9-JO. Noise enhancement
0.15
0.2
J
Chapter 19
756
The increase in eye opening due to the equahzation must exceed the loss in eye opening due to noise enhancement in order for a net benefit to be reahzed. In other words, it is possible for the receive equalizer to realize a signal-to-noise ratio penalty. This is illustrated further in Figure 19-11. The left panel of Figure 19-11 shows the vertical eye opening at IE-15 for the 127 cm channel with various levels of transmitter de-emphasis and receiver high-frequency boost. The right panel shows the horizontal eye opening at lE-15. The results include the effect of transmitter timing jitter and crosstalk (and hence, equalizer noise enhancement). The results show that without transmitter de-emphasis, increasing levels of receiver boost provide a wider eye opening in both vertical and horizontal dimensions. Note that the peak eye opening is obtained when the combined transmitter de-emphasis and receiver gain peaking is approximately 7 dB. This implies that when the total equalization gain significantly exceeds this value, the eye opening gained through ISI reduction no longer offsets the eye opening lost to de-emphasis and noise enhancement.
Vertical Eye Opening
Horizontal Eye Opening
127cm. C3, R » 62.SW2.S, OJ = 0.17, RJ« 0.1» at 1».12, cr
127cm, C3, Rs 62.S62.5, DJ s 0.17. Rl sO.IS at 1e>12. crosstalk
1
1
>i-«cs5^ . * • • • • • "
lA 030 f
0.25
! \
r/ /
/ /
Transmitter De-Bripli^is {%)
Figure J9-1 J. Vertical and horizontal eye opening as a function of transmitter de-emphasis and receiver boost for a 127 cm interconnect
19.4.3 Usage Models It has been shown that 2.5 Gb/s interconnects can be extended to lengths in excess of 1 m through the application of equalization techniques. The simplest and most common of these techniques is transmitter de-emphasis, but this is often complemented by receiver-based equalization. While the effectiveness of these techniques is readily shown, care must be taken to avoid over-equalizing the channel of interest, as this can result in suboptimal, and possibly unacceptable, performance.
High-Speed Serial Interconnect
757
The problem of determining the optimum level of equalization for a given backplane channel in a given system may be addressed in several ways. Since backplane performance is typically well characterized, a simple approach would be to statically provision the equalization settings based on an index derived from the chassis slot and backplane channel being accessed. The system could also hard-code a "middle of the road" setting that provides adequate eye for any given channel in the system. Static equalizer settings based on a priori knowledge of the channel provide a straightforward and effective method for improving transmission performance over the backplane. However, such schemes are vulnerable to the differences between the backplane characterized in the lab and the backplane that is actually deployed in the field. As discussed earlier, such differences exist in static and dynamic forms. The static variation is a product of manufacturing control, while the dynamic variation is a product of environmental influences on the channel. In the 2.5 Gb/s regime, it is conceivable that settings could be chosen that provide margin and therefore protection from such variations. It is also possible for the system management software to monitor the health of the backplane links and make adjustments to the equalization settings accordingly. While only the simplest equalization schemes were addressed in the previous section, other techniques may be employed to further improve the reach and robustness of backplane interconnect. Among them are multitap de-emphasis schemes and decision feedback equalization. Multitap deemphasis introduces multiple degrees of freedom to the transmitter equalization circuit, which, under the guidance of a convergence algorithm such as least mean squares (LMS) adaptation, can find the optimal frequency response that compensates for both the magnitude and phase distortion of the channel. Decision feedback equalization (DFE) is typically a receiver-based equalization technique that acts as a "memory eraser." Energy in the postcursors of the channel pulse response (i.e., the intersymbol interference, or channel memory) is negated by the DFE, which may also operate under the guidance of the LMS adaptation algorithm. These techniques, while sometimes applied in the 2.5 Gb/s regime, will be much more common as the per-channel speeds increase to 6 and lOGb/s, where the penalties for misequalization are greater. Such adaptive techniques not only automatically and accurately identify the optimum for a given channel but also allow the transceivers to track any time variation in the channel. Since these algorithms are readily implemented in hardware, adaptation can be supported with minimal intervention from the system management software.
758
19.5.
Chapter 19
STANDARDS-BASED HIGH-SPEED INTERCONNECT
Standardization promotes multivendor interoperability, broadening the selection of traffic managers, switches, etc. and allowing optimization of system design by leveraging the strengths of different suppliers. While this has been a boon for systems integrators, the value has recently been enhanced with the advent of standards-based modular chassis, such as those defined by the PCI Industrial Computer Manufacturers Group (PICMG). It opens the possibility that a user of a modular platform can purchase a server blade from vendor A, a storage blade from vendor B, and a networking blade from vendor C, and plug them all into a standard chassis with full confidence in the interoperability of these components. Standards also tend to focus the market on a specific solution for a given problem, broadening the market for that solution but also promoting competition. This process generally improves the technology and drives costs down. The following sections describe several commonly used serial chip-chip and backplane interconnect standards in the 2.5 Gb/s regime. While the list is not exhaustive, it covers standards typically used in the telecom switching and routing platforms.
19.5.1 OIFSxI-5 In October 2002, the Optical Internetworking Forum (OIF) published a set of common electrical characteristics for serdes interfaces in the 2.488 to 3.125 Gb/s regime. This specification, known as SxI-5, serves as the foundation for the 40 Gb/s Serdes-Framer interface (SFI-5) and the 40 Gb/s System-Packet Interface (SPI-5). More recently, this specification was used as the electrical basis for the second-generation 10 Gb/s Serdes-Framer interface (SFI-4.2). The SxI-5 implementation agreement defines a chip-chip and chip-optics interconnect. It provides transmitter and receiver electrical specifications supporting both DC- and AC-coupled operation over at least 20 cm of interconnect with one or two connectors. The 40 Gb/s solution uses 16 databearing channels (SFI-5 uses an additional channel for out-of-band signaling; SPI-5 uses two) in the transmit and receive directions, while the 10 Gb/s solution requires only four in each direction. The SxI-5 interface is agnostic to the payload (although some guarantees regarding DC balance and run-length are required), and the data coding and formatting is a function of the higher-level interface definition.
High-Speed Serial Interconnect
759
19.5.2 OIFTFI-5 The OIF has also defined the TFI-5 interface that allows a SONET/SDH framer to be connected to TDM fabric across a backplane. The TFI-5 serdes definition, although based on SxI-5, was augmented to support a reach of 75 cm with two connectors. The implementation agreement states that the use of transmitter de-emphasis is required to support the longer reaches. Like SxI-5, both DC- and AC-coupled operation is defined. The interface is A^ serdes channels (A^ may be 1,4, or 16, for example) operating at the STS-48 (or optionally STS-60) rate and leverages the SONET frame structure.
19.5.3 lEEE®^ 802.3aeTM Clause 47, XAUI IEEE 802.3 has defined the extended attachment unit interface (XAUI) as part of the 10 Gb/s Ethernet specification. This interface features four channels of SB 1 OB encoded data at 3.125 Gb/s per channel for 10 Gb/s total throughput. This interface is designed to support a 50 cm interconnect with two connectors and mandates the use of AC-coupling (coupling capacitors are assumed to reside near the receiver). The XAUI specification was originally intended as a lower pin-count, longer reach version of the 10 Gigabit Media-Independent Interface (XGMII) which required 36 bits of parallel data plus a clock in both the transmit and receive directions. The role of XGMII and XAUI is to serve as the partition between the MAC logic (ASIC) and the physical-layer silicon or optical pluggable module. Today, it serves in this capacity as part of the XENPAK, X2, and XPAK multisource agreements for optical pluggable modules. The XAUI specification makes the use of pre-emphasis optional through the specification of a near-end compliance mask and a far-end compliance mask. A device is required to meet at least one of those masks. The nearend mask applies to transmitters without pre-emphasis, while the far-end mask applies to transmitters with or without pre-emphasis. The implication is that any device that meets the near-end mask will meet the far-end mask when used in conjunction with a compliant 50 cm channel. While it is possible to support 50 cm without pre-emphasis, most devices built to the XAUI specification (and the TFI-5 specification for that matter) offer supplemental capabilities in the form of transmitter pre-emphasis and receiver-based equalization. These features allow the parts to operate well
760
Chapter 19
beyond the 50 cm design objective when matched with other enhanced sohitions, and this has made XAUI the basis for several backplane architectures.
19.5.4 Backplane Ethernet Several examples of Backplane Ethernet exist today. PICMG has defined a form factor, data plane interconnect, and other core features for a modular networking and computing chassis. Furthermore, it has defined adaptations of Ethernet physical layer standards (PHYs) for use over the data plane. PICMG 2.16 defines operation of 100/1000Mbps Ethernet PHYs, originally designed for use with unshielded twisted pair media, over a CompactPCI backplane. PICMG 3.1 defines serdes-based Ethernet backplane at 1 and 10 Gb/s (XAUI) for operation over the Advanced Telecom Computing Architecture (ATCA) backplane. Recently, the IEEE 802.3 working group has initiated a project to standardize Backplane Ethernet. The objective is to take the successful Ethernet LAN technology and put it in the box. By doing so, existing and field-proven Ethernet intellectual property, including the media access controller and switch fabric, may be leveraged. In addition, it has the potential to "flatten" the network architecture meaning that no protocol conversion or encapsulation is necessary when taking packets from the LAN and communicating them across the backplane. The definition of a backplane fabric is a large and multi-dimensional problem that is best solved incrementally. With that in mind, this new standards project is focused on the physical layer definitions for Backplane Ethernet and will define serial solutions at 1 and 10 Gb/s over 1 m of backplane interconnect.
19.5.5 Summary of Standards-Based High-Speed Interconnect Table 19-1 summarizes the 2.5 Gb/s interface standards discussed in the earlier sections. It defines the channel, bit rate, BER objective, and equalization assumptions for each interface. Note that, while the stated reach was used as a guideline during development of these specifications, it is possible to design backplanes with much longer channel lengths that meet or exceed the underlying performance assumptions. It is also possible to define a backplane channel of the nominal length with performance much worse than assumed. Therefore, it is not appropriate to interpret the nominal reach as a hard cut-off.
High-Speed Serial Interconnect
761
Table 19-1. Summary of selected 2.5 Gb/s interface standards Nominal Reach
No. of Connectors
(cm)
(min)
(min)
(max)
20 75 50
1 2 2
2.488
3.125
Sxl-5 TFI-5 XAUI
Speed (Gb/s)
2.488
Equalization
Target BER
(RX)
(TX) 1.00E.12
3.110 3.125
Not Required Not Required \
1.00E-12
Required
Optior^al
1.00E-12
Optionai
Optior)dl
Table 19-2 summarizes the transmitter parametric specifications for interfaces considered. It is important to note similarities among the interface types, and that the differential output amplitude is scaled appropriate to length of the channel to be supported. Since the differential output amplitude is a parameter that may be readily programmed, a single serdes implementation that supports multiple interface types is feasible and quite common. Table 19-2. Transmitter specifications for selected 2.5 Gb/s interface standards
Sxi-S TFI-5
1
XAUI
Note 1
Differential Output Voltage
Common-Mode Output
(mV„)
Voltage (V)
Differential Differential Resistance (Q)
Return Loss (min)
Total Jitter
Deterministic
(TJ)
Jitter (DJ)
(min)
(max)
(min)
(max)
(c»B)
(UlpJ
(UU
1000
0.72
1.23
0.17^
Vn'
125 125
0.35^
0.62
75 75
7.5^
1400
7.5^
0.35
0.17
Note 4
0.35
0.17
(min)
(max)
500 700 800
1600
N/A^
N/A^
V , is the receiver termination voltage, and is permitted to \K in the range of 1.1 to 1.3V.
M o t e 2: XAUt mandates AC-coupling at the receiver, and therefore D C specifications do not apply. Note 3: Return loss specifications apply from 0.004 to 0.7S times the symtiol rate. Note 4: XAUI transmitter return loss is defined to b« better than 1 0 dB from 1 0 0 M H z to 6 2 5 M H z . a n d better than 10-10log,o(f/625 M H z ) dB for f in the range 6 2 5 M H z to 3.125 G H z . [Note 5
These values assume a clean reference clock. W h e n the clock recovered from the serial input signal is used as the reference, the TJ/OJ specification is 0.45/0.20.
Finally, Table 19-3 summarizes the receiver parametric specifications for the interface types considered. Again, the specifications are comparable, allowing multipurpose implementations. Note that the loss budgets for TFI5 and XAUI are both 12 dB, despite the difference in supported distance. This is partially due to the fact the TFI-5 assumes that transmitter deemphasis is used, implying that a longer distance may be supported within the 12 dB loss budget.
762
Chapter 19
Table 19-3. Receiver specifications for selected 2.5 Gb/s interface standards Differential Output Voltage
(mV„) Sxl-6 TFI-6 XAUI
(min) 175 175 200
(max) 1000 1400 1600
Common-Mode Output Voltage (V) (max) V«'
(min) 0.70 0.60 N/A^
v„'
Differential Differential Resistance {Q) Return Loss (min) (min) (max) (dB) 75 125 10^ 75 125 10^ N/A^ 10^
Total Jitter (TJ)
Deterministic Jitter (DJ)
(UU) 0.56" 0.65 0.55^
(UU) 0.32" 0.37 0.37^
i N o t e l : Vttisttiereceiverterminationvoltage, and is permitted to be in the range of 1.1 to 1.3V. Note 2: XAUI mandates AC-coupling at the receiver, and therefore D C specifications do not apply. Note 3: For Sxl-5 and TF1-5, return loss specifications apply from 0.004 to 0.75 times the symbol rate. For X A U I . return loss specifications apply from 100 M H z to 2.5 G H z . Note 4: These values assume a clean reference clock. W h e n the clock recovered from the serial input signal is used as the reference, the TJ/OJ specification is 0.65/0.35. Note 5: Receiver test conditions require the ir^ection of 0.10 U l „ additional sinusoidal jitter. In Sxl-5 and T F I - 5 . this jitter is part of 0.65/0.37 TJ/DJ specification. For X A U I , the P J supplements [the 0.55/0.37 jitter implied by the far-end complianee template.
19.6-
HIGHER AND HIGHER SPEEDS
New high-speed backplane interface standards will accompany the next leap in line card capacity. Anticipating a doubling or quadrupling in line card capacity, new solutions are being developed in the 6 Gb/s and 10 Gb/s regimes. In some instances, the solution leverages existing backplane protocols by multiplexing multiple low-speed links onto a high-speed backplane channel. In other instances, existing protocols are being updated to operate at the new speeds. The definition of these new interfaces is heavily influenced by the environment in which they are targeted to operate. The signal integrity challenge presented by the backplane interconnect dictates the modulation and equalization strategies that are employed by the serdes. In the case of IEEE 802.3 Backplane Ethernet, the target environment was strongly influenced by Advanced Telecom Computing Architecture (ATCA, PICMG 3.1) [6], and backplane interconnects based on ATCA dual-star and fullmesh topologies were studied during specification development. The signal integrity issues discussed in Section 19.2.2 are amplified at higher speeds and present a significant design challenge. To mitigate some of these effects, advanced lower-loss materials and some form of stubreduction technique were assumed for 10 Gb/s enabled designs. While these measures improved the transmission performance of backplane interconnect, care must also be taken with connector selection and pin-out to keep crosstalk at manageable levels. Such enhancements to current design practice were considered carefully to balance the increased cost of the backplane and line cards with the cost, complexity, and power of a 10 Gb/s serdes solution. Even with considerable improvements in the backplane interconnect performance, robust operation at 10 Gb/s proves to be a nontrivial undertaking. Alternate modulation schemes, such as 4-PAM, duobinary, and
High-Speed Serial Interconnect
763
class IV partial-response signaling, were explored for potential performance gains. These alternate schemes attempt to constrain the signal bandwidth to a region in which the backplane interconnect remains well behaved, but they usually incur a penalty in terms of noise susceptibility (be it amplitude or phase noise). The analysis indicated that simple NRZ signaling, like that employed at the 2.5 Gb/s and lower speeds, provided the best performance. The simple transmitter de-emphasis and linear receive equalization techniques described in Sections 19.4.1 and 19.4.2 prove to be insufficient at these speeds. Multitap transmit emphasis and decision feedback equalization, as described in Section 19.4.3, are required to provide acceptable levels of performance. In the case of Backplane Ethernet, a three tap transmitter equalization was deemed necessary. The additional tap is added to cancel precursor intersymbol interference, which is contributed by the symbol following the symbol of interest. While often neglected at lower frequencies, this form of ISI can be a prominent impairment at 10 Gb/s that cannot be canceled by simple de-emphasis or decision feedback equalization. As speed increases, so does sensitivity to the precision of the transmit equalizer. While a coarse setting of transmit emphasis could yield acceptable performance in the 2.5 Gb/s regime, the transmit equalizer must be more finely tuned for acceptable 10 Gb/s operation. Such tuning cannot be accomplished manually, and therefore protocols are being introduced into next-generation serdes that allow the local receiver to tune the remote transmitter for optimal performance. Such protocols require that there be a feedback path from, for example, the line cards serdes to the switch fabric serdes. This feedback path can be provided in-band through reserved bits in the frame structure, or via out-of-band means such as the system control plane. While signal integrity concerns are paramount in next-generation interface design, solutions to address forward and backward compatibility have also been introduced. The concept is that a card, when inserted into the chassis, can initiate a handshake with the card at the other end of the backplane link, learn the capabilities of that card, and identify the highestcommon-denominator mode of operation. Then each card will configure itself accordingly. Such a protocol, referred to as auto-negotiation, has been defined in IEEE 802.3 Backplane Ethernet. This plug-play methodology reduces the amount of management and human interaction required for deployment of new cards. It also makes it possible to incrementally upgrade systems in the field. For example, a switch fabric with high-speed capability can be installed in a chassis and operate at the legacy rate with the legacy line cards in the system. In the future, when high-speed-capable line cards are installed in the system, they will sense the high-speed capabilities of the
764
Chapter 19
fabric and automatically configure themselves accordingly. Even further in the future, when the next speed generation becomes available, the cycle may be repeated. The next leap in line card capacity brings with it signal integrity, serdes design, and deployment challenges. Organizations such as IEEE 802.3 and OIF are actively identifying these challenges and putting specifications in place to address them in the future.
19.7.
SUMMARY
High-speed serial interconnect will continue to be an integral part of system design at both the line card and chassis level. As line card capacity increases, pin and routing density constraints will force migration to the next speed regime. However, with each speed generation comes a new set of signal integrity concerns. While serdes technology is continually evolving to meet these challenges, it is essential that system designers understand these signal integrity issues and be prepared to adopt new design practices to extend the lifetimes of their platforms. Combined with a working knowledge of serdes performance metrics and equalization techniques, complete systems can be engineered that will satisfy current and future capacity requirements.
19.8. 1.
NOTES
IEEE is a registered trademark of Institute of Electrical and Electronics Engineers, Inc.
19.9.
REFERENCES
[1] John D'Ambrosia and Greg Sheets, "The Impact of Environmental Conditions on Channel Performance," DesignCon 2004. [2] System Interface Level 5 (SxI-5), OIF-SxI5-01.0, Optical Internetworking Forum (OIF) October 2002, htip://oiforum.com/public/impagi'eemenis.html. [3] TDM Fabric to Framer Interface Level 5 (TFI-5), OIF-TFI5-01.0), Optical Internetworking Forum (OIF), September 2003, hltp: //oi foriim.com^p ublic/impagreemenlsJitml. [4] IEEE Std. 802.3ae-2002, http://standards.ieee.org/geiieee8Q2/download/802.3ae2QQ2.pdr. [5] IEEE P802.3ap Task Force Web Repository, http://ieee802.org/3/ap. [6] PCI Industrial Computer Manufacturers Group, http://www.picmg.org/newinitiative.stm
PARTS Standards Development Process
This page intentionally blank
Chapter 20 STANDARDS DEVELOPMENT PROCESS Upclose and Personal Tracy Dupree* and Dr. Stephen J. Trowbridge** *DDK Communications, LLC, **Lucent Technologies - Bell Laboratories
20.1.
INTRODUCTION
Throughout the evolution of technology, there has been a need for industry standards. On May 24, 1844, Samuel Morse sent his first message over a telegraph line between Washington and Baltimore. By a decade later, telegraphy service was available to the general public. But with the initial deployment of this technology, telegraph lines did not cross national borders, and each country used a different system. This created the need for messages to be transcribed, translated, and handed over at frontiers, where they would then be retransmitted over the telegraph network of the neighboring country. After the initial attempts to interconnect systems according to numerous bilateral and regional agreements, 20 founding Member States agreed to form the International Telegraph Union (ITU) on May 17, 1865, in Paris. This organization is a direct ancestor of today's International Telecommunication Union (ITU), which is still central to the creation of many communication standards today. Since that time, many different types of organizations have evolved to develop industry standards, including • •
Large, intergovernmental or treaty based organizations such as ITU Large, nongovernmental international organizations based on national membership including the International Organization for Standardization (ISO) and the International Electrotechnical
768
Chapter 20
Commission (lEC). Members of these organizations are the designated National Standards Organizations for each country • Regional Standards Oganizations, such as the Alliance for Telecommunications Industry Solutions (AXIS) and the European Telecommunications Standards Institute (ETSI) • Large international organizations based on individual participation rather than corporate or government memberships, such as the Institute for Electrical and Electronic Engineers (IEEE) and the Internet Engineering Task Force (IETF) • Technology-specific industry forums, such as the Metro Ethernet Forum (MEF) and the Optical Internetworking Forum (OIF) Each of these organizations has its own organization structure, sets of rules and procedures, methods for developing and approving standards documents, and culture. While it isn't practical to describe each of the hundreds of standards organizations worldwide, we will go into more depth concerning the ITU, whose standards are found throughout the other chapters of this book. We will also describe the typical operation of a technology-specific industry forum.
20.2.
THE INTERNATIONAL TELECOMMUNICATION UNION (ITU)
From its inception as the International Telegraph Union in 1865, the ITU has gone through numerous evolutionary steps in its structure and procedures. After World War II, the ITU was incorporated as a United Nations Specialized Agency on October 15, 1947. As such, the high-level structure and working methods of the ITU are established by the Member States—essentially the governments of the United Nations member countries.
20.2.1 Hierarchy The current high-level structure (see Figure 20-1) of the ITU has remained constant since the Geneva plenipotentiary conference in 1992.
Standards Development Process
769
Hentpofentiarj' Conrercnccs
ITU Council
World Conferences on International Communications
Study Groups
General Secretariat Secretary' Genera) Deputy Secretary General
Radiocoiiimunicatiun Bureau
Figure 20-J. ITU Organization chart
The top level, indicated as Governance in Figure 20-1 is the responsibility of the Member States, currently comprising 189 member countries. Industry participants may sometimes be involved either as members of their official national delegations or, in limited cases, as nonvoting observers in this process. The quadrennial Plenipotentiary Conference is the top-level policymaking body of the ITU. This group sets the Union's general policies, adopts a five-year strategic plan, appoints the members of the ITU Council (which considers broad policy issues between Plenipotentiary Conferences), appoints members of the Radio Regulations Board (RRB), and appoints the senior leadership for the General Secretariat (Secretary General, Deputy Secretary General), and the Bureau Directors for each of the three Sectors. The General Secretariat (shown at the bottom level of Figure 20-1) provides support for all the operations of the ITU, including publications, translation, meeting support, etc. Members of the General Secretariat are employees who are subject to the United Nations personnel rules. The senior leadership, including the Secretary General, Deputy Secretary General, and the Bureau Directors for each of the Sectors (see below), are elected at the Plenipotentiary Conferences.
770
Chapter 20
The activities of the general membership are indicated in the center. The work of the ITU is organized into three Sectors: • The ITU-R, or Radiocommunication Sector, is responsible for ensuring the rational, equitable, efficient and economical use of the radiofrequency spectrum by all radiocommunication services, including those using the geostationary-satellite or other satellite orbits. • The ITU-T, or Telecommunication Standardization Sector, is responsible for the standardization of telecommunications on a worldwide basis. This Sector contains most of the activities of concern in this book. • The ITU-D, or Telecommunication Development Sector, is responsible for sponsoring activities to move toward universal telecommunications access in developing countries. These activities are intended to help "Bridging the Digital Divide" between developed and developing countries. Each Sector holds a conference or assembly, generally quadrennially, to establish the structure, work program, and work methods for the next interval of studies. For the respective Sectors, these include the Radiocommunication Assembly (RA) for ITU-R, the World Telecommunication Standardization Assembly (WTSA) for ITU-T, and the World Telecommunication Development Conference (WTDC) for ITU-D. These conferences or assemblies are responsible for appointing the senior leadership for the advisory groups and Study Groups within the Sector. The ITU-R also holds a World Radiocommunication Conference (WRC) every 3 or 4 years to address spectrum and regulatory matters. Like the Plenipotentiary Conference, the WRC is a treaty conference. ITU-R has a unique body called the Radio Regulations Board, appointed by the ITU Plenipotentiary Conference described above. This body interprets the regulations and adjudicates disputes involving spectrum matters, etc., so that interference between different Radiocommunication services can be avoided. Each Sector has a Bureau within the General Secretariat that supports its activities. These Bureaus include the Radiocommunication Bureau (BR) for ITU-R, the Telecommunication Standardization Bureau (TSB) for ITU-T, and the Telecommunication Development Bureau (BDT) for ITU-D. lAiithor^s Note: BR and EDI" are the correct acronyms, as they cciiiie from the French translation, wliicli is the official VN language]. Each Sector has an advisory group that is responsible for making adjustments to the structure, work program, and work methods and procedures between the quadrennial conferences or assemblies. These advisory groups are the Radiocommunication Advisory Group (RAG) for ITU-R, the Telecommunication Standardization Advisory Group (TSAG) for
111
Standards Development Process
ITU-T, and the Telecommunication Development Advisory Group (TDAG) for ITU-D. Each Sector has several Study Groups that are responsible for conducting specific studies in specific topic areas within the Sector. For standardization, the ITU-T currently consists of 13 Study Groups. A typical Study Group structure is shown in Figure 20-2.
Study Group
\'ice-Chajrniaii
^r
Working Party 2
'\r
Working Party 3 WP Chairman
Question C
Question F
Rapporteur
Rjipporteur
Question B
Question D
Question G
Rapporteur
Rapporteur
R;ipporteur
Editor
Editor
Question E Rapporteur
Figure 20-2. Typical Study Group Structure Each Study Group has a Chairman and one or more Vice-Chairmen, and is responsible for a set of study areas called Questions. Study Groups are normally organized by collecting groups of related Questions into a set of Working Parties. Each Working Party has a chairman who may be selected (but not always) from among the Vice-Chairmen of the Study Group. The work on each Question is led by a Rapporteur. Each Question is responsible for developing one or more standards, called Recommendations. Each Recommendation may have a responsible Editor. The chairman and vice-chairmen of a Study Group are appointed by the conference or assembly of the relevant Sector.
772
Chapter 20
Working Party chairmen are proposed by the management team, consisting of the chairmen and vice-chairmen, and are approved by the Study Group, generally by acclamation. Rapporteurs are proposed by the Working Party or Study Group management and are approved by the Working Party or Study Group. Editors are generally selected by the Rapporteurs of the Question in which the Recommendation they are editing is being developed. Editors are generally approved by the relevant Working Party or Study Group.
20.2.2 Membership The work of the ITU is carried out by its membership. Each Sector (ITUT, ITU-R, ITU-D) has its own independent membership. There are two general classes of membership in the ITU-T: • Member States are the 189 member countries of the ITU. Member States have ultimate decision making authority within the ITU. Member States are, by right, members of all three Sectors of the ITU. • There are several classifications oi Sector Members, although all have the same rights of participation. At present, there are more than 600 organizations that are members of one or more Sectors of the ITU. The most common categories of Sector Members are Recognized Operating Agencies (ROAs), which are organizations that operate a network to provide services, and Scientific and Industrial Organizations (SIOs), which encompass most other organizations, including equipment and component manufacturers. Many other organizations, such as lEC, ISO, IEEE, and ISOC (the Internet Society, which is the governing body of the IETF), are members of one or more Sectors of ITU in the Sector Member category Regional and other International Organizations. Associates are organizations that have elected to participate in the work of only a single Study Group for a reduced fee. They can make contributions to the work, attend meetings, and participate in discussions of that single Study Group, but Associates are not members of the sector.
20.2.3 Standards Development 20.2.3.1 Meetings Each Study Group has several full meetings across the four year Study Period. For most Study Groups within the ITU-T, the interval between full Study Group meetings is typically 8 or 9 months. Most Study Group meetings are held in Geneva, Switzerland, although occasionally some Study
Standards Development Process
IIZ
Group meetings are hosted elsewhere. The decision making (plenary) portions of the meeting are generally conducted with simultaneous translation into as many as the six official languages of the Union (English, French, Spanish, Chinese, Russian, Arabic). Each full Study Group meeting generally begins with a short Study Group opening session, followed by parallel opening sessions for each of the working parties, if warranted. Throughout most of the meeting time, parallel meetings of the Questions occur, which is where most of the consideration of contributions and the drafting of text occurs. At the end of the meeting, there are Working Party (for Study Groups with Working Parties) and Study Group closing sessions in which decisions are taken. These decisions include things like putting a completed Recommendation text (standard) forward into the approval process, sending liaison statements (communications) to other organizations, and authorizing interim meetings. Several other types of meetings can be held. A stand-alone Working Party meeting can be held between full meetings of the Study Group. Standalone Working Party meetings have some decision making authority. Much more common are interim meetings of one or more Rapporteur Groups. Most of these types of meetings are held outside Geneva, hosted by a Sector Member. Interim meetings of this sort have no decision making authority, but they are generally an opportunity to work very productively with a smaller group of people to develop text to be submitted to the next Study Group or Working Party meeting for consideration. 20.2.3.2 Contributions and Other Input Documents Contributions into a Study Group or Working Party meeting are quite formal. There are two classes of contributions: Normal or "White" contributions (this name originates from the fact that these were normally printed on white paper) must be received by the ITU two months before the meeting at which they will be considered. Normal contributions may be sent via postal mail to the chairman, vice-chairmen. Member States and Sector Members prior to the meeting. Some normal contributions are translated into multiple languages, generally in the case of draft Recommendation text that has some regulatory or policy aspect that is expected to be put into the approval process at the next meeting. Many members currently elect not to receive the paper copies in the postal mail but rather to access the documents electronically as they are posted on the web. Delayed contributions must be received by the ITU at least seven working days prior to the meeting at which they are to be considered. Paper copies of Delayed contributions are distributed to delegates requesting them at the meeting via their "Pigeon Holes" (numbered mailboxes that are
774
Chapter 20
provided for each meeting participant). Delayed contributions are also available electronically on the web more than one week prior to the meeting. Another type of meeting document is a Temporary Document. This name is somewhat dated, as these days a permanent archive is maintained of all such documents. Temporary documents include incoming liaison statements from other groups, reports of interim meetings, draft Recommendation text that may have resulted from an interim meeting or from the Editor having applied the agreed changes from the previous meeting, agendas from the leadership, etc. Documents produced during a meeting are generally issued as temporary documents. Temporary documents are available electronically on the web, or for paper distribution to the pigeon holes according to the preference of each delegate. Interim meetings of one or more Rapporteur Groups do not have as formal a process for contributions. Generally, all documents considered or produced from these interim meetings are simply referred to as Working Documents and are posted on an FTP site that members can access. Each group may establish its own rules for contributions. Many groups have a custom that contributions into interim meetings should be available one week prior to the meeting, but this is not a universal ITU policy. 20.2.33 Approval of Recommendations Unlike most other standards groups, the ITU-T operates by trying to achieve the broadest possible consensus for the Recommendations it produces. Unopposed agreement is always the goal, although in extreme cases the rules permit approving a Recommendation with a small amount of dissent. Because the bar is set high to approve a document, the Recommendations produced by ITU command the utmost respect. There are two mechanisms used within the ITU-T to approve Recommendations. The Traditional Approval Process (TAP) was used for all Recommendations until the WTSA in 2000. This process is still used as the method to approve Recommendations with regulatory or policy implications, which comprise only about 5% of the Recommendations produced today. To approve a Recommendation under TAP, when the participants determine that the text is sufficiently mature and ready to begin the approval process, a Study Group or Working Party meeting Determines the text. The text is then translated into the six official languages and distributed in advance of the next full meeting of the Study Group, where the text can be Approved, becoming a Recommendation In Force. At the approval meeting, there is the opportunity to discuss and correct any defects that might have been identified by contributions to the meeting. The objective is to approve a Recommendation without opposition. If there is a
Standards Development Process
775
Sector Member who objects, it is still possible to approve a Recommendation so long as there is no Member State present in the meeting who objects. The process that is now used for about 95% of Recommendations (those that do not have policy or regulatory implications) is called the Alternative Approval Process (AAP). When participants decide that text is ready to be put for approval according to this process at a Study Group or Working Party meeting, they make a decision that the text is ready for Consent. The text that reaches this stage is posted for a four-week Last Call period during which Member States, Sector Members, or Associates (of the particular Study Group) may submit comments. If, at the conclusion of the four-week interval, there are no comments (or only comments indicating typographical corrections), the text is approved. If substantive comments are received, the text enters a Comment Resolution phase. Generally this is conducted under the direction of the Rapporteur via email correspondence. If a satisfactory resolution can be found to the comments, the text can then be posted for a three-week Additional Review period. If, at the conclusion of the three-week interval, there are no additional comments (or only comments indicating typographical corrections), the text is approved. If substantive comments are received, the text will be sent to the next full Study Group meeting where there is another opportunity to consider contributions and conduct additional discussions to try to find an agreeable way to fix the problems that have been identified. Assuming that consensus can be achieved, the Recommendation can be approved at this Study Group meeting. The fact that ITU looks for unopposed agreement to approve Recommendations results in a unique culture of working within the organization. Participants understand that in order to reach unopposed agreement to an idea that they contribute, that they will have to work to understand the views of participants across industry and government around the globe. It is unusual to bring something to the ITU and have it standardized exactly as proposed. Therefore, contributors come in with ideas very early (at the paper design stage) when they are still ready and able to accommodate other views on the substance of the idea they submitted in order to produce a standard. This process also results in an unusually high degree of civility and polite behavior among the delegates. Even in a room of those who may be fierce competitors in the marketplace, ultimately the spirit of cooperation has made the ITU exceptionally successful as the producer of global standards.
776
20.3.
Chapter 20
TECHNOLOGY-SPECIFIC INDUSTRY FORUMS
20.3.1 Message An organization is recognized as a success when its efforts to incorporate a diverse mix of members brings about respect across the various groups toward reaching the common goal. This outcome is fairly easy to see within a single-tiered organization, such as between the marketing, engineering, and quality departments of a product manufacturer. But bring this model of success to a consortium, to a group chartered with educating an industry on the benefits of an emerging technology, and the interworkings of such an organization take on a new shape beyond the familiar. Though not entirely different from what takes place within a company, the consortium environment takes the familiar interworkings to new and expanded levels. The ever-popular question in the late twentieth and early twenty-first centuries has been how to take a seemingly smart business solution or technology beyond the hype to make the solution a global market success. In order to make this happen, consortia have evolved and formed an entirely new dynamic, evolving as an extension of the traditional technology corporate world. As in any hierarchical structure, you find positives and you find struggles. However, the ultimate goal of representatives from the member companies within a consortium is to further the development of standards to a common definition that will ultimately advance the industry. The goal of this section is to provide a practical explanation of how a consortium operates to implement standards and to provide insights into some common and not-so-common expectations for success. If we look at the process of standards development, it's easy to see that it isn't complete without considering the man-power that goes into these endeavors. After all, technology can exist in an unrefined format without limitations; it is the standards that shape the use of a technology. To successfully implement standards that benefit an industry and, ultimately, the way we communicate requires individuals with dedication and passion, willing to work with competitors or even nay-sayers, to further the universal cause. In general, these people are volunteers, taking time from their 50plus hours/week jobs, to help the industry, their company, and their own career, which goes without saying. Tracing the human aspect of such technology-focused groups, one can see that the effects of standards activities can be tracked by the camaraderie and conflict, teamwork and politics that evolve from the ongoing efforts to move motions forward, agree on processes, and ultimately pass implementation agreements.
Standards Development Process
777
203.2 What is involved? Election/hierarchy As previously mentioned, standards development groups have organizational management structures similar to a typical business, including a board of directors, a president, and other leading officers that help the organization accomplish its tasks. A typical management structure of a forum or standards body is depicted in Figure 20-3. Industry forums have created democracies where individuals are elected to take positions where their job is to assure that members have a voice in the process. Maintaining the constituencies' happiness is just one of the challenges elected representatives take on in their role.
Board of Directors
President, Chairman Of the Board Committ^ Chairperson(s)
r
^ Officers
L Administrative team
Subojmmittee chairperson
Active committee members
J >
r
Administrative team
Editors
Figure 20-3. Organizational chart
The following is an example of how a forum might structure its leadership, but this example is not exclusive to any one organization. In a typical voting process, directors are nominated by members and then voted upon by the member companies, with each company receiving one vote. Elections are held once a year, and the directors hold their positions for a 12month term. The exception is when the elected official at some time during his or her term is no longer with his or her sponsoring company. In some cases in the industry, companies have shut their doors due to a decline in the economy or other reasons, whereby they are no longer a member of the forum. When this happens, an elected forum director/official who had been working for one of these defunct companies is removed from the post, and
778
Chapter 20
an interim director/official is nominated to fill that seat until proper elections can take place. Finally, each director can be renominated at the end of the term to run again for the position. This hierarchy sets the framework for the process of standards work and implementation. Technical standards and agreements are suggested and brought forth for nomination. Upon acceptance and submission of the specification, the organization as a whole votes on the acceptance of the specs, makes suggestions for further draft development or votes to take the document to ballot phases when near final agreement. As letter ballots are approved, the process is typically interactive, seeking comments on draft proposals from members within the committees. The next step after gathering comments is to address and resolve each and every comment. This process may result in additional comments, and so on, until the implementation agreement addresses all the concerns. This description may help those outside the consortium understand why the standards process is a time-intensive procedure. Digging deeper into the process of managing the resolution for moving motions forward is the editor of the initiative. The editor's role in creating the defined implementation agreement is to act as impartially as possible to resolve the discussion slated around the review process. In this entire process, the editor on the basis of the charter of the working group and the scope of the implementation agreement, needs to keep the group focused in getting consensus on the agreement to help move it forward. In cases where an item is deadlocked within the review process, the committee and those involved in the review must work to find a compromise. At times like these that the true neutral and cooperative nature of a consortium comes to light as the most important trait for achieving these goals. Only through collective agreement and cooperation can a deadlock be resolved. One oversight of many hopefuls within the industry, including media and analysts, is that technology development can come to common ground almost overnight. The fact is that, as mentioned, the men and women behind the specification development are volunteers, and as with most consortia, the technical groups often meet on a quarterly cycle, with details going into the specifications during these cycles. At quarterly meetings, the technical group—the group responsible for specification development, which is made up of subgroups focused on exact capabilities within the technology, i.e., service level agreements (SLAs), and performance monitoring,— discuss, debate and determine the movement of specifications over 3—4-day meeting sessions. It's within the scope of these quarterly meetings that the review process of agreements and standards movement is voted upon. Through these time frames, members collaborate, disagree, and come to focus on the common goal of furthering the technology at hand. Moreover,
Standards Development Process
779
countless hours are spent between meetings by each working group member in conference calls and through email exploders in discussing new ideas, debating proposals and subsequently coming to some resolution on issues. The full process of creating these standards can in certain scenarios take 12-18 months, or even longer. The subelements within these standards can be further developed as an ongoing process. When technology has matured to where the industry accepts the standards and incorporates them into products, networks, and documentation, consortia realize that their goal has been achieved and disassemble. But history shows that the evolution of technology perpetuates further developments based on new, more advanced technology that seems to bring the same players back to the group in order to advance the industry and technology.
20.3.3 The History Behind Standards Groups: Why join? The telecommunications industry has evolved to the point where global markets exist only through support of global standards. Innovations in technology have drawn companies to a common interest, and it is standards that provide the basis for cooperation among industry vendors, i.e., standards create the means to ensure interoperability among products. The technology boom of the late 1980s and early 1990s led industry vendors to form consortia as an environment for rapid standards creation. The idea was to have an alternative to the existing standards development organizations, whose processes were thought to be too slow and cumbersome to facilitate timely delivery of technology standards (or products to market). In part for this reason, industry forum organizations were created to further the cooperation of technology use across the communications industry. Early standards consortia were very simple, functioning with almost no formal processes in place. Today, standards consortia are evolving into sophisticated, nonprofit corporations with well-defined legal structures and formal policies and procedures necessary to ensure responsible operations along with success. It can be said that the underlying reason telecommunications product vendors join a forum is to ensure that their company goals are part of the standards development process. Another main reason for vendor companies to become involved in standards development organizations is to ensure that the products they are creating interoperate with others throughout the industry. Customers and carriers in this day and age require that products interoperate for the sake of advancing technology. All other equipment manufacturers cooperate in this effort. The employees of member companies that become representatives to the forum do their best to see that
780
Chapter 20
the direction of standards development closely follows the capabilities and specifications built into their company's existing products. This isn't the case for each and every company representative. Many do, however, bring their technological expertise and contribute in a neutral manner. In either case, the forum can bring a member company further validation and credibility for their efforts to advance the technology of interest. In the evolving telecommunications industry, many industry forums have seen the opportunity to work together and complement each other's groups. Typically, the representative members of a consortium benefit from the many partnerships and networking opportunities offered when multiple companies come together. Uncovered possibilities, strategic alliances, and simple knowledge sharing almost always are a plus to membership when such a group a company joins.
20.3.4 Membership Membership is a key element in the accomplishments of a consortium. As a consortium grows and becomes well established, companies can be lured to join in order to reap the benefits of the consortium's influence. There are also companies that may reconsider membership renewal after participating for a term or two and base this reconsideration on the direction of the group: Is the group moving forward? What are the cost of dues- is it affordable? How much time is the employee putting into this that might otherwise be put into his/her day job? What effects of success (or delays) is the consortium creating for the industry? In short, members who choose to be involved by putting forth time and effort can become influencers and only gain from these efforts. These are the he realities of a consortium. As with any organization where there is a constituency and more than two heads running the show, things can go so right and so wrong. The reality of a consortium is that the group of volunteers who have come together for a common goal can come, sometimes, with their own agendas that inadvertently may rise above the goal of the group. Again, nothing in life is perfect, but when companies join together in a seemingly neutral fashion, there are often leaders who emerge — be it due to the strength of their company or their personal technological knowledge — and divisions that evolve, exposing one-sided agendas. It's the smart member who realizes this and knows when to speak up and knows when to take a detour in a matter. A tip to the naive is to ask oneself. Will this issue matter in six months? Let that be a gauge of its importance. Another reality that member companies should realize in working with a consortium is that a well-operated group will have a strong marketing outreach plan. And any member company wanting to validate its
Standards Development Process
781
involvement, even if that involvement is largely from a technology perspective, should try to leverage that outreach. The branding opportunities that go along with involvement in an industry standards group can validate a member company's place in the market. With that in mind, the efforts of a consortium to evangelize its story to the industry span an array of programs from the technical and the marketing side. Regarding the latter, press and analyst outreach, speaking engagements at industry events, white papers, and more make up a well-rounded marketing plan. These plans are generally drawn up by an internal marketing committee, which is often supported by an outside agency. In choosing the events in which the organization paticipates, audience is a key consideration, as is location and size of the event. The next consideration is cost. Ideally, exposure without cost is the best form of evangelism. Having said that, there are a variety of avenues to consider — industry conferences, trade shows (often times cohosted with conferences), webinars (often hosted in conjunction with a leading media outlet), and educational seminars or tutorials. Months of planning go into the organization of a full-scale event — i.e., a full-day tutorial or interoperability product demo. The planning typically begins with interest from the marketing team in being involved with an industry event or tied to a campaign initiative of the organization. The next step is to collaborate with the technical team. It's their support, from a demonstration and content perspective, that makes the event a success. All in all, planning for a large-scale event truly brings out the team effort of a consortium. Committee members should be prepared to compromise and work above their typical role of involvement to bring about success for such events. The payoff is well recognized.
20.3.5 Reality of human nature Marketing and engineering are two groups within any organization where it is hard to assure that there will be any camaraderie. This is true in industry consortia, as well. History has shown that the two groups can work civilly together but also face challenges along the way. The aggressive nature of a marketing group to want to communicate success as soon as possible (and even sometimes sooner), and the automatically conservative nature of engineers to want to assure full functionality, even weeks after testing, can cause friction between decision makers within a consortium. Reaching a compromise proves to be a key trait in situations like this, and usually it is the marketing group that concedes a bit more than the engineering team, and usually with good reason.
782
Chapter 20
However, it's typical in these situations that individual agendas surface. Power plays associated with passing a particular standards agreement, or having such specifications passed in a timely manner, can hurt the development of the standard, and resistance can come from adversaries who hold higher positions within the consortium. It's not unthinkable to consider that human nature might cause the actions of one representative to delay a suggested specification from a competitor. In this type of scenario, it's possible that the members will split between those who agree with the legitimacy of the delay and those who think the person suggesting the delay is creating a roadblock without much recourse. Human nature is a piece of the organizational structure, but hope exists for finding positive experiences that can defeat these issues. And that one word of hope is teamwork.
20-3.6 Teamwork Without a doubt, teamwork is the most positive experience that comes from the relationships created within an industry forum. The relationships that are built between colleagues and the achievements that a group can make to advance communication are significant. It's teamwork that is the greatest conduit to advancing the telecommunications industry. When colleagues from competitor companies can put aside their rivalries to put forward a specification agreement together, or when a colleague suffers the loss of a job and those within the industry forum do all they can to get the colleague interviews or opportunities to stay involved, lies the signs of therein positive human nature that can come from an industry forum. All this teamwork is further facilitated by the people who are most involved with the support and success of the consortium - the administrative team. In most cases today, industry forums are supported by an outsourced administration group. The delicate nature of operating a consortium as a nonprofit entity and keeping committees, leaders, and officers focused and on course for success is often driven by the guidance of the administrative team. Behind the scenes, the administrative team brings together all the details for quarterly meetings, for the conference calls that take place in between, and for the logistics of it all, while assuring the committee members of their neutral standing and promoting a clear understanding of the goals ahead. There are some things that rise above the control of the administrative team, such as four-hour implementation agreement meetings that run well into late hours of the night. But the administrative team is on hand to document the direction and note the minutes of meetings so that members have disclosed information on the proceedings of the meetings. The members look to the administrative team as a neutral body, which it is. And often, the
Standards Development Process
783
administrative team is praised for all the efforts for which, it seems at times, the organization would not function without.
20.4.
CONCLUSION
Standards Development Organizations and Industry Forums are no more immune than a typical vendor company is to the typical challenges of personality, views, achievements, and goals. Network Operators, System Vendors, Device Manufacturers, and Governments may all bring their own unique perspectives to the table. Member organizations may be marketplace competitors or maybe in customer/supplier relationships with other Member organizations. Member organizations are motivated to come together to reach common objectives in order to create and develop new markets whose evolution depends on the interoperability of the systems produced by different vendors and the interconnection of the networks of different operators. It is the drive of this common goal to move technology forward and create demand for the products or services in the market. High-quality standards reduce the cost of deploying new technologies by preventing interoperability problems that might otherwise occur. Managing the diversity of the representatives in this type of organization is a dynamic that is ever constant. The officers, the administrative team and other key members with leadership positions or responsibilities must exercise neutrality and promote consensus. Any perception of bias among the leadership or unfairness in the process can damage the confidence and support of the members that is so important for a successful standard.
This page intentionally blank
INDEX + 1/0/-1 byte justification, 202 1:1 cold standby protection, 511 1:1 hot standby protection, 511 (1:1)° protection, 300, 302f 1+1 Optical Subnetwork Connection (OSNC) protection, 313, 314f 1+1 protection, 298, 298f, 499 inMEF, 511 1:1 protection, 500 l:n protection, 298-299, 299f, 500 1x9,716 1 x 9 transceivers, 716 +2/+1/0/-1 justification, 202 2.5 Gbit/s interface, 714-715 2.5 Gbit/s interface standards, 760-761 receiver specifications in, 760, 762t summary of, 760, 761t transmitter specifications in, 760, 761t 2.5 Gbit/s technology and systems, 668-672 SPI-3 signal descriptions in, 669-672 receive direction, clock and data signals, 671 receive direction, discrete control/status signals, 671-672 transmit direction, clock and data signals, 669 transmit direction, discrete control/status signals, 669-671 typical usage of, 669, 669f 3Rpoints, 65, 65f 4 X ODUl to ODU2 justification structure, 109, 109f, 109t 4 X ODUl to ODU2 multiplexing, 107-112. See also ODUK multiplexing 4 X ODUl to ODU2 justification structure in, 109, 109f, 109t frequency justification in, 111-112, 112t OPU 2 multiplex structure identifier (MSI)in, 110-111, llOf, 11 If OPU2 payload structure identifier (PSI) in, 110 structure in, 107, 108f
8B/10B client processing, in GFP-T, 169 10 Gbit/s technology and systems, 672-677, 721-728 discrete solutions in, 721-722 integrated solutions in, 722-728 200-pin transponder in, 724-725 300-pin transponder in, 722-724, 723f X2, XGP, and XPAK in, 726, 727f XENPAK in, 725-726, 725f XFP in, 727-728, 728f Media-Independent Interface (XGMII) in, 759 SPI-4 Phase 1 (OC-192 System Packet Interface) in, 674-677, 675f System Framer Interface-4 phase 1 (SFI-4 Phase 1) in, 672-674, 673f 40 Gbit/s technology and systems, 681-688, 728 background on, 681 SERDES Framer hiterface-5 (SFI-5) in, 682-685 signals, receive direction, 682-685 signals, transmit direction, 682-685 SPI-5 (OC-768 System Packet hiterface) in, 685-687, 685f TFI-5 (TDM Fabric to Framer Interface) in, 687-688, 688f 64B/65B code blocks, adaptation into GFP superb locks of, 169f, 170 140 Mbit/s to 2.5 Gbit/s technology, 713-721 2.5 Gbit/s, 714-715 140 Mbit/s to 622 Mbit/s, 713-714 integrated solutions for, 716-721 1 x 9 device (transceivers) in, 716 2.5 Gbit/s transponders in, 720-721 GigaBit Interface Converter (GBIC) in, 719-720, 721f initial versions of, 716 Small Form-Factor Pluggable (SFP) devices in, 718-719, 719f
786 Small Form Factor (SFF) devices in, 717-718, 717f 140 Mbit/s to 622 Mbit/s optical interfaces, 713-714 200-pin transponder, 724-725 300-pin transponder, 722-724, 723f A Access group, 31, 378 in MEN, 335 Access link, in MEN, 340, 341f Access point (AP), 31 in MEN, 336 Adaptation fimction, 29-30, 30f, 32, 32f in Ethernet connectivity services, 396 expansion of, 35, 35f in MEN, 336 Adapted information (AI), 51, 52f Add Drop Multiplexing (ADM), 3, 40, 41f transport function implemented on, 4If, 4 3 ^ 4 , 44f, 45f Additional Review, 775 Address and addressing, 589 in ASTN, 568-569 client, in GMPLS RSVP-TE (G.7713.2) signaling, 648 Address Prefix FEC, 451 ADM. See Add Drop Multiplexing (ADM) Administrative Unit (AU), 121 Administrative Unit Group (AUG), in SDH frame, 121, 121f Admission control, in ASTN, 569-570 Advanced Telecom Computing Architecture (ATCA) backplane, 760 Agere Systems, TSWC01622 and, 272-273, 274f Aggregated line and node protection (ALNP) in MEF protection, 508-509, 508f in MEN, 522 Aggregation DSLAM uplink in, 394, 394f in Ethernet connectivity services, 392, 394 flow definition and, 378 multiple customer flow, 393, 393f single-customer flow, 392, 393f Alarm, 55 Alarm, in MEN CES, 487-489 for buffer underflow and overflow, 488-489 for structured service, 488 for unstructured service, 488 Alarm/status monitoring, 54 Alignment jitter, 195, 196-198, 198f
Index All Optical Networks (AON), standards for, 700 All-to-one bundling map example of use of, 350, 351f feature constraints in, 355 in layer 2 control protocols, 350-351 private line and, 351 Alternative Approval Process (AAP), 775 Anchored components, 585 Anchored objects, 585 Anomaly, 55 check to identify defects in, 55-56 AON (All Optical Networks), standards for, 700 Application code terminology, related to distance, 709, 710t Application protection constraint policy (APCP), 516 Application services layer (APP layer), 329 in MEN, 329 Application Specific Integrated Circuits (ASICs), 3 APP link, in MEN, 340, 341f Approved, 774 Architecture of automated discovery (G.7714), 606-607, 606f of Automatically Switched Transport Network (ASTN), 551-655 {See also Automatically Switched Transport Network (ASTN) architecture) component, in G.8080, 575 control plane (G.8080), 571-595 {See also Control plane architecture (G.8080)) for EPL services, layered, 397, 399f of Ethernet services, 7-10 of high-speed serial interconnects, 737-742 Printed Circuit Board (PCB) interconnects in, 738-742 {See also Printed Circuit Board (PCB) interconnects) topologies in, Til-Ti^ Network Element (NE), 268-279, 664-668 {See also Network Element (NE) architecture) hybrid (TDM + cell/packet based), 667-668 of Optical Transport Network (OTN), 23 of protection switching, 297-303 {See also Protection switching, architectures of)
Index restoration, in G.8080, 593-595, 594f ring protection, 302-303, 303f Signaling Communications Network (SCN) (G.7712), 595-603 {See also Signaling Communications Network (SCN) architecture (G.7712)) synchronization, for SONET/SDH, 257-293 {See also Synchronization architectures, for SONET/SDH) of transport networks, 17-61 introduction to, 17-18 transport fianctional modeling in, 18-61 {See also Functional modeling, transport) ASICs. See Application Specific Integrated Circuits (ASICs) ASON (Automatic Switched Optical Networks), 10-11 routing in {See Routing, in ASON (G.7715andG.7715.1)) Assignment, manual synchronization of status messages (SSM) in, 245 ASTN. See Automatically Switched Transport Network (ASTN) Asynchronous interworking fianction (IWF) asynchronous tributaries and, 477^78 in MEN CES, 477-478 synchronous tributaries and, 478 in MEN CES, 478 Asynchronous mapping, 200-203 Asynchronous network operation, 238-239, 239t Asynchronous-transparent mapped mode, in GFP, 165f, 168-169 ATCA backplane, 760 ATM Adaptation Layer, Type 2 (AAL-2), 156-157 ATM protocol data units, 156 Atomic fianction information points, 51, 52f Atomic fianction inputs and outputs, 51, 52f Attachment circuit (AC), in L2VPNs, 428 Autodiscovery. See Discovery, automated (G.7714) Automated discovery. See Discovery, automated (G.7714) Automatically Switched Transport Network (ASTN) architecture, 551-655 background on, 551-553, 553f control plane architecture (G.8080) in, 571-595 {See also Control plane architecture (G.8080)) control plane management in, 633-635, 636f
1^1 control plane technology in, 552-553, 553f discovery (G.7714) in, 604-611 {See also Discovery, automated (G.7714)) methods and protocols for, 640-643 {See also Discovery, automated (G.7714)) future of, 653-655 network requirements (G.807) in, 553-571 admission control in, 569-570 architectural context in, 554-555 architecture principles in, 564-567, 565f background on, 553-554 business and operational aspects in, 559-562 call and connection control in, 555-559, 556f, 558f connection management in, 567-568 naming and addressing in, 568-569 reference points and domains in, 562-564 routing and signaling in, 568 for signaling communications network, 570 support for transport network survivability in, 571 supporting fianctions and requirements in, 567-570 transport resource management in, 569 protocol analysis in, 637-640 approach to, 637-639, 638t requirements implications on protocol solutions in, 639-640 routing in (G.7715 and G.7715.1), 611-626 {See also Routing, in ASON (G.7715 and G.7715.1)) methods and protocols in, 651-652 service activation process elements in, 603-604 in signaling communications network (G.7712), 595-603 {See also Signaling Communications Network (SCN) architecture (G.7712)) mechanisms of, 652-653, 653f signaling (G.7713) in, 626-633 {See also Signaling (G.7713), in ASON) methods and protocols for, 643-651 {See also Signaling (G.7713), in ASON, methods and protocols for) Automatic Protection Switching (APS) channel, 303 Automatic Protection Switching (APS) protocol, 310-312
788 APS signal in, 310-311 external commands in, 311 priority in, 312 process states in, 311 in protection switching, 304-305 Automatic Switched Optical Networks (ASON), 10-11 routing in {See Routing, in ASON (G.7715andG.7715.1)) Availability. See Storage Area Networks (SANs), SONET for; Storage networking Avalanche Photo-Diode (APD), 714 B Backplane Ethernet, 760, 762 Backplane interconnect, 736 Backplane topologies, 737-738 Backup. See Storage Area Networks (SANs); Storage networking Backward compatibility requirements, in MEN protection, 519 Backward Defect hidication (BDI), 92-93 in OTUK, 102-103 Backward error indication and backward incoming alignment error (BEI/BIAE) in ODUk overhead and processing, 93, 94t inOTUK, 103, 103t in tandem connection monitoring (TCM), 96, 96t Bands, wavelength, 705, 705t Bandwidth allocation, at 100 kbits/s granularity, for T-line over MEN, 462 Bandwidth broker, 560 Bandwidth evolution, in multiplex structures, 127-129, 127f-130f Bandwidth profiles application of, 363-365 Ethernet Connection (EC) attributes and, 386 in Ethernet services over MEN, 359-365 algorithm of, 360-362, 360f, 361f application of, 363-365, 364f configuring Customer Edge (CE) and, 365 disposition of service frames in, 362, 362t graphical depiction of, 360, 360t parameters of, 360-362, 362, 362t in Ethernet services over public WAN, Class of Service Identifier in, 364, 364f green, 361-362, 361f
Index in MEN, 525-526 red, 361-362, 361f yellow, 361-362, 361f Bandwidth provisioning, for T-line over MEN, 461-462, 463f bandwidth allocation at 100 kbits/s granularity in, 462 Ethernet multiplexing in, 462, 463f TDM multiplexing in, 462 Bathtub curve analysis of jitter, 746-748, 747f Beginning-Of-Life (BOL), 711 Bell System, frequency traceability and, 266 Bidirectional Line Switching Redundancy (BLSR), 509 Bidirectional switching, 304 in MEN protection, 519 BIP-8. See Bit hiterleaved Parity (BIP-8) Bit errors, in MEN CES, 490 Bit Interleaved Parity (BIP-8), 92 in G.709 overhead bytes, 88, 88f inOTUK, 83f, 101 Bit recovery, in synchronization architectures for SONET/SDH, 258 Bit-synchronous mapping, 201 Black-box method, 699 Black-link method, 699 Bridged-source timing method DSl and, 281-282, 281f in SONET/SDH, 281-282, 281f Broadcast service frame, 347-348 Buffer credit. See Storage Area Networks (SANs), SONET for; Storage networking Buffer underflow and overflow alarm, in MEN CES, 488^89 Building Integrated Timing Supply (BITS), 215,260 Bundling, Ethernet client interfaces and, 404 Bundling Map, 354-355, 354f Business continuity. See Storage Area Networks (SANs); Storage networking C Call steady-state, 645 in UML, 586 Call/connection release, 629 Call control, in ASTN, 555-559, 556f, 558f Call disconnects, 604 Called parties, 560 Call identifier, 645
Index Calling/Called Party Call Controller (CCC), 581f, 582, 627, 627f Call objects, new, in GMPLS RSVP-TE (G.7713.2) signaling, 647 Call perspective, 562 Call requests, 604 Call segments, 564, 565f Call separation support, in GMPLS RSVP-TE (G.7713.2) signaling, 645-646 Call setup, 628 Capability, 624, 624t Capacity consumers, 33 Card-card communication, over backplanes, 736 Carriage of services, over transport networks, 7-10 Ethernet services architecture and definitions, 7-10 storage area services over SONET, 10 Carrier grade in fiber channel SANs, 537-538 of SONET/SDH infrastructure, 546 Carrier's carrier, 560 CBS. See Committed Burst Rate Size (CBS); Committed Burst Size (CBS) CE. See Customer Edge (CE) CE-bound IWF, 472 Centralized management systems, in G.8080, 571-573 CES. See Circuit Emulation Service (CES) CE VLAN ID, 348-350 for Ethernet services over MEN, 350 preservation of, 350, 350t, 352 CE VLAN ID/EVC map, 348-349, 349f broken, example of, 355, 355f CE-VLAN ID preservation, 350 CF. See Coupling Flag (CF) Channels, 694 Characteristic Information (CI), 51, 52f, 334, 378 Chip-chip interconnect, 736 Chip-to-chip communications standards, 661. See also Intra-Network Elements communications CIR. See Committed Information Rate (CIR) Circuit Emulation Service (CES) definition of, 457 over MEN {See Metro Ethernet Network Circuit Emulation Services (MEN CES)) PDH, 468t, 469, 470t
789 Circuit Emulation Service Interworking Function (CES IWF), 469 in MEN CES, 469 synchronization description of, 475-477, 475f, 475t Class of Service (CoS), 356-359, 385, 389 Drop Precedence (DP) and, 385, 399 frame delay performance objective, point-to-point, 358, 358f, 358t frame delay variation, point-to-point, 359, 359t frame loss performance, point to point, 359, 359t identifying, 356-357 Layer 2 control protocols in, 365-367 {See also Layer 2 control protocols) performance parameters in, 357 Class of Service (CoS) Identifier, 357, 525 bandwidth profile and, 364, 364f Client addressing. See also Address and addressing in GMPLS RSVP-TE (G.7713.2) signaling, 648 Client alignment, 30 Client data frames, GFP, 159, 160f Client-dependent procedures, GFP, 166-171 8B/10B client processing in GFP-T in, 169 adaptation into 64B/65B code blocks into GFP superblocks of, 169f, 170 asynchronous-transparent mapped mode in, 165f, 168-169 client signals supported by GFP and, 166, 166t frame-mapped mode (GFP-F) in, 167, 167f generating GFP-T 64B/65B codes in, 169-170, 170f transparent-mapped mode (GFP-T) in, 167-168, 168f Client encoding, 29 Client-independent procedures, GFP, 164-166, 165f frame multiplexing in, 166 GFP frame delineation in, 164-165, 165f link scrambler in, 166 Client interfaces, for Ethernet services over public WAN, 401^05 bandwidth profile in, 404 bundling in, 404 Layer 2 control protocol processing in, 404-405 multiplexed access in, 402, 403f
790 UNI service attributes, common in, 402, 402t UNI service attributes, service-dependent in, 402, 402t VLAN mapping in, 404 Client labeling, 29 Client management frames, GFP, 162 Client/server relationship, 31, 33 Client signal, supported by GFP, 166, 166t Client signal fail (CSF), GFP, 163, 163f Clock gapped, 202 regular, 202 Clock backup modes, in synchronization fijr SONET/SDH, 286-292, 289t. See also Holdover mode Clock fast mode, 292. See also Holdover mode Clock hierarchy, 249-250 Clock modes, SDH Equipment Clock (SEC) and ODUk Clock (ODC), 218-219 Clock noise, 207-208 Clock recovery, timing engine (TE) fimctions and, 270 CM. See Color Mode (CM) Coarse Wavelength Division Multiplexing (CWDM), standards for, 699 CO-CS. See Connection-oriented circuit-switched (CO-CS) network Coding gain, 70-73, 71f-73f measured via EJN^, 71, 72f measured via OSNR, 72-73, 73f measured via Q factor, 70-71, 7If, 72f Cold-pluggable device, 723 Colorblind, 362 Color Mode (CM), 360 Comment Resolution, 775 Committed Burst Rate Size (CBS), 8 Committed Burst Size (CBS), 391t bandwidth profiles and, 360 Committed Information Rate (CIR), 8, 360 Communications, intra-Network Element, 11-12 Compatibility longitudinal, 692 physical layer, 700-701 transverse, 693-694 physical layer, 701-703, 702f Compatibility, transverse vs. longitudinal, 700-712 application code terminology related to distancein, 709, 710t background on, 700
Index optical fiber types and recommendations for, 703-705, 705t optical interface recommendations for, 706-709, 707t, 708t optical interfaces in, 708, 708t physical layer, 700-703, 702f power budget design in optical path penalty modeling and verification in, 711-712 worst-case design approach to, 710-711 signal classes and, 706, 707t Compliance test methodology, for high-speed serial interconnects, 742-748 bathtub curve analysis of jitter in, 746-748, 747f eye mask in, 742-744, 743f jitter modeling conventions in, 744-746 {See also Jitter) Component architecture, in G.8080, 575 Computational viewpoint, 578-579 Conceptual data processing partitioning, within Network Element, 664-665, 664f Congruent topology, 600, 600f DCN, 600, 600f Connection, 31-32, 3If degenerate subnetwork, 33 end-to-end, 555 E-NNI, 630, 631t Ethernet, 381-386 in Ethernet Private LAN (EPLAN) service, 391, 391t in Ethernet Private Line (EPL) service, 387, 388t in Ethernet Virtual Private LAN (EVPLAN) service, 391, 392t in Ethernet Virtual Private Line service (EVPL), 388, 389t layer, in Printed Circuit Board (PCB) interconnects, 739-740, 740f link, 33, 34f local type of, 625 in MEN, 335 network, 33, 34f permanent, 555 selector, 44 soft permanent, 556, 556f subnetwork, 31, 33, 34f protection of, 307-308, 308f, 309f switched, 552, 555, 556f termination, 32 timing, UltramapperTM vs. TSWC01622, 273,274f
Index virtual, 426, 426f Connection control, in ASTN, 555-559, 556f, 558f Connection Controller (CC), 581, 581f, 628 Connection dimension model, 32-35, 34f Connection fimction, 30, 30f in MEN, 337 Connection Identifier Information Element (CUE), 644 Connectionless layer networks, modeling, 60-61 Connectionless networks, 19 in G.8080, 575-576 Connectionless Trail, 335 Connection management, 28-29 in ASTN, 567-568 Connection-oriented circuit-switched (CO-CS) network, 392 Connection-oriented networks, 19 Connection-oriented packet-switched (CO-PS) network, 392, 394 emerging, 400 fimctional characteristics of, 396 necessary modifications to support, 398-400,401t Connection perspective, 562 Connection Point (CP), 32 in MEN, 335 termination, 32 Connection separation support, in GMPLS RSVP-TE (G.7713.2) signaling, 645-646 Connection service, 552 Connection setup, 628 Connectivity monitoring, Ethernet Connection (EC) attributes and, 386 Connectivity restoration instant, in MEN, 502, 503f Connectivity restoration time, in MEN, 502, 503f Connectivity verification, discovery and, 605 Consent, 775 Consequent action, 55 Constant bit rate (CBR), 191 Consumer Edge VLAN Identifier (CE VLAN ID). See CE VLAN ID Containers, 120 Contiguous concatenated (CCAT) containers, 127 Continuous Wave (CW) laser, 715 Contributions, to study groups, 773-774 Control, of optical transport networks, 10-11 Control domains, signaling, 646
791 Control Entity Logical Adjacency (CELA), establishment of, 610 Control frames, GFP, 164 client-dependent procedures in, error control in, 171 client-independent procedures in, 164-166 {See also Client-independent procedures, GFP) idle frame, 164 Controlled octet slip, 234 Controlled slip, 212 Control packet, 141 Control plane, 337, 552-553, 553f MPLS, 453-454 signal communications network message delivery in, 597-599, 598f Control plane architecture (G.8080), 571-595 background on, 571-576, 574f centralized management systems in, 571-573 component architecture in, 575 in connectionless networks, 575-576 end-to-end configuration using network management and control plane functionality in, 573-574, 574f fiiUy distributed model for, 573 goal in, 574 Reference Model for Open Distributed Processing (RM-ODP) and, 574-575 component overview in, 580-583, 581f {See also specific components) control plane view of transport network in, 576-578, 577f, 578f distribution models in, 585-586 example of components in action in, 586-588, 586t, 587f, 588f general component properties and special components in, 580, 580f identifier spaces in, 588-593, 590f, 592f {See also Identifier spaces) identifying components in, 578-579 interlayer modeling in, 583-585, 584f restoration architecture in, 593-595, 594f Control plane component identifiers, 590f, 592 Control plane management, 633-635, 636f Control plane technology, 552-553, 553f Control plane view, of transport network, 576-578, 577f, 578f Coordinated Universal Time (UTC) frequency of, 217
792 timing traceability and, 267 CO-PS. See Connection-oriented packet-switched (CO-PS) network Core header, GFP, 160 Core HEC (cHEC) field, 160 Correlated, bounded, high-probability jitter (CBHPJ), 745 Correlated, bounded Gaussian jitter (CBGJ), 746 Coupling Flag (CF), 8, 360 Cross-fertilization, of concepts and ideas, 1 Crosstalk far-end (FEXT), 738 near-end (NEXT), 738 CSMA/CD, 367 C-types, new, in GMPLS RSVP-TE (G.7713.2) signaling, 647 Current synchronization network, 249 Customer-controlled loopbacks, 492-493, 492f Customer Edge (CE), 8, 344, 524-525, 525f See also CE entries Customer Edge VLAN Identifier (CE VLAN ID). See CE VLAN ID CWDM. See Coarse Wavelength Division Multiplexing (CWDM) D Data. See Storage Area Networks (SANs), SONET for Data and PAD field, 370 Data Communications Network. See DCN (Data Communications Network) Data (database) grovrth of, 527-528 organization of, 527 storage of, 528 Data dependent jitter (DDJ), 745, 746 Data field, in Ethernet services over MEN, 370 Data-friendly next-generation SONET/SDH, 4-6 Data Link Connection Identifiers (DLCIs), 427 Data Plane, 337 Data replication, 533 Data starvation, 488 DCN (Data Communications Network), 597 reliability considerations in, 602-603 security considerations in, 603 topologies of, 599-602 congruent, 600, 600f focused (hub and spoke), 600-601, 60If
Index fullmesh, 599, 599f hybrid (multitier), 601-602, 602f DCN identifiers, 590f, 592 Decision-feedback equalization (DFE), 757 De-emphasis, at transmitter, 749-753, 750f, 751f,753f Defect, 55 Degenerate subnetwork, 33 Degenerate subnetwork connections, 33 Degradation detection, 728-732 Degrade condition, 500 Degrade condition threshold, in MEN protection, 517-518 Delayed contributions, 773-774 Delivery, in fiber channel SANs in-order, 536 lossless, 537 Demultiplexer layer, in L2VPNs, 430-431 Dense Wavelength-Division Multiple (DWDM), 3, 216, 694-695 inOTN, 313 Dense Wavelength Division Multiplexer Optical Add-Drop Multiplexer (DWDM-OADM), 63 Dense Wavelength Division Multiplexing (DWDM), standards for, 695-697 Department of Defense (DOD), in synchronization architectures for SONET/SDH, 267 Destination address field, 369 of Ethernet media access control layer, 369 Desynchronizer, 201 Detection, in network protection, 498 Detection time, in MEN, 502 Detection time Tl, 305, 306f Deterministic jitter (DJ), 744-745 Differential delay, in VCAT and LCAS, 131-132, 132f compensation of, 145-146 detection of, 144-145 Differential delay buffers, in VCAT and LCAS overview of, 147 sizing of, 149 structure and management of, 146-147, 147t Differential Group Delay (DGD), 693, 701 Digital information, growth of, 527-528 Direct Attached Storage (DAS), 528, 529, 530f Direct-source timing method, in SONET/SDH, 280, 280f
Index Disaster recovery. See Storage Area Networks (SANs); Storage networking Disconnect/hang-up request, 604 Discover Agent adjacency, 610 Discovery, automated (G.7714), 604-611 across administrative boundaries, 611 architecture of, 606-607, 606f background on, 604-605 connectivity verification and, 605 Layer Adjacency Discovery Methods in, 640-643, 641f methods and protocols for, 640-643, 641f types of, 607-611 control entity logical adjacency establishment in, 610 layer adjacency discovery in, 607-610, 608f, 609f service capability exchange in, 610-611 Discovery Agent (DA), 581f, 583, 606 Dispersion Shifted Fiber (DSF), 704 Distance, application code terminology related to, 709, 710t Distance extension alternatives, in SANs, 538-541 legacy private line in, 539, 542t SONET/SDH, 539, 541-542, 542t storage over IP in, 539, 540-541, 540f, 542t WDM in, 539-540, 542t Distance extension requirements, in Storage Area Networks (SANs), 536-538. See also under Storage Area Networks (SANs) Distributed Feedback (DFB), 713 Distribution models, in control plane architecture in G.8080, 585-586 Diversity support, 625-626 Domains, 563-564 in ASTN, 562-564 Do Not Revert (DNR), 311 Downstream node, 81, 82f Droop, 545 Drop Precedence (DP), 385 DS1,261 bridged-source timing method and, 281-282,281f extended super frame (ESF) of, 261 line/external timing method and, 282 super frame (SF) of, 261 threshold AIS generation and, 283 DSl interfaces stratum 2 holdover transient vs. seconds at, 288, 289t
793 stratum 3/3E holdover transient vs. seconds at, 288, 289t DS-3 client signal, PDH, on STM-N server, 36, 37f, 39-40 DSLAM, 394, 394f Dual-Dirac distribution, 746-747 Dual-In-Line(DIL),713 Dual protected services, in MEN CES, 494, 494f Dual-star topology, 737 Dual unprotected services, in MEN CES, 493-494, 493f DUK multiplexing, 104-116 Duplication. See Storage Area Networks (SANs); Storage networking DWDM. See Dense Wavelength-Division Multiple (DWDM) DWDM-OADM. See Dense Wavelength Division Multiplexer Optical Add-Drop Multiplexer (DWDM-OADM) El,261.&eafaoDSl EJNQ, measuring coding gain with, 71, 72f Ebrium Doped Fiber Amplifier (EDFA), 694 EBS. See Excess Burst Size (EBS) EBSCON (Enterprise Systems Connection), 529, 531t Edge LSR, 453 Editor, 771-772, 771f eGFP. See Extended GFP (eGFP) frame format Egress service frame, 345 EIR. See Excess Information Rate (EIR) E-LAN. See Ethernet LAN Service (E-LAN) Electro-absorption Modulated Lasers (EMLs), 721-722 Element-level processing, 53 Element Management Function (EMF), 54 fault management process in, 56-57, 57f Element Management Systems (EMSs), 593 E-Line. See Ethernet Line Service (E-Line) Embedded clock, 258-259 Embedded Communications Channels (ECCs), 597-598, 598f EMLs (electro-absorption/extemally modulated lasers), 715, 721-722 Emphasis, 749-750 Emulated Circuit Demultiplexing Function (ECDX), 471 in MEN CES, 471 Emulation mode structured, 459, 460
794 unstructured, 459, 460 Encapsulation layer, in L2VPNs, 429 End Of Life (EOL), 710-711 End Of Life (EOL) specifications, 701 End-to-end configuration, 573-574, 574f End-to-end connection, 555 End-to-end delay, in MEN CES, 489 End-to-End Path Protection (EEPP) in MEF protection, 509-511, 51 Of OA&M-based, in MEN, 522 packet 1+1, in MEN, 523 End-to-end principle, 564-567, 565f End-to-end service associations, calls as, 564-566,565f Engineering and technology viewpoint, 578 E-NNI call and connection attributes, 630, 631t Enterprise Systems Connection (EBSCON), 529, 531t Enterprise viewpoint, 578 Environmental effects, of Printed Circuit Board (PCB) interconnects, 740-742, 741f Environmental variation, 742 EPL. See Ethernet Private Line (EPL) service EPLAN. See Ethernet Private LAN (EPLAN) service Epsilon model, 711 Equalization for backplane channel, 757 decision-feedback, 757 on eye opening, 756, 757f at receiver, 754-756, 755f, 756f Equipment control, 50-52 Equipment packaging, 39-40, 40f Equipment supervisory process, 53-60 application example of, 58-59, 60f basic concepts of, 53-54, 54f terminology and constructs of, 55-56 utilization of concepts of, 56-58, 56f, 57f Equivalent protection path, 504 ESCON, transport over OTN in. See Multiplex structures, of OTN Ethernet (Services). See also Metro Ethernet architecture and definitions of, 7-10 Backplane, 760, 762 basic tutorial of, 367-371 physical layers in, 368, 368t media access control layer in, 368-370 over metro networks, 7-8 over MPLS networks, 9, 425^55 (See also Ethernet (Services), over MPLS networks)
Index over public wide-area networks, 8-9 SONET services for Storage Area Networks in {See Storage Area Networks (SANs), SONET for) Ethernet (Services), over Metro Ethernet Networks (MEN) background on, 343 Ethernet basic tutorial and, 367-371 media access control layer in, 368-370 {See also Ethernet media access control layer) physical layers in, 368, 368t VLAN ID in, 370-371, 371f future work of, 367 model of, 343, 344f service features for {See also Layer 2 control protocols) all-to-one bundling map in, 350-351, 351f bandwidth profiles in, 359-365 {See also Bandwidth profiles, in Ethernet services over MEN) CE-VLAN ID preservation in, 350 Class of Service in, 356-359 {See also Class of Service (CoS)) Ethernet LAN Service (E-LAN) and, 356 Ethernet Line Service (E-Line) and, 356 feature constraints of, 355-356 maps at UNIs in an EVC in, 355, 355f Layer 2 control protocols in, 365-367 service multiplexing in, 352-367 {See also Service multiplexing) services model for, 343-349 customer edge (CE) and, 344 Ethernet Virtual Connection (EVC) identification at UNI in, 348 Ethernet Virtual Connection (EVC) in, 346 {See also Ethernet Virtual Connection (EVC)) service frame in, 345 {See also Service frame) User Network Interface (UNI) and, 344-345 Ethernet (Services), over MPLS networks, 9, 425-450 E-LAN Service in, 436, 437f E-Line Service in, 436 Ethernet Virtual Connection (EVC) in, 436 L2VPNS over MPLS backbone in, 428-435 {See also Layer 2 Virtual
Index Private Networks (L2VPNs), over MPLS backbone) Metro Ethernet services in, 436, 437-449, 437f {See also Metro Ethernet services over MPLS) Virtual Private Networks (VPNs) in, 425-427 classification of, 426^27, 427f multiservice converged packet switched backbone in, 427 traditional layer 2 (L2VPNs), 425-426, 426f VPLS for Metro Ethernet in, 449^50, 450f Ethernet (Services), over public WAN, 373^22 client interfaces in, 401-405, 402t (See also Client interfaces, for Ethernet services over public WAN) emerging technology for, 421^22 Ethernet client interfaces in, 401^05, 402t bandwidth profile of, 404 bundling in, 404 Layer 2 control protocol processing in, 404^05 multiplexed access in, 402, 403f UNI service attributes in, 405 VLAN mapping in, 404 network-to-network interface (NNI), Ethernet transport and, 405-411, 406f-408f, 410f,411f OAM and, 411-419 domain service vs. network and, 413, 414f generic message format of, 418, 419f mapping maintenance entities to, 417, 418t point-to-point Ethernet flow reference modelin, 414, 415f, 416f protection and restoration for, 419^21 Layer 2 and service restoration in, 421 service provided by transport network in, 420 reason for use of, 373-374 services for, 375-377, 376-377t service types and characteristics of, 379-391, 380f,381f Ethernet Connection (EC) in, attributes of, 381-386 (See also Ethernet Connection (EC)) Ethernet Private LAN (EPLAN) service in, 389-391, 390f, 391t
795 Ethernet Private Line (EPL) service in, 387-388, 387f, 388t Ethernet Virtual Private LAN service in, 391,392t Ethernet Virtual Private Line Service (EVPL) in, 388, 389f standards activity for, 377-378 terms for, 378-379 transport network models supporting Ethernet connectivity services in, 392-401 {See also Ethernet connectivity services) Ethernet bridges, 368 Ethernet client interfaces, 405, 405t Ethernet Connect Functions (ECFs), 507 Ethernet Connection (EC), 379, 381-386 bandwidth profile in, 386 connectivity monitoring in, 386 vs. Ethernet Virtual Connection (EVC), 379 link type in, 385-386 network connectivity in, 382-384, 383f, 384f preservation in, 386 separation, customer and service instance in, 385 summary of, 381, 382t survivability in, 386 transfer characteristics in, 385 UNI hst in, 386 Ethernet connectivity services, transport models supporting, 392-401 DSLAM uplink aggregation in, 394, 394f EoS/PoS/DoS aggregation in, 394, 395f extended GFP (eGFP) frame format in, 400, 400f extended GFP (eGFP) frame format-required modifications for CO-PS network in, 400, 401t layered architecture for EPL services in, 397,399f MDP frame format-required modification ofCO-PSin, 400, 401t MPLS data plane (MDP) in, 399, 400f multiple-customer flow aggregation in, 393,393f packet switching overlay for EoS in, 397, 398f single customer flow aggregation and, 392,393f SONET/SDH physical layer transport for EoS in, 397, 397f Ethernet flow, 327
796 Ethernet Flow Termination Function (EFT), 472 in MEN CES, 472 Ethernet LAN Service (E-LAN), 356, 356t, 436, 437f, 512. See also Ethernet Line Service layer (ETH layer) Ethernet Line Service (E-Line), 356t, 436 in layer 2 control protocols, 356 Ethernet Line Service (E-Line) emulation, walk-through example of, 435f, 441-443, 443f Ethernet Line Service layer (ETH layer), 10, 328 in MEN, 328 Ethernet Line timing, 475f, 475t Ethernet MAC frame, GFP-F encapsulation of, 409, 410f Ethernet mapping. Generic Framing Procedure (GFP) and, 271 Ethernet media access control layer, 369f data and PAD field in, 370 destination address field in, 369 frame check sequence field in, 370 length/type field in, 370 preamble field in, 369 source address field in, 369 start frame delimiter field in, 369 with tag control information, 370, 371f Ethernet modes, raw vs. tag, in E-Line services, 439 Ethernet multiplexing, for T-line over MEN, 462,463f Ethernet over SDH/SONET (EoS), fionctional model of, 410, 41 If Ethernet over Transport (EoT) service, 406 laDI NNI for, 406, 406f frDI NNI for, 406, 407f server layer networks of, 407, 407f Ethernet physical interface (ETY), 407 Ethernet physical layer network (ETY), 379 Ethernet Private LAN (EPLAN) service expected connection characteristics of, 391,391t illustration of, 389, 390f Ethernet Private Line (EPL) service, 387-388 connection characteristics of, 387, 388t GFPin, 184-185, 185f illustration of, 387, 387f layered architecture for, 397, 399f Ethernet pseudowires (PWs), via LDP, in E-Line services, 440^41, 441f
Index Ethernet/TDM transport system, hybrid, 154-155, 154f Ethernet Virtual Connection (EVC), 8, 327, 346, 436, 498^99 bandwidth profile in, 363, 364f vs. Ethernet Connection (EC), 379 identification at UNI of, 350 in T-Line over MEN, 458 Ethernet Virtual Private LAN (EVPLAN) service, 391 expected connection characteristics of, 391,392t Ethernet virtual private line service (EVPL), expected connection characteristics of, 388, 389t Ethernet Wide Area Network (E-WAN), in MEN, 329 ETH interface signal structure, 409, 41 Of ETH layer. See Ethernet Line Service layer (ETH layer) ETH link, in MEN, 340, 34If ETY (Ethernet physical interface), 407 ETY (Ethernet physical layer network), 379 EVC. See Ethernet Virtual Connection (EVC) Events, 501 Event timing, in MEN, 501-503, 503f Excess Burst Size (EBS), 360 Excess Information Rate (EIR), 8, 360 Explicitly routed LSP, 454 Extended attachment unit interface (XAUI), 759 Extended GFP (eGFP) frame format, 399, 400,400f Extended super frame (ESF), 261 DSland,261 Extension header, GFP, 161-162 Extension HEC (eHEC) field, 162 External/line timing. See Line/external timing method Externally Modulated Laser (EML), 715 External Network-to-Network Interface (E-NNI), in MEN, 329, 332 External timing, in MEN CES, 476 External timing configurations, in SONET/SDH, 279-286, 284t. See also Timing configurations, external, in SONET/SDH External timing mode, in SONET/SDH, 259f, 260 External timing option, 475f, 475t Eye mask, 742-744, 743f Eye opening (diagram), 750-752, 752f, 753f
Index equalization on, 756, 757f Fabry Perot, 713 Facility Data Link (FDL), in MEN CES, 487^88 Fail condition, 500 Failure, 55 Failure types, in MEN, 500-501 degrade condition (soft link failure), 500 fail condition (hard link failure), 500 node failure, 501 False frame synchronization, in GFP, 175-176 Fan-out, of timing distributor (TD) functions, 271 Far-end crosstalk (FEXT), 738 Fault, 55 Fault, optical in conventional transmitters and receivers, 728-731 in optically amplified systems, 731-732 Fault cause, 55 Fault detection instant, in MEN, 501, 503f Fault management process Element Management Function (EMF), 56-57,57f Trail Termination Function, 56, 56f Fault monitoring, 54 FCIP, 541, 541f, 542t FEC. See Forward Error Correction (FEC) Fiber Channel (FC), 10, 530, 53It, 534-535, 534f, 535f over IP, 539, 540-541, 540f, 542t Fiber Channel IP (FCIP), 541, 541f, 542t Fiber Channel Storage Area Networks (SANs), distance extension requirements in, 536-538. See also under Storage Area Networks (SANs) Fiber Connectivity (FICON), 530, 53It Fibre Channel. See Fiber Channel (FC) FICON (Fiber Connectivity), 530, 53It Fixed infrastructure, 34 Flexibility, network, 33-34 in fiber channel SANs, 537 Flicker frequency modulation (FFM), 209 Flicker phase modulation (FPM), 209 Flow, 335, 378 Flow control, in SONET, 545 Flow Domain (FD), 335, 378 definition of, 378 multipoint-to-multipoint connectivity and, 384
797 Flow Domain Function, 337 Flow point definition of, 378 OAMand,417 Flow Point/Flow Point Pool, 335 Flow Point Link, 335 Flow Point Pool Link, 335, 378 Flow termination, 378 Flow Termination Point, 336 Focused topologies, 600-601, 601f Forced switch, in MEN, 501 Forward Error Correction (FEC), 4, 67-73, 664 coding gain in, 70-73, 71f-73f measured via EJN^, 71, 72f measured via OSNR, 72-73, 73f measured via Q factor, 70-71, 7If, 72f in interfaces for OTNs, 67-73 {See also Forward Error Correction (FEC)) theoretical description of, 68-70, 69f use of, 67-68 Forwarders, in L2VPNs, 431^32, 432f Forwarding, in synchronization of status messages (SSM), 244 Forwarding Equivalence Class (FEC), in MPLS networks, 451 Forwarding information base (FIB), 453-454 Forwarding plane, MPLS, 454 Four-photon mixing, 704 Four-Wave-Mixing (FWM) effect, 704-705 Fragmentation, in L2VPNs, 431 Frame acquisition delay, in GFP, 179-182, 180f, 181f Frame alignment overhead, in OTUK, 100, lOOf, lOlf Frame alignment signal (FAS), in OTUK, 100, lOOf Frame check sequence (FCS), 345 Frame check sequence field, 370 Frame delay, 357, 358f in MEN CES, 489^90 Frame delay performance objective, for point-to-point EVC, 358, 358f, 358t Frame delay variation, 357 Frame delay variation performance objective, for point-to-point EVC, 359 Frame delineation, in GFP, 164-165, 165f Frame delineation loss (FLD), in GFP, 174-175, 175t Frame Error Ratio (FER), 490 Frame formats, GFP, 159-163. See also GFP frame formats Frame jitter, in MEN CES, 489-490
798 Frame loss, 357 Frame loss performance objective, for point-to-point EVC, 359 Frame-mapped mode, in GFP (GFP-F), 167, 167f Frame multiplexing, in GFP, 166 Frame relay, 365 Frame unavailability (FUA), in GFP, 176-178, 178f Free-run, in MEN CES, 476 Free-run mode, internal timing in, 261 Frequency accuracy, SDH Equipment Clock (SEC) and ODUk Clock (ODC), 218-219 Frequency Domain Multiplexed (FDM) systems, 190-191 Frequency drift rate, 206 Frequency justification in 4 X ODUl to ODU2 multiplexing, 111-112, 112t in ODU1/ODU2 to ODU3 multiplexing, 115, 116t Frequency offset, 206 Frequency tolerance, 206 Frequency traceability, 262-264, 263f, 264f Bell System and, 266 SONET/SDH, clock distribution and, 267, 267f time interval error (TIE) and, 264 timing loops in, 265 Full mesh topology, 599, 599f, 737 Full restoration time, 504 Function, 20 Functional elements, generic MEN, 336-337 Functionality, 29-30, 30f Functional modeling, transport, 18-61 application examples of, 40-50 OTN equipment for ODU2 links and subnetworks in, 49-50, 49f, 50f ring topology in SONET/SDH in, 40-43, 41f-43f STM-64 switched connection services via OTN mesh network in, 45-49, 46f-48f transport fianction characteristics implemented on ADM in, 4If, 43-44, 44f, 45f basic concepts of, 20-29 G.805-based modeling approach in, 20, 60 layering in, 21-24, 23f partitioning in, 24-29, 25f, 26t(See also Partitioning)
Index topology and function in, 20 connection dimension model in, 32-35, 34f connections and points in, 31-32, 3If equipment control in, 50-52 equipment packaging in, 3 9 ^ 0 , 40f equipment supervisory process in, 53-60 {See also Equipment supervisory process) examples of, 36-38, 37f, 38f functionality of, 29-30, 30f ITU-T Recommendation G.805 for, 18-20,60 ITU-T Recommendation G.809 for, 19, 60 ITU-T Recommendation G.8010 for, 20 modeling connectionless layer networks in, 60-61 objective of, 18 sublayers and function decomposition in, 35, 35f, 36f use of, 18-19 Functional modeling specification technique, 2-3 Functions, 31 Fuzzy networking demarcation points, 662, 662f G.709 frame structure, 78f, 79, 80t G.709 overhead bytes, 81-98. See also G.709 overhead bytes directionality of information flow in, 81 ODUk overhead and processing in, 90-94 backward error identification and backward incoming alignment error (BEI/BIAE) in, 93, 94t ODUk frame structure in, 90, 90f ODUk overhead in, 90, 91f path monitoring (PM) byte descriptions in, 92-93, 92f path monitoring status (STAT) in, 94, 94t OPUk overhead bytes and client mapping structure in, 82-87, 83f frequency justification in, 84-86, 85t, 86t mapping CBR2G5 signal onto OPUl in, 86, 87f mapping CBRIOG signal onto OPU2 in, 86, 87f mapping CBR40G signal onto OPU3 in, 86, 87f OPUk overhead in, 83, 84t
Index similarly valued/formatted fields within G.709 frame in BIP-8, 88, 88f trail trace identifier (TTI), 89, 89t tandem connection monitoring (TCM) in, 95-98 automatic protection switching and protection communication channel (APS/PCC) in, 97-98, 98t backward error indication and backward incoming alignment error (BEI/BIAE) in, 96, 96t fault type and fault location reporting communication channel (FTFL) in, 98 general communication channels (GCCa, GCC2) in, 97 tandem connection monitoring ACTivation/deactivation (TCM-ACT) in, 97 TCM monitoring status (STAT) in, 96, 97t use of, 81 G.7713.1 PNNI signaling, 643-644 G.7713.3 GMPLS CR-LDP, 648-649 G.8080 rerouting domain model, 594, 594f Gapped clock, 202 Gaussian jitter, 745 correlated, bounded, 746 uncorrelated, unbounded, 745, 746 General communication channel 0 (GCCO), in OTN interfaces, 104 General Secretariat, ITU, 769, 769f Generic Framing Procedure (GFP), 5-6, 8, 153-187,679 applications of, 184-186 in Ethernet private lines, 184-185, 185f in packet rings, 186, 186f in virtual leased lines, 185, 186f background on, 155-158 for fixed-length PDUs, 157 other traffic adaptation approaches in, 156-157 packet transport on public networks in, 155-156, 156f for small, fixed-length PDUs, 157 for small PDUs, 157-158 for variable-length PDUs, 158 Ethernet mapping and, 271 in Ethernet services over public WAN, 374 formats and procedures in, 158-171
799 client-dependent procedures in, 166-171 {See also Client-dependent procedures, GFP) client-independent procedures in, 16A-I66,l65f {See also Client-independent procedures, GFP) control frames in, 164 frame formats in, 159-163 {See also GFP frame formats) future directions in, 187 high-level fianctional overview of, 158, 158f implementation considerations in, 171-174 scrambler options in, 172-174, 173f, 173t virtual framer management in, 171-172 overview of, 153-155, 154f performance of, 174-184 false frame synchronization in, probability of, 175-176 frame acquisition delay in, 179-182, 180f, 181f frame unavailability (FUA) in, probability of, 176-178, 178f GFP frame delineation loss (FLD) in, probability of, 174-175, 175t link efficiency in, 182-184, 183f scrambler resynchronization delay in, 182 in SONET, 544 VCAT, advantages of, 144 Generic Routing Encapsulation (GRE), 434 GFP. See Generic Framing Procedure (GFP) GFP frame formats, 159-163 basic, 159, 159f client data frames in, 159, 160f client management frames in, 162 chent signal fail (CSF) in, 163, 163f core header in, 160 extension header in, 161-162 payload area in, 160 payload frame check sequence (FCS) in, 162 payload header in, 161 payload information in, 162 GFP-T 64B/65B codes, generating, 169-170, 170f GFP type field, 161 Gigabit Ethernet, 530, 531t GigaBit Interface Converter (GBIC), 719-720, 721f
800 Global optical transport network timing, 6-7 Global Positioning System (GPS) accuracy of, 267 external timing mode and, 260 Global Positioning System (GPS) timing, in synchronization of optical networks, 248 GMPLS CR-LDP (G.7713.3), 648-649 GMPLS RSVP-TE (G.7713.2) signaling, 644-648 background on, 644 call and connection separation support in, 645-646 client addressing in, 648 G.77132 extensions in, 646-647 messages and procedures (ResvTear/RevErr) in, 647 new call objects in, 647 new C-types in, 647 session paradigm in, 648 signaling control domains in, 646 Golden PLL, 744 Governance, of ITU, 769, 769f Green bandwidth profile, 361-362, 361f G.Sup39, 700-712. See also Compatibility, transverse vs. longitudinal H Hard link failure, 500 Hard state, 648 Hierarchical tunnels, VPN, 433^34 Hierarchy clock, 249-250 of International Telecommunication Union (ITU), 768-772, 769f, 77If Optical Transport (OTH), 4, 64 Plesiochronous Digital (PDH), 3, 63 standards for, 692-693 in routing in ASON (G.7715 and G.7715.1), 619-621, 620f in standards development process, 777-779, 777f of stratum levels, 257 Higher-Order Virtual Container (HOVC), 470 High-speed serial interconnects, 12-13, 735-764. See also Serial interconnects, high-speed Hold-in range, 218 SEC and ODC, 218 Hold-off instant, in MEN, 502, 503f Hold-off time, 296 in MEN, 502
Index Hold-off timer, 309-310 Hold-off time T2, 305, 306f Holdover, in MEN CES, 476-477 Holdover mode clock fast mode in, 292 exit from input reference switch and, 292 input signal qualification in, 291 lock-on time stabilization in, 292 phase offset calculation in, 291 internal timing in, 261, 287-289, 292 clock fast mode in, 292 exit from, 291-292 initiation of, 287 maintenance of, 289-290, 289t ovenization in, 290 stratum 3/3E holdover transient vs. seconds at DSl interface in, 288, 289t temperature compensation and, 290 lock fast mode in, 292 ovenization in, 288 Holdover value, calculation of, 288 Hop-by-hop routed LSP, 454 Hub and spoke topologies, 382, 600-601, 601f hybrid (two-tier), 602, 602f Hybrid Ethernet/TDM transport system, 154-155, 154f Hybrid Fiber Coax (HFC) links, 157 Hybrid (multitier) topologies, 601-602, 602f Hybrid (TDM + cell/packet based) Network Element architecture, 667-668 I Identifiers, 568, 589 Subnetwork Point Pool (SNPP), 590-591 Identifier spaces, 588-593 categories and types of, 589-590, 590f control plane component, 590f, 592 data communications network, 590f, 592 management plane, 590f, 593 name and address and, 589 relationships of, 589-590, 590f transport resource, 590-591, 590f, 592f use of, 588-589 Idle frame, GFP, 164 IEEE 802.IQ, 348 IEEE 802.1Qad, in fiiture of Ethernet services, 367 IEEE 802.1Q tag, 345 IEEE® 802.3aeTM Clause 47,XAUI, 759-760
Index iFCP, 541, 542t Impairment instant, in MEN, 501, 503f In-band approaches, 596 Incoming alignment error (lAE), in OTNs, 104 Incoming label, 451 In Force, 774 Information flow, directionality of, 81 Information viewpoint, 578 Infrastructure, fixed, 34 Infrastructure, optical transport network, 2-7 fianctional modeling specification technique in, 2-3 global optical transport network timing in, 6-7 multiservice, 3-6 data and TDM-friendly next-generation SONET/SDH, 4-6 optical transport hierarchy, 4 Ingress service frame, 345, 363, 363f bandwidth profile in, per UNI, 363, 363f In-order delivery, in fiber channel SANs, 536 Input reference switch, in holdover mode, 292 Input signal, qualification of, in holdover mode, 291 Instantaneous phase error, 207 Integrated solutions, for optical interfacing, 716-721. &e also under 140 Mbit/s to 2.5 Gbit/s technology Intelligent switching network (IN), 557 Inter Domain Interface (IrDI), 66-67, 67f, 406 NNI for EoT service in, 407f Interfaces, for Optical Transport Networks, 12, 63-117. See also Optical interfaces background on, 63-64 definition of, 562 forward error correction in, Gl-Ti {See also Forward Error Correction (FEC)) G.709 frame structure in, 78f, 79, 80t G.709 overhead bytes in, 81-98 {See also G.709 overhead bytes) general communication channel 0 (GCCO) in, 104 ODUK multiplexing in, 104-116 {See also ODUK multiplexing) OTN hierarchy overview in, 76-78, 76f-78f OTN standards in, 64-66, 65f OTUK overhead and processing in, 98-101
801 frame alignment overhead in, 100, lOOf, lOlf frame alignment signal (FAS) in, 100, lOOf multiframe alignment signal (MFAS) in, 100, lOlf OTUK frame structure in, 98-99, 99f scrambling in, 99-100, 99f section monitoring byte descriptions in, 101-104 backward defect indication (BDI) in, 102-103 backward error indication and backward incoming alignment error (BEI/BIAE) in, 103, 103t Bit hiterleaved Parity (BIP-8) in, 83f, 101 incoming alignment error (lAE) in, 104 Trail Trace Identifier (TTI) in, 77f, 101,102f standardized, 66-67, 67f tandem connection monitoring in, 73-75, 74f, 75f Interlayer modeling, in control plane architecture in G.8080, 583-585, 584f Intermediate-reach, 709, 710t Internal Network-to-Network Interface (I-NNI), in MEN, 329, 332 Internal timing, in holdover mode, 261, 287-289, 292. See also Holdover mode, internal timing in Internal timing mode, free-run in, 261 Internal timing reference, 475f, 475t International Telecommunication Union (ITU), 768-775, 769f, 770 founding of, 1 hierarchy of, 768-772, 769f, 771f membership in, 772 recommendations of {See ITU-T recommendations) standardization sector of, 64 standards development by, lll-l^T) {See also Standards development process) standards of {See ITU-T standards, optical interface) Internet Service Provider (ISP), 559 Internet Small Computer Systems Interface (iSCSI), 530, 53It, 540, 540f, 542t Interoperability in network restoration, 316-317, 317f in signaling (G.7713) in ASON, 649-651 Interworking, in signaling (G.7713) in ASON, 649-651
802 Interworking Function (IWF), 457 asynchronous and asynchronous tributaries, 477^78 and synchronous tributaries, 478 CE-bound, 472 Circuit Emulation Interworking Function (CES IWF), 469 MEN-bound, 472 synchronous and asynchronous tributaries, 477 and synchronous tributaries, 477 Interworking Function (IWF) processor, 475f, 475t Intra Domain Interface (laDI), 66-67, 67f, 406 NNI for EoT service, 406f Intra-Network Elements communications, 11-12,661-688 2.5 Gbit/s systems in, 668-672 {See also 2.5 Gbit/s technology and systems) 10 Gbit/s systems in, 672-677 {See also 10 Gbit/s technology and systems) 40 Gbit/s systems in, 681-688 {See also 40 Gbit/s technology and systems) background on, 661-662 Metro Ethernet Network (MEN), 326-341 {See also Metro Ethernet Network (MEN) architecture) Network Element design and interface architecture in, 664-668 {See also Network Element (NE)) conceptual data processing partitioning within Network Element in, 664-665,664f hybrid (TDM + cell/packet based) Network Element architecture in, 667-668 packet based Network Elements in, 665-666,665f TDM based Network Elements in, 666, 666f requirements placed on Network Elements by network in, 662-664, 662f SPI-4 Phase 2 (OC-192 System Packet hiterface) in, 679-681, 680f System Framer Interface-4 Phase 2 (SFI-4 Phase 2) in, 677-679, 678f iSCSI (Internet Small Computer Systems hiterface), 530, 53It, 540, 540f, 542t Isolated pointer adjustment, 205 ITU Council, 769, 769f ITU-D, 769f, 770
Index ITU G.Sup39, 700-712. See also Compatibility, transverse vs. longitudinal ITU-T. See International Telecommunication Union (ITU) ITU-T recommendations, 771 approval of, IIA-IIS G.651,703 G.652, 703-704, 705t G.653, 704, 705t G.654, 704, 705t G.655, 705, 705t G.656, 705, 705t G.691,707t,708, 708t, 721 G.692, 696-697, 707t G.693, 707t, 708, 708t, 709 G.694.1,707t G.694.2, 699, 707t G.695, 699-700, 707t, 709 G.696.1,697, 707t G.698.1,697, 707t G.699.1,709 G.709, 24 G.709 FEC, 68-69 G.805, 18-20, 25, 60, 336 G.807, 553-571 {See also under Automatically Switched Transport Network (ASTN) architecture) G.808.1, 296 {See also Protection switching) G.809, 19-20, 60, 336 G.811,213,214,215 G.813,214 G.825,214 G.841,303, 312 G.851-01,575 G.872, 23-24, 64 G.873.1,312 G.955, 692-693, 707t G.957, 694-695, 707t, 708, 708t G.959.1,707t, 708,708t G.7042/Y.1305, 140, 141 G.7712, 595-603 {See also Signaling Communications Network (SCN) architecture (G.7712)) G.7713, 626-633 {See also Signaling (G.7713),inASON) G.7714, 604-611 {See also Discovery, automated (G.7714)) methods and protocols for, 640-643 {See also Discovery, automated (G.7714); Discovery, automated
803
Index (G.7714), methods and protocols for) G.7715, 611-626 {See also Routing, in ASON (G.7715 and G.7715.1)) G.7715.1, 611-626 {See also Routing, in ASON (G.7715 and G.7715.1)) G.7715 and G.7715.1, 611-626 {See also Routing, in ASON (G.7715 and G.7715.1)) G.8010, 20 G.8080, 314-315 1.630,312-313 Y.1720, 312 ITU-T standards, optical interface, 691-733 background on, 691 development process for, 767-783 {See also Standards development process) historical perspective on, 691-700 All Optical Networks (AON) in, 700 Coarse Wavelength Division Multiplexing (CWDM) in, 699-700 Dense Wavelength Division Multiplexing (DWDM) in, 695-697 Optical Transport Network (OTN) in, 697-699 Plesiochronous Digital Hierarchy (PDH) in, 692-693 SDH/SONET in, 693-695 implementation of, 712-728 10 Gbit/s technology in, 721-728 {See also 10 Gbit/s technology) 40 Gbit/s technology in, 728 140 Mbit/s to 2.5 Gbit/s technology in, 713-721 {See also 140 Mbit/s to 2.5 Gbit/s technology) background on, 712-713 optical fault and degradation detection in, 728-732 conventional transmitter and receiver faults in, 728-731 optically amplified system faults in, 731-732 recommendations in, overview of, 706-709, 707t, 708t transverse vs. longitudinal compatibility in, 700-712 {See also Compatibility, transverse vs. longitudinal) IWF processor, 475f, 475t
Jeditterizer, 196
Jitter, 744-746 alignment, 195, 196-198, 198f bathtub curve analysis of, 746-748, 747f correlated, bounded, high-probability, 745 data dependent, 745, 746 defmition of, 259, 744 deterministic, 744-745 Gaussian, 745 correlated, bounded, 746 uncorrelated, unbounded, 745, 746 ITU-T recommendations on, for OTN, SDH, and PDH and, 214-216, 215t in loop/line timing mode, 260 modeling conventions for, 744-746 {See also Jitter) peak-to-peak, 209-210 periodic, 745, 746 random, 745 RMS, 209-210 sinusoidal, 196 timing, 195,200 in timing signal imperfections, 206 uncorrelated, bounded, high-probability, 745 waiting time, 202 Jitter accumulation for PDH clients of SONET/SDH networks, 227-231 for SDH clients of OTN, 231-233 STM-N and OTUk, 219-227, 222f-225f Jitter buffer, 490 Jitter generation, 199-200 Jitter generation and transfer, ODCr, 219-227 Jitter network limit and tolerance, STM-N and OTUk, 219-227 Jitter tolerance, 196-198, 198f Justification + 1/0/-1 byte, 202 +2/+l/0/-l,202 negative, 201 positive, 201 Justification control, 201 Justification structure in 4 X ODUl to ODU2, 109, 109f, 109t in ODU1/ODU2 to ODU3 multiplexing, 114, 114t L2VPNs. See Layer 2 Virtual Private Networks (L2VPNs) Label, 451 in MPLS networks, 451-452, 452f
804 Label disposition, 453 Label Distribution Protocol (LDP), 434 establishing Ethernet PWs via, 440-441, 441f Label encoding, in MPLS networks, 452, 453f Label imposition, 453 Label stack, MPLS, 451, 452f, 453f Label stack operations, in MPLS networks, 453 Label swapping, 453 Label switched path (LSP), 434 in MPLS networks, 454, 455f Label switched router (LSR), 434 in MPLS networks, 453 Label-to-FEC binding, 451 LAN extension, 351 Lasers, 713-714 Last all, 775 Latency, 536 low, in fiber channel SANs, 536 Layer(s), 21-24, 23f atomic functions in, 32, 32f collapsing, 35, 36f expanding, 35, 35f Layer 2 control protocols bridges vs. routers as CE in, 367 in Class of Service (CoS), 365-367 for Ethernet services over MEN, 365-367 handling of, 366 service features for, 365-367 Spanning Tree Protocol and, 421 Layer 2 Virtual Private Networks (L2VPNs), 425-426, 426f Layer 2 Virtual Private Networks (L2VPNs), over MPLS backbone, 428-435 attachment circuit (AC) in, 428 demultiplexer layer in, 430-431 encapsulation layer in, 429 forwarders in, 431-432, 432f fragmentation and reassembly in, 431 MPLS tunnels in, 434 MPLS tunnels in, carrying PWs over, 435,435f native service processing (NSP) in, 431 payload convergence in, 429 pseudowire (PW) in, 428-429, 429f pseudowire (PW) preprocessing in, service-specific, 431 sequencing fianctions in, 430 timing in, 430 VPN tunnels in, 433
Index hierarchical, 433-434 motivation for, 433 protocols for, 434 Layer 3 Virtual Private Networks (L3VPNs), 426 Layer Adjacency Discovery (LAD), 607-610, 608f, 609f Layer Adjacency Discovery Methods, 640-643, 641f protocol for (G.7714.1), 642-643 type 1 in, overhead in server layer, 641 type 2 in, test-signal method in, 641 Layer connection, in Printed Circuit Board (PCB) interconnects, 739-740, 740f Layer network connectionless, modeling, 60-61 in MEN, 334 Layer network model, in MEN, 327-329 application services layer (APP layer) in, 329 Ethernet Services Layer (ETH layer) in, 328 transport services layer (TRAN layer) in, 328-329 Layer planes, MEN, 337 Layer-specific characteristics, 625-626, 626t LCAS. See Link Capacity Adjustment Scheme (LCAS) LDP. See Label Distribution Protocol (LDP) Leased lines, virtual, GFP in, 185, 186f Least mean square (LMS) adaptation, 757 Legacy private line, 539, 542t Length/type field, of Ethernet media access control layer, 370 Limited protection path, 504 Linear extension header, 162 Line card integrated TDM/Packet, with different switch fabrics, 667, 667f using packet/cell switch fabric, 665, 665f using TDM switch fabric, 666, 666f Line/external timing method, 282 SONET/SDH and, 282-285, 283f threshold AIS generation and, 283 Line/external timing mode, SONET/SDH and, 259f, 260-261 Line terminating element (LTE), SONET/SDH and, 260-261, 263 Line timing, 474 from CE, 475f, 475t, 476 from MEN, 475f, 475t, 476 Link, 33 in MEN, 335, 340
Index Link aggregation, link protection based on, inMEF, 514-515, 515f Link Aggregation Control Protocol (LACP), 515 Link Aggregation Group (LAG), 514-515 Link attributes, 624, 624t Link availability, 625 Link capacity, 625 Link Capacity Adjustment Scheme (LCAS), 5, 8, 137, 317, 318f in Ethernet services over public WAN, 420 ITU-T recommendation G.7042A^.1305, 140, 141 Link Capacity Adjustment Scheme (LCAS), in multiplex structures, 140-143 details of, 141-142, 142f, 143f implementers guide for, 144-152 {See also under Multiplex structures, of OTN) link capacity decrease in planned, 140-141 temporary, 141 link capacity increase in, 140 Link Connection (LC), 33, 34f Link efficiency, in GFP, 182-184, 183f Link flow, 379 Link performer, 579 Link protection based on link aggregation, inMEF, 514-515, 515f Link redundancy, 514, 514f Link Resource Manager (LRM), 581-582, 581f, 606-607 Links, 28, 34, 507 Link scrambler, in GFP, 166 Link type, Ethernet Connection (EC) attributes and, 385-386 Link weight, 625 Local Area Network (LAN), transport over OTN in. See Multiplex structures, of OTN Local Area Network (LAN)-based VPN, virtual, 156 Local Area Network (LAN) storage, 529 Local client adaptations supported, 626 Local connection type, 625 Lock fast mode, 292 Lock-on time stabilization, 292 Lockout, in MEN, 501 Long-haul, 709, 710t Long haul core/backbone, 662, 662f Long-haul/long-reach, 709, 710t Longitudinal compatibility, 692
805 physical layer, 700-701 vs. transverse compatibility, 700-712 {See also Compatibility, transverse vs. longitudinal) Long-reach, 709, 7lot Loopbacks, 491-493, 493f customer-controlled, 492^93, 492f provider-controlled, 491, 492f Looping, in synchronization of status messages (SSM), 244-245 Loop/line timing mode, SONET/SDH and, 259f, 260 Loop timing mode, SONET/SDH and, 259f, 260 Lossless delivery, in fiber channel SANs, 537 Lower-Order Virtual Container (LOVC), 470 LSP switching, 646 M Mach Zehnder (MZ) modulators, 721 Main Path hiterfaces (MPI), 702, 702f Maintenance entity (ME), 416^17 Management, of optical transport networks, 10-11 Management plane, 337, 554 Management plane identifiers, 590f, 593 Management-related interfaces naming conventions for, 59 in termination sink fianction, 58-59, 60f Mandatory usage, 624, 624t Manual switch, in MEN, 501 Manufacturing variation, 739 Mapping. See also specific types asynchronous, 200-203 bit-synchronous, 201 in synchronization, 200-203 atUNI, 355, 356t Material loss, in Printed Circuit Board (PCB) interconnects, 739 Maximum Time Interval Error (MTIE), 210-211,213 definition of, 262 time-delay and, 262 Maximum Transmission Unit (MTU), 431 ME. See Maintenance entity (ME) Mean Time To Frame (MTTF), in GFP, 179-182, 180f, 181f Media access control layer, of Ethernet, 368-370. See also Ethernet media access control layer Media-Independent hiterface (XGMII), 759 MEF. See Metro Ethernet Forum (MEF)
806 MEF protection schemes, 521 Membership, ITU, 772 MEN. See Metro Ethernet Network (MEN) MEN-bound IWF, 472 MEN CES. See Metro Ethernet Network Circuit Emulation Services (MEN CES) Messages, in GMPLS RSVP-TE (G.7713.2) signaling, 647 Message set, in synchronization of status messages (SSM), 243-244, 243t Metro Area Network (Metro Core), 662, 662f transport over OTN in {See Multiplex structures, of OTN) Metro core/backbone, 662, 662f Metro Edge, 662, 662f Metro Ethernet Forum (MEF), 1 charter of, 325, 326f circuit emulation over, 9 network resiliency of, 9-10 Metro Ethernet Network (MEN), 323-325, 344 circuit emulation services in, 324-325 Ethernet services over, 7-8 network resiliency in, 323-324 traffic and performance management in, 324, 524-526, 525f transport services layer (TRAN layer) in, 328-329 VPLS for, importance of, 449-450, 450f Metro Ethernet Network (MEN) architecture, 326-341 components of, 334-337 processing, 336-337 topological, 334-335 transport, 335-336 layer network model for, 327-329 application services layer (AFP layer) in, 329 Ethernet Services Layer (ETH layer) in, 328 transport services layer (TRAN layer) in, 328-329 MEN layer relationship to architecture model components in, 337-341 MEN network reference model and topological components in, 338-339, 338f, 339f MEN reference link model in, 340, 34If operational planes and MEN layer networks in, 337 reference model for, 326-327, 326f reference points in, 329-333
Index definition and uses of, 329-330, 330f Ethernet Wide Area Network (E-WAN) in, 329 External Network-to-Network Interface (E-NNI) in, 329, 332 Internal Network-to-Network Interface (I-NNI) in, 329, 332 Network Interworking Network-to-Network Interface (NI-NNI) in, 329, 332-333 other access arrangements in, 333 Service Interworking Network-to-Network Interface (SI-NNI) in, 329, 333 Service Node Interface (SNI) in, 333, 334f User-Network Interface (UNI), 330-331, 331f Metro Ethernet Network (MEN)-bound Interworking Function (IWF), 472 Metro Ethernet Network Circuit Emulation Services (MEN CES), 9, 324-325, 457^96, 466-496. See also Metro Ethernet Network Circuit Emulation Services (MEN CES) alarms in for buffer underflow and overflow, 488-489 for structured service, 488 for unstructured service, 488 asynchronous IWF and asynchronous tributaries in, 477^78 asynchronous IWF and synchronous tributaries in, 478 CES Interworking Function in, synchronization description of, 475-477, 475f, 475t Circuit Emulation Interworking Function (CES IWF) in, 469 customer-operated, 465, 465f customer-operated CES in, 465, 465f definition of, 457 direction terminology of, 472 efficiency of, 496 Emulated Circuit Demultiplexing Function (ECDX) in, 471 end-to-end delay in, 489 Ethernet Flow Termination Function (EFT) in, 472 Facility Data Link (FDL) in, 487-488 general principles of, 466^67 loopbacks in, 491^93, 493f customer-controlled, 492-493, 492f
Index provider-controlled, 491, 492f mixed-mode, 465-466, 466f mixed-mode CES in, 465-466, 466f PDH Circuit Emulation Service in, 468t, 469, 470t protection in, 493-496 scenario 1: dual unprotected services, 493^94, 493f scenario 2: dual protected services, 494,494f scenario 3: single protected service, 494^95, 495f scenario 4: single-to-dual interface service, 495-496, 495f service impairment in, 489^90 bit errors in, 490 frame delay andframejitter in, 489^90 Frame Error Ratio and IWF behavior in, 490 frame loss in, 489 TDM, errors within MEN causing, 489^90 service interface types in, 467, 467f service quality in, 496 SONET/SDH Circuit Emulation Service in, 469-471, 470t,471f synchronization in, 472-475, 473f, 473t, 474t synchronized administration in, 478^87, 479t multi service-provider-owned network in, 480, 480t, 483-487, 483f, 484t separate and diverse, 479 service clock preservation in, 479 service timing-private network in, 480, 480t, 486t, 487 single service-provider-owned network in, 480-483, 480t, 481f, 481t synchronization traceability in, 479-480 synchronization trail in, 479 transport timing-private network in, 480, 480t, 486t, 487 synchronous IWF and asynchronous tributaries in, 477 synchronous IWF and synchronous tributaries in, 477 TDM Access Line Service (TALS) in, 463, 464f operational modes of, 464, 464f TDM line service (T-Line) in, 458^63 {See also TDM line service (T-Line)) TDM service interface examples in, 467-468, 468t
807 TDM Service Processor (TSP) in, 468-469 TDM signaling in, 490-491 Metro Ethernet Network (MEN) resiliency, 497-524 background on, 497-499 event timing in, 501-503, 503f failure types in, 500-501 framework for protection in, 521-524 aggregated line and node protection (ALNP) in, 522 background on, 521 MEF protection schemes in, 521 OA&M-based End-to-End Path Protection (EEPP) in, 522 packet 1+1 End-to-End Path Protection (EEPP) in, 523 shared mesh protection in, 523-524 Protection Reference Model (PRM) in, 505-516, 505f (&e also Protection Reference Model (PRM)) protection types in, 499-500 requirements for protection mechanisms in, 516-520 network-related, 517-520 {See also Network-related requirements, for MEN protection) service-related, 516-517 resource selection in, 501 shared-risk link group (SRLG) in, 503 SLS commitments in, 504 timing issues in, 504 Metro Ethernet services, 436, 437f VPLS importance in, 449^50, 450f Metro Ethernet services over MPLS, 437^49 E-LAN service emulation walk-through example in, 448-449, 448f emulation of E-LAN services using VPLS in, 443-447 avoiding VPLS forwarding loops in, 446 VPLS PW encapsulation in, 447 VPLS PW setup in, 447 VPLS reference model in, 443-446, 444f, 445f emulation of E-Line Services using VPWS in, 438-441 E-Line service emulation walk-through example in, 435f, 441-443, 443f Ethernet modes (raw vs. tag) in, 439 Ethernet PWs via LDP in, establishing, 440-441, 44If
Index VLAN tag processing in, 439-440 VPWS reference model in, 429f, 435f, 438, 439f Mid-span-meet, 693 Mixed mode service, 465 Mixing products, 704 m:n protection architecture, 300, 30If m:n protection type, 500 Monitoring, nonintrusive, 44 MP2MP protection, in MEF, 512-514, 512f, 514f MPLS. See Multiprotocol Label Switching (MPLS) MPLS control plane, 453-454 MPLS data plane (MDP), 399, 400f MPLS forwarding plane, 454 MPLS label stack, 451, 452f, 453f MPLS networks. See Multiprotocol Label Switching (MPLS) networks MPLS signaling protocols, 451 MPLS tunnels, 434 carrying PWs over, 435, 435f MTIE. See Maximum Time Interval Error (MTIE) Multicast service frame, 347-348 Multidomain path diversity, 632, 632f Multi-Frame Ahgnment Signal (MFAS) inOTUK, 100, lOlf in VCAT multiframe, 137, 139f Multi-Frame Indicator (MFI), 132 Multilayer survivability, 318-319 Multi-Longitudinal-Mode (MLM), 713 Multiplexed access, in Ethernet client interfaces, 402, 403f Multiplexing, in synchronization, 200-203 Multiplexing mode, 459, 460-461, 461f Multiplex Section (MS), SDH, 214 Multiplex Section overhead (MS-OH) overhead bytes in, 122f, 123 inSDHframe, 120, 121f Multiplex Structure Identifier (MSI) OPU 2, in 4 X ODUl to ODU2 multiplexing, 110-111, llOf, l l l f OPU3, in ODU1/ODU2 to ODU3 multiplexing, 115, 116f Multiplex structures, of OTN, 119-152 applications of, 119-120 bandwidth evolution in, 127-129, 127f-130f implementers guide for VCAT and LCAS in, 144-152 alignment within VCG in, 148-149
differential delay buffers in, overview of, 147 differential delay buffers in, sizing of, 149 differential delay buffers in, structure and management of, 146-147, 147t differential delay in, compensation of, 145-146 differential delay in, detection of, 144-145 distribution/reconstruction order in, controlling, 150-151, 151t member status in, 151-152 processing time in, 149-150 Link Capacity Adjustment Scheme (LCAS) in, 137, 140-143 details of, 141-142, 142f, 143f link capacity decrease in, planned, 140-141 link capacity decrease in, temporary, 141 link capacity increase in, 140 new clients in, 130-131, 130t Synchronous Digital Hierarchy (SDH) structure in, 120-127 overhead bytes in, 122-123, 122f overview of, 120-121, 121f pointers in, 123-125, 124f sub structuring in, 126-127 VC-n structure in, 125-126, 125f, 126f Virtual Concatenation (VCAT) in, 131-139 additional benefits of, 136 detailsof, 137, 138t, 139t differential delay in, 131-132, 132f origins and value of, 131 payload distribution and reconstruction in, 133-134, 133f, 134t, 135t restrictions of, 136-137 VCAT LCAS and GFP advantages in, 144 Multipoint-to-multipoint Ethernet services and, 382, 384f topology of network portion for, 384, 384f Multipoint-to-multipoint EVC, 351 Multiprotocol, 452 Multiprotocol Label Switching (MPLS) as CO-PS candidate, 399 data plane (MDP), 400, 400f, 40 It Multiprotocol Label Switching (MPLS) networks, 451-455 benefits of, 455
Index Ethernet services over, 9, 425-455 {See also Ethernet (Services), over MPLS networks) forwarding equivalence class in, 451 L2VPNs over {See Layer 2 Virtual Private Networks (L2VPNs), over MPLS backbone) label encoding in, 452, 453f label in, 451^52, 452f label stack operations in, 453 label switched path (LSP) in, 454, 455f label switched router (LSR) in, 453 Metro Ethernet services over {See Metro Ethernet services over MPLS) MPLS control plane in, 453-454 MPLS forwarding plane in, 454 Multiprotocol Label Switching (MPLS) tunnels, 434 carrying PWs over, 435, 435f Multiservice converged packet switched backbone, in VPN, 427 Multiservice optical transport network infrastructure, 3-6 data and TDM-friendly next-generation SONET/SDH in, 4-6 optical transport hierarchy in, 4 Multiservice platforms packet over SONET in, 273 timing distributor example of, 272, 272f TSWC01622 and, 272-273 Multi service-provider-owned network, in MEN CES, 480, 480t, 483-487, 483f, 484t service timing in, 483-485, 486t transport timing in, 485-487, 486f, 486t Multiservices Provisioning Platforms (MSPPs), 663-664 Multi-Source Agreement (MSA), 717 Multispan configuration, 702, 702f Multitier topologies, 601-602, 602f Mult timing method, 285-286, 285f N n: 1 protection type, 500 Name, 589, 590 Naming, in ASTN, 568-569 Narrowly spaced signals, 694 Native Service Processing (NSP), 431, 439 Near-end crosstalk (NEXT), 738 Negative justification, 201 Negative pointer adjustment, 204 Negative stuff, 201 Net coding gain (NCG). See Coding gain
809 Network. See also specific networks flexibility of, 33-34 Network Attached Storage (NAS), 529, 530f Network CaU Controller (NCC), 58If, 582, 584-585, 584f Network Call Correlation Information Element, 643 Network connection, 33, 34f Network connectivity, Ethernet Connection (EC) attributes and, 382-384, 383f, 384f Network Element (NE), 204, 507. See also Intra-Network Elements communications design and interface architecture in, 664-668 conceptual data processing partitioning within, 664-665, 664f hybrid (TDM + cell/packet based) architecture of, 667-668 packet based, 665-666, 665f TDM based, 666, 666f in MEN, 329 in SDH network, 136 Network Element (NE) architecture, 268-279 example of, 268-269, 268f large, 278-279, 279f medium, 277-278, 278f smaU, 276-277, 277f system architecture of, 275-276, 276f timing distributor (TD) fianctions in, 270-275, 272f, 274f application example of, 271, 272f fan-out in, 271 synchronization selection of, 271 synthesis in, 271 system block architecture of TSWC01622 and, 272, 272f timing engine (TE) fianctions in, 269-270 Network engineering, synchronization, 249-250 Networking. See also specific networks on processing efficiency with CPU cycles supporting backup and defragmentation, 529 on storage resource use, 528-529 Network Interworking Network-to-Network Interface (NI-NNI), in MEN, 329, 332-333 Network Management Systems (NMSs), 593 Network Processor (NP), 665
no Network protection, 295-296. See also Protection switching Network-related requirements, for MEN protection, 517-520 backward compatibility requirements in, 519 bidirectional switching in, 519 degrade condition threshold in, 517-518 effect on user traffic in, 520 network topology in, 520 protected failures in, 517 protection control requirements in, 518-519 protection schemes in, management requirements for, 520 QoS in, 520 robustness in, 519 transport layer protection mechanism interactions in, 518 Network resiliency in fiber channel SANs, 537 in MEN, 323-324 Network restoration. See Restoration, network Network survivability. See Survivability, network Network synchronization. See Synchronization Network-to-Network Interface (NNI), 8, 316, 317f, 375 Ethernet transport and, 405-411, 406f, 407f, 408f, 410f, 411f Network topology, in MEN protection, 520 New Data Flag (NDF), 123 Node, 507, 577. See also Network Element (NE) Node attributes, 623-624, 624t Node clock, 215 Node failure, 501 Node identification (node ID), 623 Noise clock, 207-208 phase, 207-209 Noise enhancement, 754, 755f Noise power (N), 738 Nonintrusive monitoring, 44 Non-retum-to-zero (NRZ), 191-192. See also NRZ entries Nonrevertive operation, 304 Nonservice delimiting, 439 No Request (NR), 311 Normal contributions, 773 Notify message, 629
Index NRZ1.25G, 706, 707t NRZ 2.5G, 706, 707t, 708 NRZ lOG, 706, 707t, 708t NRZ 40G, 706, 707t, 708t NRZ OTU3, 708t Null extension header, 162 Null signal, in protection switching, 310 O OAM. See Operations, Administration, and Management (OAM) OA&M-based End-to-End Path Protection (EEPP), in MEN, 522 OAM&P, 665 Observation time, in source traceability, 262 OC-192 System Packet Interface SPI-4 Phase 1, 674-677, 675f SPI-4 Phase 2, 679-681, 680f OC-768 System Packet hiterface, SPI-5, 685-687, 685f OChs. See Optical Channels (OChs) Octet slip, 234 ODCr jitter generation and transfer, 219-227 ODU. See Optical Channel Data Unit (ODU) layer network ODU1/ODU2 to ODU3 multiplexing, 112-116 structure in, 112, 113f ODUl to ODU2 justification rate, 105 ODUl to ODU 3 justification rate, 106-107 ODU2 links and subnetworks, OTN equipment for, 49-50, 49f, 50f ODU2 server trail, 47 ODU2 subnetwork, 4 7 ^ 9 , 47f-48f ODU2 to ODU 3 justification rate, 105-106 ODUk Clock (ODC), 214, 219 ODUK multiplexing, 104-116 4 X ODUl to ODU2 multiplexing in, 107-112 (&eafao ODUK multiplexing) 4 X ODUl to ODU2 justification structure in, 109, 109f, 109t frequency justification in, 111-112, 112t OPU 2 multiplex structure identifier (MSI) in, 110-111, 110f,lllf OPU2 payload structure identifier (PSI) in, 110 structure in, 107, 108f multiplexing data rates in ODUl to ODU2 justification rate in, 105
Index ODUl to ODU 3 justification rate in, 106-107 ODU2 to ODU 3 justification rate in, 105-106 ODU1/ODU2 to ODU3 multiplexing in, 112-116 frequency justification in, 115, 116t justification structure in, 114, 114t OPU3 multiplex structure identifier (MSI)in, 115, 116f OPU3 payload structure identifier (PSI) in, 114 structure in, 112, 113f OIF. See Optical Internetworking Forum (OIF) OIF SxI-5, 758 OIF TFI-5, 759 OMS. See Optical Multiplex Section (OMS) layer One-to-One Map, 352-354, 352f, 353f On-the-fiy restoration, 315 Open Shortest Path First (OSPF), 454 Operational aspects, of ASTN architecture, 559-562 Operational planes, MEN, 337 Operational Support System (OSS), 561 Operations, Administration, and Management (OAM) definition of, 378 domain service vs. network and, 413, 414f generic message format of, 418, 419f mapping maintenance entities to, 417, 418t point-to-point, Ethernet fiow reference modeland,414,415f,416f Operations, Administration, Management and Provisioning (OAM&P), 665 Operation types, in protection switching, 304 Optical Amplifier (OA), 695, 702, 702f Optical Channel Data Unit (ODU) layer network, 24 Optical Channels (OChs), 4, 23 Optical Channel Transport Unit (OTU) layer network, 24 Optical Cross-Connects (OXCs), 63 Optical faults, 728-732 in conventional transmitters and receivers, 728-731 faults in optically amplified systems, 731-732 Optical fibers, types and recommendations for, 703-705, 705t Optical interfaces, 12, 708, 708t
Ul Optical interface specification. See ITU-T standards, optical interface Optical interface standards, ITU-T. See ITU-T standards, optical interface Optical Internetworking Forum (OIF), 1, 317 Optical laser transmitter. See also ITU-T standards, optical interface Continuous Wave (CW), 715 electro-absorption/extemally modulated (EMLs), 715,721-722 Optical Multiplex Section (OMS) layer, 23, 77, 77f Optical Network Elements (ONEs), in physical layer transverse compatibility, 702-703 Optical path penalty, modeling and verification of, 711-712 Optical receiver. See ITU-T standards, optical interface Optical Signal-to-Noise Ratio (OSNR), 72-73, 73f, 732 measuring coding gain with, ll-Ti, 73f, 732 Optical Supervisory Channel (OSC), 570 Optical Transmission Section (OTS) layer, 23, 77, 77f Optical Transport Hierarchy (OTH), 4, 64 Optical Transport Network (OTN), 63 3R points in, 65, 65f architecture of, 23 hierarchy overview in, 64-66, 65f, 76-78, 76f-78f infrastructure of {See Infrastructure, optical transport network) layer network trails in, 76, 77f layers and containment relationships in, 76 in Recommendation G.805, ITU-T, 19 standards for, 64-66, 65f, 697-699 survivability of, 313, 314f Optimal path, 632 Option 1 networks, 217 Option 2 networks, 217 OPU 2 multiplex structure identifier (MSI), in 4 X ODUl to ODU2 multiplexing, 110-111, 110f,lllf OPU2 payload structure identifier (PSI), in 4 X ODUl to ODU2 multiplexing, 110 OPU3 multiplex structure identifier (MSI), in ODU1/ODU2 to ODU3 multiplexing, 115, 116f OPU3 payload structure identifier (PSI), in ODU1/ODU2 to ODU3 multiplexing, 114
\n OPUk frame structure, 82, 83f OPUk overhead, 83, 84t OSNR (Optical Signal-to-Noise Ratio), 72-73, 73f, 732 measuring coding gain with, 72-73, 73f, 732 OTH. See Optical Transport Hierarchy (OTH) OTM-n.m. server signal, STM-N client signal on, 37-38, 38f OTN. See Optical Transport Netvi^ork (OTN) OTN equipment, fijr ODU2 links and subnetworks, 49-50, 49f, 50f OTN mesh network, STM-64 switched connection services via, 45-49, 46f-48f OTS. See Optical Transmission Section (OTS) layer OTU. See Optical Channel Transport Unit (OTU) layer network OTUK frame structure, 98-99, 99f OTUk jitter accumulation, 219-227, 222f-225f OTUk jitter network limit and tolerance, 219-227 OTUK overhead and processing, 98-101 frame alignment overhead in, 100, lOOf, lOlf frame alignment signal (FAS) in, 100, lOOf multiframe alignment signal (MFAS) in, 100, lOlf OTUK frame structure in, 98-99, 99f scrambling in, 99-100, 99f Outgoing label, 451 Out-of-band approaches, 596-597 Ovenization, in holdover mode, 290 holdover value calculation and, 288 phase offset computation and, 288 Overhead, 120 Overhead area, in SDH frame, 120, 121f Overhead bytes, in SDH structure, 122-123, 122f OXCs. See Optical Cross-Connects (OXCs) Packet 1+1 End-to-End Path Protection (EEPP), in MEN, 523 Packet based Network Element, 665-666, 665f Packet/cell switch fabric, line card using, 665,665f Packet over SONET, 273 Packet rings, GFP in, 186, 186f
Index Packet switched network (PSN), 439 Packet switching overlay, for EoS, 397, 398f Packet transport, on public networks, GFP in, 155-156, 156f PAD field, of Ethernet media access control layer, 370 Parties, 560 Partitioning, 24-29, 25f, 26f See also Partitioning layering and, 27 link, 25-26, 26f parallel, 26, 26f recursive, 25, 25f serial and component, 26-27, 27f topology abstraction with, 27-29, 28f Path computation service approach, 622 Path diversity multidomain, 632, 632f simple, 631-632, 632f Payload area GFP, 160 inSDHframe, 120, 121f Payload convergence, in L2VPNs, 429 Payload distribution and reconstruction, in VCAT, 133-134, 133f, 134t, 135t Payload frame check sequence (FCS), GFP, 162 Payload header, GFP, 161 Payload information, GFP, 162 Payload Structure Identifier (PSI), 83, 120 OPU2, in 4 X ODUl to ODU2 multiplexing, 110 OPU3, in ODU1/ODU2 to ODU3 multiplexing, 114 Payload Type (PT), 83 PCB. See Printed circuit board (PCB) PDH. See Plesiochronous Digital Hierarchy (PDH) PDH Circuit Emulation Service, 468t, 469, 470t in MEN, 468t, 469, 470t PDH DS-3 client signal, on STM-N server, 36, 37f, 39-40 PDH generation networks, 551 PDU Length Indicator (PLI) field, 160 Peak-to-peak jitter, 209-210 Per Class of Service Identifier bandwidth profile, application of, 364, 364f Per EVC bandwidth profile, application of, 363, 364f Performance, in fiber channel SANs, 538 Performance management, in MEN, 324, 524-526, 525f
Index Performance monitoring, in MEN CES, 487^89 end-to-end delay in, 489 Facility Data Link (FDL) in, 487-488 Performance parameters, 53-54 Per ingress UNI bandwidth profile, 363-365 Periodic jitter (PJ), 745, 746 Permanent connection (PC), 555 Permanent Virtual Circuit (PVC), 395 Phase, 192 Phase error, instantaneous, 207 Phase-error fianction, 192 Phase noise, in synchronization, 207-209 Phase offset computation, in holdover mode, 288,291 Physical layer longitudinal compatibility, 700-701 Physical layer standards (PHYs), Ethernet, 760 Physical layer transverse compatibility, 701-703, 702f PICMG, 760 Plane, 552, 554 control and transport, separation of, 556-557 management, 554 transport, 554 Plenipotentiary Conference, ITU, 769 Plesiochronous Digital Hierarchy (PDH), 3, 63 standards for, 692-693 Plesiochronous operation, 238-239, 239t Plesiochronous timing, 267 PNNI signaling, G.7713.1, 643-644 Pointer, in SDH structure, 123-125, 124f Pointer adjustments, 203-205 isolated, 205 negative and positive, 204 Points, 31-32, 31f Point-to-multipoint, Ethernet services and, 382 Point-to-multipoint EVC, 367 Point-to-point Ethernet services and, 384 OAMand, 414,415f Point-to-point EVC frame delay performance objective and, 358-359 frame loss performance objective and, 359 Point-to-Point protocol (PPP), 452 Polarization Mode Dispersion (PMD), 704 Policy, in network protection, 498 Port controller, 580
U3 Positive justification, 201 Positive/negative/zero, 202 Positive pointer adjustment, 204 Positive stuff, 201 Power budget design optical path penalty modeling and verification in, 711-712 worst-case design approach to, 710-711 Power Spectral Density (PSD), 206, 207 PRC/PRS autonomy, 240-241, 240f Preamble field, 369 Pre-emphasis, 749 Premise (enterprise networks), 662, 662f Premium SLA, 465 Primary Reference Clock (PRC), 213, 215-216,217 Primary Reference Source (PRS), 217 Primary site. See Storage Area Networks (SANs); Storage networking Primitives, 580 Printed Circuit Board (PCB), 677 Printed Circuit Board (PCB) interconnects, 736, 738-742 environmental effects in, 740-742, 74If layer connection in, 739-740, 740f material loss in, 739 Printed Circuit Board (PCB) traces, 12 Priority-based switching, 241 Priority tagged, 349 Priority tagged frame, 371 Priority tagged service frame, 348 Private line, 380, 381f, 381t in all-to-one bundling map, 351 in Ethernet services over public WAN, 380 legacy, 539, 542t Private service, 380 Processing components, in MEN, 336-337 Propagation delay, 132, 132f Protected failures, in MEN, 517 Protection, network, 295-296 definition of, 593 in MEN, 497-599 {See also Metro Ethernet Network (MEN) resiliency) in MEN CES, 493^96 scenario 1: dual unprotected services, 493-494, 493f scenario 2: dual protected services, 494,494f scenario 3: single protected service, 494-495, 495f scenario 4: single-to-dual interface service, 495^96, 495f
U4 vs. restoration, 314 Protection control requirements, in MEN, 518-519 Protection path, 132, 132f Protection Reference Model (PRM), 505-516 MEF protection mechanism in, 507-516 aggregated line and node protection (ALNP) in, 508-509, 508f application protection constraint policy (APCP)in, 516 end-to-end path protection (EEPP) in, 509-511, 510f link protection based on link aggregation in, 514-515, 515f MP2MP protection in, 512-514, 512f, 514f topology in, 507 transport in, 506 use and structure of, 505, 505f Protection switching, 295-313, 420 architectures of, 297-303 1+1,298, 298f (1:1)°, 300, 302f l:n, 298-299, 299f m:n, 300, 301f ring, 302-303, 303f Automatic Protection Switching Protocol (APS) in, 310-312 APS signal in, 310-311 external commands in, 311 priority in, 312 process states in, 311 classes of, 306-309 subnetwork connection protection (SNC-P) in, 307-308, 308f, 309f trail protection in, 306, 307f unidirectional path switch ring (UPSR) in, 309 definition of, 295-296 examples of, 312-313 hold-off timer in, 309-310 network objectives in, 297 null signal in, 310 parameters of, 303-305 operation types in, 304 protocol types in, 304-305 switching types in, 303-304 temporal model in, 305, 306f Protection types, MEN, 499-500 1+1,499 1:1,500 l:n, 500
Index m:n, 500 n: 1,500 Protocol analysis, in ASTN, 637-640 approach to, 637-639, 638t requirements implications on protocol solutions in, 639-640 Protocol Controller (PC), 580, 581f, 582-583, 607 Protocol data unit (PDU), 154 Generic Framing Procedure (GFP) for with fixed-length PDUS, 157 with small, fixed-length PDUs, 157 with small PDUs, 157-158 with variable-length PDUs, 158 Protocol neutral, 634 Protocol rules, in synchronization of status messages (SSM), 244 Protocol types, in protection switching, 304-305 Provide Bridges, in future of Ethernet services, 367 Provider-controlled loopbacks, 491, 492f Provider Edge (PE), 525, 525f See also Ethernet (Services), over public WAN Provider provisioner VPN (PPVPN), 426^27, 427f Pseudowire (PW) establishing Ethernet, via LDP, 440^41, 441f in L2VPNS, 428-429, 429f over MPLS tunnels, 435, 435f VPLS PW encapsulation in MEN and, 447 VPLS PW setup in MENs and, 447 Pseudowire (PW) preprocessing, service-specific, 431 Public Switched Telephony Network (PSTN), 557-559 Public wide-area networks, Ethernet services over, 8-9 Pull-in range, 218 SEC and ODC, 218 Pull-out/hold-in ranges, SDH Equipment Clock (SEC) and ODUk Clock (ODC), 218 Pull-out range, 218 PW. See Pseudowire (PW) PW label, 435 PW-PDU, 429, 429f Q Q factor, 732 measuring coding gain with, 70-71, 7If, 72f
Index Quality, in synchronization areas, 237t, 239-240 Quality-based switching, 242 Quality of Service (QoS), 9, 11 in MEN protection, 520 VPN tunnels on, 433 Quasi-static, 218 Query, 629 Querylndication, 629 QueryRequest, 629 Questions, 771, 77If R R-1,516 R-2, 516 R-3, 516 R-4, 516-517 R-5, 517 R-6, 517 R-7, 517-518 R-8, 518 R-9, 519 R-10, 519 R-11,519 R-12, 519 R-13, 519 R-14, 520 R-15, 520 R-16, 520 R-17, 520 R-18, 520 R-19, 520 Radiocommunication Advisory Group (RAG), 769f, 770 Radiocommunication Assembly (RA), 769f, 770 Radiocommunication Sector (ITU-T), 769f, 770 Radiocommunication Standardization Bureau (BR), 769f, 770 Radio Regulations Board (RRB), 769, 769f, 770 RA identifier (RA ID), 620 Random jitter (RJ), 745 Random walk frequency modulation (RWFM), 209 Rapid spanning tree protocol (RSTP), 323-324 Rapporteur, 771-772, 771f RC identifier (RC ID), 620 Reachability information, 624 Reassembly, in L2VPNs, 431
U5 Receiver, equalization at, 754-756, 755f, 756f Recommendations. See ITU-T recommendations Recovery-point objective (RPO), 538-539 Recovery time, 504 Recovery-time objective (RTO), 538-539 Recovery time T5, 305, 306f Red bandwidth profile, 361-362, 361f Redundant Array of Independent Disks (RAID), in storage, 529, 530f Redundant service access, example of, 353f, 354 Reed-Solomon code, 68-70 Reference acceptance, 242 Reference duplication, 241-242 Reference link model, MEN, 340, 341f Reference Model for Open Distributed Processing (RM-ODP), 574-575 Reference points, 31 in ASTN, 562-564 in MEN, 329-333 definition and uses of, 329-330, 330f Ethernet Wide Area Network (E-WAN) in, 329 External Network-to-Network Interface (E-NNI) in, 329, 332 Internal Network-to-Network Interface (I-NNI) in, 329, 332 Network Interworking Network-to-Network Interface (NI-NNI) in, 329, 332-333 other access arrangements in, 333 Service Interworking Network-to-Network Interface (SI-NNI) in, 329, 333 Service Node Interface (SNI) in, 333, 334f User-Network Interface (UNI) in, 326-327, 330-331, 33If Reference selection, 241-242 Regenerator Section overhead (RS-OH) overhead bytes in, 122f, 123 inSDHfi-ame, 120, 121f Regenerator section (RS), 214 Regular clock, 202 Releaselndication message, 629 ReleaseRequest message, 629 Rerouting, 571 Rerouting domain model, G.8080, 594, 594f Resilience, network in fiber channel SANs, 537 in MEN, 323-324
U6 Resource class, 625 Resource Reservation Protocol (RSVP), 434 Resource selection, in MEN, 501 Restoration, network, 296, 314-317 advantages of, 314 definition of, 296, 593 interoperability in, 316-317 ITU recommendations on, 314-315 in network protection, 498 on-the-fly, 315 preplanned routing with centralized route calculations in, 315 vs. protection, 314 restoration time in, 315-316 techniques of, 315 Restoration architecture, in G.8080, 593-595,594f Restoration time categories, for Ethernet services protection, 516-517 ResvTear/RevErr, in GMPLS RSVP-TE (G.7713.2) signaling, 647 Retiming, in synchronization architectures for SONET/SDH, 270 Reversion, in MEN, 503 Reversion instant, in MEN, 502 Revertive mode, in MEN, 501 Revertive operation, 304 Ring extension header, 162 Ring network, SSM-based restoration in, 245-246, 246f, 247f Ring protection architecture, 302-303, 303f Ring protection schemes, SONET/SDH, 41-42,41f, 42f Ring topology, in SONET/SDH, 4 0 ^ 3 , 41f-43f RMS jitter, 209-210 Robustness, in MEN protection, 519 Routing, in ASON (G.7715 and G.7715.1), 611-626 architecture of, 615-619, 616f-618f hierarchy in, 619-621, 620f information exchange in, 621-626 {See also Routing information exchange) methods and protocols of, 651-652 requirements for, 611-614 architectural, 612-613, 612f, 613f protocol in, 613-614, 614f, 615f Routing, in ASTN, 568 Routing adjacency, 615, 616f Routing Area (RA), 577, 615-618, 616f-618f Routing Controller (RC), 581, 581f Routing control topology, 615
Index Routing information exchange, 621-626 fiandamentals of, 621-623 general attributes of, 623 layer-specific characteristics in, 625-626, 626t link attributes of, 624, 624t node attributes of, 623-624, 624t Routing performer (RP), 615, 616f 3R points, 65, 65f RSVP session, 646 RZ 40G, 706, 707t, 708t
SI and ESF SSMs, translation between, 284t, 285 SAN. See Storage Area Networks (SANs) SAN islands, 532-534 SAN protocol. See Storage Area Networks (SANs) Satellite timing, in synchronization of optical networks, 248-249 Scalability, network, in fiber channel SANs, 538 Scrambler options, in GFP, 172-174, 173f, 173t Scrambler resynchronization delay, in GFP, 182 Scrambling inOTUK, 99-100, 99f of SDH/SONET frame, 122 SCSI (Small Computer Systems Interface), 530, 531t SDH, 3, 63 in Recommendation G.805, ITU-T, 19 SDH Equipment Clock (SEC), 214 SDH Equipment Clock (SEC) and ODUk Clock (ODC) frequency accuracy and clock modes, 218-219 SDH Equipment Clock (SEC) and ODUk Clock (ODC) puU-in and pull-out/hold-in ranges, 218 SDH generation networks, 551 SDH Multiplex Section Shared Protection Rings (MS-SPRING), 303, 303f SDH structures, 120-127 multiplex, 128-129, 129f overhead bytes in, 122-123, 122f overview of, 120-121, 121f pointers in, 123-125, 124f sub structuring in, 126-127 VC-n structure in, 125-126, 125f, 126f Secondary site. See Storage Area Networks (SANs); Storage networking
Index Section monitoring byte descriptions, in OTNs, 101-104 backward defect indication (BDI) in, 102-103 backward error indication and backward incoming alignment error (BEI/BIAE) in, 103, 103t Bit Interleaved Parity (BIP-8) in, 83f, 101 incoming alignment error (lAE) in, 104 Trail Trace Identifier (TTI) in, 77f, 101, 102f Selection, in synchronization of status messages (SSM), 244 Selector connection fionction, 44 Separation, of control and transport planes, 556-557 Sequence diagrams, 587 Sequence Number (SQ), 133 Sequencing functions, in L2VPN over MPLS backbone, 430 SERDES (Serializer/Deserializer), 12, 76-77, 665,735 SERDES Framer Interface-5 (SFI-5), 682-685 signals in, receive direction, 682-684 signals in, transmit direction, 682-685 SERDES-Framer interface (SFI-5), 758 SERDES integrated circuits, in 300-pin transponder, 722 Serial interconnects, high-speed, 12-13, 735-764 architecture of, 737-742 Printed Circuit Board (PCB) interconnects in, 738-742 {See also Printed Circuit Board (PCB) interconnects) topologies of, Til-IT)^ background on, 735 backplane interconnect in, 736 chip-chip interconnect in, 736 compliance test methodology for, l'M-l'\^ bathtub curve analysis of jitter in, 746-748, 747f eye mask in, 742-744, 743f jitter modeling conventions in, 744-746 (See also Jitter) higher and higher speeds for, 762-764 interconnect extension using de-emphasis and equalization in, 748-757 de-emphasis at transmitter in, 749-753, 750f, 751f,753f equalization at receiver in, 754-756, 755f, 756f
111 usage models in, 756-757 standards-based, 758-762 Backplane Ethernet in, 760 IEEE® 802.3aeTM Clause 47,XAUI in, 759-760 OIF SxI-5 in, 758 OIF TFI-5 in, 759 summary of, 760-761, 76 It, 762t Serializer/Deserializer (SERDES). See SERDES Service activation process elements, in ASTN, 603-604 Service (call) perspective, 562 Service Capability Exchange (SCE), 610-611 Service clock, 258-259 Service clock preservation, in MEN CES, 479 Service configuration requirements, for Ethernet services protection, 516 Service Connectivity, in VPNs, 426 Service delimiting, 439 Service frame broadcast, 347-348 delivery transparency in, 345 egress, 345 error detection in, 346 in Ethernet over MEN, 345 format of, 345-346 ingress, 345, 363, 363f in MEN, 526 multicast, 347-348 priority tagged, 348 unicast, 346-349 (See also Unicast service frame) untagged, 348 Service impairment, in MEN CES, 489-490 bit errors in, 490 frame delay and frame jitter in, 489^90 Frame Error Ratio and IWF behavior in, 490 frame loss in, 489 TDM, from errors within MEN, 489-490 Service interface types, in MEN CES, 467, 467f Service Interworking Network-to-Network hiterface (SI-NNI), in MEN, 329, 333 Service Layer, in VPNs, 426 Service Level Agreement (SLA), 8, 9, 374, 561 in Ethernet Private Line (EPL) service, 387 network protection in, 497
U8 of storage network extension, 546, 547t Service Level Specification (SLS), 324 network protection in, 497-498 Service multiplexing, 352-355 bundling map in, 354-355, 354f in Ethernet services MEN, 352-367 one-to one map in, 352-354, 352f, 353f Service Node Interface (SNI), in MEN, 333, 334f Service provider networks, connectionless networks, 19 Service quality, in MEN CES, 496 Service-related requirements, for MEN protection, 516-517 Services model, for EVC identification at UNI. See Ethernet Virtual Connection (EVC), identification at UNI of Service-specific PW preprocessing, 431 Service timing in multi service-provider-owned network, 483^85, 486t in private network, 480, 480t, 486t, 487 in single service network, 482 Session paradigm, in GMPLS RSVP-TE (G.7713.2) signaling, 648 SetupConfirm message, 628 Setuplndication message, 628 Severely Errored Seconds (SES), 304 SF14 specification, 722 SFI-5 (SERDES Framer hiterface-5), 682-685, 758 signals in, receive direction, 682-684 signals in, transmit direction, 684-685 Shared medium, 367 Shared mesh protection, MEN, 523-524 Shared redundancy, in MEF protection, 511 Shared-risk link group (SRLG), in MEN, 503 Short-haul, 709, 7lot Short-haul/intermediate-reach, 709, 710t Signal, 738 in links, 625 narrowly spaced, 694 widely spaced, 696 Signal classes, overview of, 706, 707t Signal class NRZ 1.25G, 706, 707t Signal class NRZ 2.5G, 706, 707t, 708 Signal class NRZ lOG, 706, 707t, 708t Signal class NRZ 40G, 706, 707t, 708t Signal class NRZ OTU3, 708t Signal class RZ 40G, 706, 707t, 708t Signal Degrade (SD), 304, 310 SignalFail(SF), 304, 310 Signaling, in ASTN, 568
Index Signaling attributes, 630, 63 It Signaling Communications Network (SCN) architecture (G.7712), 595-603 background on, 595-596 control plane message delivery in, 597-599, 598f DCN reliability considerations in, 602-603 DCN security considerations in, 603 DCN topologies in, 599-602 congruent, 600, 600f focused (hub and spoke), 600-601, 601f fuUmesh, 599, 599f hybrid (multitier), 601-602, 602f mechanisms of, 652-653, 653f signaling methods in in-band, 596 out-of-band, 596-597 Signaling control domains, in GMPLS RSVP-TE (G.7713.2) signaling, 646 Signaling (G.7713), in ASON, 626-633 application example of, 631-633, 632f attributesof, 630, 631t background on, 626-627 call and connection control sequences in, basic, 628-629, 629f call and connection management operations in, 627-628, 627f methods and protocols for, 643-651 G.7713.1 PNNI signaling in, 643-644 G.7713.2 GMPLS RSVP-TE signaling in, 644-648 {See also GMPLS RSVP-TE (G.7713.2) signaling) G.7713.3 GMPLS CR-LDP in, 648-649 interoperability and interworking in, 649-651 Signaling methods, in Signal Communications Network architecture (G.7712) in-band, 596 out-of-band, 596-597 Significant instant, 192 Simple path diversity, 631-632, 632f Single protected service, in MEN CES, 494^95, 495f Single service-provider-owned network, in MEN CES, 480-483, 480t, 481f, 481t service timing in, 482 transport timing in, 487^88 Single-to-dual interface service, in MEN CES, 495-496, 495f Sinusoidal jitter, 196 SLA. See Service Level Agreement (SLA)
Index Slip, 212 (controlled) octet, 234 controlled vi. uncontrolled, 212 SLS commitments, in MEN, 504 SLS restoration instant, in MEN, 502, 503f SLS restoration time, in MEN, 503, 503f Small Computer Systems Interface (SCSI), 530, 531t Small Form-Factor Pluggable (SFP) devices, 718-719, 719f Small Form Factor (SFF) devices, 717-718, 717f SNPP identifiers, 590-591 Soft link failure, 500 Soft Permanent Connection (SPC), 556, 556f SONET, 3, 63 capabilities of, 542-543 data growth and, 527-528 as ideal distance extension protocol, 542-547, 542t, 543f for Storage Area Networks (SANs), 531-548, 532f (&e also Storage Area Networks (SANs), SONET for) storage area services over, 10 storage networking and, 528-531, 530f, 53It {See also Storage networking) for voice communication, 542 SONET, as distance extension protocol, 542-547, 542t, 543f additional benefits of, 546-547, 547t capabilities of, 542-543 in SAN extension application, 543, 543f Service Level Agreement (SLA) for, 546, 547t standards in, 544-547 flow control in, 545 Generic Framing Procedure (GFP) in, 544 Virtual Concatenation (VC) in, 544-545 SONET Minimum Clock (SMC), 214, 272 SONET multiplex structures, 129 SONET/SDH data- and TDM-friendly next-generation, 4-6 Inter-Data-Center connectivity based on, 543,543f physical layer transport for EoS and, 397, 397f ring topology in, 40-43, 41f-43f standards for, 693-695 for Storage Area Networks (SANs), 539, 541-542,542t storage over, 541-542, 542t
U9 SONET/SDH Circuit Emulation Service, 469-471, 470t,471f in MEN, 469^71, 470t, 471f SONET/SDH frequency traceable clock distribution, 267, 267f Source address field, of Ethernet media access control layer, 369 Source traceability, 262-265, 262f, 263f clock distribution of, 266, 266f frequency relationship in, 263, 263f timing loop and, 264, 264f observation time and, 262 timing loops in, 264, 264f wander region in, 262 Spanning Tree Protocol (STP), 323, 366, 366f in Ethernet services over public WAN, 421 Layer 2 control protocol and, 421 Spanning Tree Protocol (STP)-BDUs, 513 Spanning tree topology, 249 Specification, optical interface. See ITU-T standards, optical interface SPI-3 signal descriptions, 669-672 receive direction in clock and data signals, 671 discrete control/status signals, 671-672 transmit direction in clock and data signals, 669 discrete control/status signals, 669-671 SPI-4 Phase 1, 674-677, 675f SPI-4 Phase 2, 679-681, 680f SPI-5, 758 OC-768 System Packet Interface, 685-687, 685f Split-horizon bridging, 512-513, 512f SSM (Synchronization Status Message), 218,261 Standardized interfaces, 66-67, 67f ITU-T, optical interface, 691-733 {See also ITU-T standards, optical interface) Standards ITU-T, 64 {See also ITU-T recommendations; ITU-T standards, optical interface) layer 1, 64-66, 65f optical network, 64-66, 65f Standards development process, 13, IGl-l^Ti approval of recommendations in, IIA-IIS background on, 161-16^ contributions and other input documents in, 773-774
820 history of, 119-1W industry forums in, 776-783 elections/hierarchy in, 111-119, llli human nature and, 781-782 membership in, 780-781 message in, 776 reasons for joining, 779-780 teamwork in, 782-783 International Telecommunication Union (ITU) in, 768-775 {See also International Telecommunication Union (ITU)) meetings in, 112-11?) Standard Single-Mode Fiber (SMF, SSMF), 703-704 Start frame delimiter field, of Ethernet media access control layer, 369 Star topology, 737 Steady-state call, 645 STM-1, 708t STM-4, 708t STM-16, 708t STM-64, 708t STM-64 switched connection services, via OTN mesh network, 4 5 ^ 9 , 46f-48f STM-N client signal, on OTM-n.m. server signal, 37, 38f STM-N jitter accumulation, 219-227, 222f-225f STM-N jitter network limit and tolerance, 219-227 STM-N regeneration, 219-227 STM-N server, PDH DS-3 chent signal on, 36, 37f, 39-40 STM-N structure, 127-128, 127f Storage, over IP, 539, 540-541, 540f, 542t Storage Area Networks (SANs), 5, 661 transport over OTN in {See Multiplex structures, of OTN) Storage Area Networks (SANs), SONET for, 10, 531-548, 532f distance extension alternatives in, 538-541 legacy private line in, 539, 542t SONET/SDH, 541-542, 542t storage over IP in, 540-541, 540f, 542t WDM in, 539-540, 542t as distance extension protocol, 542-547, 542t, 543f additional benefits of, 546-547, 547t capabilities of, 542-543 in SAN extension application, 543, 543f Service Level Agreement (SLA) for, 546, 547t
Index standards in, 544-547 flow control in, 545 Generic Framing Procedure (GFP) in, 544 Virtual Concatenation (VC) in, 544-545 distance extension requirements in, 536-538 carrier grade in, 537-538 in-order delivery in, 536 lossless delivery in, 537 low latency in, 536 network flexibility and resilience in, 537 scalability and performance in, 538 throughput in, 537 ERP and CRM applications in, 531-532 factors driving extension of, 532-534 fiber channels in, 534-535, 534f, 535f structure of, 532, 532f use of, 531 Storage arrays, 528 Storage networking, 528-531. See also specific types approaches to, 529-531, 530f, 531t benefits of, 528 example of, 528-529 processing efficiency with, 529 Stratum 2 holdover transient vs. seconds, at DSl interfaces, 288, 289t Stratum 3/3E holdover transient vs. seconds, at DSl interfaces, 288, 289t Stratum levels, hierarchy of, 257 Structure-agnostic emulation, 459 Structure-aware emulation, 459 Structured emulation mode, 459, 460 Structured service, alarm for, in MEN CES, 488 Study groups, 771, 77If contributions and other input documents for, 773-774 meetings of, 772-773 Stuff control, 201 Subnetwork, 24-25, 28, 31, 33, 335, 564 in MEN, 335 RA and, 577 Subnetwork connection protection (SNC-P), 307-308, 308f, 309f Subnetwork connection (SNC), 31, 33, 34f Subnetwork performer, 579 Subnetwork Point Pool (SNPP), 590-591 Subscriber/customer equipment, 327
Index Substructuring, in SDH VC-n structures, 126-127 Superblock, 169f, 170 Super frame (SF), 261 Super-rate signals, 127 Survivability, network, 295-319 definition of, 295 Link Capacity Adjustment Scheme (LCAS)in, 317, 318f multilayer, 318-319 network protection in, 295-313 {See also Protection switching) network restoration in, 314-317 advantages of, 314 definition of, 296 interoperability in, 316-317, 317f ITU recommendations on, 314-315 vs. protection, 314 restoration time in, 315-316 techniques of, 315 in optical transport networks, 313, 314f in transport networks, 6-7 Survivability, transport network, support for, in ASTN architecture, 571 Switched connection service, 552, 555, 556f Switched Connections (SCs), 570 Switch fabrics different, 667, 667f packet/cell, 665, 665f TDM, 666, 666f TDM/Packet, 667, 667f Switching bidirectional, 304 priority-based, 241 protection {See Protection switching) quality-based, 242 unidirectional, 303 Switching operation time T3, 305, 306f Switching time, 296 Switching transfer time T4, 305, 306f Switching types, in protection switching, 303-304 Switch initiation, 296 SxI-5, 758 Synchronization, 189-252 alignment jitter in, 195 closing remarks on, 251-252 digital transmission in, 191-194, 193f, 194f ITU-T recommendations on timing and jitter for OTN, SDH, and PDH and, 214-216, 215t jeditterizer in, 196
821 jitter generation in, 199-200 jitter tolerance in, 196-198, 198f mapping and multiplexing in, 200-203 in MEN CES, 472-475, 473f, 473t, 474t network engineering in, 249-250 pointer adjustments in, 203-205 priority-based switching in, 241 quality-based switching in, 242 reference acceptance in, 242 reference duplication and reference selection in, 241-242 reliable distribution of, 233-250 need in, 234-235, 234t synchronization areas in, 235-241 {See also Synchronization areas) satellite timing in, 248-249 sinusoidal jitter in, 196 synchronization network engineering and, 189-191 Synchronization Status Messages (SSM) in, 243-247 manual SSM assignment in, 245 message set in, 243-244, 243t SSM based restoration applied in a ring network in, 245-246, 246f, 247f SSM forwarding in, 244 SSM looping in, 244-245 SSM protocol rules in, 244 SSM selection in, 244 timing and jitter requirements for SONET/SDH and OTN in, 216-233 history and background on, 216-218 jitter and wander accumulation for PDH clients of SONET/SDH networks in, 227-231 jitter and wander accumulation for SDH clients of OTN in, 231-233 SEC and ODC frequency accuracy and clock modes in, 218-219 SEC and ODC pull-in and pull-out/hold-in ranges for, 218 STM-N and OTUk jitter accumulation in, 219-227, 222f-225f STM-N and OTUk jitter network limit and tolerance in, 219-227 STM-N regeneration and ODCr jitter generation and transfer in, 219-227 timing jitter in, 195, 200 timing performance characterization in, 209-212 Maximum Time Interval Error (MTIE) in, 210-211,213
822 peak-to-peak and RMS jitter in, 209-210 time variance (TVAR) and time deviation (TDEV) in, 211-212, 213 timing signal imperfections in, 206-209 fimdamentals of, 206-207 phase noise in, 207-209 transfer, generation, and network limit in, 196-200,198f, 199f wander network limits and wander performance in, 212-213 Synchronization architectures, for SONET/SDH, 257-293 bit recovery in, 258 clock backup modes, implications of, 286-292, 289t {See also Holdover mode) clock recovery in, 270 concepts of, 257-261, 258f, 259f Department of Defense (DOD) and, 267 distribution in, 266-268, 266f, 267f, 270 plesiochronous timing and, 267 external timing configurations in, 279-286, 284t bridged-source timing method and, 281-282,281f direct-source timing method and, 280, 280f line/external timing method and, 282-285,283f mult timing method and, 285-286, 285f external timing mode in. Global Positioning System (GPS) and, 260 guidelines for, 292-293 network element (NE) architecture in, 268-279, 268f, 216f{See also Network element (NE) architecture) clock routing in, 275, 276f example of, 268-269, 268f large, 278-279, 279f medium, 277-278, 278f small, 276-277, 277f system architecture of, 275-276, 276f timing distributor (TD) fianctions in, 270-275, 272f, 274f (&e also Timing distributor (TD) functions) timing engine (TE) fianctions in, 269-270 retiming and, 270 timing loops in, 258 timing recovery and, 258, 258f timing traceability in, 261-265
Index definition of, 261 source traceability in, 262-265 TSWC01622in, 272, 272f Synchronization areas, 235-241 definitions of, 235, 238 PRC/PRS autonomy in, 240-241, 240f synchronization reference chains in, 235-238, 237t synchronous and plesiochronous operation in, 238-239, 239t traceability and quality in, 237t, 239-240 Synchronization domain (SD), 475f, 475t Synchronization network, current, 249 Synchronization network engineering, 189-191 short history of, 190-191 Synchronization plan, 250 Synchronization protection, requirements for, 249 Synchronization reference chain, 235-238, 237t Synchronization Status Message (SSM), 218,261 Synchronization Supply Unit (SSU), 215, 260 Synchronization traceability, in MEN CES, 479-480 Synchronization trail, in MEN CES, 479 Synchronized administration, in MEN CES, 478^87, 479t multi service-provider-owned network in, 480, 480t, 483^87, 483f, 484t separate and diverse, 479 service clock preservation in, 479 service timing-private network in, 480, 480t, 486t, 487 single service-provider-owned network in, 480^83, 480t, 481f, 481t synchronization traceability in, 479-480 synchronization trail in, 479 transport timing-private network in, 480, 480t, 486t, 487 Synchronizer, 201 Synchronous Digital Hierarchy (SDH). See SDH synchronization in {See Synchronization architectures, for SONET/SDH) Synchronous IWF asynchronous tributaries and, 477 synchronous tributaries and, 477 Synchronous operation, 238-239, 239t Synchronous Optical Network (SONET). See SONET
Index synchronization in {See Synchronization architectures, for SONET/SDH) Synchronous Payload Envelope (SPE), 470 inSDHframe, 120, 121f System Framer Interface-4 phase 1 (SFI-4 Phase 1), 672-674, 673f phase 2 (SFI-4 Phase 2), 677-679, 678f System Packet hiterface (SPI), OC-192 SPI-4 Phase 1, 674-677, 675f SPI-4 Phase 2, 679-681, 680f System Packet Interface-3 (SPI-3) signal descriptions, 669-672. See also SPI-3 signal descriptions System Packet Interface-5 (SPI-5), 758
Tag, for CO-PS network, 396 Tandem Connection Monitoring (TCM), 4, 73-75, 74f, 75f in OTNs, 73-75, 74f, 75f TDEV. See Time deviation (TDEV) TDM, 3 for T-line over MEN, 462 TDM Access Line Service (TALS), 463, 464f operational modes of, 464, 464f TDM based Network Element, 666, 666f TDM Fabric to Framer Interface, 687-688, 688f TDM-friendly next-generation SONET/SDH, 4-6 TDM line service (T-Line), over Metro Ethernet Networks, 458^63, 458f bandwidth provisioning for, 461-462, 463f bandwidth allocation at 100 kbits/s granularity in, 462 Ethernet multiplexing in, 462, 463f TDM multiplexing in, 462 definition and use of, 458 operational modes of, 459^61, 459f multiplexing, 459, 460-461, 461f structured emulation in, 459, 460 unstructured emulation in, 459, 460 TDM Line timing, 475f, 475t TDM service interface examples, in MEN CES, 467-468, 468t TDM Service Processor (TSP), in MEN CES, 468^69 TDM Service Processor (TSP) block, 458 TDM signaling, in MEN CES, 490^91 TDM switch fabric, line card using, 666, 666f
823 TDM systems, 190 Telecommunication Development Advisory Group (TDAG), 769f, 771 Telecommunication Development Bureau (BDT), 769f, 770 Telecommunication Development Sector (ITU-D), 769f, 770 Telecommunication Management Network (TMN) Recommendations, 574 Telecommunication Standardization Advisory Group (TSAG), 769f, 770-771 Telecommunication Standardization Bureau (TSB), 769f, 770 Telecommunication Standardization Sector (ITU-T), 769f, 770 TeleManagement Forum (TMF), 1 Temperature compensation, in holdover mode, 290 Temporal model, protection switching in, 305, 306f Temporary Document, 774 Termination and Adaptation Performer (TAP), 581f, 583, 606 Termination connection point (TCP), 32 Termination flow point (TFP), for Ethernet service, 414^15 Termination fionction, 30, 30f, 32, 32f. See also Ethernet (Services), over public WAN expansion of, 35, 35f in MEN, 336-337 Termination sink fionction, management-related interfaces in, 58-59,60f TFI-5 (TDM Fabric to Framer Interface), 687-688, 688f Thermo-Electric Cooler (TEC), 713-714 Threshold AIS generation, 283 DSland,283 line/external timing method and, 283 Throughput, in fiber channel SANs, 537 Through timing mode, SONET/SDH and, 259f, 260 Time-delay, Maximum Time Interval Error (MTIE) and, 262 Time deviation (TDEV), 211-212, 213 traceability and, 262 Time Division Multiplexing (TDM). See TDM Time interval error (TIE), frequency traceability and, 264 Timer, hold-off, 309-310
824 Times, 501 Time variance (TVAR), 211-212 Timing in global optical transport network, 6-7 ITU-T recommendations on, for OTN, SDH, and PDH, 214-216, 215t in L2VPN over MPLS backbone, 430 in MEN issues with, 504 relationships in, 503, 503f of performance characterization, 209-212 Maximum Time Interval Error (MTIE) in, 210-211 peak-to-peak and RMS jitter in, 209-210 time variance (TVAR) and time deviation (TDEV) in, 211-212 satellite, in synchronization of optical networks, 248-249 service in multi service-provider-owned network, 483^85, 486t in private network, 480, 480t, 486t, 487 in single service network, 482 sources for, 259-261, 259f transport in multi service-provider-owned network, 485^87, 486f, 486t in private network, 480, 480t, 486t, 487 in single service network, 487-488 Timing configurations, external, in SONET/SDH, 279-286, 284t bridged-source timing method and, 281-282, 281f direct-source timing method and, 280, 280f line/external timing method and, 282-285,283f mult timing method and, 285-286, 285f Timing connections, Ultramapper^M vs. TSWC01622,273, 274f Timing distributor (TD) functions, 270-275, 272f, 274f application example of, 271, 272f fan-out of, 271 synchronization selection of, 271 synthesis of, 271 system block architecture of TSWC01622 and, 272, 272f Timing engine (TE) fianctions synchronization distribution and, 270 timing reference and, 270 clock recovery and, 270
Index Timing jitter, 195,200 Timing loops, 249 in frequency traceability, 265 in source traceability, 264, 264f in synchronization architectures for SONET/SDH, 258 Timing modes, SONET/SDH Network Element (NE) and, 259, 259f Timing recovery, synchronization and, 258, 25 8f Timing traceability, 261-265 definition of, 261 source traceability in, 262-265 TMF. See TeleManagement Forum (TMF) Topological components definition of, 379 in MEN, 334-335, 338-339, 338f, 339f Topology, 20, 382, 382f. See also specific topologies backplane, 737-738 EPLAN and, 389, 390f of network portion, multipoint-to-multipoint, 384, 384f Total cost of ownership (TCO), 533 Traceability frequency, 261-265 {See also Frequency traceability) observation time and, 262 source, 262-265 {See also Source traceability) synchronization, in MEN CES, 479-480 in synchronization areas, 239 time deviation (TDEV) and, 262 timing, 261-265 {See also Timing traceability) Traditional Approval Process (TAP), 774 Traffic Descriptors Information Element, 644 Traffic engineered LSP, 454 Traffic management, in MEN, 324, 524-526, 525f Traffic Manager (TM), 665 Traffic Pohcing (TP), 583 Trail, 33, 34f in MEN, 335 Trail protection, 306, 307f Trail signal fail (TSF) signals, 44 Trail Termination Function (TTF), 30, 30f fault and performance processes in, 56, 56f in MP2MP protection, 512-514, 514f Trail Termination Point, in MEN, 335-336 Trail Trace Identifier (TTI), 92 in G.709 overhead bytes, 89, 89t
Index inOTUK, 77f, 101,102f TRAN layer, 328-329 in MEN, 328-329 TRAN link, in MEN, 340, 341f Transceiver transponder. See Transponders Transient behavior, 645 Transmission control protocol (TCP), 540, 540f Transmission delay, 296 Transmitter, de-emphasis at, 749-753, 750f, 751f, 753f Transparent LAN service, 350 Transparent-mapped mode (GFP-T), in GFP, 167-168, 168f Transponders 2.5 Gbit/s, 720-721 200-pin, 724-725 300-pin, 722-724, 723f in SERDES integrated circuits, 722 Very Short Reach (VSR), 724 Transport, 498 Transport capabilities. See Multiplex structures, of OTN Transport Capability Exchange (TCE), 610 Transport component (transport entity), in MEN, 335-336 Transport Connection Functions (TCFs), 507 Transported payload capabilities. See Multiplex structures, of OTN Transport fionctional modeling. See Functional modeling, transport Transport layer protection mechanisms interactions, in MEN, 518 Transport network models, supporting Ethernet connectivity services. See Ethernet connectivity services, transport models supporting Transport networks, 498 automatically switched {See Automatically Switched Transport Network (ASTN) architecture) Transport network survivability, 6-7 Transport plane, 554 Transport resource identifiers, 590-591, 590f, 592f Transport resource management, in ASTN, 569 Transport services layer (TRAN layer), 328-329 in MEN, 328-329 Transport timing in multi service-provider-owned network, 485-487, 486f, 486t
825 in private network, 480, 480t, 486t, 487 in single service network, 487-488 Transverse compatibility, 693-694 vs. longitudinal compatibility, 700-712 {See also Compatibility, transverse vs. longitudinal) physical layer, 701-703, 702f Tributary Unit Groups (TUG), in SDH VC-n structures, 126-127 Tributary Unit (TU), 127 Trunk link, in MEN, 340, 341f TSWCO1622, 272-273 of Agere Systems, 272 in synchronization architectures for SONET/SDH, 272-273, 272f, 275 in timing distributor (TD) fionctions, 272, 272f vs. Ultramapper™, 273, 274f Tunnel label, 435 Tunnels and tunneling, 366, 366f MPLS, 434 carrying PWs over, 435, 435f VPN, 433 hierarchical, 433-434 motivation for, 433 protocols for, 434 Tussle, 566 Type HEC (tHEC) field, 161 U UltramapperTM, 273, 274f, 275 vs. TSWCO 1622, 273, 274f Uncontrolled slip, 212 Uncorrelated, bounded, high-probability jitter (UBHPJ), 745 Uncorrelated, unbounded Gaussian jitter (UUGJ), 745, 746 UNI. See User Network Interface (UNI) Unicast service frame, 346-349 basic concept of, 347, 347f definition of, 346 identification at UNI of, 348-349 CE-VLANID/EVC map for, 348-349, 349f CE-VLAN ID for, 348 intensifying at UNI in, 348 multipoint-to-mutipoint, 347-348, 347f point-to-point, 347, 347f UNI client (UNI-C), 331, 331f Unidirectional path switch ring (UPSR), 309 Unidirectional switching, 303 UNI/E-NNI Transport Resource names, 591, 592f
826 UNI list, Ethernet Connection (EC) attributes and, 386 UNI network (UNI-N), 331, 331f UNI reference point, in MEN, 326-327 UNI Transport Resource name, 590 Universal Time (UTC) frequency, 217 Unstructured emulation mode, 459, 460 Unstructured service, alarm for, in MEN CES, 488 Untagged service frame, 348 Upstream node, 81, 82f Usage, 624, 624t User access failure, restoration from, 632, 632f User Network Interface (UNI), 8, 316, 317f in Ethernet, 8 in Ethernet services over MEN, 344-345 in Ethernet services over public WAN, 379 in MEN, 326-327, 330-331, 331f service attributes of, 402, 402t User Network Interface (UNI) reference point, in MEN, 326-327 Userjriority field, 357 User traffic, MEN protection of, 520 User/transport/forwarding plane, 337 V VC-4-Xc structure, 128, 128f VC-12 client traffic, carried on VC-4 server trails, 4 2 ^ 3 , 42f, 43f VCAT. See Virtual Concatenation (VCAT) VC-n structure, in SDH, 125-126, 125f, 126f Very High Speed Integrated Circuit (VHSIC), 3 Very Short Reach (VSR), 699, 729 Very Short Reach (VSR) transponders, 724 VHDL. See VHSIC Hardware Description Language (VHDL) VHSIC. See Very High Speed hitegrated Circuit (VHSIC) VHSIC Hardware Description Language (VHDL), 3 Virtual Concatenation Group (VCG), 131. See also Virtual Concatenation (VCAT) alignment within, 148-149 Virtual Concatenation (VCAT), 5, 8, 131-139 additional benefits of, 136 advantages of LCAS and GFP in, 144 detailsof, 137, 138t, 139t differential delay in, 131-132, 132f
Index in Ethernet services over WAN, 374-375, 420 implementers guide for, 144-152 {See also under Multiplex structures, of OTN) origins and value of, 131 payload distribution and reconstruction in, 133-134, 133f, 134t, 135t restrictions of, 136-137 in SONET, 544-545 Virtual connections (VCs), 426, 426f. See also Ethernet (Services), over public WAN Virtual container, 120 Virtual container overhead bytes, in SDH VC-n structures, 126 Virtual framer management, in GFP, 171-172 Virtual LAN-based VPN, 156 Virtual LAN Service (VPLS), 428 Virtual LAN Service (VPLS) forwarding loops, avoiding, 446 Virtual LAN Service (VPLS) PW encapsulation, 447 Virtual LAN Service (VPLS) PW setup, 447 Virtual LAN Service (VPLS) reference model, 443-446, 444f, 445f Virtual leased lines, GFP in, 185, 186f Virtual Private Network (VPN), 425-427 classification of, 426^27, 427f multiservice converged packet switched backbone in, 427 traditional layer 2 (L2VPNs), 425-426, 426f virtual LAN-based, 156 Virtual Private Network (VPN) backbone, 425-426 Virtual Private Network (VPN) Edge Device, 426 Virtual Private Network (VPN) tunnels, 433 hierarchical, 433-434 motivations for, 433 protocols for, 434 Virtual Private Service, 380, 381f, 381t in Ethernet services over public WAN, 380 Virtual Private Wire Service (VPWS), 428 Virtual Private Wire Service (VPWS) reference model, 429f, 435f, 438, 439f Virtual Tributary (VT), 470 Virtual wire, 457 VLAN ID, 370-371, 37If
827
Index in Ethernet services over public WAN, 386 VLAN mapping, 404 VLAN tag processing, 439^40 VPLS. See Virtual LAN Service (VPLS) VPN. See Virtual Private Netvi^ork (VPN) W Waiting time jitter, 202 Wait-to-restore period, 304 Wait-To-Restore (WrT), 311 Wait to restore (WTR) time, in MEN, 503 WAN. See Ethernet (Services), over public WAN; Wide Area Network (WAN) Wander, 206, 212 source traceability and, 262 Wander accumulation for PDH clients of SONET/SDH networks, 227-231 for SDH clients of OTN, 231-233 Wander network limits, 212-213 Wander performance, 212-213 WAN flow control system, 545 Wave Division Multiplexing (WDM), 539-540, 542t Wavelength bands, 705, 705t White contributions, 773
White frequency modulation (WFM), 209 White phase modulation (WPM), 208-209 Wide Area Network (WAN), 662, 662f Ethernet connectivity and, 373-375 transport over OTN in {See Multiplex structures, of OTN) Widely spaced signals, 696 Working Documents, 774 Working parties, 771-772, 77If World Telecommunication Development Conference (WTDC), 769f, 770 World Telecommunication Standardization Assembly (WTSA), 769f, 770 X X2 device, 726, 727f XAUI, 725-726, 759-760 XENPAK, 725-726, 725f XENPAK-MSA, 725 XFP device, 727-728, 728f XGMII, 759 XGP device, 726 XPAK device, 726 Y Yellow bandwidth profile, 361-362, 361f