Table of Contents
Frame Relay for High Speed Networks ISBN: by Walter 0471312746 Goralski John Wiley & Sons © 1999 , 410
| pages
Everything you need to know about frame relay technology, in plainspoken English. Pete Loshin
Frame Relay for High-Speed Networks Introduction Chapter 1: What Frame Relay Can Do Chapter 2: The Public Data Network Chapter 3: Frame Relay Networks Chapter 4: The Frame Relay UserNetwork Interface Chapter 5: Frame Relay Signaling and Switched Virtual Circuits Chapter 6: Congestion Control Chapter 7: Link Management Chapter 8: The Network-Network Interface (NNI) Chapter 9: Voice over Frame Relay Chapter 10: Systems Network Architecture and Frame Relay Chapter 11: Internet Protocol and Frame Relay Chapter 12: Asynchronous Transfer Mode and Frame Relay Chapter 13: The Future of Frame Relay Bibliography Acronym List
Frame Relay for High-Speed Networks Walter Goralski Copyright © 1999 by Walter Goralski. All rights reserved. Published by John Wiley & Sons, Inc. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-, (212) 850-, fax (212) 850-, E-Mail:
[email protected]. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in professional services. If professional advice or other expert assistance is required, the services of a competent professional person should be sought. Publisher: Robert Ipsen Editor: Marjorie Spencer Assistant Editor: Margaret Hendrey Managing Editor: Micheline Frederick Text Design & Composition: North Market Street Graphics Designations used by companies to distinguish their products are often claimed as trademarks. In all instances where John Wiley & Sons, Inc. is aware of a claim, the product names appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration.
Acknowledgments There are a number of people who should be thanked for their assistance in making this book. First I would like to thank Hill Associates, Inc. for creating an intellectual work environment where personal growth is always encouraged and for a climate that makes the writing of books like this possible in the first place. I owe special thanks to the Hill reviewers who painstakingly waded through the individual chapters and found many of my errors. These are Clyde Bales, Rod Halsted, Gary Kessler, Hal Remington, Harry Reynolds, Ed Seager, Tom Thomas, and John Weaver. Any mistakes which remain are my own. On the publishing side, Marjorie Spencer supplied the vision that produced this text. Margaret Hendrey saw the book-writing process through, although my primitive figure-drawing skills must have been a real challenge. Finally, Micheline Frederick has produced this fine volume from my raw material.
My family, Jodi, Alex, and Ari, has also come to grips with the fact that I am now a writer in addition to all the other roles I play from day to day, and that I am now entitled to all the rights and privileges that a writer of my stature deserves (such as quiet). Thank you one and all.
Introduction What do an accountant sitting in a small home office accessing the Web, a sales representative at a branch office checking the latest home office price list, and a hospital worker ordering medication from a supplier’s remote server all have in common? Whether they are aware of it or not, more and more often the links from place to place in these environments are provided by a frame relay network. In the near future, this list might easily extend to corporate executives holding a videoconference, commercial artists manipulating images, and even college students calling home. In fact, all three of these things are done on frame relay networks now, just not routinely or in all environments. At a lower level than the user application scenarios above, a frame relay network can support LAN interconnectivity for Intranet router-to-router traffic, carry financial transactions for a corporate SNA network, or carry digital voice overseas, and all faster than before and at a fraction of the price of almost any alternative. How can frame relay do all of this so well? That is what this book is about. It should come as no surprise that networking needs have changed radically in the past few years. After all, the end systems that perch on almost every worker desktop and in every home, school, and library has changed drastically. Systems that were considered state-of-the-art 2 or 3 years ago struggle mightily even to run the games of today, let alone the new applications. Audio and video support is not a luxury, but a must, if only so that the multimedia tutorial for a new application can be appreciated (and no one reads the manual anyway). A video game requires more power than a supercomputer had 20 years ago. A palmtop assistant draws on more computing power than an IBM mainframe could in 1964. In 1991, a 66 MHz computer with color monitor and 500-megabyte hard drive and 16 Meg of RAM and a modest 2x CD-ROM cost more than $13,000. And so on. And in computing, as in almost nowhere else outside of the electronics industry in general, prices fall as power rises. As illuminating (or boring) as these examples might be, the sole point is that the typical device that uses a network has changed radically in the past 10 or 20 years, from alphanumeric display terminal to multimedia color computer. Yet how much has the network changed in that same time period? Hardly at all, and most of those changes apply to the local area network (LAN), where the cost of use is minimal after installation, and not to the wide area network (WAN), where the cost of use in the form of monthly recurring charges can be crippling. In a very real sense, frame relay (and other fast packet WAN technologies like ATM) represents the application of modern computing power not to the devices at the endpoints of the network, but inside of the network itself. Computer networks are so common today that it is hard to imagine a time when they were not. Yet the history of computer networks could be said to begin with the invention of what eventually became the Internet in 1969. This was only four years after Time magazine, in a story about computers (remarkable in itself), boldly predicted that “eventually” computers would be able to talk to each other over “communication links.” In the 1980s, the idea of networking computers was not so new, and by the end of that decade, it was more or less expected. In the 1990s, the rise of the whole client/server phenomena has enabled a whole raft of new applications requiring both networks and computers, from electronic messaging to remote database access to training videos.
Of course, this new emphasis on clients and servers and the networks that connected them led to the creation of many networks. In fact, there turned out to be perhaps too many networks. It seemed like every application, not only voice and video but the many types of data, required a slightly different network structure in terms of bandwidth (speed), errors, and delay (latency). In the expanding economy of the 1980s, the immediate reaction when faced with a new network application was to build a completely new network specifically groomed for that application. This typically meant employing time division multiplexing (TDM) to share the total bandwidth available among sites by dedicating the maximum amount available or the minimum amount required to adequately support the application. This form of channelized networking came to dominate the WAN scene in the 1980s. But just as the economy needed to regroup before it could advance to new heights in the 1990s, so did networking. Maintaining all of these essentially parallel networks proved wasteful and enormously costly. Few people used the tie-lines between office PBXs at 3 a.m. Yet the bandwidth these channels represented was still dedicated to the voice network. Many organizations, attempting to perform out-of-hours tasks such as backing up remote servers over the channelized networks, were unable to utilize the bandwidth locked up in other channels. Frame relay solves this problem by flexibly, or dynamically, allocating the total bandwidth based on the instantaneous (well, within a few milliseconds) demand of all active applications. Idle applications consume no bandwidth at all. Frame relay is not the only way to solve the problems posed by the time-consuming and wasteful task of maintaining parallel networks; it has just proved to be the most popular. And frame relay addresses more than just the need to integrate the total communications infrastructure. Frame relay can also: 1.Reduce costs. A great deal of this cost reduction comes from the elimination of the need for parallel communications networks. But there is more to it than that. Frame relay replaces a complex, incomplete web of dedicated private lines with a “cloud” of general and total connectivity, not only within the United States, but around the world. Organizations who could only dream of leasing a dedicated private line to Europe now enjoy affordable communications with major European capitals thanks to frame relay. 2.Improve network performance. It used to be true that everyone involved in a network was so happy that it worked at all that they had little inclination to care how the network was performing. And even if they did care, there was little in the way of performance tuning methodologies or software to assist them in their task. The problem with creating custom networks for each application is that there were all slightly different in their performance tuning needs. But frame relay is the same network for all its various applications. Not only is frame relay faster than almost anything else, it is also more tunable than a collection of individual, parallel networks. 3.Make the network as a whole more reliable. In a network composed of an interconnected mass of individual leased private lines, the failure of one critical link can be devastating to the network as a whole. Part of the allure of public networks is that they are more resilient and robust than private networks. Everyone knows that the public voice network suffers internal link failures all the time. Yet with a few widely publicized exceptions, notable only due to their rarity, these failures have no impact on overall voice network service. Since frame relay is almost always a public network service, it shares this characteristic with the public voice network. 4.Make the network more future-proof. Many networks are difficult to scale for new and more applications in terms of speed and connectivity. Even simple reconfiguration can be a timeconsuming and costly process. Frame relay networks can react to sudden changes quite rapidly, often within 24 hours and as the result of a simple telephone call. And nothing is more painful than watching competing organizations become more successful while one’s own organization is saddled with technology that is outdated and perhaps even considered obsolete. Frame relay is not only a member of the fast-packet family, but is intended to be interoperable with the other major fast-packet network technology, ATM.
5.Make the network more manageable. Network management centers often resemble a hospital emergency room. Nothing much happens until the patient is wheeled in on a cart. Then swarms of personnel swing into action to save the victim. Without adequate network management hardware and software at their disposal, network managers often attempt little more than fixing problems as they are reported. Frame relay network management techniques offer a way of detecting problems like link failures literally as they occur (not when the users finally get around to calling), and network congestion areas can actually be detected before they occur. And most of the management tasks associated with frame relay can be outsourced to the service provider (this is a cost savings benefit as well). 6.Provide a more structured migration path. All networks must be upgraded eventually. But to what? And how? Frame relay offers a certain migration path from old network to frame relay, regardless of whether the old network was SNA or routers linked by leased private lines, or X.25, or almost anything else. In most cases, the changes required are simple and few. And once the transition to frame relay is made, the fast packet aspects of frame relay virtually guarantee a long and useful life for the new network. If it sometimes seems like frame relay is enjoying the kind of success that ATM and a few other WAN technologies wish they had, it is because this is undoubtedly true. Frame relay is sometimes called “the first public data networking solution that customers actually like and will pay for.” Frame relay has also be called “the first international standard network method that works” and “X.25 on steroids.” All of these statements are quite accurate, and all contribute to the continued success of frame relay networks.
Overview Usually it is easy to accurately gauge the popularity of any given network technology. One can just walk into a bookstore (online or otherwise) and check out the number of titles or shelf space that books on a particular topic occupy. The shelves are typically overflowing with books on the Internet, Web, TCP/IP, Java, and the like. Only a dedicated search will turn up books on the less popular technologies and topics such as token ring LANs or Fibre Channel. There are exceptions to this general rule, of course. Among LAN technologies, Ethernet is one of these exceptions. In spite of the immense popularity of Ethernet LANs, to the extent that the label “not Ethernet” attached to a new LAN technology is a sentence of death or at least condemnation to a niche market, there are few books about Ethernet or related technologies like 10Base-T. In the WAN technology arena, the same is true of frame relay networks. There are a handful of texts, and that’s about it. Yet frame relay remains not only one of the most popular public WAN technologies of all time, frame relay continues to expand its horizons into the voice and video application areas as well. This book is intended to address the apparent imbalance between the high level of frame relay network popularity and the lack of formal, book-length sources of information on frame relay networks. A good, current, complete book on frame relay is definitely needed to help networking personnel understand exactly how frame relay works in order to assess both what frame relay does today and what it is capable of in the future. Customers and end users should know the basics of frame relay configuration and have a realistic grasp of frame relay capabilities. Service providers should know the basics of frame relay network operation and have a clear understanding and just what the network is doing at all times. Everyone should have an appreciation of the enormous power of frame relay networks to adequately address most of the issues involving modern networking needs. Most existing books on frame relay are quite short, especially when compared to the weighty volumes dedicated to even such niche technologies such as Java or intranets. These frame relay books tend to either emphasize the bits and bytes of the network operation at the expense of the application of frame relay to various networking areas, or give checklists for user and customer frame relay service preparation without ever mentioning exactly how the frame relay network exactly provides such services. This book will attempt to give equal time and emphasis to both aspects of frame relay. Those used to assessing and appreciating technology at the network node and data unit level will find plenty to occupy them here. Those used to evaluating and admiring technology at the service and effectiveness level, in the sense of “what can I do with frame relay better or that I cannot do now?”, will also find that this book has much to offer.
So a new book on frame relay is much needed. Two aspects of frame relay are important in this regard. First, there is the fact that frame relay is a public data service. Private frame relay networks can be built, just as private voice telephony networks, and have been, but the emphasis here and in the networking spotlight is on this public aspect of frame relay. There have been other public data services, notably X.25 packet switching and ISDN. Neither has ever achieved the stature that frame relay has in a short amount of time. This book will explore the reasons for this. Second, frame relay was standardized and first implemented as a public data service. But frame relay has become a vehicle not only for data traffic in the traditional sense, but it now carries all sorts of types of digital information, such as voice and video and graphics and audio. Since frame relay is widely deployed by telephone companies, the application of voice over frame relay must come as a surprise. In fact, frame relay was conceived as a method of carrying voice packets inside frames, so no one need be shocked that the promise of frame relay is now being fulfilled. Naturally, the reasons and methods behind this use of frame relay for mixed traffic types will be investigated in full in this book.
Who Should Read This Book This book is intended for readers with at least a passing familiarity with the operation of modems, LANs, and other basic networking principles. But outside of these basics, little is required to fully understand and appreciate this text. There are few equations, and these few require no more than basic high-school algebra to understand completely. It is anticipated that most of the readers of this book will be those who currently manage networks in organizations, need to understand more about frame relay networks in general, or work for service providers who offer frame relay services. This does not rule out readers with a general interest in networks, of course. While not specifically intended for a university or college audience (there are no individual chapter summaries, study guides, or questions, for example), there is nothing that prevents this book from being used at an undergraduate or even graduate level in a comprehensive networking curriculum. In fact, the author compiled much of the material in this book from teaching a graduate course on current telecommunications networks at a large university. In short, if you have a wide area network, use a wide area network, or need to assess frame relay networks, this is the book for you.
How This Book Is Organized This book consists of thirteen main chapters.The first chapter is a necessarily long look at the factors which have made frame relay the most popular public data networking solution of all time. The length is due to the fact that many of the topics are dealt with at an introductory level and presume no more than an acquaintance with LAN and WAN technologies like Ethernet and the Internet. The chapter begins with a look at the popularity of LANs and the Web, and progresses to the methods most organizations use to link these networks to each other and to the Internet: pointto-point private lines. The limitations of this solution are investigated in light of the major characteristics of current network traffic: mixed audio/video/data traffic and “bursty” traffic patterns. This leads to a need for a type of “bandwidth on demand” for the underlying network, which is hard to provide on private line networks. Finally, the benefits of frame relay are shown to coincide nicely with the limitations of private line networks, setting the stage for the rest of the book. Chapter 2 looks into key aspects of public data networks. Again there is some effort made to handle these topics at an introductory level, but always the intent is to create a basic level of knowledge for all readers of the rest of the book. Just what makes a network public or private is explored, along with an introduction to the differences between networking with circuits as opposed to packets. Frame relay is positioned as a fast packet network technology. Once the differences between circuit-switching and packet-switching are examined, the chapter introduces the X.25 public packet-switched network as a so-called slow packet network. After a brief examination of X.25 networks, frame relay is discussed as a fast packet technology, one which is capable of both broadband (very high) speeds and flexible, dynamic bandwidth allocation (the more proper term for “bandwidth on demand”). The relationship between X.25 and frame relay by way of a frame structure called LAPD is also examined. Chapter 3 explores the overall structure of an entire frame relay network. The chapter begins with a look at the key components for FRADs and network switches. How they combine to offer applications adequate quality of service (QOS) is outlined as well. This leads to a discussion of private routers and public switches as network nodes. In fact, the whole switch versus router “controversy” is addressed here at a very basic level (again for readers who may not be as familiar as they like with these terms and concepts). The chapter concludes with an exploration of permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) for frame relay networks. In this context, signaling protocols are introduced and connections for various traffic types are discussed. A major point of this chapter is that as data becomes more and more “bursty” in nature, and as new voice techniques make voice (and even video) look more and more like data, it makes little sense to send “data” packets over private line circuits. Chapter 4 examines all aspects of the way a customer connects to a frame relay network, the frame relay user-network interface (UNI). The chapter begins with a look at a key concept in frame relay networking: the committed information rate (CIR). The relationship of the CIR to the frame relay UNI port speed is also examined. Several “rules” for configuring CIRs on the UNI are discussed. The important service provider concepts of “regular booking” and oversubscription are detailed with respect to the UNI. Finally, the major issues relating to the physical configuration of the UNI are investigated. These include diverse routing, dial backups, ISDN access, multi-homing, inverse multiplexing, and the use of analog modems for remote frame relay access.
Chapter 5 investigates the topic of the signaling protocols used on a frame relay network. The standard signaling protocol for frame relay, Q.933, is detailed. But the whole issue revolves around the use of Q.933 for switched virtual circuits in a frame relay network. The chapter includes a complete discussion of frame relay call control using Q.933 to set up, maintain, and release switched virtual circuits. One of the main topics in this chapter is the delay in offering switching virtual circuit services to frame relay network users and customers. All of the reasons are explored, with the emphasis on billing complexities and resource determination. The chapter also looks into the possible merging of frame relay and ISDN services in the future. Chapter 6 explores congestion control, a key aspect of all networks, in some detail. After defining congestion as a global network property, the chapter examines the relationship between congestion control and its more localized counterpart, flow control. Several mechanisms used in networks for congestion control and flow control are investigated, always pointing out their possible use in frame relay. Then the specific frame relay mechanisms for flow control and congestion control are discussed, the use of the discard eligible (DE) bit and the use of the forward and backward explicit congestion notification bits (FECN/BECN). The recommended frame relay network actions regarding FECN and BECN to maintain service quality are discussed, as well as the current limitations of FECN and BECN usefulness. The chapter closes with a look at frame relay compression techniques, which are commonly seen as a way to prevent network congestion. Chapter 7 concerns managing a frame relay network. There is much more to networking than the delivery of bits, and this chapter makes this point clearly. Managing the frame relay network includes methods for detecting link failures and responding to these link failures once they are detected. Key parts of frame relay network management covered in this chapter include the role of the Simple Network Management Protocol (SNMP) and the Management Information Base (MIB) database in the frame relay network elements. The standards for frame relay network management are discussed, the Link Management Interface (LMI) from the Frame Relay Forum (FRF) and Annex D from the ITU-T. Issues of support for each in frame relay equipment, the impact on the network, and user awareness are also mentioned. Finally, Service Level Agreements (SLAs) for frame relay networks between customers and service providers are detailed. Chapter 8 examines the frame relay Network-to-Network Interface (NNI). Although it might seem that such a topic might be of concern only to frame relay network service providers, it is in fact a topic of immediate interest to users and customers. Because of the current state of telecommunications deregulation in the United States, it is necessary to examine the relationship between local exchange carriers (LEC) and interexchange carriers (IXCs), as well as various competing local and national entities such as competitive LECs (CLECs) and Internet service providers (ISPs). After exploring the current concerns and issues to the LEC, the chapter contains a discussion of local access and transport areas (LATA) limitations as they apply to frame relay. Several methods of providing multi-LATA frame relay networks in spite of current restrictions are outlined, from private lines to “gateways” to full NNI interoperability agreements. Chapter 9 explores the various ways in which a frame relay network can be used to support voice and video as well as more traditional data applications. The whole issue of voice over frame relay is placed in the context of the historical effort to deliver adequate packetized (or more correctly, packet-switched) voice services. The related G.728, G.729, and H.323 ITU-T standards are introduced, with the intent of providing a means to evaluate voice over frame relay equipment and associated voice quality issues. Finally, the chapter details some of the early efforts to provide adequate video service over a frame relay network. This introduces a brief discussion of MPEG-2 video streams and frame relay equipment designed for this purpose. Chapter 10 investigates the key relationship between IBM’s System Network Architecture (SNA) protocols and frame relay. It is a fact that IBM’s endorsement of frame relay as an SNA transport greatly encouraged the spread of frame relay. This chapter investigates the role of SNA in the world of LANs and routers. The position of X.25 public packet switched networks with regard to SNA is considered, both in the United States and in the rest of the world, where private line networks were and remain somewhat scarce. The chapter closes with a look at the distinctive way that SNA networks use the frame relay DE bit and the FECN/BECN bits.
Chapter 11 explores the use of frame relay for various Internet-related network interactions. The relationship between the Internet, private intranets and extranets, and the Web is discussed. The chapter also examines the concept of a Virtual Private Network (VPN) and how frame relay can support the mixed traffic types commonly encountered on Web sites. Various roles for frame relay in this Internet-dominated environment are investigated, such as the use of frame relay by ISPs, using frame relay to construct VPNs, and using frame relay for multimedia traffic. Chapter 12 examines the relationship between the two major “fast packet” networking methods, frame relay and ATM. A brief overview of ATM network features is given, involving a discussion of the frame versus cell controversy. The chapter next explores mixed traffic (voice, video, data,...) and mixed networks (all types of traffic on the same physical infrastructure), the environment that ATM was primarily invented for. The chapter considers the use of ATM rather than frame relay for LAN connectivity and introduces the ATM methods for providing such connectivity. Finally, the chapter ends with a consideration of using an ATM network to provide frame relay services to customers. This leads to a discussion of linking frame relays users and networks with ATM and also linking ATM users to frame relay users for interoperability. All of the relevant issues regarding this ATM-frame relay interaction are also investigated. Chapter 13 closes the book with a look at the future of frame relay. The relationship of frame relay to a number of other technologies is examined closely, especially frame relay and the newer, highspeed LANs such as Gigabit Ethernet. The future of frame relay in a world dominated by IP is discussed as well, and seems like a fairly good match. A more detailed look at an even closer relationship between frame relay and ATM networks and users is explored, with the conclusion that service providers can continue to “build ATM and sell frame relay” for the foreseeable future. A bibliography includes a variety of sources of frame relay information, standards and other types of information such as Web sites and white papers. The book also includes a complete list of acronyms used in the text.
Summary Frame relay is one of the most popular public data network services of all time. Frame relay can address many of the issues and problems that have come to be associated with the private line and channelized networks that are still common today. Frame relay can reduce overall network costs through integration, and yet at the same time improve network performance and reliability, ease network management tasks, provide a more future-proof network infrastructure, and offer a graceful migration path. Frame relay can be used for SNA networks, interconnections for bursty LAN client/server traffic, international networks, and even voice communications. This combination of lower cost and at the same time higher network quality is hard to beat. This is just a brief description of the capabilities of a frame relay network. The first chapter elaborates on each of these frame relay benefits and provides a framework for understanding exactly what it is about frame relay functions that allows frame relay to perform so well in a modern networking environment.
Chapter 1: What Frame Relay Can Do Overview This chapter looks at the current networking environment more or less in terms of technological popularity. It is intended mainly to allow readers to appreciate the capabilities of frame relay when it comes to addressing the limitations inherent in some of these popular networking methods. At the same time, this chapter introduces some of the basic terms and concepts that will come up over and over again in the ensuing chapters. Since many of these terms and concepts might be quite familiar to the intended audience, some might be tempted to skip this chapter altogether. However, there are at least three sound reasons why this chapter should be read. First, some of the terms and concepts might have changed in meaning, often subtly, over the past few years. For instance, the whole idea of what a router does and what a switch does has changed from what these network devices were thought of a few years ago. Second, although the terms and concepts might have been familiar, some readers might wish to reacquaint themselves with these anew, as a type of review. Finally, and most importantly, different network professionals often have different perspectives on the use of terms and concepts when they speak and write. Now, no author can claim that his or her own use of a term or concept that is not firmly defined by standards is more correct or should be favored over others. But for the purposes of this book, unless the terms and concepts used by the author coincide to a large degree with the terms and concepts used by the reader, confusion could easily result in the later chapters. For example, many sources distinguish between network security devices known as application gateways and proxy servers. In this book, these terms are used almost interchangeably (in the sense that all proxy servers are a form of application gateway). (Lacking firm definitions for each term, everyone is free to use these terms as they wish.) But certainly for the purposes of this book, the meaning of these terms should be known to the reader. Along these same lines, at least one quirk of technology terminology must be mentioned right away. Most customers and users have Local Area Networks (LANs) that are based on a particular LAN type from the Institute of Electrical and Electronics Engineers (IEEE) 802 committee for international LAN standards. This LAN type, known as 10Base-T, is the most popular LAN type ever invented. Most users refer to this as Ethernet, the parent technology that evolved into 10BaseT, resulting from the work of the IEEE 802.3 committee. Sometimes 10Base-T LANs are called Ether-type LANs, but this variation is not common. In this book, the term Ethernet applies not to a proprietary LAN technology involving heavy coaxial cable but to a 10Base-T LAN based on central hubs and Unshielded Twisted-pair (UTP) copper wire. This use of the term is essentially correct in any case because most 10Base-T LANs do not use the recommended and specified IEEE 802.3 frame structure, but the much simpler and more practical Ethernet frame structure. So most physically 10Base-T LANs use Ethernet frames, which surely justifies the use of the term Ethernet to designate these LANs.
The Network Needed By now most readers should be convinced that the old ways of building private networks out of leased private lines, if not quite impossible yet, will be quite impossible soon. The bandwidth pressure will keep going up, to the point that some organizations have begun to deploy Storage Area Networks (SANs) that link servers at multigigabit speeds. But the faster that servers can get the information, the faster they must give it out in order to keep going themselves. The connectivity pressure will build also as the use of Web-based information becomes more and more a way of life. Any-to-any connectivity is no longer a luxury but a necessity to the extent that many organizations have turned to the Internet itself for public connectivity, with all the hazards for security that this entails. The problem is that the type of network needed today is not the voice-engineered network that the service providers built. It is hardly their fault, however. The national network was built to handle three-minute telephone calls, not 12-hour Internet sessions. This network is often called the Public Switched Telephone Network (PSTN). Private lines are essentially just little pieces of bandwidth sold to customers which then can no longer be used for handling telephone calls or anything else for that matter. There is much debate over the future of the PSTN in a world where more and more fax and voice traffic travels the Internet instead of the PSTN, especially internationally. Some claim there are more bits representing information than voice on the PSTN anyway, at least since 1995. One telephone service provider says that by 2001 or so, more than half of all access lines (also called local loops) will terminate not at a telephone, but at a PC or some advanced PC-like device. So the PSTN is not the network that is needed to solve bandwidth and connectivity problems. But what is? Many think the network needed is the Internet, pure and simple. This section will examine some of the needs of modern client/server, LAN-based networking and see if the Internet totally fits the bill.
Bursty Applications A definition of bursty traffic has been already given, but will take only a moment to repeat. A bursty application is one that over time generates a torrent of traffic (usually consisting of packets) which flows between end-user devices, typically located on LANs. Between bursts, long periods of relative inactivity occur in which little to no traffic at all passes between the LANs. This is why highspeed LANs linked by lower-speed WAN private lines work so well in the first place. If the LAN traffic is bursty enough, packets can be buffered until the burst is over, at which time the packets can be sent out on the much lower speed link to the other LAN. In actual practice, it is more common to place these packets in a frame, and it is the frames that are properly spoken of as being buffered. A packet is, by definition, the content of such a frame. Just how bursty is LAN traffic? Very bursty, as it turns out. A more or less accepted measure of burstiness (the term is widely used, but always seems as if it should be more technical) is to measure the peak number of bits per second between two network end points (say a client and a server) and divide this number by the average number of bits seen during some specified time interval. This peak-to-average ratio is a good measure of the degree of burstiness for the application or applications used between the client and server during the observation period.
It would be nice if there were a standard definition for this process of measuring burstiness, but there is none. The peak bit rate is easy to define: Whenever bits flow, they always flow at the peak rate allowed by the link, be it 64 kbps, or 1.5 Mbps, or 45 Mbps. It is the average that causes problems. The average is the total number of bits in the observation period divided by the observation period interval. The trouble is that there is no standard observation interval established to measure the total bits. One is as good as another. It could be a minute, an hour, or a second. It could be a day. And when the interval was chosen, whether it represents a Monday or other weekday, or weekend would be important too. Each one would have the same peak bit rate, but widely varying averages and therefore also burst ratios. An example of this varying burst ratio is shown in Figure 1.7.
Figure 1.7 Different intervals, different peak to average ratios. Consider trying to figure out the average speed of a car. The car starts and stops, sometimes cruises, and sometimes sits in a parking space. Commuting is a bursty activity. The average speed of a car driven 10 miles to work and back over a 10-hour period is very low, but no one complains about the car being slow. But when everyone bursts at once, as during rush hour, congestion results and everyone complains about the slowness of the network, which is only marginally slower than the average speed over the entire 10-hour period. The PSTN is like the highway. It is not made for bursty traffic. The PSTN was made for non-bursty, constant bit rate traffic, not the variable bit rates that bursty applications are noted for. This works fine for digital telephone calls—at least it did in the past—since these full duplex calls even digitized the periods of silence that flowed in the opposite direction when one party was just listening. So the link used all of the 64 kbps of the voice channel bandwidth (it was the voice digitization process that determined the basic speed for digital private lines) all the time, which is the essence of networking with circuits, or just circuit switching. There will be much more on circuit switching in the next chapter. But even the fixed bandwidth, circuit-switched PSTN is due for a change. The latest international digital voice standards from the International Telecommunications Union’s Telecommunications Standardization Section (the ITU-T) feature voice compression (voice that uses less than the normal 64 kbps, usually much less, like 8 or 13 kbps) and silence suppression (no bits are sent when there is no speech). What this does, of course, is make voice start to look pretty much like bursty data applications. So it now starts to make sense to put voice into packets like everything else, since packets were more or less designed primarily for bursty applications. The clearest sign of the transition from constant bit rate voice to bursty, variable bit rate voice is the practice of trying to put voice calls on the Internet or LANs in general, not on the PSTN. Naturally, it is always desirable to have a certain quality to the voice call, which is most often characterized as toll quality voice. It makes no difference to the voice bits whether they are transported over a circuit or in packets, as long as the users will use the service. There are two key aspects of delivering toll quality voice over networks designed for data like the Internet. First, the voice packets must have a low and stable enough delay to allow users to converse normally. Second, the voice packets must all arrive intact and in sequence, since the real-time nature of voice services prevents any retransmission of missing voice conversations. The Internet was and still is designed primarily for data packet delivery. There have been efforts to improve the basic quality of service provided by the Internet for mixed types of traffic, but little work has been completed or implemented yet. In the meantime, the Internet works the same for bursty applications, whether data or packetized voice. The problem is providing adequate quality of service on the Internet for mixed traffic types. And as Web sites, e-mail, and all sorts of networked information combine text and graphics, and sound and moving images, this issue becomes important enough to merit a section all its own when considering the type of network needed for modern telecommunications.
Mixed Traffic Types In the good old days of networking, voice went on the voice network and data went on the data network. The two networks were absolutely distinct, whether public or private. When a private line or access line was ordered, one of the first questions asked was, “Is this line going to be used for voice or data?” Even today, home modem users sometimes are shocked to find that their local telephone company will not even take a report of trouble on a modem line such as, “I can only connect to my Internet service provider at 9.6 kbps this afternoon when it was 33.6 kbps this morning.” This access line was bought as a voice line, not a data line, and only voice troubles such as, “I hear a lot of static on the line,” can be reported as troubles. Data lines sometimes require additional work to be done on installation and can be harder to troubleshoot, so the cost of provisioning a data line has been traditionally higher than a voice line of the same length. Therefore, a voice line works for modem data most of the time, but when it fails or degrades, there is a different process of troubleshooting and repair. One of the reasons for this voice and data separation was regulatory, both within and without the United States. Most voice service providers could process voice calls (e.g., switch them, provide caller identification, etc.), but could not do anything with data bits (or voice bits either, for that matter) except pass them unchanged and unprocessed through the PSTN. This was called raw bit transport or basic transport services and all the service provider could supply in this regulated environment was a leased private line provisioned for the user’s data bits. This was certainly true until the service providers received permission to offer their own public data network services. But this is a topic for the next chapter. In addition to regulatory issues, there were technical reasons as well. The PSTN was designed for constant bit rate voice calls, not bursty data packets. Putting bursty data packets on all-thebandwidth circuits was extremely wasteful of network resources, but it worked. Perhaps it would be better to try to put voice calls on the data networks as a stream of packets. In fact, it was just this quest for packetized voice that culminated in frame relay. But this is also a topic for the next chapter. For now, it is enough to point out that trying to put voice traffic on data networks is the classic example of mixed traffic types. It made less and less sense to have two circuits going everywhere, one for voice and one for data. Even data was no longer just pure text. There were graphical data with engineering drawings, image data with photographs, moving picture data with action content, audio data with the soundtrack, and so forth. It became harder and harder to give the correct quality of service to each of these traffic types on the same network. Some of the differences in the quality of service needed by these different traffic types are listed in Table 1.1. They are all so different in their needs that separate networks have been built just to carry each type of traffic, as shown in parentheses. Table 1.1 Different Networks, Different Quality of Service Trafffic Type
Error Sensitivity
Delay Sensitivity
Bandwidth Need
Burstiness
64 kbps voice (PSTN)
Low
High
Low
None
Analog video (cable TV)
Low
High
High
None
Digital video (HDTV)
Medium
High
Very High
Moderate/None
Packet data (Internet)
High
Low
Low/Medium
High
Some schemes, such as Asynchronous Transfer Mode (ATM), were invented just to address the issue with a totally new type of network using fixed-length cells instead of variable-length frames to carry packets. The trouble with anything new, especially technologies, is how easy it is to migrate to a new system and integrate newer parts that have already migrated with older parts that have not. The issue of backward compatibility ultimately halted ATM almost in its tracks, except for specific applications for which it was still best-suited. So if some way could be found to easily mix traffic types—be it voice, video, data, or audio—on the same physical network, the potential for such a networking scheme was enormous. There would just be a network, not a voice network, or a data network, or a video network, but just a network. And the way to build such a network is no mystery. All that need be done is to take the most stringent characteristic required for each traffic type, then build that network. This is where the Internet, built as it was for data applications, and older ones at that, struggles to support the quality of service needed (for instance) to deliver toll quality voice. So, as well-suited as the Internet is for bursty traffic, the Internet fails when it comes to providing a transport service that is low in errors, and has low and stable delays. Since more and more traffic appears bursty due to compression and so on, this delay issue is not the limitation it once appeared to be. The issues of errors and delay on the Internet are important ones. But much can be done at the application level to alleviate the effects of missing and incorrect information, and high and varying delays as well. But there is still the issue of adequate bandwidth to support the service traffic type. Here is where the current version of the Internet fails miserably. All packets look exactly the same to the Internet. No packet stream can reserve the proper amount of bandwidth it needs to get the minimal error rate it requires (lack of bandwidth leads to packet discards), and low and stable delay (lack of bandwidth leads to a lot of time in buffers). It would be nice if each application could identify its needs to the network. It would also be nice if each application could get the proper amount of bandwidth only when it needed it, like during a burst. This would be flexible bandwidth allocation (circuit networks allocate peak bandwidth all the time), also known as dynamic bandwidth allocation. But at least some requisite bandwidth is allocated, and not just scrambled over as in most packet networks like the Internet. This dynamic bandwidth allocation is often called bandwidth-on-demand.
Bandwidth-on-Demand The Internet certainly does provide a form of bandwidth-on-demand, or flexible bandwidth allocation. But the bandwidth that current applications demand during bursts is simply not available to all at the same time on the Internet. The whole concept of bandwidth-on-demand means not only releasing bandwidth ordinarily tied up in the form of dedicated private lines when not needed, but getting enough bandwidth during bursts to allow the most bandwidth-hungry applications like video to function properly. This need for increased bandwidth frequently leads to discussions about the need for broadband networks. Unfortunately, the meaning of the term broadband keeps changing as well. Still officially defined as speeds above 1.5 Mbps in the United States and above 2 Mbps almost everywhere else, this speed is not very useful for even simple applications today. Some 20 percent of all e-mail messages received by certain individuals have attachments such as documents or video clips that are larger than 1 megabyte (8 megabits). So even checking the morning’s e-mail can be a timeconsuming task. So, for most practical purposes, the term broadband is usually employed today to indicate network bandwidths than are at least 10 Mbps, the common Ethernet speed that all PCs and other network devices can easily operate at. But here is where the flexible bandwidth allocation comes in: If there is a burst, the whole 10 Mbps might be used. If two sources burst at the same time, each can get 5 Mbps, and so on. But if there are many sources bursting, each must get a minimum bandwidth to enable it to function under the load. This is the essence of bandwidth-on-demand: No single application is assigned a fixed amount of bandwidth. The bandwidth allocated is based on the instantaneous demand of the given application.
While the Internet can easily provide this flexible bandwidth allocation, there are two problems. First, the bandwidth available is nowhere near broadband speeds, especially if the official definition is extended to 10 Mbps. Second, there is no reservation system for assigning a minimal bandwidth (and thus error rate and delay) to an application at all. Steps have been taken to correct this situation on the Internet, but large-scale and widespread deployment will take years. What is an organization to do in the meantime? This book suggests that the answer is frame relay. Frame relay can replace a maze of private lines with dedicated bandwidth between sites and replace this network with a logical collection of logical links over the switched services that frame relay provides. And frame relay can still furnish the links to and from the Internet that all organizations need today to pursue their daily activities.
Point-to-Point Private Lines In the United States, the most common service that organizations expect public network service providers to furnish is simple point-to-point private lines. The term “public network service providers” is used in preference to older terminology such as “carriers” or even “telephone companies” (telcos). Not all providers of frame relay network services are certified as common carriers, although most are, especially the major ones. Certainly, not all providers of network services are, or even were, telephone companies in the traditional sense. It is true that network service providers include many Incumbent Local Exchange Carriers (ILECs, those who held the original franchise for telephone service in a given area), Interexchange Carriers (IXCs), Competitive LECs (CLECs), and Regional Bell Operating Companies (RBOCs, former pieces of the AT&T Bell System that were split off under the Divestiture of 1984). But more and more service providers are now Internet Service Providers (ISPs) and even power company utilities. Even the term “point-to-point private line” has a lot of variations. Essentially, it is a circuit that connects two different customer sites. The bandwidth is private in the sense that all of the bits on the circuit are the customer’s and they pass unchanged through the service provider’s network, not in the sense of ownership of the physical link in any way. The private line philosophy and approach is sometimes called all the bandwidth, all the time, since this is what the customer is paying for. The lines are not purchased, in the sense of ownership, but leased from the service provider for a fixed amount per month. A typical lease runs two or three years and is routinely renewed. Sometimes these private lines are called leased lines or dedicated circuits, but the terms essentially mean the same thing. There are even multipoint private lines that are more or less only used for IBM SNA networks today. In a multipoint configuration, a private line connects a central site to several remote locations in a special fashion. Sometimes these multipoint configurations are called multidrop lines, but the idea is the same. The differences between point-to-point and multipoint private line configurations is shown in Figure 1.3.
Figure 1.3 Typical point-to-point and multipoint private line use. In the United States where bandwidth is plentiful, the leasing of service provider bandwidth for private use is common practice. The service provider basically sells a piece of its network to the customer, agreeing that the only bits sent on that circuit will be the customer’s bits. This practice is both common and relatively inexpensive. Outside of the United States, this practice is neither common nor inexpensive. Many other countries do not have the extensive telecommunications infrastructure that the United States has. So selling all the bandwidth, all the time on a particular circuit to an individual user is not always in the service provider’s best interest. Typically, the service provider has adequate network capacity for customers sharing these lines for voice purposes and has little to spare for private users. And even if there were enough facilities to allow large-scale sale of these public facilities for private use, the prices are kept high enough by financial and regulatory considerations to discourage all but the most economically well-off organizations from using private lines outside of the United States.
Bandwidth Limits Enough has been said previously in this book and in many other places about the incredible increased demands on network bandwidth that have been placed by the Web, intranets, and newer applications like videoconferencing. This section will not repeat any of the numerous examples that have forced the popular Ethernet LAN technology to move from 10 Mbps first to 100 Mbps and now to 1,000 Mbps with Gigabit Ethernet. The point here is not so much how bandwidth demands have forced LANs to change, but how little effect this pressure has had on the WANs that connect them. Ethernet LANs running at 10 Mbps have been around since the late 1970s, but most organizations did not have enough need for networking on the local, building-wide level until the mid to late 1980s. The change, of course, was brought about by the rise of the PC and the whole movement away from the hierarchical, mainframe or minicomputer central networks to the more peer-oriented, distributed networks that PCs could more easily use. The early LANs of the mid-1980s ran at the 10 Mbps Ethernet speed or the more modest 4 Mbps that IBM used for their token ring LANs. IBM mainframe shops ran token ring because token ring supported SNA sessions, and very well at that; IBM would not endorse SNA on Ethernet (Ethernet struggled with the need for stable, low delays that SNA sessions required). But almost everyone else ran Ethernet, which cost about half as much as token ring to support the same number of users. The general guidelines for building Ethernet LANs in those days was one Ethernet segment (the shared 10 Mbps bandwidth on a length of cable) for every 200 or 300 users. And why not? No PC in existence could consume more than a fraction of the 10 Mbps bandwidth on the cable. Even in those early days of client/server networking, there was a need to link separate LANs over a wide area. The natural solution was to use the same private line approach that had been used to build SNA networks in the previous decade when networks first become popular. In the mid-1980s, the most popular and affordable private lines were still analog and ran at speeds that are almost laughable today: 4,800 and 9,600 bits per second. These 4.8 and 9.6 kbps lines were fine for SNA and minicomputer networks, where interactions between terminal and host (the central computer) were measured not in megabytes, but in a few hundreds and thousands of bytes. The most popular terminals of the day displayed only alphanumeric characters (not graphics) of 80 columns and 25 rows. So the whole screen only generated 2,000 characters (16,000 bits) of traffic at a time, and most transactions between terminal and host only exchanged a few hundreds of bytes because not all of the screen was sent back and forth. But LANs were different. The processing power of PCs was not limited to simple exchanges of text, although many early PCs were used in exactly this way—to supply simple terminal emulation capabilities to access the organization’s mainframe or minicomputer. Once users began to use PCs for word processing, spreadsheets, and other more familiar client/server applications, this basic terminal emulation would no longer do. PCs were now loading programs from a remote server, exchanging files almost routinely, and overwhelming the lower speed links that had been adequate for centralized approaches to networking. So it was more common to link LANs with the newer digital private lines, available almost everywhere starting in 1984 and running at 64 kbps. So many LANs were initially linked in the late 1980s with 64 kbps private lines. How could this possibly work if the LANs were running at 10 Mbps and the links between them ran at 64 kbps, apparently about a 150:1 bottleneck in speed? There were two main reasons. First, LAN traffic, and all data traffic in general, is bursty, as previously mentioned. So not all users needed the bandwidth between the LANs at the same time. Even if they did, there were 200 or 300 users (and other devices like servers and printers) sharing the 10 Mbps on the Ethernet anyway. So each user only got about 50 kbps (10 Mbps divided by 200) or 33.3 kbps under heavy loads in any case. The actual traffic patterns varied, of course, and a more precise analysis would take this statistical nature of the bandwidth usage into account, but the main point is still the same: The restricted WAN private line bandwidth was adequate for early LAN connectivity because the traffic was bursty and there were large numbers of users sharing the LAN bandwidth (so no one needed or got a lot of bandwidth).
Things rarely stay the same for long, however, and the pace of change in the computing world can be breathtaking. In a relatively small amount of time, PC capabilities, performance, and applications increased to the point that 200 or 300 users could no longer share a single 10 Mbps Ethernet segment. One solution was to further segment the LAN into more 10 Mbps Ethernets, connected by a network device known as a bridge. The bridge made two or more separate LAN segments look and act as one big LAN. By the late 1980s, it was common to put no more than 20 or 30 PCs on an Ethernet segment. There were still 200 or 300 PCs in the building, but now they were scattered among 10 or so segments connected by bridges. Bridges could be used not only to connect LAN segments in the same building, but to connect remote LAN segments as well, as long as both were Ethernets or token rings. These remote bridges had no accepted common name; IBM called them split bridges, but the purpose was the same. They all made separate LAN segments behave like one big LAN. Only the users needing remote LAN access needed to use the bridge connecting other sites. This was not common in those days, however, as great efforts were made to preserve the administrational and functional lines introduced by departmental computing, a concept first made popular in the minicomputer arena. Suppose only 20 or 30 users needed to access a remote server over the WAN bridge. A 64 kbps link now had to handle 500 kbps (or 333 kbps) because that’s what the users got on their shared 10 Mbps Ethernet segment. But the WAN link was still only 64 kbps in most cases. The result was a world in which users were instantly aware of whether the remote server they were accessing was on the next floor or across the country. Remote servers took forever to respond. It was only due to the bursty nature of the traffic that this scheme worked at all. The applications that client/server users were used to on the LAN kept evolving as well. By the early 1990s, Windows and other GUIs were becoming common. It is hard to appreciate today the impact that GUIs had on network demands. Sending the letter “A” from client to server was no longer just a matter of sending 8 bits. Information about the font, the size, the color, and so forth had to be sent as well. And things got even worse with graphics and images. LANs were now flooded with all manner of traffic, not just simple text. There were two main ways that organizations approached this newest crisis. First, organizations could even further segment the LAN and restrict numbers of users on each segment. So segments of as few as 2 or 3 were seen. This was not a very good solution, however, because each segment required a port on a bridge to enable communications with other segments. If 20 user segments needed 10 ports for 200 total users, then 2 user segments needed 100 ports. Most bridges at the time could easily support up to 16 or so ports, but there were few products that went beyond this. This whole movement to segment Ethernet LANs with bridges is shown in Figure 1.4. Now users that had up to 5 Mbps locally on their segment were totally overwhelming the 64 kbps private lines, no matter how bursty their traffic patterns were. At this time many organizations began to upgrade their 64 kbps private lines to 1.5 Mbps (called a full T1) out of sheer desperation. And in fact, this approach worked very well, mainly due to bursts, but also due to some other factors discussed later.
Figure 1.4 LAN segmentation and bridging.
The second way that LAN administrators dealt with the WAN bandwidth limitations in the early 1990s was first to encourage, then quickly adopt, the new 100 Mbps Ethernet standard. Once this was available, a heavily accessed server or even a real power user client device could get 10 times the bandwidth previously available on the old Ethernet segment. So, if a site had two extremely busy servers, these devices could be put on an Ethernet hub port with their own 100 Mbps Ethernet segment. And remote users could access these servers, each fully capable of slamming out information at 100 Mbps, at a leisurely 64 kbps in most cases, and only 1.5 Mbps in the best of circumstances. This was an automatic 60:1 bottleneck in the 1.5 Mbps case and an amazing 1400:1 bottleneck in the case of the 64 kbps private line. At the time, any increase in private line speed beyond 1.5 Mbps was often impractical. Either the higher speed circuits were not available from the service provider (bandwidth is not unlimited) or the price was far beyond the budget of even the most prosperous organizations. The next full step was 45 Mbps (T3) and there was little available. In some places, multiples of the T1 bandwidth of 1.5 Mbps were available, most often 6 Mbps, but this fractional T3 (often called FT3 or NxDS1) service was and still is not widely available and remains expensive in any case. The irony of the situation was not lost on either the LAN administrators or WAN service providers. The fact was that there was and is plenty of bandwidth available on the national network infrastructure. It is just that leasing it out as private lines with all the bandwidth, all the time is the least efficient way of using this bandwidth. No server needs 100 Mbps all the time: Traffic is bursty. But if a burst is not carried efficiently, the delay defeats the whole purpose of remote access. So a high-speed WAN is still needed, but not as the dedicated bandwidth represented by private lines. And now with Gigabit Ethernet, the problem will only get worse. Clearly, there must be a better solution to this bandwidth problem than private lines.
Connectivity Limits The corporate networking environment situation changed in the 1980s, quickly and dramatically. Most private corporate networks had been built to accommodate the IBM SNA environment. Even when other computer vendors’ networking products were used (for example, WANGnet and HP networking products), the resulting networks looked pretty much the way SNA networks did: hierarchical, star-shaped, wide area networks (WANs). These networks typically linked a remote corporate site to a central location where the IBM mainframe or other vendor’s minicomputer essentially ran the whole network. In this sense, the networks were hierarchical, with the central computer forming the top of a pyramid and the remote sites only communicating, if at all, through the central computer. Building private networks for this environment was easy and relatively inexpensive. Private lines were leased from the telephone companies to link all the remote sites to the central site. Although the private lines were leased by the mile (that is, a 1000-mile link cost more than a 100-mile link), there were various ways around letting this variable expense limitation impose too much of a burden on the private network. The situation is shown in Figure 1.5. In the figure, five sites are networked together with point-topoint leased lines. The connectivity needed is simple: Hook all four of the remote sites to the remaining central site where the main corporate computer was located.
Figure 1.5 Central site connectivity.
A quick look at the figure makes it easy to see how many links are needed to create this network. There are four private leased lines in the figure. Fewer point-to-point lines cannot be used to link each remote site directly to the central location. These links are the main expense when building and operating such private corporate networks. But the links are not the only expense. There is also the expense associated with the communication ports on each computer in the network. This expense is minimal at each remote site. A glance at the figure shows that each remote site needs only one communications port and related hardware. The situation is different at the central computer site, however. Here the central computer needs a communications port and related hardware to link to each of the other remote sites. And, of course, the central computer must be able to be configured with the total amount of ports needed for remote site connectivity. This last requirement was not usually a problem, especially in the IBM mainframe environment. But what if the number of remote sites grew to 10? 20? 100? How many links and ports would be needed to deploy such a hierarchical network then? As corporations—and the corporate networks—merged, expanded, and otherwise grew throughout the 1970s and into the 1980s, this became a real issue for corporate data network designers. Fortunately, it is not necessary to draw detailed pictures of these larger networks to figure out exactly how many links and ports are needed. There is a simple mathematical relationship that can be used to figure out how many links and ports would be needed to link any number of sites into a hierarchical, star-shaped network. If the number of sites is designated by the letter “N” (including the central site), then the number of links needed would be N− 1. For instance, in Figure 1.5, N= 5 and the number of links needed is N− 1 = 4. The number of communication ports needed throughout the network is given by 2(N− 1), read as 2 times N− 1. When N= 5, the number of ports is 2(N− 1) = 8. Fully half of these ports (given by N− 1 = 4) would be needed at the central site. It is now easy to figure out that when the number of sites is 20 (N= 20), the number of links would be 19 (N− 1 = 19) and the number of ports would be 38 (2(N− 1) = 38), with 19 of them (N− 1) at the central site. If N = 100 (100 locations on the corporate network, a figure easily approached in larger corporations and governmental agencies as well), the number of links would be 99 and the number of communications ports would be 198, with 99 at the central site. These networks were large and expensive, but not prohibitively so. What has all of this to do with frame relay? Simply, the rise of corporate LANs and client/server computing in the 1980s has meant that building private corporate networks in hierarchical stars is no longer adequate for private corporate networks. This is not the place to rehash the evolution of LANs and client/server in detail, but it is not necessary. It is enough to understand that LANs and personal computers (PCs) running client/server software (for example, a database client package to a database server or even corporate e-mail applications) are best served by peer-to-peer networks. It can no longer be assumed in a client/server LAN environment that all communications would necessarily be between a remote site and a central location, as the hierarchical networks assumed. In a client/server environment with LANs connected by WANs, literally any client could need to access any server, no matter where in the corporation the client or the server PC happened to be. Client/server LANs at corporate sites that need to be connected are better served by peer, meshconnected networks. This need for a different type of private corporate network created real problems. The number of links and ports needed for peer, mesh-connected private LAN networks was much higher than the modest link and port needs in the older hierarchical, star environment. Figure 1.6 points out the problem. A mesh network consists of direct point-to-point links between every pair of locations (LANs) on the corporate network. This way, no client and server are separated by more than one direct link. But the associated numbers for the required links and ports have exploded.
Figure 1.6 Full mesh connectivity. The figure shows that the peer network requires 10 links and four communications ports at each location to establish the required mesh connectivity. The total number of ports is now 20. The formulas, again based on well-understood mathematical principles, are now N(N - 1)/2 = (5 * 4)/2 = 10for the number of point-to-point links needed to mesh connect N = 5sites and for the total number of communications ports needed (four at each site). For 20 sites (N= 20), the numbers would be (20 * 19)/2 = 190 for the links and 20 * 19 = 380 for the ports (19 at each site). While it would not be impossible to configure 19 ports for each site, the hardware expense alone would be enormously high, even prohibitive. And most network managers would instantly balk at providing 190 WAN leased lines paid for by the mile to link the sites together. For 100 sites (N= 100), the numbers would require an astonishing 4,950 links ((100 * 99)/2) and 9,900 communications ports (100 * 99). Each site would need hardware to support 99 links to other sites, an impossible task for any common communications device architecture. Again, various strategies were employed by corporations to keep LAN connectivity networking expenses in line. Partial meshes were deployed in varying backbone and access network configurations. These measures were only partially successful, in the sense that any economic gain within the corporate network was offset by the loss of efficiency and productivity on the part of those using the network. The traffic jams and bottlenecks that resulted from a minimum link configuration hindered users immensely. The situation is only worse when the need for international connectivity in many organizations today is taken in account. In many cases, marketing, sales, and even personnel departments not only deal with issues inside the United States, but also around the world. The cost of international private lines, even of modest speeds, is astronomical, and not even available to many of the countries where connectivity is most commonly needed. One of the attractions of frame relay is that it actually makes international connections affordable. Today, one of the most pressing needs in any organization is the need for Internet access. Usually this is provided by having the organization link to an Internet service provider (ISP). Each site that needs Internet connectivity must be linked to an ISP. This access could be through another site. But this creates enormous traffic jams and bottlenecks at the site with the Internet access. It would be better to allow each site to link to the ISP site directly. But consider the problem if one of the sites in the previous figure were an ISP site. A lot of resources would be consumed with all of the dedicated private lines required between each site and the ISP. But frame relay’s logical connectivity allows each site to be linked over one physical link which carries the traffic for many logical links.
Technology Winners In theory, any technology should be totally neutral and independent of users. That is, users should be able to pick and choose the best solution from a whole suite of technologies to fit their situation. In practice, a given technology is always involved in a kind of popularity contest with other methods for the user’s attention. Users always pick something comfortable and familiar over something uncomfortable and alien. When it comes to LANs, the good news is that users know Ethernet. The bad news is that they don’t want to know anything else. There are other examples along the same lines, but the point is the same. Some technologies are obviously helpful and immediately prosper. For example, within 40 years of the invention of the typewriter, no major author submitted handwritten manuscripts. Some technologies appeal to only niche audiences and languish, never quite catching on, but never quite disappearing either. Things like electric cars could easily be put in this category. And other technologies basically become obsolete, some rather quickly. The famous Pony Express mail service ran only 18 months before the telegraph put it out of business. There is nothing obvious about a technology that immediately is labeled as a winner or loser. So this section is not based on anything inherent in the technologies that doomed their competitors. In many cases, competing methods are still around. This section examines three popular technologies that all work together today to form a basic structure for an organization’s network from top to bottom. These are Ethernet LANs, the Internet/intranets/Web (all three have much in common), and frame relay. They are presented in this order, but only because the main topic of this book is frame relay.
Ethernet LANs Whether the LAN is called Ethernet or not, the general structure of the LAN is shown in Figure 1.1. A LAN is defined in this book as a private network where even the cable that the bits flow on is totally owned and operated by the organization using the LAN. LANs span small geographical areas, usually no more than a few miles, but typically much smaller areas. LANs are almost exclusively confined to a single building and usually a single floor, especially those based on 10Base-T. In the figure, desktop devices, either clients or servers, are attached to the central 10Base-T hub using up to 100 meters (about 328 feet) of unshielded twisted-pair (UTP) category 3 (Cat 3) or category 5 (Cat 5) copper wire. Usually, the hubs are located in telecommunications closets, but sometimes they are out in the general office space. Note that several hubs could be joined together with what is known as a backbone LAN, which might be 10Base-T, but could easily be something else.
Figure 1.1 An Ethernet (10Base-T) LAN.
Now, not all LANs are Ethernet or 10Base-T. In fact, not all networks used in a building are LANs. Many are based on Wide Area Network (WAN) technologies and older network technologies which have come to be called legacy networks, meaning “what I had before there were LANs.” Many of these legacy networks are based on proprietary, vendor-controlled network protocols such as IBM’s System Network Architecture (SNA), Digital Equipment’s DECnet, or others. There is nothing wrong with basing networks on these protocols. In fact, if they did not work and work well, they would not have thrived enough to become legacy protocols in the first place. But because they did work well, many organizations still retain the legacy SNA network that their financial business runs on, leaving the LAN applications to address more exciting, but routine, user needs. Ethernet-based LANs caught on because they were less expensive than token ring LANs, the major competition from IBM. Token rings were built in IBM shops, especially those where SNA session support to LAN-attached PCs was required. The reasons for using token ring for SNA are not important here, but what is important is that token ring LANs initially needed entirely new building wiring—known as IBM Shielded Twisted-Pair (STP)—to be run. This was much more expensive both to purchase and to install than the cable needed for Ethernet, especially once Ethernet, originally running on a thick coaxial cable, became 10Base-T and ran on quite inexpensive unshielded twisted-pair copper wire. Eventually, IBM allowed token ring to be run on UTP also, but by then the advantage gained by Ethernet could not be overcome. The advantage in wiring eventually grew to include Network Interface Cards (NIC). As more manufacturers turned to Ethernet in preference to token ring, price competition was much fiercer. It is not an exaggeration to say that for every token ring vendor, there were 10 Ethernet vendors. This brief discussion of the whole Ethernet-token ring controversy could include considerations of a political nature as well. What is most important is that by the early 1990s, when people thought “LANs,” they thought “Ethernet.” This basic Ethernet structure has been around since 1990 or so. Over the years, the 10 Mbps Ethernet speed has jumped to 100 Mbps and has jumped again to 1,000 Mbps (which is 1 Gbps). The new Gigabit Ethernet has excited a lot of equipment vendors and users, but the upgrade to 1 Gbps will not just be a matter of swapping out one network interface card (NIC) for another, as was done when 10 Mbps Ethernet became 100 Mbps Ethernet. In most cases when moving from 10 Mbps to 100 Mbps, the same building wiring was used and the NIC cards had a small toggle switch on them to allow users to easily switch from 10 Mbps to 100 Mbps. But Gigabit Ethernet will probably require not only new copper wire to be run (and shorter runs at that), but also at least some fiber optic cable as well. The success of any LAN scheme bearing the name Ethernet is virtually assured, as previously noted. The good news for LAN equipment vendors is that users and customers understand and appreciate Ethernet. So, as much as people struggle with Gigabit Ethernet, it will be considered definitely worth the effort. If ever there was a technology winner, Ethernet LANs are a prime example. When it comes to WANs, the impact of Ethernet is being seen everywhere. Most sites have LANs, and even a number of residences have begun to appreciate the fast connectivity that LANs provide over a limited distance. The distance limit is the key. Even 10 Mbps Ethernets can hardly be expected to network effectively over low-speed, dialup modems of the type used to surf the Web. The lowest speed seriously considered is known as DS-0, which runs at 64 kbps. Even this is barely adequate today; that it works at all is due to the fact that most LAN-based applications are what are known as bursty applications. The term bursty means that over time the application is seen to generate bursts of traffic, bursts consisting of data units usually called packets, which flow between the LANs. Between bursts, long periods of relative silence occur in which little or no traffic at all passes between the LANs. So if the LAN traffic is bursty enough, packets can be buffered (saved in a memory location dedicated to communications) until the burst is over, at which time the packets can be sent out on the much lower speed link to the other LAN.
More and more applications are being run on LANs, where they are usually built on what is known as the client/server model. In the client/server model of network computing, all processes running on any type of computer at all can be classified as clients or servers. Since all modern computers and operating systems can run multiple programs or processes at the same time, there can even be clients and servers running on the same machine at the same time. The difference between clients and servers is that clients talk and servers listen for initial requests. Server processes have become so complex and performance-conscious that it is much more common to implement servers on a dedicated computer of their own, one machine for each major server type. It is usually easy to tell the clients from the servers in an organization. The clients typically have people sitting in front of them doing work. The servers typically are found in the backoffice or, if located in the general office space, have signs on them saying “do not touch.” It is a truism of client/server computing that no one can actually do work on a server, but no one can do work without it. For the servers listen to client requests for information, which the servers provide over the network for the clients to process. The importance of the client/server model of network computing is not so much in the mere fact that there are clients and servers on LANs. It is in this implication: Any client must be able to access any server it is authorized to reach in order to perform its job. So, for instance, a client PC in the Human Resources department of some organization must be able to reach all of the Human Resources servers, whether in the same building or not. Naturally, if the servers are not in the same building as the clients, a WAN is needed to reach them. In practice, the servers might be literally anywhere in the organization. In fact, they might even be outside of the organization. This is especially true when speaking of the latest phenomena in the realm of client/server computing: the Internet, intranets, and the World Wide Web.
Internet/Intranet/Web The Internet is a global, public collection of computer networks that are all interconnected according to the familiar client/server model. E-mail clients send to e-mail servers. File transfer clients fetch software packages from file transfer servers. And so on. All of these clients and servers must be running software that complies with the protocols of the Internet Protocol Suite, which most people know as TCP/IP, for the two major protocols themselves. The key to the position of the Internet in networking today is its current public status. The Internet began as a military network in 1969 and only “went public” in a big way in the early 1990s. The software available for the Internet was standardized, inexpensive (often bundled with other software and so “free”), and well understood. It was not long before the software and protocols used for Internet access were also used for client/server computing on LANs. Of course, the client/server interactions within an organization were inherently private, not public. They were not characteristic of an Inter-net, but rather an intra-net, within the same organization. So the application of Internet client/server applications and protocols among the clients and servers within a single organization became known as an intranet. This preserved the privacy of the organization’s internal network while at the same time taking advantage of the effectiveness and popularity of the Internet Protocol Suite (TCP/IP). The key here was the concept of a client being authorized to access a given server. In a private network, such authorization is implicit in the client’s membership in the network, but not haphazard. For example, few organizations allow all clients to access the server with employee’s salary information since the potential for abuse is much too high. There are occasions when it is even appropriate to allow clients from outside the organization to access servers within the organization over the Internet. An example would be with a manufacturer and its suppliers. An automobile manufacturer might require client access to a tire maker’s server in order to place orders, check shipments, and perform various other tasks. This permission can be granted to such external clients, as long as the proper safeguards are in place to prevent abuses and further exposure to outside threats. This arrangement is known as an extranet, reflecting the external client relationship to the intranet servers. A lot of the security for these arrangements are provided by Virtual Private Networks (VPNs) based on various more or less standard security methods. A VPN can provide a way to exchange private information with confidence over the public Internet. The relationship between clients and servers across the Internet, within an intranet, and on an extranet is shown in Figure 1.2.
Figure 1.2 Internet, intranet, and extranet. Today, much of the activity and popularity of the Internet, and intranets and extranets as well, is firmly based on the popularity of the World Wide Web, or just the Web for short. In a few short years, the Web has invaded the public consciousness to such an extent that it is hard even now to imagine how people got along without it. The familiar child’s question of “What did people do before there was television?” will soon be replaced with “What did people do before there was the Web?” So grade schools assign Web homework, goods are sold over the Web, and even stocks can be traded on secure Web sites. Naturally, any transaction involving finances requires a high degree of security, such as that provided by a VPN. The Web browser constitutes a kind of universal client that can be run on any type of computer. So instead of needing a separate client program for e-mail, file transfers, and remote login (after all, the servers are all separate), a user really only needs a simple Web browser to perform all of these functions and more. The Web supports a wide range of multimedia content, from streaming audio Web radio stations to Web video movies. All of this information and versatility is rolled into one point-and-click Graphical User Interface (GUI) that is as simple to use as a television remote. Together, the Internet, intranets, and the Web form another technology winner with Ethernet LANs. But there is even a third technology winner that can provide an organization with even better security than almost all intranets and extranets. This is public frame relay network services.
Frame Relay Frame relay is typically used as a virtual private data service that employs virtual circuits or logical (rather than physical) connections to give frame relay users the look and feel of a private network. The shared nature of logical connections (virtual circuits) is an important one in frame relay and is fully discussed later in this chapter. In most cases, the frame relay network is built by a public network service provider and use is contracted on a multiyear basis (2-and 3-year contracts are most common) to customers. This relieves the customer of building his or her own private network with purchased network nodes (switches or routers) and leased private lines. This is not to say that frame relay networks cannot be built as totally private networks, just that very few private frame relay networks exist. Most frame relay networks are public, which gives customers a lot of flexibility and economic advantages, in the same way that taking a public taxi is more flexible and economic than buying a car. Frame relay is the first public network service designed specifically for bursty LAN applications. Frame relay supports all common data applications and protocols typically found in LAN, SNA, and other data network environments. Frame relay support for voice is not uncommon and video support might become common as well. The virtual network nature of frame relay allows for the consolidation of previously separate networks such as LANs and SNA into a one-network solution. Frame relay supports all mission-critical data applications, whether based on distributed or centralized computing. Frame relay supports LAN interconnectivity, high-speed Internet access, and traditional terminal-to-host or SNA connectivity. The bursty nature of these data applications allows users to take advantage of the special features that frame relay was designed with.
Common frame relay network configurations include LAN to LAN, terminal-to-host (the most common term for an IBM mainframe or other central computer), LAN-to-host, and even host-tohost. The typical types of applications that users run on frame relay include document or file sharing, remote database access, office automation (such as order entry, payroll, and so on), interactive sessions, e-mail, presentation or graphics file sharing, and bulk file transfers. The customers that form the best base for frame relay services have three major characteristics in common: They have five or more dispersed locations that need connectivity; they want to consolidate separate networks into one integrated network; and they need full or nearly full mesh connectivity between the sites. The popularity of frame relay can be appreciated when compared to other public data network solutions. The first public network designed specifically for data is arguably the X.25 standard for public packet-switched data services. X.25 public Packet-Switched Data Networks (PSDN) were built in almost every country around the world, but in the United States, X.25 use remained rare. People preferred to build totally private networks out of privately owned switches (and then routers) linked by point-to-point leased private lines. Even outside of the United States, X.25 networks were plagued by annoying incompatibilities between national versions, lack of availability in key locations, and the failure of the service providers to market the solution effectively. Outside of a few niche applications where it thrives to some extent even today, X.25 in the United States became a public network without a public. Then along came Integrated Services Digital Network (ISDN), which was supposed to lead the telephone companies out of the wilderness and into the public consciousness as providers of voice service with unprecedented quality, and all the data service support anyone could ever need. Most of the data service support was provided by X.25 packet switching, hardly a winner on its own. And most people in the United States at least were quite happy with the quality of their voice service already. The promised integrated services like video and fax services were either already available in other forms or soon came from other sources such as the cable TV companies. After almost 15 years, ISDN was still not available everywhere. Outside of the United States, with a few notable exceptions like Germany, ISDNs were plagued by annoying incompatibilities between national versions, lack of availability in key locations, and the failure of the service providers to market the solution effectively (one does see a pattern here). In the face of the less than rousing reception given these previous public data network solutions, the sudden success of frame relay took most service providers by surprise (actual shock in some cases). Inside and outside of the United States, frame relay enjoyed smoother international connectivity, great availability, and brilliant marketing tactics. Frame relay’s success, although surprising in its scope, should not have been totally unanticipated. Users had been faced with increasing difficulties in linking their LANs with private lines for some time. Frame relay, unlike X.25 and ISDN, filled an immediate need. This construction of private networks with point-to-point private lines requires some further exploration.
Frame Relay Benefits The time has come to bring this chapter to a close with a look at just what frame relay can do to help resolve all of the issues regarding network limitations surveyed to this point. This section is not intended to replace the list of frame relay benefits given in the Introduction. Think of this more as the foundation for all the benefits listed earlier. There are four main benefits that frame relay networks offer to organizations. These are in terms of bandwidth, connectivity, economics, and applications. The concluding sections look at each one in turn.
Bandwidth The need for increased bandwidth, and even broadband networks in some instances, is a fact of life in most organizations today. The question is more one of the best way to get the increased bandwidth. Faster private lines, the traditional answer, is wasteful in the United States and simply not an option in many other parts of the world. When the need for bandwidth for international connectivity is factored in, paying for private line bandwidth is even more of a problem. The attraction of frame relay is that there is no dedicated bandwidth as on private lines. The total amount of bandwidth is divided according to the needs of the currently running and bursting applications. This is no more than bandwidth-on-demand in action. Since a customer is not paying for dedicated bandwidth, but shared bandwidth, frame relay networks typically use relatively modest speeds for linking customer sites (since sharing cuts down on the overall bandwidth need). But where larger, broadband speeds are needed, frame relay can be used as well. Access from a customer site to a frame relay network can be at 64 kbps, simple multiples of this basic rate (called fractional T1 (FT1) or Nx64), 1.5 Mbps (full T1), 2 Mbps (mostly outside of the United States), and even 45 Mbps (full T3). Higher speeds are being considered, but 45 Mbps should remain the top speed for some time to come. Of course, a 45 Mbps link used to access a frame relay network is no more than a private line leading not to another customer site, but to the frame relay network itself. So the cost is kept manageable due to the relatively short length of the link. When the fact that all the logical connections share the 45 Mbps bandwidth on a dynamic basis is added, it turns out that frame relay has more than enough bandwidth for any application.
Connectivity Frame relay connections are logical connections, not the physical connections of a private line network. Logical connections are sometimes called virtual circuits and the terms are more or less interchangeable. This book prefers the terminology of logical connections because the term virtual has become overworked in the networking industry. There are virtual LANs, virtual private networks, and even several forms of virtual reality. However, the later use of the terms Permanent Virtual Circuit (PVC) and Switched Virtual Circuit (SVC) will be unavoidable. Whatever they are called, the connectivity provided by frame relay networks is logical, not physical. These logical connections form the virtual circuits which replace the dedicated or physical circuits that form the basis of the private line network. This logical connectivity is shown in Figure 1.8.
Figure 1.8 Private lines and logical connections. Now all access is to the public network, not using point-to-point private lines (or the multipoint private lines still in use with SNA) for all site connectivity, but with logical connections or virtual circuits established on the access link. This is the essence of public networking in general: All reachable end points are contacted through the same local access link. Think of telephone calls on the PSTN, which uses the same idea. Frame relay sites connect to each other by connecting to the frame relay network. This is a huge advantage over private line networking. Consider an organization with a need to establish an office in Paris, for example. A private line would not only be expensive, but probably not even available at anything near broadband speeds. But frame relay can link the sites together with 1.5 Mbps access in the United States and 2 Mbps access in Europe. This is a pretty neat trick, because the two ends run at different speeds. This is hard to do with physical connections, but easy to do with frame relay virtual circuits. One other connectivity example should suffice. AT&T says that some 40 percent of all telephone calls to Europe from the United States are not voice calls, but fax messages. These are expensive, to say the least, and form a considerable part of the voice network budget. With frame relay, much if not all of this fax traffic can be sent over the frame relay network itself, with the resulting savings to the voice finances. The Internet can be used for all of the connectivity advantages mentioned in this section. After all, the Internet is also a public network which reaches the world through local access links. But the Internet does not offer even the beginnings of the security that frame relay users count on routinely. Internet security must be added by users, often at considerable expense. And many studies have shown that commercial Internet security products fall far short of their claims to comply with the most basic security standards. Also, the Internet handles logical connections much differently than frame relay does. There is no connection path set up between the routers on the Internet as there is between the switches on a frame relay network. There is no minimum bandwidth guarantee either, much to the dismay of those who would rely on the Internet for things better done with frame relay. So think of the Internet as an adjunct to frame relay, not a competitor. Indeed, it is a rare frame relay network that does not include at least one virtual circuit to an Internet service provider. But frame relay provides better security and performance than the Internet.
Economics
With enough money, there is no reason to favor one technology over another. This is especially true of standards. After all, they all work. There is no single right way to build a LAN or WAN. But since this is not an ideal world, money does matter. And often one technology is favored over another because a small initial economic advantage gets magnified over time until it becomes much more expensive to do anything else. Certainly the small economic edge that Ethernet enjoyed over token ring at the beginning of LANs (token ring chipsets were just more complex and therefore more expensive) was magnified over and over. Frame relay has held on to an economic edge over ATM for some time now and probably will for some time to come. Some of the discussion in the preceding sections has touched on the economic benefits of frame relay. It should be enough to list these benefits and add a little more information about each aspect. Following are the “big 5” savings possibilities of frame relay. There are more, but they have less impact that these main ones. 1.Bandwidth savings. Since there is no dedicated bandwidth, applications can share a pool of bandwidth on the access line. This can lead to a significant cost savings. 2.Connection savings. Since there are no physical connections between sites, there is no need to have multiple links running to remote sites. All sites are reached through the same access link. 3.International savings. Private lines to other parts of the world are expensive. Frame relay logical connections can reduce this expense. And if frame relay can be used for fax and/or voice, the savings is even greater. 4.Network savings. Frame relay is a logical network. Connections can usually be added in a week or so to new sites, and logical connections rearranged literally overnight. (In contrast, private lines typically require 30 to 60 days to “rearrange”.) 5.Management savings. Most of the details of managing the day-to-day activities on a network, such as routing around failures and avoiding congestion, is now done on the public network on behalf of all customers. Some frame relay users rely on the service provider for all of their network management needs.
Applications One of the benefits of frame relay that is just beginning to be explored is the wide range of applications that a frame relay network can support. Most people are aware that frame relay will support not only bulk file transfer and delay sensitive transactions, but is perfectly suited for IBM SNA network support as well. Many now are aware that faxing and voice telephony can be done on frame relay also. Only a few are becoming aware that frame relay can support very good video services, such as corporate video conferencing. What took so long? The answer is simply that frame relay was designed to be first and foremost a data service. It was only after the success of frame relay in that role that frame relay equipment manufacturers and service providers began to explore the use of frame relay for voice and video. The whole point is that if an organization needs a network platform that is secure, fast enough for almost any application, and virtually future-proof, then frame relay is the way to go.
Chapter 2: The Public Data Network Overview This chapter will further investigate the role of frame relay as a public data network. There are many types of networks, naturally. But most of them can be classified according to whether they were designed and intended to provide public or private service, and whether these networks were designed and intended primarily to deliver voice, data, or some other kind of service on circuits or with packets. Frame relay was primarily designed and intended as a public data network service. This is an important point because of the current trend to use frame relay not only for data, but also for voice. Voice over frame relay works very well, in most cases much better than voice delivered over the public Internet, for example. There is a reason for this, which will be detailed in this chapter. For now, it is enough to point out that what is known as cell-based frame relay delivers voice that is just about as good as many international telephony calls, but at a fraction of the price of using public telephone company voice services. So frame relay is not just a data network anymore. The second half of the “frame relay was primarily designed and intended as a public data network service” equation is the public aspect of frame relay. Yet many people speak of their frame relay network as the corporate backbone for what remain essentially private network services delivered only to the authorized users of the network within the corporation. Obviously, frame relay is a public network service that can be used in conjunction with a private, corporate network. In some cases the term virtual private network is applied to a frame relay network configuration, but it is more common to apply this term to companies linked over the Internet. Nevertheless, one of the most common uses of frame relay is to link a corporate network to the Internet in a systematic and costeffective manner. This chapter will look into frame relay as a public data network and examine how it came to be characterized in this way in the first place. This chapter will enable readers to understand how frame relay can also be used today as a voice and video network, and as a virtual private network that offers privacy to employees while at the same time opening up the organization to the Internet and World Wide Web. This versatility is hard to match with many other network technologies and, of course, is one of the reasons for frame relay’s popularity today.
Networking with Circuits and Packets Frame relay, like X.25 before it, is a packet-switching technology. Frame relay is usually classified as a fast packet technology, for the simple reason that frame relay is fast enough to carry packetized voice and video. ATM is another of these fast packet technologies. The relationship between frame relay and ATM will be explored in much more detail in a later chapter (Chapter 12). ATM can and does form the switch-to-switch backbone technology for many frame relay networks. This form of cell-based frame relay makes the voice and video capabilities of a frame relay network that much more attractive to users, since ATM is designed for such mixed traffic environments. Before X.25 came along, people built data networks out of circuits, just like the old days of the telegraph network. In other words, all the telephone companies provided was the raw bandwidth on wires to send and receive frames with packets inside. If there was any switching or routing of messages, this had to be done by user-provided equipment. In the telegraph network, this was the equivalent of taking a message from one place, then sending it out on the telegraph wire to the next hop done the line. This process was repeated until the message eventually got where it was going. The alternative of a direct wire link everywhere was hardly technically or economically feasible. Yet a method for networking without circuits was clearly needed for data. Data is not like a telephone call. Data is bursty, but human telephone calls are not. Circuits used for voice are almost constantly in use, at least in one direction when someone talks and another listens. But circuits used primarily for voice when used for simple PC dialups, are filled with idle periods between bursts. These idle periods could be used for other data packets, except for the fact that the bandwidth on a leased, private line belongs exclusively to the customer leasing the line. Circuits also connect only one point to another, in most cases. This is fine for the PSTN because it was intended to connect the device at the end of one access line to the device at the end of the other access line. It matters little whether the devices are telephones, modems, or fax machines; as long as they are compatible. But one major aspect of data networks in general, especially those built on the principles of the OSI RM, is that they need to connect the client device at the end of one access line to everything. As more and more people use PSTN access lines for Internet access and not telephone calls, it is clear that circuit-switched networks with their all the bandwidth, all the time approach like the PSTN are not the best way to handle this traffic. On data networks like the Internet, information is sent and received in the form of packets. Packets are variable-length units that have some maximum size and minimum size. The term packetized means that everything sent around the network, from data to voice and video, is in the form of packets. Voice and video have traditionally been handled by circuit-switched networks (such as the PSTN and cable TV networks) rather than packet-switched networks. Once packetized using stateof-the-art digitization and compression techniques, this type of voice and video more closely resembles bursty data than anything else. Circuits are ill-suited for packets. Packets can go anywhere, but circuits only go to one place at a time. Circuits reserve all of the bandwidth, all of the time to one place at a time. Long circuits are paid for by the mile. It is too expensive to require one circuit for each potential user. A lot of expensive bandwidth on these circuits is tied up for bursty data applications.
To save money and make more efficient use of long and expensive circuits, packet switching was invented. Who invented packet switching is the subject of intense debate. Surely the Internet people were pioneers and IBM certainly popularized the process with its own networking products. The whole concept was standardized with X.25 in the 1970s. With packet switching, individual packets to all destinations could be switched from place to place based not on what circuit the packet represented, but what end application the packet was carrying information for. This is how a single link on a packet-switched network can connect one end device to everything. The packet switch (now called a router on the Internet) could send a packet literally anywhere on the network, based on the information carried in the packet header. Only one link into the network cloud was needed for all of these activities. This user-network interface (UNI) link was still only a circuit in most cases because packets still must flow on something. The individual address information attached to every packet gives another perspective for distinguishing circuit switching from packet switching. Circuit switches, such as PSTN local exchanges, switch the entire bandwidth of the circuits (all the bandwidth, all the time) from one place to another. Packet switches only switch the individual packets that form the content of the frames. So packets can be mixed to a whole host of destinations on the same link, which may still be a PSTN circuit, dialed or leased. The basic differences between circuit switching and packet switching are shown in Figure 2.7. The figure shows the basic differences between circuit-switched networks (e.g., the PSTN) and packet-switched networks (e.g., frame relay).
Figure 2.7 Circuit switching and packet switching.
X.25: The Slow Packet Network In the early days of data communications protocols, there were many private protocols designed to address a particular need or specific private network. For example, the airline industry employed IPARS, the International Protocol for Airline Reservation System, which was specific to a particular application and network. No one gave any thought to using IPARS on any other network or in any other context except for the one for which it was developed. But at the same time, several companies and telephone administrations in North America and Europe implemented a number of public data networks. The idea was to provide a data service that paralleled the voice service’s public connectivity and degree of standardization. These are commonly known today as packet-switched data networks (PSDN), but other names were common in the past. The names all acknowledge that PSDNs function by switching packets through the network. The network node in a PSDN is a called a switch, not a router, bridge, or any other data communications device. X.25 was intended to be an international standard for building a PSDN in which any data user could contact any other data user, literally anywhere in the world served by a PSDN, to exchange information as easily as voice users used the voice network to exchange information by means of speech. The ideas behind X.25 especially appealed to European telecommunications administrations, where the number of relatively small nations made adaptation of a single data standard very attractive to users. In the United States, there was less of a need, economically or politically, to consider alternatives to private data networks.
It is important to realize that the X.25 standard for PSDNs specifies one main thing: the user’s connection to the network in a standard fashion. This means that even if different vendors provide the customer premises equipment and the network node switch, they can interoperate as long as they both comply with the X.25 interface standard. How one network node sends X.25 packets to another switch (network node) is beyond the scope of the X.25 standard, which only specifies the User-Network Interface (UNI). Even the important consideration of how one PSDN should send packets to another PSDN (for instance, a PSDN in another country or one run by another service provider) is covered by a separate (but related) international standard known as X.75. X.25 was developed in the early 1970s by the CCITT (now International Telecommunications Union-Telecommunications Standardization Sector: ITU-T) and published in 1976. The assumptions made by the designers reflected the state of networking at the time. There were mainly two. First, end-user devices were of limited intelligence in terms of processing power. Second, the communications network connecting the users to the X.25 networks, and the communications network between the X.25 switches themselves, was extremely error prone and unreliable. Therefore, X.25 contained a lot of provisions that took much of the communications responsibility away from the end-user device and included a lot of error-checking and flow-control services. Since X.25 was intended to mimic the voice network in function, X.25 was intentionally made a connection-oriented service. This meant that, just as in the public telephone network, one user had to establish a connection to another user across the network before data transfer could begin. Because end devices were so limited in capability and the network similarly limited, this decision made perfect sense. Why send a packet halfway across the country only to find out that the destination could not accept any traffic, for whatever the reason? X.25 is a connection-oriented protocol, as opposed to other common protocols such as TCP/IP, which is connectionless, at least within the network itself. There had to be connections, node by node, or else packets would not flow. The connections in X.25 were logical, not physical, and the familiar frame relay terms of permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) were fundamental to the architecture of X.25 as well. The idea of a datagram in TCP/IP, which basically means a connectionless packet, does not exist in X.25. The X.25 protocol is a layered protocol. The three layers of X.25 are shown in Figure 2.8. At the bottom, the Physical Layer describes the type of connector used and the way that data bits in the form of 0s and 1s are sent to and from the X.25 network switch. Several possibilities are listed in the figure. At the Data Link Layer, the LAPB (Link Access Procedure-Balanced) defines a particular frame structure that bits are organized into when sent to and from an X.25 PSDN, as well as various functions that the frames can also provide such as setting up SVCs, sending data, and control functions. The upper layer is the Network Layer and X.25 defines a packet structure that forms the content of certain kinds of frame. The protocol which determines the rules for the sending and receiving of frame over the X.25 network interface between user and switch is known as the X.25 PLP (Packet Layer Protocol).
Figure 2.8 The layers of the X.25 protocol stack. The X.25 LAPB frame structure is probably the most important part of X.25 for understanding the development of frame relay. This is shown in Figure 2.9.
Figure 2.9 The structure of the X.25 LAPB frame.
This structure has appeared over and over again in data communications protocols. The structure is simple enough, yet possesses all of the capabilities needed to carry the X.25 packet over a series of links between X.25 switches. This is, in fact, a key point. In all layered data communications protocols, the frame is sent on a link-by-link basis only. That is, at every step along the way on an X.25 network, a frame is created by the sender and essentially destroyed (in the act of processing the frame) by the receiver. It might seem impossible to ever send anything intact from a source to a final destination if this is the case. But the key to understanding how layered protocols operate, and how frame relay relates to X.25, is to realize that it is the packet, as the contents of the constantly invented and destroyed frames, that is sent across the network end to end from a source to a destination. Many protocol developers are even in the habit of calling any protocol data unit (PDU) that travels the length of a network from end to end intact, through switches and other network devices, a packet. The packet label seems to apply whether the actual PDU is a TCP/IP Layer 3 data unit (datagram) or an ATM Layer 1 data unit (cell). The term “packet” is used generically in many cases simply to indicate a data unit that leaves one end-user location, flows over a network, and arrives at another end-user location intact. So the first step in understanding the relationship of frame relay to X.25 is to realize that in frame relay it is the frame that flows intact from source to destination on a frame relay network. In X.25 this function is performed by the packet, or frame contents. In an X.25 packet-switching network, packets are switched on the network. In a frame relay network, frames are relayed across the network much more efficiently, since the frames no longer need to be created and destroyed on a link-by-link basis just to get at the packet inside.
X.25 to Frame Relay X.25 public packet switching is often cited as the parent protocol of frame relay. X.25 is a slow packet technology while frame relay is a fast packet technology, and so on. But the road from X.25 to frame relay leads through the Integrated Services Digital Network (ISDN). ISDN was a plan to merge access to digital voice and digital switching networks and bring some packet switching capabilities and services to the circuit switching voice network. This is not the place to debate the failure or success of ISDN with regard to this envisioned merging of packets and circuits. What is important for frame relay discussions is that the layers of X.25 were adapted for use as a signaling protocol and packet data protocol for use with an ISDN. The original LAPB protocol became Link Access Procedure-D channel (LAPD). LAPD was used on ISDN signaling channels to set up and maintain and terminate ISDN connections. LAPD frames could also carry user data when not otherwise used for this signaling purpose. The LAPD frames now could carry signaling packets to and from the ISDN switch. All user devices sharing the integrated access to an ISDN signaled using LAPB. There is an important aspect of the LAPD frame structure that holds a key to understanding the relationship between X.25 and frame relay, through ISDN. This is the address field. The ISDN LAPD address field has a structure as shown in Figure 2.10.
Figure 2.10 The structure of the ISDN LAPD address field. It is obvious from the figure that there are two different fields involved in the ISDN LAPD address structure. The first is the SAPI (Service Access Point Identifier, 6 bits in length), and the second is the TEI (Terminal Endpoint Iden tifier, 7 bits in length). These identifiers are just numbers, from 0 to 63 in the case of the SAPI and from 0 to 127 in the case of the TEI field. But why should a frame, which only flows from a single location to another single location (point-to-point), have such a complicated address structure?
This was one of the innovations of ISDN itself. While it is true that all frames flow over the same point-to-point link on a data network, it is not the case (and cannot be) that all packets must do the same. Otherwise, a separate physical point-to-point link to every possible destination must be configured at the source for these frames to flow on. In fact, this is the essence of a private network. But this is not the essence of ISDN D-channels and X.25. The parent protocol LAPD, as its child protocol frame relay, allows the multiplexing of connections from a single source location over a single physical link to multiple destinations. The greatest benefit of this approach is to make more efficient use of expensive links and cut down on the number needed in the network. The two fields in the LAPD frame address in ISDN deal with the two possible kinds of multiplexing that a customer site sharing a single physical network link must accommodate. First, there may be a number of user devices at a customer site. Second, there may be a number of different kinds of traffic that each of these devices generates. For instance, even though all information is 0s and 1s, some of these digits may represent user data and some may represent control signaling to the network itself. The TEI field deals with the first of these multiplexing possibilities. The TEI field addresses a specific logical entity on the user side of the ISDN interface. Typically, each user device is given a unique TEI number. In fact, a user device can have more than one TEI, as would be the case with a concentrator device with multiple user ports. The SAPI field deals with the second multiplexing possibility. The SAPI field addresses a specific protocol understood by the logical entity addressed by a TEI. These are Layer 3 packet protocols and provide a method for all X.25 equipment to determine the structure of the packet contained in the frame. Taken together, the TEI and SAPI address fields make it possible for all network devices on the network to (1) determine the source or destination device of a particular frame on an interface (the TEI), and (2) determine the source or destination protocol and packet structure on that particular device (the SAPI). What all this has to do with frame relay will become apparent later. For now, it is enough to point out the functions of the TEI and SAPI LAPD frame address fields.
X.25 Limitations While the X.25 public, packet-switched data network protocol was a workable and international standard way of transferring data from place to place, X.25 was mostly ignored in the United States and was slow to catch on in the rest of the world as well. There were a number of reasons for this, some of which had to do with the way early data networks were built in the United States and some of which had to do with inherent limitations in the way that X.25 functioned. The X.25 protocol was designed to be a public networking standard which would operate well over noisy, error-prone, copper-based networks. The philosophy of the X.25 designers was to make the network intelligent enough to perform the necessary error recovery and flow-control operations on behalf of the end-user equipment, which had only modest capabilities along these lines when X.25 was developed. The problem with X.25 adoption in the United States is that the private corporations that had the greatest need for the kind of service that a public X.25 network could provide were addressing these same issues with other solutions. In the case of public networking, companies were busily building their own private networks out of point-to-point dedicated links. The bandwidth on these links was leased from the public telephone carriers. Thus, these carriers were in the uncomfortable position of cutting into their leased line revenues by setting up a public, packet-switched network service in direct competition with their private line business. Corporations, who were normally in intense competition with other companies in the same lines of business, were much more comfortable with the data security that private lines provided, at least on the surface (all corporations’ data was still mixed together on the telephone carriers’ backbone network, of course). And although these private lines were leased by the mile, and so could be very expensive in some instances, corporations in the United States at that time were enjoying record profits and undergoing a heady expansion phase along with the entire U.S. economy. (There were exceptional companies that embraced public data networks, usually in the computer/network industry itself.)
As far as the error-prone links were concerned, companies adapted to that environment more or less successfully as well. The data applications developed for this error-rich situation performed their own error recovery at the end points of the network. Since most, if not all, links were point-topoint connections between sites, there was little delay or overhead added to a network as a result of these error recovery mechanisms. Although this extensive error recovery increased the price of network devices, the same corporate profits which bankrolled the links also paid for the expensive network devices at the ends of the link. Besides, the philosophy went, end devices would still perform their own error-checking and flow control even if connected over a public network. After all, no computing device should be so trusting of a network as to process a piece of data delivered without extensive error checking and flow control in any case. These perceived limitations that prevented widespread deployment and use of X.25 public data networks were not so pronounced, or even present, in the rest of the world. The capabilities of the United States telecommunications network and the money available to the users of data network services were not common throughout the world. In fact, the United States was exceptional in this regard. Because the limitations of X.25 mentioned to this point were mainly matters of perception, X.25 public data services became more common throughout the world, especially in Europe. However, even in countries where X.25 networks were extensively deployed and relatively well accepted by the data user community, the technical limitations of X.25 quickly became apparent: Copy of frame was kept link-by-link on the X.25 network. Full error checking was done hop-by-hop on each frame. All switches had to examine the packet (frame contents). The X.25 network had to do flow control in the network. Logical connections between sites were at the packet layer (Layer 3 of the OSI RM). One of the limitations was that a copy of the frame had to be kept in each network node (the X.25 switch) until a message was received from the next node that the frame was received without errors. Of course, if the frame was received with errors, another copy was sent, and perhaps another, and another, until one was received without errors or a retry counter was exceeded. Many X.25 switches had to have large and expensive buffers to hold all of these frame copies within the network. Next, this full error-checking procedure had to be performed on each hop through the X.25 network. A hop is simply a link between adjacent X.25 switching nodes. In the United States, 10 or even 15 X.25 switches could be found on coast-to-coast links. This elaborate error-checking procedure could slow throughput to a crawl. Each X.25 switch had to examine the X.25 packet inside the X.25 frame to determine the proper output port to send the packet out on toward the correct destination. This meant that each X.25 switch had to assemble the entire packet (which could be spread across several frames), examine the header, and then disassemble the packet into another potential series of frames for the outbound link. Trying to limit packet size to fit in a single frame was not an elegant solution, since this produced that many more packets to be switched by the X.25 network, but this is exactly what was usually done. The X.25 network also did flow control, which prevented any sender from overwhelming a receiver. This seemingly innocent feature meant that a sender could pump data into the X.25 network at 9,600 bits per second (bps) that was intended for a destination that could only accept data at 1200 bps. The X.25 network was forced to buffer this extra data until it could be sent to the destination, again at an added expense for the additional memory needed in each X.25 network node (although the X.25 network could eventually slow the sender). Finally, the fact that the logical connection was located inside the packet itself meant that the full X.25 Packet Layer Protocol (PLP) had to be implemented in every X.25 network node. This layer also had to do error control, but the additional processing power needed to switch the packet correctly only added delay to the network as a whole.
All these limitations conspired to make X.25 an adequate network technology for modest data networking needs, but wholly inadequate for voice, video, and even the kinds of connectivity needed in the mid-1980s for the new breed of applications that ran on the faster premises networks that appeared at this time: the bursty LAN applications. The preceding discussion of X.25 limitations might seem harsh. But it must be pointed out that X.25 is still a widely used, international standard, packet-switching technology that works very well in any number of network situations. Most telex (teletype or teletypewriter) applications employ X.25 for short text messages, and sometimes even more. Equipment is very cost-effective and comes in 240 port models, surely a sign of a viable technology. There are many features that make X.25 attractive today, beyond simple standards issues, that are hard to find in newer, less mature, networks. These include things like the creation of closed user groups (for public network security), signaling connectivity based on universally unique telephone numbers (no separate network address needed), call redirect and forwarding (for finding mobile users), and other features. It is hard to send a message to an oil platform or other remote location without using X.25 today. In fact, X.25 will be found today wherever the infrastructure to support frame relay is lacking.
Modern, Packet-Switched Data Networks As interesting as the history of the telegraph might be, the type of data network most relevant to frame relay is the modern, packet-switched data network, sometimes abbreviated PSDN, because the international standard for PSDNs is recommendation X.25 from the International Telecommunications Union (ITU). And X.25 is really the direct parent technology of frame relay. Frame relay has not been called “X.25 on steroids” and “X.25 for the 1990s” for nothing. This section will introduce not only X.25, but also many of the key concepts of modern data networking, such as the Open Systems Interconnection Reference Model (OSI RM) from the International Organization for Standardization, usually (but incorrectly) abbreviated ISO. At the end of this section, not only will the stage be set for introducing frame relay’s distinctive features, but the incentive for doing so will be absolutely clear. It is necessary to fast-forward into the 20th century to do so, not because networking did not evolve in the interim, but because most of the attention was applied to voice, not data, networking. First, however, a simplified, illustrated, and painless look at the OSI RM is in order. Even those familiar with the OSI RM might still want to read this section carefully, as there are many misunderstandings about just what the OSI RM does and what it is for. And of course, without a thorough knowledge of the OSI RM, most of the interesting features of frame relay are utterly meaningless. Simply put, the OSI RM is a standard way of bridging the gap between a software program (application) running in the local memory address space of a computer and the hardware connector on the back of the machine. The OSI RM creates the bridge by breaking down this software-tohardware task into a series of layers, seven in all, that have strictly defined functions that are to be implemented in a standard fashion. The layers of the OSI RM are illustrated in Figure 2.3.
Figure 2.3 The OSI RM. All the OSI RM does in a computer is to take bits from an application program and send these bits out to a communications port on the back of the computer, and vice versa. What is so complex about this seemingly simple task that it requires seven layers to perform? Most critics of the OSI RM would indeed maintain that the seven-layer OSI RM needlessly complicates what is basically a trivial computing task. However, this also trivializes the very act of networking to the point where complex networks become impossible to build. To see why networking can be so complex, this section will gradually build up the reasoning behind the structure of the OSI RM, with the ultimate goal of better understanding the layers of the X.25 and frame relay protocol architecture.
At the bottom of the OSI RM is the Physical Layer. This is the hardware. Actually, it is a full description of the connector on the back of the machine, such as the RS-c connector. Technically, this is now the EIA-d or EIA-232e connector. The RS-c connector is a D-shaped, 25-pin connector that is normally attached by a cable to a modem for communications purposes. Note that the cable, the modem, and any other sending or receiving telecommunications equipment is not really part of the OSI RM. Such components (including multiplexers and so on) are usually referred to as the subphysical layer network. These components must be there, of course, but with a few modem exceptions, the OSI RM does not standardize their form or function. They are covered by other standards and will not be considered further. The OSI RM Physical Layer specification is divided into four parts: mechanical, electrical, functional, and procedural. The mechanical part specifies the physical size and shape of the connector itself, the number and thickness of the pins, and so forth, so components will plug into each other easily. The electrical specification spells out what values of voltage or current determines whether a pin is active or what exactly represents a 0 or 1 bit. The functional specification determines the function of each pin or lead on the connector (pin 2 is send; pin 3 is receive, etc.). The procedural specification details the sequence of actions that must take place to send or receive bits on the interface to or from a modem or other network device (first pin 8 is activated, then pin 5 goes high, and so on). The RS-c interface is a common implementation of the OSI RM Physical Layer that includes all these elements. So far, the data communications task does not seem too tough. It would be nice if there was a simple way to make the application program send bits to the connector (or connectors, since there may be more than one, e.g., COM1, COM2, etc.). And there is. All programming languages, from Basic to C++, allow the application programmer to write to a port. That is, there is a simple program statement to send bits for any purpose out the connector. For example, a statement such as “Write (Port 20$, “A”)” will send the internal bit sequence representing the letter “A” out of the send data pin on the connector. But the data communications task is trickier than this. It might not be such a good idea to just send raw bits from an application program to a communications port all in one statement. This method has several problems, some of which are not obvious. First, the statement mentioned will not check to see if there is really a modem attached to the port at all. It executes and completes when the data is delivered to the port. The application itself must check for error conditions—all possible error conditions—and take action in the application program to correct them. Second, there is no way to determine if the data has actually been delivered across the network to the destination system (which is presumably running a receiving application program). Third, since there are different internal bit configurations for the letter “A” (7-bit ASCII and 8-bit EBCDIC, for example), there is no guarantee that even if the bits get through to the destination system that the other machine will understand the delivered bits as the letter “A.” There are other, subtle problems but these are the main ones. Maybe the decision to write to a port was not a wise one. This is exactly the rationale behind the OSI RM and all layered communications protocol architectures like TCP/IP. The OSI RM offers a way to write application programs with statements like “send e-mail” or “get a file” in a networked environment in a standard fashion. These statements are now standard library functions or subprograms that are linked to the application program at compile (and link) time. This saves a lot of time and effort in the network program development cycle. But what are all those other layers for? The answer involves understanding how modern data communications networks were built in the 1990s. Consider a very simple network as shown in Figure 2.4. The figure shows two systems—A and B—connected at their physical ports by a cable. The systems may be two feet, 200 miles, or even 2,000 miles apart. There may be modems and multiplexers (muxes), and all manner of sub-physical layer network devices in between; it makes no difference to the OSI RM. The only important thing is that when bits are sent on the port on System A, they must show up at System B and whatever bits are sent out of the port on System B, end up at System A.
Figure 2.4 A very simple network.
However, bits are just bits. There should be a way for System A to tell System B: “Here come some bits, here they are, and this is the end of the bits.” In other words, the unstructured bit stream is organized into a data unit known as a frame. Of course this task—frame organization and interpretation—is much too difficult for the Physical Layer to handle in addition to its assigned tasks (and it is not part of the specification). So the ISO committee invented a layer above the Physical Layer just to send and receive frames: Layer 2 of the OSI RM, the Data Link Layer. These frames are officially known as Layer 2 Protocol Data Units (L2-PDUs) in OSI RM language. The frame structure of all Layer 2 protocols (the official OSI RM Layer 2 protocol is known as HDLC [High-level Data Link Control]) have many features in common. The frames all have a header, a body, and a trailer. On LANs, the frame header usually contains at least a source and destination address (known as the physical address or port address since it refers to the physical communication port), although this is absent in HDLC. There is also some control information. The control information is data passed from one Layer 2 to another Layer 2 and not data originating from a user. The body contains the sequence of bits being transferred across the network from System A to System B. The trailer usually contains at least some information used in detecting bit errors (such a Cyclical Redundancy Check [CRC]). There is always some maximum size associated with the frame that the entire unit must not exceed (because all systems must allocate space for the data in memory). So the task of the Data Link Layer in the OSI RM is to transfer frames from system to system across the network. The network in the original OSI RM must consist of point-to-point links between adjacent systems. Actually, the OSI RM allows what are known as multidrop or multipoint links also, but these are seldom seen today except in older networks and do not change the main points of the discussion. The only impact of these multipoint links is that the Data Link Layer (DLL) allows 256 systems on a single link. Obviously, in the figure whatever bits are sent out the port on System A arrive at System B, and whatever bits show up on the port on System B must have come from System A. The frame source and destination address in this instance are not even needed. The situation changes in the network shown in Figure 2.5. There are now two point-to-point links in the network and System C has been added to the network. System B now has two communications ports. System A and System C are now nonadjacent systems (i.e., they are not connected directly by a single one-hop point-to-point link). The question is: Can System A send a frame to System C? If not, then the network needs a direct link from System A to System C, and a large network will quickly become hopelessly complex. If so, then it is by no means obvious that the Data Link Layer is capable of doing this. In fact, the original definition of the Data Link Layer requires adjacent systems.
Figure 2.5 A more complex network. In the network illustrated in the figure, System B plays a critical role. System B can no longer assume that all frames arriving from System A are destined for System B. Some will obviously be for System C. System B will have to send these frames out the opposite port and on to System C. In the OSI RM, Systems A and C are now End Systems (ES) and System B is now an Intermediate System (IS). In 1979, when the OSI RM was developed, these systems were envisioned to be multiuser systems one and all, with many terminals attached to the host computer. As it turned out, this was asking a lot of System B. In many cases, there was simply not enough computing power to efficiently handle multiple communications links and many user terminals at the same time. The solution was to dedicate System B exclusively to the networking task. That is, there were no users on System B. System B merely took frames from an input port and determined the proper output port to resend the frames on. System B became a network node in modern language. There was another problem with System A sending frames to System C on a network as simple as this one. Recall that frames contain the physical or port address of the source and destination. Could System A really be expected to know the port address of System C? What if it changed? And could System B be expected to know the proper links to send frames out to all possible destination port addresses in a large network? Probably not. The OSI RM addressed this issue as before: The ISO added another layer on top of the Data Link Layer: Layer 3, the Network Layer.
The name Network Layer is a little confusing. The original name for this layer was the Routing Layer, since it addressed the need to route data through the network from a source to a destination. While this describes the layer’s main function, routing is not all that Layer 3 does. So the name was changed to reflect this reality. The Network Layer does not use the frame address to determine the destination for data. This may seem surprising, but the problem was that the physical address gave no indication of location on the network. Physical address “2125551212” could be anywhere in the world. It would be nicer if the address used by the Network Layer was similar to a telephone number: Anything starting with “212” was in Manhattan, for example. System B would route the data addressed to anything starting with “212” to Manhattan and let other systems in New York worry about just where 555-was. System B now becomes a router. But in the tradition of public data network terminology, this network node is called a switch when it is used on a public carrier’s data network. The exception is the Internet, where all public network nodes are called routers. The ISO addressed this physical address problem by inventing a network address for Layer 3 (actually, the OSI RM calls it a Network Service Access Point (NSAP), but it is used as a network address). Every system in the network has a network address, whether end system or intermediate systems. Systems could have many physical or port addresses, but still needed only one network address in most cases. The routing function of System B means simply this: The Data Link Layer on Port 1 of System B receives a frame from System A (which has System B’s Port 1 address as a destination). Inside that body of the frame is yet another data unit, a Layer 3 Protocol Data Unit (L3-PDU) known as a packet (in OSI RM) or datagram (in TCP/IP). This PDU has a header and body, but no trailer. In the header is the source and destination network address (System A and System C). System B looks the destination address up in a table (known, not surprisingly, as the routing table), and finds out which output port to forward the PDU out on. System B then puts the packet or datagram inside another frame (with System C’s physical address) and sends it to System C. The situation is now as it appears in Figure 2.6, with the layers of the OSI RM filled in below the network nodes and the sub-physical network indicated by simple links. (If the systems have applications running on them, more layers are needed.)
Figure 2.6 Layers in a data network. There is one more layer needed to get data across even this simple network. Something on the end systems had to handle the interface between the network and the application software so that long files were broken up to be sent across the network, electronic mail was sent to the proper network address, and so on. This function was not needed on the Intermediate Systems (no one sent e-mail to a router), but only on the End Systems where traffic originated and terminated. Of course, the ISO created another layer to do this: Layer 4, the Transport Layer (originally called the End-to-End Layer). This layer prepares messages (Layer 4 PDUs) for transport across the network. The other three layers, 5 through 7, have little relevance to X.25 and even less so for frame relay, but their functions as envisioned in the OSI RM should be outlined. It is important to realize that these layers are never implemented separately, but are always bundled in a single library function, which is essentially what the Internet protocol suite (TCP/IP) does. The Session Layer (Layer 5) contains what are known as state variables about a connection (session) on the network. For example, the Session Layer would know that 3 of 4 files intended to be transferred across the network have been sent successfully, and the network failed (the session terminated) halfway through the last file. The Session Layer would only send the rest of the last file when the network came back up again. This is essentially a way of keeping track of the history of a connection.
The Presentation Layer (Layer 6) takes care of all differences in internal data representation (e.g., 7-bit ASCII and 8-bit EBCDIC) between different systems. It does so by translating all data into a common representation (known as Abstract Syntax Notation [ASN]) and sending this ASN across the network where the receiving system translates it back to the proper representation for the destination computer. A similar function occurs when a native Englishman and German converse freely in French, which both happen to know. The Application Layer (Layer 7) is really misnamed. It is not really a layer at all and there are no application programs. Rather, there are various compartments containing Application Program Interface (API) verbs that are appropriate for network tasks. For example, “Send e-mail” and “Get a file” are separate Application Layer APIs that may be used in the application program (which runs above the OSI RM) and are linked to the application program before it is run. Of course, there is no need to include the e-mail compartment in a file transfer application program. All modern network protocols are implemented in layers, whether OSI RM layers or not. This simplifies the overall networking task and releases the application programmer from the chore of writing a complete network implementation in each and every application program. This layered approach is followed by X.25 and frame relay. In a layered protocol, there are usually two or more possibilities (or options) that a network designer may choose to implement at several layers. When one of these possibilities is actually chosen at each layer between the application program and the network, this forms a protocol stack, a term which precisely reflects the layered nature of the protocol (the layers are stacked one on top of another).
Early Public Data Networks The first public network built in the United States for telecommunications purposes was not a voice network at all. It was a data network: the telegraph network. It was built on the principles of Samuel Finley Breese Morse, but it was not even the first public telecommunications network in the world. The first national telecommunications networks were built in Europe during the late 1700s and early 1800s. These were true data networks, commonly known as optical telegraph networks. These optical telegraphs were sophisticated semaphore systems capable of sending messages across hundreds of kilometers in an hour or so. The message speeds were limited by the need to relay the messages from tower to tower along the message’s route and the complexity of the encoded message. But these systems were much faster than any other form of communications available. The most elaborate systems were built in France and Sweden. These networks could take four minutes to transmit a message such as, “If you succeed, you will bask in glory.” The messages were sent as a series of numbers which had to be looked up in code books and written down. By 1799, the code books had grown to three volumes with 25,392 entries. This clearly pointed out the need for a system that was based not on codes but on alphabetic representations, but this possibility was never explored. Besides, the use of code numbers provided a measure of security for what was essentially a broadcast medium. By 1800, the maximum speed attainable for a message was about 20 characters per minute, or 2.67 bps (bits per second) in modern terms. Several important concepts and advances came out of these first public data networks. The idea of compressing information (the code books) was proven to be a vital and viable concept. The whole area of error recovery and flow control (a sender must never overwhelm a receiver) was pioneered in these early systems. And the concept of encrypting sensitive information was first used on a large and systematic scale on these networks. The first practical electrical telegraphs merely translated the codings of the optical systems to a new medium. The semaphore towers were replaced by pole-mounted strands of copper and iron (cheaper and stronger) wire. As soon as electricity was shown to be a predictable physical entity by scientists such as Michael Faraday, engineers began working on schemes to use it to send messages over the wires. One important side effect of this activity was the exposure of people to this new technology. In 1824, a New York University art professor named Samuel Finley Breese Morse attended a lecture on electromagnetism, which set his mind in motion. The limitations of communication over distance were made painfully obvious to Morse in the following year. His wife died when he was out of town and it took days for him to learn of it. By 1837, his ideas had reached the patentable stage. He had strung 1700 feet of wire around his room at NYU. That same year, he staged a public demonstration of his device. Morse had grappled with the code book problem. His associate and assistant, Alfred Vail, soon hit upon an ideal solution. Instead of transferring coded letters and numbers, which had to be looked up in voluminous code books, Vail represented simple text by means of dots and dashes, where a dash was defined as three times the duration of a dot. A spool of paper at the receiver printed out the dots and dashes as they were sent. These dots and dashes can be easily thought of as the 0s and 1s of modern binary codes.
In 1838, Morse demonstrated a working telegraph to a Congressional committee in the Capitol building in Washington, D.C. By this time, the telegraph was working over 10 miles of wire filling the room. After some delay and bickering, including a seriously proposed amendment to fund the study of hypnotism and the possibility of the end of the world in 1844, Congress approved $30,000 to run a telegraph line between Washington, D.C. and Baltimore, Maryland in March of 1843. This 40-mile run would be a true test of the technology’s capabilities. The first official telegram was sent on May 24, 1844 between Vail at the Baltimore and Ohio Mount Clare railroad station and Morse in the Supreme Court chamber of the Capitol. The famous message “What hath God wrought?” was not a Morse inspiration. To prevent possible collusion, the assembled dignitaries in Washington decided to go along with Morse’s suggestion that a spur-ofthe-moment message be sent and returned. The expression was selected by Annie Ellsworth, daughter of a government official who was a longtime friend of Morse. Vail immediately echoed it back and the witnesses cheered. By May of 1845, he had extended the line to Philadelphia. It cost about $50 per mile to build a telegraph line, so expanding service was not an enormous burden. Rates remained based on message size. In England, by contrast, two networks had been set up by September, 1847. The rate structure of the Electric Telegraph Company was based on distance, but this proved too expensive for most potential customers. By 1850, a maximum rate of 10 shillings was imposed; this was dropped to 2 shillings by 1860. This whole distance-sensitive versus flat-rate pricing issue comes up again and again in networking. Frame relay pricing is typically distance-insensitive. Message transfer remained slow, mostly due to the laborious task of interpreting the paper tape dots and dashes into letters and words. In 1848, a 15-year-old boy in Louisville, Kentucky, became a celebrity of sorts when he demonstrated the odd ability to interpret Morse Code directly by ear alone. Soon this became common and speeds of 25 to 30 words per minute were achievable. Figuring a 5-character word, this rate of almost 20 “bits” per second is impressive for its day. It compares very favorably to the 2.67 bits per second rate of optical telegraphs. By 1858, newer mechanical senders and receivers boosted the rate on the telegraph lines up to 267 bits per second. Data compression was used on the telegraph lines as well. There was no systematic code use, but an ad hoc abbreviated writing taken from the newspaper industry was widely used. It was known as Phillips code after the Associated Press’ Walter P. Phillips. Operators could tap out “Wr u ben?” for “Where have you been?” and even “gx” for “great excitement.” The code was only used internally and customers were still charged by the word. The success of the telegraph spawned a whole new kind of business as well. In 1886, a young telegraph operator named Richard Sears took possession of a shipment of watches refused by a local jeweler. Using his telegraph, he soon sold them all to fellow operators and railroad employees. In six months, he had made $5,000, quit his job, and founded the company that later became Sears, Roebuck, and Company. A killer application for the telegraph had been found. This was the national network in the 1870s; an all-digital, unchannelized, public data network that the public used to sell goods as well as to communicate.
The Public Voice Network in the United States The national AT&T telephone network in the United States, usually called the Bell system, was regulated as a unit by the states and the federal government from 1913 until 1984. Each state regulated telephone service quality and rate structure for calls that were initiated and terminated within the boundaries of the individual state. For interstate telephone calls, where one end of the call was within one state and other end of the call was in another, regulation was handled by the federal government. Before 1934, this was done by the Interstate Commerce Commission. But after the Telecommunications Act of 1934 was passed, control and regulation passed into the hands of the Federal Communication Commission (FCC). In 1984, as a result of a decades-long battle between the FCC, Department of Justice, and AT&T, and with the new competitive long-distance companies such as MCI joining in, a federal judge and the United States Department of Justice split up the Bell system into effectively AT&T Long Lines and seven newly organized Regional Bell Operating Companies (RBOCs). The local independents more or less continued as they were, but there were major changes in how the RBOCs and independents handled long-distance calls. The RBOCs could still carry local calls end-to-end on their own facilities. For all other long-distance calls the RBOCs had to hand off the call to a long distance carrier, which could not be an RBOC. Furthermore, the RBOCs and independents had to let their subscribers use not only AT&T for longdistance service, but also any of the competitive long-distance carriers such as Sprint and MCI. In fact, any long-distance carrier that was approved by the FCC could offer long-distance services in any local service area if it had a switching office close enough. There was no firm definition at the time of just what a “local call” was, or what “close enough” was, so the court and Department of Justice provided one. The entire United States was divided into about 240 or so areas with about the same number of calls within each. Calls inside these areas, known as Local Access and Transport Areas (LATAs), could be carried on facilities wholly owned by the RBOCs. All calls that crossed a LATA boundary had to be handed off to a long-distance company, which were now called the Interexchange Carriers (IXCs, or sometimes IECs). The local companies, RBOCs and independents alike, were collectively the local exchange carriers. This whole structure neatly corresponded to the two-tier, local long-distance structure already in place. In order to carry long-distance traffic from a LEC, the IXC had to maintain a switching office within the LATA. This switching office was called the IXC Point of Presence (POP). The POPs formed the interface between the LECs at each end of the long-distance call, and the IXC switching and trunking network in between. For the most part, LATAs were contained within a single state, but there were exceptions. Any subscriber served by a LEC had to be able to route calls through the IXC of their choice, as long as the IXC maintained a POP within the originating LATA through a rule called equal access. If the chosen IXC did not have a POP in the destination LATA, the IXC could decline to carry the call (rarely), or hand the call off in turn to another IXC with a POP in the destination LATA. Naturally, the second IXC charged the first for this privilege. It soon became apparent that there were just too many LATAs anyway; as late as 1993, only AT&T had a POP in every LATA in the United States. But the system was in place and cynics noted that the LATA structure closely mirrored AT&T Long Lines switching office distribution. With the breakup of the Bell system in 1984, it became common to speak of the entire system of telephone and switches in the United States as the PSTN.
One other point should be made here. In many cases, the practice developed of running trunks not directly to other local exchanges (although this practice also continued based on calling patterns), but to a more centrally located local exchange. Usually, this local exchange received a second switch, but one which only switched from trunk-to-trunk, and not from loop-to-loop or loop-to-trunk. These trunk switching offices were called tandems, and the practice of switching trunks without any loops was said to take place at a toll office or tandem office. Usually, a call routed through a toll office was a toll call. The term toll call is exactly analogous to the term toll road. A toll road is just a road, but it costs more to drive on it, above and beyond the road use taxes assessed against drivers. In the same fashion, a toll call is just a telephone call, but it costs more to make it, above and beyond whatever the subscriber pays for local service. The amount of the toll usually depended on distance and duration of the call. Keep in mind that these calls were distinct from long-distance calls, which crossed a LATA boundary. A toll call stayed in the same LATA, but cost more (there were a few odd LATA arrangements, but these need not be of concern in this general discussion). Also, the tandem-toll office arrangement offered a convenient way for IXCs to attach POPs to the LECs’ networks. Instead of running trunks from a POP to each and every local exchange, an IXC could simply link to the area’s tandem or toll office. Since the tandem or toll office existed to tie all of the local loops in the area together, this guaranteed that all subscribers would be able to make interLATA calls through that IXC’s POP, at least on the originating end. From the IXC perspective, this preferred point of trunk connectivity was called the serving wire center, since the POP was served from this switching office. Again, this term was used from the IXC perspective. To the LECs, a wire center was just a big cabling rack (called a distribution frame where trunks and loops connected to the switching office) in the local exchange. In other words, a wire center is nothing special to the LEC, but is quite important to the IXC. Many IXCs maintain trunks to several wire centers in a LATA, all in the name of efficiency and to cut down on the number of links needed. Today, the PSTN has a structure similar to the one shown in Figure 2.2. The local exchanges (LEs, also called central offices [COs]) and toll offices inside the LATA make up the first tier of the PSTN, the LEC portion. Since the Telecommunications Act of 1996, service providers may be any entity approved or certified by the individual states to become a LEC. Newer companies are Competitive LECs (CLECs) and the former service provider in a given area becomes the Incumbent LEC (ILEC). Terms such as Other LEC (OLEC) are sometimes used as well. ILECs, CLECs, OLECs, or some other exotic alphabet combination may still be RBOCs, Independents, ISPs, or even cable TV and power companies in various parts of the United States today. There are some 1300 LECs operating in the United States today, but many of these LECs are quite small with only a few thousand subscribers and located in quite isolated areas.
Figure 2.2 The PSTN today. The second tier of the PSTN is comprised of the IXC’s networks. The IXC POP in the LATA could handle long-distance calls for all subscribers in the LATA. The IXC had to have its own backbone network of switches and links as well. The acknowledged leaders in this arena are AT&T, MCI, and Sprint. Sprint remained an oddity for a while because Sprint is also a LEC in some parts of the country, a rare mix of local and long-distance services. There are some 700 IXCs in operation in the United States today, but most of them have POPs in only a handful of LATAs. Many will still handle calls which originate from people within a LATA where the particular POP appears, to almost anywhere, but frequently hand the call off to another IXC.
A few points about Figure 2.2 should be emphasized. All lines shown on the figure are trunks, not local loops, since they connect switches rather than user devices to switches. Although shown as a single line, each trunk may carry more than one voice conversation or channel. In fact, many can carry hundreds or even thousands of simultaneous voice conversations. Also, IXC B, since it does not have a POP in the leftmost LATA, cannot carry long-distance calls from anyone in that LATA. The same is true for IXC A in the rightmost LATA. In the center LATA, customers will be able to choose either IXC A or IXC B for long-distance calls, either under equal access arrangements or by presubscription. Pre-subscription automatically sends all calls outside the LATA to a particular IXC. Recently, the practice of deceptive IXC presubscription switching known as slamming has been universally condemned by regulators and most IXCs alike. As a minor point, note that a POP need not be linked to a toll office switch. The actual trunking of the POP all depends on calling traffic patterns, expense, and other factors. At the risk of causing confusion, it should be pointed out that LEC B and LEC C, for instance, could both exist within the same LATA, depending on state ruling and certification. In this case, one would be the ILEC and the other the CLEC (or OLEC), and both would compete for the same customer pool within the LATA for local service. Finally, there is no significance at all to the number and location of LEs, POPs, and so on, nor the links between them. The figure is for illustrative purposes only. The PSTN, at both the LEC level and the IXC level, is what is known as a circuit-switched or circuitswitching network. Much more will be said about circuit-switching in the next section in order to compare and contrast this practice with packet-switching. But any discussion of the current PSTN architecture would be incomplete without introducing one of its most distinctive features. What has all this to do with frame relay? Just this: When considering frame relay services, an organization must be aware of whether the service being evaluated is proposed by a service provider that can carry the frame relay service outside the LATA if required. While there are ways for frame relay service providers that are regulated LECs to cross LATAs with frame relay services, the organization might be better served by having a national frame relay service provider rather than a LEC. It should further be noted that there are always regulatory plans for allowing LECs to offer interLATA services, especially for advanced data services like frame relay.
Public and Private Networks Privacy is a vital issue in all aspects of life. There is private property and public property. Different rules of behavior and different individual rights apply to each. Very often an important rule of law revolves around the key issue of whether an act was performed in private or in public. The concept of ownership is critical in determining if public or private rules of conduct apply. For instance, if the property is a private home, privacy is expected and certain behaviors accepted. However, if the property is a park, then privacy is never assumed and other rules of conduct apply. But who owns a network? The answer is not as straightforward as it first seems. Consider a simple scenario with two PCs in homes linked by dialup modems over the Internet. Is the resulting “network” public or private? Clearly, the PCs are privately owned by individuals and so are the modems. But the local access lines (local loops) remain the property of the local telephone company. (The quotes reflect the fact that local service providers are delivering more than simple telephony today, and in a short while the majority of local access lines might terminate at nontelephone devices.) The Internet is a global public network owned by everyone and no one. In this case, two PCs networked together are a mix of private and public property. Yet no one would hesitate to label this scenario as an instance of a public networking environment. But exactly what is it that makes this a public network and not a private one? The answer, quite simply, is who owns the network nodes? While simple to state, the answer is actually a little more complex. What exactly, for instance, is a network node?
Common Network Characteristics The nice thing about talking about networks, whether voice or video or data, is that all networks look pretty much the same. So all networks share certain structural and architectural characteristics that make them appear quite similar to each other. This is not to say there are not significant differences in the function and operation of voice, data, and video networks. There are obviously. But all networks share characteristics that make them networks in the first place. Every network discussion eventually represents the network as an ill-defined cloud that user devices access. This “network cloud concept” was first introduced for very good reasons in the 1970s when public X.25 packet-switched networks first challenged private line networks in the United States. X.25 eventually lost, but the failure arguably paved the way for the success of frame relay. In any case, there were no details about the functioning of network components inside the cloud, which was why it appeared as a cloud in the first place. The philosophy was that users did not have to concern themselves with the inner workings of the network. All the users needed to worry about was whether all other users that they cared about were reachable through the cloud. All networks are simple in overall structure, even inside the cloud that hides their inner workings. However, each major network has its own set of terms and acronyms for network components that turn out to do much the same thing. Consider, for example, the general network shown in Figure 2.1. Some of the details of just what is inside the cloud are presented in the figure also.
Figure 2.1 A network The figure shows that inside the cloud there are devices known as network nodes. Outside of the cloud, there are other devices usually called user devices or end devices. This is how the boundaries of the cloud are determined. Network nodes go inside the cloud and user devices belong outside the cloud. Usually, but not always, a user device links to one and only one network node. Network nodes, on the other hand, may have multiple links to other network nodes, but again not universally. So another way to draw the cloud is with devices with only one link outside the cloud and devices with more than one link inside the cloud. The user devices are linked to network nodes by a communications link known as a the UserNetwork Interface (UNI). The UNI link can run at a variety of speeds, is supported on a number of different transmission media from coaxial cable to copper to fiber, and can be used up to some standardized or designed distance. Network nodes link to each other by a link known as a NetworkNetwork Interface (NNI). These links vary by speed, media, and distance also. There is only one exception to this general network diagram. Older local area networks (LANs) did not conform to this generic wide area network (WAN) structure of user device linked to network node. But by the early 1990s, most newer LANs do indeed look exactly like this, although the entire cloud may only encompass a single building or even a single floor. In most LANs, the network nodes are the hubs (which may be linked together) and the user devices are PCs. Whether LAN or WAN, a key aspect of networks is that all users can be reached through the network. Different networks vary in the use of different hardware and software required by users for access, the type of network nodes, and what the UNIs and NNIs are called, but most of the differences would be lost on those not intimately familiar with the various protocols. Note that user devices do not communicate with the network nodes directly at the application level, but rather to other end users. Also, some of the network nodes do not link directly to users, but only to other network nodes. It is common to call network nodes with user interfaces edge or access devices and network nodes with links only to other network nodes backbone devices or nodes, but this is only by convention. Network nodes of all types take traffic from an input port, determine where it goes next through some rule or set of rules, then put the traffic onto an output port queue. Now, the easiest way to determine where the traffic goes is to look up the destination in a table maintained and updated in the network node itself. There are many variations of this theme, but these variations are well beyond the scope of this discussion. Up to this point the terminology presented has mostly been that of data networks. But both UNIs and NNIs are usually leased private telephone lines (there are variations here as well). However, this figure can also be used to represent the public switched telephone network (PSTN) used around the world today for general voice and dialup PC Internet connectivity. The user devices would be telephones, fax machines, or computers with modems in the PSTN, which all fall into the category of Customer Premises Equipment (CPE). In the United States, the form that the CPE might take is completely up to the customer. Any approved device from any manufacturer may be used, as long as it conforms with some basic electrical guidelines. The CPE in the United States is owned and operated by the user, and is beyond the direct control of the network service provider. (This does mean that a service provider cannot furnish the CPE for free as part of a service, but the customer is also free to reject such offers.) In other countries, the CPE can be provided and owned by the service provider under a strict set of regulations, although deregulation is helping other users to gain more control over the CPE. In the PSTN itself, the network node is a voice switch. A switch is a type of network node that functions in a particular way in terms of how traffic makes its way from an input port to an output port, the way the table is maintained, and so on. At least that is how most network people see it. There is a continuing controversy between what exactly is meant by the term switch, especially between the voice and data networking communities. The controversy extends to the companion term router, which is another type of network node, and the type usually found in the Internet. To data network people, a switch is more or less a fast router.
This is not the place to discuss the merits of switching or routing. It is enough to point out that a router is a type of network node found on the Internet that performs networking tasks differently than a switch, which is the basic network node type of the PSTN. Because of the use of the term switch applied to the PSTN, the network node on most public networks, both voice and data, is traditionally called a switch. This applies to frame relay, of course. In the voice networks, instead of the edge and backbone structure of data network nodes seen in frame relay networks, the PSTN uses terms such as local exchange and toll exchange or long distance to distinguish voice switches having links to users or not. All users link to a local exchange, usually called a central office (CO) in the United States. The local exchanges link to toll offices or tandems, or long distance switches, with the actual terms varying depending on the detailed structure of that portion of the PSTN. In the PSTN, the UNI is now the access line or local loop. Some people reserve the term line for digitized loops where analog voice is represented as a stream of bits while others reserve the term loop for purely analog user interfaces. Whether it is called a loop or line, the user interface is not normally a leased, point-to-point, private line, as in a frame relay or other data network. Rather the local loop supports a switched, dialup connection that is capable of reaching out and touching almost every other telephone in the world by dialing a simple telephone number. One such destination may be the local Internet service provider if the loop or line is connected to a PC with a modem or other specialized network interface device. Naturally, the ISP needs to be connected to a PSTN local exchange also for this to happen at all. This is how PC users can use the PSTN to access the Internet. In the PSTN, the network node interface becomes the trunk. There is little to no physical difference between loops or lines and trunks in the PSTN. The difference is in how the physical facilities are used. Lines and loops are used to connect users to the network. Trunks connect network nodes (voice switches) to each other. Trunks are typically high-speed links for one main reason. The reason is that trunks must aggregate a lot of traffic from thousands of users and ship it around the network to other network nodes efficiently. The same is true for NNIs in general for the same reason and frame relay is no exception. In fact, it is the way in which frame relay performs this traffic aggregation, with dynamic bandwidth allocation, that makes frame relay so attractive in the first place. Before taking a closer look at private networks, it will help to summarize the differences in network terminology not only between the PSTN and frame relay, but among many new technologies in general. These differences are shown in Table 2.1. The table adds a few terms to the commonly used terms for the PSTN and the Internet. X.25, frame relay, and ATM all fall into the category of packet-switching data networks, just like the Internet. Their network nodes are switches, not routers. However, as will be shown shortly, a router can be a user device on frame relay (or even ATM) networks. It should be noted that X.25 networks can use a special X.75′ (X.75 prime) interface as a network node interface, but this is not mandated or universal. Frame relay also defines a specialized device for network access CPE known as the Frame Relay Access Device (FRAD). However, the FRAD is not really a user device in some senses because users do not sit down at FRAD to do work. Note that of the three packet data networks, only ATM defines a Network Node Interface (NNI). Oddly, frame relay does define an NNI acronym, but as a network-to-network interface which handles the interface from one frame relay network to another. This will be explored in more depth later. Table 2.1 Network Terminology Network
Network Node
User Device (Usually)
User-network Interface
Network Node Interface
PSTN
Local exchange
Telephone
Local loop
Trunk
Internet
Router
PC client or server with modem
Dialup modem or leased line on local loop
Leased line
X.25 packet switching
Packet switch
Computer
X.25 interface
(undefined)
Frame relay
Frame relay switch
Router, FRAD
UNI
(undefined)
ATM
ATM switch
Router, PC
UNI
NNI
LANs since ca. 1990
Hub
Router, PC
Horizontal run
Riser, backbone
As mentioned previously, LANs looked very different from WANs before about 1990. Pre-LANs were mostly shared-media, distributed networks, and some still are. But today most LANs conform to the network node model by way of the hub. There are no firm equivalents for lines and trunks, however, and the “horizontal” and “riser” terminology in the table is used for convenience only.
Private Networks The PSTN is a public network. But this is not the only type of voice service there is, especially in corporate environments. There are also what are considered to be private voice networks, although it will be shown that there are still public elements in them. Since it will be important in understanding how public frame relay fits into a private networking environment, this section will discuss the mixing of public and private network elements a little more closely. It is not uncommon for a larger organization with multiple sites and more than 100 or so employees per site to employ a private voice switch as a CPE called a Private Branch Exchange (PBX) at each site. With only one site, and with smaller numbers of employees, other methods can be more costeffective, but even these organizations aim to grow into the types of organizations and companies that require PBXs. The PBX itself can save money by allowing extension handsets (telephones) to call each other without needing to access the public network voice switch every time someone on the third floor calls someone on the second floor. So every telephone does not need an access line to the local central office. Instead, all that is needed is the proper number of trunks to the PSTN for the amount of people who are talking outside of the organization at one time, typically about 1 in every 5 on the telephone or even more. In addition to saving on outgoing access lines, the PBX can save money for incoming lines as well. Instead of having a separate line for every employee, requiring each person to answer his or her own phone when it rang, a central call attendant takes the incoming calls, says “Corporation X, good morning,” and switches the call to the proper person or department. If there is no answer or the line is busy, the call reverts to the attendant. These three features—internal calling, attendant, and revert—are the essential features of all PBX systems. Many PBX systems add more features, so many in some cases that no one knows what they all are, let alone how to use them. When there are multiple locations, some way must be found for the PBXs to link to each other. This is usually done by leasing private lines between the locations, from a local exchange carrier (LEC) if the sites are close enough, or from an interexchange carrier (IXC) if the sites are far enough apart. These leased lines are called tie-lines. Note that although the tie-lines form part of a private voice network, the lines remain part of the public network and revert to full carrier ownership again when the lease runs out. Typically, users must dial a special prefix number such as “8” before using the tie-line system. Some more elaborate systems have their own, internal, w-digit numbering schemes. Other PBXs just analyze the dialed number of use the tie-line network automatically if it makes sense. In any case, if the tie-line network is congested, not appropriate, or otherwise unavailable, the PSTN is usually just a “9” away.
What’s the point of discussing the public voice network and private voice network? Because here is the key to understanding what distinguishes them. The PSTN switches, all of them, belong to the telephone service provider. The PBX switches, all of them, belong not to the network, but to the end-user organization. Note that this configuration does not rule out connections between the PBX network and the PSTN. The same connection aspect is true of a private data network with dialup links to the Internet; it becomes a public network. To distinguish public networks from private networks, find out who owns the network nodes? If the end-user organization owns the network nodes, this is a private network scenario. If the service provider owns the network nodes, this is a public network situation. It does not matter if the service is voice, data, or a mixture. It does not matter if the end-user organization owns the CPE. What counts is the network nodes. In frame relay, if the service provider owns the network nodes (which is usually true), the frame relay service is public service.
Fast Packet Technologies Certainly enough has been said to this point about bursty applications and LANs. But as voice becomes compressed and silence suppressed, and packet video techniques such as MPEG-2 become more common, there will be little on a network that is not well suited for packet-switching networks. Therefore, X.25 should be all over the place. Yet it is not. X.25 is not seriously considered the packet technology of choice because it is too slow, for the reasons outlined in the previous section. To do voice and video effectively, end-to-end delays must be held to some stable minimum, and information loss must be minimal as well. A packet network fast enough to handle voice and video is a fast packet network. The delays are low and stable, and the information loss is very low also. Fast packet networks are sometimes called broadband networks, but the equivalence is not always true. Broadband networks have lots of bandwidth and are used for multimedia applications that require these large bandwidths. Fast packet networks can have modest bandwidths and still be used for voice and video. However, as time goes by, all fast packet networks will have to employ broadband bandwidths just to keep up with the types of devices that users are networking together.
Broadband Needs A lot of the previous sections have dealt with theory and abstract concepts. The time has come to be very concrete and see why networks are rapidly evolving toward broadband capabilities. Consider as an example the PC that sits in front of me as I write this. I use this example because I consider myself to be neither a power user nor one of those that clings to a beloved PC long after it has ceased to be useful. I consider myself a typical PC user. Table 2.2 shows the PC that I have used over the years. The table shows how much the most commonly networked device today—the simple PC—has evolved over the years since IBM first introduced its PC in 1981. The visual traces the evolution in terms of the most common random access memory (RAM) size, CPU speed, hard disk size (if any), and size of the operating system itself for each year listed. Most importantly, the visual then gives the maximum theoretical number of bits that a serial port on one of these machines would be able to produce and consume bits. This uses the common IBM guideline of 1 bit per second for ever 2 Hz of CPU speed. The first IBM PCs had 64 kilobytes of RAM, an 8 MHz 8088 chip, and only a 5 Megabyte hard drive, if one was present at all. PC-DOS fit on a single 5/4-inch 360-kilobyte floppy. One of the reasons that token ring LANs ran at 4 Mbps is due to the theoretical speed limit of these early PCs. By 1987, a PC had 256 k of RAM, ran a 16 MHz 80286 or even an 80386, had a 40 Meg hard drive, and ran DOS 3.1 from a high-density 5/4-inch 1.2 M floppy. Table 2.2 The Evolution of the User Network Device Feature
1981
1987
1992
1995
1998
RAM
64 k
256 k
1M
8M
16-M
CPU speed
8 MHz
16 MHz
32 MHz
133 MHz
300 MHz
Hard disk
(5 Meg)
40 Meg
80 Meg
500 Meg
2-Gig
OS size
360 K
Theoretical 4 Mbps Peak Bit Rate
1.2 M
7M
70 M
~200 M
8 Mbps
16 Mbps
66 Mbps
116 Mbps
The typical 1992 PC model had 1 Meg of RAM, ran a 32 MHz 80386, had an 80 Meg hard drive, and ran Windows 3.1 from a 7 Meg directory on the hard drive. By 1995, the PC had 8 Meg of RAM, ran a 133 MHz Pentium, sometimes with MMX, and had a 500 Meg hard drive, which was needed just to hold the 70 Meg or so of Windows 95. Today, most PCs come with 16 or 32 Meg of RAM, run 300 MHz Pentium IIs, have 2 or 3 gigabyte hard drives. No one knows how big Windows 98 will become. The whole point is that as the systems being networked change, so must the network. Theoretical bit rates have quickly grown from 4 Mbps to 8 Mbps to 16 Mbps. PCs today can pump out and take in anywhere from 66 Mbps to 116 Mbps and beyond for 300 MHz PCs. The network must be able somehow to keep up and scale with the requirements of the end-user devices.
Flexible, Dynamic Bandwidth Allocation One of the key features and benefits of the frame relay protocol is flexible bandwidth allocation. This is often called bandwidth-on-demand, but this term is not nearly as accurate or descriptive as the term flexible bandwidth allocation. The term bandwidth-on-demand implies that bandwidth is created when needed, but this is not what frame relay does; it is impossible. What frame relay does is to dynamically allocate existing bandwidth on an as-needed basis. Suppose that point-to-point links were not needed to each and every site that a given site’s router or bridge needed to connect to. There would be fewer periods of idle patterns sent over the link, since when one LAN-based client-server application was silent, another might be sending data. In a leased-line network, one link would be idle while the other was busy. In this scenario, there is only one physical link with logical connections, so the link has fewer periods of idleness. There are no longer any physical channels to remote sites, but rather a number of logical channels or virtual connections. There is no longer a need to send long periods of idle channel or interframe fill bit patterns across the expensive leased line. As long as the average bandwidth use remains below the peak bandwidth use, this scheme is very effective. This idea of virtual circuits sharing one physical access link to a network is distinctive of packet networks in general and frame relay in particular. The potential gain in networking efficiency should never be underestimated. The X.25 protocol attempted this same kind of efficiency as well, but the other limitations of X.25 prevented this feature of X.25 from being used to its full promise. It remained for frame relay to strip off most of the hop-by-hop error-checking and flow-control processing overhead present in X.25. Following are the key aspects of frame relay flexible bandwidth allocation: Often misleadingly called bandwidth-on-demand Does not create bandwidth, but dynamically allocates bandwidth on an as-needed basis More efficient use of physical connectivity No longer a physical channel, but a logical channel (virtual circuit) linking sites No need to send special idle channel bit sequences as much Since frame relay became available at the exact time that corporations were both looking to minimize private line costs and yet increase the efficiency of their remaining network links, frame relay services have flourished in spite of the fact that the vast majority of frame relay networks are public network services. Frame relay has broken the private line way of thinking in many environments today.
Chapter 3: Frame Relay Networks Overview Frame relay networks are a form of connection-oriented, fast packet network. They are based on the older X.25 networks and also intended to be public data networks. Some frame relay networks might also be considered as broadband networks, but few frame relay networks fit the definition perfectly. However, most frame relay networks are still fast enough in terms of network nodal processing delays and stability to deal quite well with compressed voice and video applications along with data. In a very real sense, frame relay is the result of years of effort to enable public packet-switching networks such as X.25 to handle packetized voice (and today also video). This chapter will establish the overall structure of a frame relay network. The chapter will emphasize not only the physical structure of the network, but also the need to maintain an adequate quality of service (QoS) for voice and video services while at the same time handling extremely bursty data applications. This QoS discussion introduces the concepts of routing and switching. There is much talk today about “Layer 3 switching” in routers and “adding routing” to a fast packet switch in ATM or frame relay. This chapter is the place to explore the relationship between routing and switching once and for all. The chapter will also detail exactly how frame relay evolved from X.25 and how it still retains some evidence of this process. The chapter will end with a look at frame relay connections. Both permanent virtual circuits (PVCs) and switched virtual circuits (SVCs) are discussed. The need for both PVCs and SVCs is examined, along with consideration of the frame relay protocols that need to be implemented to allow for SVC service.
Private Routers and Public Switches Frame relay is typically a public network service. The essence of a public network service is that the service provider owns and operates the network nodes. The basic network nodes in the public frame relay network are called switches. A lot of the reasoning behind the use of the term switch for a frame relay network node is historical. Traditionally, the network nodes for services provided by the telephone companies (itself an increasingly historical term in the days of deregulation) have been called switches. So there are central office switches, ISDN switches, and X.25 packet switches. Today, the term switch has come to mean any network node whose primary method of operation involves setting up connections as paths for information to follow from source to destination. So today there are Ethernet switches, Layer 3 switches, and the like. Private networks also have plenty of network nodes. The essence of a private network is that the end user’s organization owns and operates the network nodes. The basic network nodes in private networks today have a variety of names. In a small LAN, the network nodes are called hubs. When a private network is used to connect LANs, the network node used to connect these LANs was most often a bridge in the past, but it is the router today. At first, these characterizations of public switch and private router seem obviously wrong. Is not the network node of the public Internet the router? And is not a Fast Ethernet (100 Mbps) hub called an Ethernet switch? The answer is yes to both questions. But this does not mean that the origins of these terms are not correct, only that their current usage has little to do with their original context. The Internet network node is called a router because a company named cisco decided that this is what the device should properly be called. Until then, Internet routers were called gateways; this term can still be seen in various Internet acronyms such as IGP (Interior Gateway Protocol) and BGP (Border Gateway Protocol) that apply exclusively to routers. One of the reasons for the change is that the OSI RM defined a gateway as a network connectivity device that operated at all seven layers of the OSI RM. But Internet gateways (routers) operated only at the bottom three layers of the OSI RM. So the change was made, and successfully, largely due to cisco’s enhanced standing in the field it basically created single-handedly. (One of the reasons that bridges faded as LAN interconnection devices is that many routers could also function as bridges if so configured. Once called brouters, the bridging capabilities of all modern routers is a given today and so the term was mercifully dropped.) The Ethernet switch or switching hub is called a switch because LAN equipment manufacturers were looking for a term to distinguish how these LAN network nodes operated internally from other types of LAN hubs. The term gateway did not apply and the term router was already taken. Ironically, the most descriptive term and accurate term for what a switched Ethernet hub does, a simple bridge, was avoided since by then everyone knew that a router was a more advanced network device than a bridge (and this was true). The only term left that had ever been applied to network nodes at all was switch. So the very private LAN hub that employed bridging between each individual LAN port became known, for better or worse, as a LAN switch. And no matter how much more accurate the term single port bridging hub might be, LAN switch it remains and will remain. The term Layer 3 switch applied to what otherwise appears to be an ordinary router is simply a repetition of this naming crisis. Routers operate at Layer 3. But this device is radically different, so what do we call it? Well, Layer 3 switch is not taken, and it certainly points out the router relationship (Layer 3). In this instance, the term router switch, which is basically the same as Layer 3 switch was avoided as too confusing.
So frame relay network nodes are switches and users’ LANs use routers at the ends of leased lines to connect their LANs. But a private router can be a software FRAD. And frame relay switches can be used to link customer’s routers to public Internet routers. Given the converging terminology previously noted (router switch), does all of this mean there is no difference at all today between switches and routers? Not at all. And because the frame relay switch and customer router as FRAD have such a close relationship, being at either end of the UNI, this relationship is worth exploring in a little more depth.
What Is a Router and What Is a Switch? As has already been established, the network nodes in a frame relay network are called switches. On the Internet, the network nodes are called routers. More accurately, these are IP routers, since IP is the OSI RM Layer 3 protocol used in these routers. Why should any of this matter? The answer to this question is of vital importance for organizations building networks for their users and for the service providers who build the infrastructures that link the users together. If frame relay and other technologies such as ATM are to survive and prosper in the world of the Internet, the position of public switches in relation to IP routers must be considered. Switches and routers can be compared in a number of ways. It is important to realize that even though this section emphasizes the differences between switches and routers, both are still network nodes that can be used in a wide variety of networks and under a wide range of circumstances. Switches usually (there are exceptions) have the following characteristics. They are hardwarebased; processing happens very quickly at the chipset level with a minimum of added overhead processing. Switches were created by the telcos for use on a public WAN and standards are governed by the ITU. The tables that are used for routing traffic through the switch are set up by a signaling protocol when the connection between the users is initially made. So switches are typically connection-oriented and no data transfer takes place until this connection between users is set up. Connections might be of the permanent virtual circuit (PVC) type or switched virtual circuit (SVC) type (on-demand connections is a more accurate, but seldom used, term). Both PVCs and SVCs are connections in either case, no matter how they are established. All data units are distributed from input ports to their proper output ports by a simple, quick lookup in the routing table of a connection identifier, which makes the hardware implementation so attractive. This simplicity is a result of the connection-oriented approach of the switch environment. The connection identifiers often have what is called local significance only, which means they can be used over and over on the network as a whole and thus must be changed as the data units flow from node to node across the network. Examples of networks that use switches with these characteristics as network nodes include X.25, frame relay (naturally), ATM, ISDN, and many other mostly public network services. The behavior of routers can be contrasted with switches almost point by point. Routers usually have the following characteristics, but as with switches, with some exceptions. Routers are mostly software-based (but this is changing) and processing happens more slowly at the CPU level with some added overhead processing. Routers were created for use on the public Internet (and called gateways until cisco popularized the name router), and router standards are governed by Internet organizations. The switching tables (the contrasting terms are used intentionally here) that are used for routing traffic through the router from input port to output port are created by a routing protocol that periodically contacts neighboring network nodes (other routers) for the purpose of exchanging this routing information. Routers are typically connectionless and data transfer between users can take place at any time without establishing connections between the routers. Note that there may be permanent or switched (on-demand) connections that exist end-to-end between users or applications, but there are no logical connections at all between the routers themselves, just physical links.
Because of the connectionless approach used in the router environment, the data units are distributed from input ports to their proper output ports by a set of rules which are applied to a global network address that must be present in each and every data unit. The fact that this global routing information is present in each data unit is a major reason behind the flexibility of routing and makes the software implementation so attractive. The network addresses handed out in a router environment have global significance; so they must be carefully allocated to users on the network. There can be no overlap among network addresses in a router network, a fact which alone adds administrational complexity to the network. Examples of networks that use routers with these characteristics as network nodes include IP, IPX, and a few others. Note that these network protocols started out as proprietary or closed protocols, whereas most protocols based on switches as nodes started out expressly as public and open protocols. Today, the trend is toward convergence between switches and routers as network nodes. That is, routers have begun to take on the characteristics of a kind of connectionless switch routing data units with global addresses while switches have begun to take on the characteristics of connectionoriented routers switching data units with local addresses (connection identifiers). As previously mentioned, some former LAN connectivity devices displaying the characteristics of both bridges and routers were called brouters. The term has thankfully disappeared, but perhaps the same approach could be taken with regard to network nodes that combine the characteristics of a switch and a router. This swouter device would do a lot of the data unit processing in hardware at the chipset level but would have many tables to look things up in as well. This device could handle connections or flows of IP packets. The point is clear: If this device is neither a traditional switch nor traditional router, then what exactly is it? Switches and routers have already merged in function to the point where more and more equipment manufacturers are not calling their new products a switch or a router at all. The device might be a packet processor or nodal processor, but not merely a switch or router. Two examples are instructive. Start with a normal, premises-based IP router. Add the hardware module that cisco calls its route switch processor card. Then add some software to handle IP version 6 flows, which are basically a type of on-demand connection between routers. If a frame relay UNI is added, the result is more than a FRAD, but something less than a full-blown public frame relay switch. Or start with a frame relay switch in the public frame relay network. Everyone has a router, but perhaps no potential customers want to buy a FRAD or they are reluctant to change their router configuration. No problem. Just add some software to the frame relay switch to handle IP and other traditionally connectionless protocols. This frame relay switch-based software needs the IP routing tables to perform its task, of course and the IP routing protocols would need to be added to maintain the routing tables properly. Is this frame relay switch now an IP router? Or is it rather (and this seems to be what it really is) a new type of central office FRAD or CO FRAD? It should also be noted that in a router network, each data unit is usually routed independently. But in a switch network, only the call setup message of the signaling protocol is routed independently. In a switch, all subsequent traffic follows the same path. A router can also do this, using a concept called flows, as previously noted.
Routing and Switching on a Frame Relay Network All the pieces are now in place to understand both how a frame relay network operates and how this operation is an improvement on how an X.25 packet-switching network operates. X.25 networks operate by taking source X.25 packets from an end-user device and placing them inside of frames based on the Link Access Procedure-Balanced (LAPB) standard. The term balanced means that the same type of messages can flow from either end of the link, making them peers from the networking perspective. The LAPB frames are sent as a stream of bits from source device to network X.25 packet switch.
At the public X.25 switch, each arriving frame is checked for errors and, if none are detected, an acknowledgment is sent back periodically to the source saying, in effect, “the last x frames were good.” The sender must wait after sending “x” frames until this acknowledgment appears. There is always the possibility of a Negative Acknowledgment (NACK) appearing as well. This usually prompts the sender to resend at least one and probably more frames due to the detected error. All this occurs at Layer 2 of the OSI RM. Inside the frames are the packets or more likely a piece of a packet. There is a measure of flow control done at this level as well. Flow control simply means that the user premises device can never send frames and packets into the network faster than the network can handle them. One of the simplest ways to perform flow control at this level is to merely delay an acknowledgment so the sender cannot send any more frames or packets. The public X.25 switch now assembles the entire packet and examines the connection identifier at Layer 3. The connection identifier is looked up in a table (the switching or routing table) and the proper output port determined. There might be packet-level acknowledgments involved as well. Then the public switch repackages the X.25 packet in an appropriate frame for sending to another public X.25 switch. These frame-level procedures are not quite the same as LAPB, and are vendorspecific, but must perform the same error checking and acknowledgments sequence as on the user access link. There might be an arbitrary number of public X.25 switches to traverse until the packet arrives at the switch that has the X.25 access link to the destination. At the destination side of the network, this entire packet-in-frame error detection, acknowledgment, and flow control procedure is repeated. So all three layers of the OSI RM are involved at each hop between nodes along the way. This link-by-link, or hop-by-hop, error and flow control is characteristic of all older network protocol such as X.25 and is shown in Figure 3.6.
Figure 3.6 Information flow in X.25. There is nothing at all wrong with doing things the hop-by-hop way on a network. In fact, it saves the end devices the tasks associated with error and flow control, since the network is doing all of this on behalf of the users. With X.25 it was even possible to have a source sending at 9,600 bps (which shows how long X.25 has been around) and a destination receiving at 4,800 bps. The network would buffer and store the excess bits until the receiver was ready for them. Try doing that with a leased line! However, today’s source and destination PCs and routers are much more capable than they were just a few years ago. The complete Layer 2 and Layer 3 processing required in X.25 now slows the network down more than it decreases the burden on the end systems. All in all, there are some 10 decisions that an X.25 switch must make to process a packet through an X.25 network node. About six of the decisions are at Layer 2 and four of them are at Layer 3. The details are unimportant here. What is important is that most of these decisions have to do with the error and flow control that must be done hop-by-hop throughout the X.25 network. The philosophy in a frame relay network is radically different. Instead of the network performing error and flow control hop-by-hop, the frame relay network makes these procedures the responsibility of the end-user device. (Some frame relay texts say that the end user is responsible, conjuring up the image of an office worker feverishly working to resend information through the frame relay network.) It is the end-user device, such as the host attached to the router, that performs the error and flow control tasks end-to-end across the frame relay network.
In a frame relay environment, the packet scenario is as follows. Frame relay networks operate by taking source frame relay frames from a site CPE device (e.g., a router or FRAD) and placing them inside of frames based on the Link Access Procedure-Frame Relay (LAPF) standard. There are several levels or types of service a frame relay network can offer; the most basic is based on LAPF core. This is the type of frame relay network described here. The LAPF frames are sent as a stream of bits from source device to a network frame relay switch. At the public frame relay switch, each arriving frame is checked for only two things. First, if the frame contains any errors at all, it is just discarded. No notification of this frame discard is sent to the source. Second, the arriving frame is checked to see if the Data Link Connection Identifier (DLCI) in the frame header has a routing or switching table entry. If the DLCI is not in the table, then the frame is discarded. If no errors are detected and the DLCI has a table entry, the frame is switched to the proper output port. No acknowledgment is sent back periodically to the source. So there is no flow control or error control done within the network. If missing frames are to be resent, it is the task of the end systems to decide if any frames are missing and what to do about the missing frames (voice frames can hardly be resent!). The same logic applies to flow control: If the destination system wants the source system to slow down, then it is the responsibility of the destination to inform the source of the need to slow down the sending process. The frame relay network will convey this information inside the frames to the source, but the frame relay network is never aware of the contents of the frames that the network transports on behalf of the users. There is no need for the public frame relay switch ever to look inside a frame and assemble or process the entire packet. The connection identifier is all that is needed to allow the frame relay switch to determine the proper output port from the switching or routing table lookup. There are no packet-level acknowledgments or flow control done in the network at all. The public frame relay switch does no re-packaging of packets; it only relays frames from input port to output port. The frame-level procedures used between the frame relay switches are not quite the same as LAPF core, but are vendor-specific, just as in X.25 and most other public networks. There still might be an arbitrary number of public frame relay switches to traverse until the frame arrives at the destination. At the destination side of the network, the frame content is subjected to error detection, acknowledgment, and flow-control procedures if necessary based on the application. But this is an end-to-end function of the user devices, not a hop-by-hop function of the network itself. So only the bottom two layers of the OSI RM are involved at each hop between nodes along the way. This endto-end error and flow control is characteristic of all newer fast packet protocols such as frame relay. The frame relay information flow is shown in Figure 3.7.
Figure 3.7 Information flow in frame relay. Note that the end-to-end layer in a frame relay network is the Network Layer and not the Transport Layer. This means that IP packets, or any other OSI RM Layer 3 data unit, now become the end-toend data transfer unit through the frame relay network. So any mechanisms that the TCP/IP protocol has in place to handle error control and flow control work just as before (if the network was previously run on leased lines or the like) and the frame relay network is totally transparent to the IP routers. This simple transparency is both a benefit and a liability to the network and router alike, and will be explored more fully in the next few chapters.
The Frame Relay Access Device The Frame Relay Access Device (FRAD) is the user’s view of the frame relay network. Users do not see the frame relay switch, nor do they usually even see the UNI. What users see is the FRAD. And even then, the FRAD might very well be the same familiar router down the hall. But in many cases, the FRAD is a special frame relay network device that terminates the UNI at the customer site; it is the CPE of the frame relay network. A lot of times, evaluating a frame relay network is really a two-step process. First, examine and choose the service provider. Then, examine and choose the FRAD vendor(s) (it is much easier to mix and match FRAD vendors than frame relay switch vendors). A quick definition of a FRAD is easy to provide. A FRAD has at least one port capable of supporting the hardware needed for the UNI link and the software needed for understanding frame relay protocols, and one or more nonframe relay ports, usually LAN ports. So a FRAD has at least one UNI and one or more non-UNI ports. Many FRADs, especially smaller, less expensive models have exactly one UNI port and perhaps four non-UNI ports, all usually just 10Base-T Ethernet. More expensive FRADs have more sophisticated UNI options and configurations, including dial backups, and a wider range of non-UNI ports and/or more of them as well. The point of this section is to describe the different types of FRADs that can be found on the premises of a typical frame relay customer. This does not mean that several criteria for deciding which FRAD is right for which situation are not covered here, it just means the emphasis is on description, not selection. It is possible to divide all of the FRADs marketed by vendors into roughly the following categories and subcategories: Software FRADs (Routers) -FRADs with only basic features -FRADs with more advanced features Hardware FRADs -FRADs (Traditional FRADs) -M-FRADs (Multiservice FRADs) -V-FRADs (Voice FRADs) So FRADs fall into two major categories, software FRADs and hardware FRADs. In each major category, several variations are possible. Each of the major types of FRAD is discussed in a little more detail here. Several of the more advanced features will be explored in more detail in later chapters. This section only indicates support options.
Software FRADs (Routers)
Most frame relay services are used to link organizations’ LANs together. Even when frame relay is used for support SNA, PBX voice, or some even more exotic services, the basic LAN connectivity private line replacement role of frame relay is still present. The device that is most often used to act as the external gateway between LANs is the router. In fact, gateway was the older term used for router until a company called cisco (properly spelled with a lower case “c,” but seldom seen that way) essentially invented the router and, more importantly for the LAN interconnection industry, the market for routers. This is not necessarily an endorsement of cisco routers, but more of an acknowledgment that cisco outsells every other router vendor put together. Routers are the network nodes of the Internet and have become the common term for the network nodes for LAN interconnections of any type, from leased private lines to virtual private networks. Older LAN connectivity schemes used bridges, but these devices had such limitations when compared to routers that once the pricing was right, people took advantage of the benefits of routing almost immediately. Note that the function of routers on LAN internetworks and the Internet is exactly the same as the role of switches in a public data network. Both are network nodes. This is not a coincidence. Some would even claim that there is basically no difference at all between routers and switches. More on this subject will be said later in this chapter, where the position is developed that there are still some consistent differences between routers and switches, but the differences are becoming less and less over time. Because both routers and frame relay switches are network nodes, it might seem logical that they should be able to interface directly. And they can. But since the router is not a frame relay switch, but a CPE device, the interface between them must be the UNI. Since the router is also a device that has one or more non-UNI ports for LAN attachment, it is easily seen that the router fits the oneUNI, one-or-more non-UNIs definition of a FRAD and can perform the same function as a FRAD. When a router performs this function, it is sometimes known as a software FRAD; that is the term used here. Why software FRAD? Because a router typically has at least one serial WAN port that runs an appropriate WAN protocol such as PPP (Point-to-Point Protocol) at the frame level (Data Link Layer 2) that the serial router port on the other end of the link understands. If the serial WAN port is now to be used as a frame relay UNI, PPP will no longer do. The protocol that runs at the frame level now must understand frame relay frames and nothing else. This is not usually a problem. All major router vendors (a phrase that can just mean “cisco,” but in this case means almost everybody) support and bundle the frame relay protocol on their serial WAN ports. The use of a router as the CPE on a frame relay UNI is quite attractive. This is especially true if the frame relay network is replacing an existing private line network between the routers and many frame relay networks do. What usually happens is that the customer can terminate service on all of the other links, which are usually 56-64 kbps private lines, and keep one to form the UNI. The router port is reconfigured as a frame relay UNI, restarted when service begins, and the UNI is up and running. What could be simpler? This graceful migration path aspect of frame relay must not be underestimated. It should be noted that with the exception of the frame structure, the UNI link performs exactly as it did before it became a frame relay UNI. That is, the link carrier frames are represented as bits. But now the link terminates at a local frame relay switch instead of at another router port hundreds or even thousands of miles away. Because private lines are paid for by the mile, not only are there fewer links in the frame relay network, but also they are a fraction of their former length (and price). The UNI still requires a standard Digital Service Unit/Channel Service Unit (DSU/CSU) arrangement on the premises if the UNI is a digital link such as a 56-kbps DS-0, which the vast majority of UNIs are. The DSU/CSU forms the boundary between the service provider network and the CPE (the router or FRAD). The position and function of the DSU/CSU on a frame relay UNI is shown in Figure 3.4.
Figure 3.4 The DSU/CSU on a frame relay UNI. The DSU/CSU takes bits having one type of coding suitable for short distances and some media sent on the serial port, and converts them to and from bits that are suitable for longer distances and other media. The DSU/CSU is strictly a Physical Layer 1 device in the OSI RM. The function and position of the DSU/CSU is worth mentioning because the CPE is customer property and cannot be directly managed or configured by the service provider in most cases. But the DSU/CSU can be directly managed, since it is seldom changed. This makes the DSU/CSU an attractive place to try to manage frame relay UNIs; several service providers have attempted to do just that. Software FRADs do not even have to be routers, but most of them are. Almost any device that has a serial port can be used as a FRAD, as long as the vendor provides or supports software to generate frame relay frames and understands the switch at the other end of the UNI (there is more to frame relay and networking in general than just data transfer). Bridges, mainframe Front-End Processors (FEPs), minicomputers, and other equipment can be software FRADs, and many are, especially IBM SNA FEPs. But SNA and frame relay are a topic for a later chapter. Previously, software FRADs were distinguished by the presence or absence of advanced features. In the case of software FRADs, advanced features can translate to many things that hardware FRADs offer routinely and easily. Most routers that function as software FRADs, as convenient as this is when frame relay is used for LAN interconnectivity and private line replacement (which is just about always), are unable to provide more than simply a way to package LAN frames or packets into frame relay frames on one side of the network and haul them out again on the other. However, it has already been pointed out that there is more to networking than simple data transfer. The basic frame relay functions will transfer data across the network, but how will the routers detect congestion on the frame relay network? How will the routers deal with missing frame relay frames that are dropped because of errors? Most software FRADs do not deal with these issues; in fact, they are not really worried about these issues at all. Routers simply route. Let the end systems worry about what to do about errors and congestion. The problem is that routers are the end systems to the frame relay network. But simply putting frame relay software in a router will not necessarily make the router a particularly good FRAD. Nevertheless, the use of a router as a software FRAD is common and accepted. There are several benefits to this use of routers. First, the frame relay software is usually bundled with the router, at least all but the very low-end routers. So no extra hardware is needed and the cost of a separate FRAD is not incurred. One router is typically the gateway off the premises for network traffic, so making this the gateway to the frame relay network makes sense also. Certainly this is the simplest configuration. Finally, routers have been around for a number of years; there is widespread understanding and support for routers in the networking community. Sometimes, the limitations of software FRADs only become apparent when the frame relay network becomes so successful that frame relay network access must be expanded to more users, perhaps all of them. The biggest limit is that there can only be one frame relay UNI per router, unless definite steps are taken to provide more than one UNI. Many customers, familiar with private line environments, do not think about having more than one UNI per router. Also, many router-based FRAD implementations are not totally compliant with frame relay specifications. The router might be able to handle basic frame transfer, but not much else. There are usually few really advanced frame relay features on the router acting as a FRAD, which is only understandable. Routers are built and marketed as routers, not as FRADs, after all. Finally, because routers are basically connectionless, best-effort packet delivery platforms, there is little quality of service (QoS) support in a software FRAD. In this context, QoS means that the application is able to obtain the bandwidth, error rate, and delay characteristics that the application needs to function from the underlying frame relay network. QoS in networks, in general, and frame relay QoS, in particular, will be defined more fully in a later section of this chapter.
In fairness to the router industry, it should be noted that nothing prevents a router from becoming as good a FRAD as anything else, except perhaps in the area of support for other services. That is, voice telephony ports on a router will be rare for the time being. The strength of software FRADs depends on the amount of effort put into them by the individual router vendor.
Hardware FRADs Many of the basic features of FRADs have been covered in the software FRAD section. A lot can be covered by contrast rather than detailed descriptions. That is not to say that hardware FRADs are not as important a topic as using routers as FRADs; it is simply a reflection of the natural tendency to use the familiar router in a role that can also be served, and sometimes better served, than a separate, dedicated frame relay network access device. Today, the FRAD marketplace is broken up loosely (very loosely) into three major categories: “traditional” FRADs, multiservice FRADs, and voice FRADs. The term “traditional” refers to FRAD as a stand-alone hardware device which might have very advanced state-of-the-art capabilities. In fact, the perspective employed here on these divisions is not based on any formal definitions at all, only the author’s individual view of the marketplace. Formal definitions may evolve, but for now the dividing line between the various categories of FRADs remains quite fluid.
Traditional FRADs It is surely a measure of how far hardware FRADs have come that it is now necessary to call the simplest packaging of FRAD capabilities a traditional FRAD. Although the leading edge of the market has progressed far beyond the simple packaging of the basic FRAD, such traditional FRADs remain in heavy use on many frame relay networks. The basic package should remain common in low-end FRAD offerings for many years to come. The term “traditional” applied to hardware FRADs has nothing to do with frame relay features that determine frame relay standards compliance or support for options. All hardware FRADs offer such compliance (at least they should) with frame relay standards; option support is another issue altogether. Rather, the term “traditional” applies to a FRAD that only supports data services and treats all data traffic identically in the FRAD itself (i.e., there are no priorities for individual virtual circuits). The major features of a traditional hardware FRAD are shown in Figure 3.5. The main difference between a hardware FRAD and a software FRAD, such as a router, is support for more types of non-UNI ports on the FRAD than only LAN ports, although such a FRAD might still have nothing but LAN ports, but generally more than one. Even X.25 or telex ports can be accommodated on some models of this type of FRAD. One of the distinguishing characteristics of a traditional FRAD is that traffic from all non-UNI ports is treated exactly the same inside the FRAD itself. In other words, if two frame relay frames containing client/server LAN traffic are already waiting to be sent on the UNI into the frame relay network, and a port connected to an IBM AS/400 generates a delay-sensitive unit of SNA traffic representing a financial transaction, there is no way for the SNA traffic to leap frog and thus gain preference in the output queue over any other traffic in the FRAD.
Figure 3.5 The stand-alone FRAD with advanced state-of-the-art features.
So from the traditional FRAD perspective of the frame relay network (which is basically all frame relay network levels since the FRAD is the interface to the frame relay network), all traffic is created equal. Under light loads, traffic moves through quickly, transactions and e-mail alike. Under heavy loads, traffic moves through more slowly, perhaps slow enough to result in SNA session restarts. SNA session restarts slow transaction processing to a crawl and make more work for all the components of the network, frame relay and non-frame relay components alike. Perhaps if some way to distinguish bulk e-mail traffic from delay-sensitive SNA transactions within the FRAD were possible, the frame relay network users would be a much happier group. E-mail could easily wait while SNA session traffic was delivered more quickly. Perhaps it would be better if there was a way in the FRAD itself to acknowledge inherent differences in the type of service that an application needs and prioritize the virtual connections on the network. This FRAD would not only prioritize connections, but also dynamically allocate more or less bandwidth to an application as the application uses the frame relay network. This type of FRAD is sometimes called the M-FRAD, or multiservice FRAD.
Multiservice FRADs The first thing that should be said about M-FRADs is that they are not necessarily the same as multiservice access concentrators. M-FRADs support frame relay UNIs, pure and simple. Multiservice access concentrators generally support both frame relay and ATM UNIs. M-FRADs still support delay-sensitive traffic streams like voice and video only if these services are delivered to and originate from LAN-attached PCs. So multiservice support in an M-FRAD still revolves around data packet delivery. Multiservice access concentrators have voice and video support, but the voice and video support is usually handled by specialized ports with not only non-UNI interfaces but nonLAN ports as well. This is not to say that a multiservice access concentrator with a frame relay UNI and only LAN ports cannot be used in the same fashion as an M-FRAD. It simply acknowledges that the multiservice access concentrator is a more general device while the M-FRAD is a more specialized device. The key feature and benefit of an M-FRAD is that the device can prioritize the frame relay service given to individual connections based on the needs of the traffic on the connection. Typically, this need is for data traffic priorities, but nothing prevents M-FRADs from supporting voice traffic as well—usually by encapsulating digital voice inside data packets. It is hard to be precise when there are no accepted definitions and a large measure of common sense is needed when evaluating these types of FRADs. Before M-FRADs, almost all FRADs had a few common characteristics. Some of these features have been discussed already. First, these traditional FRADs provided a single type of service (first in, first out) for all traffic. Next, the number of UNIs and non-UNI ports serviced by the FRAD were relatively small, so central site concentration was awkward. Finally, these FRADs all relied solely on permanent virtual circuits (PVCs) for connectivity. Usually, the M-FRAD at least distinguishes between LAN traffic and SNA sessions, often called legacy data traffic by the product vendors. More sophisticated M-FRADs can give priorities to interactive client/server database access over bulk file transfers, or even SNA transactions to one mainframe over SNA transactions to another mainframe. The key is that the M-FRAD is aware not only of frame relay frames, but also the differing connection identifiers (the DLCIs) of the frame relay connections or virtual circuits.
Voice FRADs The last type of specialized hardware that might be encountered in FRADs is voice capability. These voice FRADs, or just V-FRADs, typically have a harmonica style interface for hooking up 50pair twisted-pair copper wire from an organization’s PBX. These cables can carry up to 24 voice channels from the PBX into and across the frame relay network to a similar device on the other end of the PVCs used for voice. Ordinarily, these voice channels would be carried on tie-lines, which are nothing more than leased lines used for voice purposes between PBXs. Most often, the 24 voice channels would be represented by 64 kbps digital voice and the 24 channels would be carried by a 1.5 Mbps DS-circuit in the United States.
Naturally, if an entire DS-acting as a frame relay UNI were used to carry regular 64 kbps voice conversations, then when 24 people were on the telephone at one site, all data transfer would cease. This is what the V-FRAD is for. The V-FRAD will take the 64 kbps voice and further compress it to anywhere from 4 kbps to about 13 kbps, depending on the V-FRAD vendor and desired voice quality. So 24 voice compressed channels should only take up between 192 kbps and 312 kbps on the 1.5 Mbps UNI. The voice compression is done by a special board in the V-FRAD known as a Digital Signal Processor (DSP). Although the term DSP might seem to imply that any digital signals at all could be subjected to this process, in practice only digital voice signals are processed by current DSPs. The compression must be done by hardware because of the delays that might otherwise be introduced by attempting to perform this task in software. For smaller installations using only a 64 kbps UNI, the DSP boards usually have individual modular jacks for handling only a few telephones instead of a whole T-interface. Voice over frame relay is facing a real challenge from Voice over IP (VoIP) proponents and equipment. However, doing VoIP with adequate quality typically means that the organization must use a managed Internet Service Provider’s service. This usually translates to the ISP providing a separate access line and backbone network in the form of routers connected by leased lines. In other words, the VoIP in this case has nothing to do with the Internet, other than the fact that the ISP also happens to provide Internet access. The attraction of doing voice over frame relay is that the voice is more intimately tied in with the basic service. That is, the voice over frame relay is delivered over the same network, from UNI to backbone, as the data service. This is not often true with VoIP today. More will be said about voice over frame relay in Chapter 9. The discussions in this section should be used for information purposes, not as a blueprint for one FRAD or another. As time goes on, all FRADs will develop support for multiservice priorities and support for nonpacket-based telephony. So the distinctions between FRAD, M-FRAD, and V-FRAD will blur over time. In some truly state-of-the-art packages, the differences between -FRAD and VFRAD have already begun to be merged into the same device, just with different boards for the different functions. Figure 3.5 shows what passes for a state-of-the-art hardware FRAD today. Some of the features have yet to be discussed in detail, but this is the place to deal with the overall features that a frame relay customer should expect to find or at least be available for the CPE device at the customer end of the UNI. The chassis is a standard rack-mountable or standalone unit whose cost will vary widely based on features and functions supported. All include a main and backup power supply, and redundancy is always advisable for installations using voice over frame relay (otherwise voice communication is cut off with loss of building power). Typically, any network management capabilities are also built into this main system board, but there are exceptions. In fact, the network management capabilities are the main distinction between software and hardware FRADs, as will become apparent. If the unit supports multiple queues for giving priority to one traffic stream or physical port over another, there is typically a separate board for that function, although some units combine this capability with the main system board. The rest of the slots are configurable on a mix-and-match basis depending on number and type of premises connection, and number and type of UNI connections (some FRADs can support multiple UNIs). For the premises side of the network, the connectors almost universally include one or more 10Base-T LAN connectors, often support one or more token ring connectors, and might include more exotic LAN and device connector types such as Fibre Channel. For the WAN side of the network, the FRAD supports multiple UNI connector types, usually depending on the speed of the UNI itself. The most common is the V.35 connector for a 56-64 kbps UNI, but 1.5 Mbps and 45 Mbps UNI are also supported, of course. Usually there is only one UNI connector, but more are possible. There would also be a voice DSP board for compressed voice at 8 kbps or so, depending on the compression method used. The remaining slots (if any) would be used for expansion. And, of course, there is no requirement for the DSP board (for instance) to be present until there are plans for voice support, and so on for the other optional services.
More details on FRADs are available from the individual vendors, from various trade magazines who periodically review such devices, and from the Frame Relay Forum (FRF). The FRF is a vendor consortium interested in vendor interoperability of frame relay devices; it issues implementation agreements (IA) covering a wide variety of frame relay topics. More information on the frame relay forum will be found in the bibliography to this book. Regardless of the future of FRADs, all FRADs share a common purpose. The FRAD exists to allow users to access the frame relay network. The frame relay network exists to give each user the quality of service (QoS) that he or she needs to allow applications to function as designed. Because a lot of time will be spent going over a frame relay network’s techniques for delivering the proper QoS needed, this is a good place to say a few words about what QoS on a network is precisely.
Quality of Service and Networks Having the proper quality of service (QoS) on a network is a lot like having nice weather on a vacation. Not only does the term “nice weather” mean different things to different people, it means different things depending on what the vacation is all about. Obviously, nice weather for skiing in the Rockies is not the same as nice weather for sunbathing in the Bahamas. So it is with networks. The QoS needed for bulk file transfers like remote server backups is not the same as the QoS needed for packetized voice. Also, people being what they are, no one complains about too much nice weather or excellent QoS. But everyone notices when the weather or QoS fails to live up to their expectations. But the network, like the travel agent, is always held responsible if the weather or QoS disappoints a user expecting one thing and given another. Analogies are nice tools for comparison, but they can only be pushed so far before they either become inadequate or just annoying. This section will say no more about vacations, especially since no one has ever confused setting up a network with taking a vacation. There is no official definition of QoS. For the purposes of this book, QoS will be defined as the ability of a network to deliver to each and every user application that specifies a series of QoS parameters the correct amount of network resources needed to deliver that QoS. This definition sounds complicated, but it really is not. All it really means is, as an example, that if a user tells the network that this application needs a delay of 20 ms across the network, plus or minus 1 ms, the network will make sure that this happens. If not, then the user has a legitimate complaint and there might be a rebate on the monthly bill or some other penalty involved. Guaranteeing QoS is not easy on the part of the network. The network must not only look around (using whatever mechanism employed for this purpose) and see if this 20 ms delay is even possible to deliver, but also make sure that no other applications granted a QoS in the future are allowed to affect the QoS just given out. In other words, the network cannot suddenly stop 20 ms delay because a whole raft of other users are now demanding 20 ms delays also. The delivery of QoS is so difficult to accomplish consistently on large public networks that the Internet as structured today cannot do it at all, and networks such as ATM, designed for QoS delivery from the ground up, can only deliver QoS under certain circumstances. This non-QoS support is the main reason that the Internet, and IP networks in general, are considered to be unreliable. It seems odd that a network like the Internet, characterized by dynamic rerouting around network node (router) failures while switched networks drop connections when switches fail, is considered unreliable while switched networks that drop connections are considered reliable. But this is only because the term “unreliable” when used in an Internet or IP context simply means that the network is unreliable when it comes to delivering user-desired QoS parameters such as stable delays (or guaranteed delivery!). The circuit-switched PSTN, although perhaps failure-prone, is much more reliable at delivering the QoS parameters that voice (especially) requires in terms of stable delays.
There is no real agreement as to exactly what parameters go into QoS. This sounds odd, but it is true. Most would agree that at least bandwidth, delay, delay variation (jitter), and error performance (in terms of cell/packet/frame loss) belong on the list of QoS parameters. Some add at least one or sometimes even two more. Reliability concerns, this time in the sense of network availability, have become more acute with the recent widely publicized outages of portions of the Internet and public frame relay networks. In fact without reliability, the ability of a network to deliver any other QoS parameters becomes pretty much moot. In some routing protocols, reliability is a metric that can be maximized when routing decisions are made. There is also a good argument for adding security to the list of QoS parameters. The venerable IP protocol has had a bit configuration available in the IP packet header telling routers to “maximize security” when routing the packet for more than 20 years. Router vendors have never implemented this option, but that does not mean it is unimportant. It could even be argued that given today’s dependence on the Internet and other public networks for commerce and finance, if security is not a QoS parameter, it soon must be. So a comprehensive list of QoS parameters will have not four, but six items. These are: 1.Bandwidth (number of bits per second the application needs, e.g., 8 kbps). 2.Delay (maximum amount of time it can take to reach destination, e.g., r0 ms). 3.Delay variation or jitter (amount of time the delay is allowed to vary, e.g., 1 ms). 4.Errors or information loss (percentage of cells/packets/frames the network can lose, e.g., 0%). 5.Reliability (annual percentage of time that the network must be available, e.g., five nines or 99.999%). 6.Security (degree of protection afforded to information on the network, e.g., double encryption). The actual values of these parameters will differ from application to application and the ability of a given network architecture to support them will differ from network to network. For instance, delays are so variable on the Internet that it makes absolutely no sense for applications to specify delay variations limits. What has all of this discussion of QoS have to do with frame relay? Primarily, to make the point that frame relay occupies a sort of halfway point between network architectures with no QoS delivery mechanisms at all like the Internet and other IP-based best-effort networks, and network architectures that were invented specifically to deliver precise QoS performance to all applications, such as ATM. This means that while there is no mechanism in frame relay for an application using a frame relay PVC to inform the frame relay network of its QoS needs, there are some basic bandwidth reservation mechanisms built into frame relay. Even a multiservice FRAD can only guarantee that some PVCs will receive priority queuing over other PVCs, not that the network delay will be lower than X at all times. This is a form of relative QoS, not the kind of absolute QoS that ATM can deliver. But arguments that frame relay has no QoS guarantees and seem to put frame relay into the same category as the Internet or other IP networks are just wrong. This line of thought emphasizes the lack of a complete set of explicit QoS parameters that is present (but not always used) in ATM networks. It is true that the best that can be said for frame relay QoS is that the QoS is probabalistic and not deterministic as in ATM. So a frame relay network might be able to probably deliver frames in under 20 ms (for example) 99.4 percent of the time. While very good QoS performance, it is not ironclad. The 0.6 percent of a year that the delay is over 20 ms works out to 52.56 hours. If the QoS is not met for several 8-hour days when critical business activities are scheduled, this 99.4 percent might be of little consolation to the user. Yet the service provider has met the letter of the Service Level Agreement (SLA).
Frame relay service providers routinely use terms and conditions like “delay is less than 40 ms 99.5 percent of the time” and “PVC available will be 99.99 percent annually” (this works out to less than an hour of downtime a year). Some SLAs are quite explicit: “Delay is 10 ms plus 0.05 ms per 100 km of route miles from source to destination.” All of these conditions require that some mechanism is put in place to verify the QoS level available to each and every application to verify compliance and detect violations. This is one of the reasons that frame relay, although a public network service, allows users to have ways of gathering more performance information about their portion of the network than ever before (how else could a customer ever determine route miles on the network?). The good news is that in most cases all of the network management mechanisms work extremely well. More details on SLAs, frame relay network management, and related topics will be discussed in Chapter 7. It might be a good idea to close this section with a look at the service guarantees that a frame relay service provider would typically offer as opposed to the service guarantees that a typical Internet service provider would offer. These examples come from no specific source or service provider. However, they are certainly representative of the types of figures one would expect to see in a virtual private network proposal for a network based on frame relay as opposed to one based on the Internet or the IP protocol in general. This comparison in made in Table 3.1. Note that service providers commonly distinguish between network availability and user or application availability. This is an attempt to say that just because an individual user or site has a bad month or year, overall the network is doing just fine. Also, the Internet column has the absolute best guarantee from any number of widely-known business Internet service providers. So when it comes to QoS, frame relay is not the best network architecture available, but neither is it the worst. Frame relay QoS mostly provides only absolute bandwidth guarantees, but bandwidth is probably the most critical of the six QoS parameters when it comes to correct day-to-day user application operation. Table 3.1 QoS Levels in Frame Relay and on the Internet or with IP Parameter
Frame relay
Internet, Typical
Internet, Best
Delay
60 ms one way
No guarantee
150 ms or less one way
Errors (Loss)
99.99% 1
No guarantee
Individual case basis
Network
99.99%
99%
100% 2
User/application
99.9%
No guarantee
100% 2
Penalty?
Yes, detailed
No
Almost same as frame relay
Reliability
1 This applies only to traffic which conforms to the committed information rate (CIR). 2 The ability of any service, let alone the Internet or IP, to be 100% reliable is remote. 99.7% or 99.5% are more often the best.
The Frame Relay UNI and NNI A basic frame relay network is composed of three simple elements. The elements are the access link, the port connection, and the associated virtual |onnections (which are almost all PVCs today). The access link and UNI arrangement is such an important piece of the frame relay network service that the access link will be discussed more fully in the next chapter. This chapter will emphasize the frame relay switch port connection and virtual connections (or circuits). Although the two elements of port connection and virtual connections will be discussed separately, the port connection and virtual connections have no meaning unless used together in a frame relay network. Think of the port connection as a hardware aspect of frame relay and virtual connections as a software aspect of a frame relay network. It takes both to do anything useful. The relationship between the frame relay access link (UNI), port connection, and virtual circuits is shown in Figure 3.1. The access link runs between the customer premises frame relay access device, or FRAD, and the frame relay network service provider’s switch. The port connection is the actual physical connection on the switch to which the FRAD device is attached by the access link. Finally, the virtual connections (PVCs) are what allow all user traffic from the CPE to be sent into the network on the same access link, yet be delivered almost anywhere in the world. In frame relay, the virtual connections are identified by a Data Link Connection Identifier (DLCI) which will be discussed further later on.
Figure 3.1 A typical frame relay network user connection. The port connection forms the user entry point into a frame relay network. The port connection is usually associated with a single customer site, but not always. In other words, it is possible to have two sites linked to the frame relay network through one port, or even one site linked through two ports, but neither of these situations is common. In most cases, one site gets one port, no matter if there are many users, applications, or protocols sharing the network. The key with frame relay is that many logical connections will share a single physical port connection. These logical or virtual connections (the PVCs) will carry traffic to many remote locations. And the nice thing about frame relay is that all of a particular site’s traffic—regardless of originating users, application, or protocol—will use the same PVC to send and receive traffic to a particular site in the vast majority of cases. In spite of all this connectivity, there is no dedicated bandwidth allocated to these individual users, applications, or protocols on the frame relay network. Dedicated bandwidth (all the bandwidth, all the time) is a characteristic of private line networks, but not of frame relay. Instead, the port connection on the frame relay switch will dynamically allocate the frame relay network capacity to meet the changing needs of all the users on the network, not just this one port. This is simply a way of saying that in frame relay, there is no dedicated bandwidth, but there is dedicated capacity on the network. The idea of dedicated capacity will be more fully explored in Chapter 4.
Capacity is determined based on the port speed. It determines the total amount of information a user sends into the network in a given unit of time (usually a full second). For example, a port speed of 64 kbps effectively allocates a capacity of 64,000 bits each second to all of the users attached to that port connection. There is no possible way that any user or application could ever send more than this number of bits into the network in a second. Most frame relay service providers allow port connection speeds as a set of multiples of 64 kbps (56 kbps in some cases). These speeds are essentially based on something called fractional T1 (FT1) speeds. While it is not important to know exactly what this means, it is important to know what speeds are represented. These speeds are usually 56/64 kbps, 128 kbps, 256 kbps, 384 kbps, 512 kbps, 768 kbps, 1024 kbps, and 1536 kbps. Some providers do not support all of these speeds and some support other speeds, but this set is very common. Figure 3.1 showed only one site using a frame relay network across the UNI. To be more complete, there would need to be at least two sites and UNIs linked across the network. There must be at least one switch, and usually there are many. The switches link to each other over a network node interface, which is undefined in frame relay. Almost any protocol and hardware can be used, as long as it is supported by the switch vendor(s) of both switches and provides adequate Quality of Service (QoS) to users. Ironically, one of the most common uses of ATM today is to provide such a backbone for frame relay switches to connect. When ATM forms the frame relay backbone, it is known as cell-based frame relay and has many benefits for service providers and users alike. The relationship between ATM and frame relay will be explored more fully later in Chapter 12. For the sake of completeness, an entire (but very small) frame relay network is shown in Figure 3.2. Note that ATM is used as the backbone technology, but this is just one of the possibilities. In this case, the frame relay switches themselves become the end devices on the ATM network.
Figure 3.2 A frame relay network. Public network standards typically spend a lot of time detailing exactly what should happen in terms of the software and what the hardware arrangements from the CPE to the network switch port should be. The CPE to switch interface defines the UNI and is clearly a key part of any network architecture. Without firm and complete UNI specifications, CPE devices could not be a varied as they are in terms of vendors and optional capabilities, and service providers could only support a small subset of all possible CPE configurations. The frame relay UNI allows this interoperability to take place. But the frame relay network node interface is another story. Normally, the network node interface can be abbreviated NNI, but this is not a wise idea when speaking of frame relay networks. The acronym NNI does in fact exist in frame relay, but means network-to-network interface. So the frame relay NNI means something entirely different than the acronym NNI does in other technologies, such as ATM. Actually, ATM was the first major network architecture to define a standard network node interface. Most other network architectures, especially public network architectures, never bothered to define a standard network node (switch-to-switch interface). The reason is very simple: Such a standard interface on public networks was not felt to be in the best interests of the network.
Such an attitude sounds quite odd given the current climate and push toward standardization at all costs. But this attitude grew out of the voice network and the philosophy was later applied to X.25 packet-switching networks and frame relay, among other types of networks. The approach to standardization on the public voice network did not emphasize interoperability. Instead, the approach emphasized innovation. The feeling was that if standards are too strictly defined, no one will ever do anything radically different, since obviously this new approach would not fit in with the currently defined standard. If standards are more loosely defined, then innovation can proceed with less concern for interoperability. Consider the PSTN as an example. Once people could buy their own telephones, the interface from telephone to voice switch was fully and strictly defined, right down to the voltages. But there was still no standard way for the voice switches to talk to each other. Each central office or local exchange switch vendor had its own, proprietary way of signaling and trunking between switches, and each felt that its way was the absolute best possible way of performing this task. This situation encouraged vendors to freely innovate and explore other methods of switch interfaces, since the only concern was for backward compatibility with their older products, at least until the older switches could be phased out. But what about interoperability? Proprietary voice switch interfaces meant that a multivendor environment was difficult to achieve. If it had to be done, vendor A’s switch had to translate everything into vendor B’s talk or vice versa before any interswitch communication could take place. And this translation process is exactly what was done. At first, it would seem a chaotic situation, especially to those used to a standards-dominated world. What saved the PSTN was the fact that there were only about a half dozen public voice switch vendors, so multivendor translation was not as big a problem as it would be in the LAN world with 60 or more Ethernet hub vendors. Large public networks like the PSTN were seldom multivendor environments anyway. There were few alternatives, as just mentioned, and no one cared to build a network where intervendor fingerpointing between the switch vendors at each end of the link slowed troubleshooting and repair times to a crawl. The customers (and regulators) would not stand for it. So most large public networks standardized on one vendor or another, and that was that. The proprietary approach was extended to X.25 public data networks, then to frame relay as well. So frame relay switch vendors are free to innovate any way they choose on the network node switch-to-switch interface. The only real requirement is that the two ends of the link understand each other. Ironically, in spite of the lack of standards for use on the network node interface between frame relay switches, there is one standard that is forbidden for use between frame relay switches. This is the frame relay UNI. The precise reasons for this are beyond the scope of this discussion, but this prohibition revolves around the fact that two frame relay switches are total peers, and the UNI requires one end of the link to be CPE. The UNI relationship is a peer one with regard to data transfer, but not so with regard to network management and so forth. How did ATM become a common backbone for frame relay networks? One major reason is that the pressure today in the industry is not toward innovation but toward multivendor interoperability. So proprietary interfaces, while tolerable, are not always the first choice. Also, as services grow, there are more network nodes than ever. Multivendor environments are more common in the data world, where a huge switch has 256 ports, not 10,000 or even 40,000 as on a large voice switch. Therefore, if no standard network node interface exists, it might still be a good idea to use something else that is standard to tie all of the nodes together. That is one role of ATM in a frame relay network. ATM provides the standard network node interface that frame relay lacks. Each frame relay switch essentially becomes a user on the ATM network. There is much more to the relationship between frame relay and ATM than just an ATM backbone for frame relay switches. But the positioning of ATM as backbone for frame relay is enough for now.
In spite of the previous discussion, there is an NNI acronym in frame relay; NNI means Network-toNetwork Interface. The frame relay NNI is used whenever two different service provider’s frame relay networks need to communicate. After all, they might be using different switch vendor’s products and proprietary interfaces will not work. And although multivendor public network environments are not all that common, they are not particularly rare either. For instance, a service provider might change switch vendors at some point. It would hardly be possible or intelligent to discard the previous vendor’s equipment. The vendors might be isolated by area or function, but the switches must still communicate when required. Translation can be used in this situation, but there are potentially more frame relay switch vendors than voice switch vendors. And the more vendors, the more the need for standard interoperability between them. Some standard way must be found in order to allow two different frame relay networks, or portions of networks, to communicate. The relationship and uses of the frame relay UNI, NNI, and interswitch interfaces are shown in Figure 3.3. Note the presence of a simple router as a FRAD. The possible FRAD configurations are a major theme of this chapter and the next. The other FRADs in the figure have multiple ports that connect other devices, probably routers, but also other things, especially IBM SNA components. The figure shows a private frame relay network as well as public frame relay. Nothing prevents an organization from buying and installing its own internal frame relay network. Only leased private lines are needed to tie the switches together and link the FRADs to the switch ports. But if the private network needs to access users on a public frame relay network, there must be a standard interface between them if the switch does not understand each and every proprietary protocol in use. This is one role of the NNI.
Figure 3.3 Frame relay UNIs and NNIs. The primary role of the NNI is shown in the figure also. The two public frame relay networks could belong to two frame relay service providers, perhaps a LEC and an IXC. Alternatively, the two public frame relay networks could belong to the same service providers, and could even service the same geographical area. But in this case, the two clouds contain all of vendor A’s switches in one cloud and all of vendor B’s switches in the second cloud. This is a job for the NNI as well. Both uses are common.
The Frame Relay Protocol Stack Only a few related topics remain to give a good overall description of how a frame relay network actually works. It has already been pointed out that the vast majority of public frame relay networks (and even private ones) offer only Permanent Virtual Circuits (PVCs) for connectivity. The few frame relay service providers that do offer Switched Virtual Circuits (SVCs) are few and far between, and usually have many restrictions on the number of SVCs that can be established, where the endpoints are located, and so on. Of course, PVCs have no call setup delays while SVC signaling messages are processed by the network to determine routes and network resources, establish switch table entries, and engage billing procedures. All of these issues will be discussed more fully later. All that remains here is to show that SVC support depends on the exact frame relay protocol stack that a frame relay service provider supports. Mention has already been made of LAPF core, the basic frame protocol run on any frame relay UNI at all. The fact is that LAPF core simply transfers frame relay frames around a PVC-defined network. That is, no SVCs are possible in a frame relay network employing only LAPF core. In is often said, and not incorrectly, that frame relay is defined at the bottom two layers of the OSI RM. This is not inaccurate if the data transfer aspect of frame relay is being discussed. But there is more to networking than data transfer, much more in fact. Networks must be managed with some form of network management techniques. The techniques could be added on, but network management is more efficient and consistent if the techniques are part of the network specification itself. A network must be controlled with some form of signaling protocol so that users and network can inform each other of their intentions in terms of connectivity. This is the task of the signaling protocol. In any case, there is no room at Layer 2 of the OSI RM for these management and control functions. Yet these functions must be performed in the frame relay network nodes, the switches themselves. With regard to the frame relay protocol stack, these functions can be considered to be at Layer 3, the Network Layer, although most texts are fond of insisting that there is no Layer 3 in frame relay switches at all. Here is how the frame relay protocol stack actually looks. X.25 is a fairly faithful representation of what the OSI RM should do at the Physical, Data Link, and Network layers. In X.25, the layers can be represented by V.35 or X.21 at the Physical Layer, the X.25 LAPB at the Data Link Layer, and the X.25 Packet Layer Protocol (PLP) at the Network Layer. Frame relay can perform all data transfer tasks with a subset of the full OSI RM Data Link Layer. Network management is done in frame relay as a small subset of OSI RM Layer 3 and only on the UNI. This basically means that frame relay network management on the UNI consists of a small set of messages inside special frames sent back and forth on the UNI. LAPF on its own only supports manually configured PVCs. If SVCs are to be supported in frame relay, they can take one of two forms. The Integrated Services Digital Network (ISDN) signaling protocol specification on which frame relay signaling is based is called Q.931. Frame relay adapts and extends the Q.931 ISDN signaling protocol as Q.933. If the frame relay SVCs are established with signaling messages that form a subset of the full Q.931 ISDN call control message set, technically a Q.933 subset, this is known as non-ISDN SVC support. With non-ISDN SVC support, there is no relationship between a service provider’s ISDN signaling network (and billing system) and its frame relay SVC offering. But at least there are frame relay SVCs. However, a service provider can make its frame relay network a part of its ISDN, with frame relay playing the same role as X.25 as a packet-bearer service. This requires the full implementation of Q.933, however. With ISDN-compliant SVC support, there is a close relationship between a service provider’s ISDN signaling network (and billing system) and its frame relay SVC
offering. In this case, the user can use the service provider’s ISDN to establish frame relay SVCs. The relationships between all of these frame relay protocol stack permutations are shown in Figure 3.8.
Figure 3.8 The OSI RM, X.25, and frame relay.
Chapter 4: The Frame Relay User-Network Interface Overview The most visible portion of a frame relay network from the user’s perspective is the UNI. The user cannot see the frame relay switch at the other end of the UNI, nor the internal trunking network tying all of the switches together. But the user has direct access to the premises end of the frame relay UNI. In many frame relay network situations, the main concern of the user is: “How are the UNIs?” In some ways the UNI is also the most visible portion of the frame relay network from the service provider’s perspective as well. This is where many if not most of the network management efforts are focused. The UNI is also the portion of the network that requires the most care and attention when configuring PVCs for users. And the UNI is that part of the frame relay network that requires the greatest lead time if a physical link that can form the basis of a UNI is not already in place. In many frame relay network situations, the main concern of the service provider is: “How are the UNIs?” This chapter explorers all of the details of the UNI itself and any other issues involved with connecting a user to a frame relay network. This is the first look at the frame relay frame structure and the concept of the Data Link Layer connection identifiers used in frame relay. UNI configuration issues are examined in this chapter, especially the key frame relay concept of Committed Information Rate (CIR) that is often confusing. Ways of handling CIRs are discussed, with a long look at the related concepts of regular booking and oversubscription. Finally, all of the currently supported options for the UNI are investigated, from simple links to multiport UNIs.
Regular Booking and Oversubscription Every DLCI defined on a UNI must have a CIR. The CIR can be zero in some cases, but this is not the same as a DLCI not having a CIR at all. The CIR can be equal to, but cannot exceed, the access line rate of the UNI. However, the CIR on a DLCI is typically set at anywhere from 50 percent to 33 percent of the UNI speed, or even lower in many cases. CIRs can be adjusted on a monthly or even weekly basis, and often should be as traffic patterns change on the frame relay network. But suppose a frame relay UNI is configured with four DLCIs representing PVCs. Each must have a CIR associated with it. If the user or customer does not have a preference, the CIR is typically set at 50 percent of the access line rate and adjusted periodically as the network is used. But if each of the four DLCIs has a CIR of 32 kbps, the total of the CIRs defined on the UNI is 128 kbps. This greatly exceeds, and is in fact twice as much, as the access line rate of the UNI. How can this scheme possibly work? Because it does work, usually quite well, due to the extremely bursty nature of frame relay traffic. As long as all four DLCIs are not bursting at the same time, the 64 kbps UNI can handle all of the traffic. If all four DLCIs do burst at once, the FRAD must either buffer the excess frames or discard them at the source before the frames even enter the frame relay network. Most FRADs typically have anywhere from 1 to 4 megabytes of buffer space just to handle these bursts. Sometimes, a frame relay service provider will know from experience that some applications or users will regularly burst simultaneously. The multiple bursts will result in lost information if the UNI line rate is consistently exceeded. In this case, the service provider might adopt a policy known as regular booking for the UNI. With regular booking the sum of the CIRs defined on all of the DLCIs configured on the UNI cannot exceed the line rate of the UNI. So four DLCIs on a UNI with regular booking cannot have CIRs that add up to more than 64 kbps. The CIRs need not all be equal, of course, but the sum of the CIRs cannot exceed 64 kbps. If another DLCI must be added to a UNI with regular booking, the DLCI can only be added by decreasing the CIRs on one or more other DLCIs to preserve the total CIRs equal to the UNI speed relationship, or the new DLCI must be added with a CIR of zero if appropriate or allowed. It is important to note that even with regular booking, applications running on DLCIs can still exceed their CIRs, producing DE = 1 frames that might be discarded by the frame relay network under certain conditions. But with regular booking, all DLCIs can receive their CIRs at the same time, which is usually the whole idea. If the sum of the CIRs defined on all of the DLCIs configured on the UNI are allowed by the service provider to exceed the line rate of the UNI, this is known as oversubscription. The actual amount that the sum of the CIRs can exceed the UNI line rate is known as the oversubscription policy of the frame relay service provider. For example, an oversubscription policy of 200 percent means that the sum of the CIRs cannot exceed twice the UNI line rate, or 128 kbps on a 64 kbps UNI. Oversubscription policies of 300 percent are common and even 400 percent is not all that unusual. An oversubscription policy of 100 percent is essentially the same as regular booking.
The Frame Relay Committed Information Rate DLCIs are defined on a UNI. As many DLCIs are defined as are needed for connectivity to all reachable remote sites, within limits. The limitations are dictated by a couple of factors. First, there are only so many DLCIs that can be realistically supported on a UNI of a given speed, no matter how much buffering is used. After all, it is a lot to expect of a 56/64 kbps UNI that is replacing 67 private lines (for instance), no matter how bursty or intermittent the traffic. Second, each DLCI defined on a UNI has to include a certain Committed Information Rate (CIR) parameter. The CIR is probably the most difficult concept to grasp immediately when it comes to frame relay. This is due to the fact that in the world of leased private lines there is no equivalent concept. A lot of works on frame relay, from books to course notes to magazine articles, spend so much time describing the parameters and operational details involved in CIRs that people are often left with the impression that CIRs have some deep mathematical significance or are so complex that no one could or should attempt to understand them at a functional level. But none of this is true. The purpose of CIRs is easy to grasp and the function of the CIR is easy to understand. Think of the following discussion as the simplified, illustrated, and painless introduction to frame relay CIRs. Every DLCI must have a CIR. A PVC has a CIR assigned at service provision time. An SVC has a CIR configured when the demand connection is set up (the CIR is requested in the frame relay Q.933 call setup message). Some service providers allow a CIR of zero on a DLCI. This is not the same as saying there is no CIR at all. The CIR just contributes nothing to the CIR total on a given UNI. The CIR is often defined as a dedicated amount of bandwidth on a DLCI. This is not really true. There is no dedicated bandwidth in a frame relay network. Dedicated bandwidth (all the bandwidth, all the time) is more a characteristic of the private line networks that frame relay usually replaces. It is more accurate to say that the CIR represents a commitment on the part of the frame relay network to dedicate a given capacity for the transmission of user information at a certain rate determined by the CIR. The emphasized terms in this definition are precisely where the concept of a CIR originally comes from. Every DLCI, PVC or SVC on a frame relay network will have a CIR associated with it. This CIR assignment should reflect the average amount of traffic between the two sites connected by the DLCI. The CIR is a key part of exactly how a frame relay network can replace (for instance) three dedicated private lines with one access link to a public frame relay network running at the same speed as any one of the three private lines. This happens because, on average, each of the private lines may only be in use one-third of the time. The trick is to determine what the appropriate CIR is for each DLCI. This is not always easy; there is always a risk that the CIR chosen may be wrong. Fortunately, CIRs are relatively easy and quick to change and there are some general guidelines. Another way to interpret the CIR is as a statistical measurement of throughput over time. This introduces a number of technical parameters used in computing an actual CIR. All of these parameters are important, but only for understanding the deep theoretical underpinnings of frame relay. This discussion will be based more on example than abstractions. Frame relay specifications establish a number of traffic descriptors that are used by the service provider to determine when the traffic arriving on a UNI is within the configured class of service on the DLCI. This specific frame relay class of service is not to be confused with the general Quality of Service (QoS) concept that applies to all networks from the Internet to ATM. Following is a list, with definitions of the class of service parameters:
Access rate This is the physical speed of the UNI. The most common speed is 56/64 kbps, but speeds as high as 45 Mbps are supported in current frame relay specifications. Committed Rate Measurement Interval (TC) This is interval of time over which the network computes the information arrival rates and applies the CIR. Usually, this interval is measured in fractions of a second. Committed Burst Size (BC) This is the maximum number of bits that the network can absorb in the time interval TC, and still guarantee to deliver on the DLCI, under normal conditions. Usually, normal conditions means that the frame relay network is not congested from a series of bursts occurring in too short a time period. Excess Burst Size (Be) This is the maximum number of bits that the network can absorb above and beyond the committed burst size BC and attempt to deliver on the DLCI. Note that bursts within the committed size are guaranteed delivery, but delivery of excess bursts is only attempted. There is no penalty for nondelivery of excess bursts. These four basic parameters are combined in various ways to determine the amount of network resources that must be dedicated to support a particular CIR on a given DLCI. For instance, the frame relay standards specify that the CIR be set equal to the committed burst rate BC divided by the committed rate measurement interval TC or BC/TC. The usual time unit is a fraction of a second, or a full second, but this varies from service provider to service provider. The following examples will use one full second, which makes the math much easier when dealing with CIRs. There is also an Excess Information Rate (EIR) which is defined as the excess burst size Be divided by TC The CIR and EIR give results expressed in bits per second, usually kbps. Neither the CIR nor the EIR, nor even the sum of the CIR and the EIR, assigned to a DLCI can ever exceed the access rate. This only makes sense. The network cannot receive more bits, excess or not, per unit time than the UNI can transmit. However, the CIR and EIR together can add up to less than the access rate of the UNI, although this is not common. Why is so much effort applied to each DLCI on a UNI in terms of CIR? Because the CIR is directly, and the EIR indirectly, used to determine the status of the Discard Eligible (DE) bit in the frame relay header. If the DE bit is zero, then the network should deliver the frame to the destination under all but the most dire network conditions, a guarantee including such routine conditions as congestion and backbone link failures. This is what the customer is paying for, of course. On the other hand, if the DE bit is set to 1, the network can discard the frame under congested conditions (and other threatening circumstances). Under particularly drastic conditions, the frame relay network can even discard DE = 0 frames, but of course this action might result in rebates to customers on the affected links if it occurs routinely. Sometimes users familiar with private line networks or other forms of networking recoil at the very thought that a network might discard data, even if it is only under certain conditions. But this is not the point. The fact is that all networks must discard data if the alternative is to crash the node or whole network. Routers must discard packets when the buffers become full, and routinely do so. The Internet would grind to a halt (and sometimes seems to anyway) if all packets sent into the network had to be delivered. Normally, anywhere from 10 percent to even 20 percent packet loss on the Internet is considered acceptable under certain peak periods and traffic loads. The packets must be resent from the endpoint applications if necessary. The problem with haphazard discard mechanisms is that they often discard precisely the wrong traffic. A router seeking to free up buffer space often goes after smaller packets. The philosophy seems to be that freeing up a little buffer space as needed to soldier on is preferable to a wholesale wiping out of large packets which might contain large amounts of user data. But what often happens is that the smaller packets contain the network or application-level acknowledgments that destinations send to sources to inform the source that everything up to a given point has been received properly. If an acknowledgment is discarded, the net result is often a barrage of resend traffic that swamps the network worse than the original congestion that triggered the discard process in the first place.
So the DE provides a mechanism that frame relay networks can use to establish a loose system of priorities for frames that might be discarded before others. In other words, the rule is to discard frame with DE = 1 before those with DE = 0, but only if such action becomes necessary to avoid severe network congestion. This is what the CIR is for. FRADs that respect the CIR will generate only traffic flows that have DE = 0 frames. Only above the CIR will DE = 1 frames be tagged at the network end of the UNI. It is hard enough to understand CIRs when they are described in words alone. A few graphics and examples might make the concept of CIRs more real and meaningful. Always keep in mind that the CIR is designed for bursty traffic and represents a statistical smoothing of traffic over time. A person might drive to work in an hour, at 60 miles per hour, and do the same returning home. But for the eight hours that the person is at work, the car in the employee lot is not moving at all. So the burst speed is 60 miles per hour, but the smoothed or average speed is only 12 miles per hour for all 10 hours. This does not mean that the roads should be built for 12 miles per hour instead of 60 miles per hour, any more than a 64 kbps UNI should operate at only 8 kbps. But it does mean that traffic on highways, as on frame relay networks, is bursty and prone to congestion at certain times. The CIR for a DLCI may set at 32 kbps. Of course, the UNI physical access link into the frame relay port connection on the frame relay switch may run at 64 kbps. The important thing to remember about CIRs is that when information is sent on a UNI, this information must always be sent at 64 kbps. It is physically impossible for the link to do otherwise. The CIR of 32 kbps means that frames containing data may flow on this DLCI into the frame relay network at 64 kbps, but only for one-half second at a time. This obviously preserves the CIR of 32 kbps, since one half of 64 kbps is 32 kbps. This is just another way of saying that, for this CIR on this DLCI, the committed burst rate BC is 32 kbps and the committed rate measurement interval TC is 1 second (BC= 32 Kbps and TC= 1). This CIR of 32 kbps should not be seen as much of a limitation. An Ethernet frame has a maximum size of about 1,500 bytes, which would fit comfortably (and transparently) inside a frame relay frame. This is 12,000 bits. So a CIR of 32 kbps established for a particular DLCI means that the frame relay network will guarantee delivery of about 2 ½ Ethernet frames per second on this DLCI. This is not only quite good in terms of capacity, it is an astonishing amount of traffic between two LANs at different sites. As long as the FRAD sends no more than two 12,000 bits frames on this DLCI per second, the frames will be marked (or tagged; both terms are used) as DE = 0 (do not discard). It is entirely possible that a user application will generate information to be sent to a remote location at more than 32 kbps. More accurately, the application will send for more than one half second on a 64 kbps UNI. The application will burst above the CIR in frame relay talk. In the example using Ethernet frames, perhaps three Ethernet frames (36,000 bits) are sent in the same second interval. At the network end of the UNI, the third frame would be marked as DE = 1, since the whole frame is considered to be above the CIR. The frame might still make it through the network under most circumstances, but if the frame is discarded at the ingress switch, or any other switch, the user has no reason for complaint, since no commitment has been made on the part of the service provider to deliver frames above the CIR. If a fourth Ethernet frame arrives, this, too, must be marked DE = 1, since the bit total for the interval is now 48,000 bits. This example assumes that the sum of the CIR and EIR on the DLCI adds up to 64 kbps, which is normal. It is possible that the EIR might be set to only 16 kbps, however. (This is allowed, but not common.) In this case, the fourth Ethernet frame would be totally ignored, its bits not even buffered, and no response to the user end of the UNI is ever made. Given the added complexity when the CIR and EIR do not add up to the access rate on the UNI (is this frame within or without the EIR limit?) and the questionable practice of doing bad things on a network without informing the sources, the guideline that CIR + EIR = access rate is almost universally followed.
Figure 4.7 shows how the example relates to the general frame relay class of service parameters. Note that when frames are sent, they are always sent at the UNI access rate, so their slopes are identical. Note that it is in the best interests of all involved if the FRAD respects the CIR for the DLCI and only DE = 0 frames are generated. However, the user applications have no idea of CIR or DLCI. If the CIR is to be respected, usually the FRAD must buffer the traffic. Here is an area where hardware FRADs generally perform better than software FRADs, such as routers. Typically, hardware FRADs respect CIRs and routers do not. So routers on frame relay networks usually have higher resend rates than hardware FRADs, although there is much more involved in resend rates than pure CIR considerations.
Figure 4.7 The CIR and the DE bit. What happens to a frame marked DE = 0 or DE = 1 next varies according to the frame relay network. The fate of the DE = 1 frame depends heavily on the network switch and the current state of the frame relay network in terms of congestion. In some cases, frames sent in excess of the CIR will be stored in buffers on the network switch until there is capacity on the backbone link to forward them. But the DE = 1 frames are the first ones to be discarded in the case of network congestion. In other cases, these excess frames are immediately discarded, an option known as early packet discard. This is quite effective is avoiding congestion on the frame relay backbone, if somewhat drastic during low traffic periods. In no circumstances whatsoever can the CIR chosen for a DLCI exceed the port speed at the end of the frame relay UNI. However, a site with a UNI and frame relay port speed of 64 kbps connected by a DLCI to a remote site with a UNI and port speed of 256 kbps cannot have a CIR associated with this DLCI in excess of 64 kbps. This only makes sense. The 64 kbps UNI and port could never keep up with a DLCI receiving frames from across the network at up to 256 kbps. So obviously the CIR in this example cannot exceed 64 kbps in both directions on the DLCI. CIR values are typically available from 0 to 1.536 Mbps, and many values in between. Most frame relay providers offer a few sub-64 kbps rates and in the same increments of 64 kbps that the frame relay port speeds are available in. There is no technical reason to restrict the CIR to fractions and multiples of 64 kbps. It is just easier for the hardware, software, and network administrators to deal with. Some service providers do not offer a CIR of zero or CIRs higher than about 1 Mbps (1,000 kbps). A CIR of zero basically means that each and every frame sent into the network is automatically marked discard eligible, or DE = 1. The benefits of using a CIR of zero are twofold. First, the CIR of zero might be appropriate if the user has no idea of what the average traffic between two sites actually is. Second, there is only a very small amount of traffic between two sites, so any CIR at all could be considered overkill.
The Frame Relay Frame Frame relay is a network service that occupies the lower two layers of the OSI RM. This model breaks down all network communication functions into seven functional layers, of which only a few are needed within the network itself. Each layer provides services to the layer above and obtains service from the layer below. Each layer is defined as a set of functions, which may be provided by many different actual protocols, as long as the protocol provides the services defined at that layer. Within the context of the OSI RM, frame relay is a Data Link Layer (Layer 2) protocol. Other Layer 2 protocols used in other network technologies include Synchronous Data Link Control (SDLC), which is used in SNA networks, APB which is used in X.25, and High-level Data Link Control (HDLC), which is intended to be used in pure OSI RM environments. In addition, LANs employ a slightly different Layer 2 structure to accommodate such common LAN protocols as token ring or Ethernet. All Layer 2 protocols are distinguished by their use of a common Protocol Data Unit (PDU) structure known as the frame. All of these different Layer 2 protocols use a distinctive frame structure, and frame relay is no exception. The frame relay frame structure is more similar to WAN protocols such as SDLC or LAPB than it is to the LAN frame structures. This is only understandable, given frame relay’s relationship to X.25 and WAN protocols in general. What is different is the detailed structure of each field within the frame relay frame itself. Frame relay uses an adaptation of Link Access Protocol-D Channel (LAPD), a version of HDLC originally developed for ISDN. This is known as LAPF in frame relay networks. Actually, the LAPF specification for frame relay is a subset of the full LAPD protocol, which is specified in the Q.921 international standard for Digital Subscriber Signaling System, since frame relay networks are not required to do error recovery. LAPF does define a series of core functions which all frame relay networks must perform. In many texts, the core functions are called the basic operations of frame relay. These functions or operations are basic or core in the sense that a frame relay network must perform these tasks for each and every user data frame transferred through the network. The five LAPF core functions are: Use a Frame Check Sequence (FCS) to check for errors. Check for valid frame length (another error condition). Discard invalid or errored frames. Provide frame delimiting, alignment, and transparency. Multiplex and demultiplex frames. All of these have been discussed to some extent already, except perhaps the last point. Briefly, all frame relay network nodes must check that the frame relay frame is within the allowable frame size parameters. These nodes must also use a specific Frame Check Sequence (FCS) to check frames for transmission errors. These errors are not corrected within the frame relay network. The node must discard invalid or errored frames. The source and destination frame relay equipment is expected to deal with these error conditions, not the frame relay network nodes.
The frame relay network nodes must also be able to delimit (detect the start and end of) frames, align the frame (that is, make sure the frame is formatted correctly), and ensure transparency. Transparency means that the frame relay node can make no assumptions about the content of the frame relay frame and cannot even look at the frame contents when processing the frame. So the frame contents are totally transparent to the frame relay network and literally anything should be able to be sent over the network, even voice and video. The last point about the ability of frame relay networks to multiplex and demultiplex frames simply means that a router or FRAD will send all frames to all reachable destinations on the same physical link (the UNI). All traffic from a particular site is multiplexed onto the UNI at the source and all arriving traffic to a particular site is demultiplexed to the destination application. All multiplexing and demultiplexing in frame relay is based on the connection identifier (DLCI). There are other functions commonly performed by other Data Link WAN protocols. These include such functions as frame sequencing, window sizing, and acknowledgments. These are not performed by the frame relay network nodes at all. This is one of the secrets of making frame relay into a fast packet protocol, since the network nodes are relieved of these functions and operate much faster as a result. Of course, no network protocol can afford to ignore these important functions. The point is that frame relay networks do not handle them within the network at the switches. These functions are performed at the end points of the network, at layers above the Data Link Layer. The combination of higher-layer protocol functions and intelligent end systems controls end-to-end transport of data and makes the end system responsible for error recovery, windowed flow control, acknowledgments, and so on. Actually, the LAPF specification makes allowances for a frame relay transfer service to provide some of these functions to the Layer 3 (Network Layer) protocol. But since these functions apply to signaling messages, the frame relay network usually only provides the core functions for user frames. TCP/IP, itself, or some other higher layers protocol architecture, takes care of the additional function at the Network Layer and even above. The frame relay UNI sends and receives frame relay frames. The frame relay frame format is very simple in structure. First there is a special flag indicating the beginning of a frame. The flag is the bit configuration 01111110 or 7E in hexadecimal notation. Then there is a two-octet frame relay header field. An octet is defined as 8 bits and is preferred to the more common term “byte,” which can have various sizes in some contexts. In some documentation, the frame relay header is called the address field, but it will be called a header here because this is the more common term. The header is followed by the frame payload or information field, which is variable in size up to some maximum, most often 4096 octets. The information field is followed by a trailer two octets in length. This frame check sequence (FCS) contains a 16-bit Cyclical Redundancy Check (CRC-) that provides a very good error-detection mechanism for finding frames that have bit errors in them. The whole frame ends with another 7E flag octet. Immediately following an ending flag, a new frame relay frame might begin, although such rapid operation is rare. The structure of the frame relay frame is shown in Figure 4.1.
Figure 4.1 The frame relay frame. The frame relay header has a interesting structure itself. There are even a few variations allowed on the frame relay header, but all networks must support the basic structure. The basic frame relay header structure is shown in Figure 4.2.
Figure 4.2 The basic frame relay header. The fields in the figure are labeled by the acronyms used to identify the parts of the frame relay header. These can be confusing at first, so a closer look at these fields and their functions is needed. DLCI This is the Data Link Connection Identifier. It is a 10-bit field that contains the logical connection number of the PVC (or SVC, when an on-demand connection is made) that the frame is following across the network. This number ranges from 0 to 1023, but some connection numbers are reserved for special functions. The DLCI is examined by each frame relay switch on the network to determine the correct output port to relay the frame onto. The DLCI number may be changed by the switch as well, and usually is. This is all right, as a particular DLCI has local significance only on a frame relay network. The reason that the DLCI is split between two octets is that the field is a concatenation of the SAPI and TEI fields from the ISDN LAPD frame structure, which are not used in LAPF. C/R This is the Command/Response bit. It is not used in the current definition of the frame relay protocol. Again, it is an artifact of the X.25 roots of frame relay. EA This is the Extended Address bit. These bits are at the end of each header octet and allow the DLCI field to be extended to more than 10 bits. The last EA bit is coded as a “1” and all previous EA bits are coded as “0s”. In the figure, there are only two EA bits, but other header configurations are allowed in frame relay. These other possibilities are discussed later. FECN and BECN These are the Forward Explicit Congestion Notification and Backward Explicit Congestion Notification bits. These bits help with congestion control on the frame relay network. The use of these bits can be quite complex and will be discussed more fully in a later chapter. Some real controversies about the proper and effective use of FECN and BECN have come up in the past few years, so this attention is certainly warranted. DE This is the Discard Eligibility bit. It is used to identify frames that may be discarded by the frame relay network under certain conditions. The use of this bit will be discussed further later in this chapter. The FECN, BECN, and DE bits are distinguishing characteristics of the frame relay protocol. They represent something new in the philosophy of just what a network can and should do in special circumstances. One of the features of the frame relay header that strikes people as extremely odd is the fact that the DLCI field is split between the two octets of the frame relay header. As it turns out, the reason for this is quite important for the understanding of why frame relay is considered to be an improvement over LAPD. The easiest way to appreciate this is to compare the structure of the ISDN LAPD frame and header to the frame relay frame and header structure just outlined. The structure of the ISDN LAPD frame is shown in Figure 4.3. The similarities with the frame relay frame structure are immediately obvious.
Figure 4.3 The ISDN LAPD frame structure. The basic frame structure has appeared over and over again in many data communications protocols. The structure is simple enough, yet possesses all of the capabilities needed to carry the packet over a series of links between switches. This is in fact a key point. In all layered data communications protocols, the frame is sent on a link-by-link basis only. That is, at every step along the way on a network, a frame is created by the sender and essentially destroyed (in the act of processing the frame) by the receiver to get at the packet inside the frame. It might seem impossible to ever send anything intact from a source to a final destination if this is the case. But the key to understanding how layered protocols operate, and how frame relay relates to X.25 and LAPD, is to realize that it is the packet, as the contents of the constantly invented and destroyed frames that is sent across the network end -to-end from a source to a destination
Many protocol developers are even in the habit of calling any protocol data unit that travels the length of a network from end-to-end intact, through switches and other network devices, a packet. The packet label seems to apply whether the actual PDU is a TCP/IP Layer 3 data unit (formerly called a datagram or connectionless packet) or an ATM data unit (the cell). The term “packet” is used generically in many cases to indicate a data unit that leaves one end-user location, flows over a network, and arrives at another end-user location intact. This is where the term “fast packet switching” comes from. As mentioned in the previous chapter, the first step in understanding the relationship of frame relay to X.25 is to realize that in frame relay it is the frame that flows from source to destination on a frame relay network. In X.25 this function is performed by the packet or frame contents. In an X.25 packet-switching network, packets are switched on the network. In a frame relay network, frames are relayed across the network much more efficiently, since the frames no longer need to be created and destroyed on a link-by-link basis to get at the packet inside. Frame headers are examined, processed, and modified, but all of this happens at the frame level. There is another important aspect of the ISDN LAPD frame structure that holds a key to understanding the relationship between X.25 (through ISDN) and frame relay. This is the ISDN LAPD frame address field; its structure is shown in Figure 4.4. Again, this structure should be compared to the frame relay header structure.
Figure 4.4 The ISDN LAPD frame address field. Note that this is the LAPD frame address field. In frame relay, it is common to call this field the header, although plenty of frame relay documentation retains this address designation for the frame relay header. In either case, header or address field, the function is the same in both LAPD and frame relay: to tell the network devices what to do with the information inside the frame. It is obvious from Figure 4.4 that there are two different fields involved in the ISDN LAPD address structure. The first is the Service Access Point Identifier (SAPI) that is 6 bits in length and the second is the Terminal Endpoint Identifier (TEI) that is 7 bits in length. These identifiers are just numbers, from 0 to 63 in the case of the SAPI and from 0 to 127 in the case of the TEI field. But why should a frame, which only flows from a single location to another single location (point-topoint), have such a complicated address structure? This was one of the innovations of LAPD itself. While it is true that all frames flow over the same point-to-point link on a data network, it is not the case (and cannot be) that all packets must do the same. Otherwise, a separate physical point-to-point link to every possible destination must be configured at the source for these frames to flow on. And in fact this is the essence of a private network based on leased lines. But this is not the essence of LAPD, which is packet-switched, not circuit-switched. The parent protocol LAPD, as its child protocol frame relay, allows the multiplexing of connections from a single source location over a single physical link. The greatest benefit of this approach is to make more efficient use of expensive links and cut down on the number needed in the network. The two fields in the LAPD frame address deal with the two possible kinds of multiplexing that a customer site sharing a single physical network link must accommodate. First, there may be a number of user devices at a customer site. Second, there may be a number of different kinds of traffic that each of these devices generates. For instance, even though all information is 0s and 1s, some of these digits may represent user data and some may represent control signaling to the network itself.
The TEI field deals with the first of these multiplexing possibilities. The TEI field addresses a specific logical entity on the user side of the interface. Typically, each user device is given a unique TEI number. In fact, a user device can have more than one TEI, as would be the case with an ISDN concentrator device with multiple user ports. The SAPI field deals with the second multiplexing possibility. The SAPI field addresses a specific protocol understood by the logical entity addressed by a TEI. These are Layer 3 packet protocols and provide a method for all equipment to determine the structure of the packet contained in the frame. Taken together, the TEI and SAPI address fields make it possible for all network devices on the network to first determine the source or destination device of a particular frame on an interface (the TEI), and then determine the source or destination protocol and packet structure on that particular device (the SAPI). When frame relay was invented, the TEI and SAPI fields were combined, which is the main reason why the DLCI field is split between the two octets of the frame relay header field. The two-level structure of the TEI and SAPI, which in truth was marginally effective in ISDN, was combined into the flat address space of the 10-bit DLCI. The reason that the overall structure of the ISDN LAPD address field had to be preserved was due to a desire to eventually replace LAPD running on the D-channel of ISDN with frame relay. The frame relay field structure allows for networks to handle both frame relay DLCIs and ISDN LAPD SAPIs. Ten bits can count 1024 things, and the DLCI is just a number between 0 and 1023 that is used by the frame relay network to identify a virtual circuit connection.
The Data Link Connection Identifier The number of Data Link Connection Identifiers (DLCIs) possible on a UNI, 1024, might seem like a lot. But in actual practice the number of DLCIs that can be assigned on a given UNI of a certain speed are strictly limited. One reason is that some DLCIs are reserved for network management and control functions. This is certainly understandable. The second reason that DLCIs are limited on a UNI has to do with bandwidth on the UNI. Frame relay offers flexible bandwidth allocation, to be sure, but that does not mean that the UNI’s bandwidth is unlimited. When many source devices generate bursts of traffic all at once, any frames that cannot be sent in a given time frame across the UNI to the frame relay switch must be buffered or just dropped. Either strategy is allowed and strictly implementation-dependent. Naturally, the more DLCIs defined from sources, the more the chances are that many of them will be bursting at once. If the excess traffic is delayed through buffering, the end-to-end delays will rise through the network. If the excess bursts are discarded, most data applications respond by sending copies which only makes the problem worse. Other applications, such as voice and video, function hardly at all when traffic losses reach about 10 percent or so (even less for commercial voice and video applications). The number of DLCIs allowed on a UNI depends on the service provider’s policy with regard to the speed of the UNI and the Customer Premises Equipment (CPE) capabilities. In actual practice, the CPE is not much of a limitation, especially if FRADs with adequate buffers are used. Routers routinely buffer excess traffic even in a leased-line environment and hardware FRADs come with a variety of buffer sizes available. So the number of DLCIs that can be supported on a UNI of a particular speed boils down to what the service provider says is the maximum. It is in neither the customer’s nor the service provider’s best interest to have high delays or high information loss on the UNI due to over-configured connections. The majority of frame relay UNIs are still 56/64 kbps links. In many cases, the maximum number of DLCIs that can be configured on this speed UNI is 50. But in most cases, no more than 10 DLCIs on a 56/64 kbps UNI is a good guideline to work with. If this seems like a limitation, remember that a DLCI identifies a logical connection to a destination. In a leased line environment, 10 56/64 kbps links would be needed instead of 10 DLCIs on one UNI. DLCI values are to be used conforming to the pattern in Table 4.1, according to ITU-T Recommendation Q.922. Table 4.1 DLCI Values and Their Uses
DLCI Value
Assigned Use
0
In-channel signaling and management (ANSI/ITU-T)
1–15
Reserved for future use
16–991
User connections (only 512 to 991 when on an ISDN Dchannel)
992–1007
Frame relay management at Layer 2
1008–1022
Reserved for future use
1023
In-channel virtual circuit management (Frame Relay Forum)
The table is more complex than it seems at first. There appear to be two DLCIs that are used for link-management purposes. There are. DLCI 0 is used for signaling messages for demand connections (SVCs) and for some link-management functions. The use of this DLCI for this function is defined by the American National Standards Institute (ANSI) and the International Telecommunications Union Telecommunications Standards Section (ITU-T). The ITU-T sets international standards for telecommunications and ANSI adapts them for use within the United States. So the use of DLCI 0 for in-channel signaling and management is official. But the table also notes that DLCI 1023 (all 1s in the DLCI field) is used for “in-channel virtual circuit management,” which is almost the same purpose and function, and even wording, as used for DLCI 0. This is because DLCI 1023 is used by the industry consortium formed to bring frame relay equipment and services to market as quickly as possible, the Frame Relay Forum (FRF). It is enough to note here that the FRF issues various Implementation Agreements (IAs) regarding certain aspects of frame relay. FRF IAs are not really standards, but formal agreements among the members to make frame relay equipment and create frame relay networks that do certain things in certain ways. IAs are a good example of a defacto standard, one established not by rule, but by common use. With regard to DLCI 1023, the FRF has said that this connection identifier is to be used for the link management function. Link management provides such key UNI functions as verifying that the UNI and each DLCI is up and running, detecting new PVCs, and so on. Since the link management is is important to frame relay, Chapter 7 will be devoted to it. DLCIs 992 through 1007 are reserved for what is known as Layer 2 management. This sounds odd, since all of frame relay is at Layer 2 of the OSI RM, but it is exactly the point. Frame relay user DLCIs carry virtual circuit traffic end-to-end across the frame relay network for the users and such frame content is transparent to the frame relay network. DLCIs 0 and 1023 carry information back and forth from the customer premises equipment to the frame relay switch across the UNI. But how can one frame relay Layer 2 function at the premises end of a UNI communicate with its counterpart at the other end of another UNI? This is what DLCIs 992 through 1007 are for: to allow frame relay equipment to frame relay equipment communication over the frame relay network. Frames with these DLCIs contain neither user traffic nor local UNI management and control information. DLCIs 16 through 991, 976 in all, are assigned for user connections. They may be PVCs or SVCs (demand connections). There is one exception. If the frame relay frame is part of an ISDN and uses the ISDN D-channel for frame transport, which is allowed, then the user-assigned DLCIs must be between 512 and 991. This automatically forces the SAPI value to be between 32 and 61 from the ISDN perspective. This is just another way of saying, “Frame relay frames on the ISDN D-channel must use SAPIs 32 through 61.” While this limits the number of connections on an ISDN, these 480 remaining connections should not be a limitation for currently defined ISDN link speeds. Practical considerations regarding link traffic will be a limitation long before logical link connection numbers will.
This brings up another point about DLCIs and frame relay headers in general. What if a UNI or other frame relay network link is not running at a relatively low speed like 56/64 kbps, but a much higher speed like 45 Mbps or even higher? At these higher speeds, might not logical connectivity in terms of assigned DLCIs become an issue before physical traffic considerations? In other words, might not a high-speed link be able to handle 976 bursty DLCIs with ease? How would customers react to having to purchase another 45 Mbps link just because the original had run out of DLCIs if the link were only 25 percent or so loaded? These are all legitimate concerns, f course. Fortunately, frame relay has an answer: to allow the frame relay network to extend the frame relay header beyond two octets. This allows more than 10 bits to be used for the DLCI. There are various methods to do this, some using three octets and some using four. The method recommended by both the ITU-T and ANSI employs a four-octet frame relay header. Normally, this extended frame relay header is found on high-speed links and the frame relay NNIs. The four-octet frame relay header structure is shown in Figure 4.5.
Figure 4.5 The extended frame relay header format. For backward compatibility, the four-octet frame header has the first two octets essentially the same as the two-octet version, except the EA bit is now 0. This EA = 0 lets the receiver know that the four-octet header is in use. All of the bits in the third octet, except for the last EA = 0 bit, are used to extend the DLCI. This would give a 17-bit DLCI capable of enumerating some 128,000 virtual connections. If this is not enough, six more bits in the fourth octet can be used to further extend the DLCI to a total of 23 bits. This gives about 8 million virtual connections, almost certainly severe overkill. So this fourth and final octet can be used for Data Link-Core (DL-Core) control functions, similar in intent and function to the Layer 2 management connections, but of course able to be used on all DLCIs (because this field is present on all extended DLCIs), not just individual ones. The D/C bit indicates to the receiver whether the fourth octet is used for extended DLCI (D) or DL-Core control (C) purposes. No DL-Control procedures have yet been defined, however, which renders this field and function relatively useless. It should be noted that when the extended DLCI formats are used, the higher reserved DLCI values must move up to take these higher positions. So FRF link management would be on DLCI 128,000 or 8 million, as the case may be. DLCIs identify virtual connections on a frame relay network. So each connection, whether PVC or SVC, must have a DLCI in the allowed user range. There is one other major DLCI topic to be discussed, however, and a topic which can be confusing to those used to LAN and connectionless IP environments such as the Internet. LAN Layer 2 addresses and IP Layer 3 addresses have global significance. That is, there can be no two LAN or IP devices on the same network with identical addresses, just as there can be no two telephones in the world with identical telephone numbers. How could LAN and IP traffic, or voice connections, be routed if it were otherwise? But DLCIs have local significance only. This topic is important enough to deserve a section of its own.
Local Significance
When a packet is sent across the Internet from one router to another, the packet header contains the full, globally unique, source and destination IP addresses. The same is true of all LAN frames, from Ethernet to token ring, and beyond. If there are two sources or destinations with the same LAN or IP address, the system breaks down and there will be unreachable places on the network. This is the essence of global significance. But this is not the case with frame relay frames. The frame relay DLCI is a connection identifier, not a source and/or destination address. DLCIs have local rather than global significance. Only one DLCI is needed to connect any two sites because a connection is defined as a logical relationship between two endpoints. Now, there is a separate DLCI on each NNI on both ends of the network, but these are just opposite ends of the same connection. None of this discussion about local significance implies that there are no network addresses on frame relay networks. When a connection is established on any kind of network, the connection must know just which two network endpoints to connect. This may be done by hand at service provision time (which makes this connection a PVC) or dynamically by means of a signaling protocol (which makes the connection an SVC). Many public frame relay networks employ network addresses that look like telephone numbers, with area codes, office codes, and access line numbers. Since only 1024 DLCIs can exist in most frame relay networks and there will obviously be more than 1,024 users on large public frame relay networks, the DLCIs must be of local significance only. Similarly, the word “mother” is of local significance only as well. Everyone has a mother, but most people mean a different person when they say “mother” (unless they are siblings). On a frame relay network, there may be many DLCI = 17 connections, but as long as there is only one DLCI = 17 connection on a given UNI or NNI, there is no problem. Frame relay switches use the DLCI to route the frame on a path from input port to output port. All that needs to be done is to look up the DLCI in the frame header on the incoming frame in a table. The table entry essentially gives two pieces of information. First, it contains the switch output port number which the frame will be sent out from the switch on. Second, it contains the value of the new DLCI that will be inserted into the outgoing frame in place of the original DLCI value. This must be done to preserve the purely local use of DLCIs. Of course, frame relay switches can use whatever protocol they wish to communicate. In the most general case, only the DLCIs on the two UNIs will be different and actually matter, from the user perspective. The result is not chaos for the user, however. At the origination point on the frame relay network, the router will interpret the IP destination address of the packet (for example) and place the DLCI for the PVC to the destination into the frame relay DLCI header field. Although the DLCI might change as the frame flows through the network switches, the switch at the destination UNI will replace the arriving frame header DLCI with the proper DLCI for the frame on the destination UNI and send it to the user. While DLCIs change, they do not do so haphazardly. The use of locally significant DLCIs on a frame relay network is illustrated in Figure 4.6. There is more than one DLCI = 18 connection, but only one }LCI = 18 can exist at any one time on each UNI. All the FRADs need to know is that when they need to send a frame to Site B, for example, they send it on DLCI = 17 and when they need to send a frame to Site C, they send it on }LCI = 18. All that a destination such as Site D needs to know is that when it receives a frame with DLCI = 18, it came from Site C and when it receives a frame with DLCI = 19, it came from Site B. If two DLCIs happen to actually match at opposite UNIs, it is more likely the result of a coincidence than planning.
Figure 4.6 DLCIs and local significance.
It should be noted that Site A and Site D in Figure 4.6 cannot communicate directly over the frame relay network. No DLCI, and so no connection, has been defined between them. The sites might still be able to exchange information, but only through an intermediate site (from the FRAD perspective; to the frame relay network, all sites are endpoints) such as Site B or Site C. Such partial mesh connectivity might be provided for the purposes of expense (DLCIs cost little, but they are not free), traffic patterns (Site A rarely communicates with Site D), or even security (all Site A to Site D traffic must pass through Headquarters Site C). On the other hand, full logical mesh connectivity with DLCIs can be provided as well, as long as traffic requirements are respected. DLCIs are, by definition, bidirectional, as is also shown in Figure 4.6. This means that if a frame relay PVC is configured from Site A to Site B, and Site B receives from Site A on DLCI = 20, then the same DLCI number leads back to Site A. Oddly, frame relay service providers are fond of charging customers for a connection from A and B, then also for a connection from B to A, as if DLCIs were like ATM connections, unidirectional. But regardless of how the frame relay connections are bundled and charged for, DLCIs are bidirectional. This does not mean that a customer can actually use a DLCI from B back to A if it has not been paid for. It simply means that special care needs to be taken in the network to prevent this bidirectional capability, not to enable it.
UNI Options Frame relay networks are WAN technologies. Since frame relay networks may span long distances, it is common for a frame relay customer to obtain services from a long-distance public service provider. The three largest in the United States are AT&T, MCI, and Sprint. These companies handle long-distance voice services and offer frame relay services. If the frame relay network spans a small enough distance, the entire frame relay network may fall within a single Local Access and Transport Area (LATA). Within a LATA, the local exchange carrier (LEC) such as NYNEX, Bell Atlantic, or GTE may offer a full service frame relay network without the need for involving an interexchange carrier (IXC) like AT&T, MCI, or Sprint. However, most frame relay networks easily span LATAs or even states. In these cases, the carrier chosen to provide frame relay network services is often the customer’s long-distance service provider. However, given the new deregulated environment in the United Sates and around the world, it is no longer a given that a multi-LATA or multistate frame relay network always involves an IXC. And with the rise of NNI agreements between LECs and IXCs when it comes to their frame relay networks, it is always in the customer’s best interest to seek the lowest price for frame relay, whether from LEC or IXC. But even if the LEC or IXC is the ultimate supplier of the frame relay service, the local access portion of the frame relay network, the UNI, is commonly obtained from the LEC. The only issue that arises is how the customer site UNI is linked to the frame relay service. And even this LEC access is not always the only way to go. LEC control of the UNI local access portion of the frame relay network is slowly changing in many parts of the United States. Several companies have achieved co-carrier status with the incumbent LEC (ILEC) in many states. This section will consider all the possibilities of providing a local physical connection into the wide area frame relay network.
Leasing Local Access The simplest way to provide the connectivity is to lease a local loop from the LEC to the frame relay service provider’s Point of Presence (POP) within the LATA. The POP is where the frame relay switch is located. In the simplest case, the LEC is both the supplier of the UNI and the supplier of the frame relay network service. But when an IXC is the provider of the frame relay service, the UNI is still most often a LEC-provided leased line from the customer site to the frame relay POP. The choice of leased-line speeds runs from low-speed access (56/64 kbps) to high speed (a full T1 at 1.5 Mbps, or even a full T3 running at 45 Mbps) and a number of possibilities in between. The high-speed alternative may be un-channelized (just a single 1.5 Mbps or 45 Mbps channel) or channelized (multiple lower-speed channels). Channelized access still allows non-frame relay services to share the same local access line as the frame relay service. There is no need to provide exactly the same arrangement at each and every site linked to the frame relay network. This identical arrangement may be desirable from a network management standpoint merely for the sake of consistency, but it is not a requirement. Depending on the specific needs of each site, a number of local access alternatives may be used on a frame relay network.
The most popular port speed for frame relay networks to date, based on number of ports sold, is 56/64 kbps. Over 50 percent of all frame relay ports still run at this speed. Of course, the easiest way to access a 56/64 kbps frame relay switch port is with a leased 56/64 kbps DS-circuit. The “56” in the 56/64 kbps low-speed access refers to the fact that in some parts of the United States, full 64 kbps clear channel digital circuits are not available. In these places, only 56 kbps is used, which is 7/8ths the speed of the full 64 kbps DS-circuit. The difference in performance is minor in most applications. Normally, a DS-(or T1, as many customers know it) is channelized or broken up into 24 DSchannels, each running at 64 kbps (56 kbps in some U.S. locations). Although few sites may need 24 links running at 64 kbps, the price crossover point is usually only three DS-circuits. In other words, if four separate DS-s are running to a site, it is actually cheaper to lease a single DS-with 24 DS-0 channels, even though 20 of them sit idle. Think of it as buying a dozen eggs when the recipe only calls for four because a dozen is cheaper. The other eight can keep in the refrigerator for a while until needed. An un-channelized DS-runs at 1.544 Mbps and offers a data transfer rate of 1.536 Mbps, the remaining 8 kbps representing overhead. An un-channelized local access loop can only support the frame relay network traffic, however. It is typically attractive to use a channelized DS-to provide integrated access to the frame relay network, especially for initial frame relay network implementations. Here is how. Because of the price differential between individual DS-s and a 24 channel DS-, many companies have DS-channels in DS-that are unused. Many of these channels are used for voice tie-lines from PBXs or SNA networks. In most cases, some of these otherwise idle channels can be used for frame relay network access. Naturally, if the DS-1 channels run to the AT&T POP, the frame relay service provider must be AT&T. This concept of integrated access is illustrated in Figure 4.8.
Figure 4.8 Integrated access to a frame relay network. A nice feature of this arrangement is that it offers more scalability than single DS-circuits alone. Frame relay port speeds are usually available in multiples of 64 kbps. If there are six spare channels on a DS-, any number up to all six may be used for the frame relay access link, as long as the frame relay port speed is upgraded to match. This may require a separate multiplexer box but would offer access link speeds of 64 kbps, 128 kbps, 192 kbps (rare as a port speed offering), 256 kbps, 320 kbps (also rare), or 384 kbps. One word of caution is in order, however. The channels assigned for the frame relay access link should be contiguously slotted. This means that in the preceding example, channels 4, 5, 6, and 7 on a DS-1 could be used to give an access link speed (and matching port) of 256 kbps (4 ∞ 64 kbps), but not channels 4, 9, 12, and 15. These last four channels are not contiguous (not in numerical order). There is usually a severe performance degradation when using noncontiguous channels. Leasing a point-to-point local access link to an IXC from a LEC, whether as a series of DS-s or a single channelized or un-channelized DS-1 is no longer the only possibility in many areas within the United States. The alternative local service providers, which used to be called Competitive Access Providers (CAPs) but now almost universally prefer the term Competitive LECs (CLEC) also can supply local access to an IXC’s frame relay POP.
Diverse Routing A frame relay UNI will typically replace a larger number of point-to-point leased lines. But the single UNI might be considered a single point of failure on the frame relay network. That is, if the UNI fails, the entire site is suddenly cut off from the frame relay network. In order to address this issue the local access link that constitutes the UNI might be diversely routed to avoid single points of failure along the path from customer site to frame relay switch.
So, in many cases the local access provider can offer attractive alternatives to point-to-point local access links. Many of these local service providers have installed rings of fiber optic cable in major metropolitan areas. These fiber rings have two advantages over point-to-point access configurations based on copper wire or coaxial cable. In the first place, the fiber rings are much less susceptible to service outages. The rings are configured so that if a link between two adjacent network nodes is broken, service is not disrupted. Information is wrapped back around an alternate path fiber in a matter of milliseconds (60 milliseconds is not uncommon). Repair crews can go about their tasks in an unhurried fashion without the threat of irate customers (or regulatory limits) forcing the crews to cut corners or rush cable splices. Second, the fiber optic cable itself offers much lower error rates than copper media. The higher quality is quite noticeable to users. While typical copper media have bit error rates of about 1 in a million, most fiber networks have bit error rates of about 1 in a billion, which is fully 1000 times better. This translates to 1/1000th of the bit errors encountered on an access link before fiber optic cable is used. The twin advantages of quality and automatic service rerouting have made the use of fiber rings for service access—frame relay or not—very attractive for a wide range of customers. One other point should be made when it comes to leasing the local access portion of the public frame relay network. This concerns the format of the transmission frame used to carry the frame relay frames from the customer premises to the public frame relay switch. Whether high speeds or fiber rings are available, most users choose a DS-or lower speeds for the local access portion of the frame relay network. All DS-s in the United States have two critical parameters that must be matched between the customer’s premises equipment DS-port and the public service provider’s DS-port on the frame relay switch. The two parameters are frame format and line coding. Many installation dates for frame relay services have been missed because of miscommunication regarding these two parameters. Older DS-s employ a frame format known as the D4 superframe. Some service providers refer to this as SF, or superframe formatting, but this is the same thing as D4. Newer DS-1s support a frame format known as Extended SuperFrame (ESF). ESF is always preferred because of its superior manageability and problem isolation features. In some areas, ESF may not be an option, while other areas will offer a choice. Either frame format will work with frame relay, as long as both ends are configured to use D4 or ESF. Older DS-s also employ a line coding technique known as Bipolar-AMI (Alternate Mark Inversion). Line coding is used to represent the digital 0 and 1s with electricity to carry the data over long distances. To make Bipolar-AMI function over the up to several miles between a customer site and a serving office, it was necessary to limit the bandwidth of each DS-channel of the 24 present in a DS-1 to 56 kbps instead of the 64 kbps possible. This is no problem when the DS-s are used for voice, but imposes a limit of the speed of data when the DS-channel is used for this purpose. The solution was to develop a new form of line coding known as Binary 8-Zero Substitution (B8ZS) which allowed each DS-to operate at a full 64 kbps. This clear channel capability is available in many areas, but not all. As with ESF, B8ZS is always preferred to enable the user to get the maximum functionality out of his or her access link. The Bipolar-AMI limitation applies to higher speeds as well. If a fractional T1 access link is needed at 256 kbps (4 x 64 kbps), with Bipolar-AMI, the total bandwidth available will only be 224 kbps. While perhaps not critical, the difference may be noticeable. It is important to realize that frame relay will function equally well whether accessed with BipolarAMI and D4 framing or B8ZS and ESF, or any combination of the two. These two alternatives have been deployed independently in many cases, so care is needed. The whole point is to make sure there is a match between the way the premises equipment functions and is configured, and the way the service provider’s equipment functions and is configured. With any network, the fewer surprises, the better.
Dial Backup
Diverse routing on a fiber ring will go a long way in avoiding the single point of failure that a UNI represents. However, rings are not available in all locations from all frame relay service providers. In such cases, it is possible to configure a dial backup port that can be used when the leased line UNI is out of service. Most hardware FRADs will support a dial backup port, especially for 56/64 kbps access lines. Naturally, the frame relay service provider must provision a number of ports at the local switch to allow such access to take place. These dial backup ports are often perceived as a tempting target for hackers or crackers, or others who seek to enter networks without authorization, so their use is advisedly used with caution and added security measures such as password protection and encryption. The actual use of the dial backup might even be totally transparent to the customer. There are two ways to dial around UNI failures in frame relay. The first method is the most common and simply substitutes a dialed 56/64 kbps link for the leased line UNI in the event of a failure. In some older frame relay networks, the dialed connection could extend from one customer site FRAD directly to another customer site FRAD. The drawback of this approach is that no other sites on the customer’s frame relay network are accessible beyond the two sites directly linked. This option is seldom used in newer frame relay networks.
Multihoming and Multiport UNIs Single UNIs without fiber rings can also have a backup UNI that is used only when the main UNI is unavailable. In fact, this second UNI does not even have to lead to the same frame relay switch site as the main UNI. This practice is known as multihoming. Multihoming not only can protect from UNI failure, but it also can protect from a frame relay switch outage. The major drawback of a multihomed UNI is that the customer might be paying for connectivity that is not used to its fullest. The DLCIs defined on a primary UNI must be duplicated on the other UNI leading to the second switch. So there is added protection against failures, but not added efficiency. The whole idea behind diverse routing of UNIs, dial backups, and multihoming is to provide a customer site with protection against UNI failures. The latest way to provide such UNI protection is to use inverse multiplexing for frame relay, which is called the multiport UNI by the Frame Relay Forum. The multiport UNI is actually two local access links that behave like one UNI. The two links can be diversely routed, of course, but they are not usually multihomed. Both the customer site FRAD and service provider switch must support the multiport UNI option. There is only one set of DLCIs defined. Under normal operating conditions, both links are used to handle traffic on all defined DLCIs. In its most basic form, the multiport UNI uses two 64 kbps links to provide what seems to be a 128 kbps UNI. In the event of a failure on a single 64 kbps link, the only thing that the user might notice is a decrease in throughput as the UNI operates at 64 kbps until the second link is restored to service. This basic operation of a multiport UNI is shown in Figure 4.9.
Figure 4.9 The multiport UNI. It is possible to configure four-port multiport UNIs or even other combinations. In the case of fourport UNIs, the throughput is 256 kbps. Multiport UNIs can also be used to provide fractional T1 access speeds where fractional T1 speeds are not ordinarily available. This has become one of the most common uses of multiport UNI equipment. Of course, the protection against failures is still an attraction.
Analog Modems and Switched Access
It sometimes comes as a surprise to those used to private line networks that frame relay works as well as it does with UNI speeds as low as 56/64 kbps. Of course, this is due to the bursty nature of client/server LAN applications, and the efficiency and effectiveness of modern compression techniques applied to voice and video digital data streams. But the bursty nature of frame relay applications extends to UNI speeds even lower than 56/64 kbps. It is even possible for a frame relay UNI to operate at analog modem speeds as low as 33.6 kbps. Many users have home PCs that include 56 kbps modems, whether compliant with the V.90 standard. Even these modems still operate at 33.6 kbps upstream, out of the PC. It is only in the downstream direction that 56 kbps modems function at the full 56 kbps and only under certain circumstances at that. Many of these same home users still need to access their employer’s corporate network. In this case the home PC is the client and the server might be based on the organization’s frame relay network. Such small office, home office (SOHO) workers or telecommuters no longer need be left out of the organization’s network. While it is true that home workers could access a corporate network over the Internet, this type of access is considered quite insecure and not to be trusted for many types of transactions, especially those of a financial or confidential nature. The lack of any quality of service guarantees at all on the Internet has already been discussed. So frame relay access over analog modems is desirable both from a security and QoS perspective. It the case of analog access, the home PC simply runs a frame relay software package and dials in to a special modem-based frame relay switch port. A PVC is defined on this port to run to the home worker’s server site within the organization. The PVC has a DLCI and a CIR, naturally. It is even possible to configure more than one PVC on the analog UNI, but this is not common due to both traffic load and security considerations. The connectivity is still through the PVC and not through any type of frame relay SVC arrangement, but this is seen as a security feature and not as a limitation. The use of analog modems to provide dial access to PVCs brings up other alternative frame relay UNI arrangements. These alternate arrangements are all distinguished by providing switched access to the PVCs defined on the UNI. The whole suite of possibilities includes both dialup digital access (such as that used for backing up a dedicated UNI channel) and dialup analog access. The term switched access is used to avoid confusion with true SVC service on a frame relay network established through the use of a signaling protocol based on Q.933. The remaining way that a frame relay UNI can be provided in a switched access environment is by way of an Integrated Services Digital Network (ISDN). ISDN as a service offering has suffered from a variety of woes over the years, but has enjoyed renewed popularity for high-speed access to the Internet. In this same fashion, ISDN can be used for access to a frame relay network instead of (or along with) Internet access. Using ISDN to support frame relay is actually a very good fit with the intentions of ISDN. The original data protocol used on an ISDN was X.25, the forerunner of frame relay by way of LAPD. Why not replace the functions of X.25 inside an ISDN with frame relay? The main drawbacks are: (1) Merging ISDN and frame relay service offerings might have revenue repercussions for the service providers who now have separate incomes from both services, and (2) putting extremely bursty and long holding-time frame relay traffic through a network switch designed primarily for voice might not be the smartest thing to do. So tight frame relay integration with ISDN will not happen soon, if at all. Usually, ISDN just provides access to a remote frame relay switch. ISDN access to frame relay would involve having a FRAD connected not to a leased private line UNI, but to the site’s ISDN network termination type 1 (NT1) device. The FRAD could share the ISDN access line (typically a 1.5 Mbps Primary Rate Interface or PRI, but not always) with the site’s PBX or other ISDN devices. All the voice calls would still use 64 kbps ISDN-B-channels on the PRI, but the FRAD could also use a B-channel for access to an ISDN device that represents the UNI on a frame relay network. In this case the frame relay network replaces the X.25 network cloud at the other side of the ISDN. If SVCs are supported, the SVCs must still be set up by the frame relay network. If only PVCs are supported, the PVCs must still be configured separately. All the ISDN does is provide the access method. ISDN access to a frame relay network is shown in Figure 4.10. This is basically what ITUT Recommendation calls “Case A frame mode service.”
Figure 4.10 ISDN access to frame relay. It is even possible to provide packet data or D-channel frame relay support on an ISDN. In this case, messages on the ISDN D-channel are in the form of frame relay frames and not packets inside LAPD frames. All of these UNI options are simply ways to gain access to the PVCs, and the DLCIs that represent the PVCs, defined on the frame relay network. Leased lines remain the most common method by far. But there is another way to provide connections and their DLCIs on a frame relay network. This method involves the use of a signaling protocol, Q.933, to provide demand connections on frame relay networks. Although not common, this topic of SVCs and signaling on a frame relay network is deserving of a chapter all its own.
Chapter 5: Frame Relay Signaling and Switched Virtual Circuits Overview Frame relay is a form of fast packet switching. This means that frame relay switches, the network nodes of a public frame relay network, are capable of switching packets fast enough to satisfy any application carried inside the packets, including compressed voice and video. Packet switching has been around for a while in the form of X.25; it has its roots as far back as IBM’s SNA and early Internet protocols before TCP/IP. The essence of packet switching is that individually addressed data units called packets all flow on the same shared link, one after another on this virtual circuit or virtual channel, without the packet content application needing any dedicated bandwidth or channel to function correctly. (Networks that rely on channels with dedicated bandwidth to function correctly are know as circuit-switched networks.) In connection-oriented packet protocols like X.25 and frame relay, the individual address is a locally unique connection identifier, the data link connection identifier (DLCI) in the case of frame relay. In connectionless packet protocols like TCP/IP and many LAN-based protocols, the individual address is a globally unique end-system (host in TCP/IP) identifier, the fully qualified IP address in the case of TCP/IP. Packets are routed along connection paths in packet networks like frame relay. Packets are all routed independently in packet networks like the TCP/IP-based Internet. Packet networks like frame relay need a connection to be set up between source and destination, and typically between each and every network node, before the first packet makes its way from source to destination. The frames in frame relay all say something like, “Send this packet (or piece of packet) inside the frame on connection 22.” Packet networks like the Internet do not need any connection at all between source and destination before packets are sent into the network. There might be an end-to-end connection at a higher layer than is present in an IP router (that is, there might be a connection at the TCP layer), but the point is that there are no IP connections between routers or users. The packets in IP networks all say something like, “Send this packet from source address A to destination address B.” Note the choice of wording in the previous paragraph. In both cases, the fundamental unit of exchange is the packet. The packets are inside frames in all cases, but the frame is of minimal interest to the application since frames only flow hop-by-hop on each link through the network, whatever type of network it may be. It is the packet that is the fundamental unit that leaves here and arrives there unchanged by the network. In both cases the networks route the packets or frames containing the packets. The Internet is as much a packet-switching network as frame relay. But the Internet network node is called a router and the frame relay network node is called a switch. The reasons for this terminology difference have already been discussed and need not be repeated here. However, there are not fundamental differences between connection-oriented networks like frame relay and connectionless networks like the Internet. The presence or lack of network-level connections is the most fundamental difference of all. Of course, in packet switching the connection is not a circuit or channel, but a virtual or logical channel or circuit. Connection-oriented networks such as frame relay require the presence of virtual circuits to provide the path for the user data transfer frames to follow across the network. The virtual circuit also provides a basis for the bandwidth guarantees for what quality of service a frame relay network provides, but this is a different matter. What is important here is the presence of a connection (frame relay virtual circuit) in a frame relay network to provide the path through the network for user data to follow.
The question is: where do the connections come from? This seemingly simple question actually has profound implications for the future of not only frame relay networks, but also all connectionoriented networks in general. This book has already pointed out that there are really two distinct types of virtual circuits. The permanent virtual circuit (PVC) is the packet-switching equivalent of the leased private line. The PVC is available 24 hours a day, seven days a week, so there is always a path established through the network from A to B. But there are also switched virtual circuits (SVC) as well. The SVC is the packet-switching equivalent of a dialup modem connection. The SVC is only available after a connection phase or call setup phase, which takes some finite amount of time for the network to complete, so it is not really transparent to the user. After a data transfer phase, in which the information of importance to the user application is exchanged across the network, there is a further disconnect phase or call clearing phase, transparent to the user, after which there is no longer any connectivity between A and B across the network. The different terminology has been applied to different types of connection-oriented networks. But whether connection or setup, disconnect or clearing, the ideas are exactly the same. The whole connection-transfer-disconnect process is usually known as call control in many networks, including frame relay. There are three distinct phases to the whole call control process. The first is the call establishment phase which sets up a connection. The second is the information (or data) transfer phase where the endpoints exchange whatever information—voice, video, or data—that the connection was established to exchange in the first place. Finally, there is a call clearing phase after which the endpoints are disconnected, exactly the same state as they were before the process began. The concept of call control for SVCs in a frame relay network is shown in Figure 5.1.
Figure 5.1 Switched virtual circuit call control. There are some implementation quirks to this connection process that should be pointed out. The whole process starts off when a user issues a call setup request across the local UNI into the network. This originator process is not as simple as a user typing in the frame relay network address of the recipient and pressing enter. In frame relay the originator is usually a software process in the FRAD (or, rarely, software in the end-user device) that sends a signaling message across the UNI to the signaling software process in the frame relay switch (the local network node in this example). If the two endpoints are not serviced by the same network node, then the call setup message must be sent through the network to the remote frame relay switch. In Figure 5.1, this signaling message shuttling is indicated by a broken line. The only special action that the frame relay network has to take is to find the remote network node and UNI switch port that the recipient device is located on. All the local network node has to go on is the destination frame relay network address. The recipient of the call setup request message might be in the next town or across the country. The switches’ signaling protocol routing table must be able to find the destination anywhere on the frame relay network. The switches’ signaling protocol routing table also must be able to determine the best path at the moment to set up the connection as well. The local frame relay switch must be able to route new call setup requests around congestion or failed network links, for example. And, if the connection is billed by connection time (no other fair means of charging has ever been implemented successfully), then the billing software must be engaged as well in anticipation of the new connection.
At this point, with the exception of the billing part, it might start to sound as if the routing of a frame relay (and all networks that support SVCs must do the same) call setup request is suspiciously like the routing of an IP packet through the Internet. Both setup message routing and IP routing employ full, globally unique network addresses to route each data unit independently; both involve routing table lookups; both rely on updated network topology information to choose the best path; both dynamically reroute around network failures, and so on. Of course, the reason for this similarity is that routing setup messages and routing IP packets are essentially the very same process! That is, all connection-oriented switching networks that support SVCs must also support connectionless routing. Otherwise there could never be any new SVCs established at all. The main difference between connection-oriented SVC networks like frame relay and connectionless IP networks like the Internet is that IP networks treat every packet like a call setup request. Once this realization is made, the familiarity that people have today for how IP routers function makes it much easier to understand exactly what a frame relay switch does with a call setup request. The call setup message emerging from the network at Step 1a in Figure 5.1 is nothing more than the delivery of a connectionless packet over the frame relay network, with all the features that the process implies, such as a routing protocol running between switches to provide topology updates, and so on. Early Internet documents were often fond of pointing out that while connection-oriented networks like the PSTN had to support both connectionless and connectionoriented networking, the Internet only had to implement connectionless services. So, obviously, the Internet’s structure was much simpler than the PSTN’s. This line of thinking is seldom seen today, especially with all the talk in IP circles about IP flows (flows are basically IP connections but purists cringe at the very thought of connection-oriented IP), but that does not mean the argument is not valid. Note that these features to enable SVCs are totally independent of the end user, FRAD, or frame relay switch support for the frame relay signaling protocol itself. The presence of a signaling protocol alone does not mean that a frame relay network suddenly can support SVCs. Of course, the absence of such signaling protocol support in any needed component does prevent the network from supporting SVCs. In any case, Step 2 shows the local network node issuing a call proceeding message to the originator. In the PSTN, older voice switches can still generate call proceeding tones which are meant to indicate to the user that the network was working on the request, and in truly old switches represented the actual signaling tones themselves. In most modern voice switches, there is just silence on the line. In frame relay, the call proceeding message tells the originator to hold on because the far end is being contacted but has not responded yet. Although not shown in the simple figure in this text, it is common for the originator to receive the call proceeding message before the call setup message has made its way through the network to the recipient. This does not materially change the procedure. Step 3 shows the recipient accepting the new connection. The recipient can also choose to reject the call setup request for a variety of reasons. Network printers routinely reject call setups when they are currently printing one job already (they can’t reasonably intersperse pages anyway) or when they are simply out of paper. End computers might reject new connections due to current traffic load. The reasons are many and have nothing to do with the network itself, in most cases. It is worth noting that the network can also reject a request for a new connection as well, mostly due to traffic congestion considerations (the requested CIR cannot be guaranteed). In Figure 5.1, the recipient accepts the connection and sends a connect message back to the originator. This message now follows the path set up through the network by the routing of the outbound call setup message, so no independent routing is required for the connect message. Upon receipt of the connect message by the originator, the connection is now ready. The messages exchanged to this point also contain all of the information needed to allow the endpoints to determine the DLCIs used in both directions, the CIRs, and so forth.
Once the connection is no longer needed, one party or the other issues a disconnect message which initiates the call clearing phase. This decision to disconnect is usually determined by the end users, but the network can release connections that remain unused for previously defined periods of time. Since disconnects by the originator are more common, this is the interaction shown in the figure. The disconnect message follows the same path as the information through the network and pops out at the recipient. Thereafter, the two network nodes involved (sometimes there is only one, as previously noted) operate independently. Step 2 shows the local node issuing a release to the originator of the disconnect message. If this message is issued before the network knows that the disconnect has been received by the recipient, the process is known as an abrupt release. If the network waits until the recipient issues its own release as shown in Step 1b, then this is known as a graceful release and is somewhat uncommon. If a network relies on SVCs to generate revenue and conserve network resources (typical), then the network naturally wants to release the buffers and bandwidth tied up with a given connection as soon as possible. In either case, the originator issues a release complete in Step 3 while the recipient independently receives a release complete in Step 1c. There is no requirement for the sequencing of the two release-complete messages. Once either end issues a disconnect, that’s enough for the network. The actual implementation of a signaling protocol on a network can be quite complex. This simple example cannot begin to address all of the issues involved at each step of the process. Some but not all of the details regarding these issues and some of the additional information conveyed in the messages themselves will be discussed later. However, all of the details are not necessarily needed to understand the functioning of frame relay SVCs in general. Connections come from some connection setup phase between user (user A, for example) and network. This phase is only needed when there is no PVC or preexisting SVC between A and B established on the network. Most frame relay networks today offer PVC service and that is all. But some frame relay service providers have begun to offer SVCs, at least in limited circumstances. The reasons for these limitations will become apparent once a full discussion of signaling protocol implementation is completed.
Frame Relay Signaling Messages The SVCs established using the FRF.4 subset of full Q.933 look pretty much like Q.933 messages, but FRF.4 messages are not all Q.933 types in all circumstances. Q.933 messages, in turn, look like the Q.931 messages first established for ISDN signaling on the D-channel, but again Q.933 is only a subset of full Q.931. It should be noted that none of this is actually in the FRF.4 document, which is fond of referring readers to some section of Q.933. But Q.933 is fond of referring readers to sections of Q.931 (and even the related Q.932), so without all three documents in hand, it is hard to figure out exactly what is supposed to be going on. So this section of the book should bring a lot of ideas and information together. All frame relay signaling messages have a common format. They are all stuck inside a frame relay frame flowing on DLCI = 0, so they are relatively easy for FRAD and a network to discover (there are other things that flow on DLCI = 0, but these are discussed in a later chapter). When the frame relay frame is used for signaling information (DLCI = 0), the first two octets after the frame relay header are a Control field. Since there are more than just signaling protocol messages that use DLCI = 0, this control field is used to allow receivers to figure out just what is inside the frame. When used to carry an SVC-related signaling messages, the Control field is what is known as an Information frame or I-frame. The I-frame structure is identified by having a “0” bit at the end of the first octet (least significant bit) of the Control field. The second octet ends with what is known as the Poll bit. When set (1), the poll bit tells the receiver that the sender expects a response frame. When the poll bit is not set (0), it means that the receiver need not respond to the frame. The other seven bits in each octet form the N(S) and N(R) fields. All information frames are always numbered and these sequence numbers are used as a means for the receivers to figure out if any frames are missing in a given sequence of signaling messages. The N(S) field is the sequence number of the frame being sent and the N(R) field is the sequence number of the next-expected I-frame (signaling message). The numbers cycle from 0 to 127, then repeat. The overall structure of a frame relay frame carrying an FRF.4 signaling message is shown in Figure 5.2. The similarity with the LAPD frame is notable and just another confirmation of frame relay’s origins.
Figure 5.2 Frame relay frame carrying an FRF.4 SVC signaling message. The figure shows not only the two octets of the Control field structure, but also the entire Information field overall structure as well. All frame relay signaling messages, identified by the DLCI = 0, I-frame format, start out with a five-octet signaling message header. The header has three fields. The first is a one-octet Protocol Discriminator field which is used to identify the exact signaling protocol used. For frame relay, this field is set to 00001000, which is nothing more than the number 8 in hexadecimal (08h).
The next three octets in the required signaling message header form the Call Reference field. The first four bits of this field are always 0000 and the next four bits give the exact length of the call reference value itself. In frame relay, this value is 0010, or 2. This means that the call reference value itself is two octets long. The value is carried in the final two octets (16 bits) of the three-octet call reference field. One of these bits is a flag bit, leaving 15 bits for the actual call reference value itself. The flag bit is set to 0 at the originator side of the frame relay network and set to 1 at the destination side of the frame relay network. This prevents a phenomenon known as glare, which can happen when both endpoints happen to pick the same call reference value for an incoming and outgoing call. The call reference number is essentially how the frame relay network identifies a connection internally, a mechanism that works beyond the DLCI number. Like DLCIs, call reference values are of local significance only and there can be many calls with the same call reference value around the network, but only one of a given value on a given UNI. At first glance, it might not be apparent why a number other than the DLCI would be helpful to the network when SVCs are supported on a frame relay network. After all, PVCs work just fine with only the DLCI to go on. The key here is that SVCs come and go, and each SVC needs a DLCI only when it is established and the connection is being billed. There are less than a thousand DLCIs that can be used for SVCs, which sounds like a lot, but really isn’t. If a frame relay SVC is used as much as a typical Web session (or for a typical Web session), about 30 minutes or so, then 50 users will establish, use, then release, 100 connections per hour across a frame relay UNI. Over a 10-hour work day, that works out to 1000 SVC connections, more than could be tracked by expecting the network to give each one a distinct DLCI when the connection is established. Admittedly so, this example seems high, but the point is that at some level of SVC activity, billing by DLCI alone could cause confusion on the part of the network. So the call reference system, with more than 32,000 distinct values to use (15 bits), gives the frame relay service provider a greater range to assign and track DLCIs internally to reference SVC calls and bill users properly. An SVC will have different call reference values on each UNI, however, and the standards do not specify exactly how a frame relay network could or should use these call reference values other than to say that call reference values are temporary identifiers of SVCs. The internal use of the call reference values to track SVCs is up to the individual hardware and software vendors, and service providers. The last octet of the five-octet signaling message header shown in Figure 5.2 is the Message Type field. The first bit of this field is always 0 and the other 7 bits are used to indicate to the receiver whether the signaling message is a call setup, disconnect, or whatever. After the five-octet header, the frame relay signaling message has many possible structures depending on the value of the message type field in the signaling message header. The rest of the information field consists of a variable number of Information Elements (IEs), and each IE has a variable length depending on its type. There has to be at least one IE present in all signaling message types. All of the IEs are either one octet long (and of two formats, Type 1 or Type 2 IEs) or more than two octets long (a variable-length IE). The single-octet IEs begin with 1 bit and the variable-length IEs all begin with a 0 bit. A complete signaling message in frame relay is simply a frame on DLCI = 0 with the control field for I-frame that contains the five-octet signaling message header and one or more IEs. IEs themselves can become a bewildering array of the most seemingly arcane information that could be imagined. Many IEs seem to encompass the most minute details of connection behavior. Both of these statements are true. The point is that a frame relay network supporting SVCs must be able to gather all of the same types of information from the user requesting a connection as from a user requesting a PVC. With PVCs, however, the interaction is human to human, and it is relatively easy to see what will work well and what is not such a good idea. The DLCIs must be unique on the UNI, the CIR must not exceed the booking policy, the total DLCIs must not exceed the supported number of connections, and so on. With SVCs, the frame relay network has to figure all of this out on-the-fly, in real time, without the guiding hand of a human anywhere in the process to say “wait a minute! This is a dumb thing to do, and the UNI or switch might fail...”
Frame Relay Information Elements
FRF.4 uses a subset of the full Q.933 signaling message types (call setup, disconnect, etc.) to handle SVCs. Fortunately, FRF.4 also uses a subset of the full range of IEs established for Q.933 SVCs. Some of the IEs are mandatory (M) for a given message type and must appear, while others are optional (O) and can be absent. All of the IEs used in FRF.4 are more than one octet long, so there are no Type 1 or Type 2 single-octet signaling messages in FRF.4. Each variable-length IE used in FRF.4 has a common format as shown in Figure 5.3.
Figure 5.3 FRF.4 Information Element (IE) format. All of the FRF.4 IEs start with a 0 bit, indicating a multiple-octet IE. The second octet always contains the length of the contents of the IE itself and not the length of the entire IE, as might be expected. The remaining octets, and there might be many, contain the values of all the fields of the IE itself. All of the IEs have distinct numerical identifiers. Because some are mandatory and some are optional, they might or might not be present in a signaling message. So, to make life easier for receivers when loaded into a single signaling message, all of IE must be in ascending numerical order. This makes it easy for a receiver to tell if a given IE is present. Table 5.3 shows the IE identifier coding for the FRF.4 IEs used for SVCs, the mandatory (M) or optional (O) fields by message type, and the IE’s maximum length in octets (if applicable). Table 5.3 FRF.4 Signaling Messages Used for SVCs FRF.4 SVC Message Types 1
SETUP
2
CALL PROCEEDING
3
CONNECT
4
DISCONNECT
5
RELEASE
6
RELEASE COMPLETE
Identifier
Information Element
SVC Message Type
Max. Length
123456 000 0100
Bearer Capability
M
5
000 1000
Cause
MMM
32
001 1001
Data Link Connection Identifier
MMM
6
100 1000
Link Layer Core Parameters
OM
27
100 1100
Connected Number
O
100 1101
Connected Subaddress
O
110 1100
Calling Party Number
O
110 1101
Calling Party Subaddress
O
23
23
111 0000
Called Party Number
O
111 0001
Called Party Subaddress
O
111 1000
Transit Network Selection
O
111 1100
Low Layer Compatibility
O
14
111 1110
User-user
OO
131
23
Anyone familiar with the tables presented in Q.933 or even Q.931 can appreciate the compactness of the FRF.4 IE list. There are a minimal number of IEs, and most are optional. Only a few are needed for all message types and, of these, only the SETUP message has a complex set of options to deal with. A few words about the function of each IE are definitely in order. Some of the IEs are relatively self-explanatory in nature and function. For instance, an SVC has no DLCI assigned initially, so it only makes sense that the Setup, Call Proceeding, and Connect signaling messages must contain a DLCI information element. In general, the DLCI IE value can be requested by the user or assigned by the network. The network will try to grant the user’s DLCI request, but always reserves the right to allocate another DLCI value for the SVC. FRF.4 only uses the DLCI IE in the network to user direction, however. Likewise, the Disconnect, Release, and Release Complete messages must have a Cause associated with them to inform the endpoints why the connection is being dropped. Frame relay network addresses can consist of what basically amounts to a site identifier (this UNI) and an additional subaddress (this port on the FRAD or this software process on an end-user device). The presence of these IEs as options in Setup and Connect messages is therefore neither exciting nor remarkable. In an equal-access environment where users have the right to choose an IXC (transit network) regardless of the LEC used on each end of the frame relay network, the presence of a Transit Network Selection IE is expected only. In fact, this IE was included in FRF.4 for future use only, but can be present nonetheless. The remaining four IEs—Bearer Capability, Link Layer Core Parameters, Low Layer Compatibility, and User-user—are a little more complex. The Bearer Capability is the basic IE and must be present in the Setup message. This IE is used by the network to identify a network service. There are lots of services that could be supported by a fast packet network such as frame relay. For now, the Bearer Capability IE indicates frame mode as the transfer mode, that is the means by which information travels across the network. The Bearer Capability IE also says that these frames are Layer 2 protocol frames, and that the information transfer capability of the SVC will be unrestricted digital information. Taken together, the Bearer Capability IE defined in FRF.4 is just another way of saying that the network does not need to look inside the information frames for any reason whatsoever. The Link Layer Core Parameters IE is the most complicated and must be present in the Connect message in the network to user direction. This IE is optional in the user-to-network direction (few users would know these parameters anyway). There are four main network parameters that must be set for each and every DLCI on the frame relay network. For PVCs, these parameters can be established through human contact, a service agreement, or some other mechanism. For SVCs, these parameters must be established on-the-fly, in real time. The four main parameters are the Frame Relay Information Field (FRIF) maximum size, the throughput needed for the connection (call), the committed burst size, and the excess burst size. The committed and excess burst size are used to determine if and when frames may be tagged as discard-eligible or ignored, as described in the previous chapter. The throughput parameter is the equivalent of the CIR and allows the network to determine the proper CIR for the new connection. All four parameters are specified (and can differ) in both directions, outbound and inbound. This provides further evidence of the inherently bidirectional nature of frame relay connections, no matter how they might be billed by the service provider.
The Low Layer Compatibility IE is a number of fields that in some ways resemble the fields of the Bearer Capability IE. That is, this IE gives the network and the other end of the SVC further information about the Layer 2 and Layer 3 protocols that will be used on the new connection. The Setup message can include this information and additional details such as the user’s data rate or what flow-control mechanisms the end users intend to employ. The whole intent is to allow an intended destination on an SVC to decide if it makes any sense to accept the connection at all if there are concerns that the two end processes cannot communicate due to lower layer incompatibilities. Higher layer incompatibilities might still be a problem, of course, but this is not the concern of the frame relay network itself under FRF.4. Finally, the User-user IE provides a means for the users at the ends of a not-yet-established SVC to transfer up to 131 octets worth of information in order to provide some miscellaneous information from one user to another. For example, the User-user IE can be used to convey a password to an endpoint that is needed before the destination will accept the SVC from an originator. The Useruser IE can always be used to fill the pipe in the most efficient fashion. This IE is optional in the Setup and Connect messages. Remember that to a user, the call setup delay is added to the overall end-to-end delay through the network. It might take only 1 second to transfer the user information, but if the call setup delay is five seconds long, the user perceives the delay for the SVC service to be six seconds. Therefore, if some user data can be transferred as the connection is being set up, it reduces the perceived delay on the part of the user. In fact, for short interactions, the use of the User-user IE can mimic a kind of connectionless service of the frame relay network, since all call setups are routed independently through the frame relay network. Those interested in more details about the actual bit structures of frame relay signaling messages for SVCs and all of the IEs are referred to the relevant ITU-T, ANSI, and FRF documentation listed in the biblography. For the purposes of this chapter, it will be enough to show a frame relay SVC call setup message with all its IEs, mandatory and optional. This call setup message is shown in Figure 5.4.
Figure 5.4 Frame relay SVC call setup message. Some texts tend to become excited about signaling protocols and messages for SVCs. But the real excitement of signaling protocols and messages for SVCs is not in their structure, but in their use. This topic will occupy the rest of this chapter.
The Q.933 Signaling Protocol and Frame Relay Forum’s FRF.4 It has already been mentioned that frame relay LAPF core provides a basic, PVC-based data transfer service from FRAD to FRAD across the frame relay network. In order to offer SVC-based data transfer services, it is necessary for the frame relay network and FRAD to support some form of signaling protocol. At this point there are two main possibilities and for adding this SVC support. The two main methods are to use ISDN to access a frame relay network and set up frame relay SVCs (non-ISDN SVCs), or to make frame relay the data service part of the ISDN network and set up up frame relay SVCs the same way that any other connections are made on the ISDN (ISDN SVCs). If ISDN is used to access a frame relay switch, or even point-to-point leased lines as a UNI, then it is possible to use the Frame Relay Forum’s User-to-Network SVC Implementation Agreement (FRF.4). The FRAD and network understand the same signaling protocol; it is based on the signaling protocol used in the ISDN scenarios. This chapter will outline the use of non-ISDN and ISDN SVCs, but will mostly emphasize the use of the Frame Relay Forum’s FRF.4 as the way that frame relay service providers and FRAD vendors currently implement SVCs in frame relay. The ITU-T tends to see frame relay as another thing that ISDN can do. There is nothing wrong with this, but service providers have tended to deploy, market, and sell ISDN and frame relay in an entirely separate fashion. The ITU-T recommendation that addresses frame relay signaling issues and establishes the signaling protocol for frame relay networks is Q.933, which has the mindblowing title of Integrated Services Digital Network (ISDN) Digital Subscriber Signalling System No. 1 (DSS 1)—Signalling Specifications for Frame Mode Switched and Permanent Virtual Connection Control and Status Monitoring. As the title promises, there is much in Q.933 that concerns PVCs and status monitoring. This is in line with the ITU-T philosophy of considering anything that is not information transfer on a network to be signalling (note the double “l”). The related topics in Q.933 are further considered in later chapters. For now, it is what Q.933 has to say about SVCs that is of interest. Q.933 says that there are two ways that a Frame Relay Bearer Service or FRBS can use ISDN to establish demand connections (SVCs). The use of the term FRBS refers the information transfer aspects of a frame relay network as part of an overall ISDN. Q.933 calls these two ways Case A and Case B. Case A uses ISDN to access a Remote Frame Handler (RFH), which is the frame relay switch. Once this initial ISDN connection is established using ISDN Q.931 messages, then the signaling endpoint can generate the proper frame relay signaling Q.933 messages to establish an SVC. Case B considers the frame relay switch (frame handler to Q.933) to be local to the ISDN switch and therefore essentially integrated with the ISDN switch. So Q.933 messages can be used directly and immediately, without the need for an ISDN Q.931 connection first. This still works because Q.933 is a subset of the full Q.931 signaling protocol. The problem is that neither Case A nor Case B are often encountered in the real world of ISDN and frame relay. More typically, there are ISDN lines and there are frame relay UNIs. Signaling messages sent on one just don’t find their way to the other. So real-world frame relay networks usually follow the Frame Relay Forum’s User-to-Network SVC Implementation Agreement (FRF.4) which requires no ISDN relationship in the frame relay network or FRADs at all. The FRF.4 document is basically a subset of the full Q.933 signaling, since not everything is needed for SVCs when there is no ISDN around. FRF.4 basically says:
If ISDN is used, it is used only in Case A scenarios (no frame relay-ISDN integration). There are a few exceptions to the full Q.933 Case A signaling message scenarios. Q.933 signaling messages are always sent inside LAPF Core frames on DLCI 0. A UNI can have both PVCs and SVCs established at the same time. End systems will have either E.164 network addresses (telephone numbers) or X.121 network addresses (the same as used in X.25), which look like 10-digit telephone numbers anyway. There are a lot of other things addressed by FRF.4, but most of the document concerns how to pare down the full Q.933 signaling messages to get a useful subset that does not require or even rely on ISDN to function. For example, the full Q.933 message set employs 11 message types in three major categories. FRF.4 keeps the three categories, but cuts the number of message types down to eight. The message types are shown in Table 5.2. As evident in the table, the message types dropped by FRF.4 are concerned with connection establishment and mostly geared toward ISDN. For instance, Progress messages allow an originator to tell if an attempted connection is blocked due to lack of network resources. The Alerting message is the Q.933 equivalent of the telephone ringing, and so on. The Frame Relay Forum decided that frame relay SVCs could be supported (and established more quickly) without this additional messaging overhead. Table 5.2 Q.933 and FRF.4 Messages Types Message category
Q.933
FRF.4
Call establishment
Setup
Setup
Call Proceeding
Call Proceeding
Progress Alerting Connect
Connect
Connect Acknowledgment Call Clearing
Miscellaneous
Disconnect
Disconnect
Release
Release
Release Complete
Release Complete
Status
Status
Status Enquiry
Status Enquiry
Who Needs SVCs Anyway? Frame relay PVCs are logical (or virtual) connections on a frame relay network that are available at any time to send information to a remote site located at the other end of the PVC. In this sense, PVCs are the equivalent of dedicated, point-to-point leased lines on a private data network. But the use of leased lines comes at a price. A leased line will only ever lead to one other network location. Sending information somewhere else with privates requires another private line and the associated expense. The alternative is to employ SVCs or switched services to reach other locations on the network on an intermittent basis. In this context, the term “switched services” applies to what are loosely called dialup services employing modems and the public-switched telephone network (PSTN) to send data over the voice network. The use of switched circuits instead of point-to-point links is best known from the public voice network. To place a voice telephone call, the user picks up the handset, dials a number, and waits for a connection. The number represents the network address of the remote location (people seldom think of telephone calls in this way, but this is exactly what a telephone number is). The PSTN represents this number in a signaling protocol like Signaling System 7 (SS7) understood by public voice networks and uses this information to indicate to the remote location that a request for a voice connection has been made (the telephone rings). If the connection is successful (“Hello?”), the users may then transfer information (which is why people call in the first place, but again few people hink of telephones calls this way). When the transfer, which is usually two-way, is completed, the connection is terminated by hanging up the handset at either end. What is not obvious about this scenario is that this is exactly what happens on a frame relay network using SVCs instead of PVCs. With SVCs, there is no need to establish a PVC with an associated and dedicated DLCI number at service provision time at all. Instead, the locations only need to establish an access link to a frame relay switch port connection at each end of the network. Connections and paths can then be established dynamically as needed between the sites using a special frame relay signaling protocol. In fact, there is no need to restrict this SVC process to a particular set of remote locations. Literally any location on the frame relay network can receive a connection request, even if the frame relay network address is not known to the originator! This may seem hard to figure out at first, but a similar thing frequently happens on the public voice network when telemarketers just call everyone attached to a given telephone company central office switch. The network address telephone numbers are generated and dialed one after another. The signaling protocol attempts to make the connection (rings the phone) regardless of where or who is making the connection request. This is a security threat in frame relay, not merely an annoyance. So SVCs are connections that are dynamically negotiated (in terms of CIR [bandwidth] and other parameters) and established between locations attached to the frame relay network. SVCs cannot be established unless the remote location has a frame relay access link and port connection of its own, of course.
This does not mean that PVCs will or should go away. PVCs can still be used to establish virtual private networks (VPNs) between corporate offices on the frame relay network, although VPNs are typically thought of as private Internet or intranet entities today. SVCs would handle the lessfrequent need to establish connections outside of this corporate network. For example, SVCs could easily handle supplier-to-customer traffic needs or support to users as well. The use of SVCs in frame relay networks could cut down on users’ PVC costs as well, since PVCs must be paid for on an ongoing basis (although this is generally a very small cost compared to the cost of the frame relay service itself). Although SVCs can function similarly to dialup modem PSTN services, there are important differences. SVCs do not allow dialup access into the frame relay network over the frame relay UNI, which is often implied in SVC descriptions. The sites still must be connected to the frame relay network with dedicated access links, as with PVCs. The use and deployment of dialup access services is a separate issue and development that is totally independent of the concept and use of SVCs. This book has emphasized the use of PVCs for frame relay network services instead of SVCs. There are several good reasons for this emphasis. First, PVCs are available now, and very inexpensively—usually a few dollars a month for a PVC in each direction between two sites. Second, the use of SVCs is sometimes seen as a security risk to corporate sites (although security can be added in a number of ways). Finally, the full standard frame relay Q.933 implementation of the SVC signaling protocol is not easy to make work effectively in large frame relay networks. The pro and cons of frame relay SVCs are listed in Table 5.1. There is a lot of merit to the pro-SVC position in the table. Each PVC established must have a table entry in each frame relay switch. These table entries must be held in memory for speed of access. Too many PVCs can slow the network by making the lookup process slower and can increase service costs by requiring memory upgrades. SVCs keep table sizes to a minimum. There is no other way to effectively reach sites that were unplanned at service provision time without the use of SVCs. The PVC process can take days to implement, although 24 hours is a more typical timeframe. But during this period of time a site may literally be unreachable from some places on the frame relay network, even though the connectivity is there. Table 5.1 Pros and Cons of Frame Relay SVCs Pros Needed to keep the size of PVC tables to minimum. Needed to reach sites not planned for at service provision time. Needed to make frame relay as flexible as possible. Cons PVCs are inexpensive and table sizes are immaterial. SVCs are another unnecessary security risk. Adding SVC signaling protocol support to a frame relay network is not easy. Finally, SVCs are needed to make frame relay as flexible as possible and to ensure long-term customer acceptance. Imagine voice services or Internet access networking today without telephone numbers or dialup modems! As voice comes to frame relay, SVCs will become even more necessary. In spite of these very good arguments, it seems unlikely that frame relay SVCs will become common anytime soon, if at all. The fact remains that since PVCs are so inexpensive in most cases, and memory priced so reasonably, there is little need for SVCs in the foreseeable future, at least because there are a lot of PVCs required.
The security risk is real enough, also. The use of PVCs does not pose the possible risks that switched services entail (as in many businesses with dial-in network ports). Of course, frame relay is positioned as a public substitute for a private leased-line network. Frame relay has been so successfully marketed as a private line replacement that it might be difficult if not impossible to reposition the frame relay service as a switched service also. Finally, adding support for the SVC signaling protocol to all frame relay networks will not be simple nor inexpensive. And SVC support on one frame relay network does not ensure universal connectivity unless the NNI with SVC support to all other frame relay networks is implemented. It appears that none of these things will happen soon, and maybe not at all. But this does not mean that SVCs may not be desirable in some frame relay network configurations, especially very large ones with huge PVC needs and very intermittent (and unforeseen) site interactions. The potential customer should be clear about the frame relay service provider’s position when it comes to SVCs.
Whither Frame Relay SVCs? At this point in the chapter it seems clear that the frame relay signaling protocol needed by equipment vendors and service providers in order to offer SVCs on a frame relay network is ready to go. Yet, with the major exception of MCI in 1997, the major frame relay service providers have not offered SVCs. Even MCI, whose HyperStream SVC frame relay service was first offered in late 1997 with CIRs from 16 kbps to 6 Mbps, had no plans to charge SVC customers on a usage basis until mid-. Prior to this the MCI SVC service was strictly on a fixed-rate basis regardless of traffic load and connection time. An informal survey has shown that some 75 percent of all frame relay UNIs have five or fewer PVCs configured on them. About 90 percent have fewer than 20 PVCs defined to remote locations. CIR limits and traffic load have much to do with this, of course, but there are many networks that routinely connect to more than 20 remote sites, although not often directly. But that is one of the attractions of frame relay: the ability to logically mesh connect sites directly. SVCs can certainly be useful to overcome PVC limits and provide greater (and more flexible) connectivity to sites with low traffic volumes. Therefore, lack of SVC services is not due to user indifference. It is true that all the details of the frame relay signaling protocols have yet to be worked out. But this is not a big stumbling block. There is nothing to prevent a vendor from developing a proprietary signaling standard between its own switches. And since few (if any) multivendor ATM switch networks exist, incomplete standards would become a nonissue. In the frame relay world, the network node interface is beyond the scope of the frame relay standards (NNI is the NetworkNetwork Interface in frame relay), so vendors would have to develop or adapt their own signaling protocols in any case. Perhaps the problem with the absent SVC offerings is a lack of signaling protocol implementations. This is certainly true of end-user devices. But in most cases the end user would hardly be expected to set up his or her own SVCs with user device-generated signaling messages. And hardware FRADs can easily be built to comply with FRF.4, if not all of Q.933. Certainly MCI had no trouble finding FRF.4 and Q.933 software and hardware for its pioneering SVC offerings. So, there must be some other reason why SVCs are only offered by the rare frame relay service provider, and only then with limits such as flat-rate billing (then why bother with SVCs at all?). In fact, there is a very good reason why SVCs have not yet appeared in force in frame relay networks. The problem is not the lack of full standards, nor the implementation of these standards. One of the problems is the issue of routing SVCs on these networks. This will be called the “signaling protocol routing problem” in this chapter. Just what is the signaling protocol routing problem? Connection-oriented networks like frame relay (and ATM) do not independently route packets from source to destination as TCP/IP routers and the Internet do. Rather, connection-oriented networks use a signaling protocol call setup message to establish a path through a network from source to destination. It is only this call setup message that needs to be independently routed. The path, once set up, is used by all packets sent from source to destination for the duration of the “call.” The question is: What is the best way to route a call request from source to destination? This is an unanswered question in frame relay and ATM networks and defines one of the signaling protocol routing problems.
Another problem is the fact that SVC cannot realistically be charged for in the same fashion as PVCs. PVCs are available 24 hours a day, 7 days a week. So it only makes sense for service providers to bill for PVCs at a recurring, flat monthly rate. But SVCs can be established at the customer’s whim and used as long as the customer likes. It follows that some other billing methods must be used for SVCs. What this alternate billing method should be is open to some debate, as will be shown later. There are good reasons why simple call-time duration is not a good method of determining frame relay (or ATM for that matter) SVC charges. This represents another unsolved problem in frame relay networks—the billing and administration problem. Therefore, there are two main problems which must be solved before SVCs become common on frame relay networks: the signaling protocol routing or Resource Allocation problem and the Billing problem. Both are so important that they deserve the capital letters. However, it should be pointed out that these are only terms used in this book, not industry standard terms. Others may call them by other names and some may even prefer to think of them as issues, since frame relay and ATM have no problems at all. The goal here is to promote understanding of what these issues or problems are. Until these two problems are resolved, and resolved in a standard and common fashion, there will be no widespread deployment of SVCs in either frame relay or ATM networks, especially for data SVCs. What is the big deal about resource allocation and billing? The telephone companies have allocated voice and data resources for years. While it is true that congestion and busy trunks (fast busy) do occur, this has hardly prevented telephone signaling deployment. Also, the telephone companies have automated the billing process with computers for more than 30 years and, in fact, along with the power utilities were the first major corporate entities to use computers in this fashion. It is also true that billing errors occur, but again, this has not stopped either the deployment of signaling protocols or the sending of (incorrect) bills. Surely there must be something fundamentally different between the public switched telephone network (PSTN) and frame relay networks if resource allocation and billing are such problems. It turns out that there is a difference. In fact, there is such an extreme difference between resource allocation and billing in the PSTN, and in frame relay networks that few even want to think about offering SVCs until these twin problems are solved. The problems will be explored one at a time. That way, each one can be better understood and used to see what the trouble with SVCs seems to be.
The Resource Allocation Problem Resource allocation in the PSTN has already been mentioned. This section will offer more details on how resource allocation is performed on the public voice network that will help in understanding the problem with regard to frame relay networks later. A very simple telephone network is shown in Figure 5.5.
Figure 5.5 A very simple telephone network. The network in the figure is simple enough, but it has all the elements needed to illustrate how the PSTN signaling protocol interacts with the physical network of trunks and switches to perform resource allocation. There are four central office (CO) switches with both user local loops (lines) and links between the central office switches (trunks). The figure could add a few wire centers, toll offices, tandems, and IXC POPs, but the principles are the same no matter what the configuration.
Each central office switch has only so many voice channels on the trunks between them, of course. There may be as many as 10,000 local loops with telephones on each central office (maybe even more today), but since the average business or residential phone is only in use a few hours a day (and many residential phones are in use only 60 minutes or so), it makes no sense at all to have one trunk for every local loop. Besides, even the simple voice network in the figure would need not 10,000, but 20,000 trunks: 10,000 to each central office it was attached to. After all, it would be just as likely that someone attached to (or served by) Central Office A would call someone attached to Central Office B as Central Office D. So a lot fewer trunks are needed between central offices than the lines they each serve. But how many trunks are needed? The science of traffic engineering answers this question. (Actually, traffic engineering sometimes seems like a mystic art requiring advanced enlightenment.) Traffic engineering is used in voice networks to say things like, “With 10,000 phones on Central Office A, 600 trunks are needed to Central Office B to make sure that 99.6 percent of the calls get through.” The phrase “get through” is the key. This is exactly the point. If calls are blocked, the resulting busy signals generate no revenue, even though a lot of network resources are used to switch the call. These resources include things like switch digit registers (otherwise there is no dial tone), screening software (need to dial a “1” first?), initial billing record software (toll free, collect, or bill originator?), trunks, and so on. So maybe 0.4 percent calls blocked for lack of trunks is okay, maybe not. If not, the traffic engineer can recommend raising the trunk count to 650, 700, or whatever by installing more facilities between Central Office A and Central Office B (for instance). It really does not matter what technology is used to add trunks—-carrier, Sonet fiber, or microwave— since all of the trunks will be broken up into 64 kbps digital voice channels anyhow. But installing new trunk capacity is often an expensive and time-consuming task, whatever the media chosen. This being the case, the number of voice channels within the trunks between Central Office A and Central Office B remains very stable over time. A T3 will have 672 voice channels, two T3s will have 1344 voice channels, and so on. Each voice channel will have 64 kbps bandwidth, minimal delays, be fairly tolerant of bit errors, and so forth. In other words, the voice channels are built for the voice QoS parameters. New facilities and provisioning can change this trunk channel number, but not day-by-day or hour-by-hour. There is one major exception to this stability of trunks voice channels: outages. If a T3 between Central Office A and Central Office B is lost, then there are 672 fewer voice channels right away (and 672 more right away when the T3 is repaired). Calls may be blocked in the interim, which has an enormous and immediate impact on revenues and service. Service outages may have tariff impacts, leading, in turn, to fines and/or other financial penalties such as customer rebates. This outage effect is so important to resource allocation that network control centers have been created to deal with the effects of these outages and inages on phone calls. The point is that varying the resources available on the network causes problems, whether through addition or subtraction. This is important to remember. But why make trouble for ourselves? These trunks are all Sonet fiber rings today which provide automatic protection switching, right? There are still trunk outages, but not as many service outages. Okay, consider this aspect for a moment. Assume no trunks flip-flop in and out of service at all. Then the resource allocation on the network when a user makes a phone call will go as follows. Suppose a user on Central Office A (user A-) wants to make a call to another user on Central Office A (A-). This is just an SVC, of course. It is a type of virtual circuit known as an equivalent circuit in the telephony world, but it is an SVC nonetheless. The DTMF (touch-tone) signaling protocol is usually used to initiate the establishment of this SVC. In this case, resource allocation is very simple. Resource allocation affects only Central Office A resources, since there are no other Central Offices, and therefore no trunks, involved at all. The switch software in Central Office A looks around and says: “Hey! I can give A-dialtone! I can ring A-! I can supervise the call until someone hangs up! No problem!” (Signaling protocols always go “Hey!” They are really quite rude.) And it is not really any problem at all. This is because Central Office A is dealing only with a local knowledge of the network, not global knowledge of resource states elsewhere. Even if Central Office A cannot give dialtone, or ring A-, or whatever, the decision-making process is still the same and just as easy.
Now consider what must take place when a user of Central Office A (user A-, just to be different) wants to make a call (establish an SVC) to another user serviced by Central Office B. Now Central Office A’s resource allocation job is much harder. Why? Because Central Office A must now allocate resources based on a global knowledge of network resources, not only a knowledge of local resources as before. This is what makes the whole SVC routing process difficult. Consider the sample network with the following added trunk availability information, as shown in Figure 5.6.
Figure 5.6 The example network with trunk availability information. The resource allocation process at Central Office A could now go something like: “Well, there are no trunks available to Central Office B, so I’ll give them the fast busy and they’ll try again later.” The assumption that the users would try again later used to be a good one. What else were the users going to do? But maybe the “they’ll call later” assumption is not such a good one today. There are other things that a user can do besides make a phone call with the incumbent carrier. People can use their cell phones instead or make a call over the cable TV network. They can send e-mail. Maybe the problem will be the same, but maybe not. Whether the call is a flat-rate call, no service provider is happy to deny service, for revenue or tariff reasons, or both. But service providers need not fret. There is a better way. The resource allocation software in Central Office A could have a table that says: “If you can’t route a call through the A-B trunks, give it to the A-D trunks.” Central Office A will surely know that there are plenty of A-D trunks available, since it can “see” one end of the trunk group directly. The switch at Central Office D will have a table to route the call to Central Office C, and Central Office C will pass the call to Central Office B, which will complete the call. No busy trunks along the way. No lost revenue. Not a bad plan. And all that had to be done was to build a routing table with a topology database in each central office so that each switch had knowledge of other paths over which to route the call. But this does not entirely solve the problem. More smarts need to be built into each central office switch. Here is why. Consider the following trunk availability situation shown in Figure 5.7. Now when someone on Central Office A (user A-) dials the number of someone on Central Office C (C1), the task of resource allocation becomes very complex and difficult indeed. The Central Office A switch will attempt to set up the call through Central Office D or Central Office B. But it is easy to see that a call routed (or switched) through Central Office D will not and cannot be completed. The correct way for Central Office A to route the call is through Central Office B. The challenge is for the implementation of the signaling protocol with regard to resource allocation is as follows: How is Central Office A to know the proper way to route the call globally?
Figure 5.7 A more realistic resource allocation scenario.
Central Office A must know about the global conditions of the trunks on the network. For instance, it makes no sense for a central office switch in New York to route a call to San Francisco through Kansas City if all the trunks to Phoenix are busy. The central office switch should route the call through Chicago, where plenty of trunks to Seattle are available. This situation comes up all the time in the voice world. As it turns out, there are several ways to deal with this resource allocation problem in the voice world. The best way would be for the call setup packet that is used to route the call to just reroute itself as it threads its way through the network. That is, the call setup packet, even if sent to Central Office D, would look at the trunk situation and say: “Hey! Wait a minute! I can’t go anywhere from here. Better go back to Central Office A and start over.” But in the real world, this would not work. What if there are three sets of trunks out of Central Office D that the call setup packet could try, not just one? How long would it take to try all possible routes? As it turns out, this method takes much too long to set up calls within the international guidelines used by the PSTN. Of course, there are other ways. A database could be set up in each central office which is updated periodically. The database could be used exactly like the routing table in a router to determine the proper path a call should take. This is a very robust and efficient approach. It is basically what Signaling System No. 7 (SS7) does today, with Service Switching Points (SSP) as clients, Service Control Points (SCPs) as the database servers, and Signaling Transfer Points (STPs) as the routers. Perhaps the central offices could be arranged in a hierarchy, with more and more trunks at higher levels. The central offices in the figures could be called level 5 central offices. If a call gets to Central Office D and there is no trunk available to Central Office C, the call could be routed to a higher level switch at level 4 of the hierarchy. Higher levels would be developed and used as needed. The was essentially the structure used in the Bell System prior to Divestiture in 1984, and it was very successful. However, it required that all of the trunks and level switches be controlled by the same organization, and completely out of the user’s control. Of course, this was true of the AT&T Long Lines network, but this was no longer possible after 1984 with Equal Access and Carrier Selection. In fact, resource allocation can be handled quite well in the PSTN with a combination of these approaches. Since the voice network is engineered for peak loads (Mother’s Day, Thanksgiving, and New Year’s are always neck-and-neck), most of the time there are plenty of trunks to go around. Resource allocation decisions can be made locally without too much trouble. But when congestion occurs (or, ideally, right before it occurs), a network management center could see the trend and distribute traffic more efficiently. This could be as simple as adjusting a few routing table entries and parameters in the central office switch to say: “Hey! Send more stuff to Central Office B and less to Central Office D.” This would also require both a Network Management Center (NMC) and communications with the central offices. But most telcos have NMCs and central office links for other reasons already. And they could put a big traffic map on the wall to impress tour groups. In spite of this slightly tongue-in-cheek approach to routing alternatives in the PSTN, all of the methods suggested are instructive. This is how routing works in some real-world portions of the global PSTN. All real-world routing algorithms in the global PSTN use a concept known as trunk reservation (TR). Each link between switches has a TR value. If a direct route is not available to route a call, a TRpermissible alternative is sought, based on TR values, which constantly fluctuate with traffic conditions. If no current TR values are suitable, the call is blocked. The PSTN routing algorithms in use today differ in their way of choosing from the set of TR-permissible alternative routes. But this is the only way they differ. For example, from the 1980s until the early 1990s, AT&T used a routing algorithm called Dynamic Non-Hierarchical Routing (DNHR), which replaced the older hierarchical Long Lines level switch structure previously described. In DNHR, the day is divided into 10 time periods. The TR parameters vary from time period to time period, based on updates from a central location (the NMC) which reflect weightings due to current traffic load on the network, traffic forecasting rules, and manual intervention.
All DNHR switches use special signaling messages to propagate TR parameters among the switches, which number in the hundreds. Since DNHR switches tend to be highly mesh-connected in terms of trunking, DNHR switches will pare down the full alternate route set to a more manageable subset. There is no rethreading of calls, but a special crankback message is used when a call is blocked at another switch to prevent this from happening repeatedly. In Great Britain, British Telcom (BT) uses a routing algorithm known as Dynamic Alternate Routing (DAR) that depends more on actual current traffic loads than forecasting and minimizes the number of messages sent between the switches. DAR picks one alternate route all the time and uses it until a call is blocked on it. DAR then selects another route at random and the process repeats. In the early 1990s, AT&T implemented a Real-Time Network Routing (RTNR) algorithm. RTNR increases the number of messages exchanged between NMC and switches, and also among switches themselves, but is much better for completing calls than either DNHR or DAR. Both DNRH and DAR tend to pick the same alternative route over and over. RTNR, which included another routing algorithm called Least Load Routing (LLR), distributes traffic more evenly. Two new routing algorithms are claimed to be even better than RTNR. Bell Northern Research has developed Dynamically Controlled Routing (DCR) for the Trans-Canada Network. In DCR, a central computer tracks link status and gathers update messages every 15 seconds. Bellcore has developed State-Dependent Routing (SDR) which assigns TR values based in cost. Costs are determined from information gathered every five minutes and calculated by a large nonlinear program running on another processor. Because of this time lag, real-time rerouting operation is not possible. This is the whole point of this section about PSTN routing algorithms. Where is the DNHR, DAR, RTNR, DCR, or SDR for frame relay (or ATM) networks? These routing algorithms will not just port over into the frame relay and ATM worlds. This is because resource allocation, even in the small voice network example with only four switches, is enormously complex. What makes it possible at all is the fact that the resources being allocated are (in the vast majority of cases) fixed increments of 64 kbps. Imagine how much more complicated the task would be if the trunks were not channelized into 64 kbps circuits. In fact, this is exactly the case when we replace the voice central office switches with frame relay or ATM switches. There are no more channels on the trunks, Sonet or otherwise, connecting the switches. There are just virtual circuits representing channels as a stream of frames or cells. But not all connections on a frame relay network work best by emulating circuits designed to deliver Constant Bit Rate (CBR) services. Frame relay is designed for a whole array of services, especially data services, that are Variable Bit Rate (VBR) services. Most are extremely bursty data applications. How do bursty VBR data connections make resource allocation so different on a frame relay or ATM network? Here’s how. In the Q.931 signaling protocol, used with narrowband ISDN, the call setup message is only required to specify how many 64 kbps units of bandwidth the connection needs. This is one of the main bearer capability parameters. The digital switches only need to compute the effect of this parameter on the TR number to successfully route the call setup message. In the Q.933 signaling protocol used in frame relay networks, users are allowed to specify much more than bandwidth bearer capability, and indeed they must if they are to take advantage of frame relay’s VBR dynamic bandwidth allocation (also misleadingly known as bandwidth on-demand) capabilities. Here are the fields that must be specified in a Q.933 call setup message in order for the frame relay network to provide efficient VBR services: Maximum frame-mode information field size Throughput Minimum acceptable throughput Committed burst size
Excess burst size Transmit window value Retransmission timer value What has all this to do with resource allocation? What was once a quick look at a small number of fields in the ISDN Q.931 call setup process is now a long and involved process of examination and analysis with Q.933. This must be done to determine the effect of the VBR connections on the current available trunk bandwidth. It is fine to say that frame relay network connections allow for dynamic bandwidth allocation, but the network only has a fixed amount of bandwidth to play around with. The problem is that the VBR flow of frame cells may vary drastically over short periods of time. The question is no longer one of how much bandwidth per unit time a connection will consume, as in channelized, CBR circuit connections. Some VBR connections may tolerate more delay in exchange for more capacity (just buffer these frames). Some VBR connections send fewer frames but these information units must be delivered quickly within a bounded delay (and so cannot be buffered for long, if at all). Since ATM from the start, and frame relay more recently, have had standard services defined for mixing CBR (uncompressed voice/video) and VBR connections on the same unchannelized trunking system, these connections may have vastly different resource requirements. The challenge of resource allocation in frame relay networks is to determine, based on the call setup message field parameter values, the total drain on network resources in terms of buffers and bandwidth that the VBR connection will consume. Only then can the call be routed properly, whatever the routing algorithm used. This must be done in an acceptable amount of time, of course. The holdup on SVC implementation in frame relay and ATM networks is directly due to this resource allocation problem. There is currently no accepted or efficient way to determine the equivalent capacity of a frame relay or ATM VBR connection in terms of fixed time division multiplexed trunks. If the connection could be expressed as equivalent bandwidth on a fixed bandwidth trunk network, the existing routing TR mechanisms could be used. And this must be done quickly enough, based on global network capacity, current load, and congestion potential, to satisfy all types of service connections, including voice and video-on-demand, among others. PVCs make this task a little easier, but not trivial. Resource allocation for frame relay network PVCs is done at service provision time, which usually means between the contract signing and next Monday. In the interval, network engineers feverishly try to figure out the load each PVC will add to the network. But it is easier for humans, especially those who have designed the network in the first place, to obtain and use the global knowledge needed to make PVC routing decisions. More facilities may eventually have to be added, but this should be predictable in a PVC world. Consider the following simple frame relay network in Figure 5.8. Notice that this network is as simple in structure as the previous voice network. But now the central office switches are frame relay (or ATM) switches. The figure could also show the current state of network resources, not in terms of trunk channels available, but in terms of frame capacities, queue depths, buffer maximums, service times, and so on. The question for SVCs is: Can the SVC request be granted based on the current global state of the network resources and, if so, how should it be routed?
Figure 5.8 Frame relay and resource allocation. Right now, this is an unsolved problem in frame relay for general cases. Notice that the problem is totally independent of the presence or absence of a standard signaling protocol.
The Billing Problem Suppose for the sake of argument that the SVC resource allocation problem has been solved to everyone’s satisfaction. Further suppose that these calls take no longer to set up than regular PSTN voice calls. After all, if software can be written to beat chess grand masters, surely the resource allocation problem is not unsolvable. Indeed it is not. But SVCs may still not be right around the corner for all frame relay networks. This section will discuss the reasons why. The nice thing about PVCs is that they are available for customers to use all the time. Therefore, a service provider can bill the customer for each PVC at a fixed monthly rate and not fear an angry customer or face a lawsuit. But SVCs are very different. SVCs are not available for constant use, by definition. SVCs should be billed by some criteria other than fixed monthly rates. In the voice network, two criteria are commonly used: time and distance. Users pay more based on how long they talk (talk longer, pay more) and how far apart the endpoints are (generally: boundaries are arbitrary). Perhaps these criteria can also be used for SVCs in frame relay (and even ATM) networks. Consider time first. In a voice network, it is not a bad assumption to make that if one end is not sending information (talking), the other is. So for the total duration of the call, one end or another is basically active at all points in time. So duration is a valid and fair way to bill. But frame relay connections carry more than voice. Bursty data are the rule rather than the exception. Long periods of time may pass before either end of the connection is active. Is it fair to bill the customer for nothing? After all, the voice network does. If people forget to hang up, the tab just runs higher and higher. Customers may not be too happy about this, but it only happens rarely. So what’s the big deal about frame relay SVCs being billed based on time? Who cares if the customers don’t like it. Too bad. Take it or leave it. But it isn’t that simple. This is mainly because of the call setup/holding time ratio, an important consideration for all switched services. There should be a shorter term to describe it, or even another dreaded acronym (CS/HTR?). But it seems that all are stuck with call setup/holding time ratio. In a PSTN, processing and routing a call setup message takes a lot of effort, as shown in the previous section. But no revenue is earned by the service provider until the call is completed, that is, when the called party picks up the phone and says “Hello?” All blocked or abandoned calls (abandoned because they took too long to route and the caller thought something was wrong) consume resources but generate no income at all. Disconnects also consume resources, but not nearly as many as connections and will be ignored in the following discussion. The cost of routing a call setup request must be balanced by the revenue generated during the duration of the call. Flat-rate local service is an apparent exception, but these calls tend to be easy to set up and long in duration (so there are fewer of them per unit time), so they are still profitable. For example, if it takes 10 seconds to route a call (ring the destination) and the average holding time (conversation) is 10 minutes, this gives a call setup/holding time ratio of 10:600 or 1:60, which is a good number. The cost to the customer must offset the cost of the call setup time as well as the cost of the call itself to the service provider. But what if the holding time shrunk to 5 seconds instead of 10 minutes? The call setup/holding time ratio would then be 10:5 or 2:1. This is not a good ratio whether calls are charged by flat rate or duration. In both cases, the revenue generated by holding time might not be adequate to offset the call setup costs and leave any profit at all.
What’s the point? Consider an SVC on a frame relay or ATM network with the following pattern of activity as shown in Figure 5.9. Notice that the bursts of traffic at each end of the call are separated by a long period of idleness. No work is done by the network on behalf of the users for the entire 10 minutes of inactivity on the SVC. There are no idle patterns in frame relay. But the customer must pay for the SVC for the entire duration of the call.
Figure 5.9 Bursty traffic and SVCs. How long will it take before the users do what is shown in Figure 5.10? If the cost of the two calls is low enough, users will do it. And the lower the cost, the more tolerable the second call setup delay becomes. But the call setup/holding time ratio may no longer be adequate to cover the costs and the network is doing a lot more work than it did before. Since the users can establish SVCs with almost as much bandwidth as they like, the users will compensate by increasing the bandwidth on the SVC if the call setup delay is too high.
Figure 5.10 Short holding time SVCs for bursty traffic. So maybe duration is not the best method to bill for frame relay or ATM SVCs. What about distance? Without duration as an adjunct to distance, this makes little sense by itself. The distance between endpoints must be determined as each SVC is established in order to bill by miles alone. But would a 10-Mbps, 2-hour, 10-mile SVC cost more than a 10-Mbps, 10-hour, 2-mile SVC? If not, why not? Without duration or something else as an adjunct second parameter, this makes little sense. Users may respond by establishing their own, short hop relay points to offset the costs of long connections, defeating the whole public network purpose. Also, with few exceptions, existing frame relay networks PVC prices are distance-insensitive. Should there be any compelling reason why the SVCs on these networks would not be distanceinsensitive also? What user would migrate from PVCs to such an SVC system? This price structure is likely to remain in place even after SVCs become common. But what other parameter could be used as a fair and yet profitable criterion for SVC billing? How about traffic load? What could be fairer? Send more frame relay frames, pay more. A potential problem with frame relay is the presence of the discard eligibility (DE) bit. If a frame is counted for billing purposes at the User-Network Interface (UNI) on the sending side, but is tagged as DE, the frame may never make it to the destination. The frame relay network may discard the DE frame under certain conditions like congestion. Also, a discarded DE frame will probably have to be resent, adding to the sender’s billing cost. But if the frame is counted at the destination UNI, billing information must be gathered for each connection terminating there. This information must then be correlated with the proper ingress UNI in order to send the proper sender the bill. This is not an impossible task, but it is certainly complicated. In fact, most SVCs on a frame relay network that are billed by frame counts will probably be billed at the sending UNI, and that is that. So what is the answer to the SVC billing problem? There is no generally accepted answer, most prominently in the case of public frame relay (and ATM) networks of arbitrary size.
Conclusion
Until the resource allocation and billing problems are solved for the general case in both large and small frame relay networks, SVCs will not be implemented widely, in spite of service provider claims and “me too” announcements. SVCs will appear in some situations, most notably singleswitch environments or in cases where the service offered is still in fixed bandwidth increments (such as voice, 10 Mbps Ethernet, and so on). Routing and billing for frame relay networks will remain a topic of intense research for the next few years. The fact that frame relay networks still employ frames instead of numerous small cells will make usage-based billing easier for frame relay service providers. This might be the place to simply list all the issues outlined in this chapter that make frame relay SVC offerings difficult to implement and use: Security on the inbound connections DLCI and CIR limits on the UNI Call setup/holding time ratios for bursty traffic Network-to-network SVCs Signaling messages with no priority and subject to DE rules Billing system issues Competition from inexpensive PVCs
Chapter 6: Congestion Control Overview The issue of congestion control in networks is not limited to frame relay networks, of course. All networks from private line networks to public X.25 networks to brand new ATM networks must deal with the problem of congestion. Typically, the standards and related documents that define the network service itself will also outline or even detail the mechanisms to be used for dealing with congestion. This chapter will describe the mechanisms established in frame relay to handle congestion. Congestion means that there is too much traffic in the network for the network to deal with effectively. Some texts are fond of dividing periods of congestion into those of mild congestion and severe congestion. This is a little like dividing drinkers into those that appear mildly drunk and those that appear to be severely drunk. There could be debates about whether lampshade dancing or falling off the barstool belong in the mild or severe category, but the important thing is that neither a mild drunk nor a severe drunk should ever get behind the wheel of a car. The trouble with the mild and severe congestion approach is that someone might perceive mild congestion as a less drastic condition of the network or even somehow okay. In truth, all congestion is harmful to network and users alike and should be avoided at all costs. And in fact it is much easier to avoid congestion than it is to alleviate congestion once it has occurred, even mild congestion. When a packet-switching network is congested, it slows down. This means that it takes longer for traffic to find its way through the network nodes. Since packets are generally presented one after another to the network, this slowing down is seen by the users as a reduction in effective bandwidth and a lengthening of the delay through the network. The change in the characteristic throughput of the network might be gradual or abrupt, but more than this is needed to distinguish mild from severe congestion. Normally, the relationship between the traffic load offered to a packet network and the network throughput is what is known as linear. This means that a doubling of offered load results in a doubling of throughput between senders and receivers. There is more traffic in the network at the doubled load, but if the network is designed correctly, more traffic is not necessarily a bad thing. Triple the offered load or input, triple the throughput, and so on. But what if the offered load continues to increase to its maximum possible value? This should not happen often, if at all, in a packet network designed for bursty applications. But what if, for one reason or another, all senders are bursting all the time? Obviously, no one expected this when the packet network was designed. If they did expect this all-the-bandwidth-all-the-time situation, the result would be the same as with circuit-switching. There would be no sense in using packet networks to recreate circuit networks. At some point, under heavy loads, the linear relationship between offered load and throughput breaks down. The relationship becomes nonlinear. A doubling of offered load at this level of traffic activity does not result in a doubling of throughput. Usually, the increase in throughput will now be much less than a doubling. In fact, under extreme conditions, doubling the offered traffic load may actually decrease the throughput from what it was before the traffic increase occurred! Some texts refer to the point where the load-to-throughput relationship goes nonlinear as the onset of mild congestion and refer to the point where the load-to-throughput relationship goes negative as the onset of severe congestion. This is shown in Figure 6.1. To be mathematically correct, the figure should technically have curves instead of straight line segments in the nonlinear sections of the figure.
Figure 6.1 The relationship between network load, throughput, and congestion. Again, the approach taken in this chapter is that mild congestion in a network is too much to allow the network to function as designed. It cannot be stressed enough that even regions of mild congestion are to be avoided if at all possible on modern packet-switched networks. Flirting with mild congestion at today’s higher network speeds is asking for a network disaster to strike. Congestion control is related to a network concept known as flow control. The two are really distinct phenomena, but it is common for flow-control problems to cause congestion and customary for a network to try to address problems of congestion control with flow-control remedies, as will become apparent. Flow control is a local property of a packet network. The principle of flow control means that no sender should ever be able to send faster than a receiver can receive. This only makes sense. Why send packets or frames at 1.5 Mbps if a receiver can only handle 128 kbps? If a receiver is being overwhelmed, flow-control mechanisms provide the receiver with a way of telling the sender to “slow down!” until more packets or frames can be digested. But it takes time for receivers to issue slow down messages and for senders to react to them. In the meantime, bits keep flowing into the network. The extra bits either build up in the network or must be discarded by the network, which usually means that the sender must resend them. Flow control never became much of a network issue until packet-switched networks like X.25 and the Internet came along. This is because when two end devices are connected with a leased private line, they “see” each other directly across the network. The end devices are always connected at the exact same speeds: 64 kbps to 64 kbps or 1.5 Mbps to 1.5 Mbps. The end devices could send all day without the network caring, since the bit pipe always matched at both ends. Flow control was still a user issue in this environment. If the link were between a print server and a remote printer, for example, the link could be operational when the printer was out of paper. The printer needed a way to tell the print server “slow down to nothing!” until there was paper in the printer again. There were mechanisms to do this, of course, but none of this concerned the circuitswitched network in the least. Private lines cost the same whether they deliver data bits or idle patterns. Packet-switched networks are different. The user devices do not “see” each other directly across the network cloud. The user devices see only the cloud at the end of the UNI. But one UNI could run at 64 kbps and another could run at 256 kbps. If a sender is sending across the UNI at 256 kbps for an extended period (packet-wise) to a destination serviced by a 64 kbps UNI, the extra bits can easily build up in the network. And it’s not really the sender’s fault. The UNI across the network remains essentially invisible to the sending device, which can only see the cloud at the end of the UNI. The X.25 packet-switching standard allowed the network to absorb some extra bits from a sender and parcel them out to a receiver as best it could. The bits simply stayed in an X.25 switch buffer until they could be sent to the destination. But this type of network flow control usually occurred at 4800 bits per second on one end and 2400 bits per second on the other. Since packet applications are inherently bursty, this approach worked, and still works, in X.25 packet-switched networks. The only trick is to make sure that the number of packets a sender can send without hearing from the receiver does not exceed the buffer capacity of the network and the time it takes to react to the slow down message.
But at fast packet speeds, the buffering approach to flow control becomes almost impossible. Buffering at 1.5 Mbps or 45 Mbps is a problem. The time it takes for a destination to tell an originator to “slow down” and for the originator to actually do so may result in plenty of lost information due to buffer overflows. In the interest of network efficiency, frame relay and ATM discard extra traffic above the level that the connection, PVC or SVC, was established to handle. In frame relay, the acceptable traffic level is set by the committed information rate (CIR) on the connection. This approach turns the issue of flow control back into a user issue as it was in circuitswitching instead of a network issue as it was in older packet-switching networks. Note that flow control concerns this sender and that receiver only. Flow control is a strictly local phenomenon. All of the other users on a network can be experiencing adequate service, but this pair of end devices are hopelessly bogged down. But congestion affects all users, regardless of who or what is causing the congestion. Congestion is a more global, but not necessarily universal, phenomenon. Some of the related concepts of flow control and congestion are illustrated in Figure 6.2. Flow control is a local property of a network, while congestion is a more global property of a network. No sender may be sending faster than a receiver can receive, but there is just too much traffic in the network. This is why most networks handle congestion by using flow control. Flow control makes senders slow down. In the case of flow control used for congestion control, the receiver is not the actual traffic destination, the network itself is the receiver of the sender’s traffic. It is true, however, that congestion might be restricted to a single network node or group of nodes. It that case, the relief method must inform the senders that are contributing to the congestion to slow while allowing other senders that are not contributing to the congestion to continue functioning as before. Frame relay employs such a relief method that targets only the specific senders that contribute to the congested node or nodes (it is hoped that congestion in a frame relay node is alleviated before the congestion spreads to other nodes).
Figure 6.2 Flow control and congestion control in frame relay. The only other way to handle congestion is to speed up the output. Since most networks output at the maximum value at all times anyway (there is little incentive not to), the only real way to speed up output is to discard traffic. Of course, receivers detecting missing traffic that they need will respond by asking the senders to resend all of the missing traffic, and usually much more traffic besides, even though the traffic was actually delivered intact. Fragments of IP packets inside frame relay frames, for example, cannot be resent individually; all the fragments must be resent. If only one fragment out of 10 was discarded due to network congestion, the net result will be a load of 20 packets on the network instead of only 10, even though only one was discarded. This is one of the main reasons that congestion is better to avoid than attempt to alleviate. It should be noted that user-to-user flow control mechanisms must continue to function regardless of the flow control mechanisms used by the network. Printers do still run out of paper.
Flow Control Mechanisms The most common form of flow control in use today is the windowing flow control method. Most network protocols are called windowing protocols for this reason. The members of this group include such popular network protocols as TCP/IP, SNA (although IBM calls this flow control mechanism pacing), and even frame relay in some cases. Frame relay is an oddity on this list because for data transfer, frame relay networks never bother with flow control within the frame relay network. The protocols employed at the endpoints of the frame relay network, such as TCP/IP or SNA, must handle this crucial function. Windowing is the process where a receiver grants permission to the sender to transmit a given number of data units (typically frames or packets) without waiting for the receiver to respond. The sender also establishes a send window tuned to the size of the receiver window. If the receiver’s window size is 4, just to give a simple and not terribly precise example, the sender can send no more than 4 frames or packets or whatever to the receiver across the network without having to stop. The efficiency of this process is clear, especially when compared to older stop-and-wait flow control protocols that forced a sender to stop and wait for an individual acknowledgment from the receiver for each and every data unit sent across the network. This acknowledgment performed double duty. Not only did the acknowledgment inform the sender that the data unit had arrived intact and so could be safely deleted from a sender’s buffer, but the acknowledgment also informed the sender that it was okay to now send another data unit without fear of overwhelming the receiver. The stop-and-wait method provides admirable flow control: Since every data unit is individually approved by the receiver, it is next to impossible to overwhelm a receiver with data units at all. But obviously, if the network delay is measured in hundreds of milliseconds or even whole seconds, the stop-and-wait flow control method is not very efficient at all. Stop-and-wait protocols spend a huge amount of time that could otherwise be spent sending just waiting around for acknowledgments to slowly make their way back across the network. Windowing protocols make more efficient use of network resources by allowing one acknowledgment to convey more information to the sender. When windows are just the right size, an acknowledgment should arrive just as the sender has filled the send window, allowing the process to continue and fill the pipe between sender and receiver across the network. Windowing protocols have been around since at least X.25 and the roots of windowing go back in the noninternational standard arena beyond X.25. While usually seen as a huge improvement over stop-and-wait approaches, the X.25 window sizes were quite modest in most cases. A typical value was 2; this was still twice as efficient as stop-and-wait and X.25 was universally applauded for that. Wild and crazy X.25 networks used window sizes of 3 or even 4, but the effectiveness of larger window sizes was limited by the high error rates on these networks. This was because the error recovery mechanism in X.25 was called go-back-N. With go-back-N, if a window size of 4 was used (another simple example), and the second frame was received with errors, the third and fourth frames were ignored by the receiver, even if received without errors. The receiver in the meantime sent a message to the sender to go-back-to- and resend everything from frame two on in the window. So high bit errors on a network set a practical limit on just how big a window could and should be.
Whether stop-and-wait or go-back-N, the emphasis in these flow control methods is on the receivers. This also makes sense because the whole goal of flow control is to prevent senders from overwhelming receivers. But not all flow control is under the control of the receiver. This is true of all windowing protocols, but not all protocols, especially older ones, are windowing protocols. Also, informing the receiver alone of flow control issues is not always the best strategy. What if the receiver cannot or does not communicate with the sender? Examples of such interactions are not as far-fetched as they might seem. Remote weather stations dump huge amounts of information into a central site, which can easily become overwhelmed. Yet few simple weather stations are able to receive from the other site. The same is true of burglar alarms, remote sensors, and even television sets. This point will be important later in the discussion of frame relay congestion control. Senders always want to send as fast as they possibly can. In windowing protocols, slowing down is an unnatural act that must be imposed upon a sender by a receiver. The receiver can do a few things to slow down a sender, and one or more of these methods is routinely built into all data communications protocols. First, the receiver can withhold an acknowledgment for some time period after the acknowledgment could have been sent. Since the sender cannot send beyond the window without an acknowledgment, this effectively slows the sender down. However, the acknowledgment delay cannot be set too high or else the sender will assume the acknowledgment itself has been lost and resend the whole window in a kind of “maybe you didn’t hear me” mode. Second, the sender can shrink the window size. This permits the sender to send a smaller amount of information than before and has the desired result. Of course, once the window is set to a minimum, this mechanism no longer functions at all. Finally, the receiver can send a message to the sender explicitly saying, “slow down.” This is quite effective. The problem is that if there is a specific message to “speed up” which is lost on the network, the application can remain sluggish. It should be noted that there are actually two levels of acknowledgments, and therefore potential flow control, in most windowing protocols. There is a window established hop-by-hop in the network, between each network node and end system, usually at the frame or packet level. There is also a window established end-to-end through the network between end systems only, usually at the packet level or above. The details are not important. What is important is that the decreased error rates in modern networks have allowed fast packet networks like frame relay to speed themselves up by relaying data units without hop-by-hop flow control or error correction within the network between network nodes (frame relay switches in the case of frame relay). But this does not mean that there is no end-to-end flow control or error correction at protocol layers above frame relay. In fact, it is because these upper layers still must perform flow control and error correction at some layer that the network itself can dispense with this function. The network can now concentrate on congestion control rather than internal flow control. This approach is shown in Figure 6.3. Note that frame relay does no internal flow control between frame relay switches. If a sending frame relay switch sends faster than another frame relay switch can receive, the only recourse available is for the receiving switch to discard traffic. The only question is which traffic should be discarded first.
Figure 6.3 Hop-by-hop and end-to-end flow control.
Many of these points about frame relay networks have been made earlier. But the points about flow control are important enough to repeat here in this chapter about congestion control. Frame relay uses flow control concepts to address congestion control issues in order to prevent haphazard discarding of user traffic at a congested frame relay switch. This only makes sense because discarding user information needed to complete an action at the receiver will only result in the endto-end error recovery mechanisms telling the sender to repeat at least the missing data units and at most everything in the window from that point on (go-back-N). One more point about flow control should be made. As the limitations of stop-and-wait flow control led to the development and deployment of go-back-N methods, the limitations of go-back-N flow control led to the development and deployment of selective retransmission methods. The major difference is that while go-back-N requires otherwise perfectly good data units to be resent under most circumstances of data unit loss (whether due to error or network discards), selective retransmission does not. With selective retransmission, only the missing data units in a window need to be resent. However, due mainly to the complexities and processing overhead of the selective retransmission approach, implementation has been rare outside of specialized networks such as wireless networks, where resending a good but missing data unit is quite counterproductive. The higher bit error rates and frequently exorbitant air time costs on wireless networks combine to make go-back-N very expensive and inefficient.
Flow Control in Frame Relay The first and foremost flow control mechanism in frame relay is the CIR. It has already been noted that frame relay networks do not have hop-by-hop flow control between the frame relay switches. Yet the CIR is a flow control mechanism of sorts. There is no contradiction here because the CIR is enforced in frame relay at only one point in the entire network—at the switch port side of the sending UNI—and nowhere else. The role of the CIR in frame relay has already been discussed in some detail in Chapter 4. The CIR represents the amount of usable bandwidth an end system can employ on a frame relay network without fear of discards under normal circumstances. In most cases, normal circumstances mean without congestion. Since congestion control is the main topic of the whole chapter, the role of the CIR under congested conditions on a frame relay network is deferred until later. For now, assume that the frame relay network is indeed operating under normal circumstances (i.e., without congestion). How does the CIR perform flow control? The goal of flow control is to prevent senders from overwhelming receivers. This is one thing when all senders and receivers are linked by the same speed as on a private line network. But frame relay allows UNIs to operate at a wide range of speeds, from as low as 56 kbps (and lower if analog modem access is considered) to as high as 45 Mbps (and beyond if Sonet access is considered) with numerous speeds in between. There is no requirement that the UNI speeds match at each end of a virtual circuit on frame relay. The way that the CIR is used for flow control is shown in Figure 6.4. Here a sender (Sender A) on a 256 kbps (fractional T1 or FT1) UNI has a DLCI that leads to the receiver (Receiver B). Since each DLCI needs a CIR, the CIR is established between the sender and the frame relay switch port at the end of the UNI. There is a local DLCI at the receiving UNI as well, but it is important to remember that there is no CIR between network and user on the remote UNI. The CIR is purely a user-to-network concept on the sending side (but it is true that any traffic from a virtual circuit on the 64 kbps UNI will have its own CIR enforced at that UNI).
Figure 6.4 The CIR as a form of flow control. What would happen if the CIR from Sender A to Receiver B was set at 128 kbps? It is easy to see from the figure that data units could easily enter the network at 128 kbps, since the sending UNI is running at 256 kbps. Of course, 128 kbps could never leave the network at this rate since the receiving UNI operates at only 64 kbps. The extra data units will accumulate until discarded at the egress UNI, unless there is some mechanism for the receiver to tell the sender to slow down. This is the role of the CIR in frame relay flow control. Note that there is no easy way for a sending frame relay PVC to know exactly what the UNI physical speed is on the other side of the network. Care in configuration is definitely required. Note further that this use of the CIR has nothing whatsoever to do with the concept of oversubscription on the UNI.
Of course, the situation in Figure 6.4 could easily be interpreted as just another form of congestion control since the frame relay network is really the one being protected (the receiving UNI cannot operate faster than its physical speed anyway). The fact that the CIR is enforced at the ingress port makes this point less emphatic, however. Nevertheless, the CIR is a form of flow control and should be considered a flow control issue, if only because the CIR is a strictly local parameter on the network, between this sender and that receiver. This local characteristic is the essence of flow control. There is one other situation where a frame relay network can and should employ flow control. This is when LAPF reliable mode is used to provide reliable information transfer across a frame relay UNI. The only information transferred across the UNI in reliable mode is the Q.933 call setup signaling message. In this case, full data link layer frame transmission techniques, virtually identical to those employed in ISDN LAPD, are used to establish windows, withhold acknowledgments, and so on to perform flow control on the UNI. Reliable mode uses go-back-N recovery for lost data units. LAPF frames used for reliable mode contain a control field used for these purposes, while normal LAPF frames do not have a control field at all. Theoretically, a frame relay network service provider is allowed to implement reliable mode on any virtual circuit DLCI on the UNI, even those carrying user traffic. But, up to this point in practice, reliable mode has only been used on the UNI for DLCI = 0 Q.933 call setup signaling messages. (Further use of reliable mode for some actions on the frame relay NNI will be discussed in Chapter 8.)
Congestion Control Mechanisms Congestion control mechanisms are closely related to and interact with flow control mechanisms. But this does not mean the mechanisms are the same. Network protocols can employ one of two main methods of congestion control. There is implicit congestion control and explicit congestion control. With implicit congestion control, the end devices on the network will use some indirect source of information to determine if congestion exists anywhere along the path through the network from source to destination. Explicit congestion control involves the end devices on the network. Explicit congestion control means that the end devices on the network will use some direct source of information to determine if congestion exists anywhere along the path through the network from source to destination. Consider implicit congestion control first. Since networks like frame relay simply relay frames through the network, the higher layer protocols at the ends of the network are not notified of network congestion directly. End systems are all concerned with flow control, not so much with network congestion. End systems routinely monitor the delay through the network, but for purely selfish reasons. Nevertheless, there are two main ways that higher layer protocols can infer that congestion within the underlying network is taking place. First, the end systems can monitor how the round trip delay changes over time. The round trip delay might rise or fall, of course, but if the delay is slowly but surely rising over time without corresponding drops, this is a pretty good sign that network congestion is occurring. Second, the end systems can monitor the number of lost data units they send across the network. If the lost data units rise above the number expected due to the long-term error rate on the network, this is also a sign of network congestion. Both methods rely on the fact that congested networks slow down and ultimately discard data units. Because neither mechanism can be said to rely on explicit information about network congestion, this is definitely implicit congestion control. There might be other causes for the same delay rise and data loss effects. Even in the case of implicit congestion control, however, response is usually swift and often effective. Senders realizing that delays are rising and/or lost data units are above the expected level will typically perform a quick backoff followed by a slow increase back up to their former sending rates. The quick backoff is usually to 50 percent of the former rate, and if the delay continues to rise and/or the data unit loss is high, another 50 percent backoff can be assessed (now the send rate is 25 percent of the uncongested rate). The slow increase means that senders can slowly increase their sending rate back up to their former levels if the delays begin to fall and/or data unit loss reverts to expected levels due to pure error rates. The nice feature of implicit congestion control is that the higher layer protocol employing such methods is totally independent of the type of network used to deliver the higher layer data units. Implicit congestion control is employed in many network protocols, including the most popular of all: TCP/IP. Now consider explicit congestion control. The whole idea is to avoid the network having to rely on implicit methods and extreme measures such as discarding information to inform the senders to slow down. There is no universal mechanism for explicit congestion control and the sender notification that accompanies the method. The message could be conveyed in the form of a special message or packet or frame. The only requirement is that the congestion notification be received and acted on by the senders and receivers that are in a position to actually do something about the congestion. For instance, it makes no sense to send congestion notifications to senders on the East coast if none of them are contributing to congestion in Idaho.
Because there is no standard for explicit congestion control and the congestion control mechanism in frame relay is explicit congestion control and notification, this discussion is best pursued by leaving the realm of the abstract for the surer footing of frame relay itself.
Congestion Control in Frame Relay Frame relay networks use an explicit congestion control method to prevent the frame relay network from filling switch buffers with frames that cannot be delivered before the frames must be discarded to make room for more arriving frames. The frame relay network uses a mechanism of explicit congestion notification (sometimes abbreviated ECN) to convey congestion information to both senders and receivers at the end of a particular DLCI (label for a PVC or SVC). The first point to be made about frame relay network congestion is that it has both a location and a direction. This might seem odd at first. But determining if a frame relay network is congested is mainly a process of determining the status of buffers on the output ports of frame relay switches. Why output ports? Because the whole idea is to get traffic out of the network. Frame relay frames sitting in an output buffer represent work already done by the network node. If the frame is discarded, the switch might have to perform the work all over again. The most important output buffer is the one at the receiving UNI. A frame that has reached the receiving UNI’s output buffer means that the frame relay network has done all that it needs to do to deliver the frame to the user (actually, the FRAD) at the other end of the UNI and collect the fee for delivering the frame successfully. Congestion here means that this work might have to be repeated. The location of congestion in a frame relay network is very specific. Congestion occurs at this output port. A frame relay switch might have eight output ports, only one of which is considered congested. Naturally, all eight could be congested, or only one or two. The point remains that congestion occurs in frame relay networks on an output port by output port basis. All DLCIs that have a path mapped through the frame relay network using that port will be affected. It does not matter which DLCI might be flooding the switch with traffic, congestion affects all user connections serviced on the port. Note that in the opposite direction on a given DLCI, there might be no congestion at all. The output port buffers leading back in the opposite direction might be totally empty. This is why frame relay congestion has both a location and a direction. It is true that if one output port in one direction is congested at some point in time, the effects will ripple back through the frame relay network and congest other output ports. That is the nature of congestion. The aim of congestion control and the notification process is to prevent this ripple effect from occurring. In keeping with the whole mild and severe congestion philosophy, most frame relay switches have two levels of congestion. The levels are based on the percentage of output port buffers in use at any particular point in time. There is no magic number attached to these percentages. Most frame relay switch vendors set the default at 50 percent and 80 percent of output port buffer capacity respectively. Few frame relay service providers ever see fit to change the defaults. The concepts of normal, mildly congested, and severely congested frame relay output buffers are shown in Figure 6.5.
Figure 6.5 Mild and severe congestion in frame relay.
It is clear from the figure that only two of the four output ports are congested. This means that only those DLCIs that actually have frames routed through those two output ports (Ports #1 and #3) are currently experiencing congestion. Unless steps are taken, all output ports might join the congested category in the near future. But not all frames arriving on the input ports are switched through the congested output ports. The two DLCIs defined on input port #4, for instance, are routed with one to the severely congested output port #3 and the other to the uncongested output port #4. Perhaps both DLCIs originate at the same customer site. But traffic on one DLCI is experiencing congestion and maybe increasing delays (output port #3) while the other DLCI is not (output port #4). Ideally, congestion control in this situation should have a way to inform the sender and/or receiver on the paths experiencing congestion that the congested condition exists. That way, only the connections affected will have to throttle back their sending until the congestion is alleviated. Frame relay uses the mechanism of explicit congestion notification to accomplish this task. The frame relay explicit congestion notification is sent to both senders and receivers. This might sound odd since the whole goal is to get the senders to slow down. But remember that most modern network protocols are windowing protocols in which the receiver, not the sender, is the primary party for the flow control mechanisms that ultimately slow senders. However, not all protocols are windowing protocols and, more importantly, not all applications involve frequent receiver to sender interactions. The examples of burglar alarms and weather stations were used earlier. Only the senders in these applications can do anything about their sending rate. So the senders and receivers must first be notified of the congestion and, if it continues, the frame relay network will invoke congestion control to alleviate the congested condition. Explicit congestion notification in frame relay involves the use of two bits in the frame relay frame header. Congestion control in frame relay involves the use of one other bit as well. The two bits directly involved in explicit congestion notification are the Forward Explicit Congestion Notification (FECN) and Backward Explicit Congestion Notification (BECN) bits. Congestion control involves the Discard Eligible (DE) bit, also in the frame relay frame header. Figure 6.6 shows the location of these bits in the second octet of the frame relay header.
Figure 6.6 The FECN, BECN, and DE bits. The FECN bit is used to inform the receiver of the presence of congestion anywhere on the path from sender to receiver. The BECN bit is used to inform the sender of the presence of congestion anywhere on the path from sender to receiver. Both are necessary just to make sure that the sender always gets the message and has an opportunity to slow down before severe congestion occurs. User FRADs always generate frame relay frames with the FECN and BECN bits set to a 0 bit. Any frame relay switch can set the two bits to a 1 bit on all frames traveling in both directions that pass through the congested port, both input and output. In other words, if DLCI 19 from User A to User B is mapped to a congested output port, then not only will all frames traveling on this output port have the FECN and BECN bits set, but so will all frames that travel on DLCI 19 from User B to User A, even though there is no congestion at all in that direction. Uncongested frame relay switches will ignore the FECN and BECN bits and never change them back to a 0 again under any circumstances. In most frame relay networks, the setting of the FECN and BECN bits occurs when the output port buffers are 50 percent full. As a simple example, consider a frame relay switch with four output ports that has output buffers for 10 frames on each output port. If there are five frames in the buffer, then the switch will begin to set the FECN bits on all frames switched to that output port to a 1 bit. All frames that arrive on the corresponding input port will have their BECN bits set to a 1 bit. Frames switched to and from other ports on the switch will not be affected by this activity.
FRADs that receive frames on a given DLCI with the FECN bit set know that somewhere on the path between themselves and the other FRAD there is an output buffer handling arriving frames that is more than 50 percent full. FRADs that receive frames on a given DLCI with the BECN bit set know that somewhere on the path between themselves and the other FRAD there is an output buffer handling departing frames that is more than 50 percent full. The two meanings are somewhat different and affect just which sender should slow down in the case of FRADs that exchange information in both directions. Obviously, there is no benefit derived from having senders slow down if this will not help to alleviate the congestion in the network. A FRAD could receive frames with both the FECN and BECN bits set to a 1 bit. This means that one or more output buffers in both directions are more than 50 percent full. A FRAD could receive frames with the FECN and/or BECN bits set to a 1 bit on one or more DLCIs. Since DLCIs are used to label all connections, this information can be used to slow senders that are directly contributing to the congestion and no other connections. This interplay of buffers, FRADs, and FECN/BECN bits is shown in Figure 6.7.
Figure 6.7 Buffers, FRADs, and BECN/FECN bits. Note that the FECN bit tells a receiver that there is congestion and that is all. It is up to the higher layer protocol to first detect the bit, then invoke some delayed acknowledgment or reduced window strategy to force the sender to slow down. Unfortunately, there are few higher layer protocols that are capable of detecting whether frame relay frames arrive with the FECN and/or BECN bit set to a 1 bit in the first place. And TCP/IP, the most common higher layer protocol, definitely does not and will not detect the status of these bits. This is because TCP/IP highly prizes its runs the same on anything reputation. Adding features and functions only for frame relay networks would violate this principle. Without the cooperation of the receiver and sender higher layer protocols, there is only so much a FRAD can do to respond to the FECN and BECN bits. If user applications continue to send in an unrestricted manner, the FRAD can either attempt to buffer the extra traffic, perform its own version of CIR enforcement (no frames that would be tagged DE = 1 by the switch are sent), or use a combination of the two approaches. Modern FRADs do some amount of buffering of frames, usually quite a bit in an effort to smooth bursts in a process known as traffic shaping. Many FRADs also respect the CIR, and even set the DE bit to 1 when the switch would in any case, although this is not technically the job of the FRAD. In most cases where a FRAD, router or hardware device, responds at all to the FECN and/or BECN bits, it is by slowing down to the CIR level established for each DLCI affected by the congestion. Bursts above the CIR on those DLCIs are buffered, if possible, or else discarded. After all, frames above the CIR are not guaranteed delivery anyway. Who cares if it is the FRAD or the switch that actually performs the discard?
Which Traffic to Discard? The role of the FRAD in congestion control has introduced the function of the DE bit in the whole congestion control process. The DE bit comes into play because frame relay applications routinely burst above the CIR. As previously mentioned, this is okay, but there is no guarantee that frames sent in excess of the CIR on a given DLCI will actually get there. The CIR essentially forms a bandwidth quality of service guarantee on the frame relay network, although the CIR mechanism is not perfect and even traffic conforming to the CIR might be discarded under congested conditions. Here is why.
The FECN and BECN provide a mechanism that effectively kicks in at 50 percent buffer capacity, which should be early enough to avoid severe congestion and a need to discard any user traffic. However, it is clear that if any traffic is to be discarded, it should be the frames tagged upon entry to the network as DE = 1 (may be discarded under certain conditions). After all, if the rates that users pay for DLCIs are higher for higher CIRs, then users can attempt to cut corners and pay for a lower rate CIR and simply burst above this rate constantly. If all DE = 1 frames go through anyway, what is the point of a CIR at all? Maybe all CIRs should just be zero (all frames are tagged DE = 1). In fact, several major frame relay networks only support CIR = 0 service. So a discard of user traffic should focus on DE = 1 frames. These are frames that the network has not “promised” to deliver, so there should be no repercussions in terms of service level agreements (SLAs) or tariffs. If the DE = 1 frames do not arrive, the missing frame contents (but not necessarily the frame relay frames themselves) will be detected by the receiving higher layer protocol (IP for instance). Then normal error-recovery procedures are taken by the higher layer protocol to either compensate for the missing frames’ information (freeze a video, silence on a voice call) or ask for a retransmission from the sender (routine in data applications). This is why customer premises equipment and applications that respect the CIR have such value in a frame relay network. Notice that 50 percent buffer capacity on an output port does not trigger the frame relay switch to do anything at all with the frame relay frame contents in the buffer. The switch just begins setting FECN and BECN bits on traffic in both directions, and continues to accept traffic for the overloaded port and send the contents of the buffer out onto the network. Ideally, if all FRADs, protocols, and applications properly interpret and react as recommended to the FECN and BECN bits, this action should be enough. But what if it isn’t? What if the congested buffer continues to fill, from 50 percent to 60 percent to 70 percent and beyond? At 100 percent buffer capacity, which in our example has been set at 10 frames queued to go out the output port, bad things will assuredly happen. Since buffers are memory areas dedicated for communications purposes, the symptoms of full buffers closely resemble the symptoms a simple PC suffers when it runs out of main memory. The system can hang, crash, or reboot itself. So can frame relay switches. Naturally, this does not mean the frame relay switch will always do one thing or another. The switch could even just switch frames incorrectly. But why take chances? The trick is to never allow the output buffers to reach 100 percent capacity and see what happens in a production environment. Most frame relay switches will begin to set the FECN and/or BECN at 50 percent buffer capacity, or five frames in the example switch. The next major mechanism takes over at 80 percent of buffer capacity. At 80 percent capacity, or when eight frames are now in the buffer, the buffered frames are scanned for the status of the DE bit. If the bit is set (DE = 1), then the frame is discarded. Actually, the term “discarded’ is much too concrete to describe what really happens in most cases. Typically, the buffer space is freed, usually by returning the pointer to that buffer space to the pool for free buffers on the port in the switch. Either way, though, the result is the same: The frame is gone. It is instructive to consider some of the possible outcomes of this DE = 1 scan procedure. After all buffer contents are scanned, the buffer could actually be totally empty! This would be the case if all the frames in the buffer were tagged as DE = 1. This often happens when CIR = 0 is used a lot on networks. The process does not halt when buffer occupancy is less than 80 percent, or 50 percent, or some other number. The process completes when all frames in the buffer are examined for possible discard. Not only is severe congestion alleviated, but the buffer is given a new lease on life. And if the applications have reacted to the FECN and/or BECN bits, it should be a while before the port is again congested. Alternatively, the scan on the buffer could result in a buffer that is as full as it was before the DE = 1 scan commenced! This is not as odd as it sounds and actually has a message for those concerned with frame relay configuration and traffic management.
The output buffers that are of concern here are the buffers located on the network side of the UNI. The destination FRAD is at the other end of the UNI, and the frame is considered to be delivered once it has arrived intact at the FRAD. There could be many other output buffers inside the frame relay network, and often are. But since there is no requirement to use frame relay to link frame relay switches together inside the cloud (this would not be true UNI nor NNI anyway), the output buffer at the destination UNI is the only output buffer definitely under the control of frame relay specifications. Recall that only frame relay frames bursting above the CIR are tagged as DE = 1. What if everyone everywhere only sends frames into the frame relay network at the CIR rate? There are many ways this could be accomplished. FRADs could buffer frames before they enter the network, applications could pace themselves, the CIR could be vastly overconfigured, and so forth. In any case, if many DLCI all lead to a central site, as is often the case, and all sending FRADs respect the CIR, then there will be no DE = 1 frames in the buffer at all, even if the buffer occupancy exceeds 80 percent. Therefore, there will be no frames to throw away. What can be done about congestion in these all DE = 0 circumstances? Here is where the under normal conditions clause of the CIR promise of assured delivery and guaranteed bandwidth takes effect. Buffers that are congested and contain no DE = 1 frames at all are not normal in frame relay terminology. If senders can burst, they will. If they do not, that is not normal. And under non-normal conditions, the switch is basically free to do whatever it likes to alleviate the congestion, especially if the alternative is to hang, crash, or reboot the switch. Some switches will just proceed to skim off the excess DE = 0 frames above the 50 percent level. This makes sense because that is the last traffic to enter the switch and such a procedure does not penalize user traffic that has been sitting in the buffer patiently waiting to be sent. Other switches will simply flush the entire buffer and start from scratch. Switch vendors often become creative when dealing with DE = 1 frames on the frame relay network. One major frame relay switch vendor made a feature, known as early packet discard, the default action on the sending UNI. The thinking behind this feature seems to have been: “Why wait until output buffers congest at the egress switch to look for DE = 1 frames? Let’s find them at the input port at the ingress switch when we screen for CIR compliance. Then any frame that would have been tagged as DE = 1 can be discarded earlier rather than later.” The feature worked very well. In fact, it worked so well that no one with a CIR = 0 ever was able to send a single frame through the network. All CIR = 0 traffic was duly tagged as DE = 1 in one process and tossed in the bit bucket by the next process. Since the service provider had nearly every customer configured for CIR = 0 at the time of the switch cutover, users were less than pleased. The only action that could be taken was to either disable the early packet discard or give everyone a non-zero CIR. The service provider quickly handed out CIRs, usually at 50 percent of the UNI physical line rate, since the benefits of early packet discard were still attractive to the service provider. Users faced with frame relay networks that routinely discard frames tagged as DE = 1 quickly learn a few things. First, it is always a good idea to provide the network with a few frame relay frames to discard if the network starts hunting down DE = 1 frames. Some FRADs allow users to set the DE bit to a 1 bit intentionally, even if the frame is sent below the CIR. Of course, the FRAD cannot set a DE bit to a 0 bit if the frame would be above the CIR and have a DE bit set to a 1 bit ordinarily. Second, DE bits can be used as a very raw form of priorities when set by the FRAD. If a single DLCI carries (for example) IP packets with voice samples in some packets and bulk file transfer data in others, it only makes sense to tag the file transfer frames as DE = 1 and the voice frames as DE = 0, even if all are sent below the CIR. Congestion can occur anywhere in the network and has no respect for real-time requirements. The discarding DE = 0 traffic by a frame relay switch is a short-term solution to the problem of congested buffers with no discardable traffic. But there is a long-term solution, one that requires the cooperation of those responsible for network management, planning, and configuration. They must cooperate whether they work for the customer or service provider to consider the repeated situation of 80 percent buffer capacity with no DE = 1 frame to discard to be a clear message and to mandate for change on the frame relay network.
This is the message that the consistent 80 percent or more congested output buffer with all DE = 0 frames is sending to the responsible personnel: The receiving UNI’s port speed and physical line rate is underconfigured. In plain terms, all senders are respecting their CIRs. The network has allocated sufficient resources to deliver these frames to the destination UNI efficiently. The destination UNI just cannot get rid of them fast enough to prevent congestion. The only long-term solution is to increase the port speed and physical line rate of the destination UNI. This is usually not a trivial task. The proper new speed must be determined (256 kbps instead of 128 kbps? A full T1?). Facilities must be provided and paid for. The receiving FRAD must be configured properly. The cutover must be coordinated and managed. All of these issues are more properly design issues and are not discussed further.
What FRADs Can Do about Congestion So far, the discussion about frame relay congestion might give the impression that FRADs are completely at the mercy of the frame relay network and dependent on the higher layer protocols when it comes to congestion control. Therefore, it seems ironic that FRADs are the frame relay network components that are the target of the FECN and BECN bits. The whole concept of layered protocols and layer independence makes it difficult for the FRAD operating at lower layers to inform the applications operating at the higher layers on the frame relay network to slow down and speed up when they should. The issue is more than just getting senders to pay attention. Just what constitutes an adequate slow down? When should senders be allowed to speed up again? These issues have also been mentioned, but so far the answers presented have been more along the lines of general terminology (“usually 50% slow down”) and not much else. Obviously it is in everyone’s best interest if some standard mechanism were established to allow for vendor independence and customer evaluation of devices on a common ground. The standard issue is always important, especially for public networks. As it turns out, such guidelines do exist. Annex A of the ANSI’s T1.618 specification on frame relay actually specifies how user devices (routers and/or FRADs) and networks should use and act on the FECN and BECN bits. Since more and more frame relay equipment manufacturers and vendors have pledged to react to the FECN and BECN bits, routinely ignored until relatively recently, this is a good place to discuss what frame relay equipment should do about congestion and information loss.
Forward Explicit Congestion Notification (FECN) Use by User and Network According to the specification, the user device (FRAD) compares the number of frames in which the FECN bit is set to a 1 bit (congestion) to the number of frames in which the FECN bit is set to a 0 bit (no congestion) over a defined measurement period. During this measurement period, if the number of FECN bits set to a 1 bit are equal to or exceed the number of FECN bits set to a 0 bit, the user device should reduce its sending rate to 0.875 of its previous value. By the same token, if the number of FECN bits set to a 0 bit are equal to or exceed the number of FECN bits set to a 1 bit, the user device is allowed to increase sending by a value of 1/16(0.0625) of its sending rate (reflecting the slow start used to restore sending rate). The measurement interval is to be equal to approximately four times the end-to-end network delay. So if the end-to-end delay is 40 ms, the interval should be about 160 ms or so. As for the frame relay network use of the FECN bit, the frame relay switch constantly monitors the size of each queue in the buffers based on what is called the regeneration cycle. A regeneration cycle starts when an output buffer goes from being idle (empty) to busy (one or more frames). A measurement period is also defined, between the start of the previous regeneration cycle and the present time within the current measuring cycle. During this measurement period the average size of the output buffer queue is computed according to a formula. When this average size exceeds a predetermined threshold value, this particular output link is considered to be in a state of incipient congestion. At this time, the FECN bit on outgoing frames is set to 1 and remains set to 1 until the average queue size falls below this preestablished threshold.
The ANSI T1.618 specification defines an algorithm to be used to compute the average queue length. The algorithm involves making a series of calculations such as the queue length update, the queue area update, and the average queue length update. This process makes use of the following variables: t = Current time ti = Time of the ith arrival or departure event qi= Number of frames in the system after the event TO= Time at the beginning of the previous cycle T1= Time at the beginning of the current cycle For the sake of completeness, the actual algorithm consists of the following three components for the calculations: 1.The queue length update: Beginning with q0 = 0, If the ith event is an arrival event, qi= qi+ 1 If the ith event is a departure event, qi= qi− 1 2.The queue area (integral) update: Area of the previous cycle = sum of qi− 1(ti− ti− 1) over the interval Area of the current cycle = sum of qi− 1(ti− ti− 1) over the interval 3.The average queue length update: Average queue length over the two cycles =
Backward Explicit Congestion Notification (BECN) Use by User and Network According to the specification, if a user receives “n” consecutive frames with the BECN bit set to a 1 bit, the traffic should be reduced from the user by a step below the current sending or offered rate. The step count (S) is defined in the following order: 0.675 times throughput 0.5 times throughput 0.25 times throughput In the same fashion, traffic can be built up after receiving “n/2” consecutive frames with the BECN bit set to a 0 bit. The rate is increased by a factor of 0.125 times the sending rate. The value of S is calculated according to the following formulas:
where IRf= Information rate in the forward direction IRh= Information rate in the backward direction S = Step function count Thf= Throughput in the forward direction agreed during call establishment Thh= Throughput in the backward direction agreed during call establishment EETD = End-to-end transit delay N202f = Maximum information field length in the forward direction N202b = Maximum information field length in the backward direction Arf = Access rate forward Arh= Access rate backward Bef= Excess burst size forward Beh= Excess burst size backward Bcf = Committed burst size forward Bch= Committed burst size backward Fh/Ff= Ratio (either expected or measured over some implementation-dependent period of time) of frames received to frames sent The same document recommends that for network use of the BECN bit the frame relay network begin setting the BECN bit to a 1 bit prior to experiencing serious congestion and having to discard frames. Of course, if congestion ever reaches the point of creating severe problems, the network will start to discard frames, and frames with the DE bit set to 1 should be the first to go.
Windowing and FECN/BECN There are actually four situations that need to be considered when the interplay between windowing protocols and the FECN and BECN bits need to be considered. These four are: FECN with no traffic loss. FECN when traffic loss is detected. BECN with no traffic loss. BECN when traffic loss is detected. Consider each case in order. Most end-user protocols employ some form of windowing protocol for end-to-end flow control. In this environment, when a FRAD or user device employs FECN with no information loss detected, it compares the number of frames received with the FECN bit set to a 1 bit and those frames received with the FECN bit set to a 0 bit during a measurement interval. The measurement interval should be equal to two window turns. A window turn is the maximum number of frames that can be sent before an acknowledgment is required. If the number of frames with the FECN bit set to a 1 bit are greater than or equal to the number of frames with the FECN bit set to a 0 bit, then the device reduces the window size to 0.875 of its current value. But if the number of frames received with the FECN bit set to a 1 bit is less than the number of frames with the FECN bit
set to a 0, the device increases the window size by one frame, as long as this action would not exceed the maximum window size for the connection. After each adjustment, the whole process repeats itself. Next, consider the case where the user device has detected not only FECN bits but also some lost traffic and missing information. Not all user devices can do this, but if a frame is missing and the device realizes it, the device should reduce the window size to 0.25 of its current value. But if the device realizes that the frame relay network is providing congestion notification (early frame relay switches did not even use the FECN/BECN bits) and no frames with the FECN bit set to a 1 bit were received during the measurement interval (as previously defined), the device should conclude that the information loss is not due to network congestion. This conclusion is based on the fact that the network would normally send frames with the FECN bit set to a 1 if congestion was occurring. So frame loss without any indication of FECN bits set to a 1 bit is assumed to be due to errors on the network and not congestion. If there is further indication of congestion (FECN bits set to 1), then the window size is reduced by a factor of 0.625 instead of 0.25. The third possibility is when BECN bits indicate congestion but no frames are detected as missing. In this situation, the step count S (as previously defined) is used to adjust the sending rate. The step count S can have several values, but in this example S is assumed to be one window turn. If a frame is received with the BECN bit set to a 1 bit, the device reduces the window size by 0.625. The device will continue to reduce the window size if S consecutive frames with the BECN bit set to a 1 bit are received. Naturally, the window cannot be reduced to less than one frame, so the process eventually halts. But, when frames are received with the BECN bit set to a 0 bit, the device increases the window size by one frame after receiving a total of S/2 frames. The final case is where there are BECN bits for congestion and frame loss is detected. Again, this assumes that the user device is capable of detecting the lost traffic. In this case, the device reduces the sending rate to 0.25 of the current rate. This occurs whether the sending rate is being reduced due to congestion notification (BECN bits set to a 1 bit), or the frame relay network does not support frame operations that can set the BECN bit. Most frame relay networks rely on the user device (for example, a router or desktop PC attached to a FRAD) to perform flow control operations on the UNI. In many cases, the transport layer (end-toend layer) in the user device performs the flow control function. But if the FECN/BECN bits are to be interpreted and acted on by this transport layer, some mechanism must be put in place for the frames with the FECN and/or BECN bits set to a 1 bit to notify the transport layer of their status. This plan, as simple as it seems, is not so easy to implement. The plan requires changes to higher layer functions and the new coding to go along with it. Also, many transport layer protocols will timeout at the sending end if acknowledgments are delayed or missing. The result is a lot of retransmissions of discarded traffic. This only makes network congestion worse, as the identical traffic pattern that caused the congestion is reintroduced into the network by the unsuspecting transport layer. So it is not only the FRAD or the router that needs to adjust sending rates in response to frame relay network congestion.
Consolidated Link Layer Management (CLLM) This chapter has been concerned with many aspects of frame relay congestion notification and control. Notification is handled in frame relay by conveying to the endpoint devices on a connection the FECN and BECN bits in the frame relay frame header. The congested DLCI is indicated by the simple fact that the frame relay header also contains the DLCI of the affected connection. It has been shown that a given sender might receive FECN and/or BECN indications on only one DLCI defined on a UNI, or more than one DLCI, or all of them. The use of FECN and BECN is a simple yet powerful notification mechanism that can be used to avoid congestion on the frame relay network. But there is one circumstance where this simple FECN/BECN mechanism will just not work. Mention has already been made of one-way user traffic applications and several examples have been given. Other user applications such as the famous network printer (more correctly, networked print server) can pose a special challenge for FECN/BECN situations as well. The problem is this: How can the status of the FECN and BECN bits be sent to both ends of a frame relay connection efficiently if there is little or no traffic in one direction? Both ends of a frame relay connection must be notified of congestion, since the frame relay network has no idea (and should not have any idea) whether flow control is performed by the sender or receiver. But if only one end ever or rarely receives frames from its counterpart across the network, how can the FECN/BECN information be sent at all? The situation is shown in Figure 6.8.
Figure 6.8 The trouble with FECN/BECN and one-way user traffic. If there are no or few user frames flowing in one direction on the network, then the FECN/BECN congestion notification method will not work. Some sources say that frame relay networks must provide out-of-band network management messages. But the term out-of-band is typically used by service providers to refer to bandwidth and is not useful for user or bearer traffic. There is no such thing as out-of-band in frame relay in the sense of bandwidth dedicated for control functions. In frame relay, out-of-band effectively means not on a user DLCI. So the problem of no or insufficient user frames on user DLCIs to carry timely information such as FECN and BECN status has to be solved by using a nonuser DLCI to convey these types of information. The only question left is exactly which DLCI should be used. ANSI has established that DLCI 1023 (all 1 bits in the DLCI field) is to be used not only to convey FECN and BECN types of information, but a whole range of network situations that the end devices should be aware of. CLLM frames and messages are sent periodically on all UNIs. Response to CLLM congestion notifications is to be the same as that specified by ANSI and detailed in the previous section. A full discussion of the CLLM protocol is not needed in this chapter. It is enough to note that the CLLM frames on DLCI 1023 can solve the one-way traffic problem. Use of CLLM to address this issue is shown in Figure 6.9.
Figure 6.9 The one-way problem solved with CLLM. As it turns out, there is a whole family of messages used for out-of-band frame relay network control and management. Certainly the Q.933 signaling messages discussed in the previous chapter belong to this category, and the family of link management protocols also fall into this category. Link management is the topic of the next chapter in this book.
Frame Relay Compression Although this chapter deals with congestion control in frame relay networks, this section is about compression in frame relay networks. What has one to do with the other? The answer is easy enough. Congestion occurs when there are too many user traffic bits sent into the network. Compression algorithms remove redundancies in user traffic and thus reduce the number of bits sent into the network. So, the effective and standard use of compression applied to user traffic bits can reduce the risk of congestion in a frame relay network. The Frame Relay Forum is the source of the standard way to do compression on the contents of frame relay frames. FRF.9 is the Data Compression Over Frame Relay Implementation Agreement (IA). Now, there is nothing to stop FRAD vendors from using whatever data compression techniques they wish in their equipment. The problem is that a remote FRAD from another vendor might not be able to decompress the frame contents correctly in all circumstances. This multivendor interoperability is what IAs like FRF.9 are all about. It should also be noted that nothing prevents the higher layer applications at the endpoints of the frame relay network from using whatever form of data compression they wish. In fact, applications often use compression techniques such as the .zip file format to carry information across a network, not just a frame relay network. In many cases, an IP packet inside a frame relay frame carries a portion of the zipped file. But FRF.9 applies not to the content of the packet inside the frame relay frame. FRF.9 applies to the entire content of the frame relay frame, including the packet if present, and that is the difference. FRF.9 defines a compression technique called the Data Compression Protocol (DCP). The same document also specifies how to encapsulate the DCP content inside a frame relay frame. FRF.9 also defines a default Data Compression Function Definition (DCFD) to cut down on some of the options that are available to implementers of FRF.9. FRF.9 applies only to frames having a Control field (a Q.933 Annex E frame) and does not apply to signaling or other types of control messages. Use of the Control field enables receivers to determine which DLCIs have FRF.9 frames as their basic traffic unit. FRF.9 works with both PVCs and SVCs, and also works when used in conjunction with another network such as an ATM network. The DCP itself is divided into two sublayers, the DCP Function sublayer and the DCP Control sublayer. From the user perspective, the Function sublayer is the most important. The DCP Function sublayer performs the actual encoding and decoding (compression/decompression) of the frame contents and can use a wide variety of public and proprietary compression algorithms to do so. The DCP Control sublayer manages and coordinates the whole process. These control services include: Identification of the various DCP contexts and exact format of the frame contents. A form of anti-expansion protection so that messages sent on a DLCI that are not compressed will not be subjected to the uncompression process at the receiver. Negotiation of the exact form for DCP to be used on a connection, including the options to be used and the precise DCP Functions supported. What is called synchronization of the sender and receiver so that missing frames with compressed content can be detected and resynchronization performed.
The default DCFD is described in Annex A of FRF.9. Unless the two end devices agree to do otherwise, frame relay data compression will consist of a 1 to 3 octet DCP Header without extensions. The values of the fields in the DCP Header are all given default values, and the data that follows the DCP Header can be compressed or uncompressed, as the contents dictate. The default compression protocol used is LZS-DCP (the popular Lempel-Ziv compression method from Stac, Inc. adapted for DCP use) with a few user-settable parameters. Full FRF.9 implementation is quite complex. Those interested in the details of frame relay compression in terms of format, codings, and procedures are referred to the relevant sections of FRF.9 itself. Many frame relay networks rely more on user applications to generate compressed information, and many do. Web sites used zip-file formats, voice applications can use 8 kbps or even lower rate voice, and digital video has a number of more or less built-in compression techniques. But for those who want frame relay networks to be able to address compression issues directly, FRF.9 is always available.
Chapter 7: Link Management Overview The Frame relay link management performs a crucial function on the standard interfaces defined in frame relay networks. The two standard interfaces concerned with link management are the usernetwork interface (UNI) and network-network interface (NNI). It is always good to remember that the inter-switch interfaces between frame relay network nodes are undefined by frame relay standards and vendors are free to explore and improvise as they wish for this network node interface. So link management remains strictly a UNI and NNI concern. The Local Management Interface (LMI) is a specific form of frame relay network link management procedures, and LMI is sometimes used as a generic term for frame relay link management. The fact is the frame relay, amidst an acronym-laden world of networks, has no convenient acronym for link management procedures on the UNI and NNI interface.
Introduction There is more to networking than delivering bits representing user information from source to destination. It is easy to think only of data delivery when talking about networks, since this data delivery function is the one most dear to users’ hearts and minds. But this delivery of information must be both controlled and managed. Some of these control functions were discussed in the previous chapters on frame relay signaling (connection control, flow control, and congestion control) protocols. This chapter focuses on the management nature of network architectures, in particular the management of the UNI and NNI. Each standard interface, UNI or NNI, has its own unique set of requirements in terms of link management operation. The link management operations and functions for the UNI are fully treated in this chapter. Most of the link management operations and functions for the NNI are discussed in this chapter as well, but some of the details are left until the next chapter, which focuses on the NNI itself. The scope of the link management procedures is shown in Figure 7.1.
Figure 7.1 The scope of the link management procedures. The figure shows that link management procedures run on the frame relay UNI and NNI. Link management does not, however, run between ports of frame relay switches within the network cloud. Vendors are more or less free to use their own vendor-specific procedures between frame relay switches. This lack of a frame relay standard between switches has given rise to what the Frame Relay Forum calls cell-based frame relay where the frame relay switches are linked inside the cloud by an ATM network based on ATM standards and specifications. Because ATM has a fuller and more complete set of service classes for delivering QoS to applications than frame relay standards define, there is some advantage to linking frame relay switches over an ATM network. So the lack of a standard link management procedure between frame relay switches does not imply that a frame relay network is not manageable internally. Far from it. It is just that the network internally uses more equipment-oriented procedures that are linked either directly or indirectly to the service provider’s operations, administration, maintenance, and provisioning (OAM&P) hardware and software for the purposes of managing the frame relay network. What exactly is link management for? Some books and articles make it seem that link management is simply used as a way for FRADs to discover that the UNI into the frame relay network has failed, even if the link is not currently active. But just thinking about the UNI for a few minutes will lead to the realization that there must be more to link management than just that. And indeed there is. After all, the frame relay UNI is a synchronous bidirectional communications link like an ISDN digital subscriber line (DSL) or leased private line running SNA. All synchronous links constantly stream idle patterns (technically, “interframe fill”) of bits back and forth over the link when they are not sending live traffic frames. If the patterns disappear, even when there is no live traffic, then both ends of the link immediately know that the link has failed. If this works in other networks, why does frame relay need more?
The simple answer is that frame relay needs more than simple idle or fill pattern disappearance to indicate UNI link failure because frame relay does more than just provide a passive bit pipe for endto-end for users. Users on a frame relay network are at the ends of two separate local UNIs that have no direct access to how things are inside the frame relay network cloud. Perhaps a simple analogy will make this increased need for more information in frame relay networks evident. After a hard day’s work and a satisfying dinner, some people like to sit in front of their television set and watch their favorite cable TV channel (others, strange as it seems, prefer to read and/or write). But sometimes the picture on the channel just disappears, replaced by a blank screen. Now the cable TV user is faced with a number of questions that need answers before an alternative form of relaxation is considered, selected, and pursued. Some of these questions, but probably not all, are mentioned here. Is the effect local to the channel connection itself? That is, if the channel is changed, are there pictures on any other channels? If there are many channels, this might take a while to determine with absolute confidence. Is the effect local to the cable TV connection? That is, are neighbors’ cable TVs affected or not? This might involve making telephone calls to several neighbors (assuming the telephone service provider is not the cable TV company!), which might prove nothing if the neighbor says “I don’t have (or I’m not watching) cable TV.” How long will the channel/system be unavailable? That is, did a lightning strike temporarily cause the outage on a small branch of the network or did a tornado take out the central head-end site entirely for what will be an indeterminate period? This example list could easily be extended, but the point has been made. Just realizing that a network link is down is not the same as knowing the extent, cause, severity, and probable duration of the outage. The proper response might vary based on the actual value of one or more of these variables. One set of values leads to reading a book for a change, while another leads to the immediate cutover to a satellite dish and system. A FRAD at the end of a frame relay UNI always know exactly when the link has failed. The interframe fill pattern present before is now gone. But what if a remote UNI has failed? Then some PVC will continue to carry traffic while others cannot. What if it is the frame relay switch that has suddenly failed? Then all UNIs might not have their expected bit patterns in one direction, but how is the FRAD to determine this? How widespread in the failure? And so on. Again, the point is that whether the FRAD should employ a dialup ISDN link to the frame relay switch or perform some total disaster recovery procedure depends entirely on the answer to these questions. But usually only the network has access to the information that the FRAD needs to determine the proper course of action. There must be some standard mechanism defined in frame relay to allow the FRAD to query the frame relay network for the types of management information needed by the FRAD to allow the FRAD and the users supported by the FRAD to make informed decisions when considering and choosing alternative courses of action. This is what the link management procedures do.
Managing the Links In spite of its name and usage, the link management is not really used for frame relay network management. This means that no one would ever think that it should or could be used to manage a frame relay network. Full network management protocols include ways and means for network personnel to perform activities such as initialization, configuration, troubleshooting, and so on for the network in question. This is not to say that the link management procedures are not helpful to network management personnel when they are about their tasks. But link management in and of itself is not the frame relay version of something like the Internet protocol suite’s Simple Network Management Protocol (SNMP) used for devices than run TCP/IP. Instead, link management is more or a basis for information that can be made available to network managers by way of SNMP. This theme will be discussed more fully at the end of the chapter. First it is necessary to explore the link management procedures themselves in more detail. Link management is so important in frame relay networks that there are not one or two, but three, organizations that have established specifications as to how frame relay user devices (FRADs) and the frame relay network should exchange link management information. In actual practice, there are only two link management protocols, however, and there is never any concern about needing to run all three at the same time. Why should there be three in the first place? It is mainly because the three organizations worked on their versions of link management independently and at different, but overlapping, times. All of this might seem somewhat mysterious, but it happens all the time in standards groups and vendor consortiums. Frame relay link management procedures have had an interesting history. The initial experiments with frame relay networks included no link management procedures at all or proposed performing link management in a variety of proprietary and mutually exclusive ways. At first glance, it might seem strange that a standard network scheme based on ITU-T standards would not include link management procedures. There was a good reason for that apparent oversight, however. The ITU-T saw frame relay as a part of an overall ISDN, as has been previously mentioned. Frame relay was basically the new and improved version of the X.25 packet switching included as part of an ISDN. So a lot of the link management types of information needed for network management could be bundled with the overall ISDN procedures. All non-information data units to the ITU-T are all used for “signaling,” so link management procedures were simply other types of signaling messages. The close relationship envisioned between frame relay and ISDN posed somewhat of a problem for early implementers interested in frame relay. X.25 networks could be built and used apart from ISDN. Why could not frame relay networks also be built and used apart from ISDN? But then how could adequate link management information be provided specifically for a frame relay network without the benefit of having an ISDN to rely on? X.25 procedures would not just port over to the frame relay environment because all of Layer 3 was essentially absent on frame relay devices. Therefore, it was clear that straight ISDN procedures could not be used on frame relay networks. As a result, a group of equipment vendors got together and decided to create their own link management procedures to use until the standards organizations such as ANSI (in the United States) and the ITU-T (then the CCITT) came up with their own ways to perform link management procedures without the presence of an ISDN. The two groups cooperated about as well as could be expected, given the scope of their task, but on one point neither was willing to give an inch. This sticking point involved the choice of DLCI used to convey the link management information.
The ITU-T, in line with the philosophy of “all non-user stuff is signaling” wanted link management to use DLCI 0. Traditionally, signaling always used the lowest connection or device number available, and there was nothing lower than DLCI 0 for this task. On the other hand, ANSI saw the link management procedures as more closely aligned with network management and troubleshooting procedures. Traditionally, network management and exception reporting is done on the highest connection or device number available. To ANSI, signaling in the form of call control belonged on DLCI 0, but link management belonged on some other, higher-numbered DLCI. In most implementations of frame relay, there is no connection number higher than 10 1 bits or DLCI 1023. The debate over which DLCI to use for link management was spirited, to say the least. But this debate was hardly a matter of intense importance to the equipment vendors interested in manufacturing frame relay equipment before the new millennium. Four of these early frame relay equipment manufacturers got together and formed the Frame Relay Implementers’ Forum (FRIF). The companies were router vendor Cisco Systems (sometimes, and more properly, seen as lowercase “cisco”), computer maker Digital Equipment Corporation (DEC, part of Compaq), switch vendor Northern Telecom (now Nortel), and packet voice pioneer StrataCom (now part of cisco). All in all, they were a mix of vendors that could build a complete frame relay network for many types of traffic among themselves. Once the group began to grow and nonimplementers such as service providers and even end users joined, the organization became known as the Frame Relay Forum (FRF). The FRF worked in terms of Implementation Agreements, or specifications, not standards or official recommendations with the force of law. The FRF quickly decided to endorse the use of DLCI 1023 for the Local Management Interface (LMI) specification (call control signaling would still use DLCI 0, but initially the FRF just used PVCs, of course). The FRF published its version of the LMI link management procedures in September 1990. Due to the prominence of cisco in the FRF and the router industry, and the fact that 50 percent of the original Gang of Four as they were known is now only cisco, sometimes this LMI method that uses DLCI 1023 is known as cisco LMI. Although the equipment vendors decided to create the consortium LMI (yet another name for DLCI 1023 LMI procedures), this did not stop ANSI and the ITU-T from proceeding on their own. By October of 1991, ANSI had produced a link management Annex D to its frame relay specification, T1.617. Either as a result of subtle pressure from the ITU-T or because ANSI realized that the ITUT was right all along, Annex D specified the use of DLCI 0 for link management. To complete the triumvirate, the ITU-T completed work on link management Annex A in June of 1992, naturally using DLCI 0 as well. The ITU-T recommendation for link management has much in common with the ANSI version. So, although there are three organizations that were involved with the link management specifications, there were only two DLCIs involved: DLCI 0 (ITU-T and ANSI) and DLCI 1023 (consortium LMI or cisco LMI). This was okay because no frame relay network should ever have to try to run both the ANSI or ITU-T versions on any single UNI or NNI. The rules regarding which to use when are quite clear. If both ends of the UNI or NNI are within the borders of the United States, then ANSI Annex D is used on the UNI or NNI. If both ends are not within the borders of the United States, then ITU-T Annex A must be used. Since any individual UNI or NNI cannot belong to both classes at the same time, a UNI or NNI is said to be running either Annex D from ANSI or Annex A from ITU-T. One could hardly wish to use DLCI 0 for both types at the same time anyway (although it is technically possible). But the same logic does not apply to the LMI from the founders of the Frame Relay Forum (hereafter just FRF LMI). This FRF LMI could be run anywhere at anytime on any UNI or NNI since the FRF LMI was only an implementation agreement and not really a standard. The FRF LMI came first, and by the time ANSI and ITU-T were ready, the FRF LMI had a huge embedded base of FRADs and switches all merrily using DLCI 1023 for link management purposes. Most vendors soon decided that in the best interests of service providers and customers alike, and to better conform to national and international standards, FRF LMI should be treated as a type of interim solution on the road to ANSI Annex D or ITU-T Annex A. Converting from FRF LMI to ANSI or ITU-T versions was usually a software upgrade, but since the FRAD was CPE and controlled by the customer, there was no easy way to coordinate or enforce the upgrade process. Older FRADs usually supported only FRF LMI. Today, most FRADs allow configuration for using either FRF LMI or ANSI/ITU-T annexes.
But while FRADs typically use only FRF LMI or ANSI/ITU-T annexes, but not both at the same time, frame relay switches must be able to detect whether a FRAD on a switch port is using FRF LMI on DLCI 1023 or the ANSI/ITU-T annexes on DLCI 0 and respond accordingly. Not so long ago, this coordination was a monumental task and had to be done by manual configuration at customer premises and at the switch. Now all frame relay switches have an auto-detection feature to configure themselves for the correct version in use on a UNI. The user can even upgrade from a FRAD using FRF LMI to a FRAD using ANSI Annex D and the switch will go right along with the change from DLCI 1023 to DLCI 0. The days of FRF LMI appear to be numbered. The Frame Relay Forum itself notes that FRF LMI is not an implementation agreement and does not even maintain a version of the LMI documentation on its Web site. So the future of the FRF LMI is uncertain, except perhaps in the original vendors’ equipment packages (e.g., cisco products), and even then mostly due to embedded base considerations. In view of this, the rest of this chapter will focus on the ANSI and ITU-T specifications, noting a few of the basics regarding FRF LMI. Table 7.1 lists some of the major differences between FRF LMI and ANSI Annex D/ITU-T Annex A. The only entry not discussed in detail so far is the fact that FRF LMI is a unidirectional or asymmetric protocol only. Both ANSI and ITU-T link management procedures are unidirectional on the UNI, but can be bidirectional or symmetric on the NNI. This means that with the FRF LMI (asymmetric), a different type of message is always sent from FRAD to network than from network to FRAD. The ANSI/ITU-T annexes, on the other hand, can be either unidirectional (asymmetric on the UNI with different messages depending on direction) or bidirectional (symmetric on the NNI with the same messages in both directions). In this case, symmetric means that the same type of messages go back and forth on the NNI in either direction. With asymmetric operation, the message depends on which device originates the message. Table 7.1 Link Management Specifications for Frame Relay Networks COMMON NAME
ORIGIN
USAGE
DLCI
OPERATION
LMI
Gang of Four
Any UNI
1023
Unidirectional
Annex D
ANSI T1.617
U.S. UNI/NNI
0
Uni/Bidirectional
Annex A
ITU-T Q.933
International UNI/NNI
0
Uni/Bidirectional
It is somewhat confusing that the FRF LMI specification is used in this chapter as if it were an acronym for link management interface, but there is no easy alternative acronym to use. When dealing with LMI specifically, however, especially in contrast to Annex D or Annex A link management procedures, every effort will be made to avoid confusion.
Link Management Messages In frame relay, the link management messages are carried by LAPF unnumbered information (UI) frames, meaning these frames contain a control field in addition to and immediately following the frame relay header (address field). The link management messages always flow on either DLCI 0 or DLCI 1023, depending on whether LMI or Annex A/Annex D link management is used. It is the reception of frames on either DLCI 0 or DLCI 1023 that indicates to the receiver that the control field is present in the first place. Recall that link management messages are considered by ANSI and the ITU-T to be simply another type of signaling message that contains this control field. There are two main types of link management messages that are present in the information field of a DLCI 0 or DLCI 1023 frame. These are the Status Enquiry and Status messages, which were included as part of the full signaling message suite. There is also an Update Status message that exists in LMI, but this section will emphasize the link management messages used with ANSI Annex D or ITU-T Annex A. Figure 7.2 shows the general structure of a frame relay frame containing a link management message.
Figure 7.2 Link management message frame format. The main purpose of the link management messages is to allow the customer device, the FRAD, at the end of a UNI to determine first of all if a UNI link to the frame relay switch is up and running, and second what DLCIs are currently defined on the UNI. Note the emphasis on the customer premises device’s role in the link management process. After all, if a UNI fails at 3:00 a.m. on a Sunday morning, there is not much chance of the customer finding out that user data will not pass across the UNI until 8:00 a.m. on Monday. This is probably the worst possible time for anyone to realize that a link to a network is not available. Therefore, the link management messages allow customer equipment to detect UNI failures at any time of day or night. Variations of the basic UNI forms of link management messages are defined for use on the NNI also, but the UNI usage is presented first. The user device always sends the Status Enquiry message to the frame relay network on the UNI. The purpose of this message is fairly self-explanatory: What is the status of the UNI link and the DLCIs defined upon it? The network always responds with a Status message. This explains the asymmetric nature of the link management messages used on a UNI: Status Enquiry messages are always from users; Status messages are always from the network. Some of the Status Enquiry messages are simple keep alive messages (and are called such in LMI) that detect basic UNI connectivity. These Status Enquiry messages are called Link Integrity Verification (LIV) messages in Annex A and Annex D. There is no information about DLCI status provided by these keep alive messages, only sequence numbers to allow both ends of the UNI to determine if any Status Enquiry or Status messages have been missed. Status messages with DLCI information are called full Status messages. How often does a FRAD on a UNI send a Status Enquiry message to the frame relay switch and when is the response a simple LIV Status or a full Status message? This depends on the value of a timer and a counter. Both are configuration parameters that must be coordinated in both the FRAD and at the frame relay switch port, although both timer and counter have default values which are seldom changed. The timer is the Polling Interval Timer (T391) and the counter is the Polling Interval Counter (N391). The numbers do not imply that there are 390 other timers and counters in frame relay; that is simply what they are called. LMI calls them the nT1 timer and the nN1 counter, but their purpose and use is identical to their ANSI/ITU-T counterparts.
Most link management message exchanges are simple LIV exchanges to make sure the UNI is up and running. Every 10 seconds, which is the T391 default value, the user premises device (FRAD) sends a Status Enquiry message on the UNI which contains a sequence number from the FRAD and a sequence number representing the last sequence number received from the network side of the UNI. The network always responds as quickly as it can with a matching Status message which contains a sequence number from the network and a sequence number representing the last sequence number received from the FRAD. The sequence numbers increment with each exchange and eventually roll over and start again. Each end of the UNI should see each pair of numbers increase without gaps in the sequence under normal circumstances, of course. Gaps represent missing LIV messages. The T391 timer can be set in the FRAD between 5 seconds and 30 seconds, but the default of 10 seconds is seldom modified. After some number of N391 LIV messages, the default value being 6, the Status Enquiry message requests a full Status message response from the network. Once per minute with the default values, the network will send a full Status message to the FRAD on the UNI. The full Status message contains information which should reflect the status of every DLCI established on the UNI. Although there is a limit on the number of DLCIs that can be reported in a full Status message, this limit is seldom of concern in real, working frame relay networks. The N391 counter is set in the FRAD and can vary from 1 to 255, but the default of 6 is seldom modified. Taken together, the Status Enquiry and Status message pairs allow a FRAD to detect a UNI failure quickly and determine minute by minute whether a particular DLCI is available. There is a related timer in the frame relay switch, the T392 timer (how long should a switch wait for Status Enquiry message?), that must exceed the T391 timer value in the FRAD. The default value of the T392 timer is 15 seconds. This way the network always sees a Status Enquiry message before the switch records a missing Status Enquiry message. This action could result in the UNI being declared down by the network when it is in fact still available. The T392 timer, called nT2 in LMI, can be set between 5 and 30 seconds, with a default of 15 seconds. The frame relay switch also maintains two other counters, the N392 and N393 counters. These counters count the number of missing Status Enquiry messages (N392) and the total number of expected Status Enquiry messages (N393). In LMI, the N392 counter is the nN2 counter and the N393 counter is the nN3 counter. The default value of the N392 counter is 3, but it can be set between 1 and 10. The default value of the N393 counter is 4, but it can be set between 1 and 10 as well. An alarm is sent to the frame relay network operations center (NOC) if there are three Status Enquiry messages in error (N392 errors) out of four Status Enquiry events (N393 errors). Usually, this boils down to an alarm when three consecutive expected Status Enquiry messages are missing (30 seconds with default timers).
Link Management Message Formats The structure of the frame relay link management messages looks very much like the structure of the frame relay signaling messages described in Chapter 5. This is no accident, of course, because to the ITU-T and ANSI, link management messages are only other kinds of signaling (i.e., nonuser information-carrying) frames. The section will detail the structure of the ANSI and ITU-T Status Enquiry and Status messages. Some mention will be made of the LMI message formats, but only in passing. The overall structure of the ITU-T Q.933 Annex A and ANSI T1.617 Annex D Status Enquiry message is shown in Figure 7.3. Note the similarities between the two. Consider the ITU-T Annex A format first. The DLCI is 0 and the Control field value of 3 in hexadecimal (03h) indicates that this is an unnumbered information (UI) frame (regular signaling messages are sent as Information [I] frames that have a send and receive sequence field in the Control field). The Protocol Discriminator field which follows is still present and used to identify the exact signaling protocol used. As in all frame relay signaling messages based on Q.931, this field is set to 00001000, which is nothing more than the number 8 in hexadecimal (08h). The next octet in the Status Enquiry message header forms the Call Reference field. The first four bits of this field are always 0000 and the next four bits give the exact length of the call reference value itself. For link management messages, the Call Reference value is always exactly zero in all 8-bit positions. The all-zero Call Reference not only means that the length of the Call Reference field is just one octet and this is it; but it also serves as a Dummy (the term is used consistently in the standards) Call Reference value. The Dummy Call Reference makes sense in link management messages because the Call Reference value is used to track demand connections internally on the frame relay network. However, there is no connection to track in a link management message exchange of Status Enquiry and Status messages, so the Dummy Call Reference value keeps the field intact but lets the network effectively ignore it.
Figure 7.3 ITU-T Annex A and ANSI Annex D Status Enquiry message. The next octet is the Message Type. For a Status Enquiry message, this field is set to 0111 0101 or 75 in hexadecimal (75h). The Status message sent in response has the message type field set to 0111 1101 (7Dh). Note that these three fields—protocol discriminator, dummy call reference, and message type—make up the signaling message header as in all Q.933 signaling messages, although the presence of the dummy call reference field makes the signaling message header for link management messages only three octets total instead of the usual five octets when used for SVC signaling.
After the three-octet header, the Status Enquiry message consists of two Information Elements, in keeping with the whole link management is signaling philosophy. Both of the IEs have the familiar IE structure, which consists of a 0 bit followed by a 7-bit IE identifier in the first octet, a second octet with the total IE length in octets, then the octets of the IE itself. Both of the Status Enquiry IEs must always be present. The first IE in the Status Enquiry message is the report type field (as in all IEs), which is always three octets in length. The first octet is set to 0101 0001 (51h). The second octet is the length in octets of the report type contents (also as in all IEs), which is only three octets for this IE. The third octet is the type of report field and indicates mainly whether this is a link integrity verification (LIV) Status Enquiry message (0000 0001) or a full Status message (0000 0000). There is also a type of report known as a single PVC asynchronous status type of report which can be used to request status information on a particular DLCI; however, use of this type of report is for further study and not used at this time. The second and final IE in the Status Enquiry message is the LIV IE; it is used mainly to exchange the send and receive sequence numbers used for LIV purposes. This IE is four octets long. The first octet is set to 0101 0011 (53h). The second (length) octet is set to four octets. The final two octets carry the send sequence number (this Status Enquiry’s number) and the receive sequence number (the send number from the last Status message received from the network). Both numbers sequence from 0 to 255, then repeat. They roll over in about 40 minutes with the default value of one Status Enquiry every 10 seconds. The only major difference between the structure of the ITU-T Annex A Status Enquiry message and the ANSI Annex D Status Enquiry message is the presence in the ANSI format of the locking shift field (considered to be an IE all its own). The locking shift field is a single octet that appears between the message type field and the report type IE, and so forms a type of header extension to the signaling message header in ANSI implementations. The locking shift field is used to indicate that the IEs that follow are coded according to the ANSI formats (called codeset 5) instead of the ITU-T Q.931 formats previously described. The locking shift octet has a structure all its own. The first bit is a 1 bit, followed by 3 bits set to 001 which is the shift identifier. The 4th bit is set to a 0 bit, which triggers the locking shift action itself. Finally, the last 3 bits indicate the codeset invoked by the shift in coding. In ANSI Annex D, this field is set to 101, which is the number 5 in decimal. So the whole locking shift field, or IE, is 1001 0101 or 95 in hexadecimal (95h). This locking shift action basically adds a 50h to (or pastes a “5” in front of) the codes in the IEs that follow. So, for example, in ANSI Annex D, the report type field is 01h instead of 51h for an LIV type of report. Also, the LIV IE is coded as 03h in ANSI instead of 53h as in ITU-T messages. The LMI Status Enquiry message is not illustrated but is a kind of blend of the ITU-T and ANSI coding and formats. That is, the locking shift octet is not present, but the coding follows ANSI values anyway. So the LMI uses 01h in the report type field for an LIV message instead of the ITUT coding of 51h, and so on. Of course, LMI messages flow on DLCI 1023 and not DLCI 0. Finally, the LMI sets the protocol discriminator field to 0000 1001 (09h). These LMI differences (i.e., no locking shift, but ANSI coding, protocol discriminator field to 0000 1001 or 09 in hexadecimal [09h]) are consistent, so no further discussion of LMI details is necessary. Those still interested in LMI details are referred to cisco’s own documentation and implementation. (Be warned, however, that there are quirks in cisco’s LMI implementation that are vendor-specific to cisco.) What response does the Status Enquiry sent from the FRAD evoke from the switch? This depends on whether the Status Enquiry is an LIV or keep alive message sent every 10 seconds (the default) or a full Status request sent every minute (the default). In both cases the response is a Status message, but the content of an LIV Status message and a full Status message are radically different in size and use. Consider the LIV Status message first.
The LIV Status message is basically the mirror image of the Status Enquiry message shown in Figure 7.3. That is, the message flows on DLCI 0, it is a UI frame, and has a signaling message header three octets in length. The biggest difference is that the message type field in the signaling message header is coded as 0111 1101 (7Dh), instead of the 0111 0101 (75h) as in the Status Enquiry message. Both the report type and LIV IEs are present, in their familiar shapes and sizes. Naturally, the send and receive sequence numbers reflect the network perspective and not the user perspective. In the ANSI and LMI versions, the locking shift field (IE) is not present. In LMI, the message flows on DLCI 1023. LMI also sets the protocol discriminator field to 0000 1001 (09h). The full Status message is more interesting, but more complex. It contains information about the status of each and every PVC, by DLCI, established on the UNI. (This also works on the NNI as well, as will be shown later.) Consider the case where a FRAD knows that the UNI is up and running, based on the success of the LIV exchanges, but frames sent on a particular DLCI receive no response. Clearly, something is wrong, but what? Before the customer initiates various troubleshooting activities, the full Status message at least provides a basis for where such a troubleshooting process could and should begin. Perhaps the situation is more of a “never mind” condition at the endpoints of the PVC rather than a network trouble or failure. The Status Enquiry message that requests the full Status report from the network is unremarkable. The only notable difference from the formats illustrated in Figure 7.3 is that both message and report have the type of report field in the report type IE set to 0000 0000 (00h). The default counter sends this message after every 6 LIV messages, which have the type of report field in the report type IE set to 0000 0001. The LMI format is the same and uses the ANSI coding for the IEs, as previously mentioned. The real action in the full Status message is in the response from the frame relay network to the full Status Enquiry message from the customer premises device. The overall format of the full Status message is shown in Figure 7.4, again for both the ITU-T and ANSI formats. The LMI format is mentioned later.
Figure 7.4 ITU-T Annex A and ANSI Annex D full Status message. The familiar signaling message header is present in the network full Status message and coded with 08h (0000 1000) as protocol discriminator, 00h (0000 0000) as dummy call reference, and 75h (0111 0101) as the message type. The ANSI form has the locking shift. The report type field is either 51h (0101 0001) for ITU-T Annex A or 01h (0000 0001) for ANSI Annex D. In both cases, the type of report field indicates full Status or 00h (0000 0000). The LIV IE, 53h (0101 0011) for ITU-T or 03h (0000 0011) for ANSI, is also present, and still carries the send and receive sequence numbers.
The most interesting part of the full Status message follows the initial IEs. For the rest of the full Status message, up to the maximum frame relay frame size allowed on the UNI, there are simply repeated PVC Status IEs. These are usually five octet IEs, but some forms can extend to seven or even eight octets. The purpose of these IEs is to give the FRAD at the customer end of the UNI a minute-by-minute (using the default timers and counters) update on the state of all the DLCI the network knows about on the UNI. DLCIs are always reported in numerical order, making it easier for receivers to tell whether the status of a particular DLCI is being reported. However, there is an apparent problem here. Most frame relay networks employ a DLCI numbering scheme from 0 to 1023, allowing about 975 user-DLCIs after reserved values are subtracted. But most frame relay networks also allow a maximum frame relay information field size of 4096 octets, and some allow much smaller sizes (e.g., 1600 octets). If each full Status message uses 10 octets for signaling headers and initial IEs, then the full Status message can only report on the status of some 817 DLCIs (4086 octets of 5 octets per DLCI). So it might seem that more than 150 DLCIs might exist on a UNI that the network could never report on, since there is no provision for extending the reporting fields into a second full Status frame sent in response to a Status Enquiry. Fortunately, this is only a problem on paper and not in the real world. No one has ever seen a UNI with anywhere near 800 DLCIs defined. And even if this were possible without worrying about congestion, it would be unlikely that all of the DLCIs would be PVCs. And SVCs, or demand connections, need not be reported in the full Status messages. The LMI format of the full Status message again mixes some aspects of the ITU-T form (no locking shift) and the ANSI form (ANSI coding of IEs), and sets the protocol discriminator field to 0000 1001 (09h). Oddly, the LMI has an option to add three additional octets to the five octets of the Status IE. The three octets are used to indicate the PVC bandwidth used on the DLCI reported. The PVC bandwidth is defined as the minimum bandwidth that the frame relay network has assigned to the DLCI, expressed in bits per second. This field is also an option in ANSI Annex D, but has pretty much disappeared today. The Status IE has a structure, too, as shown in Figure 7.5. Thankfully, there are only two differences between the ITU-T Annex A IE and the ANSI Annex D version, so a separate figure for each is not used. The first difference is in the identifier code, which is 53h (0101 0011) in the ITU-T version and 03h (0000 0011) in ANSI, thanks to the locking shift. The second difference is in the use of the final bit of the last octet of the IE, which will be discussed shortly.
Figure 7.5 The Status IE. Consider the ITU-T format of the last three octets first. The last bit in each three octets is the extension bit which should be familiar from the concept of address extension (EA) bits in the frame relay header. Here, as there, the function of this bit is to indicate to the receiver whether the DLCI numbering is extended into the next octet (0) or not (1). In these three octets, the extension bit pattern is 0 1 1, meaning that the DLCI is extended into the second octet (0), but that’s all (1), and the third octet is the last of this IE (1). As with the frame relay header, the first (high-order) six bits of the DLCI are in the first octet and the last (low-order) 4 bits of the DLCI are in the second octet when 10-bit DLCIs are used. Naturally, this system allows for the Status message to report the larger DLCIs used in larger than 10-bit numbering schemes. The other four bits in the first two octets are just spare bits and must be set to 0 bits, although why they are spare and not reserved is anyone’s guess. The use of the term “spare” makes it seem as if the bits could be used if some other bits are busy doing other things at the time, but this is never the case, of course.
The third and final octet is what the Status IE is all about. The last 4 bits are more or less ignored for link management purposes. These last 4 bits are the extension bit and 3 spare 0 bits. The first 4 bits, after essentially five octets worth of labels and overhead, are the status bits themselves. The New (N) bit is set to a 1 bit if the DLCI just enumerated has not been reported on by the network before. The Delete (D) bit is set to a 1 bit if the network has deleted the DLCI just enumerated (the ITU-T says that this bit is only meaningful for the optional [and unused] single PVC report). If the D bit is set to a 1 bit, the status of the other bits in this octet are ignored. The Active (A) bit is set to a 1 bit if the DLCI just enumerated is active. This really means that the DLCI is available for normal use. If the A bit is set to a 0 bit, then the user is supposed to stop sending information on the DLCI, so effectively this is the enable/disable bit. The only real difference between ITU-T and ANSI (and LMI) PVC Status messages is the use in ANSI Annex D (and LMI) of the last bit in the message itself, which is reserved in the ITU-T version. In ANSI (and LMI) this is the Receiver not ready (R) bit. The R bit is set when all is otherwise well on the DLCI, but senders should not send for reasons of congestion. Now, the frame relay header has the FECN and BECN bits that perform pretty much the same function, so it is easy to see why the ITU-T decided that there was little value added by use of the R bit. In fact, it could be argued that equipment might be confused if not receiving FECNs and/or BECNs, and at the same time getting R bit status indicating congestion on DLCIs every minute through link management messages. Taken as a whole, the full suite of link management messages form an efficient mechanism that allow FRADs to track the status of the UNI on a minute-by-minute basis. It should be noted that in all modern frame relay CPE packages, use of either ITU-T Annex A or ANSI Annex D is a configurable parameter. Some devices also support LMI, although as time goes by this is more restricted to a single vendor (cisco) and then the vendor-specific implementation of LMI. However, since link management is strictly local, the two UNIs at the ends of a PVC might have differing link management procedures configured and still function properly. In other words, a cisco LMI UNI can interoperate easily with a UNI using ANSI Annex D on the other side of the PVC. All current frame relay switches detect which link management procedure mechanism is in use and configure themselves to respond properly to the Status Enquiry message type the switch receives. The local nature of the Status messages has caused some problems in frame relay, but not with interoperability, since the various types can easily be used on different UNIs. Of course, different types cannot be used on the same UNI at the same time. Just because a new DLCI has been configured on a UNI at one side of the network does not mean the configuration process has been completed at the other UNI, nor everywhere else in between. So service providers must be careful not to set the A bit to a 1 bit until the PVC labeled by the DLCI is ready to go end-to-end through the network. Otherwise, frame flowing on the DLCI will be discarded somewhere in the network without any direct indication to the sender that this action is taking place! Use of the link management procedures described in this chapter on the frame relay NNI is deferred until the next chapter, which contains a full discussion of the frame relay NNI.
What about Switched Virtual Circuits? A lot of discussion in earlier chapters focused on the whole PVC versus SVC issue in frame relay. Some service providers have begun supporting frame relay SVCs, so the issue is an important one. But for the purposes of this chapter, the important point about frame relay SVCs is that all of the previously described link management procedures are concerned with PVC status. However, once an SVC has been set up and given a DLCI, SVCs become indistinguishable from PVCs from both the user and network perspective. So how can PVC link management procedures be extended to SVCs?
As it turns out, the extension can be made fairly easily. The only tricky part is that frame relay SVCs are tracked internally to the network by call reference, a kind of connection identifier that supersedes the local DLCI values on each UNI and is unique in the network. PVCs, on the other hand, do not require call reference tracking because they are set up by hand and therefore employ the dummy (all-zero) call reference value. Does this SVC call reference value need to be conveyed to the end equipment along with the status of the SVC? If so, then modifications to the existing link management message formats are needed. If not, then PVC and SVC DLCIs can be mixed in a single full Status message with no problem. It will be interesting to see how SVC Status messages are handled by service providers once SVC service offerings become more common.
Customer Network Management (CNM) Networks involve more than just the transport of user data frames. The flow of data must be controlled and managed. This is what the last few chapters have mainly been about. Networks fail, and even worse, sometimes do not fail, causing more problems than a simple service outage. This not fail problem sounds odd at first, but it is a characteristic of all large, modern major network architectures, public or private. In the old days, networks failed all the time, but the problems were easy to find and correct. Today, networks hardly ever fail outright; but when they do, the problem is not always obvious and the cure might be complex. Network management must be able to manage networks of all shapes and sizes. Frame relay networks are complicated by the fact that users cannot see and control the entire link end-to-end, as users usually could with private, leased-line networks. So early frame relay network service providers had to deal with the perception that if a customer bought and used frame relay, the customer’s network management center would still be able to control and troubleshoot the frame relay network in spite of the network being public and appearing to the user as a collection of local UNIs. The control issue was handled by making the PVC process simple and efficient. Usually, service providers can configure new PVCs in 24 hours, or even less. The troubleshooting issue was more touchy. After all, an argument could be made that if the public network service provider is doing its job correctly, why would customers even have to concern themselves with troubleshooting in the first place? Troubleshooting is the job of the service provider’s network operations center, not the customer’s. Of course, if the troubleshooting process were that good, then customers migrating from private line networks to frame relay could just disband the NOC and go home. Naturally, nothing is that simple. In most cases, frame relay networks are a combination of public and private components. The FRAD is typically a router owned and operated by the customer. The frame relay switch is firmly under the control of the service provider. In many frame relay networks, the UNI might be provided by another service provider altogether. Yet all must work together not only when the network is running well, but when the network is not doing what it is supposed to. This is the whole idea behind Customer Network Management (CNM) features offered by most frame relay service providers: The service provider will make it easier for the customer’s network operations personnel to manage the public portions of the frame relay network.
Troubleshooting and Frame Relay CNM services can even extend to troubleshooting. A lot of network management activity in frame relay involves finding out where the trouble lies: in the network cloud, on the UNI itself, or in the FRAD. Usually, this is a three-step process that follows these steps: 1.Is the FRAD (router) connected to the DSU/CSU? This is the network equivalent of the “is it plugged in to the electrical socket?” troubleshooting step in the PC and LAN world. But it is still a good place to start. 2.Is the FRAD (router) exchanging Status Enquiry and Status messages with the frame relay switch? This is the sign that the UNI itself has failed. Note that both UNIs must be checked in this regard. 3.Are all of the DLCIs active? A UNI outage affects all DLCIs, of course. But if a complaint is about a particular DLCI, the full Status message should reveal any problems. Again, both UNIs need to be checked. Assuming all is well, or appears well, up to this point, a couple of further steps can be taken to verify end-to-end connectivity through the frame relay network. Usually, the steps are: 1.Try to ping the remote router (FRAD) if the router and network support TCP/IP, which they usually do. Ping is just a simple TCP/IP control message that is sent back by the target device. Typically, a whole stream of pings are sent, and the disappearance of a large number of responses is a sign of network congestion. A word of caution about using ping today: Many routers will filter out pings because of well-documented denial-of-service threats to TCP/IP routers (just the right number and type of pings can bring the router to its knees quickly). So just not being able to ping a router may not always be a sign that something is wrong: Perhaps everything is right. 2.Try to ping end-to-end, from user device to user device. In a perfect world, this step would be tried first. Unfortunately, users will either not bother before calling network operations or claim they did anyway, but it failed. But that does not mean network operations, following a simple try this, try that approach, should not finish up a no trouble found process with this test. The problem could be in the end device itself. Or the trouble could be just intermittent enough to disappear while the previous steps are being followed. (No one ever claimed network management was easy.) Each step involves a number of subactivities that focus on one aspect or another of the overall problem. Of course, if the FRAD is not a router, other methods must be used. Most CNM offerings from frame relay service providers do not totally outsource troubleshooting activities to the service provider, except for small customers. Generally, if the customer already has a network operations group, that group will continue its role as troubleshooting focal point for users. In that case, the CNM service offered will emphasize frame relay performance statistics.
Network Performance As important as troubleshooting is, there is more to network management than just find-it-and-fix-it. The overall health of the whole network needs to be assessed periodically as well. Is a given CIR adequate on a particular DLCI? Is there significant congestion on the frame relay network? What are the busiest periods on the network? All of these statistics should be gathered by the network operations center also, perhaps not by the same people fielding calls from irate users, but by some group at the customer’s network operations center (NOC). Customers can use their own methods to gather statistics about the frame relay network performance on their own, of course. Several packages exist, priced from about $5,000 to $20,000, excluding hardware. All will provide customers with service-level verification (is the network available when no one is using it?) and performance statistics (delay, delay variation, etc.). CNM marketing efforts focus on a couple of facts no one can deny. First of all, the frame relay service provider has experience with frame relay that a new customer does not have, by definition. Why train personnel in frame relay when the service provider has plenty? Second, a new customer has little to no hardware or software in their NOC that is specifically for frame relay network operation, again by definition. Why spend $20,000 when a small, monthly expenditure can provide the same information? These two simple but powerful facts combine to have the effect that few frame relay network services are sold without any CNM providing some form of enhanced (beyond basic uptime/downtime ratios) performance statistics. Generally, frame relay customers can get information in any one of three main ways from frame relay service providers. In all cases, these reports apply only to the customer’s PVCs and SVCs (if supported) on the public frame relay network. In fact, more than one method can be used in combination. The three main methods are: 1.The customer receives periodic written reports from the frame relay service provider. This is the original way, but has the severe draw-back of delay. On Thursday, who cares why the network was slow on Monday? The information is needed faster. Yet, several major frame relay service providers only offer this type of information to smaller customers. 2.The customer is given online access to the service provider’s report engine. This used to be an option offered only to larger customers, but no longer. A favorite method was to install a simple PC with dialup access to the service provider’s NOC. When the customer’s network operation staff took a call or needed to generate written reports of their own, they could simply dial in and access the information they needed. Even troubleshooting could be done this way in some cases. The information available to the customer was typically updated daily. 3.The customer uses an ordinary Web browser to access a secure Web site with the service provider’s reports. This option has caused a revolution in CNM circles. Implementation is simple and inexpensive for the service provider (who cannot afford a new Web site?) and trivial for the customer (who does not have a Web browser?). No separate link or port is needed, just Internet access. Security is provided by standard TCP/IP methods, which while not state-of-the-art, can be made effective.
Naturally, the most popular option today is the Web-based method. Information is usually updated every 15 minutes, practically real-time compared to other methods. Information on busy day, busy hour, traffic peaks, and traffic variations is typically gathered. The overall architecture of the Web method is shown in Figure 7.6. Note that access to the Web site need not necessarily be through the frame relay network itself, although this is of course possible. Also note that if FRAD information is included, the customer must allow access to this CPE device.
Figure 7.6 Frame relay statistics on the Web.
Service Level Agreements How often can a network fail before the network is unusable? How much delay is too long? How long should it take to return a failed UNI to service after it is down? Managing a network involves formulating answers for these and many more questions just like them. None of these questions have right or wrong answers, naturally. If a network carrying mostly e-mail is out of service for an hour, some might not even notice. But a system performing credit card validations out of service for an hour at a retail store could cause howls of protest. The same is true with end-to-end delay through the network. All networks, and frame relay is no exception, have varying service qualities from time to time as traffic comes and goes, links fail and are restored, and so forth. So it is important for the service provider to guarantee that the frame relay network support all of the user applications run across the frame relay network adequately. This is where the idea of a Service Level Agreement (SLA) enters. SLAs are essentially promises on the part of the overall frame relay service customer to each major user group that the level of service provided by the network will be within the bounds established by the SLA. The customer gets its own quality of service (QoS) from the frame relay service provider, of course. Typically, a frame relay tariff or contract will specify such QoS parameters as overall network availability (99.995%, or about 26.28 minutes per year of downtime, links restored in less than 4 hours), bit error rate (BER) (10 -11 , or 1 bit in every 100 million in error), block error rate (error rate on the frame, usually figured as the maximum size of the frame in octets times 8 bits per octet times the BER), the burst capacity (usually up to the UNI rate), delay (usually a flat less than 40 milliseconds upper bound on delay), and information loss (99.999% at CIR, or only 1 in 100,000 frames with DE = 0 discarded). There are variations within all of these categories, of course, and some major frame relay service providers compute delay along the lines of 1 millisecond per 100 route (cable) miles plus 0.5 milliseconds delay per frame relay switch. While nice in concept, few potential customers have any idea what their route actually is or how many switches there are between source and destination UNIs. Other service providers distinguish between user availability and overall network availability, which acknowledges that although the frame relay network as a whole might experience less than half an hour downtime in a year, individual users might have more extended outages, especially if the UNI is from another service provider altogether. The QoS parameters offered by frame relay service providers generally fall into the following categories: Bandwidth (the CIR, burst capacity) Delay (upper bound, but no limit on delay variation) Information loss (bits errors, block/frame errors, delivery rate at CIR) Reliability (availability, restoral times)
These four items are fairly comprehensive as far as network QoS goes. Only delay variation (jitter) and security are not really addressed. However, PVCs form a kind of closed user group that affords some measure of security for virtual private networks (VPNs) and the like. As in all cases like these, if the network does not provide adequate QoS for the application, then each and every application must address the shortcoming if the user is to make the application work on the network. For example, users must add jitter buffers to the end equipment on frame relay networks if stable delays are needed for the application (e.g., voice). Usually, the SLA takes these overall performance statistics, and parcels them out to the individual users and departments using the network. However, the SLA concept has also been extended to the relationship between customer and service provider. There are some issues associated with SLAs before they become common as part of frame relay services, however. First, service guarantees above and beyond what the network gives users are only available at a premium cost. Delays less than 40 milliseconds may be had, for instance, but only by routing PVCs carefully. Second, there are no standard terms and metrics for SLAs at all. Is it bit errors or block errors? Network availability or PVC availability? And so on. Third, given the lack of standard definitions, direct comparisons between service providers is difficult. Is 99 percent of frames with less than 40 milliseconds delay better than an absolute upper delay bound of 45 milliseconds? For all traffic types? Finally, it is hard to determine exactly when an SLA has been violated, especially in terms of individual frame delay and disappearing traffic. The Frame Relay Forum has developed standard SLA performance requirements for frame relay networks. The ITU-T has done some work in this area already. Eventually, SLAs will become the basis for a method for creating traffic priorities and service classes on frame relay, much like in ATM networks. For now, SLAs remain more or less a contract bargaining chip in pricing negotiations. For now, SLAs usually define penalties in the form of rebates on monthly bills. A full day’s credit for a one-hour outage or a full week’s credit for a four-hour outage are not uncommon terms. Some SLAs involve a form of disaster recovery for the network, but this usually only helps to avoid failures of a few UNIs or trunks between frame relay switches. One major frame relay service provider has three levels of disaster recovery service available as follows: UNI protection For a one-time fee of several hundred dollars, plus a charge of several thousand dollars if the other circuit is activated, the customer gets a second backup UNI to use if the primary UNI fails. But the second UNI still leads to the service provider’s frame relay switch. Backup PVCs For a one-time fee of less than one hundred dollars, plus a charge of several thousands dollars if the secondary is activated, each customer PVC and the CIR associated with it is mapped to a secondary route through the network. There is a small monthly maintenance fee for this service. Dynamic PVCs This option is the same as the second and the costs are essentially the same, although the monthly charges can rise quickly. The difference is that the backup PVC, which presumably kicks in when severe congestion occurs, has a larger CIR than the primary one. The philosophy is that the larger CIR will handle the bursts that caused the problems on the first circuit. The whole subject of frame relay SLAs, and disaster recovery options, will continue to be a topic of considerable interest to frame relay service providers, customers, and users.
The Service Level Definitions IA (FRF.13) The popularity of frame relay service means that frame relay is available from telephone companies (both local and long distance), ISPs, and even other types of companies. The diversity of service providers has made it difficult to assess the quality of the frame relay delivered, whether contracts are honored at all times, and even how one frame relay service compares with another. So the Frame Relay Forum has established FRF.13, the service level definitions IA, to define a series of “transfer parameters” that can be used “plan, describe, and evaluate frame relay products and offerings.”
FRF.13 makes use of many of the parameters that are established in the ITU-T’s X.144 recommendation (User Information Transfer Performance Parameters for Data Networks Providing International Frame Relay PVC Service). But of course the FRF.13 guidelines apply to any frame relay PVCs, not just international ones. FRF.13 is mainly concerned with how the parameters should be defined, not so much with how they used to compare or measure frame relay service, although there are elements of the latter. The parameters in FRF.13 are used to measure four main service elements: frame transfer delay, frame delivery ratio, data delivery ratio (not quite the same thing), and service availability. In other words, the frame relay network delay, the effects of errors, and the reliability of the network. A frame relay service provider or equipment vendor basically has to agree to define these parameters in the way that FRF.13 suggests and agree to use them according to the definition whenever the parameters are mentioned. There are 10 parameters defined in FRF.13. Each are assigned to one of the four respective characteristics of frame relay as follows: 1.Frame transfer delay: Frame Transfer Delay(FTD). The time between a defined frame entry event and a defined frame exit event. 2.Frame delivery ratio: Frame Delivery Ratio (FDR). The ratio between the total frames delivered by the network and the total frames offered to the network. Frame Delivery Ratio within CIR (FDRC). The ratio between the total frames that conform to the CIR delivered by the network and the total frames that conform to the CIR offered to the network. Frame Delivery Ratio above CIR (FDRE). The ratio between the total frames in excess of the CIR delivered by the network and the total frames in excess of the CIR offered to the network. 3.Data delivery ratio: Data Delivery Ratio (DDR). The ratio between the total payload octets delivered by the network and the total octets offered to the network. Frame Delivery Ratio within CIR (FDRC). The ratio between the total payload octets in frames that conform to the CIR delivered by the network and the total payload octets in frames that conform to the CIR offered to the network. Frame Delivery Ratio above CIR (FDRE). The ratio between the total payload octets in frames in excess of the CIR delivered by the network and the total payload octets in frames in excess of the CIR offered to the network. (Note that the difference between frame ratios and data ratios involves whether the frame contains user octets as payload or not. Missing SE frames count as missing frames, but not as missing data.) 4.Service availability: Frame Relay Virtual Connection Availability (FRVCA). A formula is allowing excluded outage time (time the network is not available due to scheduled maintenance or failures beyond the control of the frame relay network) to be subtracted from the calculation. Only outage time, direct due to faults in the network, is tracked by the service availability parameters. Frame Relay Mean Time to Repair (FRMTTR). A simple ratio between the outage time, as defined above, and the number of outages in the measurement period. If there are no outages, FRMTTR = 0 for the interval.
Frame Relay Mean Time Between Service Outages (FRMTBSO). The ratio between the measurement interval less excluded outage time and outage time, as defined above, and the number of outages in the measurement period. If there are no outages, FRMTBSO = 0 for the interval. Time intervals are measured in milliseconds for delay and minutes otherwise for the purposes of calculation. Delay is measured using a standard 128 octet frame payload, unless customer and service provider agree otherwise. FRF.13 establishes a standard, basic frame relay network model of two access circuit sections (UNIs) and the access network section (cloud) in between. More elaborate NNI situations are also defined. Several standard measurement reference points are established in FRF.13 (e.g., Egress Queue Input Reference Point (EqiRP)). There is a section on how hybrid private and public frame relay networks are to be treated (the private network looks like a UNI to the public network, which is now measuring public “edge-to-edge” service levels as well as public-private “end-to-end” levels). The actual mechanism for implementing FRF.13 is not discussed in the document. While some things are open to negotiation in FRF.13, others are not. When it comes to delay, for example, customer and service provider are free to have an SLA that determines the “object dimension” (delay per PVC?, per UNI?, both?, more?, etc.) and “time dimension” (day?, week?, business day week?, 7 day week?, etc.) over which delay is calculated. On the other hand, the SLA must describe at least: The measurement domain (objects and times) Applicable reference points (UNIs and NNIs, and so on) Delay measurement mechanism Identification of connections subject to the delay measurement Measurement frequency Frame size used Information about all of the four characteristic service level areas can be aggregated in the form of a report, although it just says “reported.” Some possible formats are given in an Annex to FRF.13. For example, a delay report measures each connection once every 15 minutes, 24 hours a day, for 30 days. There is a lot of flexibility here. Only a few more points are needed about FRF.13. The intent is to take much of the guesswork out of SLAs and comparing frame relay services and service providers wherever they are in the world. However, FRF.13 is just a step in this direction, although a giant one. For example, although FRF.13 references X.144, the FRF.13 definition of a Outage Count is not the same as X.144s. But it will surely matter if a frame relay service provider complies with FRF.13 or not.
Simple Network Management Protocol, the Management Information Base, and Frame Relay Network management is a crucial part of any network. But where does the network management hardware and software get the raw material—the information—that it needs to present a coherent picture of “how the network is doing” to the network operations personnel? In frame relay networks, as in most networks today from the Internet to SNA, the answer is with the Simple Network Management Protocol (SNMP), the related Management Information Base (MIB), and the network management software running in the operations workstations. SNMP has its roots in the TCP/IP protocol suite and the Internet. Introduced in 1989 to manage routers (then called gateways) on the Internet, SNMP is now the industry standard network management software not only for routers, but for LAN hubs, modems, multiplexers, FRADs, ATM switches, and even end-user computers. One reason for this is SNMP’s profound simplicity. SNMP is very simple. The original version is now known as SNMPv1. Today, whenever only SNMP is used, it is understood to indicate SNMPv1. The newer version, SNMPv2, was standardized in 1993 but has only slowly appeared. SNMPv2 is much more complex than SNMPv1, but much better suited to managing complex and large networks with many LANs, routers, hubs, modems, and so on. Other enhancements such as better authentication and encryption were part of SNMPv2, but are now an option in SNMPv3. SNMP is built on the client/server model. In SNMP, the client process is the central network management software running on a system in a network operations center (usually abbreviated NOC). The server process runs in each and every SNMP-manageable device on the network, which need not be a TCP/IP network. The network can support and/or use any network protocol at all. TCP/IP is only needed for SNMP communications with the managed device. Many network devices today, such as frame relay switches, employ TCP/IP only as a vehicle for SNMP network management. The SNMP network management application is run on one or more central network management workstations. Typically, the SNMP server running in the network device, known as the agent, is requested by the network management software to supply some piece of information about itself. The SNMP server process replies with the current status of some piece of information about the device. There is only one exception to this “ask me and I’ll tell you” mode of operation. SNMP defines a set of alarm conditions known as traps on the managed device. The alarm is a special message sent to the network management client software without the managed device waiting to be polled. There is a standard database of information kept in each SNMP-managed device. This network management information database is technically known in SNMP as a set of objects. But most people refer to this database of information about a managed device as a Management Information Base, although strictly speaking the MIB is only a description of this database’s structure. The MIB is really a piece of paper that says things like: “The first field is an integer and represents the number of frames processed, the second field is 20 characters long and represents the manufacturers of the device,” and so on.
However, once a MIB is implemented (written, compiled, and linked like any other program) and installed in a managed device, the MIB fields (objects) take on current, valid values (926, acme.com, etc.). Note that the agent is only able to access current values in the MIB. Any historical network management information must be kept on the network management workstation. This is a way of keeping the size of the MIB to a minimum in the managed device. The whole idea of SNMP and MIBs is shown in Figure 7.7.
Figure 7.7 SNMP and the MIB. There are four main steps shown in Figure 7.7 before any network management software package can know anything for certain about the state of the network. First, the network management software package must send an SNMP message to the managed network device, based on its IP network address. Some network devices have IP addresses solely for SNMP purposes. The SNMP poll can be generated automatically and periodically by the network management software, or the message can be generated by a point-and-click on the part of the network operations staff. The agent software in the device accesses the database of managed objects, defined by the MIB, and returns the current value of the database field in another SNMP message. The network management software now knows nothing more than a number. Only by comparing the received value with some historical information kept in a database, usually on the network management station itself, can the raw information be made meaningful. In this case, the fact that the SNMP poll an hour earlier recorded the value “6” is used to realize that an additional two bad frames have been logged by the network device in the past hour. This example, although quite simple, is basically the way that SNMP operates most of the time. Occasionally, alarms generated by SNMP traps are sent to the network management station without waiting for a poll cycle. There are two main types of MIB defined in frame relay. The first is the MIB found in the frame relay FRAD, or frame relay Data Terminal Equipment (DTE) to the MIB writers. The second MIB type is the one found in the frame relay network itself. Both are discussed in some detail in the following paragraphs. Customer’s network operations always have access to the local FRAD’s MIB and might also have access to the MIBs in FRADs all over the network. The service provider’s network operations center has access to the MIB in the frame relay network itself, and sometimes to the FRAD MIBs as well (depending on the degree of network management provided). Seldom, if ever, does a customer have access to the frame relay network’s MIB, although this is slowly becoming a feature in premium frame relay service in the network management area. The general idea of the use of the frame relay MIB by both customer and service provider network management software is shown in Figure 7.8.
Figure 7.8 The frame relay MIBs. Figure 7.8 shows three UNIs attached to FRADs, all using the frame relay MIB. LANs attached to each FRAD (which are often routers anyway) have at least one, and maybe more, stations running some form of SNMP-based network management software. The network management software is often based on HP’s OpenView graphical user interface. OpenView is not really a network management application in and of itself, but is often used as the basis for many software vendor’s network management packages. In any case, the network management station(s) sends SNMP messages to the agents which access the MIBs in the managed devices on the network.
Both MIB types are illustrated in the figure. The DTE MIB in each FRAD can be accessed by the customer owning the FRAD over a normal DLCI configured specifically for network management. In CNM situations, the DTE MIB information can be gathered by the service provider. However, the service MIB within the network is exclusively accessed by the service provider’s network management software, except in rare circumstances. Note that access to the service MIB, which is not necessarily located in each and every frame relay switch, is through a regular LAN and FRAD arrangement (and the service provider FRAD also has a DTE MIB which must be managed!). Most network equipment vendors define their own, private extensions to the MIB defined in SNMP for a specific network device. The private MIB fields in the database are usually low-level, hardware-specific extensions with information such as whether the device is on battery backup, has experienced a fan failure, and so on. Most frame relay devices sold today include the standard frame relay MIBs that are accessible by most SNMP manager software products. The MIB forms the source of raw material for the frame relay network management software.
Chapter 8: The Network-Network Interface (NNI) Overview Frame relay networks are quite simple. They consist of a series of devices supporting a number of standard interfaces. The standard interfaces are only two in number: the User-Network Interface (UNI) from customer premises to frame relay switch and the Network-Network Interface (NNI) between frame relay switches in two different frame relay networks. Much of frame relay focuses on the UNI, for the obvious reason that this is where the users and customers are. But there is a lot happening at the NNI that affects the users and customers as well. This chapter looks at the NNI in more detail.
Introduction The NNI connects two different frame relay network clouds. Sometimes it is claimed that the different networks must be from two different service providers, and they often are, but this is not always true. Many frame relay vendors employ frame relay switches from two different manufacturers. This usually happens when the equipment from a former vendor of choice is still present while equipment from a new vendor of choice is moving in. It could also happen that one service provider has purchased another, one with a different vendor for frame relay switches. In any case, the problem is that interswitch interfaces in frame relay networks are not covered by frame relay standards. So multivendor interoperability is not a feature of the frame relay network as a whole. In this case, the NNI forms the standard interface needed on the links between the two vendors’ equipment on the overall network. Without the NNI, customers with the same service provider might not be able to connect to each other—hardly a good idea. So a single NNI can connect two frame relay networks, whether they are owned and operated by the same service provider. There are two other configurations where the NNI is used; in these situations there are usually two NNIs. In regulated environments, where the line between local service and long-distance service (the term is almost meaningless today, but entrenched) is firmly drawn and closely watched, one NNI would connect frame relay customers on one local frame relay network with a long-distance service provider’s frame relay backbone and another NNI would connect to the other local frame relay network (which might be the same local service provider or even a third frame relay service provider). In international environments, a local (national) frame relay service provider would employ an NNI to reach an international frame relay service provider’s backbone, which would attach by another NNI to a second local (national) frame relay service provider in the second country. All three of these NNI arrangements are shown in Figure 8.1. It should be noted that there could be many variations in these configurations, especially the last two. There might be only two service providers instead of three, for example. But the figure is representative of just where the frame relay NNI is used.
Figure 8.1 The three major uses of the NNI. As an aside, it should be pointed out that the acronym NNI applied to frame relay is not the same as the acronym NNI applied to ATM networks. In frame relay, NNI stands for network-network interface and defines the standard interface between distinct frame relay networks, not switches per se. In ATM networks, NNI stands for network node interface and defines the standard interface between each and every ATM switch within an ATM network cloud. The acronym B-ICI (Broadband InterCarrier Interface) is used in ATM to define the role that the NNI (network-network interface) plays in frame relay. This is unfortunate and confusing, but a fact of networking life.
NNI Implementation There are two main concerns regarding the interconnecting of two frame relay clouds with the NNI. First and foremost, the transfer of user information must take place transparently so that the users are not aware that a PVC connecting two UNIs actually consists of separate segments, as frame relay documentation calls them. The multisegment PVC has one segment with associated CIRs, excess bursts, and so on, defined within each and every frame relay cloud between source and destination UNI. Each NNI configuration contains at least two PVC segments, by definition, and many NNI configurations actually have three segments for each PVC, as can be seen in Figure 8.1. Second, the multisegment PVCs must be managed just like any other PVCs, which includes both link management and network management considerations. Not only do the separate UNIs not see the frame relay inner workings, the users never are aware of the presence of the NNI between frame relay clouds. Yet all of the PVCs that are mapped onto the NNI(s) must be activated, maintained, and so forth like any other PVCs. And supporting SVCs on the NNI is especially challenging, mainly due to the need to coordinate billing and resource allocation between not only individual switches, but also entire networks. Initial frame relay documentation said little about the NNI. In the packet-switching network protocol, X.25, the X.25 protocol ran on the UNI, and a separate protocol, X.75, ran on the links between public packet data networks (the packet-switching NNI). While standards like the LAPF core portions of Q.933 addressed UNI issues, little was done to develop the X.75 equivalent for the frame relay NNI until well into the 1990s. As with many aspects of frame relay, the Frame Relay Forum (FRF) addressed the issue head on. ANSI had provided a phase 1 NNI document which allowed data transfer without any protocol changes and a slight modification to link management procedures. But the FRF implementation agreement (IA) mechanism was a better fit for providing transparent PVC service to users, given the structure and procedures of the FRF group. Currently, the FRF NNI is defined in FRF IA 2.1, the Frame Relay Network-to-Network (NNI) IA. This document is intended as a blueprint for the ITU-T recommendation on the frame relay NNI, and FRF IA 2.1 is sometimes called Q.frnni 1. Care must be taken when using the FRF IA 2.1 documentation. The document is usually separated into a body and an annex, although the pages run sequentially between the two (1 through 50). Both body and annex are equally important when it comes to understanding the features and functions of the frame relay NNI.
Multi-network PVCs A key concept in any study of the frame relay NNI is the multi-network PVC (the term “multinetwork” is always hyphenated in FRF documentation). Any frame relay network configuration that has an active NNI must include multi-network PVCs. Every multi-network PVC consists of two or more PVC segments, one in each frame relay network cloud. The concatenation of these separate PVC segments forms the multi-network PVC. A multi-network PVC always starts and ends at a UNI. The idea of a multi-network PVC is shown in Figure 8.2. Two typical local provider-to-interexchange (labeled “regional”) carrier NNIs are illustrated. The local service provider could be an incumbent LEC, regional Bell operating company (RBOC), or any other entity prohibited by regulation, law, or both from carrying user bits out of a defined area, usually a local access and transport area (LATA) in the United States. All such local entities must employ the facilities of an interexchange carrier (IXC) to carry bits across a LATA bound ary, even if the same local entity receives the bits again in the destination LATA. This handoff to the IXC must occur whether the LATAs are across the country or adjacent and within the same state.
Figure 8.2 Multi-network PVC segments. The PVC that runs between the two UNIs is now a multi-network PVC consisting of three distinct segments that are under the control of three distinct frame relay service providers. Note that there could be only two service providers involved if the local frame relay service provider were the same, but just in two different LATAs. In any case, two PVCs run from UNI to NNI, while the third in the middle runs from NNI to NNI. Each PVC is set up by the appropriate service provider, yet the multinetwork PVC must appear to be one construct to the end users. This requires coordination of all the parameters that go into PVC configuration, from CIRs to burst levels to DE policies. Even something as simple as the PVC leaving one network as DLCI 47 and entering the next as DLCI 89 requires careful bookkeeping and configuration. At first glance, the task might seem to be impossible, given how little real communication and coordination exists when trying to provision even a multi-LATA private line, which just essentially boils down to, “We need four wires to and from the same place by the end of the month.” In many cases, dates become targets rather than actual deadlines. But at least frame relay benefits from the years of experience the service providers (some of them, anyway) have had with X.25 multinetwork circuits (logical connections). In X.25, the role of the UNI is played by X.25 itself, while X.75 assumes the role of the NNI. X.25 and X.75, like any public packet network service, had to deal with multiple service provider and multiple national networks years before frame relay was even imagined. The success of X.75, although after many years of hard work, shows that multinetwork PVCs can function, and function well, in frame relay environments.
Network-Network Interface Coordination The success of X.75 does not mean that configuring multi-network PVCs in frame relay is somehow easy or routine, however. Far from it. X.25 and X.75 do not offer anything like the flexible bandwidth allocation (bandwidth-on-demand) capabilities that are characteristic of frame relay networks. This means that frame relay networks have to coordinate not only DLCI numbers, but also bandwidth commitments, congestion procedures, and everything else related to the PVC. Here is a list of the minimum settable frame relay connection parameters that must be configured in a coordinated fashion at the NNI: The DLCI numbers at each end of the PVC segments that meet at the NNI The CIR, committed and excess burst levels (Bc, Be), and time-measurement interval (Tc), in each direction on each PVC segment The maximum frame size in each PVC segment (usually either 1600 or 4096 octets) All signaling timers and counters used in link management (N391, T391, etc.) In addition, the networks should also support the standard implementations of FECN and BECN bits. All actions of the networks regarding the setting and use of the DE bit to perform traffic shaping and congestion control must be spelled out and coordinated as well. This is a good place to list the default values of the counters and timers used on the NNI, along with their ranges. All relate to the same status message use described in Chapter 7. Their use is essentially the same as on the UNI (T392 should be set greater than T391, etc.). These values are established by the Frame Relay Forum in FRF.2.1, the Frame Relay Network-to-Network (NNI) Implementation Agreement. The names and default values are listed in Table 8.1. Service providers rarely tinker with these values. Both of the frame relay networks at the ends of the NNI will generate Status Enquiry messages based on the T391 value but, of course, there is no need to coordinate the network clocks at this level to any degree. There is not even any requirement for the N391 counter to have the same value in both networks. But both sides of the NNI must have the same values for N392, N393, T391, and T392. The NNI discussion so far has only addressed PVC issues. What about SVCs? With SVCs, all of these parameters must be agreed upon and set up not in the time interval between service ordering and service provisioning, which can take several business days, but during processing of the call setup mes sage, which should take no more than about 10 seconds (the usual call setup target for switched services of all types). And SVCs should require additional coordination tasks, such as engaging billing systems and consideration of reciprocal billing arrangements between different service providers. Table 8.1 NNI Parameters for Link Management INFORMAL NAME
PARAMETER
RANGE
DEFAULT VALUE
COMMENT
Polling cycle
N391
1–255
6
Determines full status interval
Errors
N392
1–10
3
Missing cycles (<=N393)
Events
N393
1–10
4
Should be close to N391
Polling timer
T391
5–30
10
Seconds between SEs
Timeout
T392
5–30
15
Seconds till SE is “missed”
So it might seem that multi-network PVCs are possible, but multi-network SVCs are rather far off in terms of service offerings. And it should be the case that the UNI-to-UNI SVCs in a multi-network environment will only become common in the future on most frame relay networks. But the use of SVCs on the NNI itself (outside of the UNIs) might happen sooner than the use of SVCs on and between UNIs. This apparent paradox can be resolved by realizing that the complexities of manual negotiation and coordination of PVC segment parameters are actually more than the simplicities associated with having a standard signaling protocol in place to handle it all. A full discussion of the use of SVCs on the NNI in an otherwise PVC world is deferred until later in this chapter.
NNI Link Management Link management procedures on the UNI were covered in Chapter 7. The use of the Status Enquiry and Status messages used in such implementation agreements or standards such as the LMI or ANSI Annex D signaling protocols to verify that the UNI is still there and the proper DLCIs are defined upon it were detailed there and need not be repeated. What is necessary is to realize that UNI link management procedures are not ideal for the NNI. UNI link management will not work on the NNI due to the inherently asymmetrical (or unidirectional) implementation of the Status Enquiry and Status messages. The Status Enquiry message always originates on the UNI from the customer premises equipment (the FRAD). The frame relay network generates the Status messages, full or otherwise, in response to the Status Enquiry. Obviously, on the NNI there is no premises equipment or FRAD that can or will generate the Status Enquiry messages. So for NNI other arrangements must be made. On the NNI, the entire link management system of Status Enquiry and Status messages could be changed. This hardly makes any sense, however. A frame relay switch port could easily be used to terminate an NNI as a UNI. Why make link management so radically different that totally different protocols and software would have to be used on the NNI? It would be much better and easier to keep the changes to the link management messages to a absolute minimum; this is what was done. On the NNI, the Status Enquiry and Status messages are still used, but in a symmetrical or bidirectional fashion. So the frame relay switch at either end of the NNI can both issue and respond to Status Enquiry messages. These symmetric procedures are the simple trick that allows link management procedures to function properly over the NNI with as few changes as possible. The Frame Relay Implementers Forum (FRIF, forerunner to the current Frame Relay Forum organization) defined the Local Management Interface (LMI) strictly for use on the UNI (few if any NNIs even existed then). What this boils down to is that there is no option for symmetrical use in LMI, and frame relay switches only ever issue Status messages. So LMI can only be used on the UNI. But both ANSI Annex D and ITU-T Annex A allow for symmetrical operation of their versions of the link management messages. The trick is to make each end of the NNI take turns “acting like the FRAD” and issuing the Status Enquiry messages at the appropriate intervals. Although ANSI and ITU-T documents favor the terms asymmetrical for the UNI and symmetrical for the NNI, current FRF documentation says that link management on the UNI follows user-to-network procedures and the NNI follows bidirectional procedures. All of the normal link management timer and counter rules still apply on the NNI; they just apply to both ends of the NNI at the same time. So both ends of the NNI must issue Status Enquiry messages (a poll cycle) on DLCI 0 at the defined interval (default 10 seconds), and request a full Status report on many poll cycles (default 6). The full Status message response has exactly the same format and function as it does on the UNI. All PVC DLCIs are listed, along with their status (active, available) (inactive, temporarily disabled, new). The availability of PVCs on a network in the middle of any pair of UNIs in a multi-network PVC should propagate out properly to the FRADs at each end of the network. Obviously, if any PVC segment is unavailable, then the whole multinetwork PVC is also unavailable. The question of which PVC segment has failed is not an immediate concern of link management procedures, but rather the concern of overall frame relay network management. These bidirectional procedures used on the NNI are shown in Figure 8.3. These procedures apply to PVC status reporting only at present. Status Enquiry and Status messages for SVCs will most likely be reported on DLCI 0, although current FRF NNI documentation is silent on this subject.
Figure 8.3 Bidirectional NNI link management procedures. In the United States, the NNI messages follow the ANSI Annex D format. On international NNIs, ITU-T Annex A messages must be used. Both are widely supported by equipment vendors. PVCs that span one or more NNIs basically can be in one of the same three states as PVCs on the UNI. The PVC reported in the full Status message on the two end-user UNIs can be new, meaning that the DLCI has been assigned on the NNI, but the PVC should not be used for live information. The PVC can be inactive, meaning the DLCI labeling the PVC could be used for user information, but not right now. An inactive PVC has been temporarily disabled although there are no firm rules as to just when a disabled PVC is temporarily disabled as opposed to “go to Plan B disabled.” Finally, the PVC reported to the UNIs that spans an NNI (or series of NNIs) can be active. Fortunately, the Frame Relay Forum has established a set of rules to determine just when a new PVC across an NNI becomes an active PVC from the UNI perspective. The guidelines are given in FRF.2.1. Five conditions must be met before a PVC defined on an NNI is reported with the active bit set to a 1 bit (A = 1) in the full Status message sent to the endpoints of the UNIs. These are detailed as follows: 1.All PVC segments in between must be configured. In other words, if just one segment is configured, a UNI mapping to that single segment will report the DLCI labeling the PVC as new, but not yet active. 2.Link integrity verification (i.e., Status Enquiry/Status Pairs) occurs on all NNIs and both UNIs involved in the multi-network PVC. This pair of messages has to be exchanged N393 times. The N393 monitored events counter has a default value of 4. So 4 pairs of Status Enquiry/Status Messages in a row have to be exchanged to satisfy this condition. This process takes 40 seconds at the default timer value, at each UNI and NNI. 3.All of the UNIs and NNIs involved have to be up and running (operational). This condition only makes sense. Configuration can still occur on the switches at the ends of a link that is down. 4.All of the PVC segments in between must also be operational. This is the internal counterpart of the previous condition. A PVC out of service between two frame relay switches in the same cloud could be running proprietary protocols or be defined on an ATM network. This condition requires the mapping of inter-switch problems to the frame relay PVC level (somehow). 5.The Active bits in the Status messages on all NNIs involved are currently set to a 1 bit (A = 1). This should prevent the situation where a DLCI on the end-user UNIs is reported as active when the PVC segment across some NNI is still not ready for user information. If any of these conditions is not met, then the PVC cannot be reported as active to the end users across the UNI. Frame relay PVCs are by definition bidirectional, so if a PVC exists in one direction on a network, a PVC must also exist in the other direction. However, the status of any given PVC is independent in each direction. So a PVC from Network A to Network B might be active, but the associated PVC from Network B to Network A on the same NNI might not be active. The FRF.2.1 document also says that whenever the status of any PVC changes on an NNI, the very next Status Enquiry message will be answered with a full Status message, regardless of where the two ends of the NNI are in the full Status polling interval. The intent is to allow knowledge of a failed PVC to propagate to the affected UNIs as rapidly as possible, since all frames sent into the networks at the ends of a multi-network PVC will be discarded in the interval between failure (technically, inactive status) and notification.
Figure 8.4 shows the sequence of events that occurs on the NNI and UNIs when a PVC is added. The figure shows only a single NNI between two frame relay networks, but the intent here is just to be illustrative. Only the relevant Status Enquiry (SE), Full Status (FS), and other messages are shown. As soon as the PVC is configured in Network A, the presence of a new (N) but inactive (I) PVC is made known to the customer by means of the SE and FS message (represented as FS [22,N,I)]. However, on the NNI, the new DLCI is reported as new (N) and active (A), since from the perspective of Network A, the PVC on the NNI is ready to go.
Figure 8.4 Adding a PVC in a multi-network NNI environment. However, Status Enquiries sent on the NNI from Network A to Network B will not evoke any new information about the new PVC (noted as FS[no info]). This does not mean that the full Status message is empty, only that the full Status message contains no information of interest about the new PVC, which has not been configured yet. Naturally, the subsequent NNI full Status messages also drop the new (N) bit. On the near UNI at the left of the figure, note that once the DLCI has been reported as new (N), the DLCI just becomes inactive (I) in subsequent full Status messages. But once the multi-network PVC has been configured on Network B, the full Status message sent on the UNI at the right of the figure reports not only that the DLCI exists and is new (N), but also that the new PVC is now active (A), since all of the segments of the multi-network PVC are now reported as active end-to-end and all the links are up (SE and FS pairs are flowing without serious interruptions). All that is left is to report the new PVC in Network B as new (N) and active (A) to Network A on the NNI (note that the DLCIs (DLCI 45) must match across the NNI) and report the new PVC as active (A) to the near UNI on the left. Note the status of the Active bits for each new PVC on the UNIs. One of the problems with NNI arrangements is that it has been rather difficult to coordinate this setting of the A bits on the far UNI with the setting of the A bits on the near UNI. There is only one bit set in the full Status message on the NNI, the N bit, that must be used by the local network (Network A in the figure) to trigger the setting of the A bit on the near UNI. FRF.2.1 contains many very good examples of multi-network PVC deletion, NNI failure and restoral, and so forth along the lines of the PVC addition illustrated here. Interested readers are referred to the full FRF.2.1 document for details.
Event-driven NNI Procedures One of the most annoying characteristics of multi-network PVCs defined across one or more NNIs is the need to propagate timely PVC status information to the UNIs at each end. This process takes time. With the default Status Enquiry timer of 10 seconds, it can take up to 10 seconds for a failed multi-network PVC on the NNI to convey its failed status (I) to the other network. (The NNI does not require a Status Enquiry to request a full Status message in order to generate a full Status report if the need arises.) This does not sound like a lot, but a UNI operating at 1.5 Mbps or 45 Mbps can dump a huge number of octets into the network in even 5 seconds. And even though CIRs are respected, and no FECN or BECN bits are set, the network has no choice but to discard traffic destined for a failed PVC across an NNI. So time is of the essence. There is also the need to sift through a full Status report and check the status of each bit associated with the PVC’s condition on the NNI. As opposed to a typical UNI, there might be literally hundreds of PVCs defined on the NNI (though only those with live traffic will be filling the NNI pipe). So even finding a PVC problem can take time across the NNI. The full Status message also tends to eat up precious bandwidth on the NNI. What can be done to speed up and make the detection of NNI problems more efficient? The Frame Relay Forum has addressed this issue and made a technique called event-driven procedures an option on the NNI. In addition to a faster and more efficient method of reporting PVC status on an NNI, the event-driven procedures address several issues that are important in international frame relay environments. There are five key provisions of the event driven option set for the NNI. These procedures can be used instead of the normal bidirectional procedures on the NNI. All of the operational details for the event-driven procedures are described in full in FRF.2.1. The five key provisions can be outlined as follows: No status polling needed. It is not necessary for the networks on each end of the NNI to poll (send SE) for status information. Status changes are reported when detected, without waiting. Also, the NNI link management messages are sent using the full LAPF data link protocol (I-frames, Sframes, etc.) for error detection or recovery. Otherwise, it would be hard to detect missing Status messages, since they can arrive totally unexpectedly. This method is not perfect, and messages could still be lost due to buffer overflows and so on. But at the link level, Status message transfer is considered reliable. No full Status reports used. There is no longer any need for a receiving network to shift through Status information about PVCs that have been around since the NNI was provisioned. Only PVCs that have changed status (A to I, for example) are reported to the network at the other end of the NNI. Support for more DLCIs. A frame relay network with many UNIs but only one high-speed NNI can end up with lots of PVCs configured on the NNI. The event-driven NNI can support DLCIs of 17 or 23 bits in length, large enough for many cross-network connections. Additional diagnostic and network identification. When a multi-network PVC fails, how long will it be out? And in which network did the problem occur? The event-driven Status messages add a national network identifier field to identify the network to the UNIs. There is also an inactive reason field, coded as shown in Table 8.2. Table 8.2 The Inactive Reason Field in the Event-driven NNI Procedures
CODE (DECIMAL)
MEANING
0
PVC inactive in adjacent network. The status report received had A ? 0 status from a network that does not support the reason coding or identification.
1
PVC deleted in adjacent network. The status report received had a deleted status from a network that does not support the reason coding or identification.
2
Interface inactive to adjacent network. The NNI is down and the other network does not support the reason coding or identification.
4
PVC not operational in this network. Check the network identification field.
5
PVC deleted in this network. Check the network identification field.
6
Interface inactive to adjacent network. Check the network identification field.
Protected new (N) bit. Missing an N bit set to a one bit can delay reporting of an active PVC to the UNI, or even possibly cause a frame relay network to miss the new PVC altogether. Since the event-driven procedures on the NNI use the full LAPF data link protocol, N bit transfer is now also reliable, at least at the link level. The coding of the inactive reason field applied to an NNI is shown in Table 8.2. These codes can also be generated by problems on the UNIs, but UNI issues are not discussed here. Values that do not appear are reserved. The coding reports the network identification if the information is available. Otherwise, the field essentially mimics the status of the regular status bits (active, deleted). When the event-driven NNI is first established, the status of all the PVCs is reported in full. But after this initial status report exchange, only PVC segments with a changed status are ever reported again.
NNI Agreements and InterLATA Frame Relay In the United States today, one of the overriding concerns with frame relay services is what to do when all of the sites to be connected by the frame relay network are not within the same local access and transport area (LATA). Transport of bits between LATAs must currently be handled not by the local exchange carrier (LEC) but on interexchange carrier (IXC) facilities. So, even if the frame relay service is handled locally by the same LEC at all sites, interLATA frame relay traffic must pass through an IXC one way or another. The LECs are constantly asking the states and federal government to ease this restriction, especially for data networks such as frame relay. But so far, the LATA construct remains firmly in place. IXC cannot directly offer local services within a LATA without a LEC involved, and the LECs cannot offer inter-LATA frame relay services without IXC involvement. Why should anyone care? Mainly because multiple service provider arrangements are not always the most efficient solution to any network situation, not only in terms of troubleshooting, but also in terms of simple administration, configuration, and even pricing. In many cases, customers have tended to favor IXC frame relay service offering in multi-LATA, and especially in regional and national, frame relay environments. In this case, the UNI simply is a leased access line from the LEC to the IXC frame relay point of presence (POP). Once at the POP, the IXC can carry the frame relay traffic on its own facilities to the destination UNI, also a LEC leased line. In fact, the IXC frame relay network can be totally within a LATA anyway, as long as the local access is handled by an incumbent LEC (ILEC) or state-certified other LEC (OLEC or Competitive Access Provider, CAP). These are all just LEC for the purposes of this discussion. The interLATA situation from the IXC perspective in shown in Figure 8.5. Such IXC interLATA frame relay works and works well. The link management procedures quickly detect UNI outages, and the LEC usually responds to such outages rapidly and efficiently. But from the LEC perspective, this arrangement leaves a lot to be desired. Although involved in the frame relay network, the LEC at either end of the network is basically just a passive bit carrier. The LEC line is usually only a few miles in length, so costs are low. There is no LEC frame relay at all in the figure.
Figure 8.5 InterLATA frame relay from the IXC perspective. So the LECs have all sought to figure out how to sell frame relay solutions to their customers with interLATA requirements. What most LECs have come up with is three interLATA alternatives that allow the LECs to sell frame relay and still service sites in more than one LATA. The three approaches are: 1.NNI to the IXC’s frame relay network. 2.The extended UNI by means of a private line. 3.The hub in the LATA (backhauling with a private line). Each method has attractions and deserves to be considered more fully.
NNI to Interexchange Carrier Frame Relay This method is nothing more than the familiar configuration of two NNIs between three frame relay networks. Of course, there is more to an NNI than just getting a link to another service provider’s frame relay switch. The NNI is not a UNI. Usually when two frame relay service providers want to establish an NNI between their frame relay network, the first step is a series of negotiations between the frame relay operations personnel representing the two service providers. Most of the process is unexciting: Is the NNI to be ANSI Annex D or ITU-T Annex A? What values should be used for the parameters? Is FRF.2.1 fully supported? Will there be event-driven procedures, and so on? But some of the items that must be agreed upon are more critical. How long should it take for a new multi-network PVC to be configured? Whose organizations will be the primary point of contact between the two networking entities? What are the proper escalation procedures to resolve stubborn problems? There are no right or wrong answers to any of these questions. There might be lots of wrong answers, but there are no right answers. A particularly tricky problem is reciprocal billing and traffic arrangements. Reciprocal billing in frame relay is not usually a billing option at all. It is more or less a simple agreement that, “I’ll carry your traffic if you’ll carry mine” along the arrangements of peering in an Internet service provider (ISP) situation. Of course, the asymmetrical nature of traffic in most client/server networks means that one LEC could be handling a lot more traffic outbound on a UNI than inbound. This could be a problem if the customer billing becomes use-sensitive and is paid for on the sending side. Again, this is like a replay of all the access charge debates among the LECs and ISPs. For now, flat-rate PVCs raise no serious challenges to NNI agreements. When and if SVCs become common, however, billing issues on the NNI will rise to the forefront. Many of the major frame relay service providers have NNI agreements with each other. The only way to be sure if an NNI agreement exists between Carrier A and Carrier B, however, is to ask. For a while, several notable IXCs would not sign any NNI agreements at all with the LECs or other IXCs. A customer needs a LEC leased line for access to the IXC POP. So what? This is all the LEC should do, the philosophy went. Fortunately, those days of aloof frame relay service providers seem to be over.
Extended UNI The concept of the extended UNI turns the LEC-private-line-UNI-to-the-IXC POP concept on its head. Why not carry the UNI on a long IXC leased line to the LEC frame relay POP in another LATA? This arrangement is shown in Figure 8.6.
Figure 8.6 The extended UNI from the LEC perspective. The IXC is now reduced to the role of providing a private line to the LEC frame relay switch in another LATA. Although a different LEC is shown in the figure, both LECs in the extended UNI arrangement are typically the same. The extended UNI is simple, needs no NNI agreements with the IXCs, and allows LEC to compete with IXCs for interLATA frame relay business. However, the cost of the IXC leased line can be high, but much depends on distance and area of the United States.
Hub in the LATA
The third and final LEC arrangement for interLATA frame relay also does not require an NNI agreement with the IXC. This is the concept of the hub in the LATA. In this arrangement, all of the PVCs within a LATA that must connect to another LATA lead to a designated location, the hub. Hub locations are linked by IXC leased lines. The hub locations can link the frame relay switch ports together directly. In this case, there is an NNI, but it remains firmly under the control of the LEC (or LECs, if there are two LECs involved). Alternatively, the hub locations can have ordinary routers and the routers are linked by the IXC leased line. Both methods will work, but in the case of the router connectivity, there is no frame relay NNI to be concerned with. Both of these arrangements, NNI on IXC leased line and router to router connectivity, are shown in Figure 8.7.
Figure 8.7 The hub in the LATA. The NNI on IXC private line basically reverts to the classic NNI between two frame relay clouds belonging to the same service provider, although the presence of the NNI at least holds out the possibility that the two local frame relay service providers could be different LECs. In the second case at the bottom of Figure 8.7, however, both routers are firmly under the control of the same LEC organization. The absence of the NNI makes this more or less mandatory, although exceptions could probably be found. No matter which interLATA arrangement is chosen, each offers a way for local frame relay service providers to service their customers across LATA boundaries. Such arrangements will be common until the interLATA traffic restrictions are dropped, at least for data services like frame relay. In the meantime, only well-established interLATA corridors (e.g., between northern New Jersey and New York City) do not require one of these methods for a LEC to offer interLATA frame relay.
Switched PVCs (SPVCs) By now it should be apparent that configuring and maintaining PVCs on an NNI is a labor-intensive task. Usually, the connection between frame relay networks is a single physical link, typically running at either 1.5 Mbps or 45 Mbps in the United States. But each and every multi-network PVC that exists has to be mapped on to this connection. This is true even if the PVC is only used between the UNIs once a month for payroll summaries or even once every three months for quarterly spreadsheet consolidation. If some method could be found to both simplify administration of the NNI PVCs and make this process more efficient, this would make the NNI a more effective interface between the frame relay networks. So the Frame Relay Forum has come up with the concept of Switched Permanent Virtual Circuits (SPVCs) for use on the NNI. The idea of SPVCs is detailed in FRF 10, titled “NNI SVC Implementation Agreement.” Do not be mislead by the presence of the “SVC” term in the title of the document. To the users on their UNIs, the SPVC is totally invisible. In fact, to both users and network administrator, SPVCs look like regular PVCs; this identity was intended. SPVCs have some characteristics of SVCs, however, and (more importantly) can span networks because SPVCs are on the NNI. SPVCs use frame relay SVC signaling on the NNI to establish calls (frame relay connections) across the NNI. But the SPVCs do not require the user to signal the network about user connection parameters such as CIRs or burst values. Instead, the SPVC uses a call request agent process in the switch (or network as a whole) at the end of the NNI to relay these parameters across the NNI. So the signaling protocol uses the proper parameters from the existing PVCs when establishing the connection, which could terminate in another frame relay network, and obviously usually does. The concept of SPVCs is shown in Figure 8.8.
Figure 8.8 The concept of a switched PVC (SPVC). Note that there are no longer any PVC segments and therefore no need for individual network administrators to match DLCIs, CIRs, and so on for each PVC on the NNI. The SPVCs now come and go on the NNI as PVCs within the frame relay network become active from the user perspective. Of course, as with regular or UNI-based SVCs, SPVCs do need the frame relay networks to support resource allocation and call setup routing. But it is easier to do this for a handful of NNIs than for each UNI and link between frame relay switches. A nice feature of SPVCs is that they can offer a way to backup the NNI and make it more faulttolerant. The official word for this fault-tolerant feature of SPVCs is resilience. SPVC resilience can detect a failed SVC on an NNI (called an SVC-NNI) and release the SPVC all the way back to its point of origin, which is the call request agent itself. The agent can then consult its routing tables to find out if another NNI is available to the destination network and re-establish the SPVC. If there is an alternate route, the SPVC can be established on that NNI. This works whether the NNI failure is due to a physical path failure, line card failure, or even a switch or total site failure. It is up to the network whether the new SPVC should use a new line, switch, or node to reconnect over the alternate NNI. The alternate NNI could be used for live traffic, of course, and does not have to be reserved only for reasons of NNI resilience, although capacity planning is always a good idea in
any event. SPVCs have a number of distinct features. SPVCs are released when a PVC is quiet and established only when the PVC is carrying live traffic. Although the SPVC call setup contains the frame relay network address of the endpoints, SPVCs are not established from site to site, but actually from DLCI to DLCI. This only makes sense, because the SPVCs really link PVCs, not sites. Agents are configured to only accept calls from the proper DLCIs. The call setup request used with the SPVC signaling protocol can tell the called party (the agent) only to use a specific DLCI, use any DLCI at all, or allow the called party to pick its own DLCI. The Frame Relay Forum has even established an SPVC reference model for the NNI. A reference model is just a useful way to show all of the possible ways that a technological method can be helpful in a variety of situations, without necessarily going into all of the details about how this is to be accomplished. The SPVC reference model establishes five unique SPVC applications. For SPVC purposes, the reference model classifies frame relay UNIs and NNIs are either PVC-UNIs (only PVCs defined), SVC-UNIs (allows SVCs also), PVC-NNIs (only PVCs defined), and SVCNNIs (allows SVCs also). Three of the five SPVC applications include SPVC to SPVC interoperability, while the other two do not. With SPVC interoperability, the SPVC agents reside in two separate frame relay clouds under the control of two different frame relay service providers. The five SPVC applications (cases) are shown in Figure 8.9.
Figure 8.9 The five SPVC applications. These are the five SPVC configurations defined in the SPVC reference model: 1.SPVC to SPVC, from PVC-UNI to PVC-UNI across an SVC-NNI. 2.SPVC to SPVC, from PVC-UNI to PVC-NNI across an SVC-NNI. 3.PVC to SPVC, from PVC-NNI to PVC-NNI, across an SVC-NNI (this is an SPVC interoperability scenario). 4.SPVC to SVC, from PVC-NNI to SVC-UNI, across an SVC-NNI (this is an SPVC interoperability scenario). 5.SPVC to SVC, from PVC-UNI to SVC-UNI, across a PVC-NNI and an SVC-NNI (this is an SPVC interoperability scenario). All are multi-network SPVCs, but only the last three cases involve SPVC interoperability. Presumably, SVC-NNIs are to be deployed only when the same service provider controls both clouds. This makes sense, because billing and resource allocation issues are at least manageable with the same service provider involved on both ends. A lot of the details of SPVC procedures have yet to be worked out. The implementation agreement is more or less a blueprint document than a detailed, step-by-step set of instructions for SPVC deployment. There is nothing really wrong with this approach; it is a common way of developing complex specifications in other technologies as well, from the Internet to ATM. So all issues and aspects of SPVC interoperability, for instance, are set aside as an area for further study. What this boils down to is that SPVC implementation will be a gradual and evolutionary technology, not a frame relay enhancement that will appear overnight.
This is not to say that the SPVC specification is devoid of details or helpful information. For example, the details about when an SPVC is to be disconnected due to the active bit status or a link integrity failure is completely spelled out. The active status bits will still flow over the PVC-NNI even when SPVC procedures are in use. When the active bit indicates that the PVC status is inactive (the active bit is set to a 0 bit), then an SPVC disconnect takes place at the end of the link that supports SPVCs. The DLCI mapped to that SPVC is also set as inactive. This procedures makes sure that an inactive PVC also disconnects the affected SPVC and tags the other end of the SPVC as inactive also. The link integrity failure rule is similarly spelled out. Whenever a PVC-NNI or PVC-UNI interface that feeds a particular SPVC is declared to be inoperative by the link management procedures, then all of the SPVCs that lead to or from the PVC-NNI or PVC-UNI are disconnected, again by SPVC procedures. Whichever end of the link that detects the failure initiates the disconnection sequence. Whenever an SPVC has been disconnected, the reconnection process is attempted periodically to see if the PVC-NNI or PVC-UNI link has been restored. This periodic re-trying of the SPVC is necessary because no mechanism exists to reliably inform the SPVC signaling agents that the PVC-NNI or PVC-UNI link has been restored to service. So the SPVC implementation agreement can be very detailed on some aspects of SPVC behavior. But many more of the operational details are not spelled out in the document or are areas that have been marked out for future study. These brief specifics have been presented to show that SPVCs are quite complex. The Frame Relay Forum document describing them is one of the largest of all the implementation agreements. This section has only given the main ideas behind the concept of SPVCs. As always, interested readers are referred to the documentation itself.
International Frame Relay Frame relay has been called “the first international network technology that actually works,” although this is a very extreme statement. What this means is that a FRAD located in the United States can successfully exchange frame relay frames with a FRAD located in, say, Paris or London. There could be two local (national) frame relay service providers in each FRAD country and probably an international frame relay service provider in between, but not necessarily. It is just as likely that the United States frame relay service provider would have an NNI agreement with the national frame relay service provider in the other country. The point is that the high degree of standardization of the UNI and NNI (particularly the NNI in this situation) make it possible for international networks to be constructed with frame relay. The significance of this frame relay characteristic should never be underestimated. Even X.25 packet switching, which was every bit as standardized as frame relay is today, struggled in international deployments. Major international X.25 service providers had to deal with Belgian X.25, German X.25, French X.25, and so on. The X.75 standard between X.25 networks was supposed to take care of transferring packets from one X.25 network to another and it did. But X.75 could not hide all of the differences between the X.25 packets being shuttled back and forth between different countries. International X.25 networks had to deploy a series of specific version gateways between different nations, often at great effort and expense. Even so, X.25 is a very successful international standard. The relatively easy international network deployment with frame relay has been an enormous benefit to users, customers, service providers, and vendors alike. Vendors can manufacture frame relay equipment that can be used anywhere, without adapters or differing configurations. Service providers can deploy the exact same hardware and software anywhere they please, subject to local regulation, of course. Customers can plan international expansion with fewer concerns than ever before. And users can perform their functions without worrying about if and how information will get where it is going. Not only does international frame relay work, but also customers often have a choice of service providers. The increasing atmosphere of deregulation around the world has in turn led to an explosion of frame relay service providers. It is not necessary to list every frame relay service provider in the world to show the truth of this statement. Deregulation of national service providers and the growth stimulated by competitive telecommunications services markets have been evident everywhere. The situation is far from perfect, however. Many smaller countries have only one frame relay service provider and some have none at all. But many of the most industrialized countries in a given region have half a dozen choices of frame relay service providers and some can have more than that. This is not to say that all frame relay service providers have equal service area coverage or availability. But at least they exist in environments where none existed at all not too long ago. Rather than listing service providers around the world to show the viability of international frame relay, this section will focus on one section of the globe. Consider Central and South America. Table 8.3 shows the number of frame relay service providers in each country, as of mid-1998. Note that smaller countries are still underrepresented in the frame relay arena. But also note the activity in the frame relay marketplace of the larger countries in the region. There are as many countries with more than two frame relay service providers as there are countries with none or one. Table 8.3 Frame Relay Service Providers in Central and South America
NUMBER OF SERVICE PROVIDERS
NUMBER OF COUNTRIES
COUNTRIES
0
10
Belize, Cuba, El Salvador, French Guiana, Guyana, Honduras, Paraguay, Suriname, Uruguay, Virgin Islands
1
3
Grenada, Nicaragua, Panama
2 to 3
6
Bolivia, Costa Rica, Dominican Republic, Ecuador, Guatemala, Peru
4 to 6
1
Puerto Rico
7 to 9
6
Argentina, Brazil, Chile, Columbia, Mexico, Venezuela
Why should anyone care about international frame relay? It is simple, because maintaining steady revenue growth usually requires businesses to expand markets. Market expansion can only take a business so far in the country it was born in, since there are only so many people. The rest of the expansion must take place abroad. Some companies have been so successful at international marketing and presence that one is hard-pressed to recall that Shell Oil is a Dutch company and Nestlé a Swiss one. Although the following comments apply mostly to U.S. companies considering frame relay for international use, the general themes apply to all countries. Most of the frame relay service providers in the United States offer international connectivity. This is usually delivered through strategic partnerships with either an international frame relay service provider or a group of service providers whose members form an international consortium. But this is just the minimum requirement for connectivity. There are other issues to resolve and questions to be asked. There are issues of support, connectivity, and invoicing to consider. Each is important enough to deserve a section of its own.
International Support Issues A network without adequate support is like an automobile without a reliable mechanic. The car will run just fine until the first problem arises. Then it might never run properly again. Many international frame relay network issues involve support considerations. Among these support considerations are the things mentioned in this section, but this is not an exhaustive list by any means. How are troubles reported and who will handle these reports? Perhaps the domestic service provider will handle the reports of troubles at international sites. Perhaps these troubles need to be reported to the other country’s service provider. It is always better to have a single point of contact for any network and most frame relay service providers will do just that. There might be a separate point of contact for international trouble reporting, however. Are there proper escalation procedures for the international portions of the network? Not all countries have the elaborate and detailed escalation procedures common in the United States. Some country’s approaches to ongoing and long-standing problems never quite reach the level of urgency that a stranded customer might desire. A lot of the differences here are cultural. Not everyone thinks that a frantic pace is a necessarily good thing. What is the time frame if on-site support is needed? This issue relates to the previous concern about escalation. In remote and rural areas of some countries, on-site help has to be flown in to a site, and scheduled air traffic might be minimal. In extreme situations, the technicians might have to fly in from another country altogether.
Is there a help desk staffed by people on the ground in each country? Strategic partnerships help a service provider offer services in places they are not. But it makes no sense for a user in Greece, for example, to have to call France or even the United States to ask a simple question about the network. Flying specialists around (sometimes called tiger teams from the Flying Tigers of World War II fame) is one thing. Making everyone place international calls for simple help desk questions is quite another. Are the technicians required to speak the language of the country they support? Many of the difficulties associated with international networks are basic language difficulties. It helps somewhat that most acronyms are international in scope (NNI or TCP/IP), but when it comes to support and help issues, accuracy of problem description is very important. While many people around the world routinely learn English as a second language, lack of English skills is not a sign of a poor education or lack of intelligence. Do the frame relay reports include statistics for the international portions of the network? Network monitoring is important in any network and especially in frame relay where there are really no dedicated resources at all for the bursty traffic that characterizes it. Statistics for CIR adjustments, and even UNI and NNI physical access link sizing, are needed for more than just the domestic portions on the network. Is the network truly a whole and are all features supported everywhere? Frame relay service providers differ in the way that DE frames are handled or FECN/BECN bits are set and handled. Some service providers support zero CIR and others do not. When there is more than one frame relay service provider involved, as in international frame relay networks, it is important to determine what the features are as a whole, not only on one portion of the network and another.
International Connectivity Issues Support issues are not the only important considerations that need to be addressed with international frame relay. The type of connectivity provided between the two domestic networks is important as well. Among these considerations are the items mentioned in this section, but again this is not an exhaustive list. Is there a true NNI or is there a type of interLATA arrangement? Recall that in the absence of a true NNI arrangement, various leased private-line configurations can allow two frame relay networks to attach to one another. However, all of the link management information that would ordinarily flow across the NNI is then absent. This might still be acceptable to all parties involved, but the only way to be sure of this is to make sure all are informed of the presence or absence of an NNI agreement. Are the international connections redundant and/or diverse? Nothing is worse than funneling all traffic between major areas through a single bit pipe, no matter how large. There might be two or more high-speed NNIs between networks, but if both are only channels within a single international fiber link, there is little to prevent prolonged outages in the case of disaster. Redundancy and diversity are even more important in international situations than they are domestically. This applies to transit countries as well (a transit country is not the source or destination of any of the bits, but carries the backbone bits right through it). Where are the gateways between the networks located? Not all frame relay switches in a given network support NNIs. In fact, it makes a lot of sense to minimize the number of NNIs that must be configured and supported, especially in purely PVC frame relay networks. These NNI frame relay switches are often called gateways, but the term is not really used in the same way as it is in LANs or SNA networks. That is, there is no protocol conversion going on at the frame relay gateway location, just the end of one or more NNIs. The location of these NNI endpoints is important for a couple of reasons. First, it is always better to minimize the distance between the UNI and the NNI in terms of hops, both for delay and reliability reasons. Second, it makes no sense to route PVCs in a totally roundabout fashion to reach a given destination.
International Invoicing Issues
Among the more mundane considerations regarding international frame relay networks are the billing and invoicing issues. The routine nature of the billing task does not mean invoicing is not important, however. Again, there are issues that are more important in international situations than in purely domestic. Is there a single invoice for all charges? If the consortium wants to present itself as a single entity, then there should be a single invoice presented to the customer. All of the interexchange carriers quickly found out in voice service situations that customers still wanted a unified bill for services, both local and long-distance, no matter how many service providers were really involved. The same is true for frame relay billing. Will international sites count toward any domestic use discounts? Many service providers offer discounts when a certain traffic level is reached. With international frame relay, does traffic generated overseas count toward the domestic traffic quota? What about domestic traffic and overseas discounts? The savings due to such cross-discounting practices can be significant, if the frame relay service providers offer this type of package. Are invoices organized by location? Many organizations like to have an idea of whose department is generating what amount of network traffic. Sometimes this is so the organization can generate its own internal charge-back schemes to an individual departmental budget, but usually it’s just so that some planners can have an idea where the future of the network should be headed. With frame relay, invoices can even be broken down and sent to the different locations abroad as well as domestically. Are invoices electronic? An invoice that is just a piece of paper in an envelope is not all that helpful. Paper gets lost, misfiled, shredded accidentally (or intentionally), and so on. An invoice on a diskette, CD, or even good old 9-track tape is much more useful and permanent. The electronic invoice can be further processed for different departments, locations, and so forth. The electronic invoice can be archived, distributed, and even used to determine trends and make forecasts. The same thing could theoretically be done with paper, of course, but much less efficiently. Are invoices in different languages and currencies? The language barrier has already been mentioned with regard to support issues. But not everyone with a checkbook can read English (or French, for that matter). Why send an English invoice to a department in Tokyo if no one there can read it? The same logic applies to the invoice billing amount as well. If a local department of a global corporation in Malaysia staffed by Malaysians is to pay for its local portion of the frame relay network service, it only makes sense for the department to receive a bill in ringgits. Exchange rates might be volatile, sometimes extremely so, but this is not an argument against local currencies, only an argument for timely billing and collection agreements.
The Future of the NNI It might seem a little odd to consider the future of the NNI in isolation. After all, how could the future of the NNI be substantially any different than the future of frame relay as a whole or the future of the UNI? But in fact, the future of the NNI is very critical to the future evolution of not only the UNI, but also the entire future of frame relay as a whole. This is why. Frame relay is designed for bursty traffic patterns, especially those patterns from LAN traffic. No surprise here. But many DLCIs, whether they represent PVCs or SVCs, are mapped onto a limited number of NNIs between frame relay clouds. Not only is there a valid concern about the sheer number of DLCIs to be configured, maintained, and managed at the NNI, which is addressed by the idea of switched PVCs (SPVCs), but there is an overriding concern about the aggregation of a statistically variable amount of traffic onto these NNIs with their limitations of speed. Only so much traffic can be handled by an NNI at a given time. Frame relay networks tend to respond to congestion by discarding traffic. But discarding is necessarily only a short-term solution for dealing with bursts. This is because the more bursters there are, the better the chances are that a large number of these bursters are bursting all at the same time. On lazy summer evenings, the firefires all still burst at random and at their leisure, but given enough of them, the chances are pretty good that one or more fireflies will be emitting their burst of light at the same time. So bursty traffic on a UNI is not the same as bursty traffic on the NNI. SPVCs will help somewhat, but only with the administrative tasks. The need will still exist to provide the NNI with more bandwidth than exists on the UNIs in terms of physical line rate. The fastest UNI in common use today in the United States runs at 1.5 Mbps, although speeds up to 45 Mbps are supported. In most of the rest of the world, these speeds translate to 2 Mbps and 34 Mbps respectively. As the amount of fiber optic cable deployed continues to grow in leaps and bounds, and the cost of more higher-speed broadband UNIs becomes financially attractive, there will be more and more UNIs running at the higher of the available speeds. But how many 45- or 34Mbps UNIs to every customer site can be supported on a limited number of NNIs running at the same speed? Therefore, the NNI not only will still be around in the world of high-speed UNIs, but the number of NNIs will have to increase also. This is because the number of frame relay service providers around the world will not decrease over time, but increase, especially with the new atmosphere of deregulation in more and more countries. There may be local pockets of consolidation, of course, but the overall trend will be up, and perhaps upwards rather sharply in some locations. So, having the peak speed for the frame relay UNI and NNI set at the same value is probably not a good long-term strategy. Also, it is hardly a good approach to require major work to be done by standard’s organizations and the Frame Relay Forum every time a simple change in the speed of the NNI and UNI is contemplated. What is needed is for the NNI and UNI speeds to be automatically scaleable or tied to a standard that evolves independently from frame relay. That way, the maximum speed of an NNI or UNI could be tied to something like Sonet or SDH (Synchronous Digital Hierarchy, the international standard version of North America’s Sonet). The frame relay NNI and UNI could scale to different speeds of Sonet/SDH, with the maximum allowable speed on the NNI always exceeding the maximum allowable speed on the UNI. The same would be true of the minimum speeds, but this time the UNI could always be slower than the NNI minimum speed. ATM is already aligned with Sonet/SDH in exactly this way. Work is underway to align frame relay and Sonet/SDH for the UNI and NNI in much the same way.
There is even another reason that higher-speed and scaleable NNIs and UNIs are important. Frame relay was designed for bursty data traffic patterns. But frame relay is also a fast packets network, so it is capable of transporting packets containing not only data, but voice and video as well. Voice over frame relay is already an important application on frame relay networks. Video will follow as well. The use of frame relay networks for voice communications is important enough to merit a chapter of its own.
Chapter 9: Voice over Frame Relay Overview Things were much simpler when it came to networking voice not too long ago. There was a whole infrastructure, the public-switched telephone network (PSTN), that was totally built and optimized for transporting digital voice. The basic building block of the network links was the DS-voice channel running at 64 kbps, which was the international standard. Various DS-0 packages could be built up from this basic structure to build a DS-(24 voice channels) or a DS-(672 voice channels, or 28 DS-s) in North America and similar packages in the rest of the world (e.g., 30 voice channels in an E-). The DS-0 voice channels connected central office digital switches, which were simply big computers all optimized to handle thousands of streams of 64 kbps digital voice from local loop and trunk, trunk to trunk, and so on. Analog equipment linked to this network (mostly residential and small business telephones) had its analog signals digitized as soon as the voice hit the network and converted back to analog right as the digital signal left the network for an analog local loop. In between, the digital circuit-switched PSTN kept everything humming along. Ironically, the access lines, or local loops, leading to the digital backbone of the PSTN were overwhelmingly analog and still are today in most places. The only way that digital signals representing the data from a computer could make it into the digital portions of the PSTN across analog access lines was by making the data look like voice. This is what a common modem did and still does. Anyone who has picked up a telephone line that has been dialed by a computer modem is familiar with the squeaks, hisses, and whistles that represent how the computer’s digital data sounds. Even when access lines are digitized, as with ISDN, the result is still most often organized into a given number of voice channels operating at 64 kbps. And private lines, even when used for frame relay network access (for example), still adhere to the basic 64 kbps of the DS-. Today things are not so simple, however. People not only talk on the PSTN, they talk on the Internet as well, while downloading Web pages. Some businesses use ATM fast packet networks for voice, while they transfer invoices. Many businesses install managed IP networks that basically boil down to private lines organized not into circuit-switched networks, but rather into packetswitched networks running the IP protocol everywhere, which again carry voice packets and data packets over the same link at the same time. And many businesses also talk over their data frame relay network, which has at least as good a voice quality as voice over managed IP networks, and usually much better. So even the simple act of picking up the telephone and calling someone is more complex than it was only a few years ago. Something must have happened to cause the change. But what? How can voice, which once filled a 64 kbps voice channel all by itself, now share a 64 kbps Internet access line or frame relay UNI with file transfers and the like? Simply putting the voice in packets or frame relay frames cannot be the answer: 64 kbps digital voice is still 64 kbps inside of packets or frame relay. Packetized voice was once a dream. What has happened to voice to make it a reality? The answer is easy to state. Digitized voice is now being subject to various Digital Signal Processing (DSP) steps after the initial 64 kbps is obtained. Using advanced techniques such as silence suppression and new compression algorithms, voice is no longer the sole occupant of the 64 kbps DS-0 link. Voice subjected to this DSP treatment only requires 8 kbps of bandwidth and sometimes only 2 kbps. Motorola engineers foresee the day when voice occupies no more than 1 kbps on a digital link. This shrunken voice is now just another packet content, and perhaps not even the most important packet content on the network DS-link, and maybe now the DS-link is only a frame relay UNI. Perhaps the voice-oriented and optimized PSTN is a needless extravagance in a world where the quest for more data bandwidth has led right up to the door of the PSTN central
office. Perhaps what the world needs now is not a global network optimized for voice carrying data that sounds like voice, but rather a global network optimized for fast packets carrying voice that looks like packets. In fact, this transition is well underway in many countries already. This chapter explores the whole voice-DSP phenomenon as it applies to frame relay. It is good to keep in mind that this phenomenon of efficient packetized voice is not confined only to frame relay. However, before exploring how the Frame Relay Forum specifies that voice is to be carried over a frame relay network, it might be a good idea to take a closer look at how voice at 8 kbps and even below has come to pass. Why was digital voice tied to 64 kbps in the first place? After all, if DSP is so easy to accomplish, 8 kbps voice should have happened long ago.
FRF.11 Compliance Requirements V-FRADs must implement a minimum set of capabilities to comply with FRF.11. These compliance requirements fall into two classes, Class 1 and Class 2. Class 1 is for high-speed UNI and NNI (essentially speeds above 64 kbps) and Class 2 is for lower-speed UNIs and NNIs (essentially 64 kbps). Both classes use the same, basic subframe structure and allow receivers to discard optional frames (not discussed here, but detailed in FRF.11). Above and beyond these similarities, there are some differences. Class 1 (high-speed) V-FRADs must support G.727 (ADPCM) encoding, but other encodings are optional. Senders must be able to send 32 kbps G.727 voice, but receivers must be able understand 32 kbps, 24 kbps, and 16 kbps encoding. Support for fax and data is optional. These VFRADs must support signaling bits in the signaled payloads, but support for fax and dialed digits is optional. Class 2 (low-speed) V-FRADs must support G.729 or G.729A (CS-ACELP) encoding. Support for fax and data is optional. These V-FRADs must support signaling bits and dialed digits in the signaled payloads, but support for fax is optional.
6/23/2002
Voice over Frame Relay Implementation Agreement (FRF.11) In 1997, the Frame Relay Forum finalized FRF.11, the implementation agreement that spelled out how V-FRAD vendors were to support voice over frame relay networks. V-FRADs existed prior to 1997, of course, but FRF.11 added an element that was heretofore missing from the V-FRAD equation: vendor interoperability. V-FRAD vendors’ equipment that complied with FRF.11 specifications should be interoperable. No longer would customers be locked into one vendor’s VFRAD equipment always and forever. At least that’s the idea. FRF.11 sets forth a number of goals: 1.Allow the transport of compressed voice inside frame relay frames. 2.Allow a variety of voice compression schemes to be used. 3.Work well even on low speed (64 kbps) UNI frame relay connections. 4.Allow for the multiplexing of up to 255 subchannels (individual voice calls) on a single DLCI. 5.Allow the loading of multiple voice calls into a single or different subchannel in the same frame. 6.Allow data subchannels on a multiplexed DLCI used for voice. The big feature here was the support of multiple subchannels inside a single frame. This allows compressed voice from a variety of sources to be carried inside the same frame relay frame, over the same DLCI, to the same location. Without this subchannel feature, a separate DLCI would be required for each voice stream. Since all frame relay service providers limit the number of DLCIs supported on a UNI, this limitation is no longer a concern with FRF.11. Additional frame formats are established in FRF.11 for the transfer of nonvoice information such as signaling bits (the ABCD signaling bits used on digital trunks), the digits pressed on the telephone keypad, facsimile service (called appropriately, fax relay), and more. FRF.11 defines a voice over frame relay protocol stack, the VoFR stack, as the source and destination of the voice inside the frame relay frames. In a very real sense, the V-FRAD is a package that includes not only some or all of the Q.933 protocol, but also the VoFR stack. FRF.11 defines three distinct types of packages that can be V-FRADs. These three types are shown in Figure 9.6. Each of the types is still capable of interacting with the other two types. The differences are just in packaging, not in capability. The types of devices shown as examples of the device type are representative, not mandated. End-system devices are basically PCs or servers with voice over frame relay software. The type of voice interface shown in each device is not part of the FRF.11 specification.
Figure 9.6 The voice over frame relay reference model. Voice over frame relay supports five types of digital voice. None of these are defined in FRF.11, but all are defined by various international ITU-T standard recommendations. These are shown in Table 9.2, in ascending order by normal bit rate usage. So ACELP uses the least amount of bandwidth and PCM the most. No absolute bit rates are shown, as several of the standard methods allow multiple bit rates. Currently, FRF.11 includes no standard procedure for allowing one V-FRAD to figure out exactly what digital voice technique another is using. The choices must be configured manually by some local administrator. Table 9.2 Digital Voice Choices in Voice over Frame Relay TECHNIQUE
(ACRONYM MEANING)
ITU-T STANDARD
CS-ACELP
(Conjugate structure—algebraic code-excited linear predictive)
G.729
LD-CELP
(Low delay—code-excited linear prediction)
G.728
MP-MLQ
(Multipulse—maximum liklihood quantizer)
G.723.1
ADPCM
(Adaptive differential pulse code modulation)
G.726/G.727
PCM
(Pulse code modulation)
G.711
For voice purposes, frame relay frames are divided into two categories: primary payloads and signaled payloads (FRF.11 consistently spells it “signalled”). Primary payloads can be the voice itself, fax information, voice band modem data (a voice subchannel used to carry digital information from a modem), or digital data. Special procedures allow for frames that would have fax or modem data inside a subchannel to be recognized as such at the sender. The sending V-FRAD can then take the appropriate steps so that the receiving V-FRAD does not try to interpret the arriving frame content as a digitized voice signal. This process is called the demodulation of fax and modem data. The digital data inside the primary payload frames is not regular data traffic hitching a ride on voice frames. These are usually normal, but digital, voice signaling messages that need to be conveyed across the frame relay network, such as from one PBX to another. But since the frame relay network never looks at the digital data inside the frame anyway, almost any type of digital information could be carried as long as the formatting rules are respected. Signaled payload frames contain things like dialed digits, whether tone or pulse telephone numbers or digits pressed during a call, signaling bits (ABCD, for example), alarms or fault indications, digital signaling messages, fax data (fax data can also be sent inside primary payload frames), or a Silence Information Descriptor (SID) subframe. The SID subframe inside the signaled payload frame is how voice over frame relay handles silence suppression. The SID subframes are used to indicate the end of a talk-spurt or voice burst, and allow receivers to start to generate comfort noise. They can be generated periodically to adjust the comfort noise. The SID subframe is also used to signal Voice Activity Detection (VAD) to tell the receiver that the formerly silent speaker has started talking again. If VAD is not supported, then SID is not used. Just what goes in what is shown in Figure 9.7.
Figure 9.7 Primary payload and signaled payload contents. All of the content originates and terminates with voice over frame relay service users, although the user is actually a device. The primary and signaled payloads form the voice over frame relay service itself and these payloads are carried over the frame relay service just like any other frames.
Mixing Voice and Data Any UNI on a frame relay network can support multiple DLCIs representing PVCs. Each DLCI can potentially support voice over frame relay service. A single DLCI can carry more than one voice channel or subchannel. Other DLCIs can carry normal bursty data traffic across the frame relay network at the same time. How the voice over frame relay protocol stack shares the UNI with data connections is shown in Figure 9.8. The DLCIs shown are used for example only.
Figure 9.8 Multiplexing voice and data over frame relay. Recall that the data content of a voice over frame relay subchannel can be modem data but FRF.11 allows sub-channels to carry the digital data usually carried inside the normal frame relay data frames.
Subframes and Subchannels Frame relay frames containing voice over frame relay payloads do not have to carry all subchannels inside a single frame relay frame. This is because a frame relay frame actually carries one or more subframes when a given DLCI is used for voice over frame relay. Subframes carry a subframe header in front of the subchannel information. So a voice over frame relay subchannel plus a subframe header equals a subframe. Since the subframe header identifies the subchannel (perversely called the Channel Identification [CID] field) carried in the subframe, there is no need to statically map the subframes inside a frame relay frame. Even better, the subchannel information can be of variable length, the better to minimize the delay filling subchannels with voice samples (shorter subframes) or maximize bandwidth efficiency (longer subframes). The format of the subframe header is shown in Figure 9.9. There are four types, based on whether the extension indication (EI) bit and length indication (LI) bits are set to a 1 bit. If the EI bit is set to a 0 bit, then the channel identification (CID) field is six bits long and can support 64 subchannels, from 0 to 63. If the EI bit is set to a 1 bit, then the CID field is 8 bits long and can support 256 subchannels, from 0 to 255. Subchannels (CIDs) 0 through 3 are reserved and cannot be used for user information. Note that the extra two bits appear in the second octet. Even if only 64 subchannels are supported, the EI bit must be set to a 1 bit if the payload type field is to be included. Otherwise, the absence of the payload type field implies that the payload type field is all 0s (meaning this is a primary payload containing voice, modem data, or fax).
Figure 9.9 The voice over frame relay subframe format. So if the EI bit is set to a 1 bit and the LI bit is not, the payload type field is present in the second octet. A variable number of payload octets follow. If the EI bit is set to a 0 bit and the LI bit is set to a 1 bit, the payload length is present in the second octet (CID numbering is thus limited to less than 63 and a primary payload type is implied). A variable number of payload octets follow, specified by the value in the payload length field. If both the EI and LI bits are set to a 1 bit, the payload type field is present in the second octet and the payload length field is present in the third octet. A variable number of payload octets follow, specified by the value in the payload length field. Finally, if neither the EI nor the LI bits are set to a 1 bit, then the variable number of payload octets immediately follow the first octet. When the LI bit is not set, the payload extends to the end of the frame relay frame. When the length indication (LI) bit is set to a 1 bit, then the payload length field is present. Otherwise, the payload length field is absent. The LI bit is used in two ways. First, when the frame relay frame contains more than one subframe, the LI bit is set to a 1 bit in all but the last subframe. The LI bit is set to a 0 bit in the last subframe in the frame relay frame to indicate the end of a series of subframes inside a frame relay frame. The length of this last payload is just the rest of the information field of the frame relay frame. Second, when the frame relay frame contains only one voice over frame relay subframe, then the LI bit must be set to a 0 bit. The length of the subframe payload then becomes equal to the rest of the information field of the frame relay frame. The coding for the payload type field is shown in Table 9.3. Note that there is no separate coding for alarms or fault-signaled payloads. These are carried in payload type 2, used for all forms of signaling bits. Table 9.3 Payload Type Field Codings BITS
DECIMAL VALUE
CONTENT OF SUBFRAME
0000
0
Primary payload (voice, data, fax)
0001
1
Dialed digits
0010
2
Signaling bits
0011
3
Fax relay
0100
4
Silence information descriptor (SID)
Some examples of voice over frame relay subframe use are shown in Figure 9.10. This subframe header structure is somewhat awkward, but it works. The actual structure of the subframe (and subchannel) payload itself, as determined by the payload type field, is detailed in a series of appendices to FRF.11. Those interested in coding details are referred to this document.
Figure 9.10 Examples of subframe use.
The Odd Case of Facsimile Services (FAX) Facsimile services (fax) occupy a sort of shadowland between data and voice, and digital and analog, services. A fax machine digitizes an image on paper, usually by a simple coding scheme like 0 = black and 1 = white. The resulting string of 0s and 1s is then compressed using a simple algorithm like Run-Length Encoding (RLE) to remove long strings of 0s and 1s. The final result is sent over the PSTN, usually on a regular dialup analog voice circuit, using a 9.6 or 14.4 kbps modem built into the fax machine. This is pretty much what the most common fax machines, Group III fax machines, do. Modems are also used to send 0s and 1s from computer to computer. So what is the difference with fax traffic? As it turns out, fax calls look much more like voice calls than data calls to the PSTN. Dialup data calls, especially to the Internet, have longer and longer holding times, and the Web becomes more of a necessity in daily life than a luxury. Lately, long data calls and Web sessions have caused numerous problems for PSTN voice switches. But fax calls are quite short, even though 0s and 1s are sent. A two-or three-page fax can be sent in less time than a typical human conversation takes. Even huge faxes are still shorter in terms of holding time than most Web sessions. And fax calls are usually much more tolerant of jitter than human beings. A considerable number of telephone calls are made for fax purposes. Some 40 percent of all telephone calls to Europe from the United States, for instance, are fax calls. Some have argued that since e-mail can now append images to text, and fax software is routinely bundled with PC modems and new PCs, the days of the fax machine are numbered. But this is not necessarily so. In many cases, a signed fax carries more weight in a court of law than an e-mail message. In parts of the world where logogram (also called pictogram) languages like Chinese are common, faxing is much more common than e-mail. Faxes can easily handle handwritten (or hand-scrawled) input, no matter how complex or mixed with text and diagrams. A fax is by definition an exact copy of what was sent, right down to the typos. Faxes are tangible, meaning that there is no risk that the computer might crash and take the only copy of the document before it can be printed out. Finally, there are millions of fax machines already out there that will not disappear overnight. What has all of this to do with voice over frame relay? Just this: Even if frame relay is deemed too poor in terms of voice quality for human conversation, frame relay might be just right for fax traffic. If 40 percent of a company’s calls to Europe are faxes, putting this traffic on the frame relay network could save an enormous amount of money. All that is necessary is to either create a DSP board for faxing, or even take an analog signal from a Group III fax and subject it to the same DSP as ordinary sound. Of these two approaches, the special DSP fax board approach is preferred, since the number of analog-digital conversions (and the inevitable distortions) will be fewer.
Frame Relay Voice Considerations Why would anyone go through all the trouble of putting voice on a frame relay network? First and foremost, the potential cost savings are substantial. Doing voice over the frame relay network as opposed to using private tie-lines between corporate PBXs is very attractive. Tie-lines are dedicated trunk circuits used to carry voice calls between an organization’s PBXs. The 64 kbps DSprivate lines used for tie-lines are priced by the mile: The longer the tie-line, the more the DS-0 costs per month. But most frame relay DLCIs are paid for on a flat monthly basis, although tiered pricing is not uncommon. Since the DLCI is, for the most part distance-insensitive, it makes no difference if the frame relay tie-line is 100 miles long or 1000. And the voice inside the frame relay frames shares the same UNI as the frames carrying data. So no separate link is usually needed. The voice just rides the frame relay UNIs and across the frame relay network(s) in between. This is why silence suppressed and compressed voice are so important to voice over frame relay considerations. It makes no sense to put a PBX tie-line across a frame relay UNI if the voice can still consume 64 kbps of bandwidth. Since most UNIs are running at 64 kbps, once someone started talking, no data could ever be sent across the network. But if the voice occupies only 8 kbps or so of the UNI’s bandwidth, and only when someone is actually speaking, then a frame relay PBX tie-line makes a lot of sense. Savings of anywhere from 30 percent to 50 percent of tie-line voice costs are frequently mentioned, but results vary widely. So if the PBX tie-line network for a major corporation offers voice internally at 5 cents per minute (a fairly representative cost), then the frame relay PBX tie-line could do the same job and cost between 2 and 2.5 cents per minute. Of course, more than one tie-line is normally employed between PBXs. But usually these are configured for peak use periods and often sit idle for long periods of time. Even if 4 or 5 voice calls are in progress on the UNI, the 8 kbps for each is consumed in bursts, since all of the people might not be speaking at the same time. All this might sound too good to be true. Why not just terminate service on all the PBX tie-lines, hook the PBX up to a box that contains the DSP cards needed to generate 8 kbps ACELP voice, and funnel this into the FRAD over a DLCI to the remote location? Better yet, why not build this DSP support into a special FRAD, a voice FRAD or V-FRAD? The V-FRAD would still have data ports for Ethernet LANs, of course, but would also have a port for a 50-pin harmonica connector to the corporate PBX. Now, this type of V-FRAD would typically have a full 1.5 Mbps UNI, since the PBX cable can carry 24 voice conversations. Still, at 8 kbps peak, these 24 conversations, even if all active at once, only consume 192 kbps of the 1.5 Mbps. This leaves plenty of bandwidth for data. This PBX-to-PBX arrangement is shown in Figure 9.4.
Figure 9.4 PBX-to-PBX using V-FRADs. But before everyone rushes out and buys V-FRADs for voice over frame relay, there are at least six concerns that must be addressed by every organization that contemplates using a frame relay network service for voice. Following is a list of these concerns in no particular order. Each issue is serious enough to deserve a short section of its own for further discussion.
1.What about the fact that 8 kbps voice just does not sound as good as 64 kbps PCM voice? 2.What about the fact that frame relay networks will simply discard any frames received by a frame relay switch that contains bit errors? 3.What about the fact that frame relay network will discard frames sent on a DLCI above the CIR if the network is congested? 4.What about the fact that frame relay network delays are higher and much more variable than PSTN delays? 5.What about the fact that many frame relay service providers will not support frame relay voice? 6.What about voice over IP?
8 kbps Voice Is Not as Good as 64 kbps Voice Once upon a time, the telephone companies all preached that enormous sums of money had to be spent to make sure that voice quality was always the best possible at all times under all conditions. If customers are far away, reach them. If thunderstorms produce static, minimize it. If the line hums, fix it. Someone with a truck came out to fix my phone if the voice was muffled, and so on. (Why didn’t the electric company send someone out in a truck to adjust my toaster if my toast was too dark?) Spending these enormous sums required enormous revenues to allow the telephone to eke out profits that were the envy of every other industry in the world. But as soon as deregulation came along and telephone companies were forced to become as costconscious as other industries, a funny thing happened. It turned out that voice quality was not as important as it once seemed to be. No trucks rolled to fix premises wiring and customer-owned telephones, and people still talked. Now, a lot of this had to do with the fact that the PSTN had years of regulated revenues to improve facilities. But cellular networks were built after 1984 and they sounded pretty lousy compared to landline service. After all, even long distance rated an MOS of 4 compared to a local cellular MOS of 3 or so. Cellular cost many times more than regular landline service. But it sold like hotcakes. Oddly, it is probably the acceptance of cellular telephone service quality that has made voice over frame relay sound so good. And once voice over frame relay sounds good enough, the cost saving acts like a magnet to cost-conscious organizations. So voice quality over frame relay does not seem to be an issue at all.
Frame Relay Switches Discard Frames Received with Bit Errors If a frame relay frame is received at a frame relay switch port and the frame CRC check fails, the frame is simply discarded. No error message is sent to the receiver or sender. Data applications must perform their own recovery techniques and usually deal with missing frames in a sequence by just re-sending the missing data units or packets. Voice applications do not have an option to resend voice samples, of course, and missing voice samples result in silence or dead air at the receiver. So discarded voice samples must be kept to a minimum in order for users to carry on an intelligible conversation. Frames sent over a frame relay network containing voice are crammed with voice samples. Each sample represents at least 1/8000th of a second of voice. A 1500-octet frame relay frame (the most common size ordinarily seen) will contain at least 0.2 seconds of voice and probably more. Usually, no one talking on the telephone would notice that 1/8000th of a second of a conversation is missing (a golf ball is in contact with the driver for only about 4/8000th or 1/2000th of a second). But many people will notice 0.2 of a second of silence (the eye blinks in about 0.1 of a second, and people notice that all the time). If a noticeable period of silence happens once during a 5-minute conversation, users will shrug and perform their own error recovery (“What did you just say?”). But if it happens frequently enough, more and more time is consumed with repeated information, and if the missing voice sample interval decreases, even these resends must be repeated. So this is a legitimate concern.
What saves voice over frame relay is that more and more UNIs and NNIs, and trunks between frame relay switches are provisioned on fiber, usually Sonet fiber rings. These fiber rings have extremely low bit error rates and discarding voice samples between frame relay switches is unlikely to be a problem. An area of more concern is the tail end of the UNI, which is typically still some kind of copper link, since few users have direct fiber interfaces on their premises equipment, and FRADs are no exception. But if the copper ends of the UNI are kept short enough, even this need not be a major concern. Many commercial buildings already have fiber in the equipment room in the basement, and many have fiber in the riser as well. As fiber makes its way slowly but surely closer and closer to the premises equipment, discards due to errors should become even more uncommon.
Frame Relay Discards Frames above the CIR If the Network Is Congested The effects on voice of frame relay frames with the DE bit set to a 1 bit and discarded under congested conditions are the same as the effects of frame discards due to errors. The cause is different, but the effect is the same: noticeable periods of silence and dead air during the call. The solution here is to make sure that the frames carrying voice sent on a DLCI (1) have a CIR, and (2) respect the CIR at all times. Recall that if a DLCI is allowed and configured to have a CIR of zero, each and every frame sent into the network on the DLCI should have its DE bit set to a 1 bit. (Oddly, there are cases where zero CIR DLCI frames have been observed to enter a frame relay network with the DE bit set to a 0 bit, but this appears to be a zero CIR implementation issue.) It is the frames with the DE bit set to a 1 bit that are the first to go when congestion occurs on the frame relay network. Most frame relay voice DLCIs should have a CIR set to the peak bit rate of the compression method, usually 8 kbps. This is a nice round number from the frame relay network perspective and all major frame relay service providers support 8 kbps CIRs. Some voice techniques digitize at 13 kbps, for example, and these DLCIs would require a CIR of 16 kbps, if the service provider supports 16 kbps CIRs (most do). So an 8 kbps CIR on a voice DLCI is more or less a requirement. This CIR must be respected at all times and should be if the voice technique is truly 8 kbps. In other words, the traffic seen on the voice DLCI should never burst above the CIR. This would result in some of the voice frames being tagged discard eligible and defeat the whole purpose. Since silence suppression is a feature of frame relay voice, it might be tempting to try to lower the CIR on the voice DLCI even further. After all, the 8 kbps is needed only when people are actually talking. Most conversations are 50 percent silence. This temptation should be resisted. First of all, few frame relay service providers, if any, support a CIR of less than 8 kbps. And the risk of voice frame discards is just too great. Accept the 8 kbps savings as good enough.
Frame Relay Delays Are Higher and More Variable The delays through the PSTN, and the tie-lines derived from the same facilities, are very low and very stable, at least for landline circuits. This low delay is the result of two factors. First, the speed of bits in a wire, copper or fiber, is very high. Most signals representing bits propagate at about twothirds of the speed of light in a vacuum or about 200,000 kilometers per second (about 120,000 miles per second). So a fiber link 1200 miles long has a delay of about 10 milliseconds from the time the first bit of a voice sample is sent and the time it arrives at the end of the link. Second, the ITU-T has established the maximum time for bits to enter and leave a switching node as 450 microseconds, or about half a millisecond. Central office switches and digital cross-connects are examples of switching nodes. So bits from a voice sample that flow through 15 switching nodes are delayed only about 7.5 milliseconds as they make their way through the network. So if a voice circuit passes through 15 switching nodes separated by 1200 cable miles, then the total end-to-end delay would be about 17.5 milliseconds. In actual practice, Bellcore considers the United States to be about 20 to 30 milliseconds wide for coast-to-coast circuits; of course, most circuits are much shorter and therefore faster in terms of delay. The United States is quite large in telecommunications terms anyway. Only Russia, China, and Brazil are in the same category.
The delay on these landline voice circuits is not only low, but also very stable. This is because voice is distorted if voice samples arrive closer and closer together (clumping) or further and further apart (dispersion). Clumping raises the voice to Minnie Mouse levels and dispersion lowers it. For this reason, all voice switching nodes are engineered to have very stringent limits on jitter, or delay variations. Voice samples do not queue up in a buffer at a central office switch trunk interface, for instance. Varying buffer lengths are the leading cause of delay variations in data networks such as frame relay. Voice circuits on geosynchronous satellites commonly used for international circuits are an exception. The uplink and downlink propagation delay, even at the speed of light, is considerable. The end-to-end satellite delay is 250 milliseconds or a full quarter of a second. This is still useable for voice, but barely. The delay coincides with the psychological timeout in people’s brains that is interpreted as, “They stopped speaking; it’s my turn.” But if the speakers hear nothing from the other end after this quarter-second pause (and they won’t, due to the propagation delay), they assume the other person has nothing to say, so they begin speaking again. Conversations on satellite voice circuits have many instances of simultaneous speech beginnings, apologies, and unvoiced thoughts about how rude those people in other countries are. It should be noted that newer satellite systems used for voice, the Low Earth Orbit Systems (LEOS), have delay much more like landlines than geosynchronous satellites. This is because of the much shorter uplink and downlink distances, of course. How do frame relay delays stack up against landlines and geosynchronous satellite delays? Well, they are about in the middle between landlines and geosynchronous satellite voice circuits. Most frame relay networks will have delays between 120 and 200 milliseconds, even on international frame relay circuits. This is because few frame relay links are provisioned on satellite, most using underseas fibers instead. The underseas links have much lower delays than geosynchronous satellites, naturally. So the delay for frame relay voice is not too bad. The real problem is jitter. The delays vary between 120 and 200 milliseconds constantly as bursty traffic entering and leaving the frame relay network causes the output buffer queues to shrink and grow. Frame relay was designed and engineered for bursty, delay-tolerant data, not voice. The characteristic delay for landline voice in the United States, frame relay, and geosynchronous satellite circuits is shown in Table 9.1. Table 9.1 Frame Relay Delays and Jitter VOICE TRANSPORT
CHARACTERISTIC DELAY
JITTER CONCERNS
PSTN landline
20–30 milliseconds
Minimal
Frame relay
120–200 milliseconds
Major
Geosynchronous satellite
250 milliseconds
Minimal
There are two ways around the jitter issues with voice over frame relay. One is used at the sender; the other is employed at the receiver. The first method is to give the DLCI carrying voice priority over all other traffic entering the network from the V-FRAD at the UNI. This ensures that the voice samples are not delayed any more than necessary at the V-FRAD, as the voice samples might be if they had to queue up behind four large IP packets inside of Ethernet frames. The second method is to employ a jitter buffer at the receiver. A jitter buffer is used to hold the voice samples arriving from the frame relay network and adjust their playout so that they appear to have arrived with the same delay. Some frames arrive quickly when bursts on the network are at a minimum. These samples are held in the jitter buffer longer. Some samples arrive more slowly as traffic builds. These samples are held for a shorter period. The concept of a jitter buffer is shown in Figure 9.5.
Figure 9.5 Jitter buffers. The amount of delay the jitter buffer makes all voice samples appear to arrive with is a key parameter in making the whole scheme work. If the playout delay is set too low (say at 150 milliseconds), then late-arriving samples will have to be discarded anyway. If the playout delay is set to high (say at 300 milliseconds), the jitter buffer adds needless delay to an already admittedly slow network. When jitter buffers are used, there is always a tradeoff between missing voice samples or adding more delay. Jitter in most packet-switching networks is also the result of serial delay. Serial delay occurs when a delay-sensitive frame relay frame is delayed at the V-FRAD by a data frame. There are several things that can be done in V-FRAD products to minimize the effect of serial delay. The V-FRAD should have separate queues and buffer areas for voice, data, and fax. The DE bit could be set to a 1 bit on data frames, even when sent under the CIR, in order to make sure there are enough frames for the network to discard when congestion occurs without affecting the voice. Smaller data frames also minimize the effects of the serial delay.
Many Service Providers Will Not Support Frame Relay Voice Frame relay switches inside a frame relay network never have a need to look inside a frame relay frame. This is the essence of frame relaying as opposed to packet switching, which must look inside frames to find the packets. Yet many frame relay service providers insist in their frame relay service proposals that they will not support voice over their frame relay network. But how can they possibly stop anyone from doing it? How can they possibly know what’s inside the frame anyway? They can’t and they don’t. The V-FRADs are customer premises equipment anyway, firmly beyond the control of the service provider. So what does the lack of frame relay voice support mean? This is what it boils down to. When a frame relay service provider says that it will not support voice over its frame relay network, it usually means at least two things. First, when the service provider is given the list of locations to be linked by PVCs and assigned DLCIs, if the service provider finds out that any of the DLCIs are to be used to carry voice, the service provider will not configure them. Second, if the DLCI is configured (why tell the service provider what the DLCI will be used for?) and the DLCI is used for voice, the frame relay service provider will not take a trouble reported on that DLCI. Neither of these statements should be taken as absolutes; both need a few more words of explanation anyway. Frame relay service providers that do not support voice over frame relay can refuse to configure the DLCIs. This hardly seems like a way to win new customers and retain old ones. So, in actual practice this is interpreted to mean that the service provider will partner with whomever is providing the V-FRAD equipment to the customer. Technically, the service provider is not configuring the voice DLCIs directly for the customer. But since the service provider is configuring the DLCIs for the V-FRAD vendor as a third-party in the network, the result is pretty much the same. This thirdparty arrangement possibility is sometimes spelled out in the frame relay network proposal, but sometimes it is not. It cannot hurt to ask if indirect voice support is possible through the third-party arrangement. The service provider might have had a bad experience with a particular V-FRAD vendor and nothing but good experiences with other V-FRAD vendors, so this partnering relationship between some V-FRAD vendors and the service provider is by no means a given.
The second restriction, not taking trouble reports on DLCIs used for voice, is related to the first. If the voice quality deteriorates on the frame relay network, who is to blame? The V-FRAD vendor or the service provider? If the frame relay service provider does not support voice over frame relay, there is no sense in even asking the service provider to try to determine the cause of the problem. But if there is a third-party relationship with the V-FRAD vendor, the service provider will tell the customer to call the V-FRAD vendor to report the problem. Of course, if the V-FRAD vendor determines that the problem is within the frame relay network itself, the service provider will take a trouble on the DLCI from the V-FRAD vendor, again depending on the relationship between the VFRAD vendor and service provider. Note that this procedure presumes that troubles with frame relay voice are caused by the V-FRAD equipment. But if the problem is caused by some component of the frame relay network, why would only the voice DLCIs on a UNI be affected? And, if all DLCIs on a UNI are in fact affected, then just report a trouble on the data DLCIs and be done with it. When one is fixed, so should be the others (there can be many more complexities to a UNI problem that have not been discussed here). So lack of frame relay voice support is not as drastic an issue as it first appears. It is really a way for the frame relay service providers to more or less force the V-FRAD vendors to be part of the overall solution.
Voice over Internet Protocol It might seem strange to discuss voice over IP in a chapter on voice over frame relay, but the issue must be addressed squarely. The fact of the matter is that many more people are considering putting voice traffic on the Internet inside IP packets than in packaging voice inside frame relay frames. As widespread as frame relay is, frame relay cannot begin to compare with the universal availability of Internet access. The number of Internet service providers dwarfs the number of frame relay service providers, and frame relay service is usually more costly than Internet access. Proponents of voice over IP point out that IP voice can be implemented on any desktop device with an empty slot and TCP/IP software, which is essentially the vast majority of desktop devices today. And frame relay is often used to provide Internet access anyway. Why bother with expensive VFRADs at all? Simply use the same DSP chips to digitize the voice at the desktop and ship the voice over the Internet in IP packets the same as everything else. There are even IP to/from PSTN gateways available so that a person at his or her desk can call home over the Internet. Proponents of voice over frame relay point out that the voice quality available over the public Internet is quite poor, much poorer than a V-FRAD can provide. And one V-FRAD can do the work of many DSP PC cards, since not everyone will be talking at the same time. Even if voice inside IP packets is the chosen method, IP packets can ride inside the frame relay frames just as comfortably as anything else. Even Internet service providers that encourage IP voice recommend that the voice be provisioned on their managed (meaning “having adequate resources to ensure some level of service quality”) Internet service, which is essentially a series of dedicated links between non-Internet routers carrying a limited number of customers’ traffic. Such a managed Internet network looks pretty much like a public frame relay network to the frame relay service providers, only with routers in place of the frame relay switches. Which side is correct? They both are. Voice over IP makes more sense in some cases, especially when there is no frame relay at all in the picture. And voice over frame relay makes more sense in other cases, especially when there is already a cost-justified frame relay network between the sites that need to be linked. For links to the PSTN, the same gateway approach can be used.
The Digitization of Voice Digital voice is not an idea in any way related to the rise of digital computers. The basic ideas were well understood even before World War II, but the war intervened before the idea could be explored more fully. The whole stimulus for developing digital voice was the simple observation that digital signals are much less susceptible to noise than analog signals. The reasons are not all that important. What is important is that engineers fed up with the static, squeaks, and pops of analog circuits quickly realized that digitizing the analog voice immediately brought an increase in perceived voice quality on the line. This increased voice quality carried a hefty price tag in the days before microprocessors made computing power inexpensive. So it was not until the late 1960s that digitization of the PSTN could begin in earnest. But it was the ties that the PSTN had to early computer processors that more or less made 64 kbps digitized voice inevitable. Voice is an analog signal. Analog signals rise and fall between some maximum and minimum value, but any value in between is allowed. Computers use digital signals that allow only a small number of valid values, such as just a 0 or a 1 (combinations like 00, 10, 01, and 11 are also digital signals). This is the secret to digitized voice’s greater immunity to noise. Noise is only another allowed analog value. All equipment must pass the noise along with the voice. But if the noise is kept within the digital signal’s allowed values, then the noise can be eliminated at the receiver on a digital link. This is true whether the digits represent voice or data. These aspects of analog and digital signals are shown in Figure 9.1.
Figure 9.1 Analog signals and digital signals. The human vocal tract—the throat and tongue and lips—makes analog sounds in the form of acoustical pressure waves. These acoustical waves are made into analog electrical waves by the transducer in the telephone that exactly mirrors the rise and fall of the acoustical pressure wave. The receiver turns these analog electrical waves back into acoustical pressure waves at the receiver. So the digitization of voice involves taking these electrical waves and translating them into a string of 0s and 1s. The analog signal is held to a given value for an instant of time while the value of the signal is sampled. This sample-and-hold technique is then used to convert the value of the analog signal at that point in time to a string of 0s and 1s. This process is repeated over and over again without stopping during the entire duration of the voice conversation, even when one party on the call is listening.
6/23/2002
But how many 0s and 1s are needed to do the job? If more than enough bits are used, then fewer voice conversations are carried on the same facilities and revenues will suffer. If not enough bits are used, then the voice will lose the distinctive qualities needed to recognize different people and eventually will become unintelligible at very low bit rates. An important mathematical formula established what is known as the Nyquist sampling rate. This states that in order to have enough information to completely and reliably reconstruct the analog signal from a series of discrete samples, the samples must be taken at twice the rate of the highest frequency component in the analog signal (otherwise a phenomenon known as aliasing could occur, destroying all of the noise benefits of digitization). For analog voice, the highest frequency component was established at around 3300 cycles per second (Hertz, abbreviated Hz). The low analog frequency component for the PSTN, 300 Hz, did not enter into the formula. This range of frequencies was chosen even though the human voice can generate sounds from about 100 Hz to about 10,000 Hz (10 kHz), since studies had shown that about 80 percent of the power of the human voice is in the 3000-Hz (3-kHz) range between 300 and 3300 Hz. So expanding the bandwidth (the range of frequencies used for voice networks) above and below this 3-kHz range had little benefit in terms of perceived voice quality. For practical digitization purposes, the voice passband was set at 4000 Hz. So the Nyquist sampling rate for the 4 kHz voice passband had to be 8000 times per second. In other words, 8000 times per second, the electrical analog wave representing the acoustical pressure wave was held and sampled. Sampling more often is called oversampling and is not used for voice, but is used for other types of analog audio signals, such as music. Sampling below the Nyquist rate results in aliasing; this is sometimes done intentionally in some applications. But where does the 64 kbps come from? If each sample-and-hold at the Nyquist rate generated one bit, then only 8 kbps would be needed to represent voice. But more that one bit per sample is needed, because of the odd distribution of the range in amplitude of the electrical signals mirroring the pressure waves generated by the human voice.
64 kbps Pulse Code Modulation Voice There are about 6000 documented human languages (strangely, almost half are found in New Guinea and the immediate vicinity). But the human voice can only make a limited range of sounds, no matter how the words in a given language are actually formed. Some of the sounds are called unvoiced sounds and are formed by the lips and tongue. Most consonants, such as “t,” “p,” “d,” and so on, are unvoiced sounds. Other sounds are called voiced sounds and are formed in the throat by the vocal chords. Most vowel sounds, such as “a,” “e,” “i,” and so on, are voiced sounds. Syllables consist of the unvoiced and voiced sounds in various combinations. Syllables are in turn combined to make words in whatever language the speaker is using. It is the voiced sounds (vowels) that carry most of the power of the human voice. But it is the unvoiced sounds (consonants) that allow listeners to distinguish similar sounding words (was that “tall” or “fall”?). Unfortunately, voiced sounds are about 20 times more powerful than unvoiced sounds. A simple word like “salt,” for example, generates about 20 times more electrical energy for the “a” sound than for the initial “s” sound. (One bit can represent two values, two bits four values, and so on.) Even at the Nyquist rate, one bit per sample cannot adequately represent signals that range from 1 to 20. So more bits were needed. Early implementations of voice digitization used 8-bit computer processors to accomplish the analog-to-digital conversion. So it only made sense to generate 8 bits per Nyquist sample. Since 8 bits were generated 8000 times per second, digital voice was accomplished at 64 kbps. Even then, the 20 to 1 ratio of voiced to unvoiced sounds posed challenges. Without any modifications, 9 bits would be needed. The answer was to stretch the range covered by the 8 bits using a process known as companding (compressing the voiced ranges and expanding the unvoiced ranges). The net result was known as Pulse Code Modulation (PCM) voice running at 64 kbps. The idea behind companding is shown in Figure 9.2. Two main types of companding exist, one used in North America (µ-law) and the other used almost everywhere else (A-law). Both generate 64 kbps PCM voice, but are not directly compatible with each other, of course.
Figure 9.2 The companding concept. The Nyquist sampling rate coupled with 8-bit PCM companding will always result in 64 kbps voice. But 64 kbps PCM voice is not the only way to represent analog voice. The processing power of modern microprocessors harnessed in DSP chipsets makes it not only possible to break the 64 kbps barrier, but also attractive.
Voice Quality below 64 kbps Voice at bit rates below 64 kbps has been around for a while also. The trick is to preserve the voice quality while reducing the bit rate. Suppose PCM voice generated only 4 bits per sample, not 8 bits per sample. The resulting bit rate would be only 32 kbps. Voice quality would suffer somewhat (there would be fewer ranges for the companding of voiced and unvoiced sounds), but maybe the fall-off in voice quality would not be bad. Certainly the doubling of the capacity of all DS-0 digital links would be an immediate benefit to the service provider. This issue raises the question of how voice quality is measured. The answer is surprising to those used to the digital world of “it has to be a 0 or a 1.” With voice quality measurements, “sort of like a 1, but maybe a 0” is about the best one can hope for. Voice is an analog phenomenon. This has been stated several times already. Digital quality is easy to measure. Put a known string of 0s and 1s through the network. Compare what went in with what comes out. A network that matches bits with 99 percent accuracy has better quality than a network that matches bits with only 95 percent accuracy (actually, both are pretty horrible, but this is just an example). Analog quality is not so absolute. The only way to compare 32 kbps PCM voice with 64 kbps PCM voice is to change the digital signal back to analog sound and ask a number of people to listen to it, usually through a regular telephone handset. Then ask them to rate the quality on a scale of 0 to 5, where 0 is “what sound?” and 5 is “the best telephone I ever heard in my life.” Various levels can be introduced to the system until it is almost like rating a new dance tune: “I liked the beat so I gave it a 4.3.” Average (compute the arithmetic mean) together a statistically significant cross-sampling of the customer population (old and young, male and female, etc.) and this yields the Mean Opinion Score of the voice quality, the MOS. Analog voice quality has been using the MOS scale for a long time. It is well known and relatively easy to perform. In actual practice, the “5” rating is usually considered local quality, where both parties are serviced by the same central office or two central offices linked by a few trunks. The “4” rating is toll quality; most long-distance calls fall into this category. The “3” rating is usually considered cellular quality, the “2” rating is down to almost the intercom (“you want fries with that?”) level, and the “1” rating is barely intelligible and requires constant repetition of phrases, as prompted by the listener. Most regulators expect to see consistent MOS ratings of 4 or better for landline services, and 3 or better for wireless telephony. Naturally, a service provider with a 4.3 will quickly claim “better” voice quality than a service provider with a 4.0, even though both essentially provide toll quality voice. So service providers quickly discovered that while 64 kbps PCM voice consistently gave a MOS of 4.5, 32 kbps PCM voice had a MOS of about 3.0. Dropping PCM voice to 16 kbps produced a MOS of about 1.5.
What has all this to do with the current rise of voice below 64 kbps? As it turns out, many techniques other than PCM have been developed over the past 15 years or so to digitize voice at less than 64 kbps. Some are very simple (sample less than Nyquist or send only changes in the PCM values), while others are quite sophisticated (take the 64 kbps stream and remove the redundancies, effectively compressing it below 64 kbps). The most promising of these compression techniques was called Linear Predictive Coding (LPC). The details of LPC are far beyond the scope of this book, but LPC involves both sender and receiver trying to predict what sound will come next, then the sender only sending the difference between the predicted sound and the actual analog sound being digitized. Very low bit rates were achieved with LPC devices even in the mid-1980s (16 kbps or even lower). But the high cost of the chipsets used and the processing power required to run the LPC algorithms made LPC a very expensive way to save voice bandwidth. But two things have happened in the 1990s to make LPC schemes attractive. First, new DSP chips have dropped in price and increased in power. Second, and even more important, the new LPC algorithms have become sophisticated enough to flatten the slope of the MOS drop-off due to using fewer bits to represent voice, as shown in Figure 9.3.
Figure 9.3 Flattening the slope of the MOS. Only PCM and a newer LPC technique (Algebraic Code-Excited Linear Predictive or ACELP) are labeled, but the effect is the important point, not the details. With new techniques such as ACELP implemented in new DSP chips, digitized voice at 8 kbps or even 4 kbps is possible and attractive, with little fall-off in perceived voice quality.
Silence Suppression There is one other piece of the voice over frame relay puzzle to examine before tackling exactly how the Frame Relay Forum says voice samples should be carried on frame relay networks. This is the concept of silence suppression; it starts with the realization that in normal voice conversations, a full-duplex medium (the voice channel can carry 64 kbps of sound in both directions at the same time) is used in a half-duplex manner (one person talks while the other politely listens—most of the time). Even though a voice channel takes up 64 kbps in both directions at the same time, usually one 64 kbps link is carrying nothing but silence. The silence is packaged up as 64 kbps, naturally, but it is silence nonetheless. Technically, of course, there is not really silence in the sense of no sound at all. The sender still generates background noises even when not speaking (a radio or television audio in the next room, perhaps), and even what is called ambient sound, the faint hums that human ears pick up all the time from all directions (tightly covering the ears even when all alone is startling because the ambient sounds disappear). Silence suppression will cut off transmission in one direction when the speaker falls silent. This could even be in both directions when a call is on hold, for instance. Why package up background noise just to take up 64 kbps of bandwidth? In packet networks, the 64 kbps can easily be used to send packets on other virtual circuits. But the silence being suppressed in silence suppression is not only the sound of the human speaker, but also the ambient background noises. This sometimes results in listeners thinking that the line has totally gone dead (“are you still there?”). The most advanced silence suppression devices will actually inject some ambient noise (perhaps even taken from the beginning of the conversation) into the receiver. Advanced silence supression can even supress silence between words and syllables.
Silence suppression when coupled with compression techniques has the effect of making voice look like bursty data from the network perspective. Simply place the voice samples inside a normal packet (or frame). A burst of low bit-rate packets or frames flow in one direction (speaker to listener, but it seems like client to server), then a burst of low bit-rate packets or frames flows in the other direction, just as in data applications. Is it any wonder that serious observers have suggested that voice should now be another low bit-rate application on a global network optimized for data? This is the whole idea behind voice over frame relay. Since most frame relay networks have already been cost-justified based on data applications only, putting voice over the network becomes a financially intelligent thing to do if the cost of the added equipment is reasonable for the amount of voice traffic envisioned.
Video over Frame Relay If voice over frame relay is becoming accepted, common, and reliable, can video over frame relay be far behind? Just as television followed radio over wireless technologies, should not video follow voice onto frame relay? Yes, and yes again. There are several vendors of video over frame relay equipment already and many more can be expected to follow. The same economies are the incentives: fewer leased lines, lower costs, and simpler management of the network as a whole. Discussing the options involved in supporting video over frame relay is more complicated than the voice options, however. This is mainly because the video marketplace is not as unified as the voice market. Although there is not any universal agreement about the proper types and categories of video services, most video networking experts would distinguish between at least four types of video. Some of these types can be supported over frame relay networks immediately, while others might take a while to find their way onto frame relay networks. The main reason is the bandwidth each of them demands. A few words about each of the four main types of video are in order. First and most basically, there is very low-bit rate video telephony of the type designed and supported on ISDN B-channels. Most equipment designed for this market operates at the modest speeds of 64 kbps or 128 kbps, speeds that are universally available even to the home in ISDN environments. The digital coding of the analog video is handled by H.261, part of the H.323 specification. The screen size of the image is 176 x 144 pixels, or about 1/8th of the size of a PC screen (the image can be enlarged, but only with considerable distortion). The system delivers about 7.5 frames per second, which results in very choppy and blurry motion. About the best that can be said for this today is that it works. But no one would care to watch it for hours on end. A step up in quality is video conferencing, which is just video telephony with more bandwidth available. Designed for ISDN support, video conferencing uses H.261 coding at speeds of 384 kbps to 1.5 Mbps, and in many cases only these two speeds. The image is 352 x 288 pixels, twice as large as video telephony, but still only 1/4th of the PC screen. Video conferencing can handle between 10 to 30 frames per second, but usually the higher number (full-motion video) is reserved for 1.5 Mbps connections. The low end is not much better than video telephony, but 30 frames per second yields substantial improvements in video quality, at least enough to allow the use of video conferencing for several hours at a time. The top of the video line today, at least in terms of affordable equipment, is MPEG- encoded analog, broadcast television signals. This video quality is what users of satellite television systems enjoy and, in most cases, the picture is better than that available on analog cable television systems. MPEG-is a digital video standard that supports full-screen (or more) 720 x 576 pixel images at a full-motion rate of 30 frames per second. The bit rate or 4 to 9 Mbps is fine for LANs, but difficult to provision on frame relay UNIs. Finally, waiting in the wings (but not for long), is high-definition television or HDTV. This features a screen of movie proportions (16 ∞ 9 aspect ratio) so that motion pictures need not be letterboxed or edited for television. The 1920 ∞ 1080 image is at a higher resolution than anything else (HDTV still uses MPEG-2), and of course HDTV runs at 30 frames per second. For transmission purposes, 20 Mbps is needed and distributing production HDTV images requires 100 Mbps. Again, it should be pointed out that other methods and categories could easily be added to this list (for example, Motion-JPEG and several MPEG-profiles). But these are the main categories used by most users and service providers.
Of the four digital video systems detailed, both video telephony and video conferencing can easily be handled on existing 1.5 Mbps or 2 Mbps frame relay UNIs right now. Several vendors make the equipment; none of them really require a special video FRAD for the simple reason that most enduser digital video equipment has LAN connections already. Simply put the video equipment on a LAN, hook the LAN to the FRAD, link the FRAD to a UNI running at 1.5 Mbps or 2 Mbps, and away it goes. Caution is needed when attempting to run either 64 kbps video telephony over a 64 kbps UNI or 1.5 Mbps or 2 Mbps video conferencing over a 1.5 Mbps or 2 Mbps UNI, since the video packets can easily cause everything else mapped to the same UNI to freeze, perhaps for the duration of the video session. But 384 kbps video will run fine over a higher-speed UNI, as many users have been pleased to discover. As for MPEG-and HDTV, most frame relay customers can only dream for now. Not only is the end equipment expensive, almost prohibitively so for HDTV, but few frame relay customers or service providers have the UNIs necessary for this type of video stream. Now, two 2 Mbps UNIs or three 1.5 Mbps UNIs can be inverse-multiplexed for MPEG-2 (the multiport UNI discussed in Chapter 4). But then the speed of any NNIs encountered becomes the problem. And a 45 Mbps UNI (and NNI) is a given for HDTV. Needless to say, 45 Mbps UNIs and NNIs remain rare and prohibitively expensive except for the most ardent video users and specialized customer markets, such as major tele vision and motion picture studios. Of course, this will change over time, but only slowly. Although bandwidth is the overwhelming consideration for video over frame relay, it is not the only consideration. In fact, all of the voice concerns also apply to video. Frame loss causes images to freeze and jump. Delay variations or jitter cause images to speed up and slow down. In the case of video telephony and video conferencing, careful configuration of CIR and jitter buffers is needed. For MPEG-2, the standard fortunately contains its own built-in methods for dealing with loss and jitter. The video over frame relay requirements are summarized in Table 9.4. Although lower speed or inverse-multiplexed UNIs are possible for some frame relay video, the table lists minimum recommended UNI and NNI speeds. Table 9.4 Video over Frame Relay MAXIMUM SCREEN SIZE
FRAMES PER SECOND
MAXIMUM BIT RATE
Video telephony H.261
176 x 144 (1/8th)
7.5
64–128 kbps1.5–2 Mbps
Video conferencing
H.261
352 x 288 (1/4th)
10–30
0.384–1.5 Mbps 1.5–2 Mbps
MPEG2 video
MPEG-2
720 x 576 (full)
30
4–9 Mbps 45 Mbps
HDTV
MPEG-2
1920 x 1080 (movie)
30
20–100 Mbps 45+ Mbps
APPLICATION “MINIMUM”
CODING UNI/NNI STANDARD
It might seem surprising that voice and video, which previously required their own private networks to function properly, can be accommodated as well as they are on frame relay networks. But perhaps this should not come as much of a surprise. The most popular private networking standard for data of all time, IBM’s Systems Network Architecture (SNA) protocol, not only easily rides on frame relay, SNA actually thrives in a frame relay environment. So much so that the migration of SNA applications from private, leased-line networks onto frame relay networks helped boost frame relay into the spotlight. The time has come to examine the relationship between SNA and frame relay more closely.
Chapter 10: Systems Network Architecture and Frame Relay Overview IBM’s Systems Network Architecture (SNA) is the most successful vendor-specific network protocol architecture in history. From humble beginnings in 1974, SNA has found its way onto some 50,000 networks, based on IBM licensing agreements. The revenues from VTAM licensing alone, a key component of SNA networks, would have at one point constituted a Fortune 200 company all on its own. As common as the Internet protocol suite has become as a vehicle for all forms of networking, public and private alike, some 60 percent of all data still flows over SNA networks according to most observers. Now, a great deal of this SNA traffic still finds its way onto router-based networks running the Internet protocol suite, including the public Internet itself in some cases. But the point is that most banks, insurance companies, and other institutions with strong financial interests and activities more or less revolve their networking around SNA. There may be plenty of IP and routers around, but the systems that pay the bills are running SNA. Even companies that make and market pure client and server PC-based products often process the order and invoice the customer on their mainframe-based, SNA network. IBM often calls the mainframe a host, which can be confusing to those familiar with the Internet use of the word host to denote anything that runs the TCP/IP Internet protocol suite. In this chapter, mainframe is the preferred term, and all uses of the term host will be qualified. These facts about SNA might come as somewhat of a surprise to those familiar only with client/server computing, connectionless LANs, and Internet protocols. It was common in the early 1990s to consider the IBM mainframe a dinosaur and the days of big iron computers numbered. Why sit users in front of green phosphor dumb terminals (most often 3278-IBM terminals) and process everything on a remote, million-dollar mainframe? Client/server computing could give the same users a powerful client PC at almost the same cost as the 3278-2 (the main cost in both is the monitor itself) and minimize dependence on a remote mainframe by using a server located down the hall. A multimegabit LAN instead of a multikilobit WAN connected the components of the network. Client/server was peer networking at its best, since the server was just another PC, but one dedicated to serving up resources to the client community. Potentially, any PC could be a server, although such pure peer networks were usually limited to about 10 users or less, not because of limitations in the technology, but mostly due to limitations in processor power and the way humans interact with networks. Regardless of how it was used, peer networking contrasts starkly with SNA hierarchical networking approach, where an all-powerful mainframe is the center of the network universe and controls every interaction. The details will be dealt with later, since they directly concern how SNA and frame relay interact.
Router as Frame Relay Access Device for SNA It is all well and good to talk about using host FRADs and terminal FRADs for linking SNA devices over frame relay, or discussing upgrading and/or changing the VTAM and NCP netgens (for network generation) and cluster controller configurations for frame relay, but this is not what usually happens to SNA networks today when frame relay migration occurs. Buying special SNA FRADs for each site is expensive, and making massive changes to SNA netgens is enough to cause SNA administrators many sleepless nights (mistakes are easy to make and there have been few tools to detect errors until the SNA network goes live on Monday morning). But most sites have a router connecting the LANs at the site already. If the router connectivity is provided by frame relay, it only makes sense to use the router as the FRAD not only for non-SNA traffic, but for the SNA traffic as well. The previous section discussed some of the performance issues involved with SNA and FRAD routers. This section explores some of the configuration options for SNA and routers as FRADs. In this section, the term router is used to denote a device used to connect LANs running the Internet protocol suite (TCP/IP). Now, routers can easily understand and route multiple protocols, including non-routable SNA, but in the current environment of electronic commerce (e-commerce), ubiquitous Web sites, and frenzies over virtual private networks (VPNs), the use of IP-only routers on intranets is more or less a given. So in this context, SNA is carried over the frame relay network inside IP packets. The router itself must spoof any SNA polling if supported on the network. Four scenarios are explored. All have advantages and disadvantages, which will be discussed in each section. None are perfect, but all work. In all cases, links labeled “FR” are understood to carry SNA traffic inside of IP packets inside of frame relay frames. Links labeled “SDLC” carry SNA frames and rely on SNA polling for link control. LANs labeled “LLC” (Logical Link Control, which is the ISO Data Link Layer adapted by the IEEE for LAN use) carry SNA traffic inside of LAN frames. Finally, links labeled “QLLC” (Qualified LLC, the SNA method of carrying SNA information inside X.25 frames) carry SNA traffic inside of X.25 packets. The four scenarios are: Router as terminal FRAD Router as host FRAD Router as multiprotocol host/terminal FRAD Router as X.25 terminal FRAD Not all router products from all router vendors will support all of these scenarios. But most major router vendors will have one product or another that will fit one or more of these scenarios.
Router as Terminal Frame Relay Access Device Most SNA remote sites have a cluster controller attached to the central mainframe site, at least in those organizations whose business revolves around financial information and transaction support. But since the SNA cluster controller and its leased line was installed, the site has probably also gained a LAN or two, and at least one router to link remote clients and servers to the site.
This router network might be on frame relay already; the next step is to replace the multidrop private lines used for SNA with some form of frame relay connectivity. The concern is that all of the personnel familiar with SNA configuration and upgrades tend to be concentrated at the central site. Therefore, it would be nice to have a scenario whereby the changes required by frame relay migration at the remote sites are minimal, but profound, while all of the complexities involved are handled at the central mainframe site where support is highest. This is exactly what the router as terminal FRAD approach is all about. The router at each remote site is upgraded (if it is not capable of such support already) to handle SNA SDLC poll spoofing on at least one port. All LAN protocol support remains the same. A separate PVC should be configured for the SNA traffic, not only because the SNA connectivity is being added onto the existing PVCs, but because of all the considerations previously discussed. No changes are needed to the cluster controller configuration, just a cable swap from private line to router port (watch out for compatibility here). At the central site, the FEP itself is upgraded (again, if it is not capable of such support already) to handle a frame relay UNI directly from the FEP. The requirement here is Network Control Program (NCP) 7.1 of the FEP. The NCP version must be carefully coordinated and work together with the mainframe VTAM version, so if an upgrade is needed, it is not a trivial task. All of the changes are made at the central site. Also, note that the routers, although a key component of the network, are not really SNA components of the network and thus could be a concern. The idea behind this approach is shown in Figure 10.8 with a simple, four-site (one central, three remote) SNA network. Note that a separate UNI is needed at the central site, even though a router and UNI might be present at the central site already. This is because the FEP is not a multiprotocol router and only understands SNA, and also because the traffic patterns of SNA are fundamentally different than client/server traffic patterns. Even here, there are variations. Routers exist that can totally replace the FEP. These routers are channel-attached to the mainframe, just as an FEP. The router can also provide LAN interconnection at the same time on other ports. Such routers will be mentioned again in the third scenario.
Figure 10.8 The router as terminal FRAD. The major advantages of this scenario are: 1.No changes to the remote cluster controllers. 2.All changes made at the support-rich central site. 3.Poll spoofing supported at remote site. 4.SNA priorities can be supported in terminal FRADs. The major disadvantages of this scenario are: 1.Could require upgrades to central site hardware and software. 2.Requires complex changes to central site configurations. 3.Usually requires new, high-speed UNI at central site. 4.Non-SNA router component becomes point of failure concern.
Router as Host Frame Relay Access Device
This scenario is basically the flip side of the pervious one. Potentially, there is much more to be gained by getting rid of the many private lines needed to connect all of the SNA remote sites to the central mainframe site. At the same time, upgrading the NCP and VTAM to frame relay-compliant levels might not be an option. Some older mainframe applications, perhaps the very ones that form the lifeblood of the business, are untested or just not supported on newer versions of VTAM or NCP. At the same time, upgrading a basic cluster controller for frame relay support is not much more complicated than telling a router to run a serial port with frame relay instead of some other serial line protocol. So it still makes sense to put the frame relay UNI support on each remote cluster controller and put a router as a host FRAD at the central site. This requires a new, but low-speed, UNI to each remote site (actually, the short UNI replaces the long private line, but multidrop must be taken into consideration). But perhaps smaller offices and older networks have no large LANs or routers already in place at every site anyway. So each remote site gets a UNI and frame relay support on the cluster controller. At the central site, on the other hand, the router spoofs and answers the polls from the FEP. The router also handles all of the frame relay UNI functions. No changes—hardware, software, or configuration-wise—are needed to the SNA equipment at the central site. This consideration alone might justify the approach. There is another case where this scenario makes perfect sense: where the central site has a LANattached FEP or other IBM processor that supports LAN attachments, such as an AS/400. More details about AS/400s and frame relay will be examined later, but for now the fact that many AS/400s have LAN interfaces means that they can easily be attached to the router running frame relay and carry AS/400 traffic across the frame relay network at the same time (although AS/400 terminals are not 3278-s). The idea behind this approach is shown in Figure 10.9 with the same simple, four-site (one central, three remote) SNA network. Note that a frame relay UNI is needed at each remote site, even though a router and UNI might be present at the remote site already (not shown in the figure). There are variations here as well. Routers can be deployed as host FRADs and terminal FRADs at the same time, and so forth.
Figure 10.9 The router as host FRAD. The major advantages of this scenario are: 1.No SNA changes to the central site at all. 2.All changes simple and made at the remote sites. 3.Poll spoofing supported at central site. 4.SNA applications are not aware of the change. The major disadvantages of this scenario are: 1.Could require upgrades to remote cluster controllers. 2.Remote sites usually lack expertise and support. 3.Requires new, but low-speed, UNI at each remote site. 4.Non-SNA router component becomes point of failure concern.
Router as Multiprotocol Host-terminal Frame Relay Access Device
In many situations, as has already been mentioned, sites with SNA equipment also have routers, routers that are either capable of multiprotocol support or multiprotocol tunneling (stick everything inside an IP packet and just route IP, also called encapsulation). But it is not yet just an IP world, or even an IP and SNA world. There are other network protocols in use, including IPX (NetWare’s packet protocol), VIP (Banyan Vines’ packet protocol), AppleTalk, DECnet, and so on. Even IBM supports protocols besides SNA. Before there was SNA, there was Bisync (the Binary Synchronous protocol, or BSC). BSC survived SNA not only for reasons of conversion complexity and expense, but also because Bisync frames could be typed by a human being on a keyboard using standard key combinations (Ctrl-A-Start of header, for example). This was one reason that early ATM networks (the teller machines, not the switching concept) used the Bisync protocol, even though Bisync was more or less considered obsolete even then. So there are other protocols than SNA to be dealt with. Why not use the router not only as the host FRAD and terminal FRAD, but also as the multiprotocol router that forms the backbone of the organization’s network? The router could handle the multiple protocols either by actually routing multiple packet types (the old way) or by multiprotocol tunneling (the new way); it does not matter from the frame relay perspective. This scenario is pretty much the frame relay dream with SNA. The routers now not only converge all LAN traffic at a site, but also the SNA traffic. Of course, the decision whether to have separate PVCs for SNA is still a crucial one. The same goes for SNA PVC priorities, buffer handling, and so forth. Adding SNA traffic might require higher-speed UNIs, especially at the central site, higher CIRs if no new PVCs are used, or both. As previously mentioned, there are routers that are more or less intended for use at an SNA central site in just this fashion. These routers have the usual LAN and frame relay UNI ports, as well as something extra. They also have an IBM bus-and-tag, channel attachment port for hooking the routers directly to the mainframe. In this case, the router takes the place of the FEP altogether. The world’s most feature-free router has now been replaced with a state-of-the-art, feature-rich, muscle processor that handles the SNA tasks with style. Of course, buffer and PVC optimization for SNA are essential in such a router. The idea behind this approach is shown in Figure 10.10 with the four-site SNA network. Note that the major feature is integration of SNA and non-SNA traffic, which maximizes the connectivity offered by the routers (FRADs). The router FRAD at the host site passes along the SNA traffic to the FEP. It is crucial to enforce SNA priorities in the router FRADs and try to obtain differentiated services from the frame relay service provider in this scenario. The actual blend of LAN-based equipment and protocols can vary, of course. The central router FRAD can spoof to the FEP and therefore insulate mainframe applications from the changes, or the FEP can use the frame relay interface, with the router FRAD treating the FEP interface as another “UNI.” This is not shown in the figure.
Figure 10.10 The router as multiprotocol FRAD. The major advantages of this scenario are: 1.Natural application of router-based networking. 2.Can be used to minimize changes at central and remote sites. 3.Merges SNA and non-SNA traffic onto one unified network. 4.Router/FRAD becomes network management focal point for all. The major disadvantages of this scenario are:
1.Router FRADs may be invisible to SNA network. 2.SNA must compete with bursty, non-SNA traffic loads. 3.Priorities and/or differentiate services might not be enough for SNA. 4.Non-SNA router component becomes point of failure concern.
Router as X.25 Terminal Frame Relay Access Device The last two scenarios considered in this section address the relationship of SNA over frame relay with SNA over X.25. Outside of the United States, and even within the United States in some cases, SNA networks were built on or migrated to slow packet networks based on X.25. SNA equipment that uses X.25 employs a protocol known as Qualified Link Layer Control (QLLC) which is yet another variation on the ISO Data Link Layer functions, but one which allows X.25 packet content to be delivered to the SNA equipment itself, not to the users. With QLLC, some packets are qualified and contain control type of information, while the others are delivered to the user process itself. In cases where X.25 with QLLC has been deployed, the router can still act as a terminal FRAD, this time converting frame relay to X.25 with QLLC and back again. The central site would most likely convert the FEP to frame relay (with all of the associated NCP and VTAM release concerns), but the major network change would be exchanging X.25 PSDN service for frame relay service. Of course, a spoofing router FRAD could be used at the central site. Note that there is no need to reconfigure the remote cluster controller for SDLC support. The idea behind this approach is shown in Figure 10.11 with the same four-site SNA network. Note that this scenario does not address the presence of other routers at the sites, nor the presence of non-SNA traffic. These issues have already been addressed in previous scenarios and it is enough to note that the same concerns apply here, just substituting X.25 for SDLC. The router FRAD at the remote sites convert from X.25 with QLLC to frame relay. The X.25 “UNIs” now become frame relay UNIs. There is not much more to it than that.
Figure 10.11 The router as X.25 terminal FRAD. The major advantages of this scenario are: 1.No changes to the remote cluster controllers (still X.25/QLLC). 2.All changes made at the support-rich central site (FRAD or FEP). 3.Poll spoofing supported at remote site. 4.SNA priorities can be supported in X.25 terminal FRADs. The major disadvantages of this scenario are: 1.Could require upgrades to central site hardware and software. 2.Requires complex changes to central site configurations. 3.Might require new, high-speed frame relay UNI at central site. 4.Non-SNA router component becomes point of failure concern.
SNA and Frame Relay Frame relay and SNA fit very well together, so well that some have noted that the rise of frame relay has coincided with the migration of SNA networks from private line environments to public packet network environments. This is totally understandable. Frame relay is an international standard fast packet networking technology. SNA is IBM’s vendor-specific networking method. Formerly, the term proprietary was often applied, but newer versions of SNA are more open than ever before. In the case of SNA, vendor-specific refers to the fact that IBM determines what SNA is or is not, not some international standards organization. While SNA is as standardized in its own way as frame relay, SNA is not an international standard in the same sense as frame relay, X.25, or even ATM is. Frame relay networks are singularly indifferent to the content of the frame relay frames. Yet SNA traffic is often singled out as deserving of special mention when it comes to frame relay. What’s so special about SNA traffic? Well, as has been pointed out, SNA has been the most common way of building private corporate networks for companies to essentially run their businesses on. The fact that frame relay is a public networking alternative to these private networks has a potentially enormous impact on the users and builders of SNA networks. Furthermore, about half of all frame relay networks in use today have at least some SNA traffic on them. Finally, IBM itself has traditionally severely limited SNA link support to all but private lines. X.25 was the major exception. Since frame relay evolved from X.25, IBM quickly embraced frame relay as a better way to support SNA links in a public packet-switching environment. SNA is actually a whole family of hardware, software, and protocols that fit together to allow remote terminals to access information resident on mainframe computers over a WAN in a very secure, very fast, and very stable manner. When considering only the WAN, SNA networks revolve around two key components of the SNA network. These are the 3174 cluster controller (technically now called the enterprise controller by IBM) and the 37X5 (3705, 3745) FEP (technically called the communication controller [COMC] by IBM). These two devices, cluster controller at the remote site and the FEP at the central mainframe location, have typically been linked by multidrop or point-topoint leased private lines using IBM’s data link protocol for SNA, Synchronous Data Link Control (SDLC). The SDLC protocol sends frames between FEPs and cluster controllers for exactly the same reasons that frame relay sends frames over a public frame relay network between any two devices. The SNA frame structure used in SDLC is quite different than the frame structure used in frame relay. There are similarities, however, including the use of a header and information field, and address information in the header, among other things. The polling sequence in SNA is a special type of small SDLC frame that repeatedly is exchanged between FEP and cluster controller. The FEP also routes SNA traffic, and is every bit as much of a router in that sense as any Internetbased network node. The FEP is basically the network node, or router, of the SNA network. Yet the phrase “SNA is a nonroutable protocol” is often heard, even among internal IBM networking groups. While it is true that SDLC frames in SNA cannot be routed in the same sense that TCP/IP packets can be routed connectionlessly between LANs or on the Internet, SNA is not really a nonroutable protocol. A more accurate and balanced evaluation would be that routers send packets between separate networks, but SNA FEPs basically switch frames between all the components of one big SNA network. There are more technical differences, and not a few exceptions to this statement, but this is all that needs to be pointed out when considering the relationship of SNA to frame relay, since the most common frame relay FRAD is the router.
Even as recently as two or three years ago, migrating from multidrop SNA to frame relay was a challenging task. How were the router FRADs to be linked to the SNA FEPs and cluster controllers? What about session timeouts and so forth? Most of these issues now have answers or at least guidelines. These guidelines basically fall into one of the following major categories: Frame relay delays Delay variations The router as FRAD SNA and PVCs SNA and FECN/BECN/DE The following sections look at each of these issues in more detail.
Frame Relay Delays Whether it’s called latency or delay, this boils down to the same thing: How long does it take for a frame relay frame to find its way through the network? Private lines have extremely low delays compared to most current packet-switched networks, fast packet or not. A private line is essentially a long stretch of copper wire with some fiber thrown in the middle, and the copper to-from fiber conversions (electro-optical interfaces) happen at blazingly fast, chip-level speeds. Switching node processing is minimal, so leased private lines have delays of around 20 to 30 milliseconds, even on coast-to-coast circuits in the United States. Frame relay networks have higher delays, although not necessarily in the hundreds of milliseconds range. The slower frame relay network concerns are offset by the fact that the FEP now only needs to service one port to reach all remote cluster controllers, and the tasks of actually generating an SNA poll on each PVC can be offloaded even further from the mainframe than the FEP. This will be explained in more detail shortly. The net result is that many SNA customers actually find that user response times improve when migrating from private lines to frame relay for SNA. This pleasant result encourage the use of frame relay for SNA even more. All things considered, in spite of all the hype, SNA users experienced a wide range of results with frame relay. User response times improved, stayed about the same, or deteriorated. It all depended on the interplay of many factors, including the UNI bandwidth to the central FEP site and to the many remote sites, the volume of the traffic on the network, the sizes of the messages inside the frame relay frames, and the exact methods used to carry things like SNA polls across the frame relay network. The buffering and frame processing that take place at frame relay switching nodes make frame relay fundamentally different than private lines used for SNA in terms of delay characteristics. On frame relay, delays are determined mostly by two types of serial delays or insertion delays. Both terms are used for the same phenomenon. Private lines have only one serial delay: at the source. Frames, SDLC frames or otherwise, are placed in an output buffer when ready for transmission. If queued behind a frame already being sent bit-by-bit on the link (hence the term serial delay), the frame must wait until the frame in progress is finished, no matter if the frame in the queue has a higher priority than the one in progress. By definition, there is not subunit structure to a frame: A frame is a first-order bit structure. Once the first bit of a huge but low-priority frame is on the wire, the bit cannot be pulled back in favor of bits representing a small but high-priority frame. Serial delay is a lot like a 200-car freight train delaying an automobile at a railroad crossing. Once the engine has started through the crossing, no cars go anywhere until the caboose goes by.
Replacing private lines with frame relay does nothing to or for the serial delay on a network. If large frames got in the way of SDLC polls before, they will with frame relay also. The issue is that frame relay networks add a second serial delay or insertion delay to the delay always present at the source. In frame relay, there is a second serial delay at the destination UNI on the other side of the network. Note that serial delays can occur on the backbone links between frame relay switches. But these queues and buffer are usually modest, and enough traffic on a frame relay network to cause added delays on the backbone usually triggers BECN/FECN congestion bit settings long before the backbone delays become a problem. To separate the two frame relay delays, which are both plain serial delays, it is common to refer to them as the ingress delay (onto the source UNI) and the egress delay (onto the destination UNI). These delays in both private line and frame relay environments are shown in Figure 10.3.
Figure 10.3 Ingress and egress delay. The amount of added delay will depend on other factors such as frame size (larger trains take longer to pass the crossing), serial port speed, and other, non-SNA traffic activity levels. In turn, the added delay can not only totally offset any other response times gained by migrating to frame relay, but adversely affect SNA sessions. For example, some applications must exchange many small packets before a screen refresh occurs and the user sees “transaction completed” or some such message. If each packet is delayed just a little bit more, the cumulative effect is that it takes what seems like ages for the “transaction complete” message to appear. The good news is that few SNA applications today are as chatty as they were in the past. In fact, compared to other protocols, especially those intended for bandwidth-rich LAN environments, SNA transactions are downright bulky. After all, SNA was designed for low-speed WAN networks, not high-speed LANs (although SNA zips along quite nicely on LANs, of course). Still, most SNA transactions are inherently asymmetrical, like most other good client/server applications. Dumb terminals cannot generate large amounts of information anyway. But mainframes can and do generate a lot of information. So SNA transactions send small amounts of data into the mainframe, but deal with a lot leaving the mainframe. Typically, a 3278-will send between 100 and 200 octets into the network (the contents of a “fill in the blanks” form, for instance), and the mainframe will respond with between 700 and 1000 octets (the whole customer history, for instance), although a full screen is about 2000 octets long. The bottom line is that customers migrating from multidrop private line SNA to frame relay should not see a rise in end-to-end delay, given the common sizes of SNA transactions. If response times do rise, speed of the UNIs is the first place to look for relief, but message sizes are important too. This section presumes that each and every poll is still sent across the frame relay network. But with most implementations of SNA over frame relay, this is neither necessary nor common. In most cases, the polls that do not result in transaction traffic being sent between FEP and cluster controller (called unproductive polls in SNA) are not sent across the frame relay network at all. The FRAD, the router, or even the FEP and cluster controller themselves in some cases can all spoof the polling sequence at each end. These three possibilities are illustrated in Figure 10.4.
Figure 10.4 Poll spoofing in frame relay. Poll spoofing can be done on any packet network, including the Internet. There are even variations on the poll spoofing, all of which involve the fact that the SNA sessions are effectively terminated at the spoofing device. In other words, at the mainframe FEP site the spoofing could be performed by a host FRAD device that more of less does only poll spoofing, while at the remote sites with the cluster controllers, the spoofing can be done by a router, since many cluster controllers already have LAN interfaces anyway. Spoofing at the remote cluster controller sites can be done by a specially equipped terminal FRAD. Here, the term “host” is used in the IBM sense of “mainframe.” It should be noted that IBM has even developed a standard way of carrying SNA traffic inside of IP packets across a router network. The method is totally independent of frame relay network use (frame relay could still link the routers) and is mentioned here only for completeness. The technique is called Data Link Switching (DLSw) and proposes a standardized tunneling technique (although sometimes it is called an encapsulation technique) for SNA and also LAN-based NetBIOS traffic. The details of DLSw operation are not important here. What is important is to stress that poll spoofing, session termination, and frame relay support for SNA are not only possible with the methods and techniques described here. There are also vendor-specific implementations of approaches that make the merging of SNA, frame relay, and router networks easier. The main point here is not how to do it, but the issues involved. A major feature of any method is that delays across the frame relay portion of the SNA network are kept low within the network itself in two main ways. First, the frame relay service providers frequently offer delay-bounded services (“less than 40 milliseconds”) which are not very different from private line delays. Second, poll spoofing can reduce the amount of traffic that needs to be sent across the frame relay network, improving delays by minimizing the effects of serial delays at both ends of the network.
Delay Variations Not only are the delays experienced on a frame relay network higher than on private line networks, but they are also more variable, serial delay effects aside. This is because private lines do not have bursts of traffic. A 64 kbps leased line always sends 64 kbps in the familiar circuit “all the bandwidth, all the time” approach. There is no way to burst above the line rate, nor any advantage to sending less than 64 kbps of information per second. In frame relay terms, a 64 kbps private line is a circuit with one PVC (itself) and a CIR equal to the peak rate of 64 kbps. Bursts into frame relay network introduce delay variations, mostly due to serial delays at the ingress and egress ports, as previously discussed. How much delay is introduced depends on many factors, including overall network traffic loads and local activity. Delays in frame relay can be guaranteed to be below some value (what is the delay when all buffers are full?), but are not guaranteed to be stable.
But SNA sessions timeout if messages are not received within a certain timeframe. It seems as if the solution would be to raise the timeouts to the point where delay variation is not a problem. This establishes a kind of voice jitter buffer approach to transactions. But then the network responds sluggishly to errors and other faults. Yet, if the timeout is set too low, duplicate SNA messages inside frames might be sent while the original has just been delayed. While many other protocols have procedures established to detect and eliminate delayed duplicates, SNA has no such provision. SNA sessions are dropped and must be restarted if duplicate frames are received. Bursting results in discard eligible (DE) frames in frame relay as well. These are the first frames to be discarded by the frame relay network when many bursting sources swamp the network with traffic. Discarding pieces of SNA transactions is a sure way to slow the network to a crawl and infuriate users and customers. So how should frame relay PVCs carrying SNA traffic be configured to account for delay variations? Basically, make sure that the CIR assigned to a PVC carrying SNA traffic not be allowed to burst above the CIR. This approach might not be the most efficient when it comes to frame relay bandwidth utilization on a UNI, but it is the safest when it comes to session timeouts and delay variations. If this is the case, most SNA PVCs on frame relay have a 32 kbps CIR configured on a 64 kbps UNI at the remote cluster controller site. The situation at the central mainframe and FEP site is more complex and will be discussed next.
The Router as Frame Relay Access Device Most SNA networks that migrate over the frame relay do not fit into the mesh-connectivity frame relay PVC patterns of LAN connectivity requirements. Of course, if the FEP and cluster controllers are already on LANs, the situation is more complex, but this section only addresses hierarchical SNA to frame relay issues. Since the central IBM mainframe is the focal point of the entire network, and SNA cluster controllers cannot communicate directly with each other, but only through the mainframe site, SNA frame relay PVCs tend to have a one-to-many pattern. The FEP site UNI has a PVC for each and every remote UNI leading to a cluster controller. On the other hand, the remote UNIs only need a single PVC leading to the FEP site. So SNA networks tend to aggregate and concentrate large amounts of traffic onto the UNI at the FEP site. But this is not a problem for frame relay. UNIs linked by PVCs do not have to match speeds, just CIRs on the PVCs. But how many PVCs can be supported on the central site UNI? This all depends on the number of PVCs at the central site, the oversubscription policy of the service provider, and the maximum number of PVCs supported on a UNI of a given speed. Fortunately, whatever the answer as to needed number of PVCs turns out to be—10 or 20 or 50—a central site UNI with a higher port speed can be provisioned and configured. So many SNA networks have 64 kbps UNI at the remote sites and 128 kbps or 256 kbps UNIs at the central site. A full DS-running at 1.5 Mbps is not uncommon for the central site. Many 32 kbps CIRs for many PVCs can be carried on a UNI running a this speed. One of these PVCs is shown in Figure 10.5, with a router as the FRAD, since this is most common. The issue is that the router also handles non-SNA, LAN-to-LAN traffic between the two sites as well. Otherwise, why use a router as the FRAD? Use special SNA host and terminal FRADs or the FEP and remote cluster controllers themselves.
Figure 10.5 The router as FRAD for SNA. What’s wrong with this picture? Only that most routers will try to fill the pipe whenever they can regardless of whether the serial port on the router is configured for frame relay or another, private line protocol. In other words, unless the router is SNA-aware, the router will just blast SNA traffic (along with everything else) into the network at 1.5 Mbps. Routers have no respect for CIRs, or even awareness of CIRs unless they are especially intended for frame relay CIR operation (and a few are).
So frame ingress is at 1.5 Mbps. But frame egress is only at 64 kbps maximum (the line rate) and 32 kbps minimum (the CIR)! So the frame relay switch at the end of the central 1.5 Mbps UNI becomes a potential congestion point. It is important to realize that most routers do not ignore frame relay CIRs out of spite. This situation was never a problem before. Consider two routers directly linked by a 64 kbps leased line, as shown in Figure 10.5. The router might become congested under heavy traffic loads, so the router always tries to fill the pipe to prevent its own congestion. After all, a private line gives the router “all the bandwidth, all the time.” But frame relay does not give the router all 1.5 Mbps into and out of the network. Traffic builds up at the ingress node, buffers fill, FECN and BECNs flow, and frames are discarded, much to the detriment of SNA transactions. This mismatch between network fill rate and drain rate, as it is sometimes called, is not unique to SNA over frame relay. Extreme client/server situations can have the same effect. But the effect is most common in SNA, where central sites are the rule. Oddly, even if the router gives priority to the SNA traffic, ingress congestion is a real problem. The traffic arrives at the congestion point faster. It is not enough for a router to give priority to SNA PVCs at the router. The frame relay network itself must be able to prioritize PVCs, perhaps as part of a differentiated services quality of service offering. So what’s the answer? How can a router adequately handle not only LAN-to-LAN traffic, but also delay-sensitive SNA interactive transactions? There are two answers. First, use a router that is especially designed to both respect CIRs and assign priority to SNA traffic. This way, if any source at the central site, SNA or not, generates traffic above the CIR for that PVC, the router can buffer the traffic, not just dump it into the network. Today, the low cost of memory has resulted in a line of routers specifically designed for frame relay that can buffer megabytes of frames, not to introduce more delays, but to alleviate them. The second answer is to assign a separate PVC just for the SNA traffic, even though the PVC obviously follows the same path through the frame relay networks as a PVC used for non-SNA traffic to the same site. This is discussed in the next section.
SNA and PVCs One of the crucial factors in the success of running SNA over frame relay is to assign a separate PVC specifically for the SNA traffic, even if this results in multiple PVCs running between the same sites. This is especially true when the service provider’s frame relay switch buffers data at the ingress port on a per-PVC basis, which many frame relay switches do. The whole idea is shown in Figure 10.6. There are a number of reasons for using a separate PVC for SNA traffic, some of them not so obvious.
Figure 10.6 Separate PVCs for SNA. First and foremost, SNA traffic tends to be less bulky than other types, and stable response times are needed to prevent session timeouts and restarts. Also, the bulkier the non-SNA traffic appears to the frame relay network, such as in the case of file transfers of Web pages, the more reason there is for the use of a separate PVC for SNA. Finally, the smaller the CIR at the low-speed egress port, and the slower the drain of the output buffers, the greater the chance that SNA traffic will find itself serially delayed by the non-SNA traffic. Of course, the previously mentioned considerations that separate PVCs enable source FRAD priorities and service provider differentiated services for SNA traffic are still a factor.
Unfortunately, there are no hard and fast rules. In the end, it all depends on traffic patterns. But PVC cost should not be part of the consideration. Two PVCs are not always twice as expensive as one PVC. And even if this is true, there is not much of a burden represented by the few dollars per month that most service providers charge for PVC maintenance (in truth little effort is required to maintain a table entry: Make sure it stays there!). Now, this might sound harsh when large SNA networks might require PVC connectivity with 300 remote sites, but compared to the monthly cost of the UNIs alone in this case, PVC costs are small. Sometimes PVCs are priced by CIR, with higher CIRs costing more. This pricing scheme makes sense, because the more resources reserved on the network for a PVC, the fewer PVCs that can be configured and sold. On the other hand, some frame relay service providers just say, “Pick the CIR that makes the most sense, the cost is the same.” The philosophy here is that there is no savings incentive to underconfigure CIRs and no benefit in overconfiguring CIRs. It is claimed that this flat-rate CIR approach makes for more right-sized CIRs, but no reliable studies have been done contrasting the two approaches. In any case, CIR-sensitive pricing actually could encourage SNA PVCs. In many cases, two PVCs with correspondingly lower CIRs will not cost twice as much as a single PVC with a larger CIR, and the performance benefit could be great, considering the router, FRAD, and frame relay switch buffering policies. Of course, things might not be so simple. At the central site, with potentially hundreds of remotes to reach with PVCs, doubling the number of PVCs configured could seriously challenge the UNI’s CIR oversubscription limits, the service provider’s maximum PVCs per UNI policy, and the ability of the FRAD-router vendor to adequately service and buffer traffic to and from all of the PVCs. However, there is one practice that has been considered in many SNA over frame relay situations where separate PVCs have been assigned for SNA and non-SNA traffic that should not be encouraged. Sometimes network administrators will limit the window size of protocols used for nonSNA file transfer and other bulk information transfer applications. The idea is that since these protocols expect to use the entire bandwidth available on a leased line, some attempt should be made to make these protocols realize that the bandwidth is now limited by the PVC and its CIR. But in most cases it is simply better to allow the protocols involved to perform their own window-setting, which is built-in and based on real-time measurements of round-trip delays across the network and the like. Attempting to guide these activities has often resulted in more harm than good.
System Network Architecture and FECN/BECN/DE It has already been noted that IBM has encouraged the use of frame relay for SNA connectivity in many ways. However, IBM, itself, has changed strategies several times over the years and the end result has been that there is now a lot of obsolete information circulating about how IBM SNA equipment handles frame relay features like the FECN/BECN and DE bits. It is true that IBM equipment designed or optioned for frame relay use was the first frame relay equipment that respected the FECN/BECN congestion indicators. IBM FEPs, routers, and cluster controllers all throttled back sending rates in response to FECN/BECN bits arriving in frames, as specified in frame relay traffic-shaping documentation. Of course, in the cases where host FRADs and terminal FRADs were in use, since their very presence was totally transparent to the SNA network, it was not the FEP and cluster controllers that buffered information, but the FRADs themselves. Many documents claimed that only such equipment responded to FECN/BECN bits, however. Other routers, it was claimed, did not process the FECN/BECN bits and just blindly sent frames into a congested network in true best effort fashion. The point is that now many routers and other frame relay devices do process FECN/BECN bits and change their behavior accordingly (typically by buffering frames).
However, merely adding FECN/BECN awareness to routers does not make them more suitable (or less suitable) for SNA traffic. This is because responding to FECN/BECN bits is not really an adequate way to handle network congestion in the first place. Any type of network congestion, whether considered mild (FECN/BECN set) or severe (throw away DE traffic), is sending a clear message to the network administrators: Something is not right. In could be undersized UNIs, or oversized CIRs, or both. Relying on FECN/BECN behavior modifications to address network congestion issues is a lot like relying on automobile accidents to reduce automobile traffic during rush hour, since there are now a few less cars that can enter the highway. In both cases, the root cause of the problem must be found and addressed. There could be non-SNA router broadcasts that should be filtered out or other circumstances that should be changed. SNA over frame relay’s handling of the DE bit has been an oddity and has changed over the years since frame relay support was first introduced into SNA. Initially, SNA equipment attached to a frame relay network had a unique way of using and setting the DE bit independent of the frame relay network’s handling of the DE bit. SNA equipment set the DE bit to a 1 bit (okay to discard) on all frames sent into the mainframe at the central site. SNA equipment set the DE bit to a 0 bit (don’t discard please) on all frames sent out of the mainframe at the central site. The intention here was simply to acknowledge the needs of SNA transaction sessions. Traffic heading into the mainframe from a remote user represented work not yet done; if these frames were discarded, the most that needed to be done was for the user to press a key to enter the transaction again. On the other hand, traffic heading out of the mainframe to the remote user represented work done (perhaps transaction completed) and if these frames were discarded the user might inadvertently trigger the same transaction over again. This attempt to repeat work done would necessitate a rolling-back of the session, undoing of transactions, and a lot of extra work in general. The DE bit use here was designed to minimize the loss of work done and maximize the loss of work to be done, if it came to that point on the frame relay network. Some criticized this nonstandard use of the DE bit. But at the time, all the documentation stated was that the DE bit was to be used to avoid and/or alleviate congestion on the frame relay network. IBM was just seeking to make sure that there was enough DE traffic to discard when and if the network had to perform such a DE scan of congested buffers. This early SNA use of the DE bit is shown in Figure 10.7.
Figure 10.7 Early SNA use of the DE bit. Results from this use of the DE bit were mixed at best. It has already been pointed out that most of the congestion concerns in SNA environments on frame relay are at the central site ingress UNI and remote site egress UNI. Sending outbound frames with the DE bit set to a 0 bit (don’t discard please) had little effect on these congestion points, because often all of the frames from the central site were not tagged as DE. There was little congestion risk on the inbound traffic to the central site (but more risk at the central site egress UNI), so having a lot of DE traffic there did not matter much. And the larger UNI size at the central site allowed large amounts of remote traffic to aggregate there comfortably. Only when large amounts of non-SNA traffic were mixed with SNA traffic or central site UNIs ran at the same speed as the remote UNIs did the SNA use of the DE bit make a noticeable difference. In any case, IBM dropped this practice. But mention of this IBM bug does pop up from time to time. Ironically, many routers that feature SNA support now allow frame relay frames carrying non-SNA traffic to set the DE bit to a 1 bit (okay to discard) before entering the frame relay network. The intent is the same as before: Provide the network with enough traffic to discard under severe congestion while at the same time protecting the crucial SNA transactions. There is nothing wrong with either approach. Only the effectiveness of any technique is the issue.
SNA, Private Lines, and X.25 SNA excels at transaction support. Building such support into what were in the 1970s error-filled and failure-prone networks running at modest speeds (4800 or 9600 bps on analog links, in most cases) was not easy. So IBM took drastic measures to make sure that SNA networks functioned as intended, since users focused complaints on the very visible equipment and not the almost transparent network of links connecting the equipment. What this boiled down to was limited support for the many possible ways that SNA components could be networked together, mainly the FEPs and the cluster controllers. Lack of support meant then what it does now: We won’t configure it and if it gets configured, we won’t take a trouble report on it. Since most customers were totally dependent on IBM and IBM alone to resolve all but the simplest problems with SNA software and hardware (SNA is not called vendor-specific for nothing), this had the intended effect of IBM’s word becoming law. It was not done out of nastiness, but merely to try to control the interactive nature of the SNA network: a very tricky process to begin with. So IBM would only support SNA networks linking FEPs and cluster controllers in a limited number of ways, which eventually became basically two on the WAN side: leased private lines and X.25 public packet-switched data networks (PSDNs). Of the two, private line support was simpler and preferred. But outside of the United States, especially where the effects of World War II were still being felt, public facilities available for private lines were scarce and expensive. Most European countries embraced the concept of PSDNs based on X.25 for this very reason: packet-switching delivered services to the widest range of users at the minimum cost of facilities. IBM really had no choice but to support X.25 for SNA as well as private lines, although SNA support for X.25 was more complex and required special SNA components. But at least it worked. Even within the United States, linking each and every cluster controller to a central mainframe with a point-to-point private line (paid for by the mile) could be a pricey proposition. And having a port on the FEP for every cluster controller was costly as well. So it was common practice to configure the WAN links from an FEP serial port as multipoint or multidrop lines. Multidrop lines had the FEP at one end and multiple cluster controllers at the other, forming a type of limited multicast environment. That is, when the FEP (called the primary) sent, all cluster controllers heard the message. But when a cluster controller sent (the secondaries), only the FEP got the message. So multidrop lines are multicast outbound, but unicast inbound. In practice, up to about 32 cluster controllers could share a single port on the FEP; large SNA networks still had multiple FEPs with many ports. The architecture of a multidrop line in an SNA network is shown in Figure 10.2. Instead of two ports and two lines stretching 100 and 200 miles, only the 200-mile line and the drop count toward the mileage. The drop is typically only a few miles long, and the bridging between the drops is done at the service provider’s switching office (this is not shown in the figure).
Figure 10.2 Multidrop lines and SNA. The use of multidrop lines is virtually unique to SNA networks; the practice of using multidrop lines is becoming scarce today, even in SNA. There are two main reasons: cost elements and the use of digital lines. In the early days of networking, the voice service providers wanted to encourage the use of public voice facilities for data networking as much as possible. After all, this translated into more revenue and bigger markets. If computer giant IBM had customers that wanted multidrop lines to preserve FEP ports, then the telephone companies would offer them at low prices. Revenues might be lower, but at least they were there. If the choice was between a multidrop line or no networking line sale at all, then multidrop it will be. And without multidrop, the FEP costs skyrocketed. In an analog environment, using 4800 or 9600 bps modems for SNA links, multidrops are easy to configure and install. Analog signals are easily split and combined using simple metallic bridge clips at the main distribution frame in the service provider’s switching node (central office). However, once digital links became more commonly used for SNA networks in the mid-1980s, it was not as easy to configure and install multidrop links. In a digital environment, such as using 56 kbps or 64 kbps SNA links, signals are so easily split and combined. The dropping is now done in a digital cross-connect that must be configured by a skilled technician. As a result, many service providers began to subtly (or not so subtly) restructure their private line rates so that there was less and less of a price differential between point-to-point and multidrop configurations between the same sites. In some cases, the difference was exactly zero. So the major incentive to use multidrop lines in SNA became port reductions on the FEP. But here the revolution in hardware pricing led to less and less expensive gear of all types, FEPs included. It was no longer unthinkable for a FEP to have a single port for each and every cluster controller. In fact, some customers liked the idea, since losing a single port or line to failure would only affect whatever was at the end of the point-to-point line. Instead of losing 128 multidrop terminals, maybe only 32 were lost until repairs were made. However, IBM pricing discouraged too much of this and the FEP came to be called the “world’s most feature-free and expensive router” by the mid-s. What has all of this to do with frame relay? Well, by the early 1990s both multidrop and point-topoint network links had become a major expense in all but the smallest of SNA networks. X.25 packet networks could provide the same connectivity for a fraction of the cost, since any FEP site only needed one link into the PSDN cloud to carry all remote virtual circuits to the cluster controllers. The problem was first in finding PSDN services in the United States, and second in making sure that the delays were low and stable enough for SNA sessions and transactions to complete. The second issue is due the SNA’s polled environment. The mainframe (actually, the FEP) controlling the network had to ask each and every cluster controller periodically (once every few seconds) “do you have anything to send?” The FEP basically spent its day doing this. When anyone at an IBM terminal pressed “enter,” the cluster controller passed the information along to the FEP when polled, then waited for a reply from the mainframe such as “I did it” or “what was that again?” But the cluster controller could not wait for long, not for the interactive transactions that the SNA network was built to support in the first place. Transactions more or less required subsecond response times, and usually got them. If a poll from the FEP was delayed, or a response from the mainframe was delayed on the way back, SNA networks generated session timeouts which essentially rolled the transaction back to square one and made the user start all over. Not only did this annoy the user, it might annoy the company’s customer, who might be on the telephone and now be required to repeat information already given. So there were many issues to consider when migrating SNA to a public packet-switched network. The need to beat the clock of the session timeout was a critical one. SNA had been engineered to run over private lines with low and stable delays in the 10s of milliseconds at worst. The variable and generally higher delays encountered in packet networks would lead to numerous session timeouts if something was not done. If the session timeout value was increased everywhere, the remaining private lines were slowed unnecessarily. And raising the timeout only worked to a point. At higher values, the SNA network’s ability to respond rapidly to genuine faults deteriorated. Any packet network that wished to support SNA traffic adequately had to be a fast packet network like frame relay, not a slow packet network like X.25.
SNA Concepts and Components In the simplest SNA hierarchical network, the powerful mainframe sits at the top of a pyramid with a few communications controllers attached. In IBM SNA documents, the communications controller is often abbreviated as COMC, but most people call this device the front-end processor, or FEP. FEPs are attached to the mainframe by high-speed channels and connected to the mainframe by special IBM bus-and-tag cables. Attached to the FEP by a network of one kind or another are remote cluster controllers (CCs), which might number in the hundreds. Technically, the cluster controllers form the endpoint of the SNA network. But attached to the cluster controllers are the 3278-terminals, usually by IBM coaxial cable, unshielded twisted-pair (UTP) wire, or (more and more often today) some form of LAN. A large SNA network might number more than a thousand terminals, all with users busily entering transactions processed on the mainframe. This is another reason for the persistence of SNA: Try to squeeze 1,000 simultaneous users that need to have their every message tracked for transaction purposes onto the most powerful server. It has been argued that the IBM mainframe is really the most powerful server money can buy today. The contrast between SNA hierarchical networks and peer-oriented, LAN-based, client/server networks is shown in Figure 10.1. The SNA pyramid is flattened to a complex array of interconnected LANs bristling with routers, clients, and servers. The price of distributed computing is complexity. Democracy is always messier than government by the word of the king.
Figure 10.1 SNA and client/server. Although SNA networks typically have a mainframe as their focus, this is not always the case. A newer processor known as the IBM AS-400 is in many respects as powerful as a mainframe in terms of transaction support but shares many of the characteristics of client/server architectures. So AS-s can be attached to LANs and AS-s require no external FEP to drive a network. The special case of AS-s and frame relay will be considered later in this chapter. Whenever IBM equipment is mentioned in an SNA context, the traditional SNA hierarchical network structure excluding AS-s is assumed.
Who Cares about SNA Anyway? So why has SNA persisted and even thrived to some extent in the world of peer networking? The reason is simple enough: transactions. A related concept called sessions comes into play as well, but other protocols handle sessions also. Few handle transactions well at all. IBM defines a transaction as any activity that changes the financial standing of an organization; that is the definition employed here. If there is money involved, it’s a transaction. From a network perspective, a request to “Enter your credit card number” is not a transaction. No money has changed hands yet, although it most likely will at some point. On the other hand, “Take $100 from accounts payable and put it in accounts receivable” is most definitely a transaction. Money has moved around, and must be moved right across the network if accounts receivable is in Boston and accounts payable in Dallas. The key difference is that transactions are by definition idempotent. This Latin term means “powerful in and of itself.” Transactions must be done once, even if the network fails and they must be done only once. Otherwise the books will not balance at the end of the day. If the $100 leaves Boston on the network but does not arrive in Dallas before a network crash occurs, then some mechanism must exist to either complete the transaction when the network is restored, or back off the entire transaction and go back to start. Transactions are atomic actions (atomic is ancient Greek for “un-cuttable”) and cannot be half-done. Repeatedly asking users to enter information might annoy the user but will not cause irreparable harm. Repeatedly spewing $100 bills out of an ATM machine will please users, but will annoy the bankers. The mechanism that tracks transaction completion in SNA is called a session. An SNA session will track the history of all the transactions on a network connection. If things go wrong, the record can be replayed forwards or backwards so that at the end of the day, all of the accounts add up to their proper sums, ready for the next day’s activities. One of the reasons that Internet protocols struggle with transaction support is that the connectionless nature of IP makes session support at the higher protocol layers that much more difficult (but by no means impossible, only much harder). Now, SNA networks can do more than process many transactions very quickly. SNA networks can still perform bulk file transfers, handle e-mail, and in short do everything expected of a network today. This chapter emphasizes frame relay support for interactive, transaction-based SNA traffic only because this is the type of traffic most critical to users and the most difficult to support adequately. If frame relay can support interactive SNA traffic, then frame relay can support other types of SNA traffic just as well. The popularity of the Internet and its protocols means that many people are not even familiar with the basics of the SNA network components. Since much of this chapter depends on at least a passing familiarity with simple SNA concepts and network components, a few words about SNA itself are in order.
AS/400s and Frame Relay AS/400s have been called mini-mainframes and this description is quite good. Almost all that a sizable mainframe can do on a raised floor with water-cooling and an army of personnel to operate it, an AS/400 can do in general office space with a few administrators but not as much or as fast. But if there are not thousands of remote users at 3278-2 terminals to support, but only several hundred remote users with PCs attached to LANs, AS/400s make a lot of sense. An AS/400 can still drive an SNA network with SDLC polling, although AS/400 terminals belong to a different family than the 3278-family. But the AS/400 is more LAN-friendly than most mainframes. Most AS/400s already sit on LANs. The AS/400s use routers and IP packets for many applications, but many AS/400s still push SDLC polled traffic out of other multidrop ports to the AS-family of cluster controllers. These mini-mainframes can basically fit into one or more of the SNA scenarios with routers previously mentioned, and usually with a lot less work. Why bother to mention AS/400s and frame relay in a special fashion at all? It is simply because of IBM’s early deployment of Advanced Peerto-Peer Networking (APPN) on the AS/400 product line. APPN was an attempt to take advantage of the peer features of LANs for AS/400 networks, rather than relying on older, hierarchical SNA topologies as all of the larger mainframe products did. The details of APPN are unimportant, especially considering the rise of the Internet protocol suite and IP-based LANs and WANs. What is important as far as AS/400s are concerned is that APPN, once considered “the future of SNA,” has more or less been shouldered aside by IP as “the future of everything.” This is not to suggest that those with AS/400s should banish APPN. It is just that running APPN over IP routers over frame relay strikes many as quite odd and having little additional benefit to anyone. Of course, it is possible to run APPN over frame relay (but carefully: APPN expects ATM), relying on special FRADs to separate APPN and other LAN traffic, just as before. But this approach does not even acknowledge that the AS/400 probably is linked to a router on a LAN having a frame relay interface already. The root issue is that APPN network nodes, such as an AS/400, want to be their own router. So hooking an AS/400 APPN network node (router) to a special FRAD (router) to access frame relay and keep other traffic apart (traffic that might make its way into the AS/400 on the LAN ports!) seems very strange indeed. So AS/400s more or less must choose between the APPN-ATM approach and the IP router-frame relay approach. Increasingly, AS/400 network administrators are going with IP and frame relay, although there is nothing wrong with APPN and ATM. It just makes no sense at all to attempt to mix them. This chapter has explored some of the ways that SNA can fit into a frame relay network. Specifically, this SNA over frame relay discussion involved taking the content of SNA SDLC frames and transporting them inside frame relay frames. Such content can easily share a PVC with other types of frame relay frame content, such as IP packets, although the wisdom of doing this has been a concern.
This brings up the whole issue of how the receivers of frame relay frames can distinguish between arriving frames containing IP packets, SNA information, and anything else. A router used as a FRAD can understand all of these, of course. The frame relay network never looks at the frame content. But this does not mean that the FRAD, router or not, does not look inside the frame, and indeed it must. Just because the content of the frame relay frame is transparent to the frame relay network does not mean that all frame contents are created equal or are indistinguishable. The time has come to look at frame relay multiprotocol encapsulation.
Chapter 11: Internet Protocol and Frame Relay Overview Not so long ago, there were many ISO Network Layer (Layer 3) protocols to choose from when creating a network. These packet layer protocols formed the heart of any network. According to the OSI Reference Model, it was the packet that actually traversed the network from source to destination, while the frame changed hop-by-hop between network nodes (switches or routers). It was the packet that held the globally unique source and destination network addresses for connectionless protocols and call setups, and the packet held the locally significant connection identifier in connection-oriented protocols. It was the packet that actually carried the user’s message from one end of the network to the other. No wonder so much interest in networking focused on the network, and the best format and processing rules for a packet. For a while, it seemed as if each vendor of network equipment had its own best packet structure for its equipment, whatever it was. Novell NetWare featured IPX packets, which could actually combine LAN Media Access Control (MAC) layer physical addresses with Network Layer addresses to allow for more efficient determination of which Layer 3 address went with which Layer 2 (MAC) address on a LAN. Banyan Vines had VIP, and DECnet and AppleTalk also had their own vendor-specific packet structures. Even IBM could claim that SNA Path Control Information Units (PIUs) looked pretty much like any other packet, with the notable exception of the network address. In fact, it was the network address structure that effectively made each packet layer protocol distinct. These network software and equipment vendors did not create a raft of protocols just for spite or to make interoperability difficult, however. They were all following the major guideline of protocol development, which stated that each protocol should be tuned and made specifically for the environment it would be run on. Even this practice had its root in the computer operating system world. MVS was made for IBM mainframes, MacOS was for Macintoshes, and even Windows 98 remains firmly tied to the Intel PC architecture. So SNA was for networks with IBM equipment, AppleTalk was for networks of Macs, and NetWare was for DOS-based PCs on a LAN. Who knew better how best to network IBM equipment than IBM? A whole generation of routers went to market with multiprotocol capabilities so that routers receiving frames with IPX packets inside, or VIP, or whatever could distinguish the protocols and use the proper address structure and routing table to look up the destination address in the packet and forward the packet inside another frame. These routers became the engines of enterprise networks, those with multiple sites, multiple platforms (PCs, Macs, mainframes, etc.), and multiple Network Layer protocols. Multiprotocol routers challenged the design and performance capabilities of existing architectures, since the various routing tables—all of them—had to be held in memory all the time. This struck some as rather wasteful. Why fragment the router resources among many protocols, some of which were seldom if ever seen, while the router spent 90 percent of its effort on one main protocol? Why not just decide once and for all on a main router protocol and be done with it? If other protocols needed to be supported, why not stick everything inside one major router protocol and route that? This is, in fact, what happened in the mid-1990s; and frame relay played a role in this development.
Frame relay did not play the major role in this single-protocol router development, however. That role went to the Internet and World Wide Web. For the Internet, and its Web portion, was not a multiprotocol network at all. It was a single protocol network and the single protocol was the Internet protocol suite (TCP/IP). Want to attach to the Internet and access the Web? Do you run TCP/IP? If the answer was “yes,” then you could. Internet routers did not have to be multiprotocol routers, just IP routers, since IP was the Network Layer protocol of the Internet. Once Internet and Web access became essential for everyone everywhere, IP routers became the router of choice for everything. IP routers could still handle other protocols, but only by sticking them inside IP packets and routing the IP packets. The destination, of course, had to know what the contents of the arriving IP packet were, but the routers in between did not. Networks based on IP routers were now intranets instead of enterprise networks, with multiple sites and multiple platforms, but only a single protocol that mattered: IP. Frame relay helped this process along, however. The frame relay switches that linked IP routers never looked inside the frame relay frame. So IP or not, it did not matter to the frame relay network. And frame relay virtual circuits passed all frames between all sites over a single physical link, the UNI. The network way to say this is that frame relay multiplexes at the frame level, not the packet level as did X.25 and other packet protocols. X.25 packets could contain other higher-layer protocols, but X.25 switches only switched X.25 packets. Frame relay switches relayed frames, so the multiplexing of many types of packets was easily accomplished by the FRAD or router. Taken together, the rise of the Internet, Web, and frame relay helped to move the router industry along to a point where multiprotocol routers still exist, but are being slowly supplanted by IP routers. This chapter explorers the relationship between IP routers and frame relay networks in more detail, and also considers the various ways that a router or FRAD can figure out if an arriving frame relay frame had IP inside (more and more likely) or something else (still possible of course).
Address Resolution Protocols and FRC 2427 All hosts and routers running the Internet protocol suite (TCP/IP) spend a lot of time ARPing. The Address Resolution Protocol (ARP) is used when a device running IP knows the destination’s IP address, but does not know the physical or MAC layer address associated with the destination IP address. This happens because IP addresses are in the packets, but the MAC layer addresses are on the frames, and the frames are the first order bit structures. Consider sending a simple e-mail message to a co-worker at corpx.com. The e-mail “to” field has
[email protected] and that’s all. What’s the IP address on the mail server that goes in the packet? A set of servers called the Domain Name System (DNS) provides that information. So that is the destination IP address. But it’s the frame that is sent on the LAN. What is the MAC address on the NIC that is installed in the target mail server? ARP allows devices attached to the LAN to send out a broadcast MAC frame that basically says, “Who has the IP address in this ARP packet?” Since it’s a broadcast, every device on the LAN gets it. If the IP device is turned on (the mail server should be!), then the device responds with an ARP message of its own to the requesting device saying, “I have that IP address and my MAC address is the source field of this frame.” So the sender can cache the MAC address or just ARP again when the need arises. There is also a Reverse ARP (RARP) for associating MAC addresses with IP addresses when the device has no convenient place (like a hard drive) to store that information when it is powered down. The problem is that ARP and RARP only work on LANs, not on WANs like frame relay. How can a router FRAD determine the IP address of a remote device when the router knows only that the router port is connected to DLCI 18? RFC 2390 addresses this problem. The addition of ARP support to multiprotocol encapsulation was one of the major changes between RFC 1294 and RFC 1490. This information is now in RFC 2390, which describes frame relay ARPs in full. The problem with ARPing in virtual circuit networks like frame relay is the inability to easily broadcast frames containing the ARP request (“who has...?”) to every endpoint on the network. For many of these network architectures, multicast capabilities have been defined but are seldom implemented. For IP purposes, frame relay and ATM fall into the category of Non-Broadcast Multiple Access (NBMA) networks. This means that there are point-to-point virtual circuits (nonbroadcast) and all the virtual circuits are supported on the same physical link or UNI (multiple access). Given the importance of ARP, this section will close with a consideration of the standard ARP method used in frame relay, Inverse ARP or InARP.
Inverse ARP (InARP) IP routers, the most common FRAD, reach all remote routers on the same physical port, the frame relay UNI. The DLCIs representing the PVCs are different to each remote router, of course, but this is of no help to the router when it comes time to route an IP packet. IP routing tables are all based on the next-hop router, as determined by the destination IP address, not the frame-level address of the next hop router of destination. The IP packet has been removed from its frame for this routing table lookup, but the packet must be placed in a frame relay frame with the proper DLCI to reach the next hop router.
Now, the DLCIs defined on a UNI are local to the router and are known since they have to be configured there. But some method is needed to associate the IP address at the other end of the DLCI with the local DLCI for IP routing purposes. Obviously, some form of ARP must be used to determine the IP address at the other end of the DLCI. Since this is a lower layer to IP determination, rather than an IP to lower-layer determination, the protocol needed is a form of RARP for frame relay and not really ARP. In frame relay, the procedure used to determine the IP address at the end of a DLCI is called Inverse ARP or InARP. InARP was expressly invented for frame relay use and was originally defined in RFC 1293. When RFC 1293 and RFC 1294 were more or less merged and improved to produce RFC 1490, InARP came right along. Now, InARP has its own RFC again: RFC 2390. It is important to realize exactly what InARP is doing. A routing table in an IP router usually has only one entry per serial port in a point-to-point leased-line world. That is, only one next-hop router is at the end of the serial port. But frame relay allows the router to reach any other router on the frame relay network through the same serial port. The issue is not which serial port, but which DLCI is associated with the next-hop router. Once understood in this light, there is really not much to InARP itself. Once an IP router becomes aware of a new PVC and its DLCI (which might be the initial or only DLCI on the UNI), the IP router compliant with RFC 2390 will send an InARP message on the DLCI. Usually, the link management procedures will provide the trigger for the InARP process. If there is more than one DLCI, an InARP message is sent on each one. The InARP message asks, “What’s the IP address associated with this DLCI?” If the IP router on the other end of the DLCI is there, the router replies with the information. The sending router uses the information to properly populate its router table entries. The receiving router also fills in its own table with the sending router’s information provided in the arriving InARP message such as sending IP address and the DLCI leading there (why repeat the process needlessly?). All other ARP rules apply (what to do if no reply is received, timeouts, etc.) A typical InARP exchange is shown in Figure 11.11.
Figure 11.11 An InARP Exchange. It should be noted that InARP requires the use of multiprotocol encapsulation. Otherwise InARP requests could not be separated from other IP packets generated by users. This is why InARP appeared in RFC 1490 in the first place. InARP uses SNAP encapsulation, so the NLPID is 0x80. The OUI for InARP is 0x00--and the PID for InARP is 0x0806. There is one quirk in the RFC 2390 documentation regarding InARP that should be mentioned. The actual routing table entries used by IP routers for frame relay are not the DLCIs, but the Q.922 addresses of the remote routers. The Q.922 address is basically the 10-bit DLCI split into its 6 highorder bits and 4 low-order bits with the extended address (EA) bits set properly and all other bits in the 2-octet frame relay header (FECN, BECN, etc.) set to 0 bits. So the router has an easier time constructing the frame relay header from the Q.922 address than from the DLCI itself. But it can be jarring to see nice decimal DLCI like 50 turn up as something like 0x0C-21 when expressed as a Q.922 address. Table 11.5 shows the rules for converting a DLCI (50) to a Q.922 address (0x0C-). Table 11.5 From DLCI to Q.922 Address VALUE AT START OF STEP
STEP
RESULTING BITS
50
Express as 10 hex bits
00 0011 0010
00 0011 0010
Split into 6/4 bit pattern
001100 0010
001100 0010
Add C/R (0) and EA (0) to first 6 bits
00110000 0010
00110000 0010
Add FECN/BECN/DE (0s), EA (1) to last 4 bits
00110000 00100001
00110000 00100001
Express as two hex bytes
0x0C-21
Note that InARP only provides IP address and DLCI correspondences. What about multiprotocol routers that also must build and maintain routing tables for routed protocols such as IPX? RFC 2390 is silent about this issue. Sometimes table entries for other protocols must be built by hand. Other router vendors will support the use of InARP for other protocols in a variety of ways.
Ether-type Encapsulation Ether-type encapsulation deserves a section all its own, although much of the discussion is related to SNAP encapsulation techniques. This is because early frame relay equipment vendors, operating in an environment without the guidance of RFC 1490 or FRF.3.1, were free to use whatever multiprotocol encapsulation techniques they desired, as long as both FRADs understood what was going on. What the more aggressive frame relay equipment vendors came up with was multiprotocol encapsulation support using Ether-types. This technique works fine, but is more limited in supporting the many traffic types of RFC 2427 and FRF.3.1. But if these limitations are acceptable, then Ether-type encapsulation can be, and is, still used. Of course, frame relay devices that use straight Ether-type encapsulation are not compliant with RFC 2427 or FRF.3.1. Even purely vendorspecific (proprietary) encapsulation techniques are sometimes encountered in frame relay networks. Multiprotocol encapsulation was not a problem unique to frame relay. Consider two routers linked by a point-to-point leased line running a Layer 2 protocol such as ISO’s HDLC (High-Level Data Link Control) or even the IETF’s PPP (Point-to-point Protocol). If the routers can route more than IP, or are bridging between the two ports, more than one protocol will appear inside the frame. All of the protocols must be identified, exactly as in the case of frame relay. The solution was to insert an Ether-type header between the WAN frame header and the LAN data inside the frame. They were called Ether-types because most routers even then connected Ethertype LANs, either IEEE 802.3 or Ethernet itself. Cisco calls this technique serial line encapsulation. Ether-types are administered by the IETF and listed in the Assigned Numbers RFC. They were originally used in the real Ethernet defined by DEC, Intel, and Xerox to specify the content of the Ethernet frame. When the IEEE standardized Ethernet as IEEE 802.3, this simple scheme was dropped in favor of the OUI/PID structure using SNAP. Some common Ether-types are listed in Table 11.4. Table 11.4 Common Ether-types CODING
MEANING
0x08-00
IP packet inside
0x08-06
ARP inside
0x08-B
AppleTalk packet inside
0x08-F3
AppleTalk ARP inside
0x81-37
Novell IPX packet inside
0x81-38
Novell SAP packet inside
0x81-C
SNMP inside
Most of the values are self-explanatory. SNMP is the Simple Network Management Protocol used in most network management software today, including that for frame relay itself. SAP is the Novell Service Advertising Protocol, a way that NetWare servers make their presence and network addresses (and MAC addresses as well) known to clients. ARP is the Address Resolution Protocol, a common way for Layer 3 protocols to associate network addresses at Layer 3 with LAN MAC addresses when SAP is not used. Note that Ether-types have a simple, 2-octet structure. When used for frame relay multiprotocol encapsulation, all that needs to be done is to insert the correct Ether-type after the frame relay header but before the data itself. This is shown in Figure 11.10 for an IPX packet, again on DLCI = 18 with no FECN, BECN, or DE bits set to a 1 bit.
Figure 11.10 Ether-type encapsulation. There is some potential for confusion between SNAP and Ether-type encapsulation. For some protocols, the Ether-type value is also used as the PID in the SNAP encapsulation method. Note that the Ether-type value (0x81-) used for the IPX packet in the figure is the same as the PID (0x81) used in the SNAP encapsulation technique. The key difference is that Ether-type encapsulation uses only the 2-octet Ether-type label while the SNAP method uses the NLPID and OUI (0x00--for IPX) as well as the Ether-type PID, all in a standard, 8-octet package. Ether-type use does not provide the same level of interoperability of RFC 2427 and FRF.3.1 standard encapsulation techniques. But Ether-type encapsulation works well enough that it is still around. It is easy to understand and implement. However, Ether-type encapsulation is limited to defined Ether-types. RFC 2427 and FRF.3.1 are the preferred methods of multiprotocol encapsulation in frame relay today.
RFC 2427 and RFC 1490 It is hard to underestimate the importance of RFC 1490 in IP networking in general and frame relay in particular. RFC 1490 has been around since mid-1993 and has become quite familiar to all involved in frame relay and all other types of multiprotocol encapsulation. This being the case, it might be a good idea to outline the changes that RFC 2427 makes to RFC 1490 before exploring the newer RFC 2427 in more detail. A summary of the changes RFC 2427 makes to RFC 1490 appears in Appendix C of RFC 2427. The good news is that RFC 2427 does not change any of the multiprotocol header formats, so those familiar with RFC 1490 formats will feel right at home with RFC 2427. However, some of the more awkward language of RFC 1490 has been changed, and some of the functions described in RFC 1490 that did not relate directly to multiprotocol encapsulation (like fragmentation) have been removed. The major changes that RFC 2427 makes to RFC 1490 are: Stations no longer have to accept SNAP encapsulation when a NLPID is available. Fragmentation for frame relay networks was removed, and replaced by a reference to FRF.12, which is the current Frame Relay Fragmentation Implementation Agreement. RFC 1490 implied that frame relay address resolution could be used for both PVCs and SVCs, but the method was lacking for SVCs. RFC 2427 explicitly acknowledges this situation. Encapsulation was added for Source Routing Bridged PDUs, such as found in token ring. The NLPID and PID lists in RFC 2427 Appendix A were updated to reflect this.
Language concerning bridging encapsulation use of canonical and non-canonical MAC destination addresses was clarified. Inverse ARP now has its own RFC 2390 and was removed from RFC 2427. A security section has been added.
RFC 2427 and FRF.3.1 in Action So far the discussion of frame relay multiprotocol encapsulation has been rather abstract. The time has come to look at the techniques in action and detail just how data traffic is carried inside frame relay frames. RFC 2427 will be examined primarily, but the code points established in FRF.3.1 will also be included. RFC 2427 specifies that all frame relay frames can conform to the multiprotocol encapsulation methods in RFC 2427 that have the format defined by Q.922 Annex A. This appendix says that the frame relay address (or header) octets are to be followed by a control octet with the value 0x03, which is the RFC 2427 way of saying “the value 3 in hexadecimal notation.” In bits, this is 0000 0011. In terms of Q.922, this control octet value indicates that this is an Unnumbered Information (UI) frame relay frame. A UI frame is not protected by sequence numbering for network-level error recovery, as expected in all fast packet, relaying networks like frame relay. The UI octet was added to RFC 2427 strictly for Q.922 compatibility. The control octet is not required by the standard frame relay service definition and can actually take on other values than 0x03, although this value is the most common. Following the UI octet is the NLPID octet itself. The 8 bits of the NLPID can be used to indicate any one of 256 network layer protocols, but never any more, obviously. The frame relay frame data immediately follows the NLPID field, assuming the type of information being carried inside the frame relay frame has an NLPID coding. This NLPID method also allows for an optional, 1 octet pad field to be placed between the UI octet and the NLPID octet. This pad octet is used in the SNAP method and is always coded as 0x00 or all zero bits. There can be no pad octets, or one, or even several pad octets present. The purpose of the pad octets is to align the content of the frame relay frames on a convenient boundary for more efficient processing. In particular, IP packets are organized into 4 octet or 32-bit units (called words in IP), not single octets. Frame relay frames using the NLPID encapsulation method with and without pad octets are shown in Figure 11.4.
Figure 11.4 Encapsulation with the NLPID. But what if the NLPID is coded as all zeros? Then the NLPID will look just like a pad to the receiver. To prevent this from happening, an NLPID value of 0x00 is invalid as far as frame relay encapsulation is concerned. So receivers on frame relay networks can discard 0x00 octets following the control (usually UI) octet until a non-zero octet is reached. This must be the NLPID. Officially, however, an NLPID value of 0x00 is known as the Null Network Layer or Inactive Set in ISO documents. Some of the common NLPID values are listed in Table 11.1. Table 11.1 Some Common NLPID Values
CODING
MEANING
0x00
Null Network Layer of Inactive Set (unused in frame relay)
0x08
Q.933 (signaling) inside
0x80
SNAP used for encapsulation
0x81
ISO CLNP inside
0x82
ISO ES-IS routing protocol inside
0x83
ISO IS-IS routing protocol inside
0x8E
IPv6 inside
0xB0
Data inside is compressed according to FRF.9 IA
0xB1
Data inside is fragmented according to FRF.12 IA
0xCC
IPv4 inside
0xCE
Ether-type inside (use prohibited in frame relay by FRF.3.1)
0xCF
PPP inside (RFC 1973)
Ether-type protocols (0xCE), commonly used on Ethernet-type LANs such as 802.3 and real Ethernet, are to be carried by SNAP encapsulation according to FRF. 3.1, so this NLPID should not be used with frame relay. Only protocols with direct NLPID values are to be used. The value is included for information only. Of all the values, probably the 0xCC (1100 1100) coding is the most important in frame relay. This is how IP packets (technically, IPv4) are supposed to be carried inside frame relay frames. The newer IPv6 is not discussed further. An IPv4 packet inside a frame relay frame using NLPID encapsulation is shown in Figure 11.5. The DLCI value is 18, and all other frame relay header bits (FECN, BECN, and DE) have been set to a 0 bit for clarity.
Figure 11.5 An IP packet transported with RFC 2427 NLPID encapsulation. While most routers support RFC 2427, it is not always a given that when routers from the same vendor are linked over frame relay, RFC 2427 is universally used for multiprotocol encapsulation. Routers can be configured to use other encapsulation techniques besides RFC 2427, from purely vendor-specific methods to methods developed before RFC 2427 (and RFC 1490) and FRF.3.1 came along. More details about these other techniques will be discussed at the end of this chapter.
RFC 1490, FRF.12 Fragmentation, and Frame Sizes RFC 1490 contained more than coding tables to be used when encapsulating data units inside of frame relay frames. RFC 1490 also had information on frame relay frame fragmentation. Now, frame fragmentation sounds like an oxymoron, terms that just do not belong together, like jumbo shrimp or working vacation. By definition, frames are first order data units. This means that frames are self-contained data units. Every frame is processed independently from every other frame upon arrival at a destination. That’s what makes it a frame. There is no subunit of a frame. So how could a frame possibly be marked as “piece 1 of 3” or “piece 2 of 3” or whatever? A destination could then buffer the arriving frame pieces until the whole frame was present. But obviously a frame relay frame, like every other frame, has no place to put such an indicator in the frame header anyway.
Packets, on the other hand, can be fragmented all over the place, within certain guidelines. It is the packet inside the frame that has the “piece 1 of 3” (usually just one bit to signal “more coming” and “done”) indication. Receivers can then buffer not the frames, which are all processes and handed off to the correct higher-layer protocol, but the packet fragments inside the frames. Only when a packet is complete will the receiver process the correctly (it is hoped) reconstructed packet in a sequence as frames. That’s how it works. So what’s the problem with frame relay? Well, the packet is not the end-to-end data unit in a frame relay network passing unchanged from source to destination. The frame relay frame is the end-toend data unit in a frame relay network passing unchanged from source to destination. So, there is no way to easily change the size of a frame relay frame as it makes its way through a frame relay network or a series of frame relay networks connected by NNIs. The maximum frame relay frame size is typically set at 4096 octets. But this does not mean that most frame relay frames are that large. Since most FRADs are routers, and most routers connect Ethernet (or rather Ether-type) LANs, it makes sense to optimize frame relay frame sizes for Ethernet LANs. This is commonly done; the most common default frame relay frame size is not 4096 octets, but about 1500 octets, the same size as the maximum Ethernet LAN MAC frame. In real frame relay networks, the maximum message size inside the frame relay frame, excluding the header and FCS trailer, is configurable from 16 to 4472 octets. The most common default value for the maximum message size is 1600 octets. This nicely aligns with the 1518 octet maximum Ethernet frame size. (As an aside, the 1600 octet maximum limits the number of PVCs reporting Full Status on a UNI to 317, not the 892 PVCs supported with a 4472 message size). But what would happen if a frame relay network with a 4096 octet maximum message size attempted to send a 4000 octet frame relay frame across an NNI to a frame relay network with a 1600 octet maximum message size? It is readily apparent that the 4000 octet frame relay frame could not be sent onto the second network. The frame relay switch at the destination end of the NNI would immediately discard the arriving frame as being too large for the network (correctly so). There are only two ways around the problem. First, a method for determining the minimum maximum message size across the series of NNI-connected networks could be devised. This is what IP networks connected by a series of routers do. But frame relay is intended to be fast and a relay network (the default minimum maximum message size in frame relay is 262 octets). So a second method is needed: frame fragmentation. RFC 1490 included guidelines for fragmenting frame relay contents, essentially as a series of subframes similar to the approach taken in FRF.11 for voice over frame relay. In any case, RFC 1490 fragmentation is rarely used or even supported today. Frame fragmentation in frame relay is now detailed in Frame Relay Forum FRF.12 on frame fragmentation. FRF.12 is the recommended way to fragment frame relay frame contents regardless of encapsulation method. FRF.12 is a somewhat improved and simplified version of RFC 1490 fragmentation, and RFC 2427 contains only a reference to FRF.12. Oddly, most discussions of frame relay fragmentation today do not revolve around maximum frame sizes and the NNI. Most of the talk is about the use of frame relay fragmentation in conjunction with FRF.11 for voice over frame relay (VoFR). This is due to the fact that serial delays for voice frames queued behind large data frames on a UNI can be very bad for the voice quality. But if FRF.12 fragmentation is used, then the data frame can be broken up into smaller chunks and the delays experienced by the VoFR frames will be correspondingly less. Note that this FRF.12 support must be configured at both ends of the frame relay network.
Subnetwork Access Protocol Encapsulation As has been pointed out already, not all protocols have NLPID code points. These protocols include most LAN-bridged traffic types, such as Ethernet frames and token ring frames. In order to carry these bridged traffic types inside frame relay frames, SNAP encapsulation is used. SNAP encapsulation is only to be used when there is no direct NLPID method for encapsulation.
SNAP does have an NLPID code point. The NLPID value of 0x80 (1000 0000) is used when SNAP encapsulation is used inside the frame relay frame. In fact, many frame relay networks almost exclusively pass traffic back and forth all day using only the NLPID values of 0xCC for “IPv4 packet inside” and 0x80 for “LAN frame inside.” The SNAP header format is quite large compared to the two octets (or three or more with the optional pad octets) of the NLPID format. The SNAP header is 8 octets long. This is due to the size of the SNAP header itself, which is 5 octets in length. The 5 octets of the SNAP header are divided into two fields. The Organizationally Unique Identifier (OUI) is 3 octets long and the following Protocol Identifier (PID) field is 2 octets long. The structure of the SNAP encapsulation header, which employs one pad octet to indicate its use, appears in Figure 11.6.
Figure 11.6 SNAP encapsulation. The OUI values are determined and administered by the IEEE. Initially, the OUIs were assigned by the IEEE for use in the MAC address when a manufacturer or vendor of LAN hardware or software applied for an OUI. In LANs, the OUIs allow for NIC vendors to all manufacture a card with serial number “00 00 01” (for example) right on the board and not worry about other “00 00 01” cards from other vendors, because all are preceded by the 3-octet OUI. So NICs with “00 00 1C 00 01” from Vendor A and “00 00 F4 00 01” from Vendor B are unique at the OUI level. But vendors more or less resisted using OUIs for identifying protocol stacks and other softwarebased products. And in truth, it made little difference who made the software, as long as it was standard and worked. So some OUIs used in the SNAP header are organized by protocol. For example, OUI 0x00--is used to identify the 2-octet PID following that the information encapsulated is some type of Ethernet (Ether-type) protocol. Other OUI values can be used to specify a PID unique to that particular vendor, of course. Now there could be a problem. The PID octets that follow the OUI can be used to indicate the presence of routed traffic such as “IP packet inside.” This is why FRF.3.1 establishes a strict hierarchy for multiprotocol encapsulation methods. IP packets always should use the NLPID method and not the SNAP method, just to make life easier for all. When used for transporting bridged traffic, the SNAP header uses the OUI value of 0x00--C2. This value is registered to the IEEE 802.1 LAN committee and is used for all LAN frame types inside frame relay. A list of all OUIs is contained in an RFC itself, which changes periodically to incorporate changes. This Assigned Numbers RFC typically has a nice round number such as RFC 2100 or RFC 2200 so it is easier to anticipate and find when the current version is needed. Immediately following the OUI and PID octets is the information field, which should be a LAN frame in frame relay encapsulation. The whole is followed by the 16-bit (2-octet) frame relay Frame Check Sequence (FCS). Now, LAN frames also contain their own FCS, usually 32 bits (4 octets) long. LANs typically have lower bit error rates than WANs, but the higher bit rates at which LANs operate make the use of the bigger FCS a good idea. But including both FCSs in frame relay might not be such a good idea. The presence of double FCSs increases receiver processing time to accomplish the same goal. And the different FCS sizes make it possible that a frame relay frame with undetected errors at the FCS 16-bit level might actually contain bit errors that are only detectable at the FCS 32-bit level. So the PID values used with OUI 0x00--C2 include an option to preserve the MAC frame FCS or just rely on the frame relay frame’s 16-bit FCS for error detection. Given the lower speed of frame relay UNIs compared to LANs, and the elimination of FCS conflicts, the option to strip off the MAC frame FCS is not much of a risk. However, the MAC frame FCS must be
stripped off at the source and added back on at the destination, so this solution is by no means more efficient. The valid PIDs for OUI 0x00--C2 used for bridged traffic are established and administered by the IEEE 802.1 LAN committee, as expected. These are relatively few in number and are listed in Table 11.2. Table 11.2 PID Values for OUI 0x00--C2 PRESERVING MAC FCS
NOT PRESERVING MAC FCS
MAC FRAME TYPE
0x00-01
0x00-07
802.3 (“Ethernet”)
0x00-02
0x00-08
802.4 (Token bus)
0x00-03
0x00-09
802.5 (Token ring)
0x00-04
0x00-A
FDDI
0x00-B
802.6 (DQDB)
0x00-D
Frame relay fragments
0x00-E
Other bridged PDUs (802.1[d] or 802.1[g])
0x00-F
Source Routing Bridged PDUs
Although listed in the “not preserving MAC FCS” columns, frames using distributed queue dual bus (DQDB) 802.6 metropolitan area network (MAN) frame structures must preserve the FCS because an indication that the FCS is present is located in the DQDB header. Note that the use of the PIDs does not imply that the router FRAD is actually bridging in the usual sense over the frame relay network, only that the bridged traffic encapsulation method is used. This is a subtle but sometimes important distinction. In many LANs today, bridged traffic types support virtual LANs (VLANs) over a frame relay network. The usage pattern of the numbering scheme does not leave much room for growth, but just covering 802.3 (Ethernet) and 802.5 (token ring) covers about 90 percent of all the LANs that have ever been built.
Subnetwork Access Protocol Use SNAP encapsulation in frame relay is used for two main types of traffic. First and foremost, the SNAP technique is used to transport bridged Ethernet frames across a frame relay network, even though the FRADs might be routers. Second, the SNAP technique is used to transport packets that do not have a NLPID code point, such as vendor-specific, but very common, packet types. Most of the packet types of interest are Ether-type protocols and as such have been given PIDs under the 0x00--Ethertype OUI. For example, Novell’s IPX packet structure is an Ether-type and uses PID 0x81-. Both of these examples using DLCI = 18 are shown in Figure 11.7.
Figure 11.7 SNAP encapsulation for 802.3/Ethernet frames and IPX packets.
As shown in the figure, a SNAP encapsulation header used for transporting an Ethernet frame with the 32-bit MAC FCS intact would have the control field set to UI (0x03), a pad octet (0x00), a NLPID of 0x80 (SNAP inside), an OUI of 0x00--C2 (IEEE 802.1), and a PID of 0x00-01 (Ethernet MAC with FCS) following the frame relay header. Following the MAC frame itself, the frame relay frame ends with the 16-bit FCS. A SNAP encapsulation header used for transporting an IPX packet would have the control field set to UI (0x03), a pad octet (0x00), a NLPID of 0x80 (SNAP inside), an OUI of 0x00--(Ether-type), and a PID of 0x81-(IPX packet inside) following the frame relay header. Following the IPX packet itself, the frame relay frame ends with the 16-bit FCS.
Q.933 Encapsulation Between NLPID encapsulation and SNAP encapsulation, frame relay frames can transport IP packets, bridged LAN frames, and even vendor-specific packet formats such as IPX. But there are still Layer 2 (frame) and Layer 3 (packet) protocols that exist and are not covered by one of these two methods. So FRF.3.1 has established a third method of encapsulation to be used as a type of final resort when all else fails. This is Q.933 encapsulation. So, for all protocols lacking an NLPID or SNAP encapsulation, whether bridged (OUI 0x00--C2) or Ether-type (OUI 0x00--), the NLPID code point of 0x08 for Q.933 encapsulation should be used. This NLPID should never be confused with the value of 0x80 used for SNAP inside but of course it is. So, it’s 0x08 for Q.933 and 0x80 for SNAP, and that is that. This third and last major encapsulation technique for frame relay has a 6-octet structure. The first 2 octets from the control (UI) and NLPID octets as expected. Then there follow 2 octets for Layer 2 (L2) protocol identification and 2 octets for Layer 3 (L3) protocol identification. The whole Q.933 encapsulation structure is shown in Figure 11.8.
Figure 11.8 Q.933 encapsulation. The Q.933 format is mainly used for IBM protocols such as SNA and APPN, but there are other uses as well. The Q.933 also has an escape coding to allow for any vendor-specific or user-defined protocol to be carried inside frame relay frames. Use NLPID or SNAP if it fits, Q.933 for SNA and literally everything else. Since SNA and related protocols are not open standards, they have no NLPID or SNAP values in the OUI 0x00--C2 universe. They are not Ether-types; therefore, they have no code points in the OUI 0x00--00 world either. This is because IBM generally used token ring for everything, not Ethernet. But Q.933 is a good match because the OUI is not used at all in Q.933 encapsulation. Most of FRF.3.1 addresses IBM protocol encapsulation using Q.933. IBM has reserved several values for the L2 and L3 fields of the Q.933 encapsulation method. These values are shown in Table 11.3. Table 11.3 IBM SNA and NetBIOS Values LAYER 2
LAYER 3
IBM PROTOCOL
0x4C-80
0x70-81
SNA traffic between FEPs
0x4C-80
0x70-82
SNA traffic from FEP to cluster controller
0x4C-80
0x70-83
APPN traffic
0x4C-80
0x70-84
NetBIOS traffic
0x4C-80
0x70-85
HPR traffic with 802.2 LLC headers
0x50-81
0x70-85
HPR traffic without 802.2 LLC headers
All of the details about when one value or another is used are far beyond the scope of this book. But 0x4C-80 is used when the Layer 2 protocol is IEEE 802.2 Logical Link Control (LLC) frames, which can be the contents of the LAN MAC frames. A value of 0x50-indicates the absence of the Layer 2 protocol altogether. When Layer 2 is 0x4C-80 (LLC used), then the value of the first 8 bits (first octet) of the Layer 3 field is 0x70. The 0x70 value means that the protocol is user specified, the user being IBM in all these cases. SNA was discussed in the previous chapter. APPN is Advanced Peer-to-Peer Networking, a sort of router-based version of hierarchical SNA. NetBIOS is IBM’s (and Microsoft’s) implementation of the middle layers of the OSI RM (most of layers 5--3). Finally, HPR is High Performance Routing, a kind of extension to APPN designed for fast packet networks such as ATM. Purists might cringe at these brief descriptions, but this is enough to know about IBM protocols for frame relay encapsulation purposes. The details on how SNA traffic is carried inside frame relay frames are quite complex and require a much deeper understanding of SNA than has been presented in this chapter and the previous one. Those interested in the deeper levels of SNA, APPN, and other IBM protocols with regard to frame relay are referred to the sources listed in the bibliography. Before moving on to the related topic on Ether-type encapsulation, a comparison of the NLPID, SNAP, and Q.933 encapsulation header formats is shown in Figure 11.9.
Figure 11.9 NLPID, SNAP, and Q.933 encapsulation.
What’s inside the Frame? Networks exist to link senders and receivers. Frame relay networks exist to link senders and receivers exchanging frame relay frames. But that does not mean that the networking task is as symmetrical as the information flow. Most of the complexity in networking is at the receiver. Senders always know what they are sending (at least they should). Receivers only know what they are receiving, if they can only ever receive one thing. Why should this be a problem? Consider the simple network shown in Figure 11.2.
Figure 11.2 A simple multiprotocol network. The figure shows two routers as FRADs linked by a frame relay network. These are multiprotocol routers, since frame relay must be able to handle anything, not just IP. The sender can package up an IP packet or an IPX packet inside a frame relay frame and send it through the network on the DLCI representing the PVC that leads to the receiving router. Only one DLCI is needed, of course, since all of the packets inside the frame relay frame are intended for the other router anyway. But now the receiver has a problem. There are two separate processes available to pass the content of the frame along to. An IP packet makes no sense to IPX, since the packet header format, address structure, and almost everything else is unique to IPX packets and the software knows only IPX. The same is true of an IPX packet passed to IP. Why can’t IP and IPX figure out which is which? Because by then it’s too late: It’s the job of the lower layer in all layered protocols to figure out which higher layer to hand traffic off to. Even IP packets can contain TCP or UDP segments. IP has to figure out which is which, not TCP or UDP themselves. So the frame relay protocol has to figure out what’s inside: IP or IPX. Of course, a totally separate DLCI could be defined for each and every protocol, and this is sometimes done, as in SNA. But to multiply the number of PVCs by the number of protocols used (IP has not totally driven out everything else just yet) would be prohibitively complex to set up, and expensive to pay for and maintain the jungle of PVCs needed. What if a protocol were added or deleted? And even if only IP is used, the router FRAD must be able to distinguish protocols, even if it never actually has to. But frame relay networks never know what’s inside the frames they relay! How are receivers to know? Giving a separate PVC and DLCI to each protocol in use is not very practical. If only one PVC and DLCI per location is to be used, then the sender must place something inside the frame relay frame that informs the receiver exactly what the format of the rest of the frame is and which higher-layer process to hand the contents off to. The main question is what the something the sender sends to the receiver is. The receiver would then always know what is encapsulated inside the arriving frame relay frame.
As it turns out, there are two main ways of performing this multiprotocol encapsulation in frame relay. The first is by using the Q.933 methods established by the ITU-T for frame relay. These methods are geared mainly for SVC use in frame relay. The signaling protocol used to set up frame relay SVCs include Information Elements (IEs) designed to allow the endpoints to agree on the OSI RM Layer 2 and Layer 3 protocols that will be used on the new connection. Even service providers that support frame relay SVCs do not implement these IEs. Almost universally, the preferred method of frame relay multiprotocol encapsulation is defined in RFC 2427 (which replaces RFC 1490) and Frame Relay Forum Implementation Agreement FRF.3.1.
RFC 2427 and FRF.3.1 The Internet Engineering Task Force (IETF) has established RFC 2427, called Multiprotocol Interconnect over Frame Relay, as a standard way to encapsulate a wide range of protocols, both Layer 2 and Layer 3 alike, inside of frame relay frames. The IETF was mostly interested in how IP packets found their way inside frame relay frames and back out again, but to the IETF’s credit, the method they came up with works for almost anything. RFC 2427 allows many Layer 2 and Layer 3 protocols to be mixed on a single PVC and the DLCI identifying the PVC. With RFC 2427, a receiving FRAD or router will always know exactly what is inside an arriving frame relay frame. RFC 2427 replaces an earlier RFC 1490 (and even earlier RFC 1294) that should no longer be used for multiprotocol encapsulation. The Frame Relay Forum had no desire to reinvent the wheel and duplicate the IETF’s methods, which worked well. So the Frame Relay Forum essentially extended RFC 1490 (the work predated RFC 2427) in a number of ways, mainly by adding on some standard methods of encapsulation for SNA and its successor architecture for distributed client/server networks, APPN (Advanced Peerto-Peer Networking). The result was FRF.3.1. It is sometimes claimed that RFC 1490 and FRF.3.1 are in some way identical or that FRF.3.1 endorses RFC 1490. But anyone looking at the two documents will immediately see the profound differences. Everything important is in RFC 1490 (and now RFC 2427), period. All FRF.3.1 does is extend the concepts established in RFC 1490 for SNA and APPN use. This made sense, because the IETF focused on IP and LAN interconnectivity applications for frame relay. The Frame Relay Forum added the SNA and APPN encapsulation techniques for the IBM environment. So RFC 2427 is the main multiprotocol encapsulation document, and FRF.3.1 adds some details. In anything, this shows what a great job the IETF did with RFC 1490 and now RFC 2427. There was little that the Frame Relay Forum needed to add. RFC 2427 allows for the standard encapsulation not only of IP packets, but also of all IEEE LAN frame formats, the IEEE’s Logical Link Control (LLC, the LAN version of the OSI RM’s Data Link Layer 2), the Fiber Distributed Data Interface (FDDI), the OSI RM’s Connectionless Network Layer Protocol (CLNP, the OSI RM’s equivalent of IP), and even more. There are even provisions for vendor-specific Layer 3 protocols, such as AppleTalk or IPX. Why are so many encapsulation methods needed? Well, one of the major reasons is that not only do routers route IP packets, but many routers can also handle bridging between LANs linked by these types of routers. Routers link LANs at Layer 3, the Network Layer, of the OSI RM. But LANs are only defined by the IEEE as existing at the lower two layers of the OSI RM, the Physical Layer (Layer 1) and the Data Link Layer (Layer 2). Now in LAN use, the Data Link Layer becomes Logical Link Control (LLC, Layer 2a) and the Media Access Control (MAC, Layer 2b) layers, but there are still only two layers for all LANs. That is why routers can easily link two different types of LANs together: Above the bottom two layers of the OSI RM, there are no LANs at all! As long as the Network Layer (Layer 3) protocol is the same, or the routers understand more than one protocol, any packet can make its way from one type of LAN to another through a series of routers. Before there were routers, bridges were commonly used to link LANs. Routers, if used at all, were called gateways. Oddly, when routers came along, most of the acronyms that were used for the protocols that ran on gateways were not changed. So there are still protocols like the Border Gateway Protocol (BGP) and whole families of protocols known as Interior Gateways Protocols (IGPs) that all refer to the time when routers were gateways. Bridges were used to link LANs for one very good reason: The IEEE said so. For example, the Token Ring specification (IEEE 802.5) said that token ring LANs were to be linked by Source Routing Bridges (SRBs). The “Ethernet” group (IEEE 802.3) decided that Ethernet LANs would be linked by transparent bridges. The reason for the use of the term “Ethernet” is that there are important differences between real
p Ethernet and IEEE 802.3 “Ethernet” LANs, which will be explored later. However, people always say “Ethernet” for any Ether-type LAN (the preferred term), so that is the usage employed here (so the quotes will be dropped). Anyway, bridges only bridge between similar LANs: Ethernet-to-Ethernet or token ring-to-token ring. Bridges make one big LAN out of two separate LANs by linking the LANs at Layer 2 of the OSI RM. Since LANs are very different at Layer 2 of the OSI RM in terms of speed, MAC frame format, and various other ways, translating between different LAN types at the bridge level was difficult and discouraged by the IEEE. At the router level (Layer 3), on the other hand, all LANs looked the same and the problems of linking dissimilar LANs totally disappeared. The idea to “route when you can, bridge if you must” acknowledges this fundamental difference in bridge and LAN operation. Once routers began to find their way among the bridges onto LAN internetworks, backward compatibility became an issue. So routers all supported bridging on the router ports that led to or replaced bridges. Sometimes, these devices were called brouters but the term thankfully vanished relatively quickly as routers took over the LAN interconnection backbone. Efforts to counteract the router steamroller by bridge vendors were futile. Bridge vendors pointed out that routers are protocol-specific since the Layer 3 protocols had to match. This was the flip side of the bridges are LAN-specific argument used by the router vendors. In any case, once processor power got to the point where routers and bridges cost about the same, customers embraced routers, which could still bridge if need be since bridging was a lower layer function. Routers connect two LANs but they are still two LANs. The Network Layer (Layer 3) address must be assigned to enforce this. Two LANs running the IP protocol connected by a router must have separate network IDs in the IP address. All Layer 3 protocols basically work the same way in his regard. The differences between bridging and routing as LAN interconnection techniques are shown in Figure 11.3.
Figure 11.3 Bridging and routing between LANs. Both RFC 2427 and FRF.3.1 allow a router FRAD to handle bridged traffic as well as routed traffic. So frame relay devices can just as easily take an Ethernet frame and put it inside a frame relay frame (bridged traffic) as the frame relay device can take an IP packet and put it inside a frame relay frame (routed traffic). In fact, the multiprotocol encapsulation methods allow for both types of traffic—bridged traffic and routed traffic—to be transported over the same DLCI on a frame relay network. This is the essence of multiprotocol encapsulation.
The Network Layer Protocol Identifier and Subnetwork Access Protocol So a FRAD, whether router or bridge, has to take a targeted Protocol Data Unit (PDU) from a number of types and encapsulate the PDU inside a frame relay frame. The PDU is identified by a kind of label or header attached at the front of the PDU content before it is packed inside the frame relay frame. Now the question is what the format of the label should be and how large the label should be. Common formats are needed to limit the amount of work receivers must do on the frame relay network and the larger the label, the more overhead there will be for small data units such as mouse clicks or acknowledgments.
As it turns out, there are two separate but related methods that are used for this encapsulation label or header. Both are used, and must be used, in both RFC 2427 and FRF.3.1. The first method was established by the same people who created the OSI RM, the International Organization for Standardization (often referred to as ISO). The header structure is called the Network Layer Protocol Identifier or NLPID. The NLPID establishes a standard form and coding method that can be used inside any frame, not only frame relay frames, to identify the exact form of the Layer 3 network layer PDUs (packets) inside the frame. But NLPID does not cover all of the frame relay possibilities. Frame relay frames do not have to contain packets (routed traffic); they can contain other frames, such as Ethernet LAN frames (bridged traffic). So there is also a second method used for frame relay multiprotocol encapsulation, this one established by the Institute for Electrical and Electronic Engineers (IEEE), which is in charge of all international LAN standards through ANSI and ultimately the ISO 8802 series. The IEEE uses something known as the Subnetwork Access Protocol (SNAP) format to indicate what is inside the frame relay frame. The SNAP header includes coding for all standard LAN frame types, so it is not limited to network layer PDUs (packets). Taken together, the NLPID and SNAP header formats are capable of identifying anything that a frame relay frame can carry, at least in the traditional data arena. There is no standard NLPID or SNAP coding yet for voice and video inside frame relay, but that is not really the job of the NLPID or SNAP headers anyway. But if the NLPID covers standard OSI RM packets, and the SNAP format covers all standard bridged formats for IEEE LANs, what about vendor-specific packets and LAN frames? Vendorspecific protocols are not necessarily standards, at least in the same sense as IP or token ring. In this case, an extension of the NLPID format is used. So FRF. 3.1 easily extended the NLPID coding to include SNA protocols. Sometimes there are protocols that could conceivably be encapsulated using either NLPID or SNAP formats. But there should never be any confusion between when the NLPID header or SNAP header should be used. FRF. 3.1 establishes a firm hierarchy for deciding which format to use. In order of preference, these are: 1.NLPID is the first choice. The code listings in ISO TR 9577 are to be consulted. If the protocol is listed, NLPID is used. IP, CLNP, and other packets are always encapsulated this way. The NLPID header format is two octets long. 2.SNAP comes next. The NLPID actually lists a value that says “SNAP is used.” The SNAP codes are in RFC 2427 itself. Even some vendor-specific protocols are listed. Bridged LAN traffic frames, DECnet, IPX, AppleTalk, and others are always encapsulated this way. The SNAP header format is eight octets long. 3.NLPID according to Q.933 is the last resort. If worse comes to worse, there are NLPID options that allow for almost any OSI RM Layer 2 and Layer 3 PDU format to be specified. The Q.933 header format is six octets long. The multiprotocol encapsulation techniques established for frame relay are quite flexible. Almost any data protocol can be transported inside a frame relay frame using the same DLCI and yet distinguished by the FRAD at the receiving side of the network.
How IP Conquered the World In spite of the enormous popularity of IP, it is not yet totally an IP world. Now, IP is very versatile and the significance of TCP/IP in the networking world is that TCP/IP was the first general purpose network protocol, one which ran on almost every network, from LANs to WANs. In fact, in 1984 TCP/IP was more or less married to Unix, the first general purpose operating system which ran on almost every computer architecture, for this very reason (and also because they were both more or less free). If the networking goal is to allow users on systems from Macs to mainframes to easily communicate (interoperability is a separate issue), then have everyone run IP. TCP/IP is usually bundled with Unix, so run Unix as well. Moreover, since Unix was a product of the regulated AT&T Bell Labs, and TCP/IP development was paid for with U.S. government dollars, both were essentially free for the asking. By the 1980s, colleges and universities were asking for Unix and TCP/IP in droves, not because educational institutions were stingy and did not want to pay for vendor-specific operating systems and network software (although that reason has been proposed), but because educational institutions had a wide variety of computing equipment running all manner of operating systems and networking protocols. Some of the equipment was donated, other gear came in low-bid packages, but all of it had to link together somehow, whether by LAN or WAN or both. With Unix and TCP/IP in place, not only did all students learn one operating system and network protocol, but also attaching to the Internet was a trivial task when the time came. How could TCP/IP, and IP in particular as Network Layer, perform this trick and run on everything? The answer is in the structure of the TCP/IP protocol stack, which is shown in Figure 11.1. TCP/IP takes the seven layers of the OSI RM and turns it into four layers. The closest corresponding OSI RM layers are shown in the figure.
Figure 11.1 The Internet protocol suite (TCP/IP). The Network Access layer of TCP/IP (the layers are never numbered, only named) forms the lowest layer of the stack. Note the many choices—not all of which are shown—for the frame types that IP packets can be placed in. Other protocol stacks (e.g., X.25, SNA) define the frame structure as well as the packets structure. TCP/IP does not; this is the secret to running on everything. With two exceptions, SLIP (Serial Line Interface Protocol) and PPP (Point-to-point Protocol), all the TCP/IP standards documents specify is how to put IP packets inside this particular type of frame at the sender and how to take the IP packets out again at the destination. It is a simple enough trick, but one that took years to figure out. The set of standards documentation for TCP/IP, known as RFCs (Request for Comments, although no one is really asking anyone to comment), are issued by the Internet Engineering Task Force (IETF). Several RFCs pertaining to frame relay will be discussed in this chapter.
At the IP layer, or Internetworking Layer, the major protocol is IP itself. But, oddly enough there are other protocols that occupy compartments at this layer. There are two major categories: the Internet Control Message Protocol (ICMP), which handles control (error messages and the like) information, and the Address Resolution Protocols (ARPs) that enable an IP network address to be associated with a connection identifier or MAC layer physical address at the Network Access layer (whatever the actual network). The ARP family is a little more complicated than that, but the perspective here is on frame relay. ARPs and ICMP messages are transported inside IP packets and so belong to the IP layer itself. Above the IP layer is the TCP layer or Transport Layer. There are two main protocols here. TCP is for connection-oriented, sustained sequences of messages (flows) between source and destination. UDP is for connectionless, small request and response pairs of messages. Note that connection-oriented TCP easily uses connectionless IP for best effort or unreliable services. So there are TCP connections end-to-end across an IP router network, but no connections at all between the IP routers themselves. Every packet is routed independently between routers, just like a call setup. To top it all off, the Application Services form a standard set of applications for things like e-mail, remote login, and file transfer. Some of the applications must use TCP (flows) and some must use UDP (request-response), but the TCP/IP applications do not correspond particularly well with OSI RM layers (no real sessions, no real APIs, etc.). Consider this brief look at the TCP/IP stack as an effort to at least make some of the more obscure points about IP and frame relay more concrete. As IP packets can be sent inside almost any type of frame, frame relay frames can carry almost any type of packet.
Frame Relay for Internet Access No discussion of IP and frame relay would be complete without a deeper look at how frame relay networks can be used for Internet access. More and more ISPs are providing Internet access over frame relay as well as leased lines or dialup circuits for all of the same reasons that frame relay is used in other situations: distance-insensitive pricing (usually) and logical rather than physical connectivity. When things like flexible reconfiguration, routers as FRADs, and a network designed for bursty traffic are factored in, frame relay is a very attractive infrastructure for Internet access. There are two main ways that frame relay is used for Internet access through an ISP. Many other ways are possible, but the two ways discussed here are most common. In the first scenario, all sites that have FRADs with DLCIs leading to other sites also have a DLCI leading to the ISP for Internet access. There can be NNIs along the way to the ISP, but this is not common. Usually the ISP has frame relay service from the same frame relay service provider as the customer. This method is quite efficient because heavy Internet traffic to and from one site should not affect other sites. However, the Internet traffic filtering and screening that is common today for security purposes is more difficult to establish and maintain since each site has access to the Internet. In the second scenario, there are still DLCIs set up for Internet access, but all of the DLCIs lead to a central site which has the only FRAD that can access the ISPs FRAD over the frame relay network. This scenario makes central administration of Internet security concerns that much easier, but of course creates a potential bottleneck at the central site. Both access methods are shown in Figure 11.12.
Figure 11.12 Frame relay for Internet access. This mention of frame relay and the Internet opens up another issue concerning frame relay. The evolving Internet has begun to employ more and more ATM networking for the Internet backbone. Major corporations, government agencies, and educational institutions also have ATM networks in many cases as well. There are even mixed frame relay and ATM environments. What is or should be the relationship between frame relay and ATM? This is another topic worth exploring in depth.
Chapter 12: Asynchronous Transfer Mode and Frame Relay Overview Asynchronous Transfer Mode (ATM) and frame relay have been tied to each other for a long time and for a number of reasons. Both ATM and frame relay are usually mentioned as fast packet technologies, technologies designed to process and switch packets much faster than X.25 or older packet-switching architectures. Both ATM and frame relay were created about the same time, in the late 1980s and early 1990s when various standards organizations began to consider what the networks of the future should look like. Finally, both ATM and frame relay are often used in the same network, either because service providers use both ATM and frame relay to supply their network services, or because users themselves employ both ATM and frame relay in different parts of the same network. This chapter considers the relationship between ATM and frame relay. The emphasis is not so much on what the ATM and frame relay relationship should be, because few if any would claim to know that. Rather the focus of the chapter is on what the relationship between ATM and frame relay actually is in the world today. One of the most basic and key differences when discussing frame relay and ATM at the same time is the different terminology used when talking about even simple things like standards and services. This is because the frame relay organizations chose to use the same terms for everything relating to frame relay, while the ATM groups generally chose different terms for the same types of things. This is easiest to show by example. All network services, from X.25 to ATM and beyond, have three basic building blocks. There is the fundamental technology upon which the network is based. Then there is the standard which refines the technology for interoperability reasons. Finally, there are the services that customers buy and the network must support. The terms used in frame relay and ATM for the fundamental technology, the standard, and the services are shown in Table 12.1. Notice the different approaches taken by each network group. For whatever reason, in frame relay the same term is used for almost everything. So frame relay technology is used to build standard frame relay networks to deliver frame relay services to users. But ATM is very different. The underlying foundation technology of ATM is cell relay. As frame relay relays variable-length data units called frames through a network, ATM relays fixed-length data units called cells through a network. But cell relay simply says, “there should be cells” but says little about how big the cells should be, what the cell header must look like, and so on. The international standard from the ITU-T for cell relay is ATM. ATM says that the cell is always 53 octets long (there have been larger and smaller cells used in non-ATM cell relay networks, believe it or not). The header is always the first 5 octets of the cell, leaving 48 octets as the cell payload. There is no cell trailer, and so forth. Finally, the service definitions that describe what users can do with an ATM network are specified as part of the Broadband ISDN (B-ISDN) series of ITU-T recommendations. So, cell relay technology is used to build standard ATM networks to deliver B-ISDN services to users. Table 12.1 Frame Relay and ATM NETWORK ASPECT
IN FRAME RELAY
IN ATM
Service definition
Frame Relay
B-ISDN
Standard
Frame Relay
ATM
Technology
Frame Relay
Cell Relay
The two approaches are just different, not better or worse. When frame relay people say “frame relay,” more information is needed to decide if the technology, standard, or service is being referred to. When ATM people say “ATM,” there is no question that it is the standard cell relay technology that is being talked about. But this raises an important issue. If B-ISDN services are what ATM networks deliver, how come there are many users with ATM networks without B-ISDN services? The fact is that B-ISDN has been slow in arriving, much slower than ATM itself, which has been criticized as being too little, too late in the fast packet world. Most ATM networks employ the sort of interim services defined by groups like the ATM Forum until B-ISDN arrives (if at all). ATM is left in the lurch as the technology without a service to sell, while frame relay (with its fast and transparent frame delivery) and the Internet and Web (with everything) are taking over the world. What is it about ATM that makes it so attractive for frame relay on the one hand but so difficult to do well on the other? Why not just sit down and whip up some B-ISDN services tomorrow? To consider the answers to these questions, a closer look at ATM is needed.
ATM Networks and Frame Relay ATM service definitions come from two main organizations, but most of the service definitions that ATM service providers really care about come from the ATM Forum. Examples are ATM services such as Circuit Emulation Service (CES) and Voice and Telephony over ATM (VTOA). While the ATM Forum has defined a number of services for ATM networks, the ATM Forum has not defined any ATM services directly relating to frame relay. This is mainly because the ATM Forum has focused on service definitions that involve ATM cells directly, from customer premises to customer premises. It is the Frame Relay Forum that has focused on services that allow frame relay and ATM networks to interoperate. If both ATM and frame relay are fast packet technologies, and large numbers of users buy both ATM and frame relay services, then the need for some form of interworking between ATM networks and frame relay networks is obvious. Otherwise, how could users who have purchased frame relay for LAN interconnectivity and ATM for high-speed imaging from the same service provider be expected to send an image from the ATM network to a frame relay LAN? How could organizations which have been formed by the merger of an ATM shop and a frame relay shop be expected to still perform normal and routine client-to-server access if the client is on a frame relay network and the server is on an ATM network? The entire relationship between frame relay and ATM has been explored by the Frame Relay Forum. It has defined two major types of ATM and frame relay interworking. The first is somewhat confusingly called frame relay network interworking, and is described and defined in the FRF.5 implementation agreement (FRF.5 IA). The second method is called frame relay service interworking, and is described and defined in the FRF.8 implementation agreement (FRF.8 IA). Both methods are discussed in detail in the following sections. Both methods, network and service interworking, employ AAL 5 for transporting frame relay frames across an ATM network as a series of cells. Some of the terminology used is rather convoluted and acronym strings are often used when the underlying idea is really very simple. For example, when a frame relay frame enters an AAL 5 device on an ATM network, the user information is extracted from the frame relay frame. After all, all the user cares about is sending the information across the ATM network, not the frame relay frame header and trailer. However, the AAL immediately turns around and reconstructs the frame relay frame header and trailer. This is done by the ServiceSpecific Convergence Protocol (SSCP) of the service specific convergence service (SSCS), which forms the upper part of the convergence services (CS) sublayer of the AAL. Because this SSCP performs its magic only on frame relay frames, this is called frame relay SSCP or FR-SSCP. So the FR-SSCS PDU is just the variable length frame relay protocol data unit (frame) that is ATM AAL 5’s reconstruction of the frame relay header, content, and trailer. All of the details are in the ITU-T’s specification for an ATM network’s FR-SSCP, I.365.1. This process is frame relay service-specific because only the FR-SSCS PDU is really a frame relay frame in disguise. But lots of different variable-length data units use AAL 5. So AAL 5 also contains a Common Part Convergence Service (CPCS) which treats all SSCP data units exactly the same way: The CPCS sticks an 8-octet trailer on the SSCP PDU. Once this AAL 5 CPCS trailer has been stuck onto the frame relay frame (or FR-SSCS PDU as the documents insist), the whole thing becomes the CPCS PDU (note the absence of any reference to frame relay due to the common treatment of everything at this level of ATM). The whole CPCS PDU must be a multiple of 48 octets in order to fit nicely inside an integer number of ATM cell payloads. Padding is used between the SSCP PDU and CPCS trailer to make this 48-octet-multiple length always true.
It is the nature of standards documentation to say “FR-SSCS PDU” instead of “the frame relay frame from the ATM perspective” and “CPCS PDU” instead of “the frame relay frame with an AAL 5 trailer stuck on it.” Then again, the same standards’ nature makes books like this required. Anyway, once the CPCS PDU has been formed the result is still variable length, not cells (technically, cell payloads). Now the segmentation and reassembly layer of the AAL gets into the act and chops up the CPCS PDU into a whole number of ATM 48-octet cell payloads. These are called SAR PDUs instead of cell payloads, of course. Finally, the cell payloads are passed to the ATM layer where a 5-octet cell header is pasted on and the stream of cells representing the frame relay frame are sent into the ATM network by the ATM physical layer. Perversely, the ATM PDUs are always called cells. But all of the cells have a SAR PDU inside. This general frame relay frame to/from ATM cell processing using AAL5 is shown in Figure 12.5.
Figure 12.5 Frame relay frame to ATM cells. At the AAL across the network, where the cells become frames again, the receiving ATM layer strips off the cell headers and passes the arriving 48-octets payloads to the AAL 5 process. The cell payloads (SAR PDUs) are associated until the ATM layer says “stop!” (The receiving ATM layer is able to identify the last payload in a sequence because the sending ATM layer sets a bit in the ATM cell header to halt the accumulation process.) The payloads are now pasted together by the receiving AAL5 process and the CPCS trailer is processed. This trailer contains things like a way for the users of the AAL5s to communicate directly (most likely SSCP software is saying “voice inside these frame relay frames” and so on), a length field so that the receiver can check that all payloads have been received, and even another CRC field on the whole SSCP (frame relay frame), since the frame relay FCS is not checked by the ATM network. Also, the frame relay FCS is 2 octets (16 bits), while the AAL5 CRC is 4 octets (32 bits). If all is well, then the CPCS trailer is dropped, and the FR-SSCS PDU fields are used to generate the proper frame relay header for the user information that started the whole process in the first place. This is how frame relay frames make their way across an ATM network. With these basic concepts (and acronyms) in hand, the details of frame relay-ATM network interworking and service interworking will be more understandable.
Frame Relay-Asynchronous Transfer Mode Network Interworking Network interworking between frame relay users and ATM users is basically a form of using AAL5 to carry frame relay frames across an ATM network. The ATM device that encapsulates the frame relay frame inside a series of ATM cells is called the Interworking Unit (IWU). The IWU contains the Interworking Function (IWF) of interest; sometimes the labels are used interchangeably. Frame relay-ATM network interworking can take place from IWU to IWU or from IWU to an ATM end system (an ATM host). Naturally, the ATM host must be able to properly generate and process the AAL5 PDU needed instead of relying on a network-based IWU to perform this task. These ATM hosts are sometimes called frame relay-aware ATM hosts. So there are two variations on frame relay-ATM network interworking: one in which there are two IWUs and two frame relay FRADs as endpoints, and one in which there is only one IWU and a frame relay-aware ATM network attached ATM host device.
IWU to IWU Network Interworking Consider this first IWU to IWU scenario. The ATM network is totally transparent to the FRADs, and the IWUs (or IWFs in some documentation) are located somewhere else in the networks and not in the end systems themselves. The end protocol stacks are the same, and still frame relay, just as if there was no ATM network present at all. The IWU handles all of the frame relay-to-ATM mapping using AAL5, including mapping the frame relay FECN, BECN, and DE bits to their ATM cell header equivalents, the EFCI and CLP bits. The frame relay status signaling and network resource management functions (called operations, administration, and maintenance [OAM] in ATM) are also mapped between frame relay and ATM. The IWU-to-IWU network interworking scenario is shown in Figure 12.6. The protocol stacks in each device are also shown. Note that the end systems are utterly unaware of the presence of the ATM network between them. It is most common that the IWU reside either within the ATM switch (as a port adapter) or on the frame relay switch port that supports the ATM UNI leading to the ATM network. Note that the presence of a full frame relay network on one end is unnecessary, as shown by the dotted network cloud. The IWU can communicate with the FRAD directly.
Figure 12.6 Frame relay-ATM network interworking from IWU-to-IWU.
Discard Eligible and Cell Loss Priority Bit Mapping A lot of time and effort is spent in frame relay network interworking to get the DE and CLP bit translations exactly right. The IWU is allowed to operate in one of two modes in order to map frame relay DE bits to ATM CLP bits and back again. In Mode 1, the receiving IWU must look at both the ATM header CLP bits in the arriving stream of cells that carry the FR-SSCS PDU (the frame relay frame) and the DE bit of the FR-SSCS PDU as reconstructed by the ATM AAL5 process. Now, an ATM switch can set the CLP bit in any cell to CLP = 1 when the arriving rate exceeds a certain value, just as a frame relay switch can and does with the DE bit. So a series of cells that represent a frame relay frame could have none, one, or more of the arriving cells tagged as CLP = 1 (discard eligible). But the frame relay frame inside the cell stream could have arrived at the source IWU with DE = 0 or with DE = 1 already. What should happen to frame relay frames that arrive with one or more cells tagged CLP = 1 but contain a frame relay frame with DE = 0? In Mode 1, the rule is that if one or more ATM cells arrive with CLP = 1, or the FR-SSCS PDU has the DE = 1, then the DE field of the frame relay frame (technically, the Q.922 core frame) reconstructed by the receiving IWU will have the DE field set to a “1” bit. Table 12.2 shows these rules for IWU frame relay-ATM network interworking Mode 1. Note that there are rules for mapping DE to and from CLP in both directions, at both IWUs. An “X” means that the status of the DE or CLP bit does not matter. In Mode 2 of the frame relay-ATM network interworking scenario, the receiving IWU does not map the status of the arriving CLP bits to the frame relay frame. The arriving FR-SSCS PDU’s DE bit is mapped unchanged to the reconstructed frame relay frame at the receiving IWU. In Mode 2, it does not matter to the frame relay devices if the ATM network has generated CLP = 1 cells. All that matters is if the originating frame relay network has tagged the frame as DE = 1. Table 12.3 shows these rules for IWU frame relay-ATM network interworking Mode 2. As before, there are rules for mapping DE to and from CLP in both directions, at both IWUs. An “X” means that the status of the DE or CLP bit does not matter. But now a “Y” means that the CLP bit can be a “0” or “1” bit. The status of the CLP bit is unimportant to the receiving IWU anyway. Table 12.2 DE and CLP for Network Interworking Mode 1
FRAME RELAY TO ATM DIRECTION (APPLIES TO ALL CELLS IN A SERIES) from frame relay (Q.922)
DE = 0
DE = 1
FR-SSCS PDU gets set as
DE = 0
DE = 1
CLP gets set as
CLP = 0
CLP = 1
ATM TO FRAME RELAY DIRECTION (ONE OR MORE ARRIVING CELLS) CLP arrives as
CLP = 0
CLP = 1
X
FR-SSCS PDU arrives DE = 0 as
X
DE = 1
to frame relay (Q.922) DE = 0
DE = 1
DE = 1
Table 12.3 DE and CLP for Network Interworking Mode 2 FRAME RELAY TO ATM DIRECTION (APPLIES TO ALL CELLS IN A SERIES) from frame relay (Q.922)
DE = 0
DE = 1
FR-SSCS PDU gets set as
DE = 0
DE = 1
CLP gets set as
CLP = Y
CLP = Y
ATM TO FRAME RELAY DIRECTION (ONE OR MORE ARRIVING CELLS) CLP arrives as
CLP = X
CLP = X
FR-SSCS PDU arrives as
DE = 0
DE = 1
to frame relay (Q.922)
DE = 0
DE = 1
FECN and EFCI Bit Mapping The consideration of the mapping of frame relay FECN bits to ATM EFCI bits is another matter altogether. There is no BECN equivalent in ATM, so these bits are always passed transparently in the FR-SSCS PDU across the ATM network between IWUs. But obviously, the receiving IWU should know if the arriving cells, or the frame relay frame inside, encountered congestion on the frame relay network feeding the sending IWU, the ATM network linking the IWUs, or even both networks. The rules here are simple enough. The sending IWU always maps the FECN bit unchanged into the FR-SSCS PDU. The sending IWU always generates ATM cells with the EFCI set to a “0” bit (no congestion). At the receiving IWU, the status of the EFCI bit in the last cell header of an arriving cell series is what counts (if the EFCI bit is not right now indicating congestion on the ATM network, then what is the sense of telling the receiving frame relay network that the congestion was once there?). So if the last EFCI is set to a “1” bit, or the FECN field in the received FR-SSCS PDU is set to a “1” bit, then the receiving IWU will set the FECN bit to “1” bit (congestion) on the reconstructed frame relay frame (Q.922 core frame). These rules for frame relay-ATM network interworking FECN to and from EFCI mapping are shown in Table 12.4. The use of the “X” indicates that the status of the bits does not matter. Table 12.4 FECN and EFCI for Network Interworking Mode 2 FRAME RELAY TO ATM DIRECTION
from frame relay (Q.922)
FECN = 0
FECN = 1
FR-SSCS PDU gets set as
FECN = 0
FECN = 1
EFCI gets set as
EFCI = 0
EFCI = 1
ATM TO FRAME RELAY DIRECTION (LAST CELL IN A SERIES) EFCI arrives as
EFCI = 0
EFCI = X
EFCI = 1
FR-SSCS PDU arrives FECN = 0 as
FECN = 1
FECN = X
to frame relay (Q.922) FECN = 0
FECN = 1
FECN = 1
IWU to Frame Relay-Aware Network Interworking The second frame relay-ATM network interworking scenario outlined in FRF.5 concerns the case where there is only one IWU. The other end of the frame relay-ATM combination network is an ATM end system or host that is now frame relay-aware. There is still an AAL5 and FR-SSCS process involved, but this is now performed directly on the ATM end system and not in some IWU. The advantage of this approach is that only those ATM hosts that need to communicate with frame relay devices need to be frame relay-aware. The drawback is that each ATM host needing frame relay communications must be outfitted with this frame relay awareness. So for limited frame relayATM interactions, ATM host scenarios are cost-effective. But at some limit of penetration, the second IWU approach makes more financial (and logistical) sense. The structure of the frame relay-ATM network interworking scenario 2 is shown in Figure 12.7. The protocol stacks in each device are also shown.
Figure 12.7 Frame relay-ATM network interworking from IWU to ATM host. Note that all the interworking functions are integrated into the ATM end system. All of the mappings of the FECN and DE bits to EFCI and CLP previously discussed still apply, in both modes, but they are done by the ATM end system. All status signaling and ATM OAM procedures are done here as well. The ATM end system basically has an FR-SSCS with AAL5 built on top of the regular ATM layers.
Network Interworking Issues There are a few issues that need to be considered when frame relay-ATM network interworking is used, regardless of scenario or mode. The content of the frame relay frame that becomes the FR-SSCS PDU still uses regular FRF.3.1 (RFC 2427) multiprotocol encapsulation in scenario 1 (IWU to IWU). The IWU-to-IWU interworking is absolutely transparent to the frame relay applications, so there is no change at all required to the frame relay protocol stack used in this form of network interworking. Naturally, with scenario 2 (IWU to ATM host) the ATM end system must still employ the proper frame relay application protocol above the FR-SSCS process. This is the essence of the frame relay-aware as applied to the ATM host in this scenario. A final consideration involves mapping frame relay DLCI to ATM VPI/VCIs. The ATM VPI/VCI taken together form the ATM Virtual Channel Connection (VCC) from one ATM end system (or IWU) to another. With frame relay-ATM network interworking, there are two ways to map frame relay DLCIs to ATM VCCs.
The first mapping mode is one-to-one mapping. In this mode, each and every frame relay DLCI is mapped by the IWU to a separate ATM VCC. This is simple and easy, and accomplished with a translation table in the IWU (or ATM host). The second mapping mode is many-to-one mapping. The presence of the DLCI in the FR-SSCS PDU means that multiple frame relay DLCIs can be mapped to one VCC. This makes a lot of sense because the IWU or even the frame relay-aware ATM host is only one device on the ATM network. Why map DLCIs to different VCCs if they all lead to the same place on the ATM network? With many-to-one mapping, it is said that the DLCI field of the FR-SSCS PDU forms a submultiplexing field from the ATM VCC perspective. In one sense, however, all frame relay-ATM network interworking is really many-to-one! This is because all DLCI link management messages that flow on DLCI=0 must be carried from IWU to IWU or from IWU to frame relay-aware ATM host end-to-end along with the user traffic. So, at least two DLCIs will map to the same VPI/VCI values, and probably even more if many-to-one mapping is endorsed.
Frame Relay-ATM Service Interworking A related fashion of linking frame relay and ATM users is by means of frame relay-ATM service interworking. Frame relay-ATM service interworking is defined in FRF.8 (FRF.8 IA) from the Frame Relay Forum. Like network interworking, frame relay-ATM service interworking connects frame relay devices to ATM devices. However, frame relay-ATM service interworking does not send frame relay frames across the ATM portion of the network. Instead, it is the content of the frame relay frames that are sent to the ATM devices. Since the frame relay frame is neither preserved nor reconstructed in service interworking, there are no frame relay mapping functions that need to be performed in the IWU and an FR-SSCS is not needed. However, that is not to say that the DE to and from CLP or FECN to and from EFCI mappings are not important. These mapping are just simpler because there is never any real DE or FECN to check against the CLP or EFCI since the frame relay frame in the form of the FR-SSCS PDU is not used in service interworking. However, the rules still exist, and in one of two modes as well.
Service Interworking Discard Eligible and Cell Loss Priority Mapping Consider DE and CLP mapping in the frame relay to ATM direction first. The IWU can use either Mode 1 or Mode 2 for DE and CLP mapping. All IWUs must support Mode 1 and can optionally support Mode 2. If both modes are supported in the same IWU, then either mode can be used on a connection-by-connection basis. In Mode 1, the DE field value in an arriving frame relay frame is mapped to the ATM cell header CLP field. Every cell generated by AAL5 as part of the series must be sent with the same value. In Mode 2, the value of the DE field in the arriving frame relay frame is immaterial. The IWU always sets the CLP to either a “0” or “1” bit on a connection-by-connection basis. This decision regarding the “0” or “1” bit is made when the frame relay-to-ATM connection is set up. In the ATM-to-frame relay direction, things are simple also. There are again two modes, the first mode mandatory and the second mode optional. If both modes are supported in the same IWU, then either mode can be used on a connection-by-connection basis. In Mode 1, if one or more cells belonging to the same frame (series of AAL5 cell payloads) arrive at the IWU with the CLP bit set to a “1” bit, then the IWU sets the DE field in the frame relay frame to a “1” bit also. In Mode 2, the value of the CLP field in the arriving cell AAL5 payloads is immaterial. The IWU always sets the DE bit to either a “0” or “1” bit on a connection-by-connection basis. This decision regarding the “0” or “1” bit is made when the ATM-to-frame relay connection is set up.
Service Interworking FECN and EFCI Mapping
Next consider FECN and EFCI mapping. The frame relay-to-ATM direction is examined first. The IWU can use either Mode 1 or Mode 2 for FECN and EFCI mapping. All IWUs must support Mode 1 and can optionally support Mode 2. If both modes are supported in the same IWU, then either mode can be used on a connection-by-connection basis. In Mode 1, the FECN field value in an arriving frame relay frame is mapped to the ATM cell header EFCI field. Every cell generated by AAL5 as part of the series must be sent with the same value. However, since the EFCI field is used to indicate congestion on the ATM network, which might be present, the use of this mode requires careful consideration of the effects on the ATM end-user equipment. The receiving ATM equipment might only be able to tell the sending IWU to “slow down,” not necessarily the frame relay source! In Mode 2, the value of the FECN field in the arriving frame relay frame is immaterial. The IWU always sets the EFCI field to a “0” bit (no congestion) for every ATM payload on every connection. However, whether operating in either mode, the IWU always has the option of setting the EFCI bit to a “1” bit (congestion) whenever the IWU thinks it should. So, even if the FECN is not set to a “1” bit, the IWU can always set the EFCI bits to a “1” bit if there is congestion on the ATM network. This IWU consideration always overrides the previous rules. In the ATM-to-frame relay direction, things are even simpler. There is only one mode and one rule. If the EFCI field in the last ATM cell belonging to the same frame (series of AAL5 cell payloads) arrives at the IWU set to a “1” bit, then the IWU sets the FECN field in the frame relay frame to a “1” bit also. In frame relay-ATM service interworking, as in frame relay-ATM network interworking, the frame relay BECN bit is not mapped to any ATM value. However, in the ATM-to-frame relay direction, the value of the BECN bit is always set to a “0” bit (no congestion).
Transparent Mode and Translational Mode Only one IWU is ever needed for frame relay-ATM service interworking and the ATM network devices have no awareness of frame relay as either the origin or destination of application information. In other words, frame relay devices see only frame relay, and ATM devices see only ATM, but in reality the endpoints are on different types of networks entirely. Because there is no frame relay frame to transport or reconstruct on the ATM network, service interworking does not really employ an FR-SSCS at all in the IWU or the ATM host, although the layer is still present. All that is used in the frame relay-ATM direction is the null-SSCS AAL5 frame, which is just a way of saying, “Take the user information out of the frame relay frame and stick an AAL5 trailer on it.” The same null-SSCS AAL5 works in the other direction as well, from ATM to frame relay. When the IWU receives a series of AAL5 payloads, the AAL5 trailer is stripped off and the reassembled contents passed to the Q.922 core function in the IWU. The whole process sounds more complex than it really is. The important point is that the frame relay frame, in the form of the FR-SSCS PDU, never presents itself to or is generated by the ATM host in service interworking. Frame relay frame-to-ATM cell mapping is all performed in one IWU. There are also two modes of frame relay-ATM service interworking. These are called transparent mode and translational mode. Transparent mode is the simpler of the two. In the transparent mode of frame relay-ATM service interworking, the contents of the frame relay frames (in the frame relay-to-ATM direction) or the AAL5 PDUs (in the ATM-to-frame relay direction) are passed without change from one form to the other by the IWU. The transparent mode is useful when the end-user application protocols used above the frame relay and ATM protocol layers are identical. That is, the protocol inside the frame relay frame or ATM cell payloads is transported in exactly the same fashion, regardless of whether frame relay or ATM is used. However, there are not many cases where frame relay and ATM carry even the same higher-layer protocols in precisely the same fashion. Even the extremely common case of IP packets sent with frame relay-ATM service interworking cannot use transparent mode. This is because frame relay encapsulates IP packets according to RFC 2427 (and with FRF.3.1 IA) while ATM encapsulates IP packets with RFC 1483 (the same method as in the newer RFC 1932).
Whenever the encapsulation method of the information inside the frame relay frames and ATM cells is not carried in exactly the same fashion, the translational mode of frame relay-ATM service interworking is required. The two encapsulation methods must be translated to one another so that the endpoints, one a frame relay device and the other an ATM device, remain unaware of the differences in payload packaging. The IWU used in frame relay-ATM service interworking can be placed in one of three main positions in a mixed frame relay and ATM network. These positions are shown in Figure 12.8 along with the protocol stacks used in each device. The IWU (or IWF) can be between the frame relay network and the ATM device, between two frame relay and ATM networks, or between the frame relay device and the ATM network.
Figure 12.8 Frame relay-ATM service interworking. Note the presence of the optional translational mode of operation in the IWU. If translation between frame relay RFC 2427 (and FRF.3.1 IA) encapsulation to ATM RFC 1490 encapsulation is needed, as is most likely, this is where the translation takes place. FRF.8 details all of the possibilities. The details are meaningless with a knowledge of AAL5 encapsulation, but such details are far beyond the scope of this chapter. Interested readers are referred directly to the relevant sections of FRF.8.
Related Issues All frame relay-ATM service interworking is essentially one-to-one mapping between DLCIs and ATM VPI/VCIs (VCCs). The DLCI in the FR-SSCS PDU is not available for many-to-one mapping in service interworking. But what about the frame relay link management flows and status messages on DLCI=0? Since one end of the service interworking scenario is always non-frame relay, it makes little sense to try to map status messages to ATM VCCs. However, frame relay link management and status messages should be mapped to ATM OAM flows and back again in the ATM-to-frame relay direction. This is not a trivial task. One further frame relay-ATM service interworking issue is rather subtle, but important. Frame relay frame contents will require more bandwidth when packaged as a stream of ATM cell payloads than they do on the frame relay network. The cell header overhead must be considered, as well as the padding and AAL5 trailer, although padding and trailer effects should be minimal compared to the cell header overhead tax. Why would anyone care? Well, frame relay connections are defined by CIR, BC and BE. Most ATM connections that carry frame relay types of information are defined using the Peak Cell Rate (PCR), Sustainable Cell Rate (SCR), and the Maximum Burst Size (MBS). ATM parameters are measured in units of cells per second. Now, the ATM equivalents of the frame relay CIR and other parameters can be used to translate frame relay connection bandwidth requirements into ATM parameters. However, any such conversion must take into account the increased bandwidth needed due to the ATM overhead involved. If done improperly, many cells containing frame relay frame contents will potentially be tagged with the CLP bit set to a “1” bit (discard okay) by the ATM switches in the ATM network. This will happen even when the frame relay source is scrupulously observing the CIR. In a worst-case scenario, a lot of frame relay traffic is discarded by the ATM network and there is nothing that the frame relay users can do about it!
So, based on the maximum size of the frame relay frame sent on the connection, which determines the maximum number of ATM cell headers that must be added, careful calculations must be carried out to make sure that the ATM parameters add sufficient bandwidth to the frame relay CIR so that traffic does not disappear on the ATM network. The ATM Forum has released some guidelines for assisting in this traffic load mapping process. FRF.8 includes many pages on how frame relay status messages are passed to ATM devices, how a frame relay IP ARP is mapped to and from an ATM IP ARP, what can go wrong, and related issues. All of these issues should be investigated in the pages of FRF.8 itself.
ATM and Cell-based Frame Relay If “ATM is for everything” then ATM should be for frame relay too. And ATM is. In many service provider networks, an ATM network of ATM backbone switches can provide the standard interface connectivity between service provider switches that is lacking in frame relay. Frame relay does not define a standard, network node-to-network node interface between frame relay switches. X.25, the frame relay precursor did not, and if it is assumed that a network will consist of switches from a single vendor (or even from one dominant vendor), then vendor-specific interfaces will not be a problem. The same vendor’s equipment should interoperate painlessly with their kin, and small groups of other vendors will have to emulate the dominant vendor’s interface (assuming it is well-enough known) in order to sell their devices at all. This approach worked fine for voice switches for years, as AT&T dominated the market in the United States, and there were never more than about a half-dozen public local exchange voice switch vendors in the first place. But LANs and data networking changed all this. With things like Ethernet LANs, vendors were counted in the 60s and 70s. And if the OSI RM did nothing else, it demonstrated the power and wisdom of interoperability through standardization. When there are many and conflicting choices, people and organizations suffer from a type of paralysis and are reluctant to buy anything because they might buy the wrong package and be left with devices that are as useful as the Sony BetaMax in the basement. So if frame relay switch vendors wanted to avoid the vendor-specific switch interface dilemma, they had to come with something that a service provider could use as a standard interface between frame relay switches, no matter whose they were. Of course, this standard interface remains an option. If a service provider wanted to use the vendor-specific interface, that was all right too. Faced with the choices of adapting the frame relay network-network interface (NNI) for interswitch communications, inventing a new, frame-based interswitch protocol, or using ATM as the frame relay interswitch protocol, most frame relay switch vendors chose the ATM option. Frame relay NNI procedures were overkill for the relative simplicity of frame relay switch-to-switch communications. No one knew how long it would take to develop a new interswitch protocol for frame relay (some work has been done and continues). In the meantime, ATM was available, ATM was the international standard for cell relay, and ATM was for everything. There are actually two ways to employ ATM to link frame relay switches. The first makes the frame relay switches look like ATM premises devices on the ATM network. The frame relay switches have only an ATM interface, however. The second way, which is becoming very common, is to make the frame relay switch into an ATM switch at heart. Now the ATM (frame relay) switch UNIs are still frame relay UNIs, but the links between the ATM switches with frame relay UNIs are now ATM network-node interfaces (NNIs). This sounds complex, so a few more words might be needed to straighten it all out.
Frame Relay Switches and ATM Frame relay frames are sent between frame relay switches within the same organization in one of two main ways: with frame-based protocols and with a cell-based protocol (ATM). Usually, the frame-based protocol will be the vendors’ own way of sending frame relay frames between the frame relay switches. These frames are sent on normal, point-to-point private lines. If one frame relay switch must be able to send frames to another, a direct link is needed.
But all this does is replicate the user’s mesh-connectivity problem at the service provider network level. As the frame relay network grows, so does the number of links needed to connect all of the frame relay switches together. Constructing partial meshes will create natural bottlenecks unless the user traffic patterns somehow mimic the physical connectivity, which is rare. The attraction of ATM as a standard interswitch protocol extends to the attraction of using the ATM backbone connectivity for frame relay switch connectivity. Since most service providers today have extensive ATM backbone networks for their merged voice, video, and data traffic, frame relay can simply ride the ATM backbone as another data service type. In this scenario, each frame relay switch has at least one ATM UNI. The ATM UNIs lead to the service provider’s ATM backbone cloud. Of course, all of the frame relay switches still have a number of frame relay UNIs which lead to the customers’ FRADs. This method of cell-based frame relay is shown in Figure 12.3.
Figure 12.3 Frame relay switches as ATM users. Note that this onion structure presents only frame relay services to the users. However, this method makes good use of the service provider’s ATM backbone. Now ATM provides the virtual connectivity needed for the frame relay network without meshed links, and ATM can offer quality of service that other types of networks cannot. However, since the users remain firmly in the frame relay world of variable-length data units, cells are of no direct benefit to the frame relay users. Although the reduced ATM delay is obviously a plus, lower delays are not exclusive to cell-based networks.
ATM Switches and Frame Relay ATM backbone network switches shuttle cells very rapidly from input port to output port. The ATM network switches are as indifferent to the content of the cells as frame relay switches are to the content of the frame relay frames. It is the edge ATM switches, which are the customer premises equipment with the AALs, that load and process that content of the ATM cell payloads. But what if the customer premises equipment was still a FRAD? Then variable-length frame relay frames would show up at an ATM switch on the service provider’s ATM backbone. But since ATM is for everything, why not just take the FRAD’s frame relay traffic and pipe it into the ATM switch network? This is the second approach to cell-based frame relay. All of the frame relay switches are actually ATM switches on the service provider’s backbone. How do the frame relay frames become cells? They become cells with a special ATM device known as a port adapter. The frame relay port adapter (there are several types of ATM switch port adapters) will take the frame relay frames to and from the frame relay UNI, and perform the proper AAL manipulations to allow the frame relay traffic to enter and exit the ATM switch. This approach is shown in Figure 12.4.
Figure 12.4 Frame relay and ATM switch port adapters.
Note that the main difference between this approach and the first is that the switches themselves are frame relay switches in the first case and ATM switches in the second case. All of the service provider’s switches in this second method are ATM switches, but some of the users on the switch, and perhaps even all of them, are frame relay users. This certainly demonstrates the flexibility of ATM. Since frame relay users now are really using an ATM switch (but through a frame relay UNI at each end), the quality of service advantages of ATM networks should be more readily apparent to them. It is sometimes claimed that cell-based frame relay of one sort or another is more or less a requirement for voice and video over frame relay. However, given the rapid improvement of voice and video digitization techniques and application timing compensation, it no longer seems obvious that cell-based frame relay will automatically provide better voice and video services than something else.
ATM Networks ATM, like almost everything else, is a layered protocol. Frame relay, and even the Internet protocol suite, have layers that more or less closely align with some or all of the layers of the OSI RM. Frame relay, for instance, uses an abbreviated version of the OSI-RM’s Physical Layer (L1) and Data Link Layer (L2) for data transfer, and a very light version of the Network Layer (L3) for switched virtual circuits and network management functions. In contrast, the layers of the ATM protocol stack have no relationship at all with the layers of the OSI-RM. The reason is simple enough: ATM is for everything and the OSI-RM is for data. So out with the OSI-RM and in with ATM. The layers of the ATM protocol stack are shown in Figure 12.1. The main tasks of each layer are also shown in the figure. In a world where all applications could generate, understand, and receive ATM cells directly, there would be no need for the ATM Adaptation Layer (AAL). Since the vast majority of existing hardware and software still generate, understand, and receive only variablelength data units (packets and frames), the cells have to come from somewhere. This is the job of the AAL: make the cells outbound (segmentation) and interpret the cells inbound (reassembly). The AAL also usually slaps on some additional headers and trailers before segmenting variable-length data units, a process done at the convergence services (CS) sublayer of the AAL. Initially, there were many types of AAL envisioned, basically one for every major traffic type on an ATM network. ATM connections must be established between like AAL types. AAL5, for example, will not interoperate with an AAL1 across the ATM network. Recently, the philosophy has been to employ only one or two AALs for every application, usually AAL1 for ATM voice and AAL5 for everything else. Of course, there are still many different traffic types. But now these differences are approached more from the ATM layer perspective instead of the AAL.
Figure 12.1 The layers of the ATM protocol stack. The ATM layer is the heart and soul of ATM. This is where the switching and multiplexing of the cells occurs. Cells arriving from the AAL are switched and multiplexed on the virtual circuits from ATM layer to ATM later until the cells reach their destination AAL. ATM virtual circuits are labeled by a virtual path identifier (VPI) for site-to-site connectivity and a virtual channel identifier (VCI) for device-to-device connectivity. While once it was planned that there should be many AALs all sharing the same ATM layer, this proved to be impractical. The AALs are only needed in the customer premises equipment (CPE) where the cells are made and consumed, not in the network ATM switches themselves, where the cells are just relayed around. So the service providers had no real access or control over the ATM cell contents, which is, of course, all the users themselves really cared about. The issue was resolved by the ATM Forum, an almost exact analogy of the Frame Relay Forum for ATM, with the definition of cell services at the ATM layer which the service provider could offer the AALs in the CPE. So now there are five services defined at the ATM layer itself: 1.CBR (Constant Bit Rate). This service is for applications such as noncompressed, 64 kbps PCM voice which always generate the same number of bits per unit time, and need low and stable delays.
2.VBR-rt (Variable Bit Rate, real time). This service is for applications such as compressed video, which generate a variable number of bits per unit time, but which still need a bounded and stable delay (real time). 3.VBR-nrt (Variable Bit Rate, non-real time). This service is for applications which generate a variable number of bits, but which do not require bounded or stable delays to function. However, a certain minimum amount of bandwidth is always needed for the application. Video can still go here, but again with no delay guarantees. 4.UBR (Unspecified Bit Rate). The ATM version of flying standby. No delay limit is placed on the cell traffic, nor is any minimum bandwidth reserved for these applications. Things like credit card verification are often put into this category. 5.ABR (Available Bit Rate). This service is for applications such as traditional LAN-to-LAN clientserver traffic. There is a minimum amount of bandwidth assigned, and applications can burst above this if more bandwidth is available. (Sounds—and is—a lot like frame relay service.) The precise structure of the ATM cell header is unimportant for the purposes of this discussion. It is enough to note that the ATM cell header contains many of the same features of the frame relay frame header, as might be expected from two fast packet technologies created with the same goals and aims in mind. So the frame relay connection identifier, the DLCI, becomes the ATM VPI/VCI. The frame relay DE bit becomes the ATM Cell Loss Priority (CLP) but has exactly the same function. So CLP=0 means “Please don’t discard this cell,” and CLP=1 means “Okay to discard the cell under congested conditions” on the ATM network. Finally, the FECN bit becomes the ATM Explicit Forward Congestion Notification (EFCI) bit, just to make sure everyone is paying attention. Again, the function and intent is the same for EFCI as for FECN. Oddly, to those familiar with frame relay, there is no BECN bit in ATM. This is because frame relay connections are bidirectional by definition, so that a DLCI from A to B always spawns a like-numbered DLCI from B to A on the frame relay network. But since ATM is for everything, and many applications are unidirectional (alarm systems, video cameras, etc.), VPI/VCIs are by definition unidirectional. So an ATM connection established from A to B will not automatically produce a connection from B to A unless the ATM network is explicitly told to do so by hand or by signaling protocol. So the frame relay BECN, without a return path guaranteed, is simply dropped in ATM. The lowest layer of the ATM protocol stack is the physical layer. It carries the ATM cells inside certain standard and approved transports such as DS-, E-, or Sonet/SDH (OC-c is the B-ISDN recommendation). The DS-3 (for example) is still a DS-, but carries ATM cells instead of 28 DS-s or a series of very fast frame relay frames. Typically, the CPE of an ATM network is a small ATM switch all by itself. That is, the CPE can easily complete voice calls and data connections within the same building without sending the cells to, and getting them back from, the service provider. This is essentially how ATM can be a LAN as well as a WAN. However, the full networking capabilities of ATM premises devices, as opposed to the simple functioning of devices such as FRADs, meant that ATM started out as more expensive than almost any other type of networking and has tended to stay that way. Naturally, the fact that no existing hardware, software, or applications even knew what a cell was (while everything understood variable-length frames) meant that ATM has faced an uphill battle for LAN acceptance since Day One.
ATM Network Architecture ATM networks consist of two main equipment types and two main interfaces. The main equipment types are the premises ATM switch, also known as the edge ATM switch, and the backbone ATM switch, also known as the core ATM switch. Usually, the edge switches have the AALs and thus the required non-ATM interfaces (e.g., 10 Mbps Ethernet) for premises use and only one ATM interface, while the backbone switches have no need of AALs but are bristling with large numbers of ATM interfaces.
The two main interfaces are the user-to-network (UNI) and the Network Node Interface (NNI). Note that the acronym NNI, familiar in frame relay as the network-network interface is not the same in ATM. ATM, virtually uniquely in networking protocols, defines a standard interface between the ATM switches themselves. Such a standard interface has traditionally been left as vendor-specific in other protocols (e.g., X.25, frame relay, and many others). Again, the “ATM is for everything” philosophy forced the ATM standards groups at the ITU-T to address every detail of the network. Vendor interoperability, a luxury in purely data networks, becomes a must in a world where literally everything from voice to video to data runs on an ATM switch platform. This general architecture of an ATM network is shown in Figure 12.2. There are many details and variations that are of interest when the main topic is ATM, but for understanding ATM and frame relay, this level of detail is quite sufficient.
Figure 12.2 The ATM network architecture. Note that there is a network-network interface defined for ATM networks. The network-network interface is not called the NNI (that’s the network node interface in ATM). The network-network interface between different service provider’s ATM networks is called the Broadband Intercarrier Interface (B-ICI) in ATM. The “B” reflects the fact that ATM was initially developed as the transport network for so-called broadband ISDN services (B-ISDN), services which were supposed to appear between approximately 1990 and 2020 until everyone figured out in 1995 that everything B-ISDN was supposed to be tomorrow the Web was today. But that’s another story. The point is that where ATM was once firmly tied to B-ISDN, ATM networks can be used today to provide high-speed backbone WAN services for voice networks, private data networks, or even public frame relay networks.
ATM Characteristics ATM is intended as the switching and multiplexing standard technology for B-ISDN. Until B-ISDN comes in all its glory, a lot of emerging broadband (lotsa bandwidth) and multimedia (low delays) applications can be supported on ATM networks. ATM has the same bandwidth-on-demand virtual circuit support as frame relay, but the data units are fixed-length cells and not variable-length frames. ATM has virtual circuit identifiers, similar to DLCIs. However, in ATM the connection identifiers are hierarchical and are used for site-to-site connectivity (the Virtual Path Identifier or VPI) and deviceto-device connectivity (the Virtual Channel Identifier or VCI). So ATM switches can switch all traffic between sites just by VPI. The site ATM switch can distribute traffic to the proper device based on the VCI. It is sometimes said that “ATM is for everything” and this is not far off. ATM is one of the few network technologies that was not specifically intended for voice or video or data, but for all three and more besides. Frame relay was designed as a data network, and although voice support has been added, that is the whole point: ATM builds in what others have to add on. Also, ATM networks use exactly the same hardware, software, and protocols whether the clients and servers being linked are down the hall (LAN) or across the country (WAN). Every other protocol has been designed as either a LAN or a WAN, so no one (for example) would ever dream of building a frame relay LAN. Finally, the fixed size of the ATM cell was intended to make hardware (chip-level) implementations much easier and effective. Chips always process fixed numbers of bits, and even though routers and other network nodes that process variable-length data units have become faster and faster, an ATM switch can always be designed to run even faster due to the fixed-length of the ATM cells being processed. ATM was never designed to run at one or two fixed speeds. So ATM devices and switches can run at a wide variety of speeds, which makes network and switch upgrades easy and cost-effective. In contrast, other technologies such as frame relay and Ethernet required years of work and much rewriting of standards to increase their speeds from 1.5 Mbps to 45 Mbps and 10 Mbps to 100 Mbps, respectively.
Frame Relay and Compression Revisited This chapter will close with a closer look at data compression and frame relay. Some general comments about data compression and frame relay closed Chapter 6 of this book in a congestion control context. But now, armed with a more detailed knowledge of frame relay encapsulation techniques, more of the details of frame relay data compression can be investigated. The relationship between frame-relay-ATM interworking and compression of the data content inside frame relay frames is not as far-fetched as it may seem at first. When frame relay frames are sent onto an ATM network, the bandwidth required increases due the ATM cell header overhead tax. There is a real benefit in minimizing the size of the frame relay frame, and thus minimizing the impact and severity of the ATM tax. Nothing in frame relay requires or forbids the content of a frame relay frame used for data transfer to be compressed. Note that this statement applies to data contents of frame relay frames. Voice and video, when carried inside frame relay frames, have their own set of compression rules, many of which have been discussed in Chapter 9. While it is common enough for data to be compressed for purposes such as hard-drive conservation, compression of files for LAN transfer is almost unheard of, given the abundance of bandwidth on most LANs. But when files are to be sent on a WAN, things are different. It is always a benefit to the user, and even to the service provider, to employ data compression for WAN file transfer. Users will have faster file transfers and service providers will be able to carry more traffic per unit time. The problem is that unless each user can employ some form of data compression, and compatible forms at that, frame relay WANs can do nothing about data compression at all. That is, until the Frame Relay Forum came up with the FRF.9 IA, the Data Compression Over Frame relay Implementation Agreement. FRF.9 can be used to employ data compression between FRADs while the users remain totally unaware that any data compression is taking place at all. FRF.9 outlines two key processes relating to data compression. First, FRF.9 describes a Data Compression Protocol (DCP) that can be used for data compression. Second, FRF.9 describes a Default Data Compression Function definition (DCFD) for compatibility purposes between frame relay devices that wish to use data compression. The DCP defines a way to encapsulate FRF.9 compressed data inside an unnumbered (UI) frame relay frame in the same fashion as FRF.3.1 encapsulates multiple protocols without using data compression. In a sense, FRF.9 defines another multiprotocol encapsulation, this time for data compressed by multiple methods, one of which forms the default method of data compression. The DCP, itself, is composed of two sublayers: a DCP control sublayer and a DCP function sublayer. The DCP control sublayer is mainly in charge of: Encapsulating the DCP PDU inside frame relay frames for transport Negotiating optional DCP formats and procedures, as well as optional DCP functions and parameters Making sure the sender and receiver DCP peers always know what is going on Allowing compressed and uncompressed data to share the same frame relay connection (called antiexpansion protection). Identifying various DCP contexts (sets of parameters and the like)
The DCP function sublayer is mainly in charge of: Encoding the user data for compression using a variety of algorithms Decoding the arriving compressed data correctly DCP supports not only publicly available compression schemes, but a wide variety of proprietary techniques as well. Since there is so much diversity in actual compression implementations, each specific DCP function sublayer requires a totally separate DCFD (data compression function definition) document to make sure vendors of FRF.9 products all can interoperate with one another when using a specific type of compression.
Data Compression Protocol Basics The whole trick of data compression in frame relay is to establish a special Network Layer Protocol Identifier (NLPID) that allows receivers to determine if data compression according to FRF.9 is taking place or not. This is done in FRF.9 by making NLPID 0xB0 (1011 0000) the identifier for determining the use of FRF.9 data compression inside a frame relay frame that conforms to FRF.3.1 multiprotocol encapsulation rules, as discussed in the previous chapter. The structure of a frame relay frame employing FRF.9 data compression is shown in Figure 12.9. This is now an FRF.9 frame. This is a generalized format and there are variations on this basic theme.
Figure 12.9 Data compression using FRF.9. The DCP header plays a key role in the process, naturally. The Data Compression Context Identifier (DCCI) is optional, and if present is always 2 octets in length. If the DCCI is absent, then the default data compression method is being used inside the frame. The bulk of the frame is made up of the compressed data itself, called the Data Compression Function Definition (DCFD) data. The most important octet in the whole process is the first octet of the DCP header, which must be present. For the purposes of data compression, it makes more sense to discuss the bits in the reverse order, from right to left instead of from left to right. The last bit is the control/data (C/D) bit, which allows receivers to recognize whether the arriving frame contains compressed data or some control information. If there is any compressed data in the frame, the C/D bit is set to a “0” bit. If the C/D bit is set to a “1” bit, this means that the two DCCI octets are present in the DCP header. The next 3 bits to the left of the C/D bit are reserved for future use. The next 2 bits to the left are the Reset-Acknowledgment (R-A) and Reset-Request (R-R) bits. These are used so that the decompression processes can synchronize each other and basically say, “I have no idea where we are here. Go back to square one.” Why 2 bits? Two are used so that each receiving process can independently request a reset of the other and acknowledge a reset request from the other end.
The compressed/uncompressed (C/U) bit comes next, moving from right to left. This bit provides the decompression protection needed so that compressed and uncompressed frames can be mixed on the same DLCI while still using the FRF.9 frame format. If the C/U bit is set to a “1” bit, then the data inside the frame has been compressed. If the C/U bit is set to a “0” bit, then the data inside the frame has not been compressed and an attempt by the receiving process to apply some decompression rules will lead to errors. The C/U bit is a nice way of allowing the application processes above the FRF.9 process to communicate in normal, uncompressed mode without having to turn the FRF.9 processes completely off at each end of the virtual circuit. The first bit of the DCP header is the extension (E) bit, used mainly for compatibility with Q.931 specifications regarding the coding of variable-length information elements. Use this bit to indicate whether a header is being extended into other octets. Since the DCP header is never extended in this way, the E bit is always set to a “1” bit (no extension: last octet of header). The FRF.9 document includes many pages of rules for allowing endpoints to configure themselves for various public and private compression schemes. These details are unimportant for the purposes of this chapter. As always, interested readers are referred to the relevant documentation. FRF.9 closes with an Annex that details the default DCFD (data compression function definition). As mentioned in Chapter 6, the default compression method is Lempel-Ziv compression method from Stac, Inc. (LZS) adapted for DCP use. This book has examined frame relay from many perspectives. The functioning of the frame relay protocols and interfaces have been explored, and the relationship between frame relay and other networking techniques such as SNA, IP, and ATM have all been investigated. However, one aspect of frame relay has been almost ignored until the very end of this book. This is the future of frame relay. What do the next few years—and beyond—hold for users, customers, and service providers when it comes to frame relay?
Chapter 13: The Future of Frame Relay Overview Technologies all have a life cycle and network technologies are no exception. A new technology is born, flourishes to a greater or lesser extent, then is eclipsed by the next new thing down the pipe. What seems like a great idea at the time is now just seen as a quaint approach to a given situation. The timeline for a technology from birth to obsolescence can be fast or slow, depending on many factors, some of which are only peripherally related to technology. Social and economic factors always play a role. It is often thought that rapid obsolescence is the hallmark of modern technology. But this is not necessarily true. Many older technologies had the technological life span of a gnat, but they were important nonetheless. Consider the fabled Pony Express in the United States. Set up to span the prairies and deserts between the Mississippi River and gold-rush California with fast mail service in the early 1860s, the Pony Express had the misfortune of trying to compete with the telegraph soon after enormous sums had been spent on horses, riders, way stations, watering spots, and so on. After only 18 months of operation, the Pony Express was obsolete. Priority messages used the telegraph. Low-priority mail used the stagecoaches as before. In spite of the demise of the Pony Express, most technologies never really disappear, especially those that have achieved the status of international standards. International standards usually become irrelevant. Anyone can buildnetworks using X.25 packet switching, but few bother to create an infrastructure based on the old way when frame relay represents the new way of packet switching, and fast packet switching at that. What about frame relay itself? What does the future hold for fast packet switching at Layer 2 of the OSI RM with variable-length frames? Is frame relay still on its way up as a technology, or has frame relay passed the midpoint of its useful life span and will spend the rest of its existence scrambling for an ever-decreasing niche marketplace while customers and service providers spend their network dollars elsewhere? Of course, if the future of all technologies were crystal clear, then the well-documented troubles of companies like Novell and Apple would never come as a surprise to anyone. But the future of frame relay encompasses come trends that promise to keep frame relay as a viable alternative in the network marketplace, with one possible exception discussed at the end of this final chapter. The exception is Gigabit Ethernet. The reason that Gigabit Ethernet might pose a threat to frame relay is not obvious, so a more complete discussion is needed. So, this chapter is not and cannot be a blueprint for the future of frame relay. But this chapter can at least describe some of the new options and uses of frame relay (“improvements” would not be a bad word to use) that are available already or will be in the near future. Generally, the more flexible and future proof a technology is, the brighter its future in the long run. Here is how frame relay is attempting to future proof itself. All of these points are discussed in full in this chapter. In the future, users, customers, and service providers can look for frame relay to: Support a range of UNI and NNI speeds based on Sonet/SDH. Use ATM backbones with frame relay access.
Support more and better voice and video than ever before. Be used more and more for global networking. Employ more multicast virtual connections for a variety of purposes. Explore using an IP/Internet backbone with frame relay access. Find its place in a Gigabit Ethernet world. Most of these items seem pretty much to follow well-known trends and Frame Relay Forum initiatives, and should really come as no surprise to anyone. But the last two items are admittedly an attempt to generate controversy. Frame relay with an ATM core network is one thing, but frame relay over IP? Why bother with frame relay and just use IP everywhere? And Gigabit Ethernet is a LAN technology, right? How could a LAN ever replace a WAN technology like frame relay? Well, this is the place to find out.
IP/Internet Backbone for Frame Relay Surely the most startling aspect of this issue is that anyone would take frame relay UNIs—with their CIR bandwidth guarantees and virtual circuits with stable paths and therefore bounded delays— and funnel the UNI traffic over an IP or even Internet backbone—which has neither bandwidth nor delay guarantees of any type at all. Why subject user traffic arriving on frame relay to the perils of IP and the Internet? Yet this movement of frame relay access to an IP backbone is exactly what might happen, and very rapidly, in the near future of frame relay. Where once ATM was put forward as the backbone technology that would eventually conquer the world and ultimately replace frame relay UNIs, now IP seems to be replacing ATM in this equation. The simple answer for this IP replacement of ATM would be that while ATM is for everything, IP does everything, and right now, without waiting for a glorious B-ISDN future. However, there is more to it than that. Consider the use of frame relay for access at the UNI. Frame relay software is bundled with almost all routers. A single UNI provides reachability to all user endpoints. So far, IP can do the same things. But when CIRs and bounded delays are added to the equation, the balance can tip in frame relay’s favor, especially where traffic like SNA is concerned. So there are sound reasons for using frame relay for access instead of just linking IP routers with point-to-point leased lines. However, on the backbone the frame relay advantage is not as obvious or clearcut. A lot of bandwidth can cover for a multitude of QoS sins. Frame relay switch vendors have had a rather sedate and laid back approach to screaming port speeds. After all, if frame relay works as advertised, then an 8 kbps CIR can do the work of many 64 kbps leased lines with no problem. This is one reason for the movement to ATM backbone switches for frame relay access: ATM port speeds are scaleable into the Sonet/SDH ranges while frame relay is not (although this should change soon). But today regular IP routers can support the same blinding port speeds as ATM switches. In fact, the current debate involves, “Is a router with ATM interfaces a switch? Or is an ATM switch that routes IP a router?” In many cases, the big backbone node can be either an ATM switch or IP router, it just depends on who calls it what. So IP routers can offer larger bit pipes on the backbone for frame relay access. Ironically, it is precisely this perceived lack of QoS guarantees that has led the IP people to develop newer and better ways to bring QoS to the IP world. Most of the exciting breakthroughs in network node queuing (Weighted Fair Queuing or WFQ), network resource allocation (Resource Reservation Protocol or RSVP), and so forth come not from the ATM groups, but from the IP groups. In fairness, it could be pointed out that IP is busily adding what ATM already has in many cases. But the currency of IP initiatives holds out the promise of being better suited for what users are actually doing on the network. That brings the discussion back to frame relay. Only some 1 percent of the traffic on frame relay is currently voice. But about half of all frame relay frames carry IP packets and many of these IP packets contain voice (and even video); this is growing all the time. Of course, voice over frame relay is more mature than IP telephony, but this apparent edge will not last long. Many more resources are being devoted to IP telephony than to frame relay voice. So why optimize a whole network backbone for frame relay when most of the traffic on it might become IP for everything very soon? Why not optimize the backbone (at least) for IP? The high-speed routers and Sonet/SDH links could then be put to work right away.
The whole concept of frame relay access to an IP backbone is shown in Figure 13.3. The similarity with frame relay access to an ATM switch backbone is intentional. Just replace those expensive ATM switches with more cost-effective IP routers and stand back.
Figure 13.3 Frame relay access to an IP backbone. It all makes perfect sense in many ways. Most frame relay networks connect routers acting as FRADs to connect LANs. And what is running on almost all these LANs is IP. Frame relay has PVCs that need to be configured before anyone can communicate at all. Each PVC must have a CIR configured also, based on the assumed traffic load that the applications using the PVC will generate. But traffic loads, and even connectivity needs, can change rapidly. This is why CIRs must be constantly monitored, reviewed, and managed, at least until SVCs become much more common in frame relay than they are now. As long as frame relay is used in a star pattern, with many remote sites linked by PVCs to a central location, this PVC/CIR burden is not too bad. More than half of all frame relay networks are still simple stars, thanks to the prevalence of frame relay for SNA, to say the least. But once IP routers are used for all site connectivity, and many-to-many LAN traffic patterns shift and change with user needs (any client to any server) on almost a day-by-day basis, establishing mesh-connected frame relay PVC patterns becomes a real pain both financial (most service providers charge a monthly PVC maintenance fee) and administratively (someone has to set all of this up and watch it, and coordinate the changes with the service providers, network-wide). IP routers are connectionless devices. The IP routing protocols exchange information and keep all of the routers informed of changes in reachability, topology, and so on. So with frame relay access to IP backbones, the local users can still configure their PVCs and CIRs on each frame relay UNI, but the service provider can just rely on the IP router backbone to provide connectionless routing, which is what IP is intended for anyway. After all, IP was invented as a WAN protocol for just these types of backbones. Routers were then called “gateways” and did not presume that IP was running anywhere else than on the WAN backbone. So in one sense, this frame relay access to IP backbone movement is a return to IP roots. The more unpredictable the traffic flow, the more an IP backbone makes sense. The use of PVCs and CIRs on the backbone tend to freeze user traffic patterns into what some network management group determined previously. IP suffers from no such limitations, as the success of the Internet shows when many have predicted the Internet’s imminent collapse. Real-world user traffic flows today are more unpredictable than ever. So let the local managers configure their PVCs on their UNIs. The service providers can coordinate at the local access switch and let the IP backbone routers run free. Maybe then the service provider monthly frame relay PVC maintenance fee (how do you “maintain” a switching table entry? See if it’s still there once in while?) will more accurately reflect the actual work involved.
Multicast Virtual Connections About the only Frame Relay Forum Implementation Agreement (FRF IA) of any significance that has not been discussed in detail in this book is FRF.7, the Frame Relay PVC Multicast Service and Protocol Description IA. This is mainly because such multicast services are relatively new and require special frame relay devices not only to switch frames, but also to manufacture and switch multiple copies of the frames as well. Multicast services are useful on networks when one traffic stream must be sent to multiple endpoints. Broadcasts differ from multicasts in the sense that broadcasts are just sent to everyone on the network, while multicasts are only sent to a defined subset of endpoints on the network. Multicast services are most often talked about in a video service context, but many data applications can also benefit from multicasts, even simple e-mail sent to a variety of users on different networks. Without multicasts, many copies of user information must be sent on the network to reach every destination. Since congestion is always a network-wide concern, multicasts can cut down on much of the traffic generated by the source, although at the price of added network node complexity. And large multicasts do not do much to cut down on network traffic near the recipients. FRF.7 defines multicast services only on PVCs, and the multicast groups must all be set up in advance. So, there is no signaling protocol for multicast frame relay service as yet. Each multicast group has a number of members and each group also has a service description that determines how frames sent to this member should be treated. A service description would include such parameters as, “Should frames be discarded if the member is unavailable?” In its simplest form, one frame relay user forms the root of the multicast group and all other members of the group form the leafs (preferred) or leaves (sometimes). (Apparently, in networking “leaves” fall from trees while “leafs” are aggregate units like the Toronto Maple Leafs.) The root initiates the multicast while the leafs receive the frames. The root only sends one copy of the frame into the frame relay network, but many copies pop out at the leafs. The job of replicating the frames arriving from the root is the responsibility of the frame relay network multicast server. The multicast server can be a server at a central location in the network that handles all multicasts or it can be deployed as a distributed service with several multicast servers providing this function. FRF.7 says that there is no restriction as to where the multicast server(s) actually reside. They can even be external to the frame relay network. For the purposes of this discussion, the multicast servers will be positioned as a single device inside the frame relay network. A single centralized multicast server is shown in Figure 13.1.
Figure 13.1 A single, centralized multicast server.
Frame relay multicast service is not limited to the simple one-in, many-out model in the figure. This is a one-way multicast. In one-way multicast, all frames sent by the root to the multicast server are sent to the leafs. But no frames at all are ever sent from the leaf-to-root direction. This does not imply that there is no communication at all from leaf to root. It just means that if such leaf-to-root communication is needed (and it probably is), then a separate PVC is used for this purpose. This cuts down on the amount of work presented to the multicast server. There is also two-way multicast service defined in FRF.7. As before, the root sends to all the leafs through the multicast server. But now each leaf can also reply to the root, although none of these leaf frames are sent to the other leafs at all. The effect is much like a multipoint SNA link, with central polling at the root and leaf traffic all converging on the root. Finally, there is true N-way multicast service. All members of a multicast group are now peers. All frames sent from any member, root or leaf, to the multicast server are sent to all other members of the group. It has been suggested that this type of multicast service could be used to allow routers to update routing tables more efficiently, since only one copy need ever be sent, not one copy on each DLCI to each router. In fact, it has been further suggested that N-way multicasts are a form of connectionless service support on the connection-oriented frame relay network. In any case, multicasting is a tool. How it is used is totally up to the user. Any valid DLCI can be used to identify a multicast PVC, which always leads to the multicast server, not the multicast group endpoints. This DLCI is now a multicast DLCI (MDLCI) to distinguish it from other, user-to-user DLCI on the UNI. It should be pointed out that long ago, even before the Frame Relay Forum existed, there was a proposal to use DLCI 1019 for multicast services. Why aren’t multicast services all over frame relay services already? Certainly, the perception of frame relay as a point-to-point private line replacement has much to do with this. The need to define and configure all the PVCs ahead of time is a limitation as well. And the multicast server is not only a service provider expense, and a significant one, but the multicast server now forms a natural bottleneck on the frame relay network. Finally, while multicast services do cut down on traffic into the frame relay network, multicasting obviously does absolutely nothing about traffic sent out of the frame relay network. And when it comes to network congestion, traffic leaving the network is the critical factor. So why is multicast here as the future of frame relay? Only because there are some services that make absolutely no sense at all without multicast capability. If these services become popular, or if frame relay shakes off its limits as a perceived data-only service, then multicasts must be there and ready to contribute. For example, if frame relay is used on a large scale for video applications such as videoconferencing or broadcast television to residences, then multicast support is essential. And while the thoughts of residential frame relay might raise eyebrows, the idea is not as strange as it seems. Many organizations with telecommuters or home office workers already support analog dial-in frame relay UNIs. And newer and faster Digital Subscriber Lines (DSLs) such as Asymmetric Digital Subscriber Lines (ADSL) are starting to make a mark in the residential services arena. There are many variations of ADSL, all under the “xDSL” umbrella. So why not allow xDSL access to frame relay? The support could be added as simply as running frame relay on the home PC and placing a frame relay switch behind the xDSL central office equipment rack, the Digital Subscriber Line Access Module (DSLAM). This architecture is shown in Figure 13.2. Of course, frame relay has to compete with IP and ATM for the same role in the xDSL environment.
Figure 13.2 Frame relay and xDSL.
What has all this to do with multicast? If frame relay is used for data services to the home, why not use frame relay for video services to the home as well? Why not one-way multicast for premium or standard channels? Two-way for interactive, pay-per-view services? N-way for community meetings? If any of this makes financial sense and becomes socially acceptable, then it will happen somewhere. If this last section seems somewhat speculative about the future of frame relay, the last two items of the list will be pure conjecture. No one knows how far IP and the Internet will push frame relay networks. And no one knows how users will accept Gigabit Ethernet, or even appreciate why Gigabit Ethernet (a LAN) can challenge frame relay (a WAN) in any way at all. But at least all of the issues can be explored.
Truly Global Networking Some of the aspects of frame relay as the first international standard that really works have been examined earlier. For the purposes of this section, this just means that frame relay networks can easily link FRADs located in different countries around the world. In fact, sometimes frame relay is the only realistic way to create a global network today. Naturally, the Internet is there and available also. But if frame relay is as public a network as the Internet is in one sense, there is little residential frame relay where grade school students download book reports at all hours of the day or night. No one offers frame relay for $20 per month. And frame relay has a CIR to guarantee bandwidth between FRADs, while the Internet does not even guarantee reachability between Internet clients and servers. Where are the keepalives for the Internet? So for organizations concerned with these issues, frame relay is preferred to the public Internet. However, some service providers now offer managed Internet services which basically create a separate IP network with limited links to the public Internet. Such managed IP networks try to offer the same types of QoS as frame relay does, but of course not in a standard fashion (yet). The most interesting aspect of global frame relay services is their improvement over the last few years. Geographical coverage is much improved, as are the access options to choose from, and overall network reliability. Denser coverage translates to shorter UNIs and lower access charges. Many service providers, realizing the enormous attraction of international voice over frame relay, have begun customizing services just for frame relay voice and fax. Higher-speed UNIs are available and service level agreements are open to negotiation. The best source for up-to-date information on global frame relay is beyond a doubt at the Data Communications magazine Web site. Data Communications has been required reading for years for anyone involved with, well, data communications. Their excellent global frame relay resource is www.dat.com/global_networks/improvements.html. There are some 11 frame relay service providers that can supply global frame relay almost anywhere in the world. About half are non-United States based. In one recent year, one service provider added 37 countries to its coverage area and another added 20 new countries. Even traditionally hard-to-reach places such as India and some African countries are now fully reachable by frame relay. Many of the service providers have NNI agreements in place. However, many of the more advanced services, such as voice and fax, which rely on known QoS parameters and compatible equipment, are available only in countries serviced directly by the offering service provider. UNI switch port speeds up to 512 kbps and even 2 Mbps are not unknown. So frame relay will be used more and more for global networking. This will be even truer for faxing and other non-data services. In fact, some organizations might even employ frame relay as a form of Internet bypass for global networks if Internet congestion and security become more of a concern.
Voice and Video Support Frame relay was designed as a data network, which just reflects frame relay’s X.25 roots. Voice support has been added, and very successfully, in both vendor-specific and standard (or at least implementation agreement) ways. Video support in frame relay is not as far along. It needs more bandwidth than voice, naturally, but video over frame relay seems as inevitable as voice over frame relay a few years ago. Look for more support for frame relay voice and video, and better quality voice and video, than ever before. This is not a trivial thing, because many of the biggest frame relay service providers are voice telephony companies first and foremost. Many grit their teeth as they configure voice DLCIs, but competitors are waiting in the wings if the service provider refuses to support frame relay voice (some still do). There is no need to repeat all of the details of voice over frame relay covered earlier in this book. The quality of compressed voice is always improving and, given sufficient bandwidth on the backbone network, frame relay voice can easily achieve toll quality. Video techniques are evolving as well and certainly the new digital television (DTV) standards will eventually be applied to frame relay, perhaps very rapidly. Even high-definition television (HDTV) has a place in frame relay, but Sonet/SDH speeds UNIs will be a requirement for this HDTV traffic. Ironically, it should be the presence of more and more cell-based frame relay that improves the quality of frame relay voice and video the most. ATM backbones cut down on packet delays (wait until a packet is filled) and serial delays (wait behind a larger, but lower priority, data unit) due to the small size of the ATM cells. So once the voice and video is through the frame relay UNI and onto the ATM backbone, frame relay QoS should no longer be much of an issue at all. A related topic is fax services and frame relay. A fax relay service is defined for voice over frame relay. But the next few years should see a jump in importance in this particular type of voice service. Many data networks support a faxing service; even the older data services that eventually evolved into Internet service providers would take an e-mail message and fax it almost anywhere in the world. In spite of the popularity of e-mail, fax lives and prospers for many reasons. A five-year-old can be taught to use a fax machine in a few minutes, but it is more difficult to teach the same child to start a PC, log in, and send e-mail. Look for faxing, especially international faxing between the United States and Europe, and the United States and Asia, to become a more important frame relay service and traffic content.
Asynchronous Transfer Mode Backbones for Frame Relay Both frame relay and ATM share the common technology label of fast packet networks. But while frame relay was firmly intended as a data networking solution, ATM was intended as a network for everything. This claim for ATM’s versatility is due to ATM’s relationship with B-ISDN, of course. And since ATM is for everything, ATM is for frame relay traffic also, as was fully explored in the previous chapter. So cell-based frame relay, which uses an ATM backbone network to connect frame relay UNIs, either with frame relay switches and ATM interfaces or with ATM switches and frame relay port adapters, is neither outlandish nor uncommon. Most of the money spent on ATM equipment goes for service provider backbone ATM gear, rather than for ATM premises devices, but there are exceptions. The more different types of traffic are mixed on the same physical network, the more ATM makes sense. ATM offers bounded delays, assured bandwidth, and limits on jitter that are the envy of other network technologies. So it only makes sense for service providers to pipe their frame relay traffic onto their ATM backbone. The presence of the ATM backbone means that service providers will look for more and more traffic for the ATM backbone. There is therefore somewhat of a sense of self-fulfilling prophecy with regard to ATM backbones. More and more service providers build ATM backbones that carry more and more of their traffic, which leads to more and more of a need for ATM backbones, and so on. But the advantages of the ATM backbone in terms of mixed traffic types and quality of service (QoS) guarantees does not necessarily extend to the customer premises, as many ATM service providers have discovered the hard way. The most complex and expensive part of an ATM network is the ATM adaptation Layer (AAL) where the cells are made and processed. Because cells have never existed in networking before, there is no legacy equipment, LAN or WAN, that can make cells as easily as these same devices can make variable-length packets and frames in a bewildering array of formats, frame relay frames included. When faced with the requirement to buy expensive new premises equipment that performs AAL processing (while still keeping the legacy routers and other devices in many cases!) or simply reconfiguring the serial port to run frame relay instead of something else, users and customers have chosen the frame relay alternative. Many frame relay service providers have adjusted to this reality. With an ATM switch backbone and frame relay port adapters, the service provider can build ATM and sell frame relay. The AAL process is now in the network, but economies of scale apply because one AAL5 port adapter can handle many frame relay UNIs at the same time. The customer doesn’t need to buy expensive new ATM equipment. There are winners all around. Sometimes frame relay customers will claim that their frame relay networks will someday migrate to a totally ATM network. But as time goes on, there is less and less of an incentive for customers to ever use ATM directly. Large, public ATM networks are relatively new and unexplored territory for service providers with over 100 years of network experience, but all with circuit switching. Voice is the most expensive service to carry over ATM, and the falling price of frame relay network voice, and even voice network voice, makes using ATM for voice almost unthinkable today. Many features of frame relay, such as data compression, have yet to find their way into ATM. And finally, users have found that abundant and cheap bandwidth can cover a multitude of QoS shortcomings and that given enough bandwidth, network issues like low and stable delays simply evaporate.
The only undisputed advantage that ATM holds over frame relay today is that if the goal is to get the customer’s bits into the service provider’s backbone the cleanest way, then ATM is the way to go. An ATM UNI direct from premises ATM edge device to backbone ATM switch port requires no conversions, no port adapters, no interworking units, or any of the other devices that add complexity and reliability risks to the network as a whole. But when all is said and done, expect frame relay to hold its own no matter how much ATM is deployed in the service provider backbone. The threat to frame relay from ATM, if there ever was a battle along these lines, is all but gone.
Frame Relay Sonet/SDH UNI and NNI The original frame relay specifications basically replaced X.25 packet switches with frame relay switches. The speeds of the ports on the switches and the links connecting the switch ports remained the same as they were with X.25. User ports ran at 64 kbps or less, but in no case could the maximum speed exceed 2.048 Mbps (E1), even on links between X.25 switches. In North America, this top speed for frame relay ports and links was 1.544 Mbps (T1). Early on, it was claimed that only frame relay could allow users to access a packet network at speeds above 64 kbps. Several research networks immediately started running X.25 at a full 2.048 Mbps, but it was obvious that it made more sense to deploy frame relay than rewrite X.25 specifications for the higher speeds. Soon frame relay not only supported E1 or T1 speeds for the UNI and NNI, but E3 (34 Mbps) and T3 (45 Mbps) speeds as well. But this is the entire point. Every time a network technology wishes to change the speed that it runs at, the process usually takes years and has to be repeated at some point in the (near) future. This standard speed approach made sense when network speeds changed little if at all from one year to the next. Why have many versions of hardware all run at different and incompatible speeds? But the pace of technological change today makes these types of frozen speeds impractical. Imagine if the standard definition of a PC said, “The process must run at 300 MHz.” Before any 400 MHz PCs could be sold, the definition would have to be revised and approved by all PC manufacturers. And if my company has no interest in making 400 MHz machines, why should I approve the speed increase? The standard speed definition problem not only applies to WAN technologies. Ethernet ran at 10 Mbps, not 100 Mbps in 1992. When the first rumblings about 100 Mbps Ethernet were heard, it took quite a while, and a lot of intense lobbying and committee work for the 100 Mbps Ethernet standard to emerge. Naturally, the company that started the whole thing and whose work provided the framework for 100 Mbps Ethernet had an initial advantage in the marketplace. But this was a lot of time and effort just to add a “0” in the right places. Remarkably, ATM has never suffered from these speed definition problems. ATM, as part of the overall B-ISDN specification, was aligned with Synchronous Optical Network (Sonet) in the United States and its close relative Synchronous Digital Hierarchy (SDH) almost everywhere else. Since Sonet/SDH define a scaleable set of standard speeds, not a single, fixed bit rate, ATM switches and links could always relay cells as fast as Sonet/SDH could carry them. ATM has never needed a separate specification to allow ATM networks to run faster (but ATM has needed new specifications to allow an ATM network to run slower than Sonet/SDH). In any case, it will not be long before frame relay, as another fast packet technology, has its speeds aligned with Sonet/SDH as well. Some of the more widely anticipated frame relay equipment will run at the speeds listed in Table 13.1. Note that Sonet speeds differ somewhat from their SDH counterparts. The acronym OC stands for Optical Carrier and the STMO stands for Synchronous Transport Module-Optical. Bit rates above OC-c are approximate. Table 13.1 Frame Relay and Sonet/SDH Speeds BIT RATE
SONET
SDH
51.84 Mbps
OC-1
Not defined
155.52 Mbps
OC-3c
STMO-1
622.08 Mbps
OC-12c
STMO-4
2.4 Gbps
OC-48c
STMO-16
5 Gbps
OC-96c
(STMO-32)
10 Gbps
OC-192c
(STMO-64)
It will take a while for the standards and implementation agreements to be changed to allow Sonet/SDH to be used directly with frame relay FRADs and switches. However, given the wide availability and popularity of Sonet/SDH, along with the need for higher aggregate bit rates on frame relay networks for both UNIs and NNIs alike, the shift of frame relay to Sonet/SDH speeds is more or less a given.
A Gigabit Ethernet World (?) Gigabit Ethernet is a new LAN technology that uses frames and runs at 1000 Mbps (1 Gbps) on fiber optic cable and even some unshielded twisted-pair (UTP) copper wire, but over restricted distances. Frame relay is a WAN technology that uses different frames and runs at up to 45 Mbps (and usually much slower) and spans the globe. How can a LAN technology in any way impact frame relay? The key is in the speed differential. Bursty data-only LAN applications running on 10 Mbps Ethernets serviced by FRADs linked to a frame relay network by 64 kbps UNIs function just fine. The speed ratio of about 150 to 1 from LAN to WAN is masked by the burstiness of the traffic. But replace the LANs with 100 Mbps Ethernet, and the ratio jumps to 1500 to 1. Even a full 1.5 Mbps UNI is still a 66 to 1 bottleneck which bursts can easily overwhelm. This is because a 100millisecond burst at 100 Mbps generates 10 times more bits than the same length burst at 10 Mbps. Gigabit Ethernet generates 100 times more bits per unit time than 10 Mbps Ethernet. And buffering always adds delay, and variable delays at that. So Gigabit Ethernet speeds can swamp the fastest frame relay UNI. But so what? Users have to use a WAN to link LANs, right? And Gigabit Ethernet is a LAN. Not so fast. While Gigabit Ethernet is still a LAN, the use of fiber optics for transmission makes Gigabit Ethernet a formidable LAN indeed. Gigabit Ethernet hubs are really switches, and closely resemble the nodes used in a technology called Fibre Channel. Gigabit Ethernet fibers can extend 10 kilometers between hubs (about 6.5 U.S. miles). Now the threat to frame relay becomes clearer. Most frame relay frames contain IP packets, but so do Gigabit Ethernet frames. If the major concern of users becomes, “How do I best link my Gigabit Ethernet sites together?” then frame relay has cause for alarm. There is plenty of bandwidth in most service provider backbones in the form of fiber optic cables. The problem in Gigabit Ethernet scenarios is that frame relay UNIs do not have enough bandwidth available to handle even bursty LAN traffic between Gigabit Ethernet sites. So why not eliminate the frame relay UNI? In fact, since Sonet/SDH links even at 622.08 Mbps do not align well with Gigabit Ethernet speeds, why not just deploy direct-to-fiber Gigabit Ethernet interfaces? The 10 kilometers will easily reach a backbone location is most cases. Once there, the service provider can just pipe the Gigabit Ethernet frames over its fiber backbone, just like any other bits (although plenty of them). In many cases, this new Gigabit Ethernet traffic can ride the same fibers as existing frame relay and ATM traffic. This is possible thanks to new and improved Wavelength Division Multiplexing (WDM) called dense WDM (DWDM). With DWDM, the same fiber can be used to carry 32 or 64 times the amount of traffic as previously. The world of direct-to-fiber Gigabit Ethernet is shown in Figure 13.4.
Figure 13.4 Direct-to-fiber Gigabit Ethernet. Now, this type of doomsday scenario depends on the acceptance and popularity of Gigabit Ethernet, true enough. But the magical Ethernet name just about ensures the acceptance and popularity of Gigabit Ethernet, just as with Ethernet’s slower cousins. Of course, all of these fiber channels carrying Gigabit Ethernet frames will need to be cross-connected or switched, but making frame relay frames or ATM cells just to be able to do so, and more slowly, with existing equipment seems a waste. Interestingly, the future of IP seems secure, since most Gigabit Ethernet frames will still carry IP packets. Plentiful fiber backbone bandwidth will make QoS concerns less and less of an issue. In spite of IP backbones and Gigabit Ethernet, the future of frame relay in the near-term seems secure. Frame relay will not only prosper in the data networking arena, frame relay will be able to handle other traffic types, such as voice or video, with relative ease. As for a more long-term outlook, frame relay will have to bend and adapt to a future of even more all-encompassing uses of IP and the rise of Gigabit Ethernet. Either way, the most interesting days of frame relay are still before it.
Bibliography Those interested in either pursuing the study of frame relay beyond the scope of this book or seeking a deeper understanding of the concepts presented here are urged to make use of as many of these sources as possible. Many are available essentially for free on the Internet and/or Web.
Frame Relay Forum The Frame Relay Forum maintains a number of online specifications. These are also available in PostScript or Adobe Acrobat. www.frforum.com/ FRF.1.1 User-to-Network (UNI) Implementation Agreement FRF.2.1 Network-to-Network (NNI) Implementation Agreement Version 2.1 FRF.3.1 Multiprotocol Encapsulation Implementation Agreement (MEI) FRF.4 Switched Virtual Circuit Implementation Agreement (SVC) FRF.5 Frame Relay/ATM Network Interworking Implementation Agreement FRF.6 Frame Relay Service Customer Network Management Implementation Agreement (MIB) FRF.7 Frame Relay PVC Multicast Service and Protocol Description FRF.8 Frame Relay/ATM PVC Service Interworking Implementation Agreement FRF.9 Data Compression Over Frame Relay Implementation Agreement FRF.10 Frame Relay Network-to-Network SVC Implementation Agreement FRF.11 Voice over Frame Relay Implementation Agreement FRF.12 Frame Relay Fragmentation Implementation Agreement FRF.13 Service Level Definitions Implementation Agreement
ANSI ANSI Standards www.ansi.org/ ANSI has a number of specifications pertaining to Frame Relay. These are not available online on the Web. These must be purchased from ANSI. The following ANSI specifications pertain to Frame Relay: T1.606, ISDN-Architectural Framework and Service Description for Frame Relay Bearer Service T1.606, Addendum, Frame Relay Bearer Service: Architectural Framework and Service Description T1.617, ISDN: DSSI-Signaling Specification for Frame Relay Bearer Service T1.617, Annex D, Additional Procedures for PVCs Using Unnumbered Information Frames T1.618, ISDN—Core Aspects of Frame Protocol for use with Frame Relay Bearer Service
Acronym List Acronyms, A-J AALATM Adaptation Layer AAL1ATM Adaptation Layer 1 AAL5ATM Adaptation Layer 5 ABRAvailable Bit rate ACELPAlgebraic Code-Excited Linear Predictive ACSAccess Circuit Section ADPCMAdaptive Differential Pulse Code Modulation ADSLAsymmetric Digital Subscriber Line AISAlarm Indication Signal AMIAlternate Mark Inversion ANSAccess Network Section ANSIAmerican National Standards Institute APIApplication Program Interface APPNAdvanced Peer-to-Peer Networking ARPAddress Resolution Protocol ASCIIAmerican Standard Code for Information Interchange ASN.1Abstract Syntax Notation 1 ATMAsynchronous Transfer Mode B8ZSBinary 8 Zero Substitution BcCommitted Burst BeExcess Burst BECNBackward Explicit Congestion Notification BERBit Error Rate BGPBorder Gateway Protocol B-ICIBroadband Inter Carrier Interface
B-ISDNBroadband Integrated Services Digital Network bpsbits per second BRIBasic Rate Interface B-TEB-ISDN Terminal Equipment CAPCompetitive Access Provider CASChannel Associated Signaling CBRConstant Bit Rate CCCluster Controller CCITTConsultative Committee for International Telegraph and Telephony (obsolete) CCSCommon Channel Signaling CD-ROMCompact Disk-Read Only Memory CDVCell Delay Variance CELPCode Excited Linear Prediction CESCircuit Emulation Service CICongestion Indication CIDChannel Identification CIRCommitted Information Rate CLECCompetitive Local Exchange Carrier CLLMConsolidated Link Layer Management CLPCell Loss Priority CNMCustomer Network Management COCentral Office COMCCommunications Controller COSClass of Service CLNPConnectionless Network Layer Protocol CPECustomer Premises Equipment CPCSCommon Part Convergence Services (or Sub-layer) CRCCyclic Redundancy Check C/RCommand/Response bit CRVCall Reference Value CSConvergence Services (or Sub-layer) CS-ACELPConjugate Structure—Algebraic Code Excited Linear Predictive
CSUChannel Service Unit DARDynamic Alternate Routing DCEData Communications Equipment DCFDData Compression Function Definition DCPData Compression Protocol DCRDynamically Controlled Routing DDRData Delivery Ratio DEDiscard Eligibility DLCIData Link Connection Identifier DLSwData Link Switching DNHRDynamic Non-Hierarchical Routing DQDBDistributed Queue Dual Bus DSDigital Signal DSAPDestination Service Access Point DSLDigital Subscriber Line DSLAMDigital Subscriber Line Access Multiplexer DSPDigital Signal Processor DSSDigital Subscriber Signaling DSUData Service Unit DTEData Terminal Equipment DTMFDual Tone Multi-Frequency DTPData Transfer Protocol DXIData Exchange Interface EAAddress Extension bit E-ADPCMEmbedded Adaptive Differential Pulse Code Modulation EBCDICExtended Binary Coded Decimal Interchange Code ECNExplicit Congestion Notification EETDEnd-to-End Transit Delay EFCIExplicit Forward Congestion Indicator EIExtension Indication EIAElectronics Industry Association EIRExcess Information Rate
ESEnd System ESFExtended Superframe FAXFacsimile (Group 3) FCCFederal Communications Commission FCSFrame Check Sequence FDDIFiber Distributed Data Interface FDRFrame Delivered Ratio FECNForward Explicit Congestion Notification FEPFront End Processor FFSFor Further Study FMBSFrame Mode Bearer Service FOFault Outage FRFrame Relay FRADFrame Relay Access Device FRBSFrame Relay Bearer Service FRFFrame Relay Forum FRIFFrame Relay Information Field FRIFFrame Relay Implementers Forum FRMTBSOFrame Relay Mean Time Between Service Outages FRMTTRFrame Relay Mean Time to Repair FR-SSCSFrame Relay—Service Specific Convergence Sub-layer FRSFrame Relay Service FRVCAFrame Relay Virtual Connection Availability FSFull Status FSBSFrame Switching Bearer Service FT1Fractional TFT3Fractional TGCRAGeneric Cell Rate Algorithm GUIGraphical User Interface HDLCHigh Level Data Link Control HDTVHigh Definition Television HPRHigh Performance Routing
IAImplementation Agreement ICMPInternet Control Message Protocol ICSInternetwork Circuit Section IEInformation Element IECInterexchange Carrier IEEEInstitute of Electrical and Electronics Engineers IETFInternet Engineering Task Force I/FInterface IGPInterior Gateway Protocol ILECIncumbent Local Exchange Carrier ILMIInterim Local Management Interface InARPInverse Address Resolution Protocol IPInternet Protocol IPXInternet Protocol—Xerox ISIntermediate System ISDNIntegrated Services Digital Network ISOInternational Organization for Standardization ISPInternet Service Provider ITUInternational Telecommunication Union ITU-TITU Telecommunications Sector IWFInterWorking Function IWUInterWorking Unit IXCInterexchange Carrier JPEGJoint Photographic Experts Group
Acronyms, K-Z kbpsKilobits per Second LANLocal Area Network LAPBLink Access Procedure Balanced LAPDLink Access Procedure D-channel LAPFLink Access Procedure Frame relay LATALocal Access Transport Area LD-CELPLow Delay—Code Excited Linear Prediction LELocal Exchange LECLocal Exchange Carrier LILength Indication LIVLink Integrity Verification LLCLogical Link Control LLRLeast Load Routing LMILocal Management Interface LPLoss Priority LPCLinear Predictive Coding lsbLeast Significant Bit LZSLempel-Ziv-Stac MACMedia Access Control MBSMaximum Burst Size M-FRADMultiservice FRAD MbpsMegabits per Second MIBManagement Information Base MOSMean Opinion Score MPEG-Motion Picture Experts Group—2 MP-MLQMulti Pulse Maximum Likelihood Quantizer
msbMost Significant Bit MTUMaximum Transmission Unit NACKNegative Acknowledgment NCPNetwork Control Program NLPIDNetwork Layer Protocol Identifier NMCNetwork Management Center NNINetwork to Network Interface NPCNetwork Parameter Control NT1Network Termination 1 OAMOperations, Administration and Maintenance OAM&POperations, Administration, Maintenance, and Provisioning OCOptical Carrier OLECOther Local Exchange Carrier OSI-RMOpen Systems Interconnection Reference Model OUIOrganizationally Unique Identifier PADPacket Assembler and Disassembler PBXPrivate Branch Exchange PCPersonal Computer PCIProtocol Control Information PCMPulse Code Modulation PCRPeak Cell Rate PDUProtocol Data Unit PIDProtocol Identifier PIUPath Information Unit PLPPacket Level Protocol (or Procedures) POPPoint of Presence PPPPoint-to-Point Protocol PSDNPacket Switched Data Network PSPDNPacket Switched Public Data Network PSTNPublic Switched Telephone Network PVCPermanent Virtual Connection QLLCQualified Link Layer Control
QoSQuality of Service RAMRandom Access Memory RARPReverse Address Resolution Protocol RBOCRegional Bell Operating Company RFCRequest for Comments RFHRemote Frame Handler RSRecommended Standard RTNRReal-Time Network Routing SANStorage Area Network SAPService Access Point SAPIService Access Point Identifier SARSegmentation and Reassembly SCPService Control Point SCRSustainable Cell Rate SDHSynchronous Digital Hierarchy SDLCSynchronous Data Link Control SDRState Dependent Routing SDUService Data Unit SEStatus Enquiry SIDSilence Information Descriptor SLAService Level Agreement SNASystems Network Architecture SNAPSub-Network Access Protocol SNMPSimple Network Management Protocol SonetSynchronous Optical Network SPVCSwitched Permanent Virtual Circuit SSAPSource Service Access Point SSCPService Specific Convergence Services (or Sub-layer) SSCSService Specific Convergence Services (or Sub-layer) SSPService Switching Point STMSynchronous Transport Module STMOSynchronous Transport Module Optical
STP*Shielded Twisted Pair STP*Signaling Transfer Point *STP appears for both Shielded Twisted Pair and Signaling Transfer Point. Definition is given by content of text. SVCSwitched Virtual Connection TATerminal Adapter TCP/IPTransmission Control Protocol/Internet Protocol TETerminal Equipment TEITerminal Endpoint Identifier TNSTransit Network Section TRTrunk Reservation U-PlaneUser Plane UBRUnspecified Bit Rate UDPUser Datagram Protocol UIUnnumbered Information UNIUser to Network Interface UPCUsage Parameter Control UTPUnshielded Twisted Pair VADVoice Activity Detection VBRVariable Bit Rate VCVirtual Connection VCCVirtual Channel Connection VCIVirtual Channel Identifier V-FRADVoice Frame Relay Access Device VLANVirtual Local Area network VocoderVoice coder/decoder VoIPVoice over IP VoFRVoice over Frame Relay VPCVirtual Path Connection VPIVirtual Path Identifier VPNVirtual Private Network VTAMVirtual Telecommunication Access Method VTOAVoice and Telephony over ATM
WANWide Area Network WWWWorld Wide Web xDSLx-type Digital Subscriber Line
ITU-T Recommendations ITU-T Recommendations www.itu.ch The ITU has a number of specifications pertaining to Frame Relay. These are not available online on the Web. These must be purchased from ITU. The following ITU-T specifications pertain to Frame Relay: X.144 ITU-T. Recommendation X.144 (1995), User information transfer performance parameters for data networks providing international frame relay PVC service. I.122 ITU-T. Recommendation I.122 (1988), Framework for providing additional packet mode bearer services. I.233 ITU-T. Recommendation I.233 (1991), Frame mode bearer services. I.233.1 ITU-T. Recommendation I.233.1 (1991), ISDN frame relaying bearer service. I.370 ITU-T. Recommendation I.370 (1991), Congestion management for the ISDN frame relaying bearer service. X.36 ITU-T. Recommendation X.36 (1994), Interface between Data Terminal Equipment (DTE) and Data circuit-terminating Equipment for public data networks providing frame relay data transmission service by dedicated circuit. X.76 ITU-T. Recommendation X.76 (1995), Network-to-network interface between public data networks providing the frame relay data transmission service. Q.922 ITU-T. Recommendation Q.922, ISDN Data Link Layer Specification for Frame Mode Bearer Services. Q.933 ITU-T. Recommendation Q.933, ISDN Signaling Specification for Frame Mode Bearer Services.
Internet RFCs Internet RFCs related to Frame Relay. Internet RFCs can be found in a number of locations on the Internet. RFC 2427 C. Brown, A. Malis, “Multiprotocol Interconnect over Frame Relay,” September 1998. (Obsoletes RFC1490 and 1294) (Status: DRAFT STANDARD) RFC 2390 Bradley, T., Brown, C., and A. Malis, “Inverse Address Resolution Protocol,” September 1998. (Obsoletes RFC1293) (Status: DRAFT STANDARD) RFC 2115 C. Brown, F. Baker, “Management Information Base for Frame Relay DTEs Using SMIv2,” September 1997. (Obsoletes RFC1315) (Status: DRAFT STANDARD) RFC 1604 T. Brown, “Definitions of Managed Objects for Frame Relay Service,” March 1994. (Obsoletes RFC1596) (Status: PROPOSED STANDARD) RFC 1586 O. deSouza, M. Rodrigues, “Guidelines for Running OSPF Over Frame Relay Networks,” March 1994. (Status: INFORMATIONAL) RFC 1483 J. Heinanen, “Multiprotocol Encapsulation over ATM Adaptation Layer 5,” July 1993. (Status: PROPOSED STANDARD)
Miscellaneous Sources Among the URLs consulted that were not standard documents in the course of writing this book, the following were the most helpful. Of course, like all URLs, the information often moves around. These URLs were all valid as of mid-. Cisco maintains a lot of frame relay information on several Web sites, from configuration guides to troubleshooting techniques. www.cisco.com/univercd/cc/td/doc/cisintwk/itg_vl/itg_frml.htmlcontains fine information on troubleshooting Frame Relay connections. wwweurope.cisco.com/warp/public/service/troubleshooting/ts_fr.htmlcontains similar step-bystep information on troubleshooting Frame Relay networks. Finally, www.alef0.cz/cprodocs/data/ciscorpo/software/phas_1/cg/ cfrelay .htmlwas probably the most valuable of all. This contains everything one needs to know about configuring a Cisco router to support Frame Relay, from LMI keepalive intervals to DLCI priority levels to IP tunneling. There are sample commands and responses to help the reader along at each step. Hill Associates has an excellent introductory course on Frame Relay. This course was an important source for the ideas presented in this book. Hill Associates is located atwww.hill.com. Data Communications magazine publishes periodic lab test results on both FRADs and Frame Relay network services. Not only do they cover traditional data aspects of Frame Relay, but also the latest uses of Frame Relay such as voice and using Frame Relay as the basis for virtual private networks (VPNs). Many past articles are archived at www.data.com.