Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
4712
Yevgeni Koucheryavy Jarmo Harju Alexander Sayenko (Eds.)
Next Generation Teletraffic and Wired/Wireless Advanced Networking 7th International Conference, NEW2AN 2007 St. Petersburg, Russia, September 10-14, 2007 Proceedings
13
Volume Editors Yevgeni Koucheryavy Jarmo Harju Tampere University of Technology, Institute of Communications Engineering Korkeakoulunkatu 1, 33720 Tampere, Finland E-mail: {yk, harju}@cs.tut.fi Alexander Sayenko Nokia Research Center, Computation Structures Itämerenkatu 11-13, 000180 Helsinki, Finland E-mail:
[email protected]
Library of Congress Control Number: 2007934040 CR Subject Classification (1998): C.2, C.4, H.4, D.2, J.1, K.6, K.4 LNCS Sublibrary: SL 5 – Computer Communication Networks and Telecommunications ISSN ISBN-10 ISBN-13
0302-9743 3-540-74832-6 Springer Berlin Heidelberg New York 978-3-540-74832-8 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2007 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12120567 06/3180 543210
Preface
We welcome you to the proceedings of seventh NEW2AN 2007 (Next-Generation Teletraffic and Wired/Wireless Advanced Networking) held in St. Petersburg, Russia. Significant contributions have been made in various aspects of networking in next-generation teletraffic. Presented topics encompassed several layers of communication networks: from physical layers to the transport protocols and modeling of new services. New and innovative developments for enhanced signaling protocols, QoS mechanisms, cross-layer optimization, and traffic characterization were also addressed within the program. In particular, issues of QoS in wireless and IP-based multi-service networks were dealt with, as well as financial aspects of future networks. It is also worth mentioning the emphasis placed on wireless networks, including, but not limited to, cellular networks, wireless local area networks, personal area networks, mobile ad hoc networks, and sensor networks. The call for papers attracted 113 papers from 29 countries, resulting in an acceptance ratio of 35%. With the help of the excellent Technical Program Committee and a number of associated reviewers, the best 39 high-quality papers were selected for publication. The conference was organized in 13 single-track sessions. The Technical Program of the conference benefited from two keynote speakers: Rod Walsh, NOKIA, Finland, and Saverio Mascolo, Politecnico di Bari, Italy. Moreover, a panel session on emerging wireless technologies organized and moderated by Alexander Sayenko, NOKIA, Finland, brought the wireless domain of the conference to a new level. We wish to sincerely thank the Technical Program Committee members and associated reviewers for their hard work and important contribution to the conference. This year the conference was organized in cooperation with ITC (International Teletraffic Congress), IEEE, COST 290, with the support of NOKIA (Finland) and BalticIT Ltd. (Russia). The support of these organizations is gratefully acknowledged. Finally, we wish to thank many people who contributed to the NEW2AN organization. In particular, Jakub Jakubiak (TUT) carried a substantial load with submissions and reviews, Web site maintaining, and he did an excellent job in the compilation of camera-ready papers and the interaction with Springer. Sergei Semenov (NOKIA) is to be thanked for his great efforts in conference linkage to industry. Many thanks go to Natalia Zaharova (Monomax Meetings & Incentives) for her excellent local organization efforts and the conference’s social program preparation. We believe that the work done for the seventh NEW2AN conference provided an interesting and up-to-date scientific experience. We hope that all the
VI
Preface
participants enjoyed the technical and social conference program, Russian hospitality and the beautiful city of St. Petersburg. September 2007
Yevgeni Koucheryavy Jarmo Harju Alexander Sayenko
Organization
International Advisory Committee Ian F. Akyildiz Nina Bhatti Igor Faynberg Jarmo Harju Andrey Koucheryavy Villy B. Iversen Paul K¨ uhn Kyu Ouk Lee Mohammad S. Obaidat Michael Smirnov Manfred Sneps-Sneppe Ioannis Stavrakakis Sergey Stepanov Phuoc Tran-Gia Gennady Yanovsky
Georgia Institute of Technology, USA Hewlett Packard, USA Alcatel Lucent, USA Tampere University of Technology, Finland ZNIIS R&D, Russia Technical University of Denmark, Denmark University of Stuttgart, Germany ETRI, Korea Monmouth University, USA Fraunhofer FOKUS, Germany Ventspils University College, Latvia University of Athens, Greece Sistema Telecom, Russia University of W¨ urzburg, Germany State University of Telecommunications, Russia
Technical Program Committee Mari Carmen Aguayo-Torres Ozgur B. Akan Khalid Al-Begain Tricha Anjali Konstantin Avrachenkov Francisco Barcelo Thomas M. Bohnert Torsten Braun Georg Carle Chrysostomos Chrysostomou Ibrahim Develi Roman Dunaytsev Eylem Ekici Sergey Gorinsky Markus Fidler Giovanni Giambene Stefano Giordano Ivan Ganchev
University of Malaga, Spain METU, Turkey University of Glamorgan, UK Illinois Institute of Technology, USA INRIA, France UPC, Spain University of Coimbra, Portugal University of Bern, Switzerland University of T¨ ubingen, Germany University of Cyprus, Cyprus Erciyes University, Turkey Tampere University of Technology, Finland Ohio State University, USA Washington University in St. Louis, USA NTNU Trondheim, Norway University of Siena, Italy University of Pisa, Italy University of Limerick, Ireland
VIII
Organization
Andrei Gurtov Vitaly Gutin Martin Karsten Andreas Kassler Maria Kihl Vitaly Kondratiev Tatiana Kozlova Madsen Yevgeni Koucheryavy Jae-Young Kim Jong-Hyouk Lee Vitaly Li Lemin Li Leszek T. Lilien Saverio Mascolo Maja Matijaˇsevic Paulo Mendes Ilka Miloucheva Dmitri Moltchanov Edmundo Monteiro Se´an Murphy Marc Necker Mairtin O’Droma Jaudelice Cavalcante de Oliveira Evgeni Osipov George Pavlou Simon Pietro Romano Stoyan Poryazov Alexander Sayenko Dirk Staehle Sergei Semenov Burkhard Stiller Weilian Su Veselin Rakocevic Dmitry Tkachenko Vassilis Tsaoussidis Christian Tschudin Kurt Tutschku Lars Wolf Linda J. Xie
HIIT, Finland Popov Society, Russia University of Waterloo, Canada Karlstad University, Sweden Lund University, Sweden Baltic-IT, Russia Aalborg University, Denmark Tampere University of Technology, Finland (Chair) Purdue University, USA Sungkyunkwan University, Korea Kangwon National University, Korea University of Electronic Science and Technology of China, China Western Michigan University, USA Politecnico di Bari, Italy University of Zagreb, FER, Croatia DoCoMo Euro-Labs, Germany Salzburg Research, Austria Tampere University of Technology, Finland University of Coimbra, Portugal University College Dublin, Ireland University of Stuttgart, Germany University of Limerick, Ireland Drexel University, USA RWTH Aachen, Germany University of Surrey, UK Universit` a degli Studi di Napoli “Federico II”, Italy Bulgarian Academy of Sciences, Bulgaria University of Jyv¨ askyl¨ a, Finland University of W¨ urzburg, Germany NOKIA, Finland University of Z¨ urich and ETH Z¨ urich, Switzerland Naval Postgraduate School, USA City University London, UK IEEE St. Petersburg BT/CE/COM Chapter, Russia Demokritos University of Thrace, Greece University of Basel, Switzerland University of W¨ urzburg, Germany Technische Universitt Braunschweig, Germany University of North Carolina, USA
Organization
Additional Reviewers A. Akan A. Amirante F. Araujo M. Bechler B. Bellalta A. Binzenhoefer C. Blankenhorn J. Brandt M. Bredel G. Bruck F. Cercas M. Ciurana S. D’Antonio L. De Cicco M. Dick S. Enoch J.T. Entrambasaguas C. Esli D. Ficara A. Fonte I. Goldberg L. Grieco X. Gu S. Gundavelli A. Gutscher
R. Henjes T. Hoßfeld T. Hossmann J. Jakubiak I. Jawhar R. Jurdak A. Kuwadekar M. Kaschub D. Kyung Kim E.S. Lohan F.J. Lopez-Martinez S. Luna M. Maggiora A. Malkov I. Martin-Escalona L. Martucci D. Milic L. Miniero C. Mueller J. Munilla A. Ruzzelli J.J. Sanchez Sanchez L. Servi D. M. Shila J. Silva
T. Ozsahin V. Palmisano P. Papadimitriou I. Psaras J. Riihijrvi D. Schlosser T. Staub B. Soret A. Spedalieri G. Stette L. Tavanti A. Tsioliaridou N. Vassileva F. Velez M. Waelchli O. Wellnitz L. Wood M. Wulff H. Xiong C.Y. Yang S. Yerima I.P. Zarko E. Zola
IX
X
Organization
Table of Contents
Teletraffic I Effects of Spatial Aggregation on the Characteristics of Origin-Destination Pair Traffic in Funet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilmari Juva, Riikka Susitaival, Markus Peuhkuri, and Samuli Aalto
1
Users Dimensioning and Traffic Modelling in Rural e-Health Services . . . I. Mart´ınez, J. Garc´ıa, and E. Viruete
13
Empirical Observations of Traffic Patterns in Mobile and IP Telephony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Poul E. Heegaard
26
Teletraffic II On the Number of Losses in an MMPP Queue . . . . . . . . . . . . . . . . . . . . . . . Andrzej Chydzinski, Robert Wojcicki, and Grzegorz Hryn
38
On-Line State Detection in Time-Varying Traffic Patterns . . . . . . . . . . . . . D. Moltchanov
49
The Drop-From-Front Strategy in AQM . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joanna Doma´ nska, Adam Doma´ nski, and Tadeusz Czach´ orski
61
TCP Protocol in Wireless Systems TCP Congestion Control over 3G Communication Systems: An Experimental Evaluation of New Reno, BIC and Westwood+ . . . . . . . . . . Luca De Cicco and Saverio Mascolo
73
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth on Demand Satellite Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Koong Chai and George Pavlou
86
TCP Performance over Cluster-Label-Based Routing Protocol for Mobile Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vitaly Li and Hong Seong Park
99
XII
Table of Contents
WiMAX An Analytic Model of IEEE 802.16e Sleep Mode Operation with Correlated Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Koen De Turck, Stijn De Vuyst, Dieter Fiems, and Sabine Wittevrongel Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P˚ al Grønsund, Paal Engelstad, Moti Ayoun, and Tor Skeie On Evaluating a WiMAX Access Network for Isolated Research and Data Networks Using NS-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Michael Bohnert, Jakub Jakubiak, Marcos Katz, Yevgeni Koucheryavy, Edmundo Monteiro, and Eugen Borcoci Performance Evaluation of the IEEE 802.16 ARQ Mechanism . . . . . . . . . Vitaliy Tykhomyrov, Alexander Sayenko, Henrik Martikainen, Olli Alanen, and Timo H¨ am¨ al¨ ainen
109
121
133
148
QoS Topics in Fixed Networks Evaluating Differentiated Quality of Service Parameters in Optical Packet Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Poul E. Heegaard and Werner Sandmann
162
GESEQ: A Generic Security and QoS Model for Traffic Priorization over IPSec Site to Site Virtual Private Networks . . . . . . . . . . . . . . . . . . . . . Jes´ us A. P´erez, Victor Z´ arate, and Angel Montes
175
Routes Building Approach for Multicast Applications in Metro Ethernet Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anatoly M. Galkin, Olga A. Simonina, and Gennady G. Yanovsky
187
Wireless Networking I Performance Modelling and Evaluation of Wireless Multi-access Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remco Litjens, Ljupco Jorguseski, and Mariya Popova Analysis of a Cellular Network with User Redials and Automatic Handover Retrials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jose Manuel Gimenez-Guzman, Ma Jose Domenech-Benlloch, Vicent Pla, Vicente Casares-Giner, and Jorge Martinez-Bauset Stochastic Optimization Algorithm Based Dynamic Resource Assignment for 3G Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mustafa Karakoc and Adnan Kavak
194
210
223
Table of Contents
Adaptive Resource Reservation for Efficient Resource Utilization in the Wireless Multimedia Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seungwoo Jeon, Hanjin Lee, and Hyunsoo Yoon
XIII
235
Teletraffic III A Discrete-Time Queueing Model with a Batch Server Operating Under the Minimum Batch Size Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dieter Claeys, Joris Walraevens, Koenraad Laevens, and Herwig Bruneel
248
Derivatives of Blocking Probabilities for Multi-service Loss Systems and Their Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V.B. Iversen and S.N. Stepanov
260
Rare Events of Gaussian Processes: A Performance Comparison Between Bridge Monte-Carlo and Importance Sampling . . . . . . . . . . . . . . . Stefano Giordano, Massimiliano Gubinelli, and Michele Pagano
269
AdHoc Networks I A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Helena Rif` a-Pous and Jordi Herrera-Joancomart´ı
281
Direct Conversion Transceivers as a Promising Solution for Building Future Ad-Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oleg Panfilov, Antonio Turgeon, Ron Hickling, and Lloyd Linder
294
Location Tracking for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . Kil-Woong Jang
306
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks with Anonymous Packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jerzy Konorski
316
Wireless Networking II Providing Seamless Mobility Using the FOCALE Autonomic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John Strassner, Dave Raymer, and Srini Samudrala
330
Evaluation of Joint Admission Control and VoIP Codec Selection Policies in Generic Multirate Wireless Networks . . . . . . . . . . . . . . . . . . . . . . B. Bellalta, C. Macian, A. Sfairopoulou, and C. Cano
342
XIV
Table of Contents
A Novel Inter-LMD Handoff Mechanism for Network-Based Localized Mobility Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joong-Hee Lee, Jong-Hyouk Lee, and Tai-Myoung Chung
356
AdHoc Networks II Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol by Using Active Packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitri Marandin
367
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evgeny Osipov
379
Wireless Topics On the Optimality and the Stability of Backoff Protocols . . . . . . . . . . . . . Andrey Lukyanenko
393
Maximum Frame Size in Large Layer 2 Networks . . . . . . . . . . . . . . . . . . . . Karel Slavicek
409
Analysis of Medium Access Delay and Packet Overflow Probability in IEEE 802.11 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gang Uk Hwang
419
EU Projects Experience Communications Challenges in the Celtic-BOSS Project . . . . . . . . . . . . . . G´ abor Jeney, Catherine Lamy-Bergot, Xavier Desurmont, ´ Rafael Lopez da Silva, Rodrigo Alvarez Garc´ıa-Sanchidri´ an, Michel Bonte, Marion Berbineau, M´ arton Csapodi, Olivier Cantineau, Naceur Malouch, David Sanz, and Jean-Luc Bruyelle
431
NGN Topics Performance Analysis of the REAchability Protocol for IPv6 Multihoming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antonio de la Oliva, Marcelo Bagnulo, Alberto Garc´ıa-Mart´ınez, and Ignacio Soto Controlling Incoming Connections Using Certificates and Distributed Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dmitrij Lagutin and Hannu H. Kari
443
455
Table of Contents
XV
Design and Implementation of an Open Source IMS Enabled Conferencing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Buono, T. Castaldi, L. Miniero, and S.P. Romano
468
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
481
Effects of Spatial Aggregation on the Characteristics of Origin-Destination Pair Traffic in Funet Ilmari Juva , Riikka Susitaival, Markus Peuhkuri, and Samuli Aalto Networking Laboratory, Helsinki University of Technology P.O.Box 3000, FI-02015 TKK, Finland
[email protected]
Abstract. In this paper we analyze measurements from the Finnish University Network (Funet) and study the effect of spatial aggregation on the origindestination flows. The traffic is divided into OD pairs based on IP addresses, using different prefix lengths to obtain data sets with various aggregation levels. We find that typically the diurnal pattern of the total traffic is followed more closely by the OD pairs as their volume increases, but there are many exceptions. Gaussian assumption holds well for all OD pairs when the aggregation level is high enough, and we find an approximate threshold for OD pair traffic volume after which they tend to be Gaussian. Also the functional mean-variance relation holds better when the aggregation level is higher. Keywords: Measurements, Traffic characterization, Gaussianity, Mean-variance relation.
1 Introduction Origin-Destination (OD) pair traffic refers to the traffic flow that traverses between two nodes in a network. Depending on the aggregation level, these can be, for example, hosts, routers, or ISPs. The main feature of measuring OD pair traffic is that traffic has to be aggregated both in time and space. Diurnal variation of the Internet traffic is usually studied at the coarse level of temporal aggregation with sample interval of some minutes whereas the packet level dynamics has to be studied at a very fine granularity of time. When the aggregation in space is considered, traffic flowing between two hosts is an example of very fine level of spatial aggregation, whereas ISP level studies represent coarse-grained aggregation. In many areas of traffic engineering, nature of OD pair traffic plays an important role. For example, in load balancing the shares of OD pair traffic are moved from one route to another. The idea of traffic matrix estimation is to estimate the OD traffic flows from the measured link flows. The existing estimation techniques make several assumptions about the OD pair traffic, including Gaussianity, functional mean-variance relationship and independence of the traffic samples. Evidently, the validity of these assumptions in real traffic traces depends both on the level of temporal and spatial aggregation. Few papers have studied the characteristics of OD pair traffic earlier. First, Feldman et al. [3] characterize point-to-multipoint traffic and find that a few demands account
Corresponding author.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 1–12, 2007. c Springer-Verlag Berlin Heidelberg 2007
2
I. Juva et al.
for 80% of total traffic and the traffic volumes follow Zipf’s law. Daily profiles of the greatest demands also vary significantly from each other. Bhattacharyya et al. characterize Point of Presence-level (POP) and access-link level traffic dynamics in [4]. Also they find that there are huge differences in the traffic volumes of the demands. In addition, the larger the traffic volume of an egress node, the larger also the variability of the traffic during the day. Finally, Lakhina et al. [5] analyze traffic of two backbone networks. Using Principal Component Analysis (PCA) they demonstrate that OD flows can be approximated by a linear combination of a small number of so-called eigenflows. In addition they observe that these eigenflows fall into three categories: deterministic, spiky and noisy. We have also previously studied the characteristics of measured Funet link traffic: in [7] we studied the characteristics of the aggregate link traffic and in [8] OD pair traffic at a fixed spatial aggregation level. Even though these aforementioned measurement studies answer to some questions related to OD pair traffic, full understanding how spatial aggregation changes the characteristics of OD pair traffic, is still missing. To this end, in this paper we study the effect that aggregation in space has on the OD pair traffic characteristics. Thus the main contribution of this paper is to locate boundaries for certain traffic behavior characteristics as a function of the aggregation level. The traffic of the link in Funet network is divided into OD pairs with different prefix lengths. Often traffic characteristics are analyzed in short time scales. We take the vantage point of traffic engineering and traffic matrix estimation, in which the relevant time scale is minutes, instead of seconds or less. We show that while the diurnal pattern of the OD pairs is not always the same as the diurnal pattern of the total traffic, the correlation is better, in general, as the OD pair’s traffic volume is larger. The Gaussian assumption, on the other hand, is shown to hold well for all OD pairs exceeding a certain size. For the relation between mean and variance we found that the larger the aggregation level, the better the relation holds. The rest of the paper is organized as follows. In Section 2 we explain the measurement methodology and introduce the data set used in the study. Section 3 studies the magnitudes of OD pairs, while Sections 4 and 5 study how the aggregation affects the diurnal pattern and Gaussianity of the OD pairs. In Section 6 the existence of a meanvariance relation is studied. Finally, section 7 concludes the paper.
2 Measurements and Original Data The traffic traces of this paper were captured by Endance DAG 4.23 cards from 2.5 Gbit/s STM-16 link connecting nodes csc0-rtr and helsinki0-rtr in Funet network1. The link is two-directional and we denote the direction from helsinki0-rtr to csc0-rtr by d0 and the opposite direction by d1 . Further details of the measurement process are available in earlier work based on the same measurements [7]. We divide the traffic of the link into origin-destination pairs by identifying the origin and destination networks of packets by the left-most bits in the IP address. Let l denote the number of bits in this network prefix, also called network mask. Different levels of 1
For details about Finnish university network (Funet), see www.csc.fi/suomi/funet/verkko. html.en
Effects of Spatial Aggregation on the Characteristics of OD Pair Traffic
3
aggregation are obtained by changing the prefix length l. The maximum length of the network prefix is 24 bits. With this resolution, there are 224 , or over sixteen million, possible origin networks. On the other hand, with the prefix length l = 1 there are only two networks and thus four possible OD pairs. Our procedure for selecting OD-pairs for further analysis from the original link traffic is the following. Combining both directions, the N most active networks in terms of traffic sent are selected and a N × N traffic matrix is formed, where N ≤ 100. This is enough to include all the significant OD pairs. From the obtained traffic matrix at most M greatest OD pairs in terms of sent traffic are selected for further analysis. We select M = 100, except in section 6, where we use M = 1000. Note that for very coarse level of aggregation the number of all OD pairs remains under 100. The measurements capture the traffic of two days: November 30th 2004 and June 31st 2006, with the main focus being on the first day. The traffic is divided into origindestination pairs using different prefix lengths and aggregated in time to one minute resolution. For each prefix length l and direction d0 /d1 separately, we denote the original measurement data by x = (xt,k ; t = 1, 2, . . . , T, k = 1, 2, . . . , K), where xt,k refers to the measured bit count of OD pair k over one minute period at time t minutes. Let us consider traffic of individual OD pairs. As in [1], we split the OD pair bit counts xt,k into components, xt,k = mt,k + st,k zt,k , where mt,k refers to the moving sample average, st,k to the moving sample standard deviation, and zt,k to the sample standardized residual of OD pair k. The averaging period was chosen to be one hour. Thus, mt,k = and st,k
n+30 1 xj,k 60 j=n−30+1
n+30 1 = (xj,k − mj,k )2 . 60 j=n−30+1
350
350 300
400
350
350
300
Mbps
400
400
Mbps
450 Mbps
Mbps
500
300
300 250
250 250
250
200
200
200
200 0
5
10 15 Time h
20
0
5
10 15 Time h
20
0
5
10 15 Time h
20
0
5
10 15 Time h
20
Fig. 1. One day original traffic trace and moving average of the studied link. Left side: direction d0 , right side: direction d1 .
4
I. Juva et al.
The traces of total traffic on the first measured day in the studied link for directions d0 and d1 are shown in the left and right side of Figure 1, respectively. The figure depicts also the moving sample averages of the traces. The diurnal variation of the traffic at this level of aggregation is clearly visible. The busiest hour of the day is in the middle of the day from 11 a.m. to 12 a.m. in both directions.
3 Magnitudes of OD Pairs In this section we study the size of the OD pairs at different aggregation levels. We are interested in how the traffic is distributed in address space, and whether there is a power law behavior observable in the sizes of the OD pairs, which would mean that the decrease in OD pair size as a function of rank should be linear in the log-log scale. For OD pair k we define the volume Xk as the average of bits transferred per second over one day, T xt,k Xk = . 60T t=1 When the level of aggregation is very coarse (l ≤ 4), the number of non-zero OD pairs is smaller than 100 and we are able to study the volumes of the complete traffic matrix. In Figure 2 we have depicted traffic matrices for cases from l = 1 to l = 4. In the density graphs the darker the color is the more traffic is sent, while white indicates that there is no traffic between the networks. When l = 1, the classification into OD-pairs is done based on the first bit of the network prefix. The density plot shows that most of the traffic in the link originates and terminates in the network whose first bit of prefix is 1. On the other hand, there is no traffic at all between networks with first bit 0. As we increase l, the density plots become sparser since the non-zero OD pairs form only a minor part of all possible OD pair combinations in the traffic matrix. One reason for sparseness is that the measurements are not network wide, but just from one link. Next we consider the volumes of the OD pairs with different values of l. In Figure 3 the OD pairs are sorted from greatest to smallest and their volumes are plotted on the log-log scale, when the prefix length varies from l = 4 to l = 22. For every level of aggregation there are approximately 15 very significant OD pairs and after that the volumes decrease. We note that for l ≥ 10 the decrease is quite linear. l1
l2
l3
l4
Fig. 2. Traffic volume sent between the origin and destination network for different prefix lengths l. Black: a lot of traffic, white: no traffic. Direction d0 .
Effects of Spatial Aggregation on the Characteristics of OD Pair Traffic l4
l10
100
10 7 5 Mbps
Mbps
10 1 0.1
1
2
5 10 20 OD pair l16
3 2 1.5 1
50
1
10 5
2
5 10 20 OD pair l22
50 100
10 0.1
2 1 0.5
Mbps
Mbps
5
0.001 0.00001 1. 107
0.2 0.1 1
2
5 10 20 OD pair
50 100
1
2
5 10 20 OD pair
50
Fig. 3. Traffic volume of OD pairs for different prefix lengths l. Direction d0 .
% of total traffic
On the left side of Figure 4 the volume of the greatest OD pair for each aggregation level l is plotted. Decrease in the volume as a function of l is first very steep even in the logarithmic scale, but then it saturates until l changes from 16 to 17 where the volume drops again. In general, as compared to the hypothetical situation where all link traffic is divided evenly among all possible OD-pairs, the decrease is moderate. On the right side of Figure 4 we show the percentage that the 15 greatest OD pairs comprise of the total link traffic as a function of l. Even for finer resolutions, such as l = 16, these 15 pairs form a significant part of the traffic. As a result of this section we can say that the classification of the link traffic based on origin and destination pairs produces "mice" and "elephants", which is a well known phenomenon from earlier Internet measurement studies. However, the power-law assumption is valid only for finer granularity of aggregation, such as l ≥ 10, where the traffic volumes are smaller.
Mbps
200 100 50 0
5
10
15 l
20
90 80 70 60 50 40 30 0
5
10
15
20
l
Fig. 4. Left side: The volumes of the greatest OD pairs as a function of prefix length l. Right side: The percentage of traffic of 15 greatest OD pairs as a function of l. Direction d0 .
6
I. Juva et al.
4 Diurnal Variation of the OD Pair Traffic In [8] we observed that at a fine aggregation level of l = 22 none of the OD pairs seemed to follow the diurnal variation of the total link traffic, in which the traffic peaks in the midday. We concluded that the strong diurnal variation in the link traffic is more explained by the variation in the number of active on-off OD pairs than diurnal pattern within these OD pairs. However, we would expect that when increasing the aggregation level, at some point the diurnal pattern should become visible in the OD pairs. In this section we study in more detail the diurnal variation of the OD pairs at different levels of OD pair aggregation. This is done by comparing the daily profiles of the OD pairs and the corresponding profile of the total link traffic, shown in the lower row of Figure 1. As an example, we plot the moving sample averages of the four largest OD pairs with aggregation levels l = 4 and l = 8 for direction d0 in Figure 5. At the coarse level of aggregation we can see different types of diurnal patterns. Pairs 3 and 4 have a diurnal variation close to the variation of the total link traffic, while pairs 1 and 2 are not so close. At the resolution l = 8 only the fourth OD pair follows the diurnal pattern of the link traffic. To better understand how the diurnal variation changes as the aggregation level l increases, we study the correlation between two time series; the moving sample average of the total link traffic, and moving sample average of the OD pair k. The correlation coefficient between any two time series x = (xi , i = 1, ..., n) and y = (yi , i = 1, ..., n) is defined as n (xi − x)(yi − y) n r(x, y) = n i=1 . (1) 2 (x − x)2 i=1 i i=1 (yi − y) On the left side of Figure 6 we plot the correlation coefficients for all OD pairs with all aggregation levels l and directions d0 and d1 as a function of the volume of the OD pair. For small OD pairs there exist both positive and negative correlations but for large OD pairs the correlations are positive, as we would expect. However, dependence between the correlation and the volume of the OD pair is not strong. In the right hand side of the same figure the mean of the correlation coefficients for the OD pairs with
OD 1
OD 2
OD 3
70
35
70
60 50
15
30 10 15 Timeh
20
0
5
OD 1
10 15 Timeh
0
5
20
0
5
10 15 Timeh
10 15 Timeh
20
5
0
OD 3
45 40 35 30 25 20 15 10
Mbps
Mbps 5
20
OD 2
62.5 60 57.5 55 52.5 50 47.5 0
10 15 Timeh
30 27.5 25 22.5 20 17.5 15
20
18 17 16 15 14 13 12 0
5
10 15 Timeh
10 15 Timeh
20
OD 4
Mbps
5
0
30 25 20
40
60
Mbps
80 75
Mbps
40
Mbps
Mbps
80
65
Mbps
OD 4
45
90 85
20
20 18 16 14 12 10 8 0
5
10 15 Timeh
20
Fig. 5. The moving sample average for the 4 greatest OD pairs. Prefix length l = 4 (upper) and l = 8 (lower). Direction d0 .
1 0.75 0.5 0.25 0 -0.25 -0.5 -0.75
7
0.7 0.6 0.5 r
r
Effects of Spatial Aggregation on the Characteristics of OD Pair Traffic
0.4 0.3 0.2 0.1
0
50
100 Mbps
150
200
5
10
15
20
l
Fig. 6. Testing diurnal variation. Left side: OD pairs correlation to total link traffic as a function of the traffic volume. Right side: Average correlation of OD pairs with different prefix lengths l.
given prefix length l are plotted. We can see that the mean correlation decreases as a function of l, as the earlier figures indicated. As a conclusion of this section we can state that as the aggregation level of the traffic coarse, also the diurnal traffic pattern of the OD pairs is closer to the variation of the total link traffic. However, there is not any clear bound in OD pair volume or in the prefix length, after which we can say that the daily behavior is similar to the familiar profile found in the link traces.
5 Gaussianity A typical assumption in traffic engineering is that traffic follows Gaussian distribution. Specifically, traffic matrix estimation techniques make this assumption to simplify statistical calculations. In [7] the aggregated link traffic was found to follow very closely the Gaussian distribution. However, when we studied the origin-destination flows in [8], only a small portion of them were anywhere close to Gaussian, typically only the larger flows. Due to the Central Limit Theorem we might assume that when the aggregation of individual non-gaussian flows is large enough, the aggregate will indeed follow the Gaussian distribution. In [9] the authors studied the number of users required for aggregate to be Gaussian and found that "a few tens of users" is typically sufficient. We study the different aggregation levels in terms of traffic volume in order to determine how much traffic is needed to yield Gaussian behavior. In this paper, the Gaussianity of each OD pair is evaluated by the Normal-quantile (N-Q) plot of the standardized residual zt,k . The original sample (denoted by x in the equation) is ordered from the smallest to the largest and plotted against a, which is defined as i ai = Φ−1 ( ) i = 1, . . . , n, n+1 where Φ is the cumulative distribution function of the Gaussian distribution. The vector a contains the quantiles of the standard Gaussian distribution, thus ranging approximately from −3 to 3. If the considered data follows the Gaussian distribution, the N-Q plot should be linear. Goodness of fit with respect to this can be calculated by the linear correlation coefficient r(x, a), defined in (1), and the value r2 is used as a measure of the goodness of fit, an approach used in [10] and in our earlier works [7,8]. In [9] the
I. Juva et al.
0.0
0.2
0.4
r^2
0.6
0.8
1.0
8
0
50
100
150
200
Mbps
Fig. 7. Testing Gaussianity: Goodness of fit values r 2 as a function of OD pair traffic volume
1kbps − 1Mbps
0
0
100
1000
300
2000
500
3000
0 − 1 kbps
0.2
0.4
0.6
0.8
0.0
1.0
0.2
0.4
0.6
r^2
r^2
1 Mbps − 10 Mbps
> 10 Mbps
0.8
1.0
0.8
1.0
0
0
20
50
40
60
100
80
150
0.0
0.0
0.2
0.4
0.6 r^2
0.8
1.0
0.0
0.2
0.4
0.6 r^2
Fig. 8. Testing Gaussianity: Distribution of r 2 values for OD pairs of different traffic volumes
authors studied this method and found that although simple, it is sufficiently accurate to determine the validity of the Gaussian assumption. They note that when r2 > 0.9 then also the more complex Kolmogorov-Smirnov test usually supports the assumption that the traffic is Gaussian. In Figure 7 the size of the OD pair traffic volume (bits per second) is plotted against the goodness of fit value r2 of the Gaussian assumption. We can see from the figure that the larger flows are always close to Gaussian, with r2 values easily over 0.90. The largest OD pair with r2 < 0.90 has traffic volume of 17.5 Mbps. The vertical line in the figure is located at 10 Mbps, which seems to be an approximate threshold after which an overwhelming majority of the OD pairs have r2 > 0.90, with r2 > 0.98 for many of the OD pairs, as seen in the histogram of Figure 8. For OD pairs of size 1 Mbps
9
0.8 0.6 0.4
r^2
80 0
0.2
40
Mbps
120
1.0
Effects of Spatial Aggregation on the Characteristics of OD Pair Traffic
5
10
15
20
l
5
10
15
20
l
Fig. 9. Testing Gaussianity: Average OD pair traffic volumes and goodness of fit values r 2 as a function of prefix length l. Direction d0 .
to 10 Mbps there is still a lot of Gaussian traffic, while for OD pairs smaller than 1 Mbps there is not any Gaussian behavior observable. For the smallest OD pairs the fit is almost always near zero, as these are typically flows that have one or few bursts of traffic and are idle the rest of the time. In Figure 9 the average OD pair traffic volumes and the average r2 values are shown as a function of the prefix length. The average is taken over those largest OD pairs that comprise 80 percent of total traffic. For the link direction d0 , depicted in the figure, the first six cases, with prefix lengths from 1 to 6, have an aggregation level high enough so that their average traffic volume is over 10 Mbps, and the r2 values for the first seven cases exceed 0.9. For the d1 direction, the first six are over 10 Mbps and the same six are over 0.9 while the the seventh is almost exactly 0.9. In general, the ten megabit threshold seems to approximately apply also for averages. An average of 10 Mbps implies that the goodness of fit is better than 0.90. However, in both directions the values decline rather slowly from good to reasonable to adequate until a steep drop occurs from the adequate values to the bad values between network prefixes of 15 and 20 bits. While Figure 9 is in linear scale and fails to depict any observable change in the mean flow size in this region, Figure 4, in logarithmic scale, shows a steep drop in the maximum size of the OD pair. To summarize, while it is impossible to set a concrete threshold, it seems that in our data majority of the OD pairs with at least 10 Mbps of traffic are fairly Gaussian.
6 Mean-Variance Relation Traffic matrix estimation is an underdetermined problem, if we do not have any extra information. A typical way to obtain the extra information is to use the mean-variance relation. A functional relation is assumed between the mean λ and the variance Σ of an OD pair’s traffic volume. Although this spatial mean-variance relation is a key assumption in many traffic matrix estimation techniques [1,11,12,13], evidence of its validity is contradictory. Cao et al. [1] found the relation sufficiently valid to justify its use, but their study is of a local area network. Gunnar et al. [6] found the relation valid in study of a Global Crossing backbone network, while Soule et al. considered the validity
10
I. Juva et al.
not sufficient in their study [14]. We found the relation to hold moderately in the Funet network, with average goodness of fit value around r2 = 0.80 [8]. That study, however, was done with a high resolution, leading to rather small traffic volumes. Now we have extended the measurement data for higher aggregation levels, which probably gives more relevant results, as it is similar to typical traffic matrix estimation environment, where a backbone network with large traffic volumes is considered. The commonly used power law relation can be written as Σ = φ · diag{λc }. The power law relation for the OD pair i is σi2 = φ · λci , and its logarithm is log σi2 = c log λi + log φ. Thus, if the relation held, the points would fall on a line with slope c and intercept log φ in the log-log scale. This is a simple linear regression model and we can measure the validity of the mean-variance relation with the linear correlation goodness of fit value r2 used in the previous section. For each prefix length, the mean and the variance are calculated for each one hour period in the 24 hour trace. In Figure 10 the values are depicted for one selected hour and two selected prefix lengths, with one point in the plot representing the mean and the variance of one OD pair for that hour. For a longer prefix (l = 18) r2 = 0.80, which is in line with previous results. It can be seen that the values deviate significantly more from the regression line making the fit worse. However, for shorter prefix (l = 7), depicted in the same Figure, the fit is much better, about r2 = 0.95. In Figure 11 the average goodness of fits values are shown as a function of the network prefix length l. As the prefix gets longer, there are more OD pairs, with the average size of an OD pair obviously getting smaller. Recall that the average OD pair sizes for different prefixes are shown in Figure 9. For the longer prefixes the fit of the meanvariance relation is around 0.75 to 0.80. As the resolution gets coarser, the goodness of fit values improve to over 0.90, in some cases as high as 0.95. The OD pair traffic volumes at these aggregation levels are still less than 100 Mbps, and as the growth is approximately linear as a function of the aggregation level, we may conclude that for larger traffic flows the fit is at least as good, probably better. Table 1 shows the values of the exponent parameter c with different aggregation levels. It can be said that the parameter stays relatively constant and that the values fall between the results reported for parameter values in other networks [1,6,14]. l7
l18
30
25
20
log Σ2
log Σ2
25 15 10
10 5
5 0
20 15
0
2.5
5
7.5 10 12.5 15 17.5 log Λ
0
0
2
4
6 8 10 12 14 log Λ
Fig. 10. Mean variance relation in log-log scale. Left: r 2 = 0.95, right: r 2 = 0.80.
Effects of Spatial Aggregation on the Characteristics of OD Pair Traffic
0.95
0.9
r^2
r^2
0.95
11
0.85 0.8
0.9 0.85 0.8
4
8
12 l
16
4
8
12 l
16
Fig. 11. Testing mean-variance relation: Goodness of fit values r 2 as a function of prefix length l. Directions d0 on the left side, d1 on the right side.
Table 1. Estimates for the mean-variance relations exponent parameter c for different prefix lengths l l 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 c 1.64 1.60 1.60 1.60 1.66 1.71 1.72 1.76 1.77 1.73 1.75 1.76 1.73 1.71 1.67 1.66 1.71
We can conclude that there is a clear dependency of the mean-variance relation fit and the aggregation. Most importantly, there is a strong functional mean-variance relation for the cases where aggregation level is high.
7 Conclusion In this paper we have analyzed the origin-destination pair traffic in the Funet network, and in particular the effects that spatial aggregation has on these characteristics. The gaussian assumption holds better when the aggregation level is higher. An approximate threshold, after which all OD pairs are at least fairly Gaussian, would appear to be around traffic volumes of 10 to 20 Mbps. This means that for many traffic engineering and traffic modeling tasks where we consider much larger traffic flows the Gaussian assumption is justified, but it probably cannot be used for cases with smaller traffic volumes due to low aggregation level. The diurnal variation of the OD pairs follow the diurnal pattern of total traffic more closely when the aggregation level is higher. However, there is not a clear cut boundary as in the Gaussianity assumption. We can point out, though, that it would be ill-advised to assume in any scenario that diurnal patterns are similar for all OD pairs, or that busy hours of different flows would coincide. We validated the spatial power law assumption between mean and variance of the OD pairs. Particularly with large aggregation levels it holds well. This is an essential result concerning traffic matrix estimation techniques which rely on this very assumption. Our results also show that the exponent parameter remained about constant regardless of the aggregation, and was within the range of values obtained for it in literature. To conclude, we can state that the more aggregated the traffic becomes, the more well behaved it is in general, in the sense that the assumptions studied hold better.
12
I. Juva et al.
References 1. Cao, J., Davis, D., Wiel, S.V., Yu, B.: Time-varying Network Tomography. Journal of the American Statistical Assosiation 95, 1063–1075 (2000) 2. Barthelemy, M., Gondran, B., Guichard, E.: Spatial Structure of the Internet Traffic, Physica A: Statistical Mechanics and its Applications 319 (2003) 3. Feldmann, A., Greenberg, A., Lund, C., Reingold, N., Rexford, J., True, F.: Deriving Traffic Demands for Operational IP Netowrks: Methodology and Experience. In: IEEE/ACM Transactions on Networking, vol. 9(3), ACM Press, New York (2001) 4. Bhattacharyya, S., Diot, C., Jetcheva, J., Taft, N.: Pop-Level and Access-Link-Level Traffic Dynamics in a Tier-1 POP. In: Proceedings of ACM Internet Measurement Workshop (IMW), ACM Press, New York (2001) 5. Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., Kolaczyk, E.D., Taft, N.: Structural Analysis of Network Traffic Flows. In: SIGMETRICS/Performance 2004, New York, USA (2004) 6. Gunnar, A., Johansson, M., Telkamp, T.: Traffic Matrix Estimation on a Large IP BackboneA Comparision on Real Data. In: IMC 2004, Taormina, Italy, October (2004) 7. Juva, I., Susitaival, R., Peuhkuri, M., Aalto, S.: Traffic Characterization for Traffic Engineering Purposes: Analysis of Funet Data. In: NGI 2005, Rome, Italy (2005) 8. Susitaival, R., Juva, I., Peuhkuri, M., Aalto, S.: Characteristics of OD Pair Traffic in Funet. In: ICN 2006, Mauritius (Extended version to appear in Telecommunications Systems) (2006) 9. van de Meent, R., Mandjes, M., Pras, A.: Gaussian Traffic Everywhere? In: ICC 2006, Istanbul, Turkey (2006) 10. Kilpi, J., Norros, I.: Testing the Gaussian Approximation of Aggregate Traffic. In: 2nd ACM SIGCOMM Internet Measurement Workshop, Marseille, France (2002) 11. Vardi, Y.: Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data. Journal of the American Statistical Association 91, 365–377 (1996) 12. Liang, G., Yu, B.: Pseudo Likelihood Estimation in Network Tomography. In: IEEE Infocom (2003) 13. Juva, I., Vaton, S., Virtamo, J.: Quick Traffic Matrix Estimation Based on Link Count Covariances. In: proceedings of ICC 2006, Istanbul, Turkey (2006) 14. Soule, A., Nucci, A., Cruz, R., Leonardi, E., Taft, N.: How to Identify and Estimate the Largest Traffic Matrix Elements in a Dynamic Environment. In: SIGMETRICS/Performance’04, New York, USA (2004)
Users Dimensioning and Traffic Modelling in Rural e-Health Services I. Martínez, J. García, and E. Viruete Communications Technologies Group (GTC), Aragon Institute for Engineering Research (I3A) D.204 – Dpt. IEC. Ada Byron Building. Univ Zaragoza (CPS.UZ) – 50018 Zaragoza Spain {imr, jogarmo, eviruete}@unizar.es
Abstract. The development of e-Health services in rural environments, where broadband networks are usually not accessible, requires a specific analysis of available resources to improve Quality of Service (QoS) management. This work quantifies the maximum number of simultaneous users that fulfill the specific QoS levels in common e-Health services, including both store-andforward and real-time telemedicine applications. The analysis also proposes variations in the modelling of traffic distributions regarding the number of multiplexed users. The results obtained in this study permit an accurate users dimensioning, which is necessary to optimize the performance and to guarantee the QoS requirements in this kind of services where network resources are limited. Keywords: e-Health, QoS, rural services, traffic model, user dimensioning.
1 Introduction The great advance in new technologies in the last years has allowed to increase the quantity and to improve the quality of e-Health services in very varied assistance scenarios (rural environments, tele-assistance, home assistance, etc.) [1]-[3]. Every of these heterogeneous environments includes different Types of Service (ToS) that require specific analyses and precise estimations of the Quality of Service (QoS) level that can offer [4], [5]. In order to achieve that objective, it is crucial to study two aspects: the specific nature of the information to transmit and the exact behaviour of the networks transporting it. Regarding the first aspect, a particular description of traffic models and parameters associated to the service is required. With regard to the second, the network parameters that allow to estimate QoS levels have to be studied to guarantee the feasibility, efficiency and the precise parameter range for the correct behaviour of e-Health services [6]-[8]. In this line, an extended idea is to manage and vary adaptively the transmission of information generated by applications (codecs, transmission rate and compression levels, etc.) to adapt it to network resources (capacity, available bandwidth, performance, etc.). This concept would permit to improve the QoS of e-Health communications to approach their optimum behaviour in every moment [9], [10]. In the last Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 13 – 25, 2007. © Springer-Verlag Berlin Heidelberg 2007
14
I. Martínez, J. García, and E. Viruete
years, this idea has been developed in multimedia scenarios over best-effort networks like Internet, but a detailed analysis in a rural environment, like the one presented in this article, would contribute with quantitative results to optimize QoS and to model the traffic of the sources in the e-Health applications. Rural scenarios (characterized by the long distances to the hospital) are one of the most representative environments in which new technologies allow to improve health services by bringing closer the hospital and the patient, and benefiting users in a massive way, irrespective of their location. In this context, a study to fix specific models depending on the type of traffic and the volume of information transferred as a function of the available resources are required to correctly develop services and to dimension the maximum number of users to be granted guaranteed QoS in the most adverse situations. The analysis presented in this article has been carried out thanks to an ad-hoc tool [11], [12] that allow integrating the results obtained from experimental measurements (developed at the network laboratory of the Zaragoza University) and simulated traces (developed with the Network Simulator (NS-2) tool and using specific traffic and network models). This integrated methodology permits to characterize, optimize and model the service from the two main points-of-view: application traffic and communication networks. Section 2 describes the characteristics of the rural scenario, the use cases and the traffic parameters (from the point of view of the application and the network). Section 3 analyzes the optimum application parameters that fulfill QoS depending on network conditions. These parameters serve as the starting point to Section 4, where the maximum number of system users is obtained. The different traffic models for this environment are presented in Section 5. Finally the results obtained and their translations into adaptive mechanisms to guarantee QoS are discussed in Section 6.
2 Description of the e-Health Rural Scenario The features of the rural scenario correspond to a communication between a medical non-specialist (situated in a rural medical centre) and the reference hospital in order to offer tele-consulting with the medical specialist or patient tele-care, see Fig.1. The rural medical centers are situated in a remote place with fixed interconnection technologies (Public Switched Telephone Network, PSTN, or Digital Subscriber Line, DSL). These accesses are often based on narrowband technologies [13], [14]. Thus, for every user connection, the maximum transmission rate to the hospital (upstream) r ≤ 64kb/s is considered in the access point. These different user connections are multiplexed over the remote hospital server, which requires more capacity (C=k·64kb/s, with k ≥ 1). In addition, every user connection may include different ToS grouped into two main categories: Store-and-Forward (SF) services and Real-Time (RT) services. SF services are used for applications without time requirements (e.g. medical tests transmission to Electronic Healthcare Record (EHR) database). RT services are used by applications that require minimum thresholds of delay and packet loss (biomedical signals transmission, medical video-conference, etc.). In order to study most of the rural situations, several Use Cases (UC) are proposed, see Fig. 1.
Users Dimensioning and Traffic Modelling in Rural e-Health Services
15
In every UC, it is useful to take into account the study of service performance (according to the occupation factor of network resources, ρ) in order to evaluate the number of simultaneous users (N) that may be multiplexed keeping their individual QoS level. 2.1 Use Cases Based on the technical description of the rural scenario, several real situations are proposed (UCs, see Fig. 1). The UCs descriptions are the following: − UC1. The most frequent UC consists of remote transmission (to the reference hospital) of medical tests (ECGs, ECHOs, digital images) acquired on the medical centre (SF.Data). − UC2. Including UC1, it adds transmission of clinical/administrative data and remote access to the EHR database (RT.EHR). − UC3. It consists of UC2 adding a RT videoconference with a medical specialist for diagnostic support (RT.Media), which includes audio (RT.Audio) and video (RT.Video) services. − UC4. Including UC3, it is usual to add the RT acquisition and transmission of specific vital signals (pulse, blood pressure) in order to complete patient diagnostic (RT.Bio). These UCs defined previously include SF and RT services and permit to evaluate and quantify the optimum performance areas depending on N and ρ to guarantee the recommended QoS. The result of this evaluation will also permit to model the traffic parameters that characterize the service to propose new traffic models, and to design optimum algorithms according to the variable conditions of the system. For leading this study, it is necessary to define the main traffic parameters which take part in the
Fig. 1. Evaluation scenario for a rural e-Health service between a Primary Healthcare Centre and a hospital; including transmission of medical tests, patient information, and biomedical signals, EHR updating and audio/video-conference
16
I. Martínez, J. García, and E. Viruete
Fig. 2. Application traffic descriptors and network QoS parameters associated to the evaluation of rural e-Health scenarios
scenario, their specific values in the rural context, and the variable QoS to optimize network resources. 2.2 Traffic Descriptors The service model used in this study is based on previous contributions detailed in [15], and it has been designed from the technical results obtained in [11], [12] and from the main traffic descriptors and conclusions over QoS in related works [16]-[19]. All these proposed QoS models include the performance of application as well as network technologies and, from both points of view, a generic evaluation scheme for rural e-Health scenarios is proposed in Fig.2. A. Application Parameters − Data size (S). Amount of data (in their original format) generated by the traffic generator (source application). − Packet size. The transfer unit size of the Internet Protocol (IP) datagram, using the TCP (SMSS) or UDP (s) protocol, depending on the information type (the final packet sizes are calculated adding network and physical layer headers). − Data rate. It may be defined considering several parameters: Peak Data Rate (PDR) that it is the maximum data rate (it is the inverse of the nearest timestamps between two consecutive packets, 1/Δt), and Sustained Data Rate (SDR) that is the transmission data rate measured in a reference time interval (T = ti+n - ti), see (1). − Maximum Burst Size (MBS). It is defined as the maximum number of packets that may be transmitted at PDR guarantying SDR. The relation factor of both parameters is defined as Burst Tolerance (BT), see (1). PDR = 1/ T ⎢ BT ⎥ MBS = ⎢1 + , ⎥ with SDR = 1/ Ts ⎣ Ts − T ⎦
and with BT = ( MBS − 1) ⋅ (
1 SDR
−
1 PDR
)
(1)
B. Network Parameters − End-to-End Delay (EED) [20]. It is defined as the time since a packet is transmitted until the packet is received at its destination. This is the addition of several delays: accessing, processing, buffering, etc. The EDD is completed by other parameters as jitter (EED variance between consecutive delays: for RT services, a probability of P[jitter>20ms] < 10% must be guaranteed).
Users Dimensioning and Traffic Modelling in Rural e-Health Services
17
− Packet Loss Rate (PLR) [21]. It is the number of lost packets with regard to transmitted packets. Thus, the EED-PLR combination is decisive in the QoS study. − BandWidth (BW) and Available BW (ABW) [22]. BW represents the capacity (C) for all the communications that share the link and ABW is the capacity not used, which is available for new input connections. Moreover, it is usual to define the effective capacity (Ce) as the real resources for data transmission measured in a reference time interval. − Occupation factor (ρ). It is normally used for link occupation comparisons related to available resources. It is a good indicator of the service efficiency and performance [23], [24]. In a stable system without packet loss, ρ is limited by a maximum value (ρmáx). Moreover, the control bits are usually distinguished of the information bits; thus, ρ is usually normalized to its maximum value, see ρ* in (2). ρ* =
ρ Ce C = < 1, with ρ = C C , ρmáx = ρmáx Ce e
e máx
C
máx
(2)
and C = r · k → Ce = r · ke → Ce máx = r · ke máx
3 Parameters Optimization From the specific characteristics of the rural scenarios and some conclusions obtained in previous works [25], this paper proposes new considerations for the traffic descriptors focused on application parameters: data size (S), packet size (SMSS for TCP, and s for UDP), data rate (1/Δt), and burst lengths (bs, bt, and MBS). The variation range considered in this study is detailed in Appendix I. 3.1 SF Services In order to study the main parameters related to SF services, UC1 (that only includes SF.Data) was analyzed. Thus, the influences of the SF parameters (SMSS, Δt and MBS) were evaluated according to EED and ρ* thresholds for different congestion levels: low-light (PLR<0.03) and medium-high (PLR<0.10). Firstly, ρ* was evaluated for: MBSi={4, 7, 11 (packets)}, Δtj={10, 20, 30 (ms)}, and SMSSk={53, 512, 1024, 1500, 2000, 2500 (B)}, and without considering user simultaneousness yet (N=1, r ≤ 64kb/s). The results obtained for each (MBSi , Δtj) duple, indicated as MBSi tj in the legend, are shown in Fig. 3. For the most critical situation (PLR<0.10), there is a better behaviour (higher normalized link occupation) in accordance with the decrease of MBS and Δt (in all trends, the best results are obtained with Δt1=10ms). This yields that, with higher rates, the link utilization is higher and the efficiency improves. This conclusion seems a logic result due to the fact that user rate only depends on the user connection (in the individual access). The discussion about SMSS is not so straightforward because efficiency is high but similar for SMSS2, SMSS3 and SMSS4. These results advise not to discard any SMSS in subsequent evaluations. Moreover, ρ* notably decreases with SMSSk > 1500B, due to the fragmentation of IP packets.
18
I. Martínez, J. García, and E. Viruete
1
340
0,95
300
't1 0,9
MBS1 t1 MBS2 t1 MBS3 t1
MBS1 t2 MBS2 t2 MBS3 t2
260
0,85
EED (s)
U * - normalized link occupation
Secondly, the proposed SF evaluation is completed with the EED analysis. In this case, the selected MBS value influences EED more than the possible Δt values. Again, the lowest MBS values yield the best results (lowest delay, see Fig. 4); this permits to discard MBS3. Moreover, the best results are obtained with low packet sizes SMSS≤1500B (for an accepted delay variation range in SF services, EED<180s). In this case, there are more significant differences depending on the SMSS value. Therefore, the optimal values that can be selected from this study are: Δt1=10ms, MBS1=4 and MBS2=7, and SMSS2=512B and SMSS4=1500B (the two extreme values that are the most relevant and also the most representative technologically). These values can be considered the default parameters of the system without user multiplexation (N=1), but the next step was to evaluate if this traffic model is valid with N multiplexed users and/or with multiple simultaneous RT services.
't2
0,8
220 180
0,75 MBS1 t1 MBS2 t1 MBS3 t1
0,7
MBS1,2
140
MBS1 t2 MBS2 t2 MBS3 t2
100
0,65 53
512
1024
1500
SMSS (B)
2000
2500
53
512
1024
1500
2000
2500
SMSS (B)
Fig. 3. Occupation ρ* depending on SMSS, Fig. 4. EED depending on SMSS, for the variafor the variation range of MBS and Δt pa- tion range of MBS and Δt parameters rameters
3.2 RT Services Following with the rural scenario and the previous SF premises, the RT services (UC2, UC3, and UC4) are added to the study in order to evaluate the global influence of RT parameters (s, Δt and MBS) and if they fulfill EED and PLR recommended thresholds. Firstly, the evaluation of the buffer sizes (Q) that guarantee EED and PLR requirements constitutes an interesting analysis to dimension the system. From the experimental and simulated tests, the most relevant results correspond to RT.Media services (distinguishing between RT.Audio and RT.Video) that impose the highest restrictions. Thus, the rural scenario was evaluated with MBSAi={4, 7 (pps)} and sAj={100, 240, 300, 400 (B)} for RT.Audio, MBSVi={5, 10, 15, 30 (fps)} and sVj={1024, 1280, 1500, 4000 (B)} for RT.Video; and an uniform inter-packet time Δt3=15ms, in both cases. The results obtained for each (MBSi , sj) duple are indicated as MBSi sj in the legend, see Fig. 5 and Fig. 6.
Users Dimensioning and Traffic Modelling in Rural e-Health Services
19
Secondly, and in order to analyze performance limits, the evaluation situations with N=1, 2, and 3 (user connection rate r≤64kb/s) were included; although only the last case (N=3) was critical for the QoS study. Thus, for this last case, Fig. 5 (for RT.Audio) and Fig. 6 (for RT.Video) show the evolution of Q depending on EED and PLR recommended thresholds. In both cases, the trends show that, when buffer size increases, EED increases linearly and PLR decreases suddenly. This EED/PLR tradeoff conditions the optimal number of simultaneous users and implies the selection of those applications that guarantee QoS: − RT.Audio service. For MBSA1=4 and with Q≥8, all the sizes sAi guarantee QoS. However, for MBSA2=7, only Q=12 (for sA1 and sA2) or Q=10 (for sA3) are valid combinations because, for sA4, there is no situation that guarantees QoS. − RT.Video service. For MBSV2=10 (and lower values) and with 12≥Q≥9, all the sizes sVi guarantee QoS. However, for MBSV3=15, only Q=10 (for sV1 and sV2) is a valid combination because, for sV3 and sV4, there is no situation that guarantees QoS. 500
0,40
MBS3 s4 MBS3 s3 MBS3 s2 MBS3 s1
450
MBS3 s4 MBS3 s3 MBS3 s2 MBS3 s1
0,35 0,30
350
MBS2 s4 MBS2 s3 MBS2 s2 MBS2 s1
0,25
PLR
EED (ms)
400
MBS2 s4 MBS2 s3 MBS2 s2 MBS2 s1
300
0,20
250
0,15
200
0,10
150
0,05
100
0,00
15
14
13
12
11
Q
10
9
8
7
6
5
15
14
13
12
(a) EED
11
Q
10
9
8
7
6
5
(b) PLR
Fig. 5. EED and PLR depending on Q, for different MBS and s combinations (RT.Audio) 0,40
350 MBS2 s4 MBS2 s3 MBS2 s2 MBS2 s1
300
MBS1 s4 MBS1 s3 MBS1 s2 MBS1 s1
MBS2 s4 MBS2 s3 MBS2 s2 MBS2 s1
0,35 0,30
MBS1 s4 MBS1 s3 MBS1 s2 MBS1 s1
0,25
200
PLR
EED (ms)
250
0,20
150 0,15
100
0,10
50
0,05 0,00
0 15
14
13
12
11
Q
10
(a) EED
9
8
7
6
5
15
14
13
12
11
Q
10
9
8
7
6
(b) PLR
Fig. 6. EED and PLR depending on Q, for different MBS and s combinations (RT.Video)
5
20
I. Martínez, J. García, and E. Viruete
4 Users Dimensioning From the conclusions obtained in the previous section for SF and RT services, this section evaluates the global performance of each ToS according to the multiplexing degree (for different values of link capacity, C=k·64kb/s with k ≥ 1 and C ≤ 2Mb/s, and the most restrictive situation, PLR<0.10). The occupation factor (ρ) is a good indicator in order to fairly compare available resources and measure link efficiency. In this case, the graphics do not represent the normalized factor (ρ*) but the relative factor (ρN) depending on the number of users (N), see (3), due to the interest of evaluating its quantitative evolution according to the simultaneousness degree. ρN = N ⋅ ρ = N
Ce k =N e C k
⎧ Ce =k e ⋅ 64kb/s (k e ≤ 1) with ⎨ ⎩ C=k ⋅ 64kb/s (k > 1)
(3)
The results obtained for each performance threshold (selected by ρN) are presented in Fig. 7. The figures show the evolution of the allowed number of users for each UC according to the available network resources (indicated by the link capacity, C). 40
40
UC1
UC1 35
UC3
30
UC2 UC3
30
UC4
N - number of users
N - number of users
35
UC2
25 20 15
UC4 25 20 15
10
10
5
5 0
0 64
128
192
256
384
512
640
768
64
1024 1280 1536 1792 2048
128
192
256
384
(a) ρN = 0.75
640
768
1024 1280 1536 1792 2048
(b) ρN = 0.80 40
40
UC1
UC1 35
35
UC2 UC3
30
UC2 UC3
30
UC4
N - number of users
N - number of users
512
C (kb/s)
C (kb/s)
25 20 15
UC4 25 20 15
10
10
5
5 0
0 64
128
192
256
384
512
640
768 1024 1280 1536 1792 2048
C (kb/s)
(c) ρN = 0.85
64
128
192
256
384
512
640
768
1024 1280 1536 1792 2048
C (kb/s)
(d) ρN = 0.90
Fig. 7. Recommended performance areas for the variation range of capacity (C) and number of users (N), depending on the useful thresholds and relative link occupancy factor (ρN)
Users Dimensioning and Traffic Modelling in Rural e-Health Services
21
Regarding the lowest performance, see Fig. 7(a) and Fig. 7(b), the recommended values of N are very high because the network conditions permit a huge number of users. If new ToS are added, N decreases notably because network resources are shared proportionally between each service. If the required performance is higher, see Fig. 7(c) and Fig. 7(d), the variation range of N decreases (as it is shown in the evolution of remarked circled areas), implying a considerably increase of network resources in order to allow accepting new users. In these cases, the quantitative relation between N and C is practically linear with k: for each 64kb/s of link capacity, the system guarantees QoS with a maximum value of k users (ρN > 0.90). These results permit to quantify the maximum values of N and, therefore, to dimension the number of simultaneous users that can be allowed in each UC of the rural environments, guaranteeing QoS. Moreover, the curves presented propose diverse recommended performance areas, for a given efficiency threshold and network occupancy level.
5 Traffic Modelling In all this entire study, the results correspond to experimental measurements obtained in the test laboratory evaluated by multiple simulations from traffic models recommended in the literature and detailed in Appendix I. This last section checks the utility of these models for high values of N (as required in the context conditions of the published works), and analyzes their validity for a more limited number of users (as a specific situation of rural scenarios for e-Health services). Thus, the main application parameters, previously characterized, have been considered: MBS and Δt (for SF services); and s, MBS and Δt (for RT services). Firstly, SF services are usually modelled as Constant Bit Rate (CBR) services with exponential MBS (ON-OFF models); and uniform s and Δt with an exponential mean. The K-S test [26] shows (detailing the values of mean and maximum deviation regarding the theoretical distribution) that, for s and Δt, the mean follows an exponential distribution independently of N, see Table 1(b). However, this conclusion in not valid with MBS, which fits better to a log-normal distribution with a small number of simultaneous users (N<15), see Table 1(a). Secondly, and for RT services, RT.Bio follows a uniform CBR model with constant rate, RT.EHR follows a multiple model with three levels (session, page and packet), RT.Audio follows a constant CBR model, and RT.Video is characterized as a Variable Bit Rate (VBR) model with exponential mean. Theoretically, the aggregation of these RT services would imply a complex model characterized by their main parameters: s, following a Pareto distribution; MBS, following an exponential distribution with constant mean; and Δt, following an exponential distribution with exponential mean. The K-S test, for the s parameter, remarks this trend regarding a Pareto distribution; but for MBS and Δt parameters, the K-S test shows that they fit better to a geometric distribution with low values (N<13 and N<14, see Table 2(a) and Table 2(b), respectively). In summary, it is remarkable that the SF services can be modelled as CBR (characterized with Δt of exponential mean; and MBS of exponential mean, for high values of N, and log-normal mean, for low values of N). Moreover, the aggregation of RT services follows a multiple model characterized with s (Pareto distribution), and MBS
22
I. Martínez, J. García, and E. Viruete Table 1. K-S test applied to sf services
N
LOG mean max
GEO mean max
EXP mean max
4 6 8 10 12 13 14 15 16 18 20
0.13 0.13 0.14 0.14 0.16 0.17 0.17 0.17 0.18 0.18 0.19
0.32 0.30 0.29 0.28 0.29 0.30 0.30 0.27 0.28 0.30 0.26
0.17 0.18 0.18 0.16 0.17 0.17 0.17 0.17 0.16 0.16 0.15
0.02 0.10 0.07 0.11 0.09 0.06 0.09 0.06 0.01 0.07 0.03
0.42 0.38 0.36 0.34 0.27 0.26 0.24 0.18 0.19 0.21 0.21
0.16 0.13 0.19 0.13 0.16 0.12 0.14 0.10 0.01 0.11 0.04
N 4 6 8 10 12 13 14 15 16 18 20
LOG mean max
GEO mean max
EXP mean max
0.14 0.19 0.18 0.19 0.15 0.14 0.17 0.14 0.18 0.17 0.17
0.23 0.23 0.24 0.22 0.24 0.21 0.22 0.24 0.23 0.21 0.23
0.05 0.07 0.11 0.13 0.11 0.10 0.11 0.10 0.09 0.08 0.07
0.21 0.21 0.14 0.13 0.17 0.15 0.16 0.18 0.20 0.21 0.19
0.13 0.19 0.13 0.16 0.14 0.15 0.14 0.11 0.12 0.13 0.14
0.07 0.05 0.07 0.08 0.08 0.10 0.09 0.07 0.07 0.09 0.08
(b) Δt
(a) MBS Table 2. K-S test applied to rt services
N 4 6 8 10 12 13 14 15 16 18 20
LOG mean max
GEO mean max
EXP mean max
0.24 0.25 0.28 0.26 0.24 0.27 0.28 0.26 0.25 0.27 0.24
0.03 0.07 0.05 0.11 0.13 0.16 0.18 0.19 0.18 0.20 0.22
0.18 0.19 0.17 0.16 0.15 0.14 0.14 0.13 0.14 0.13 0.15
0.14 0.18 0.15 0.18 0.19 0.21 0.19 0.17 0.18 0.22 0.21
0.09 0.06 0.07 0.09 0.11 0.18 0.22 0.18 0.20 0.21 0.19
0.20 0.17 0.16 0.14 0.11 0.08 0.07 0.09 0.10 0.12 0.11
(a) MBS
N 4 6 8 10 12 13 14 15 16 18 20
LOG mean max
GEO mean max
EXP mean max
0.23 0.21 0.24 0.22 0.23 0.26 0.27 0.27 0.28 0.31 0.34
0.04 0.07 0.06 0.08 0.10 0.12 0.15 0.16 0.18 0.20 0.24
0.20 0.18 0.16 0.17 0.17 0.16 0.13 0.13 0.14 0.15 0.16
0.16 0.18 0.15 0.17 0.17 0.15 0.16 0.18 0.19 0.20 0.22
0.08 0.08 0.07 0.11 0.13 0.14 0.21 0.20 0.19 0.18 0.19
0.20 0.17 0.15 0.17 0.16 0.18 0.11 0.09 0.08 0.10 0.11
(b) Δt
and Δt (exponential distribution, for high values of N, and geometric distribution, for low values of N). Although these differences with the original models are not significant to re-evaluate the entire study proposed, the results obtained are interesting enough to specify more accurate models according to the number of users. These models will permit to optimize the design of new e-Health services allowing the dynamic selection of the application codecs that better fulfill their specific model.
6 Discussion and Conclusions This paper has presented a quantitative analysis of the maximum number of simultaneous users over common e-Health services that can be provided with QoS guarantees in rural scenarios. The results obtained in this study permit to propose several optimum performance areas as a function of available network resources and required thresholds
Users Dimensioning and Traffic Modelling in Rural e-Health Services
23
for efficiency and link occupancy. A set of common telemedicine applications (in combinations which define different use cases) has been considered and the influence of their traffic descriptors over the QoS levels has been studied. Moreover, the probability distribution models associated to the traffic descriptors have been evaluated, showing their validity for high values of N and proposing some modifications when the number of users is limited, such as it may happen in rural scenarios. In summary, the methodology proposed (in the specific context of rural e-Health services, but valid for other generic multimedia scenarios) can be applied to the optimum design of new services, by adjusting the users multiplexing according to the available network resources in time, and proposing new adaptive QoS mechanisms. Acknowledgments. This work has been supported by project TSI2004-04940-C02-01 from Inter-ministerial Commission of Science and Technology (CICYT) and European Regional Development Fund (ERDF), Pulsers II IP IST-27142 from VI Framework, and FPU grant AP-2004-3568 from State Secretary of Universities and Research.
References 1. Yamazaki, T., Matsuda, J.: Adaptive QoS management for multimedia applications in heterogeneous environments: a case study with video QoS mediation. IEICE Trans. Comm. E82-B (11), 1801–1807 (1999) 2. Jennet, P., et al.: A study of a rural community’s readiness for telehealth. J. Telemed. Telecare 9(5), 259–263 (2003) 3. Jennet, P., et al.: Delivery of rural and remote health care via a broadband Internet Protocol network – views of potential users. J. Telemed. Telecare 11(8), 419–424 (2005) 4. Kosuga, M., Yamazaki, T., Ogino, N., Matsuda, J.: Adaptive QoS management using layered multi-agent system for distributed multimedia applications. In: Proc. International Conference on the Parallel Processing, pp. 388–394 (1999) 5. Viruete, E.A., Fernández, J., Martinez, I.: Evaluation of QoS in Internet accesses for Multimedia applications EQoSIM. In: Proc. IEEE Consumer Communications and Networking Conference, vol. 1, pp. 356–360 (2006) 6. Maheu, M., Whitten, P., Allen, A.: E-health, telehealth, and telemedicine: a guide to startup and success. In: Jossey-Bass (ed.), San Francisco, USA, 362.102821-E103 (2001) 7. Wright, D.: The ITU’s report on telemedicine and developing countries. J. Telemed. Telecare 4(1), 75–79 (1998) 8. Slipy, S.M.: Telemedicine and interconnection services reduce costs at several facilities. Health Management Technology 16(8), 52–55 (1995) 9. Taylor, P.: Evaluating telemedicine systems and services. J. Telemed. Telecare 11(4), 167– 177 (2005) 10. Suthon, S-W., Ong, G-M., Pung, H-K.: An adaptive end-to-end QoS management with dynamic protocol configurations. In: 10th IEEE International Conference on Networks ICON, pp. 106–111 (2002) 11. Martínez, I., García, J.: SM3-Quality of Service evaluation tool for Telemedicine-Based New Healthcare Services. In: International Congress on Computational Bioengineering ICCB, pp. 1163–1173 (2005)
24
I. Martínez, J. García, and E. Viruete
12. Martínez, I., Valero, A., Viruete, E., Fernández, J., García, J.: QoSM3. Traffic Modelling Tool for e-Health services. Telematic Engineering Event JITEL, 423–430 (2005) 13. Bai, J.: PSTN technologies: Health evolution. IEEE Trans. Inf. Technol. Biomed. 2(4), 250–259 (1999) 14. Swartz, D.: Digital Subscriber Lines: DSL in telemedicine. Telemedicine Today 6(2), 28– 30 (1998) 15. Martínez, I.: Contributions of traffic models and QoS control for the new healthcare service telemedicine-based, PhD thesis, Zaragoza University (2006) 16. Vogel, G., Bochmann, R., Disallow, J., Geckos, J., Kerherv, B.: Distributed Multimedia Applications and Quality of Service - A survey. IEEE Multimedia 2(2), 10–19 (1995) 17. Aurrecoechea, C., Campbell, A.T., Hauw, L.: A survey of QoS architectures. IEEE Trans. Inf. Techn. Biomed. (2002) 18. Xiao, X., Ni, L.M.: Internet QoS: a big picture. IEEE Network 13(2), 8–18 (1999) 19. Seitz, N.: ITU-T QoS standards for IP-based networks. IEEE Communications Magazine 41(6), 82–89 (2003) 20. Ikenaga, T., et al.: Performance evaluation of delayed reservation schemes in server-based QoS management. IEEE GLOBECOM 2, 1460–1464 (2002) 21. Guérin, R.A.: QoS Routing in Networks with Inaccurate Information: Theory and Algorithms. IEEE/ACM Transactions on Networking 7(3), 605–617 (1999) 22. Mandisodza, R.L.K., Reed, M.J.: Evaluation of buffer management for RT audio transmission over IP Networks, Communication Networks and Services (Last access 30/06/06) (2001), http://www.iee.org/oncomms/ pn/communications 23. Chao, H.J., Guo, X.: Quality of Service Control in High-Speed Networks. John Wiley, Chichester (2002) 24. Kalapriya, K., Raghucharan, B.R., Lele, A.M., Nand, S.K.: Dynamic Traffic Profiling for Efficient Link Bandwidth Utilization in QoS Routing. In: Asia-Pacific Conference on Communication (APCC), pp. 17–38 (2003) 25. Martínez, I., García, J., et al.: Application Parameters Optimization to Guarantee QoS in eHealth Services. In: Int. Conf. IEEE Engineering in Medicine and Biology Society EMBS, pp. 5222–5225 (2006) 26. Romeu, J.L.: K-S: A goodness of fit test for small samples. START Reliability Analysis Center 10(6), 123–126 (2003)
Users Dimensioning and Traffic Modelling in Rural e-Health Services
25
Appendix I. Models Used in This Study (The service model used in this study is based on the contributions of [15], and it has been designed from technical conclusions of [11] - [19]). ToS
params
SF typeI
S (MB) r (b/s)
SF typeII
S (MB) r (b/s)
audio typeI
r (b/s) s (b)
audio typeII
r (b/s) s (b)
model
Values in this study
SMSS={53,512,1500} Δt = {10,20,30} MBS = {4,7,11} SMSS={1024,2k,2k5} OnOff [bs – expo] Δt = {5,10,15,30} OnOff [s,Δt – unif/expo] MBS = {1,15,30,60} s ={100,240,300,400} OnOff [bs – expo/pareto] A Δt = {10,15,30} OnOff [s – expo/lognrm] MBSA = {3,4,5,7} CBR [bs – exponential] CBR [s,Δt – unif(expo)]
CBR [bs – expo/ray] On–Off [s – expo/unif]
video typeI
r (b/s) VBR [bt – unif/nrm] PDR, BT VBR [s – expo/weib]
video typeII
r (b/s) VBR [bt – expo/gama] PDR, BT VBR [s – pareto]
web
Session Page Packet
image
r (b/s)
bio typeI
r (b/s) s (b)
[Δt expo / s logn] [Δt gam / s paret] [Δt expo / s unif] CBR/VBR [bs/s – unif/nrm] CBR/VBR [bs/s – unif/unif]
bio typeII
r (b/s) s (b)
CBR [bs/s – unif/unif]
sA={100,240,480} Δt = {5,10,15,30} MBSA = {3,4,7} sV={800,1024,1500} Δt = {5,10,15,30} MBSV = {5,10,15,30} sV={1024,1280,4000} Δt = {5,10,15,30} MBSV = {1,15,30,60} s ={40,53,512,1500} Δt ={50,75,100,150 } MBS = {20,25,30} s ={200,512,1024} bs = {1,3,5,10,15} s ={512,800,1500} bt = {1,6,12,15,30,60} s ={40,80,100,200,400} bt = {10,20,30}
r = data rate (b/s), S = data size, s = packet size (bits), bs = burst size (packets), bt = burst time (ms), Δt = inter-packets time (ms), MBS = Maximum Burst Size.
Empirical Observations of Traffic Patterns in Mobile and IP Telephony Poul E. Heegaard Norwegian University of Science and Technology, Dept. of Telematics, N-7491 Trondheim, Norway
[email protected]
Abstract. This paper provides recent empirical traffic data and observations of telephony traffic patterns in mobile and IP telephony. These are compared with old telephony patterns from Public Switched Telephone Networks (PSTN) to investigate potential evolution and impact on traffic characterisations due to technology changes from fixed to mobile phones, changes in quality from fixed phone to mobile and IP telephone, changes in tariffs from usage based to flat-rate subscriptions, and appearance of alternative message based communication means. The results show different daily and weekly traffic profiles compared to PSTN telephony. In particular, the profile of international calls is significantly changed. Furthermore, the average call holding times show significant variations over the day in flat-rate subscriptions. Finally, the results indicate that the Short Message Service (SMS) seems to serve as a supplement to phone calls, in particular in the evenings, which might change call holding time distribution and traffic intensities.
1
Introduction
A traditional communication service like telephony is still a popular service that is offered to the customers, but the conditions are changing compared to the traditional, circuit-switched Plain Old Telephone Service (POTS). The communication technology is changing (VoIP, GSM, GPRS), the users and terminals are mobile, the terminals are changing (cell phones, IP based soft phones), the codecs and speech quality are changing, the tariff profile is changing (flat-rate subscriptions), and a variety of other communication services is appearing (email, SMS, MMS, chat). When evaluating a service like telephony it is very important to be able to correctly characterise it. Furthermore, to do proper traffic engineering, traffic planning and modelling, and traffic forecasting, in-depth knowledge of traffic pattern are still very important. There exists a lot of publications on measurement methods, on traffic model parametrisation, tons on data traffic measurement with focus on packet level, and some on session level. However, few measurements of the telephony service are reported the last decade and the question in this paper is therefore if the observations from the previous measurements still holds, or if the traffic pattern with respect to e.g. daily and weekly Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 26–37, 2007. c Springer-Verlag Berlin Heidelberg 2007
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
27
variations in traffic, arrival rate, and call holding times has changed as a consequence of the changes in conditions. The traffic engineering E-series recommendations in ITU-T [1,2] includes four major tasks; traffic demand characterisation, grade of service (GoS) objectives, traffic control and dimensioning, and performance monitoring recommendations. In Figure 1 the interrelations between these traffic engineering tasks are given. Traffic control and dimensioning take input from traffic demand characterisation and the grade of service specification. The performance monitoring is required to check if the grade of service objectives are fulfilled and to provide input to refinement of the traffic demand characterisations and to the traffic control and dimensioning processes.
&#$&(&+(%$
&%'&*% (*' ( +*."+&'-,
+" &,.+&'-
+" &(%%"'
'-(' ((#-"/,
+" (+,-"'
%%(-"(' -('-0(+$(&)(''-,
&%$(&%"'$#$'%$$ +"('-+(%,
"&',"('"'
&%$#%$(%&$ +(+&' &,.+&'-,
Fig. 1. Traffic engineering tasks [1]
This paper focuses on the traffic demand characterisation and potential impact on traffic control and dimensioning of e.g. call control servers, mobile channels, or PSTN gateways, from observed changes in the traffic patterns. The traffic characterisation is conducted by means of statistical models of large populations of network service users. The evolution of traffic patterns is discussed relative to the basic assumptions made in the traffic engineering recommendations. In Section 2 the measurement data is described. In Section 3 traffic characterisations based on the data in Section 2. This includes daily and weekly variations in traffic intensities, how the average call holding time changes, and how standard traffic demand profiles in the ITU-T recommendations compares with the profiles observed in the IP telephony measurements. In Section 4 some closing comments are given.
28
2
P.E. Heegaard
Traffic Measurement Data
Measurements of telephony usage have been conduced for many years. Early computerised measurements were reported e.g. in the Holbæk [3] and CARAT [4] where the telephony service was measured on analogue PSTN exchanges with no terminal mobility. The measurement of IP and mobile telephony are compared with the Holbæk and CARAT measurements. The IP telephony data in this paper are extracts from Call Detail Records (CDR) provided by Telio, a Norwegian IP telephony operator, and the scaled and aggregated hourly mobile telephony statistics are provided by Netcom, a Norwegian mobile operator. The data are anonymous both with respect to the originating and terminating numbers, identity and position of base stations. The data are scaled with an unknown factor in order not to reveal any details about the absolute traffic volumes of the operators. The CDR-data consists of approximately 1 million outgoing VoIP entries originated in the IP network, and terminated in the Public Switched Telephone Network (PSTN). The IP data set, I, consists of records, Ik = {ts , tc , d}, that contains information about the k’th call session including the start time, ts , the call holding time (the duration of a call), tc , and the terminating region, d, (Domestic, Europe, USA, Asia, Africa, Australia) in the PSTN. The data were collected in July 2005. The majority of the users is residential users. The CDR does not include IP-IP traffic which at that time was approximately 4% of the total traffic. The data from the mobile operator contains hourly records from a small number of mobile stations in Norway. The mobile GSM data set, M, consists of records Mk = {b, t, nl , e, nc , nhi , nho , nsi , nso } that contains aggregated and scaled information at k’th hour including base station identity (anonymous), b, time of logged record, t, number of lost calls, nl , traffic volume in Erlang, e, number of calls, nc , number of handovers (incoming, nhi , and outgoing, nho ), and number of SMS messages (incoming, nsi , and outgoing, nso ). The data were collected over a two week period in Sept 2005. Neither the location of the base stations where the data were collected, nor the ratio between business and residential users are know. The details in the loss ratio (which is very low) and the handover rate (approximately 2 handovers per originating call) are not discussed in this paper.
3
Traffic Demand Characterisation
This paper focus on the traffic demand characterisation in the traffic engineering tasks briefly discussed in the introduction. Here this includes – traffic intensities and variation, i.e. the number of calls per time unit and how this varies over the day and week. – average call holding times and distribution, i.e. the average duration of a call and how it varies over the day and call category, and how the call holding times are distributed
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
29
– traffic demand profiles, i.e. the traffic volumes in Erlang and how this varies over the day and week. The traffic characterisations in this section are based on the empirical data from the traffic measurements described in Section 2. In Section 3.1 the results from daily and weekly variations in traffic intensities are presented and discussed. Section 3.2 shows how the average call holding time changes, and Section 3.3 compares the standard traffic demand profiles in the ITU-T E.523 recommendations with the profiles observed in the IP telephony measurements. 3.1
Traffic Intensities
In traffic modelling it is quite common to assume that the call arrival process is a time-homogeneous Poisson process. It is mathematically convenient, can be justified by Palm-Kintchine theorem, and has been confirmed by many previous measurements (e.g. the Holbæk measurements [3], also included as examples in Iversen’s textbook on teletraffic theory [2]). It is well-known that the arrival process is time-inhomogeneous since the intensity is varying as a function of time of day and day of week, but that it can be homogeneous in quasi-stationary time periods of 20-30 minutes (e.g. in the busy period). The traffic intensity measurement principle in E.500 [5] assumes that both the arrival and departure processes are stationary in the observation period. The traffic intensity typically varies as a natural effect of daily life cycles of the users/subscribers. This was clearly observed in the Holbæk measurements [3]; in the business areas the traffic was decreased at lunch time (12:00) and at office closing hours [6]. In residential areas the traffic was increased after dinner time (17:00) and decreased when the television news was broadcasted (20:00) [4]. It is also observed for IP and mobile telephony; see Figure 2 where this is compared to variations of the arrival intensities observed in the Holbæk measurements. The arrival intensity is averaged over 15 minutes periods for the IP telephony and in the Holbæk measurements, and over 60 minutes periods for the mobile telephony measurements. The arrival processes have a daily profile with peak intensities at different time of day. The IP domestic peak arrival intensity is at 12:00, mobile traffic at 15:00, and IP international traffic at 22:00. Furthermore, it is observed that the arrival intensity for all traffic types drops to almost zero at nights, even for international traffic where the destinations might be in different time zones. The reason is probably that most of the international destinations are in Europe (belonging to almost the same time zone), and that the data in the IP data set I contains calls originated in Norway only. The main observation from this section confirms previous measurements that the arrival intensity varies as a function of time of day and day of week, and that the profiles and peak hours are different for domestic fixed and mobile, and international calls. This must be carefully considered before assuming a time homogeneous Poisson process in a traffic model of a mixture of domestic fixed and mobile, and international calls.
P.E. Heegaard
[number of calls]
30
IP telephony, domestic Mobile
IP telephony, international
Time of day [hour]
Fig. 2. Variations of number of calls in IP telephony compared to the Holbækmeasurements (1969) [3]
3.2
Call Holding Times
In many traffic models it is assumed that the call holding time distribution is exponentially distributed and that the expected holding time is independent of time of day. This is mathematically convenient, and the loss formulae are insensitive to the holding time distribution. Previous measurements have shown that this is not necessarily the case and that the empirical distribution tends to be more heavy tailed than the exponential distribution. Other model distributions like hyper-exponential, Weibull, normal distributions have been applied [2]. E.g. in [7], the author showed that mixtures of log-normals fit the call holding time better than the exponential. Typically, the model distribution is fitted to the busy hour of the empirical distribution. More recent measurements from mobile systems focus less on the call holding time, and more on the channel holding times in the base stations. This includes both new calls and handover of calls from neighbouring base stations. The arrival process of channel allocation in mobile cells [8,9], and the channel holding time distribution [9,10,11], are similar to the call arrival process and holding time distribution and the daily and weekly variations follows similar patterns. In the IP telephony (VoIP) measurements the focus is mostly on the effects of IP packet drops on the VoIP session performance quality (see e.g. [12,13,14]). The measurement techniques and analysis are focused on the IP packet level and packet sessions, and less on the IP telephony service level. Compared to previous measurements it can be seen in Figure 3 that the average call holding time is significantly lower during the daytime than at evenings and nights. As can be observed from the coefficient on variation in Figure 4 the accuracy of the average call holding time is not significantly changed over the day. Hence, the variation in the average call holding times is not due to
[average call holding time]
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
31
100
Time of day [hour]
Fig. 3. The daily variation in the average IP telephony call holding times compared with the Holbæk measurements (1969) [3]
average
1400 1200
1.2
CoV
1000
1.0
800 600
0.8
400 200
0
0
5
10
15
Coefficient of Variations
Average call holding time [s]
1600
20
Time of day [hour]
Fig. 4. Call holding time of international calls with average and coefficient of variations (CoV)
inaccuracy, but might be explained by the fact that the IP data set I contains a significant number private calls and some calls to destinations in other time zones than Norway. This variation in average call holding times was not observed in Holbæk [3] where the amount of private international calls was much lower, mainly due to the high tariffs, but maybe also partly because there were fewer people travelling abroad 30 years ago. In Figure 5 the average call holding times for flat-rate subscriptions (IP domestic to PSTN and IP international to Europe), and for usage based subscriptions (IP to mobile, IP international to USA, and pure mobile) are given. The plots show how the average values varies over a 24-hour period. The average values
32
P.E. Heegaard
1600
Average call duration [s]
1400 1200 1000
IP domestic->PSTN
800
IP Norway->Europe 600 400
IP->mobile
200
mobile->*
0
0
5
IP Norway->USA
10 Time of day [hour]
15
20
Fig. 5. Daily variation in call holding time
from the IP data set I are measured over 15 minute period, while in the mobile set M they are averaged over 60-minute periods. It can be observed that the call holding time is sensitive to tariffs. This was also observed in [6] where the traffic was reduced by a reduction in the average call holding times when the tariff was increased. Furthermore, when the differentiation in tariff was introduced a significant increase in traffic was observed every evening where the tariff rate was reduced (at 17:00). 3.3
Traffic Demand Profiles
The traffic variations over the day, weeks and seasons, are mainly determined by the changes in the user’s context and demand for communication. It is interesting to see if and how the traffic variation has changed since the measurements in [3,4,6]. The changes in the traffic as a function of time of day from the IP data set I records are given in Figure 6 for domestic IP telephony calls (within Norway), and for international IP telephony calls. More than 83% (in Erlang) of the international terminations are at destinations in Europe, approximately 15% at destinations in the US, and less than 2% towards the rest of the world. The figure gives no absolute numbers but plots the average over 31 days and how the average varies in 15 minutes intervals over a day. To retain readability, the accuracy is not included but this will not change the trends observed in the plots. Figure 6 also includes the daily mobile phone traffic variations from the M records. The Erlang values in the plots are scaled with a random value, but the variation in volume as a function of time of day is correct. The traffic variations in IP and mobile telephony are compared with PSTN telephony measurements from CARAT [4]. In Figure 7 the weekly variation in traffic for mobile and IP telephony is given, where the IP telephony traffic is subdivided according
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
IP telephony, domestic
33
mobile telephony (scaled)
Lunch break
traffic [Erlang]
IP telephony, interational
Dinnertime News broadcast (once every everning)
CARAT
00
02
04
06
08
10
12
14
16
18
20
22
00
Time of day [hour]
Fig. 6. Comparison of daily traffic variation from old CARAT measurement (from [4]) and recent IP and mobile telephony measurements [Erlang] [weekday] Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
IP domestic
traffic [Erlang]
Mobile (scaled)
00
Norway->Europe
12
00
12
00
12
00
12
00
Time of day [hour]
12
00
12
00
12
00
Norway-> USA
Fig. 7. Weekly variation in mobile calls, domestic IP telephony calls, and international IP telephony with termination in Europe or USA [Erlang]
to region of termination (Europe and USA). The CARAT measurements were collected when telephony was expensive, when most people had regular working hours, and Norway had only one television broadcast channel with only one news broadcast in the evening (20:00). In [3,4] peaks were observed at 10 and 14 with a decrease around lunch time (noon in Norway). In the domestic IP telephony peaks are observed at noon (12:00) and late evening (22:00). This might be due to the fact that a significant number of the subscribers in the IP data set I are residential (private) users. Compared to the recent traffic it can be observed that the communication, and probably also the working habits, have changed;
34
P.E. Heegaard
to Europe (0-1 hours)
no time difference
to USA (6-9 hours) 2 hours
8 hours
4 hours
10 hours
6 hours
12 hours
Fig. 8. Standard hourly two-way traffic distribution patterns, extract from [15]. This is compared to the one-way (rescaled) IP international traffic originated in Norway and terminated in Europe (0-1 hour time difference) and USA (6-9 hours time difference).
e.g. the busy hour in mobile telephony is in the afternoon and no effect of the lunch break is observed. This might have a technical explanation since the users now bring their terminal (the phone) with them for lunch, which was previously impossible. Figure 7 shows a typical business area profile; the mobile traffic have almost the same pattern from Monday to Friday with a significant decrease during the weekend. Similar profile is not observed for the IP telephony traffic where the total traffic only slightly decreases at Friday, a bit more on Saturday, but the rest of the week is approximately at the same level. Observe in particular that the international traffic pattern is almost insensitive to the weekdays. From Figure 6 it is observed that international calls contribute significantly to the overall traffic with a peak volume approximately at the same level as domestic traffic. Approximately 83% of the international calls are terminated in Europe (the majority has same time zone) so the increase is not mainly due to time differences but probably an effect of type of call (private) and nature of
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
35
call (e.g. family affairs), see also discussions of call holding times in Section 3.2. The mobile traffic has its main peak between 12 and 16 and a significant decrease in the late evenings. Compared to the E.523 recommendation [15], the standard traffic profiles for international traffic streams is different. In Figure 8 the daily variation in traffic demand for international calls in IP telephony are plotted against the traffic demand of international calls where the time difference between originating and terminating varies. The IP international is rescaled (plotted relative to the peak volume) and divided in terminations in Europe with at most one hour time difference, and terminations in USA with 6 to 9 hours time difference. The peak hours of the international traffic are late evening for both the European and American terminations independent of the time difference. This is probably because the traffic measurements include only calls with origination in Norway, and that they are all private customers.
4
Closing Comments
It is obvious that the traffic pattern has changed over the 30 year from 1975 till 2005. Several causes can be identified but some of them are hard or impossible to assess from the current measurement sets. It seems that, not surprisingly, if the price is dependent of the length of the call (usage based minute price), then the call will be shorter than if it is not (flat-rate or paid by someone else). See Figure 5 where it is observed that the call holding times are rather constant during daytime for all categories, but that a significant increase can be observed at evenings and nights for flat-rate categories. For usage based pricing the average call holding times are rather constant over a 24-hour period. Another effect is observed in Figures 3 and 5 where it is quite clear that in the flat-rate cases where the price does not dominate the call holding times, the times increases significantly at evenings and nights where people tend to have more time to talk than during a busy day. It can also be speculated that if the speech quality is bad then the call will be shorter than if the quality is good, if a call addresses important and/or difficult issues then the call will be longer than a call with short messages, and demographic and cultural difference will be reflected in the speech patterns and the call holding time distribution. Unfortunately, this can not be studied in the current data sets. The average number of Short Message Service (SMS) messages was in 2005 approximately 1000 per year per registered customer in Norway [16], or approximately three messages per day per customer. Hence, the SMS is a popular service that might serve as a supplement for the (short) phone calls. In Figure 9 the number of calls, nc , and number of SMS messages, ns , in the mobile data set M are plotted as a function of time of day, together with the ratio between them, ns /nc . On average the ratio ns /nc between the number of SMS messages and calls over a 24-hour period is one call per message. However, significant differences are observed in the daily profile of nc and ns , in particular in the evenings. The
36
P.E. Heegaard 350
#calls 300
#SMS/#calls 250
ratio [%]
#SMS 200
150
100
50 0
4
8
12
16
20
Time
Fig. 9. The ratio between the number of SMS messages and calls
number of calls decreases in the evenings while the number of SMS messages increases. The ratio ns /nc increases correspondingly at evenings and reaches its maximum value (>300%) at midnight. The different profile of nc and ns is an indication of that the SMS supplements or substitutes the phone calls in the evenings. This confirms the intuition that since the SMS is asynchronous means of communication, this is considered to be convenient in evenings and at nights when you are afraid of disturbing the callee (B-party). It is also likely that the SMS messages are substituting short calls with a single message or question. A detailed study of the call holding distribution, in particular for holding times close to zero, would have provided more insight to this. Unfortunately, the data in M were too aggregated for such an investigation. The weekly variations (results not shown) of the ratio ns /nc have a lower peak at Fridays and Saturdays. Maybe people are less afraid of disturbing the callee late Friday or Saturday evening than the rest of the week? The traffic intensities is still varying over the days and weeks (the data sets were to limited to observe seasonal variations). But, (quasi)-stationary periods can be identified as stated in the ITU-T traffic engineering recommendations. The average call holding times shows significant variations over the day and is much higher late evenings than during the day. The international traffic demand profile of the IP telephony traffic is significantly different from the ITU-T recommendations. To give a more comprehensive explanation of the traffic pattern variations additional details about the arrival- and departure processes, and the mixture of users (e.g. business vs. residential) are required, and more detailed measurements with finer granularity must be collected and in-depth statistical investigations conducted.
Empirical Observations of Traffic Patterns in Mobile and IP Telephony
37
References 1. ITU: Overview of recommendations on traffic engineering. ITU-T Recommendation E.490.1 (2003) 2. Iversen, V.B.: Handbook in Teletraffic engineering. ITC / ITU-D (2005) 3. Iversen, V.B.: Analysis of real teletraffic processes based on computerized measurements. Ericsson Technics 29(1), 13–64 (1973) 4. Kosberg, J.E.: Measured data of subscriber behaviour from CARAT (In Norwegian). In: 1st Nordic Teletraffic Seminar (NTS-1) (1977) 5. ITU: Traffic intensity measurement principles. ITU-T Recommendation E.500 (1998) 6. Bø, K., Gaustad, O., Kosberg, J.E.: Some traffic characteristics of subscriber categories and the influence from tariff changes. In: 8th International teletraffic congress (ITC’8), Melbourne, Australia, pp. 324/1–324/8 (1976) 7. Bolotin, V.A.: Telephone circuit holding time distributions. In: Labetoulle, J., Roberts, J.W. (eds.) 14th International Teletraffic Congress, Antibes Juan-les-Pins, France, pp. 125–134 (1994) 8. Barceló, F., Sanchez, J.I.: Probability distribution of the inter-arrival time to cellular telephony channels. In: 49th Vehicular Technology Conference (VTC’99), Huston, TX, USA (1999) 9. Aschenbruck, N., Frank, M., Martini, P., Tolle, J.: Traffic measurement and statistical analysis in a disaster area scenario. In: 1st workshop on Wireless Network Measurements (2005) 10. Barceló, F., Jordán, J.: Channel holding time distribution in public cellular systems. In: Kelly, P., Smith, D. (eds.) 16th International teletraffic congress (ITC’16), Edinburgh, Scotland, pp. 107–116. Elsevier, Amsterdam (1999) 11. Chlebus, E.: Empirical validation of call holding time distribution in cellular communications systems. In: 15th International teletraffic congress (ITC’15), Washington, DC, USA, pp. 1179–1188. Elsevier, Amsterdam (1997) 12. Cole, R.G., Rosenbluth, J.H.: Voice over IP performance monitoring. SIGCOMM Comput. Commun. Rev. 31(2), 9–24 (2001) 13. Maxemchuk, N.F., Lo, S.: Measurement and Interpretation of Voice Traffic on the Internet. In: ICC (1), pp. 500–507 (1997) 14. Marsh, I., Li, F., Karlsson, G.: Wide Area Measurements of VoIP Quality. In: Quality of Future Internet Services, Stockholm, Sweden (2003) 15. ITU: Standard traffic profiles for international traffic streams. ITU-T Recommendation E.523 (1988) 16. Jensen, W.: The Norwegian Telecommarket 2005 (In Norwegian). Technical report, Post- og teletilsynet (2005)
On the Number of Losses in an MMPP Queue Andrzej Chydzinski, Robert Wojcicki, and Grzegorz Hryn Silesian University of Technology Institute of Computer Sciences Akademicka 16, 44-100 Gliwice, Poland
[email protected] Phone: +48 32 237 11 54; Fax: +48 32 237 23 33
Abstract. We present a comprehensive analysis of the packet loss process in a finite-buffer queue fed by the Markov-modulated Poisson process. In particular, solutions for the number of losses in (0,t], the stationary number of losses and the loss ratio are presented in closed forms. Theoretical results are illustrated via numerical examples based on an IP traffic trace file. Keywords: queueing performance, MMPP traffic, packet losses.
1 Introduction Packet losses are common in packet networking. They are caused by the limited buffering space in network devices and can seriously influence the performance of the network. Describing the packet loss process is an important task as it enables better network design in terms of buffer sizing and management, congestion control mechanisms, protocols etc. Calculations of packet loss characteristics can be carried out using simulation or analysis. Both these approaches have their advantages and disadvantages. For simulation we may built a more accurate model but obtaining simulation results is sometimes difficult (due to rare events) or time-consuming. In this paper we present an analysis of the packet loss process in a finite-buffer queue whose arrival process is given by the Markov-modulated Poisson process (MMPP). The MMPP was chosen due to its ability to mimic the complex statistical behaviour of recorded traffic traces. As was shown in [1], using the MMPP we can match not only the basic parameters of the traffic (mean rate, variance, higher moments) but also the shape of the marginal distribution and the autocovariance function. The main achievement presented herein is a closed-form formula for the transform of the average number of losses in (0, t] interval. To the best of the authors’ knowledge, this result is new. Using it we can easily obtain both the transient and stationary number of packet losses as well as the loss ratio, a commonly used QoS parameter. There is a vast amount of literature devoted to the applications of the MMPP [2]-[6], parameter fitting [1],[7]-[9] and its queueing behaviour [10]-[14] (these are just examples, the complete bibliography is much longer than that), but relatively little has been
Corresponding author.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 38–48, 2007. c Springer-Verlag Berlin Heidelberg 2007
On the Number of Losses in an MMPP Queue
39
reported regarding the loss process in a finite-buffer queue fed by the MMPP. Typically, only the stationary loss ratio was studied using exact and approximate techniques [12,15,16]. The layout of this paper is as follows. The next section describes the arrival process, the queueing model and introduces the notation used throughout the paper. The major part of the paper (Section 3) then follows, presenting formulas for the number of packet losses in the transient and stationary regimes with proofs and comments. Section 4 then shows a numerical example that uses the MMPP parameterization based on an IP traffic trace file. The paper concludes in Section 5.
2 Arrival Process and Queueing System The Markov-modulated Poisson process is constructed by varying the arrival rate of the Poisson process according to an m-state continuous-time Markov chain (for instance, [11]). When the Markov chain is in state i, arrivals occur according to a Poisson process of rate λi . The MMPP is usually parametrized by two m × m matrices: Q and Λ. In this parameterization, Q is the infinitesimal generator of the continuous-time Markov chain and Λ is a diagonal matrix that has arrival rates (λ1 , . . . , λm ) on its diagonal and zeroes elsewhere. By J(t) we will denote the state of the modulating Markov chain at time t and by Pi,j (n, t) the counting function for the MMPP, namely Pi,j (n, t) = P(N (t) = n, J(t) = j|N (0) = 0, J(0) = i), where P(·) is the probability and N (t) is the total number of arrivals in (0, t]. We will use also the following matrix-form characteristics of the MMPP: (λi − Qii )pij 0 if i = j, Z(s) = , pij = Qij /(λi − Qii ) if i = j, s + λi − Qii i,j Λij E(s) = , s + λi − Qii i,j where Qij and Λij denote elements of the matrices Q and Λ, respectively. In this report we deal with a single server queue fed by an MMPP. The service time is distributed according to a distribution function F (·), which is not further specified, and the standard independence assumptions are made. The buffer size (system capacity) is finite and equal to b (including service position). This means that if a packet at its arrival finds the buffer full, it is blocked and lost. We assume also that the time origin corresponds to a departure epoch. In what follows, the crucial role will be played by the sequences of m × m matrices Ak (s), Dk (s) defined as: ∞ Ak (s) = [ak,i,j (s)]i,j , ak,i,j (s) = e−st Pi,j (k, t)dF (t), 0 ∞ Dk (s) = dk,i,j (s) i,j , dk,i,j (s) = e−st Pi,j (k, t)(1 − F (t))dt. 0
40
A. Chydzinski, R. Wojcicki, and G. Hryn
From the practical point of view it is important that Ak (s) and Dk (s) can be computed effectively by means of the well-known uniformization method (see, for instance, [10]). We will use also the following notation: 0 = m × m matrix of zeroes, I = m × m identity matrix, 1 = (1, . . . , 1)T , column vector of 1’s, ∞ An (s) = Ak (s), k=n
Bn (s) = An+1 (s) − An+1 (s)(A0 (s))−1 , R0 (s) = 0, R1 (s) = A−1 0 (s), Rk+1 (s) = A−1 0 (s)(Rk (s) −
k
Ai+1 (s)Rk−i (s)),
k ≥ 1,
i=0
π = the stationary distribution for Q, λ = πΛ1, the average arrival rate.
3 Number of Losses Let X(t) denote the queue size at time t (including service position, if occupied). Let L(t) be the number of losses in (0, t] and Δn,i (t) be its average value provided X(0) = n and J(0) = i, namely: Δn,i (t) = E(L(t)|X(0) = n, J(0) = i). Moreover, let δn,i (s) denote the Laplace transform of Δn,i (t) ∞ δn,i (s) = e−st Δn,i (t)dt, 0
and δn (s) be the column vector: δn (s) = (δn,1 (s), . . . , δn,m (s))T .
Theorem 1. The Laplace transform of the average number of losses in (0, t] in the MMPP/G/1/b queue has the form:
b−n δn (s) = Rb−n+1 (s)A0 (s) + Rb−n−k (s)Bk (s) Mb−1 (s)yb (s) k=0
+
b−n k=0
Rb−n−k (s)vk (s),
n = 0, . . . , b,
(1)
On the Number of Losses in an MMPP Queue
41
where vk (s) = Ak+1 (s)(A0 (s))−1 cb (s) − cb−k (s), ∞ ∞ 1 ck (s) = (i − b + k)Ai (s) · 1 + (i − b + k)Di (s) · 1, s i=b−k
yb (s) = E(s)
i=b−k
b−1
Rb−1−k (s)vk (s) − (I − Z(s))
k=0
b
Rb−k (s)vk (s),
k=0
Mb (s) = (I − Z(s))[Rb+1 (s)A0 (s) +
b
Rb−k (s)Bk (s)]
k=0
−E(s)[Rb (s)A0 (s) +
b−1
Rb−1−k (s)Bk (s)].
k=0
Proof of Theorem 1. Assuming 1 ≤ X(0) ≤ b and conditioning on the first departure time we obtain the following system of integral equations:
Δn,i (t) =
m b−n−1
t
Δn+k−1,j (t − u)Pi,j (k, u)dF (u)
0 j=1 k=0 m ∞ t
+
+(1 − F (t))
(k − b + n + Δb−1,j (t − u))Pi,j (k, u)dF (u)
0
j=1 k=b−n
m ∞
(k − b + n)Pi,j (k, t),
n = 1, . . . , b.
(2)
j=1 k=b−n
The first summand in (2) corresponds to the case where the first departure time u is before t and the buffer does not get full by the time u. This means that the number of arrivals in (0, u] must not be greater than b − n − 1. The second summand corresponds to the case where the first departure time u is before t and the buffer gets full by the time u. In this case k ≥ b − n packets arrive in (0, u] and k − b + n of them are lost. Finally, the last summand corresponds to the case where the first departure time is after t. Probability of this event ∞is equal to 1 − F (t) and the average number of lost packets is then equal to m j=1 k=b−n (k − b + n)Pi,j (k, t). If X(0) = 0 then conditioning on the first event time in the MMPP (packet arrival or change of the modulating state) we have:
Δ0,i (t) =
m
t
Δ0,j (t − u)(λi − Qii )pij e−(λi −Qii )u du
0
j=1 m t
+
j=1
0
Δ1,j (t − u)Λij e−(λi −Qii )u du.
(3)
42
A. Chydzinski, R. Wojcicki, and G. Hryn
We may now apply the Laplace transform to both sides of (2) and (3). After that, utilizing matrix notation we arrive at: δn (s) =
b−n−1
∞
Ak (s)δn+k−1 (s) +
k=0
Ak (s)δb−1 (s) + cn (s), n = 1, . . . , b, (4)
k=b−n
(5)
δ0 (s) = Z(s)δ0 (s) + E(s)δ1 (s). Then, substituting δ˜n (s) = δb−n (s) we have: n
Ak+1 (s)δ˜n−k (s) − δ˜n (s) = ψn (s),
n = 0, . . . , b − 1,
(6)
k=−1
δ˜b (s) = Z(s)δ˜b (s) + E(s)δ˜b−1 (s), where ψn (s) = An+1 (s)δ˜0 (s) −
∞
(7)
Ak (s)δ˜1 (s) − cb−n (s).
k=n+1
Applying Lemma 3.2.1 [17] with slight change in notation we conclude that the solution of the system (6) has the form: δ˜n (s) = Rn+1 (s)C(s) +
n
Rn−k (s)ψk (s),
(8)
k=0
where C(s) is a column vector that does not depend on n. Now we are reduced to finding unknown C(s), δ˜0 (s) and δ˜1 (s). Substituting n = 0 in (8) we obtain C(s) = A0 (s)δ˜0 (s), (9) while substituting n = 0 in (6) we have δ˜0 (s) =
∞
Ak (s)δ˜1 (s) + cb (s).
(10)
k=0
Substituting n = b and, subsequently, n = b − 1 into (8) and then applying condition (7) we obtain δ˜0 (s) = Mb−1 (s)yb (s), (11) and this finishes in fact the proof of Theorem 1.
Applying Theorem 1 and using limiting properties of the Laplace transform we can easily obtain stationary characteristics of the loss process. Corollary 1. The stationary number of packet losses in time unit (i.e. limt→∞ Δn,i (t)/t) is equal to lim s2 δb,1 (s), (12) s→0+
where δb,1 (s) is the first element of vector δb,1 (s) given in (1).
On the Number of Losses in an MMPP Queue
43
As the stationary characteristics do not depend on initial state of the system, we can use any initial queue size and any modulating state instead of b and 1 in formula (12). In practice, X(0) = b is the best choice as in this case (1) reduces to its simplest form, namely: δb (s) = Mb−1 (s)yb (s). Similarly, we can obtain the loss ratio (LR), a very popular QoS parameter. We simply have: s2 δb,1 (s) . (13) s→0+ λ Naturally, using Theorem 1 we can also obtain the transient number of losses. To accomplish that, an algorithm for the Laplace transform inversion has to be applied (see, for example, [18]). LR = lim
4 Numerical Example For numerical purposes we are going to use a parameterization of MMPP fitted to aggregated IP traffic. Namely, using one million packet headers from the file FRG1137208198-1.tsh, recorded on Jan 14th, 2006, at the Front Range GigaPOP (FRG) aggregation point, run by PMA1 , the following MMPP parameters were fitted [19]: ⎡ ⎤ −172.53 38.80 30.85 0.88 102.00 ⎢ 16.76−883.26 97.52 398.9 370.08 ⎥ ⎢ ⎥ 281.48 445.97−1594.49 410.98 456.06 ⎥ Q=⎢ ⎢ ⎥, ⎣ 23.61 205.74 58.49−598.93 311.09 ⎦ 368.48 277.28 7.91 32.45−686.12 (λ1 , · · · , λ5 ) = (59620.6, 113826.1, 7892.6, 123563.2, 55428.2). Basic characteristics of the original sample and its MMPP model are shown in Table 1. It is important that the autocorrelation function properly matches the original sample on several time scales (see Fig. 5 in [19]). Table 1. Parameters of the original and MMPP traffic in Example 2 mean packet arrival rate, λ interarr. time [μs] [packets/s] original traffic 13.940 71732 MMPP 13.941 71729
It is assumed that b = 120 and the queue is served at a constant rate of 98689.5 pkts/s which gives the link utilization ρ of 72.7%. 1
Passive Measurement and Analysis Project, see http://pma.nlanr.net/
44
A. Chydzinski, R. Wojcicki, and G. Hryn
Now we can obtain numerical values. Firstly, using (13) we can compute the stationary loss ratio: LR = 1.1044%, and the mean number of packet losses per 1ms:
MEAN NUMBER OF LOSSES
λ · LR = 0.79219.
14
(14)
10 8
12
6 4
10
2
8
0
50
100
150
200
6 4 2 0
5
10
15 TIME ms
20
25
30
MEAN NUMBER OF LOSSES
Fig. 1. The mean number of losses per 1ms as a function of time for J(0) = 4 and initial system occupancy of 0, 25%, 50%, 75% and 100%, counting from the bottom.
i=4
1
i=2
0.8
i=3 i=5
0.6
i=1
0.4 0.2
1 0.8 0.6 0.4 0.2 0
0
20
40 TIME ms
100
200
60
300
400
80
Fig. 2. The mean number of losses per 1ms as a function of time for initially empty system (X(0) = 0) and five different states of the modulating chain, i. e. J(0) = i = 1, . . . , 5
On the Number of Losses in an MMPP Queue
45
MEAN NUMBER OF LOSSES
3.5 3 2.5 i=4
2
3.5 3 2.5 2 1.5 1 0.5 0
100
200
300
400
1.5 i=2
1
i=3 i=5
0.5
i=1
0
20
40 TIME ms
60
80
Fig. 3. The mean number of losses per 1ms as a function of time for initially full system and five different states of the modulating chain J(0) = i = 1, . . . , 5
Both of these values are relatively high, taking into account the moderate value of link utilization. This is caused, naturally, by the autcorrelated structure of the arrival process. Secondly, we can evaluate the impact of the initial queue size and modulating phase on the short-time behaviour of the loss process. In Figure 4 the function Δn,i (t)/t, representing the mean number of losses per time unit, is depicted for five different initial queue sizes n and i = 4. Obviously, all the curves converge to 0.79219, but, as we can see, this convergence may be slow, especially for high initial queue sizes. Also, for high initial queue sizes we have a very high number of losses compared to the stationary value. In Figs. 2, 1, 3 the impact of the initial phase, i , on the function Δn,i (t)/t, is depicted for initial buffer occupancy of 0, 50% and 100%, respectively. As we can see, the higher the initial arrival rate, the slower the convergence to the steady state (see curves for i = 4 or i = 2). What is more interesting is that the function may not only be nonmonotonic, but it may also have more than one extremum (for instance, Fig. 3, i = 3). This is probably caused by the structure of the transition matrix Q. Finally, in Fig. 5 the stationary loss ratio as a fuction of the buffer size for two link utilizations: 72.7% and 99% is shown. As we can see, in the case of autocorrelated arrival process and high link utilzation this function may decrease very slowly and even a large buffer does not eliminate losses completely.
5 Conclusions We presented a study on the packet loss process in a finite-buffer queue whose arrival process is a Markov-modulated Poisson process. The MMPP was chosen due to its ability to model the complex statistical behaviour of network traffic.
46
A. Chydzinski, R. Wojcicki, and G. Hryn
The main result, which is the Laplace transform of the mean number of losses in (0, t], enables quick calculations of the stationary characteristics of the loss process and, by means of an inversion algorithm, also enables the calculations of the transient measures. As these losses are common in packet networking and the ability to compute the characteristics of the loss process is helpful in network design, we believe that this study is of practical importance. The results presented herein are devoted to the average number of losses. It is easily seen that they can be extended to the probability distribution instead of the average value. For this purpose, it is sufficient to use the function Δn,i (t, l) = P(L(t) = l|X(0) = n, J(0) = i)
MEAN NUMBER OF LOSSES
10 8 6 4 2
0
100
200 TIME ms
300
400
MEAN NUMBER OF LOSSES
10 8
i=4
6
i=2
4 i=3 i=5
2
i=1
0
2
4
6 8 TIME ms
10
12
14
Fig. 4. The mean number of losses per 1ms as a function of time for J(0) = 4 and initial system occupancy of 0, 25%, 50%, 75% and 100%, counting from the bottom
On the Number of Losses in an MMPP Queue
47
1
ρ = 99%
LOSS RATIO
0.01
0.0001
ρ = 72.7%
1. 106
0
200
400 600 BUFFER SIZE pkts
800
1000
Fig. 5. Stationary loss ratio versus the buffer size for two link utilizations: 72.7% and 99%
instead of Δn,i (t), and obtaining an analog of Theorem 1 for the transform of Δn,i (t, l) is straightforward.
Acknowledgment This work was supported in part by MNiSW under grant N517 025 31/2997.
References 1. Salvador, P., Valadas, R., Pacheco, A.: Multiscale Fitting Procedure Using Markov Modulated Poisson Processes. Telecommunication Systems 23(1,2), 123–148 (2003) 2. Skelly, P., Schwartz, M., Dixit, S.: A histogram-based model for video traffic behavior in an ATM multiplexer. IEEE/ACM Trans. Netw. 1(4), 446–459 (1993) 3. Kim, Y.H., Un, C.K.: Performance analysis of statistical multiplexing for heterogeneous bursty traffic in ATM network. IEEE Trans. Commun. 42(2-4), 745–753 (1994) 4. Kang, S., Sung, D.: Two-state MMPP modeling of ATM superposed traffic streams based on the characterization of correlated interarrival times. In: Proc. of IEEE GLOBECOM ’95, pp. 1422–1426 5. Shah-Heydari, S., Le-Ngoc, T.: MMPP models for multimedia traffic. Telecommunication Systems 15(3-4), 273–293 (2000) 6. Yoshihara, T., Kasahara, S., Takahashi, Y.: Practical time-scale fitting of self-similar traffic with Markov-modulated Poisson process. Telecommunication Systems 17(1/2), 185–211 (2001) 7. Ryden, T.: An EM algorithm for parameter estimation in Markov modulated Poisson processes. Comput. Stat. Data Anal. 21, 431–447 (1996) 8. Ge, H., Harder, U., Harrison, P.G.: Parameter estimation for MMPPs using the EM algorithm. In: Proc. UKPEW (2003) 9. Klemm, A., Lindemann, C., Lohmann, M.: Modeling IP traffic using the batch Markovian arrival process. Performance Evaluation 54(2) (2003)
48
A. Chydzinski, R. Wojcicki, and G. Hryn
10. Lucantoni, D.M.: New results on the single server queue with a batch Markovian arrival process. Commun. Stat., Stochastic Models 7(1), 1–46 (1991) 11. Fischer, W., Meier-Hellstern, K.: The Markov-modulated Poisson process (MMPP) cookbook. Performance Evaluation 18(2), 149–171 (1992) 12. Baiocchi, A., Blefari-Melazzi, N.: Steady-state analysis of the MMPP/G/1/K queue. IEEE Trans. Commun. 41(4), 531–534 (1992) 13. Lee, D.-S., Li, S.-Q.: Transient analysis of multi-server queues with Markov-modulated Poisson arrivals and overload control. Perform. Eval. 16(1-3), 49–66 (1992) 14. Kulkarni, L., Li, S.-Q.: Transient behaviour of queueing systems with correlated traffic. Perform. Eval. 27&28, 117–145 (1996) 15. Gouweleeuw, F.N.: Calculating the loss probability in a BM AP/G/1/N + 1 queue. Commun. Stat., Stochastic Models 12(3), 473–492 (1996) 16. Nagarajan, R., Kurose, J.F., Towsley, D.: Approximation techniques for computing packet loss infinite-buffered voice multiplexers. In: Proc. of INFOCOM ’90, pp. 947–955 (1990) 17. Chydzinski, A.: Time to reach buffer capacity in a BMAP queue. Stochastic Models 23, 195–209 (2007) 18. Abate, J., Choudhury, G.L., Whitt, W.: An introduction to numerical transform inversion and its application to probability models. In: Grassman, W. (ed.) Chapter in Computational Probability, pp. 257–323. Kluwer, Boston (2000) 19. Chydzinski, A.: Transient analysis of the MMPP/G/1/K queue. Telecommunication Systems 32(4), 247–262 (2006)
On-Line State Detection in Time-Varying Traffic Patterns D. Moltchanov Institute of Communication Engineering, Tampere University of Technology, P.O.Box 553, Tampere, Finland
[email protected]
Abstract. Real-time traffic aggregates are often characterized by timevarying nature. However, resources for this kind of traffic are usually allocated assuming busy hour stationary traffic characteristics. This assumption may lead to inefficient use of network resources when static resource reservation is used. In this paper we propose the algorithm for on-line estimation of the traffic state in terms of piecewise weakly stationary stochastic process. As a basic tool of the algorithm we use change-point statistical test allowing us to dynamically and automatically determine whether the traffic pattern changes and, if so, estimate new parameters of the traffic pattern. Assuming that a network may assign the required resources on-demand on a per-node basis, the proposed procedure helps to decide which amount of resources should be assigned to handle the traffic with given performance metrics. The proposed algorithm is well-suited for non-stationary behavior of the aggregated traffic where statistical parameters change in time. The computational complexity of the algorithm is low and it can be implemented on a traffic class basis making the proposal suitable for backbone routers.
1
Introduction
Traffic aggregates are known to exhibit high variability due to a number of phenomena including self-similar, long-range dependent or non-stationary behavior. These properties of the traffic require overprovisioning of the network resources in order to serve it with given performance metrics. It means that there must be enough resources to serve very large bursts in traffic pattern. However, there are also long time spans during which local average of the traffic aggregate may stay well below the mean of the whole process. For traffic exhibiting high variability static resource allocation may lead to inefficient use of network resources. Aggregated traffic is also characterized by deterministic trends similar to those found in telephone traffic. It was demonstrated that there are clear daily variations in the traffic patterns [5]. Authors in [5] also noticed that there is a clear indication of busy hour in link usage patterns. It is straightforward to expect that busy hour may not be the same for different applications. When statistical characteristics of traffic change in time, static resource allocation based on busy hour assumption may lead to overprovisioning of network resources. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 49–60, 2007. c Springer-Verlag Berlin Heidelberg 2007
50
D. Moltchanov
The straightforward way to deal with non-optimal resource reservation is to allocate resources dynamically. However, even if the traffic description is allowed to be dynamically altered by the source using a kind of signaling protocol, the optimal resource allocation still remains an issue. While the source-based descriptor may provide appropriate representation of the traffic when it enters the network, it may not be suitable to characterize the traffic inside the network. Indeed, when the traffic passes network elements it is smoothed. In this case the network has better view of the traffic. To provide optimal resource allocation we should allow the network to decide what is the minimum amount of resources that must be assigned to a given traffic to satisfy its performance requirements. To achieve optimal resource reservation we have to be able to estimate current statistical characteristics of the traffic aggregate in real-time. In this paper we propose the mechanism responsible for dynamic estimation of the traffic state in terms of first- and second-order statistical characteristics. The algorithm is particularly well-suited for non-stationary type of the traffic when one or more statistical characteristics change in time. Since state control is provided on peraggregate granularity, the proposed method may scale well in the backbone. The way how statistical characteristics of the traffic aggregate are mapped to resource reservation is out of the scope of this paper. The rest of the paper is organized as follows. In Section 2 we provide motivation for this work considering dynamic nature of aggregated video traffic. Model for aggregated traffic is also proposed there. Change-point statistical tests are introduced in Section 3. Numerical examples of EWMA test are considered in Section 4. Conclusions are drawn in the last section.
2 2.1
Time-Varying Nature of Aggregated Traffic Service Configuration
Reference configuration of the service is presented in Fig. 1, where AN stands for access network. We assume that there is a network operator providing network resources for a service provider. Service provider provide a QoS-constrained service to a number of users across the network. As an example of the service we consider video distribution system. We expect that routers along the path of the traffic aggregate implement a number of network services and these services are well isolated from each other. We assume that there is a specifically designed service class to serve video traffic aggregate. To represent traffic arrival process we use video traffic traces available from the University of Berlin [1]. We consider video traffic using the granularity of a second. Although we use H.263 VBR traces, we have checked that conclusions stated in this paper remain valid for MPEG-4 VBR traces from the same traffic archive and MPEG-1 VBR sequences from another archive [2].
On-Line State Detection in Time-Varying Traffic Patterns
51
...
...
Network operator Other ANs
To ANs
Edge 2 Edge 1
Service provider
Fig. 1. Reference configuration of the service
2.2
Traffic Aggregate: Varying Number of Sources
We generated traffic traces simulating behavior of the video distribution system as explained below. We assume that at a certain instant of time t0 there are no active sessions and this is the time instant when this system enters its operational state. After time instant t0 session requests start to arrive to the system. Each arrival requests an arbitrary video sequence. We used uniform distribution over all traces to determine the requested sequence. Interarrival times of sessions are geometrically distributed with a certain probability p. We assumed that session arrival process is time-varying meaning that p varies in time. This is a natural assumption for a any service where the session arrival rate is different for different time of day. We simulated 50 session arrivals. First 15 and last 15 arrivals were drawn from geometrical distribution with parameter p = 0.001, 20 arrivals in the middle occur according to geometrical distribution with parameter p = 0.005. These parameters allows us to mimic behavior of the system prior, during and after the busy hour. Note that these parameters were not intended to emulate real session arrival process to video distribution system. The purpose was to show how the traffic looks like when session arrivals are time-varying. Time-series of generated traffic traces are shown in Fig. 2. Observing these traces it is natural to expect that statistics of the aggregated traffic is timevarying. Indeed, when the rate of session arrivals significantly varies in time, it is unrealistic to assume that the aggregated traffic may converge to covariance stationary process except for some segments during which the number of active sessions remains constant. Note that statistical characteristics of traffic aggregate may also vary due to other phenomena including possible long-range dependent, self-similar or non-stationary properties. In this paper, the actual cause of this variability is of no interest as long as change detection mechanism is able to quickly and reliably detect points at which changes occur and subsequently estimate current traffic statistics. An important property of aggregated traffic is that changes in the mean value leads to changes in the variance and vice versa. Fig. 3 highlights that behavior. As one can observe this dependency is almost linear. This dependency allows us to track changes in one parameter only, either mean or variance.
52
D. Moltchanov Y(k), bits
Y(k), bits
3 .10
6
4 .10
6
2.25 .10
6
3 .10
6
1.5 .10
6
2 .10
6
1 .10
6
5
7.5 .10
0 2254
1.38 .10
4
2.54 .10
4
3.69 .10
4
4.85 .10
4
0 6666
1.71 .10
4
2.76 .10
4
3.81 .10
4
4.86 .10
k, time
4
k, time
(a) Trace 1
(b) Trace 2
Fig. 2. Aggregated traffic from varying number of sources s2[Y], bits2
s2[Y], bits2
11
1.5.10
11
1 .10
1.5 .10
11
1 .10
11
10
5 .10
5 .10
10
0 10 5 .10
0
0
5 .10
5
1 .10
6
1.5 .10
6
2 .10
6
2.5 .10
6
10 5 .10
0
5 .10
5
1 .10
6
1.5.10
6
E[Y], bits
(a) Trace 1
2 .10
6
2.5.10
6
E[Y], bits
(b) Trace 2
Fig. 3. Dependence between mean and variance of aggregated traffic
2.3
Traffic Aggregate: Fixed Number of Sources
Consider statistical characteristics of the aggregated traffic when the number of multiplexed sources is kept constant. To obtain aggregated traffic we arbitrarily choose 9 VBR H.263 traces out of the archive and multiplex them synchronizing starting times of all traces. We limited all traces to 1800 seconds. Time-series of two aggregated traces are shown in Fig. 4. Histograms of relative frequencies of these traces and their approximations by normal distributions are shown in Fig. 5, where fi,Y (Δ), i = 1, 2, . . . is the relative frequency corresponding to the is bin, Δ = (max∀i Y (i) − min∀i Y (i))/m is the width of the histogram bin. These approximations suggest that aggregated traffic from a fixed number of sources is normally distributed. The χ2 goodness-of-fit statistical test performed with the level of significance α = 0.1 confirmed this hypothesis. Empirical normalized autocorrelation functions (NACF) for both traces are shown in Fig. 6a and Fig. 6b using solid lines with circles. One may note that the memory of the process is short and limited to few lags. We approximate this behavior using a single geometrical term, e.g. y(i) = KY (1)i , i = 0, 1, . . . . In Fig. 6 approximating functions are denoted by solid lines. They exactly capture lag-1 values of NACF of empirical processes and do not significantly overestimate or underestimate autocorrelation coefficients for larger lags. We also tested traces for homogeneity. To do so each trace was divided into two samples having the same number of observations. In our case this corresponds to
On-Line State Detection in Time-Varying Traffic Patterns Y(k), bits
Y(k), bits
6
1.5 .10
6
1.13 .10
7.5 .10
5
7.5 .10
3.75 .10
5
1.5 .10
53
6
6
1.13 .10
5
3.75 .10 0
0
450
900
1350
5
0
1800
0
450
900
1350
1800
k, time
k, time
(a) Trace 1
(b) Trace 2
Fig. 4. Aggregated traffic traces from a fixed number of sources fi,Y(D)
fi,Y(D) 6 4 .10
4 .10
6
6
3 .10
6
3 .10 2 .10
6
2 .10
6
1 .10
6
1 .10
6
0
5
5
3.25 .10
0
5
6.5 .10
0
6
9.75 .10
A)
1.3 .10
0
3.25 .10
5
6.5 .10
5
9.75 .10
5
1.3 .10
B)
iD, bits
(a) Trace 1
6
iD, bits
(b) Trace 2
Fig. 5. Histograms and their normal approximations KY(i)
KY(i)
1
1
0.75
0.75
0.5
0.5
0.25 0
0.25
0
5
Empirical Approximation
10
(a) Trace 1
15
20
i, lag
0
0
5
Empirical Approximation
10
15
20
i, lag
(b) Trace 2
Fig. 6. NACFs and their approximations by a single geometrical term
900 observations. Then, we applied χ2 test for homogeneity of two samples. This test allows to check hypothesis H0 that two parts have the same marginal distribution against alternative hypothesis, H1 , that they have different marginal distributions. Tests demonstrate that with level of significance α = 0.1 two parts have the same distribution. This result allows us to assume that when the number of multiplexed sources remains constant the aggregated video traffic composes a realization of covariance stationary stochastic process with normal distribution and geometrically distributed ACF. We note that our tests do not allow us to be statistically strict with our conclusion regarding covariance stationarity. Unfortunately, there are no effective methods to statistically test whether given
54
D. Moltchanov
observations are stationary or not. However, carried out tests confirm that the marginal distribution is normal. Moreover, this distribution depends on the initial set of multiplexed traces only and remains constant in time. NACF tends to zero as time progresses confirming that underlying process is ergodic. All these conditions are necessary for an underlying process to be covariance stationary. In Section 4 we demonstrate one more test that also confirms our hypotheses. 2.4
Model for Aggregated Traffic
To model aggregated traffic from a fixed number of sources we use covariance stationary autoregressive process of order one AR(1). This process has normal marginal distribution and geometrically decaying ACF allowing us to suggest that it may produce fair approximation of empirical data. These two properties of traffic patterns were found to produce the major impact on the service performance in a network (see [4,3] among others). A process is said to be autoregressive of order one if it is given by X(n) = φ0 + φ1 X(n − 1) + (n),
n = 0, 1, . . . ,
(1)
where φ0 and φ1 are some constants, {(n), n = 0, 1, . . . } are independently and identically distributed random variables having the same normal distribution with zero mean and variance σ 2 []. For (1) to be weakly stationary it is sufficient to have φ1 = 1. In this case E[X(n)] = μX ,
σ 2 [X(n)] = γX (0),
Cov(X0 (n), Xi (n + i)) = γX (i),
(2)
where μX , γX (i), i = 0, 1, . . . , are some constants. Mean, variance and covariance of AR(1) are related to φ0 , φ1 and σ 2 [] as μX =
φ0 , 1 − φ1
σ2 [X] =
σ 2 [] , 1 − φ21
γX (i) = φi1 γX (0).
(3)
Parameters of AR(1) models are related to statistical data as φ1 = KY (1),
φ0 = μY (1 − φ1 ),
σ 2 [] = σ 2 [Y ](1 − φ21 ),
(4)
where KY (1), μY and σ 2 [Y ] are estimates of lag-1 autocorrelation coefficient, mean and variance, respectively.
3 3.1
Change-Point Statistical Tests Basic Principles
To differentiate between fluctuations of the aggregated traffic around a constant mean and those fluctuations caused by changes in the mean value we propose to use change-point statistical tests. There are a number of change-point detection
On-Line State Detection in Time-Varying Traffic Patterns
55
algorithms developed to date. The common approach to deal with this task is to use control charts including Shewhart, cumulative sum (CUSUM) or exponentially-weighted moving average (EWMA) charts. The idea of control charts is to classify all causes of deviation from a target value into two groups. These are common causes and special causes. Deviation due to common causes is the effect of ’inherent’ causes affecting a given process. Special causes are not the part of the process, occur accidentally and affect the process significantly. Control charts signal the point at which special causes occur using two control limits. If values of a certain function of initial observations are between them, process is ’in-control’. If some value falls outside, the process is ’out-of-control’. For detecting changes in aggregated traffic we assume the following. We consider that common causes of deviation are those resulting in inherent stochastic nature of the multiplexed traffic from a fixed number of video sources. Special causes are all those causing changes in parameters of ’in-control’ processes. For example, among those are arrivals of new sessions adding new traffic and completions of ongoing ones subtracting some traffic. The control procedure is as follows. Initially, a control chart is parameterized using estimates of parameters of the aggregated traffic process. When change in a parameter occurs, a new process is considered as ’in-control’ and the control chart is re-parameterized according to statistics of this process. 3.2
Change in the Mean Value
Assume that k observations of a stochastic process have the same distribution F0 . Change-point algorithms test the null hypothesis that a currently observed observation k has distribution F0 against alternative hypothesis that this observation has distribution F1 . It is often assumed that F0 and F1 are known except for some parameters of F1 . Control charts are used for detecting changes in these unknown parameters. In our case the form of the distribution is known in advance and the unknown parameter is the mean value. Change-point tests often require observations to be independent. We have seen that aggregated video traffic is characterized by positive correlation. Autocorrelation makes control charts less sensitive to changes in the mean value. For detecting changes in the mean value of autocorrelated processes two approaches have been proposed. According to the first approach, control limits of charts are modified to take into account autocorrelation properties. The idea of the second approach is to fit observations using a certain time-series model and subsequently test residuals. If the model fits empirical data well residuals are uncorrelated and control charts for independent observations can be used. Performance of change-point statistical tests for autocorrelated data has been compared in [6,7]. It was shown that modified control charts on initial observations perform better when autocorrelation is positive. Due to these reason, we use the first approach.
56
3.3
D. Moltchanov
EWMA Control Charts
Let {Y (n), n = 0, 1, . . . } be a sequence of observations. The value of EWMA statistics at the time n, denoted by LY (n), is given by LY (n) = γY (n) + (1 − γ)LY (n − 1),
(5)
where γ ∈ (0, 1) is some constant. The EWMA statistical test takes central part among other control charts. Although according to (5), the most recent value always receives more weight in computation of LY (n), the choice of γ determines the effect of previous observations on the current value of EWMA statistics. When γ → 1 all weight is placed on the current observation, LY (n) → Y (n), and EWMA statistics degenerate to original observations. As a result, EWMA control chart behaves like Shewhart one. When γ → 0 the current observation gets only a little weight, but most weight is assigned to previous observations. In this case, EWMA control chart is similar to CUSUM one. Summarizing, EWMA charts provide more flexibility at the expense of additional complexity in determining one more parameter γ. Due to this flexibility, we use EWMA control charts. Assume that observations {Y (n), n = 0, 1, . . . , N } are taken from covariance stationary process with mean E[Y ] and variance σ 2 [Y ] and can be well represented by AR(1) model. If LY (0) = E[Y ] it is easy to see that E[LY ] = E[Y ] = μY when n → ∞. The approximation of variance of {LY (n), n = 0, 1, . . . } for n → ∞ is [7] 2
2
σ [LY ] = σ [Y ]
γ 2−γ
1 + φ1 (1 − γ) 1 − φ1 (1 − γ)
,
(6)
1 + φ1 (1 − γ) , 1 − φ1 (1 − γ)
(7)
where φ1 is the parameter of AR(1) process. The control limits E[LY ] ± CY (n) are given by E[LY ] ± kσ[Y ]
γ 2−γ
where k is a design parameter. To parameterize EWMA control chart a number of parameters have to be provided. Firstly, parameter γ determining the decline of the weights of past observations should be set. Values of k and γ determine the wideness of control belts for a process with σ 2 [Y ] and μY . These four parameters affect behavior of the so-called average run length (ARL) curve that is used to determine efficiency of a certain change detection procedure. ARL is defined as the average number of observation up to the first ’out-of-control’ signal. Different parameters of k and γ for a given ARL, σ 2 [Y ] and μY are provided in [7]. Finally, μY and σ 2 [Y ] are not usually known in practice and must be estimated from empirical data. Therefore, estimates of μY and σ 2 [Y ] should be used in (7).
On-Line State Detection in Time-Varying Traffic Patterns
4 4.1
57
Numerical Examples Aggregated Traffic: Fixed Number of Sources
In Section 2 we assumed that aggregated traffic from a fixed number of video traffic sources composes realization of a covariance stationary process. Let us now provide one more reason for this conclusion applying EWMA change-point test to traces demonstrated in Fig. 4. We already found that this traffic is normally distributed and there is significant lag-1 autocorrelation. Due to these properties control limits are computed according to (7). The warm-up period used to compute control limits was set to 50 observations. EWMA statistics for different values of γ and k are shown in Fig. 7. Note that k = 3, γ = 0.01 corresponds to 137.91 ’in-control’ ARL for the first trace and 175.24 ’in-control’ ARL for the second trace. Parameters k = 3, γ = 0.001 corresponds to 956.68 ’in-control’ ARL for the first trace and 1286.97 ’in-control’ ARL for the second one. One can observe that no change in the average value of traffic observations is detected even though the ARL value for k = 3, γ = 0.01 is relatively small. These results suggest that the mean value of the aggregated traffic from a fixed number of sources does not vary in time as required by covariance stationarity. LY(k), bits
LY(k), bits
5
9 .10
5
8.25 .10
5
7.5 .10
5
7 .10
5 6.5 .10
5 7.5 .10
5
5
6 .10
6.75 .10
5
5.5 .10
50
487.5
925
1362.5
1800
6 .10
5
50
487.5
925
1362.5
k, time
(a) Trace 1: γ = 0.01, k = 3 LY(k), bits 7 .10
(b) Trace 2: γ = 0.01, k = 3 LY(k), bits
5
8 .10
5
7.63 .10
5
5
6.75 .10
5 6.5 .10
5 7.25 .10
5
5
6.25 .10 6 .10
1800
k, time
6.88 .10
5
50
487.5
925
1362.5
1800
6.5 .10
5
50
487.5
925
1362.5
1800
k, time
k, time
(c) Trace 1: γ = 0.001, k = 3
(d) Trace 2: γ = 0.001, k = 3
Fig. 7. EWMA statistics for aggregated traffic from a fixed number of sources
4.2
Aggregated Traffic: Varying Number of Sources
Consider now aggregated traffic from varying number of sources shown in Fig. 2. EWMA statistics computed for these traces are shown in Fig 8(a) and Fig. 8(b) (k = 3 and γ = 0.01). Fig. 8(c) and Fig. 8(d) demonstrate the same statistics for first several session requests. Solid horizontal lines represent control limits,
58
D. Moltchanov LY(k), bits
LY(k), bits
2 .10
6
3 .10
6
1.5 .10
6
2.25 .10
6
1 .10
6
1.5 .10
6
5 .10
5
7.5 .10
5
0 2254
4
1.38 .10
4
2.54 .10
4
3.69 .10
0 6666
4
4.85 .10
4
1.71 .10
4
2.76 .10
4
3.81 .10
k, time
k, time
(a) Trace 1
(b) Trace 2
LY(k), bits
LY(k), bits
3.2 .10
5
2.8 .10
5
2.4 .10
5
2.22 .10
5
1.6 .10
5
1.65 .10
5
7.99 .10
4
1.07 .10
5
5 .10
4
0 2254
4
4.86 .10
2747.75
3241.5
3735.25
4229
6666
7472.25
8278.5
9084.75
k, time
9891
k, time
(c) Trace 1
(d) Trace 2
Fig. 8. EWMA statistics for aggregated traffic from varying number of sources LY(k), bits
LY(k), bits
2 .10
6
3 .10
6
1.5 .10
6
2.25 .10
6
6 1 .10
5 .10
6 1.5 .10
5
7.5 .10 0 2254
4
1.38 .10
4
2.54 .10
4
3.69 .10
5
0 6666
4
4.85 .10
4
1.71 .10
4
2.76 .10
4
3.81 .10
k, time
k, time
(a) Trace 1
(b) Trace 2
LY(k), bits
LY(k), bits
1.55 .10
5
2.8 .10
5
1.36 .10
5
2.22 .10
5
1.17 .10
5
1.65 .10
5
9.86 .10
4
1.07 .10
5
8 .10
4
5 .10
4
2254
4
4.86 .10
2747.75
3241.5
3735.25
4229
6666
7472.25
8278.5
k, time
(c) Trace 1
9084.75
9891
k, time
(d) Trace 2
Fig. 9. EWMA statistics for aggregated traffic from varying number of sources
boxes represents session arrivals. The first box indicates the time instant when second session request arrives to the system, the second one denotes time instant when third session arrives. Let us consider EWMA statistics in detail.
On-Line State Detection in Time-Varying Traffic Patterns
59
We started the control chart when the first session arrives. It occurred at 2203 second for trace 1 and at 6615 second for trace 2. The warm-up period used to compute statistics of observations and control limits of charts was set to 50 observations. One may observe from Fig. 8(c) and Fig. 8(d) that both processes stay ’in-control’ when the second sessions arrive and remain ’in-control’ until new traffic adds up to EWMA statistics and changes are detected. Note that there are gaps before EWMA statistics exceed the upper control limit for both traces. This is due to the memory of EWMA statistics that allows to avoid false signals but worsens the reactive properties of the chart. In general, when the value of γ gets smaller, this interval becomes larger while the probability of false change detection decreases. EWMA statistics computed for these traces with k = 3 and γ = 0.001 are shown in Fig 9(a) and Fig. 9(b). The same statistics for first several session arrivals are shown in Fig 9(c) and Fig. 9(d). One may see that the changes in mean values of aggregated traffic patterns are successfully detected for these values of γ and k.
5
Conclusions
Based on EWMA change-point statistical test we developed dynamic state detection algorithm for aggregated network traffic. We demonstrated applicability of this test to the case when the mean value of the traffic process varies in time. The proposed approach is useful whenever the traffic aggregate experiences high variability. It can be implemented on a traffic class basis suggesting that it may scale well in backbone routers. Supplemented with dynamic resource allocation the proposed approach may result in truly optimized operation of networks. There are a number of problems that still remains open. Choice of the optimal length of the warm-up period and false alarm detection due to short-term changes are among others. Finally, the resource allocation scheme have to be developed. This scheme must provide a way to deal with periods of uncertainty during warmup periods where parameters of a new ’in-control’ traffic process are estimated.
References 1. MPEG-4 and H.263 video traces for network performance evaluation. Technical University of Berlin (Accessed on: 12.05.2006), Available at: http://www.tkn.tu-berlin.de/research/trace/trace.html 2. MPEG traffic archieve, University of Wuerzburg (Accessed on: 12.09.2003), available at: http://www.info3.informatik.uniwuerzburg.de/mpeg/traces/ 3. Hajek, B., He, L.: On variations of queue response for inputs with the same mean and autocorrelation function. IEEE Trans. Netw., 6(5), 588–598 (1998) 4. Li, S.-Q., Hwang, C.-L.: Queue response to input correlation functions: discrete spectral analysis. IEEE Trans. Netw. 1, 522–533 (1997) 5. Thompson, K., Miller, G., Wilder, R.: Wide-area internet traffic patterns and characteristics. IEEE Network 11, 10–23 (1997)
60
D. Moltchanov
6. Wieringa, J.: Control charts for monitoring the mean of AR(1) data. University of Groningen, Department of Econometrics, Faculty of Economic Sciences (Accessed on 06.07.2005), Available at: http://www.ub.rug.nl/eldoc/som/a/98a09/98a09.pdf 7. Wieringa, J.: Statistical process control for serially correlated data, PhD Thesis. University of Groningen, Department of Econometrics, Faculty of Economic Sciences(Accessed on 18.10.2005), Available at: http://dissertations.ub.rug.nl/ files/faculties /eco/1999/j.e.wieringa/
The Drop-From-Front Strategy in AQM Joanna Doma´ nska1 , Adam Doma´ nski2 , and Tadeusz Czach´orski1,2 1
Institute of Theoretical and Applied Informatics Polish Academy of Sciences Baltycka 5, 44–100 Gliwice, Poland {joanna,tadek}@iitis.gliwice.pl 2 Institute of Informatics Silesian Technical University Akademicka 16, 44–100 Gliwice, Poland
[email protected]
Abstract. The article investigates the influence of the way packets are chosen to be dropped (end of the tail, head of the tail) on the performance, i.e. response time for in case of RED and DSRED queues - two representative active queue management mechanisms used in IP routers. In particular, the self-similar traffic is considered. The quantitative analysis is based on simulation and Markov chain models solved numerically.
1
Introduction
The algorithms of queue management at IP routers determine which packet should be deleted when necessary. The active queue management, recommended now by IETF, enhances the efficiency of transfers and cooperate with TCP congestion window mechanism in adapting the flows intensity to the congestion at a network [1]. In classic RED the incoming packet is considered to be dropped or marked. In [2] S. Floyd wrote: ”when RED is working right the average queue size should be small, and it shouldn’t make too much different one way or another whether you drop a packet at the front of the queue or at the tail”. Here, we reconsider the problem of choosing either tail or front packets in presence of self-similar traffic. Sections 2 gives basic notions on active queue management, Section 3 presents briefly a self-similar model used in the article. Section 4 gives Markov chain and simulation models of the considered two active queue management schemes: RED and Double-Slope RED (DSRED). Section 5 discusses numerical results, some conclusions are given in Section 6.
2
Active Queue Management
In passive queue management, packets coming to a buffer are rejected only if there is no space in the buffer to store them, hence the senders have no earlier warning on the danger of growing congestion. In this case all packets coming during saturation of the buffer are lost. The existing schemes may differ on the Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 61–72, 2007. c Springer-Verlag Berlin Heidelberg 2007
62
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski
choice of packet to be deleted (end of the tail, head of the tail, random). During a saturation period all connections are affected and all react in the same way, hence they become synchronised. To enhance the throughput and fairness of the link sharing, also to eliminate the synchronisation, the Internet Engineering Task Force (IETF) recommends active algorithms of buffer management. They incorporate mechanisms of preventive packet dropping when there is still place to store some packets, to advertise that the queue is growing and the danger of congestion is ahead. The probability of packet rejection is growing together with the level of congestion. The packets are dropped randomly, hence only chosen users are notified and the global synchronisation of connections is avoided. A detailed discussion of the active queue management goals may be found in [1]. The RED (Random Early Detection) algorithm was proposed by IETF to enhance the transmission via IP routers. It was primarily described by Sally Floyd and Van Jacobson in [3]. Its performance is based on a drop function giving probability that a packet is rejected. The argument avg of this function is a weighted moving average queue length, acting as a low-pass filter and calculated at the arrival of each packet as avg = (1 − w)avg + wq where avg is the previous value of avg, q is the current queue length and w is a weight determining the importance of the instantaneous queue length, typically w 1. If w is too small, the reaction on arising congestion is too slow, if w is too large, the algorithm is too sensitive on ephemeral changes of the queue (noise). Articles [3,4] recommend w = 0.001 or w = 0.002, and [5] shows the efficiency of w = 0.05 and w = 0.07. Article [6] analyses the influence of w on queueing time fluctuations, obviously the larger w, the higher fluctuations. In RED drop function there are two thresholds M inth and M axth . If avg < M inth all packets are admitted, if M inth < avg < M axth then dropping probability p is growing linearly from 0 to pmax : p = pmax
avg − M inth M axth − M inth
and if avg > M axth then all packets are dropped. The value of pmax has also a strong influence on the RED performance: if it is too large, the overall throughput is unnecessarily choked and if it’s too small the danger of synchronisation arises; [4] recommends pmax = 0.1. The problem of the choice of parameters is still discussed, see e.g. [7,8]. The mean avg may be also determined in other way, see [9] for discussion. Despite of evident highlights, RED has also such drawbacks as low throughput, unfair bandwidth sharing, introduction of variable latency, deterioration of network stability. Therefore numerous propositions of basic algorithms improvements appear, their comparison may be found e.g. in [10]. DSRED (double-slope RED) introduced in [11] and developed in [12] is one of these modifications. Three thresholds Kl , Km and Kh (usually Km = (Kl + Kh )/2) and parameter γ determine two slopes of the DSRED drop function:
The Drop-From-Front Strategy in AQM
⎧ 0 ⎪ ⎪ ⎨ α(avg − Kl ) p(avg) = 1 − γ + β(avg − Km ) ⎪ ⎪ ⎩ 1
if if if if
63
avg < Kl Kl ≤ avg < Km Km ≤ avg < Kh Kh ≤ avg ≤ N
where α=
2(1 − γ) , Kh − K l
β=
2γ Kh − Kl
The double slope function makes the algorithm more elastic (more parameters to fix); gentle at the beginning (for low congestion) drop function enhances throughput and reduces queue waiting times. In this article, we present analytical (based on Markov chain) and simulation models of RED and DSRED. We assume either Poisson or self-similar traffic. Because of the difficulty in analyzing RED mathematically [13], RED and DSRED are studied in an open-loop scenario.
3
Self-similarity of Network Traffic
Measurements and statistical analysis of network traffic, e.g. [14,15] show that it displays a self-similar character. It is observed on various protocol layers and in different network structures. Self-similarity of a process means that the change of time scales does not affect the statistical characteristics of the process. It results in long-range dependence and makes possible the occurrence of very long periods of high (or low) traffic intensity. These features have a great impact on a network performance. They enlarge the mean queue lengths at buffers and increase the probability of packet losses, reducing this way the quality of services provided by a network. Also TCP/IP traffic is characterised by burstiness and long-term correlation, [16], its features are additionally influenced by the performance of congestion avoidance and congestion management mechanisms, [17,18]. Let a process Xk represent the traffic intensity measured in fixed time intervals (m) and let the aggregated process Xk be the average of the basic process over a (m) 1 group of m consecutive samples: Xk = m (Xk·m−m+1 + ... + Xk·m ), where k ≥ 1. There are several methods used to check if a process is self-similar. The easiest one is a visual test: one can observe the behaviour of the basic (m) process Xt and the aggregated process Xk . If these processes have the same character - the increase of m does not smooth the process, the process is selfsimilar. More formally, the difference between short-range dependent and longrange dependent (self-similar) process is as follows [15]: for the first process k=∞ the sum of covariance k=0 cov(k) is convergent, the spectrum of the process k=∞ S(ω) = k=−∞ R(k)e−jωk , where R(k) is the autocorrelation function of the (m)
process, is finite at ω = 0, and the variance var(Xk ) tends asymptotically for large m to the function var(X) . In the case of long-range dependent process, the k=∞ m (m) sum of covariance k=0 cov(k) is divergent, S(0) is singular, and var(Xk )
64
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski
tends asymptotically to var(X) , where 0 < β < 1. The parameter β is related mβ to the Hurst parameter H (often used to characterise the self-similarity of a process): H = 1 − β2 [15]. For 0.5 < H ≤ 1 process is self-similar; the closer H is to 1, the greater is the degree of persistence of long-range dependence. To represent the self-similar traffic we use here a model introduced by S. Robert [19,20]. The time of the model is discrete and divided into unit length slots. Only one packet can arrive during each time-slot. In the case of memoryless, geometrical source, the packet comes into system with fixed probability α1 . In the case of self-similar traffic, packet arrivals are determined by a n-state discrete time Markov chain called modulator. It was assumed that modulator has n = 5 states (i = 0, 1, . . . 4) and packets arrive only when the modulator is in state i = 0. The elements of the modulator transition probability matrix depend only on two parameters: q and a – therefore only two parameters should be fitted to match the mean value and Hurst parameter of the process. If pij denotes the modulator transition probability from state i to state j, then it was assumed that p0j = 1/aj , pj0 = (q/a)j , pjj = 1 − (q/a)j where j = 1, . . . , 4, p00 = 1 − 1/a − . . .− 1/a4, and remaining probabilities are equal to zero. The passages from the state 0 to one of other states determine the process behaviour on one time scale, hence the number of these states corresponds to the number of timescales where the process may by considered as self-similar. The model was fitted to real data [21]. This model enables us to represent, with the use of few parameters, a network traffic which is self-similar over several time-scales.
4
Analytical and Simulation Models of RED and DSRED
The RED or DSRED queue mechanisms are represented by a single-server model based either on discrete-time Markov chain or simulation. The service time represents the time of a packet treatment and dispatching. Its distribution is geometric. The model of incoming traffic was presented above. For both considered in comparisons cases, i.e. for geometric interarrival time distribution (which corresponds to Poisson traffic in case of continuous time models) and self-similar traffic, the considered traffic intensities are the same. A detailed discussion of the choice of model parameters is also presented in [22]. In Markov model, the Markov chain state is defined by the number of packets in the queue, the integer part of the avg value and by four flags u1 , u2 , u3 , u4 approximating the rest of this value (as avg is a real number, it is impossible to attribute a state to each of the infinite number of its possible values) in the following way: [(i − 1) ∗ 0.25] + (i ∗ 0.25) 2 where i is the number of non-zero flag. If all flags are null, we assume the integer value of avg. In case of self-similar traffic this state definition is supplemented by a variable denoting the state of the modulator.
The Drop-From-Front Strategy in AQM
65
The vector p of state probabilities is given by a system of linear equations p=p∗P where P is the transition probability matrix which is generally large (the number of states, hence the order of the matrix P may be hundreds of thousands or millions), sparse and ill conditioned, and the use of well known and broadly used numerical algorithms for algebraic and differential equation systems gives poor results. That is why a projection method using Krylov subspaces, as recommended in [23] was chosen. The method of Arnoldi is an orthogonal projection process onto the Krylov subspace. It may be used to compute approximations to the unit eigenvalue and the corresponding eigenvector of the matrix P . The matrix Hm (upper Hessenberg matrix) represents the restriction of the linear transformation P to the subspace Km . Approximation of the eigenvalue of P can be obtained from the eigenvalue of Hm . We often use the so-called Rayleigh-Ritz procedure for extracting eigenvalue and eigenvector approximations from a given subspace. If λi is an eigenvalue of Hm and pi the corresponding eigenvector, i.e., Hm pi = λi pi , then λi is taken as an approximation to an eigenvalue of P , and Vm pi as an approximation to the corresponding eigenvector of P . To simplify the notation, we denote: v = p(ti ), and w = p(ti+1 ). The solution should have the form w = eP v (for simplicity, we omit here the constant τi = ti+1 − ti ). Following the observation that a truncated series of order m−1 (or, more generally, that approximating eP v by a polynomial of degree m − 1, notice that this polynomial is a linear combination of the vectors v, P v, ..., P m−1 v), will yield an element of the Krylov subspace: Km (P, v) ≡ Span{v, P v, ..., P m−1 v} The method is reduced to find such as element Km (P, v) of this space, that best approximates w = eP v. The set of base vectors of these subspace is denoted by Vm = [v1 , v2 , ..., vm ], v1 = v/β where β = v 2 , v = βVm e1 and: w ≈ Vm (VmT Vm )−1 VmT eP Vm βe1 (1) In vector ei the i-th element is equal 1 and the others are null. The set of base vectors Vm is obtained via Arnoldi’s procedure [23]: 1. v1 = v/ v 2 2. For j=1,2,...,m do z = P vj For i = 1, 2, ..., j do hij = viT z z = z − hij vi hj+1,j = z 2
66
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski
vj+1 = z/hj+1,j The above algorithm is the modified Gram-Schmidt orthogonalization procedure, the obtained vectors vi are orthonormal, and the upper Hessenberg matrix Hm (its dimension is m × m) which is composed of coefficients hij holds the equation: P Vm = Vm Hm + hm+1,m vm+1 eTm (2) The set of vectors Vm is orthonormal, hence we can simplify the eq. (1): w ≈ βVm (VmT eP Vm )e1 T
If we approximate VmT eP Vm by eVm P Vm the sought vector w will become: T
w ≈ βVm eVm P Vm e1 Also, because the set of vectors is orthonormal Vm we may rewrite (1) as: Hm = VmT P Vm
(3)
and the solution may be expressed as: w ≈ βVm eHm e1 This solution still needs the calculation of matrix exponential but the size of the matrix is considerably smaller (m - the dimension of Krylov subspace is significantly smaller than n - the number of states of the considered system). Hence we can use any method advised for small systems. e.g. Pad´e approximation. In the above description the constant τ was omitted. It may be easily put to the obtained solution because VmT (P τ )Vm = Hm , and Krylov subspaces related to P and P τ are indentical. Hence, the use of the Krylov subspaces for transient states consists in: – the use of Arnoldi procedure to obtain the orthonormal set of base vectors Vm and Hessenberg matrix Hm . – the use of Pad´e approximation to obtain eHm τ . – calculation of the state probability vector approximated by βVm eHm τ e1 . To validate the Markovian results, we used a simulation packet OMNET++, written in C++ by A. Varga [http://www.omnetpp.org/]. Below we present some numerical results.
5
Numerical Results
Our goal is to capture the influence of the way a packet is chosen to be deleted (end of the tail, head of the tail) on the RED and DSRED queueing times. Input traffic intensity (for geometric and self-similar traffic) was chosen as α = 0.5,
The Drop-From-Front Strategy in AQM
67
and due to the modulator characteristics, the Hurst parameter of self-similar traffic was fixed to H = 0.78. The RED parameters had the following values: buffer size 250 packets, threshold values M inth = 100 and M axth = 200, pmax = 0.1, w = 0.002 or w = 0.07. Parameter μ of geometric distribution of service times (probability of the end of service within a current time-slot) was μ = 0.25 or μ = 0.5. Due to the changes of μ, two different traffic loads (low and high) were considered. In case of DSRED policy, the traffic pattern and the buffer size are the same, parameters Kl = M inth = 100 and Kh = M axth = 200, intermediate threshold Km = 150. The shaping parameter γ had three values γ = 0.15, 0.5, 0.85. Fig. 1 displays a comparison of analytical and simulation results. They are almost identical if probabilities are greater then 10−10 , for smaller values the simulation results are not significant (the simulation run involved 250 millions of packets) while Markov model is able to give probabilities of very rare events. If the mean queue length is relatively low, the influence of dropping scheme on queueing time is negligible: the introduction of drop-from-front strategy gives 0.7% shorter mean queueing time in case of RED and 0.8% shorter mean queueing time in case of DSRED, see Fig. 2. 1 MARKOV SIMULATION 1e-05 1e-10 1e-15
Probability
1e-20 1e-25 1e-30 1e-35 1e-40 1e-45 1e-50 0
50
100 150 Number of packets
200
250
Fig. 1. Queue distribution for RED queue: geometric source, α = 0.5, μ = 0.25, w = 0.07, analytic and simulation results
Naturally, the introduction of DSRED gives shorter mean queue length and shorter mean queueing time compared to RED. However, when the Poisson traffic is replaced by self-similar one with the same intensity and preserving the same parameters of RED, the length of the queue grows and the influence of the dropping scheme is more visible: drop-from-front strategy reduces mean queueing time by 16.4%. A comparison of response time distributions for RED queue, for both strategies is presented in Fig. 3 (left). The same comparison in case of
68
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski 0.004
0.006 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL
0.0035 0.005 0.003 0.004 Probability
Probability
0.0025
0.002
0.003
0.0015 0.002 0.001 0.001 0.0005
0
0 0
50
100
150
200
250
300
350
400
450
0
50
100
150
Time [slots]
200
250
300
350
Time [slots]
Fig. 2. Waiting times for RED (left) DSRED (right) queues: drop-from-front and dropfrom-tail strategies, geometric source, α = 0.5, μ = 0.5, w = 0.07, γ = 0.5. 0.008
0.012 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL
0.007 0.01 0.006 0.008 Probability
Probability
0.005
0.004
0.006
0.003 0.004 0.002 0.002 0.001
0
0 0
100
200
300
400
500
600
0
50
100
150
Time [slots]
200
250
300
350
400
450
Time [slots]
Fig. 3. Waiting times for RED (left) and DSRED (right) queues: drop-from-front and drop-from-tail strategies, self-similar source, α = 0.5, μ = 0.5, w = 0.07, γ = 0.5. 0.006
0.007 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL 0.006
0.005
0.005
Probability
Probability
0.004
0.003
0.004
0.003
0.002 0.002
0.001
0.001
0
0 0
100
200
300
400
Time [slots]
500
600
700
0
100
200
300
400
500
600
Time [slots]
Fig. 4. Waiting times for RED (left) and DSRED (right) queues: drop-from-front and drop-from-tail strategies, self-similar source, α = 0.5, μ = 0.5, w = 0.002, γ = 0.5.
DSRED queue is presented in Fig. 3 (right). In this case the response time with drop-from-front strategy is 18.1% shorter then for tail-drop mechanism.
The Drop-From-Front Strategy in AQM 0.02
69
0.025 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL
0.018 0.016
0.02
0.014 0.015 Probability
Probability
0.012 0.01 0.008
0.01
0.006 0.004
0.005
0.002 0
0 0
200
400
600
800
1000
1200
0
100
200
300
400
Time [slots]
500
600
700
800
900
1000
Time [slots]
Fig. 5. Waiting times for RED (left) and DSRED (right) queues: drop-from-front and drop-from-tail strategies, geometric source, α = 0.5, μ = 0.25, w = 0.07, γ = 0.5. 0.007
0.008 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL 0.007
0.006
0.006 0.005
Probability
Probability
0.005 0.004
0.003
0.004
0.003 0.002 0.002 0.001
0.001
0
0 0
200
400
600
800
1000
1200
0
100
200
300
Time [slots]
400
500
600
700
800
900
1000
Time [slots]
Fig. 6. Waiting times for RED (left) and DSRED (right) queues: drop-from-front and drop-from-tail strategies, self-similar source, α = 0.5, μ = 0.25, w = 0.07, γ = 0.5. 0.0045
0.007 DROP FROM FRONT DROP FROM TAIL
DROP FROM FRONT DROP FROM TAIL
0.004 0.006 0.0035 0.005
Probability
Probability
0.003
0.0025
0.002
0.004
0.003
0.0015 0.002 0.001 0.001 0.0005
0
0 0
200
400
600
800
Time [slots]
1000
1200
1400
0
200
400
600
800
1000
1200
1400
Time [slots]
Fig. 7. Waiting times for RED (left) and DSRED (right) queues: drop-from-front and drop-from-tail strategies, self-similar source, α = 0.5, μ = 0.25, w = 0.002, γ = 0.5.
70
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski
Table 1. Comparison of RED and DSRED (geometric and self-similar traffic)
Mean queue length
Variation Loss prob- Mean of queue ability waiting length time
Variance of waiting time
64.92
RED μ = 0.5 w = 0.002
GEO
1562.07
0.00389652 132.34
6340.05
SELF-S 130.61
6484.46
0.150939
16866.7
DSRED γ = 0.5 μ = 0.5 w = 0.002
GEO
54.65
1077.13
0.00455001 111.8
4385.54
SELF-S 89.53
3703.73
0.168627
218.26
10063.2
RED μ = 0.25 w = 0.002
GEO
82.33
0.500013
803.35
3731.58
SELF-S 169.86
5056.31
0.551741
760.58
34059.4
DSRED γ = 0.5 μ = 0.25 w = 0.002
GEO
150.01
131.036
0.500066
604.09
3910.51
SELF-S 136.21
3962.95
0.55552
617.071
31355
RED μ = 0.5 w = 0.07
GEO
1504.8
0.00390818 131.375
6109.14
SELF-S 123.79
5570.37
0.151645
293.89
13634.1
DSRED γ = 0.15 μ = 0.5 w = 0.07
GEO
977.58
0.004675
108.63
3985.63
SELF-S 76.1
2202.89
0.175138
186.39
4924.46
DSRED γ = 0.85 μ = 0.5 w = 0.07
GEO
54.87
1018.47
0.00457423 110.67
4150.01
SELF-S 83.01
2628.08
0.170974
6025.25
DSRED γ = 0.85 μ = 0.5 w = 0.07
GEO
1182.4
0.00426957 118.314
4811.09
SELF-S 100.03
3731.86
0.162186
240.73
8821.86
RED μ = 0.25 w = 0.07
GEO
3.47
0.500006
803
2467.85
SELF-S 163.2
4497.22
0.55187
735.69
25134.7
DSRED γ = 0.15 μ = 0.25 w = 0.07
GEO
24.57
0.499999
521.63
1958.88
SELF-S 109.52
2313.9
0.559453
501.39
14030.4
DSRED γ = 0.85 μ = 0.25 w = 0.07
GEO
40.05
0.499999
603.98
2554.41
SELF-S 130.26
3100.71
0.555498
590.9
18972.3
DSRED γ = 0.85 μ = 0.25 w = 0.07
GEO
24.58
0.499999
686.33
2453.79
3642.48
0.554022
652.57
21408.3
199.84
64.43 53.07
57.91 199.75 129.41 150 170.59
SELF-S 144.43
308.49
202.25
The change of wq value (from 0.07 to 0.002) in computation of moving average results in longer response time and mean queue – see the Table – but the introduction of drop-from-front in place of tail-drop gives about 1% of changes. A comparison of queueing time distributions in these cases is given in Fig. 4 (left - RED) and (right - DSRED). In case of heavy traffic, for both mechanisms RED/DSRED, irrespective of the wq value and of the traffic self-similarity, drop-from-front strategy gives two times shorter mean queueing times. Queueing time distributions for all considered cases are presented in Figs. 5, 6, 7.
The Drop-From-Front Strategy in AQM
6
71
Conclusions
Drop-from-front strategy, when applied in place of tail-drop one, results in reduction of mean queueing time in RED/DSRED mechanisms of active queue management. In case of light load, the difference is more visible for self-similar traffic. In case of heavy load, the difference is also substantial for short-dependent traffic. Hence the application of drop-from-front strategy in AQM mechanisms may be recommended for connections with real-time requirements, even if the quantitative results depend on the distribution of the packet size and thus may differ slightly from presented here.
Acknowledgements This research was financed by Polish Ministry of Science and Higher Education project no. N517 025 31/2997 and supported by European Network of Excellence EuroFGI (Future Generation Internet).
References 1. Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., Zhang, L.: Recommendations on queue management and congestion avoidance in the internet, RFC 2309, IETF (1998) 2. Floyd, S.: Red with drop from front (1998), ftp://ftp.ee.lbl.gov/email/ sf.98mar11.txt 3. Floyd, S., Jacobson, V.: Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking 1(4), 397–413 (1993) 4. Floyd, S.: Discussions of setting parameters (1997), http://www.icir.org/floyd/ REDparameters.txt 5. Zheng, B., Atiquzzaman, M.: A framework to determine the optimal weight parameter of red in next generation internet routers. Technical report, The University of Dayton, Department of Electrical and Computer Engineering (2000) 6. May, M., Bonald, T., Bolot, J.: Analytic evaluation of red performance. IEEE Infocom 2000, Tel-Aviv, Izrael (2000) 7. Feng, W.C., Kandlur, D.D., Saha, D.: Adaptive packet marking for maintaining end to end throughput in a differentiated service internet. IEEE/ACM Transactions on Networking 7(5), 685–697 (1999) 8. May, M., Diot, C., Lyles, B., Bolot, J.: Influence of active queue management parameters on aggregate traffic performance. Technical report, Research Report, Institut de Recherche en Informatique et en Automatique (2000) 9. Zheng, B., Atiquzzaman, M.: Low pass filter/over drop avoidance (lpf/oda): An algorithm to improve the response time of red gateways. Int. Journal of Communication Systems 15(10), 899–906 (2002) 10. Hassan, M., Jain, R.: High Performance TCP/IP Networking. Pearson Education Inc., London (2004) 11. Zheng, B., Atiquzzaman, M.: Dsred: A new queue management scheme for next generation networks. In: The 25th Annual IEEE Conference on Local Computer Networks, pp. 242–251. IEEE Computer Society Press, Los Alamitos (2000)
72
J. Doma´ nska, A. Doma´ nski, and T. Czach´ orski
12. Zheng, B., Atiquzzaman, M.: Improving performance of active queue management over heterogeneous networks. In: ICC 2001: International Conference on Communications, pp. 2375–2379 (2001) 13. Liu, C., Jain, R.: Improving explicit congestion notification with the mark-front strategy. Computer Networks (Amsterdam, Netherlands: 1999) 35(2–3), 185–201 (2001) 14. Stallings, W.: High Speed Networks, TCP/IP and ATM Design Principles. Prentice Hall, Upper Saddle River, N.J (1998) 15. Willinger, W., Leland, W.E., Taqqu, M.S.: On the self-similar nature of ethernet traffic. IEEE/ACM Transactions on Networking (1994) 16. Abry, P., Flandrin, P., Taqqu, M., Veitch, D.: Wavelets for the analysis, estimation and synthesis of scaling data. In: Park, K., Willinger, i W. (eds.) Self-similar Network Traffic Analysis and Performance Evaluation (1999) 17. Paxson, V., Floyd, S.: Wide area traffic: the failure of poisson modeling. IEEE/ACM Transactions on Networking 3 (1995) 18. Feldman, A., Gilbert, A., Huang, P., Willinger, W.: Dynamics of ip traffic: Study of the role of variability and the impact of control. In: ACM SIGCOMM’99, Cambridge (1999) 19. Robert, S.: Mod´elisation Markovienne du Trafic dans les R´eseaux de Communication. PhD thesis, Ecole Polytechnique F´ed´erale de Lausanne, Nr 1479 (1996) 20. Robert, S., Boudec, J.-Y.L.: New models for pseudo self-similar traffic. Performance Evaluation 30(1-2), 57–68 (1997) 21. Czach´ orski, T., Doma´ nska, J.: Markovian models for long-range dependent traffic. Archiwum Informatyki Teoretycznej I Stosowanej 13(3), 297–308 (2001) 22. Doma´ nska, J.: Procesy Markowa w modelowaniu nate˙zenia ruchu w sieciach komputerowych.PhD thesis, IITiS PAN, Gliwice (2005) 23. Stewart, W.: An Introduction to the Numerical Solution of Markov Chains. Princeton Academic Press, London (1994)
TCP Congestion Control over 3G Communication Systems: An Experimental Evaluation of New Reno, BIC and Westwood+ Luca De Cicco and Saverio Mascolo Dipartimento di Elettrotecnica ed Elettronica, Politenico di Bari, Via Re David 200, Italy {ldecicco,mascolo}@poliba.it
Abstract. One of TCP’s key tasks is to react and avoid network congestion episodes which normally arise in packet switched networks. A wide literature is available concerning the behaviour of congestion control algorithms in many different scenarios and several congestion control algorithms have been proposed in order to improve performances in specific scenarios. In this paper we focus on the UMTS wireless scenario and we report a campaign of measurements that involved around 3000 flows and more than 40 hours of measurements using three different TCP stacks: TCP NewReno, which is the congestion control algorithm standardized by IETF, TCP BIC which is the default congestion control algorithm adopted by the Linux operating system, and TCP Westwood+ also available in the Linux kernel. The experimental evaluation has been carried out by accessing the public Internet using an UMTS card. Measurements of goodputs, RTTs over time, packet loss ratios, number of timeouts and Jain Fairness Indices are reported through cumulative distribution functions. Moreover, the efficiency of each TCP version in transferring files has been evaluated by varying the file size in the range from 50 KB up to 500 KB. The cumulative distribution functions reported in the paper show interesting results: 1) a single downlink flow is far from saturating the channel bandwidth; 2) considered TCP stacks provide similar results; 3) 90th (50th) percentile of the goodput of a single downlink flow is less or equal then 230 kbps (120 kbps) compared to a nominal 384 kbps UMTS downlink channel. Keywords: TCP, congestion control, 3G, UMTS.
1
Introduction
Since 2001, the year that the first commercial Universal Mobile Telecommunications System (UMTS) network was deployed by NTT DoCoMo in Japan, many telecom operators have launched UMTS access to subscribers. As many
We would like to thank Fabio Ricciato for allowing us to use a server at FTW. This work has been partially supported by the MIUR-PRIN project no. 2005093971 "FAMOUS Fluid Analytical Models Of aUtonomic Systems" and by Financial Tradeware S.r.l..
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 73–85, 2007. c Springer-Verlag Berlin Heidelberg 2007
74
L. De Cicco and S. Mascolo
3G networks are emerging, it is important to evaluate how different Transmission Control Protocol (TCP) congestion control mechanisms behave in such networks. The UMTS network provides wide area Internet wireless access with downlink speeds up to 384 kbps and round trip times in the order of 300 ms thus providing a viable solution for multimedia and for Voice over IP (VoIP) applications. It is known that efficiency of TCP as a transport protocol degrades when lossy links are present in the routing path [6][4], such as in the case of wireless links. For this reason, the UMTS link layer implements the Radio Link Control (RLC) protocol that masks the lossy channel to upper layers through retransmissions. In this way, in-order packet delivery and loss probability less than 1% are provided [1]. The reliability of the link layer comes at the cost of an highly variable segments delay as seen at the transport layer when frames are retransmitted at the link layer, thus possibly leading to spurious timeouts [10]. The impact of spurious timeouts has been studied extensively so far and it has been often considered as one of the major causes of TCP throughput degradation. However, against this common belief authors in [5] have found that, in the case of a welldesigned UMTS network and in the static scenario, the number of spurious timeouts is very low, thus having negligible impact on TCP throughput. Another issue raised by the UMTS link layer is the variability of the available bandwidth that is caused by the channel state scheduling. In [1] authors address both rate and delay variabilities, showing the negative effects of delay variability on throughput achieved by TCP. In this paper we present the results obtained through an extensive campaign of measurements obtained over a live UMTS network. Both downlink and uplink measurements of goodputs, round trip times (RTT), queuing times, packet loss ratios, number of timeouts and Jain Fairness Indices (JFI) have been collected. We have considered three different TCP stacks that are available in the Linux kernel: TCP NewReno [12], which is the congestion control algorithm standardized by the Internet Engineering Task Force (IETF), TCP BIC [16], which is the default congestion control algorithm adopted by the Linux operating system, and TCP Westwood+, which has been proposed in [13,11] and is also available in the Linux kernel. The rest of the paper is organized as follows: in Section 2 we summarize the prior work on live UMTS measurements and we briefly describe the considered congestion control algorithms; in Section 3 we describe the considered experimental testbed; Section 4 reports experimental results whereas a discussion of the results is reported in Section 5. Finally, Section 6, draws the conclusions.
2 2.1
Related Work Live UMTS Network Performance Evaluation
The academic literature contains a plethora of simulation studies about TCP performances over UMTS and GPRS, but very few papers have addressed the performance evaluation of a live UMTS network.
TCP Congestion Control over 3G Communication Systems
75
In [9] authors report results obtained by an experimental investigation carried out in an early deployment of two 3G networks in near-ideal conditions, i.e. no other user was accessing the network. In the paper it is noticed that the UMTS link is affected by very few spurious timeouts and that the employment of the Eifel algorithm did not provide any throughput improvement. In [7] authors report goodput measurements obtained by accessing the public 3G/UMTS network to transfer files with different size; the paper focuses only on the downlink and does not report any data regarding RTT variability, typical number of timeouts and fairness indices. Authors of [8] provide an IP and TCP level measurement of the UMTS downlink and uplink in both static and mobile scenario. The paper focuses on finding the optimal settings for the MSS and the initial receiver window; throughput measurements are given for a limited number of considered scenarios. 2.2
TCP Congestion Control Algorithms
One of the most important tasks that the TCP addresses is regulating the sending rate in order to avoid the network congestion. In [14] Van Jacobson has proposed a solution to the network congestion control problem that mainly consists of two distinct phases: a probing phase and a decreasing phase (the well-known Additive Increase Multiplicative Decrease - AIMD - paradigm [2]). During the probing phase the link capacity is probed using an exponential growth law which is called slow start, or a linear growth law that is called congestion avoidance. The congestion control algorithm switches from the probing phase to the decrease phase when a three duplicate acknowledgment (3DUPACK) or a timeout is experienced indicating that a congestion event has taken place. During the decreasing phase the congestion window cwnd is multiplicatively decreased in reply to the congestion episode. During the congestion avoidance phase of TCP NewReno the congestion window is increased by one packet for each RT T , whereas in the slow start phase the cwnd is doubled each RT T . The congestion window is halved after a congestion episode, whereas when a timeout is experienced the cwnd is set to 1 segment and the slow start phase takes place. TCP Binary Increase Congestion Control (BIC) [16] is made of two parts: the binary search increase phase and the additive increase phase. In the binary search phase the congestion window setting is performed as a binary search problem. After a packet loss, the congestion window is reduced by a constant factor b, cwndmax is set to the window size before the lost and cwndmin is set to the value of congestion window after the loss (cwndmin = b · cwndmax ). If the difference between the congestion window middle point (cwndmax + cwndmin )/2 and the minimum congestion window is lower than a threshold Smax the protocol starts a binary search algorithm increasing the congestion window to the middle point, otherwise the protocol enters the “linear increase” phase and increments the congestion window by one for each received ACK. If BIC does not get a loss indication at this window size, then the actual window size becomes the new minimum window; otherwise, if it gets a packet loss, the actual window size
76
L. De Cicco and S. Mascolo
becomes the new maximum. The process goes on until the window increment becomes lower than the Smin threshold and the congestion window is set to cwndmax . If the window grows more than cwndmax , the protocol enters into a new phase (max probing) that is specular to the previous phase; that is, it uses the inverse of the binary search phase first and then the additive increase. TCP Westwood+ [11,13] is a sender side modification of TCP NewReno in which the multiplicative decreasing phase is replaced with an adaptive decreasing phase. In particular, after a congestion episode cwnd is set such that the bandwidth which is available at the time of congestion is exactly matched. The available bandwidth is estimated by counting and averaging the stream of ACK packets. In particular, when three DUPACKs are received, the congestion window cwnd is set equal to the estimated bandwidth (BW E) times the minimum measured round trip time (RT Tmin ). After a timeout the slow start threshold is set equal to BW E · RT Tmin and the cwnd is set to one.
3
Experimental Testbed
In order to carry out our experiments we have set up a machine at the FTW research center in Vienna, installing a Linux Kernel with Web100 support. The TCP flows have been generated and received using a modified version of iperf [15] which uses the libnetmeas [3] library that we have developed in order to automatically get instantaneous values of internal kernel variables such as cwnd, RTT, ssthresh, timeouts. We have found out that the telecom operator uses a transparent proxy that probably implement some sort of Split Stream solution, so that when the user downloads the file for the first time, the file is cached on a proxy that is located in the telecom operator network. This way the second download provides better results, since the path results shorter than that of the first download. For this reason we have not used an HTTP server as the sender and an HTTP client (such as wget) for the receiver. The scenarios and the testbed that we have used in this investigation are aimed at characterizing both the uplink and downlink UMTS channels in the most common user scenarios. The user equipment (UE) is a Nokia 6630 mobile phone connected to a laptop via USB 2.0 that was located at the C3Lab, Politecnico di Bari (Italy) accessing the public UMTS network using a commercial card provided by a local mobile operator (Figure 1).
Fig. 1. Experimental Testbed
TCP Congestion Control over 3G Communication Systems
77
The UE was static and was accessing the UMTS network in an indoor environment so that handovers could not occur during measurements. The nominal value declared by the telecom operator for the downlink (uplink) channel is 384 kbps (64 kbps). Based on the results obtained in [8] we have fixed the maximum segment size to 1500 bytes. Moreover we have set the initial receiver advertisement window to the default value of 64 Kb which is well above the bandwidth delay product so that we are sure that the bottleneck is located in the UMTS network. For each connection we have collected a very rich set of measurement including goodputs, RTTs, queuing times, number of timeouts, retransmission ratios. We have evaluated the Jain Fairness Index as follows [2]: N JF I =
2
i=1
N
N
xi
i=1
x2i
where xi is the mean goodput obtained by the i-th flow accessing the downlink or the uplink. We have evaluated TCP congestion control algorithms in the following scenarios: i) one flow over the UMTS downlink or uplink; ii) two or four homogeneous flows sharing the UMTS downlink or uplink; iii) short file transfers of 50KB, 100KB, 200KB, 500KB, files on both UMTS downlink and uplink. For each scenario we have run experiments using the Linux Kernel implementation of TCP Bic, TCP NewReno and TCP Westwood+[11,13]. All tests, except those for the short file transfer scenario, lasted approximately 100 seconds each, thus counting for more than 40 hours of active measurements of downlink and uplink UMTS channel. In order to perform a fair comparison, we have run tests by rotating TCP congestion control algorithms and scenarios and repeating tests in different days and in different hours of the day.
4
Experimental Results
We have performed an extensive experimental evaluation of the three congestion control algorithms by collecting measurements of around three thousand flows for more than 40 hours of active measurements. In this section we report the results collected for both downlink and uplink. 4.1
Goodput, Link Utilization and Fairness
The case of downlink flows Figures 2 (a), (b) and (c) report the cumulative distribution functions for the results obtained for the cases of one, two and four flows respectively, sharing the UMTS downlink, whereas mean goodputs and standard deviations are summarized in Table 1. It is noteworthy that the three congestion control protocols provide similar results in all the cases, the only remarkable difference being the case of the single flow (Figure 2 (a)) where TCP NewReno obtain 12% and 8% goodput
78
L. De Cicco and S. Mascolo
Goodput CDF − 1 flow downlink
Goodput CDF − 2 flows downlink
1 0.9
0.9 0.8
0.7
0.7
0.6
0.6 CDF
CDF
0.8
1 TCP Westwood+ TCP Bic TCP NewReno
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0 0
50
100
150 200 goodput (kbps)
250
300
0 0
350
TCP Westwood+ TCP Bic TCP NewReno 50
100 150 goodput (kbps)
(a)
200
250
(b) downlink utilization
Goodput CDF − 4 flows downlink
1
1 0.9
0.9
0.8
0.8 Link utilization
0.7
CDF
0.6 0.5 0.4 0.3
0.6 0.5
0.2
TCP Westwood+ TCP Bic TCP NewReno
0.1 0 0
0.7
50
100 150 goodput (kbps)
200
250
TCP Westwood+ TCP Bic TCP NewReno
0.4
1
2
4 Number of Flows
(c)
(d)
Fig. 2. Cumulative distribution function of the goodput measured in the case of 1 flow (a), 2 flows (b), 4 flows (c) sharing the UMTS downlink; bandwidth utilization (d)
Table 1. Average and standard deviation values (in kbps) of goodput for the UMTS downlink channel New Reno
BIC
Westwood+
#Flows E[x] σ(x) Ch. Utiliz. E[x] σ(x) Ch. Utiliz. E[x] σ(x) Ch. Utiliz. 1 2 4
122.03 65.38 86.08 36.20 80.13 38.53
31.8% 44.8% 83.5%
133.98 69.41 85.82 40.51 83.83 30.68
34.9% 44.7% 87.3%
138.58 77.10 90.41 46.12 82.46 40.60
36.1% 47.1% 85.9%
less than TCP Westwood+ and TCP Bic. The median values in the case of one, two and four flows are in the ranges [101, 121], [76, 87] and [71, 78] kbps respectively, whereas the upper 10th percentile experiences bandwidth in the ranges [220, 249], [128, 146] and [121, 139]kbps, respectively. Figure 2 (d) shows the downlink utilization that is computed by averaging goodputs of all the experiments performed for each considered protocol when the number of flows sharing the link are one, two or four. It is worth noting that all the three tested TCP variants provide less than 40% of bandwidth
TCP Congestion Control over 3G Communication Systems
79
utilization in the single flow case. By increasing the number of flows sharing the bottleneck the utilization reaches about 90% in the four flows case. We will return later on this topic in Section 5 where we will discuss possible reasons of the very low link utilization in the one flow case. Finally, it is worth to notice that goodputs reported by Catalan et al. [8] in the single flow case averaged over three experiments are similar to our results. Moreover, concerning the fairness indices we have found that the considered TCP variants provide similar average values of the JF I both in the two and four flows cases. We have obtained values around 0.94 for the two flows case and around 0.86 for the four flows case. The case of uplink flows The evaluation of the UMTS uplink channel have led to the results depicted in Figure 3 (a). Also in the uplink scenarios the goodputs obtained by the considered TCP congestion control algorithms are very similar, TCP Bic performing slightly better in the single flow case. Goodput CDF − uplink
uplink utilization
1
0.93 0.92
0.8
0.91 Link utilization
CDF
0.6
0.4
0.9 0.89
0.2 0.88
0 0
10
20
30 40 goodput (kbps) TCP Westwood+ TCP Bic TCP NewReno
(a)
50
60
TCP Westwood+ TCP Bic TCP NewReno
0.87 0.86 1
2
4 Number of Flows
(b)
Fig. 3. Goodput cumulative distribution function for the uplink scenario (a): 1 flow in solid lines , 2 flows dashed lines and 4 flows in dot-dashed lines. Link utilization (b)
In the case of uplink flows Table 2 enlights that the values of the standard deviation of goodput are very low for all the considered TCP stacks and in all cases. The median values in the case of one, two and four flows are respectively in the ranges [53, 55], [24, 26] and [14, 15] kbps, whereas the upper 10th percentile experiences bandwidths that are respectively in the ranges [57, 59], [31, 33] and [17, 20]kbps. Differently from the case of downlink flows, the protocols that we have tested have provided nearly full uplink utilization even in the single flow case as it is shown in Figure 3 (d). In the uplink scenario Jain Fairness Index is very high for the considered TCP stacks, being around 0.99 for the two flows case and around 0.95 for the four flows case.
80
L. De Cicco and S. Mascolo
Table 2. Average and standard deviation values (in kbps) of goodput for the UMTS uplink channel New Reno
BIC
Westwood+
#Flows E[x] σ(x) Ch. Utiliz. E[x] σ(x) Ch. Utiliz. E[x] σ(x) Ch. Utiliz. 1 2 4
4.2
56.44 2.09 29.40 3.95 14.78 4.08
88.2% 91.9% 92.4%
57.51 2.11 29.44 2.31 14.82 2.81
89.9% 92.0% 92.6%
55.62 1.74 29.32 2.19 14.72 2.51
86.9% 91.6% 92.02%
RTT and Queuing Time
The case of downlink flows Figures 4 (a), (b) and (c) show the cumulative distribution functions of the round trip times (RTT) and the queuing times, hereinafter named tq 1 , in the case of the downlink channel, while Table 3 collects mean values, RTT standard deviations and average queuing times for the considered scenarios. RTT and Queuing CDF − 2 flows downlink
RTT and Queuing CDF − 4 flows downlink 1 0.9
0.8
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
TCP Westwood+ TCP Bic TCP NewReno
0.1 0 0
1000
2000
3000 (ms)
4000
5000
6000
CDF
1 0.9
CDF
CDF
RTT and Queuing CDF − 1 flow downlink 1 0.9
0.4 0.3
0.2
TCP Westwood+ TCP Bic TCP NewReno
0.1 0 0
0.5
1000
(a)
2000
3000 (ms)
4000
5000
6000
0.2
TCP Westwood+ TCP Bic TCP NewReno
0.1 0 0
1000
(b)
2000
3000 (ms)
4000
5000
6000
(c)
Fig. 4. RTT (solid lines) and queuing time (dashed lines) cumulative distribution function in the case of 1 flow (a), 2 flows (b) and 4 flows (c) sharing the UMTS downlink
Table 3. Average and standard deviation values (in ms) of RTT for the UMTS downlink channel. Average values (in ms) of queuing. New Reno
BIC
Westwood+
#Flows E[RT T ] σ(RT T ) E[tq ] E[RT T ] σ(RT T ) E[tq ] E[RT T ] σ(RT T ) E[tq ] 1 2 4
1
1550 1297 1338
1096 624 488
1248 953 995
1457 1369 1159
897 691 230
1137 1024 825
1469 1219 1102
1110 508 221
1125 873 765
The round trip time is defined as sum of a propagation time, which can be evaluated as the minimum RTT, and the queuing time tq .
TCP Congestion Control over 3G Communication Systems
81
Even though the minimum RTT experienced in our evaluation is around 300 ms, the measured queuing times are very high and they do not depend on the number of flows, thus suggesting that in the single flow case we have an excessive queuing. TCP NewReno exhibits a slightly inflated value of RT T quantified in 100 ms in the one flow case, due to larger queuing time. In all evaluated scenarios, TCP Westwood+ provides less queuing time with respect to the other two algorithms. This result is coherent with the Westwood+ unique feature of clearing all queuing after a congestion. The case of uplink flows Figures 5 (a), (b) and (c) show the round trip time and the queuing time cumulative distribution functions in the case of uplink flows while Table 4 collects the values of mean RTT, RTT standard deviation and average queuing time. Similarly to the downlink scenario, TCP Westwood+ exhibits less queuing time except the case of four flows where TCP NewReno provides slightly less queuing time. Differently from the case of downlink flows, the RTT and queuing time tends to increase with the number of concurrent flows. RTT and Queuing CDF − 2 flows uplink
RTT and Queuing CDF − 4 flows uplink
1 0.9
0.8
0.8
0.7
0.7
0.7
0.6
0.6
0.6
0.5
1 TCP Westwood+ TCP Bic TCP NewReno
0.9 0.8
CDF
CDF
CDF
RTT and Queuing CDF − 1 flow uplink 1 0.9
0.5
0.5
0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
TCP Westwood+ TCP Bic TCP NewReno
0.1 0 500
1000
1500
2000
2500
3000
0.1 0 500
TCP Westwood+ TCP Bic TCP NewReno
0.1 1000
1500
(ms)
2000
2500
3000
0 500
1000
(ms)
(a)
1500
2000
2500
3000
(ms)
(b)
(c)
Fig. 5. RTT (solid lines) and queuing time (dashed lines) cumulative distribution function in the case of 1 flow (a), 2 flows (b) and 4 flows (c) sharing the UMTS uplink Table 4. Average and standard deviation values (in ms) of RTT for the UMTS uplink channel. Average values (in ms) of queuing. New Reno
BIC
Westwood+
#Flows E[RT T ] σ(RT T ) E[tq ] E[RT T ] σ(RT T ) E[tq ] E[RT T ] σ(RT T ) E[tq ] 1 2 4
4.3
1471 1887 2222
113.2 80.2 72.7
927 1416 1556
1452 1998 2276
113.2 104.3 70.7
904 1521 1621
1368 1869 2301
87.1 79.5 78.7
828 1364 1587
Timeouts and Packet Retransmission Percentage
Table 5 summarizes the average number of timeouts obtained on the UMTS downlink and uplink in the considered scenarios. The results obtained in the
82
L. De Cicco and S. Mascolo
Table 5. Average number of timeouts for downlink and uplink (the duration of the connection is 100 s) New Reno
BIC
Westwood+
#Flows Downlink Uplink Downlink Uplink Downlink Uplink 1 2 4
5.84 6.44 5.35
0.15 0.04 0.45
4.82 5.43 5.24
0.13 0.03 0.35
5.27 6.20 5.80
0.15 0.06 0.43
Table 6. Average retransmission percentage (%) and standard deviation in the case of uplink and downlink New Reno Downlink
Uplink
BIC Downlink
Westwood+ Uplink
Downlink
Uplink
#Flows E[p] σ(p) E[p] σ(p) E[p] σ(p) E[p] σ(p) E[p] σ(p) E[p] σ(p) 1 2 4
7.13 4.89 0.04 0.12 6.75 2.97 0.03 0.10 7.79 6.05 0.03 0.10 7.96 5.28 0.02 0.12 7.97 4.41 0.02 0.12 10.12 6.86 0.02 0.10 10.95 5.85 0.40 0.98 10.05 3.77 0.23 0.52 11.69 4.57 0.27 0.55
downlink and uplink scenarios are very different and can be the source of the very different link utilization that we have reported in Section 4.1. In the downlink case the flows suffer around five timeouts throughout the 100 s duration of the connection regardless of the number of flows and the congestion control algorithm used. On the other hand, in the uplink flows case, the number of timeouts is negligible. Due to space limitation we can’t show the cumulative distribution function of packet loss ratios, but we report the average value and the standard deviation in the Table 6. By considering the values reported in the table we can observe that there is no remarkable difference in the considered TCP congestion control algorithms. In the case of downlink flows, the retransmission percentage increases with the number of flows and it is below 11%, whereas in the case of uplink flows the fraction of retransmitted packets is less than 1%. 4.4
Goodput Versus File Size
In this section we investigate the impact of the file size on the goodput of TCP. We have collected goodput measurements for file size in the range from 50 KB to 500 KB, in order to find out if the slow start phase degrades goodput in the case of small size file transfers. Figures 6 (a) and (b) show results respectively for the downlink and uplink cases. In the case of downlink flows, all considered TCP variants provide similar results and the goodput is essentially constant
TCP Congestion Control over 3G Communication Systems
83
Short file transfer − uplink
Short file transfer − downlink
64
384 352
56
320
48
256
Goodput (kbps)
Goodput (kbps)
288
224 192 160
40 32 24
128
16
96 64
TCP Westwood+ TCP Bic TCP NewReno
32 0 50
100
150
200
250 300 350 File size (Kbyte)
400
(a)
450
500
TCP Westwood+ TCP Bic TCP NewReno
8 0 50
100
150
200
250 300 350 File size (Kbyte)
400
450
500
(b)
Fig. 6. Goodput vs file size for (a) downlink and (b) uplink channel
when the file size increases, showing a maximum at 100 KB in the case of TCP NewReno and TCP Westwood+. Also in the case of uplink flows the goodput is constant, slightly increasing of 10 kbps when the file size grows from 50 KB to 500 KB. Thus, we can conclude that the slow start phase does not cause remarkable effects on the goodput in both the downlink and uplink channels.
5
Discussion of Results
In the previous sections we have reported the results of our extensive UMTS evaluation and we have found that the performance of the considered TCP congestion control algorithms are comparable in all the scenarios we have evaluated. Furthermore, we have found that in the case of uplink flows there are no remarkable issues, since all TCPs provide satisfactory channel utilization and exhibit very low number of timeouts. On the contrary, Section 4.1 has shown that in the case of the downlink scenario, all TCPs provided a very low utilization of UMTS links in the single flow case. To gain an insight into the reason of the poor link utilization let’s consider the number of timeouts in the case of the downlink flows as summarized in Table 5. It can be seen that the number of timeouts is roughly constant and does not depend on the number of flows. This observation suggests that timeouts are not due to congestion, otherwise we should expect to measure more timeouts in the case of multiple flows over the downlink. Thus, we argue that the reason of the poor downlink utilization in the case of single flow is mostly due to the high number of timeouts that impose an upper bound to the achievable goodput. In fact, the inflated RTT values due to large buffering, which has been already discussed in Section 4.2, make the retransmission timeout longer, thus implying a very long time spent in recovering timeout events [10]. Using trivial arguments, the average time tout spent resolving timeouts is the product of the average number of timeouts Ntout that affect the connection and the average retransmission timeout value T0 : E[ttout ] = E[Ntout ] · E[T0 ] ∼ = 35 s
84
L. De Cicco and S. Mascolo
which in our case, by considering that the connections last 100 s, results in 35% of the connection time. It is worth to notice that this value matches the link utilization shown in Section 4.1 (see Figure 2 (d)).
6
Conclusions
In this paper an extensive TCP performance evaluation over a live UMTS network by measuring both downlink and uplink performance indices for three different TCP congestion control algorithms is reported. The main findings can be summarized as follows: (i) the considered TCP congestion control algorithms performed similarly both in downlink and uplink scenarios; (ii) the UMTS uplink channel did not exhibit any remarkable issues, providing good channel utilization and very low number of timeouts and packet retransmissions; (iii) a very high number of timeouts has been observed in our measurements in the case of downlink channel that does not seem to be caused by congestion; (iv) the UMTS downlink channel utilization is poor in the single flow case because of the joint effect of the very high number of timeouts and the inflated RTT due to queuing.
References 1. Chan, M.C., Ramjee, R.: TCP/IP Performance over 3G Wireless Links with Rate and Delay Variation. Wireless Networks 11(1), 81–97 (2005) 2. Chiu, D.M., Jain, R.: Analysis of the increase and decrease algorithms for congestion avoidance in computer networks. Computer Networks and ISDN Systems 17(1), 1–14 (1989) 3. De Cicco, L.: Libnetmeas (2006) 4. Eckhardt, D.A., Steenkiste, P.: Improving wireless lan performance via adaptive local error control. In: ICNP, pp. 327–338 (1998) 5. Vacirca, F., et al.: An algorithm to detect TCP spurious timeouts and its application to operational UMTS/GPRS networks. Computer Networks 50(16), 2981–3001 (2006) 6. Balakrishnan, H., et al.: A comparison of mechanisms for improving TCP performance overwireless links. Networking, IEEE/ACM Trans. on 5(6), 756–769 (1997) 7. Pentikousis, K., et al.: Active goodput measurements from a public 3G/UMTS network. Communications Letters. IEEE 9(9), 802–804 (2005) 8. Catalan, M., et al.: TCP/IP analysis and optimization over a precommercial live UMTS network. In: Proc. IEEE WCNC’05, vol. 3 (2005) 9. Kohlwes, M., et al.: Measurements of TCP performance over UMTS networks in near-ideal conditions. In: Proc. VTC 2005-Spring (2005) 10. Ludwig, R., et al.: Multi-layer tracing of TCP over a reliable wireless link. In: Proc. ACM SIGMETRICS 1999, pp. 144–154 (1999) 11. Mascolo, S., et al.: TCP westwood: Bandwidth estimation for enhanced transport over wireless links. In: Proc. ACM MOBICOM, pp. 287–297 (2001) 12. Floyd, S., Henderson, T.: RFC2582: The NewReno Modification to TCP’s Fast Recovery Algorithm. Internet RFCs (1999)
TCP Congestion Control over 3G Communication Systems
85
13. Grieco, L.A., Mascolo, S.: Performance evaluation and comparison of Westwood+, New Reno, and Vegas TCP congestion control. ACM SIGCOMM Computer Communication Review 34(2), 25–38 (2004) 14. Jacobson, V.: Congestion avoidance and control. In: ACM SIGCOMM ’88, Stanford, CA, aug 1988, pp. 314–329. ACM Press, New York (1988) 15. Tirumala, A., Ferguson, J.: Iperf (2001) 16. Xu, L., Harfoush, K., Rhee, I.: Binary increase congestion control (BIC) for fast long-distance networks. In: Proc. INFOCOM 2004, pp. 2514–2524 (2004)
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth on Demand Satellite Networks Wei Koong Chai and George Pavlou Centre for Communication Systems Research, University of Surrey, GU2 7XH, UK {W.Chai, G.Pavlou}@surrey.ac.uk
Abstract. It is well-known that the transmission control protocol (TCP) does not perform well in wireless and satellite environments. We investigate the use of cross-layer design involving the transport and medium access control (MAC) layers in the context of a geostationary bandwidth-on-demand satellite network to simultaneously enhance TCP performance and to improve bandwidth utilization. In this paper, we focus on the slow-start phase of the connection. In essence, we create a bandwidth pipe between the two layers so that through cross-layer interactions, the TCP connections are aware of the satellite resources available to them, thus adjusting their congestion window accordingly. Our proposal includes minimal changes to the original protocol, allowing easier integration and inter-working with existing infrastructure. Our evaluation results show a shorter slow-start duration with better bandwidth utilization. Although the performance gain is higher in a lossy satellite link, we also found that it is dependent on the network load. Keywords: Cross-layer design, satellite network, bandwidth on demand, TCP.
1 Introduction The majority of Internet traffic uses the transmission control protocol (TCP). TCP was first conceived with terrestrial wired networks in mind, utilizing a closed-loop probing approach to slowly detect and utilize available resources in the end-to-end route. The protocol relies on feedback from the receiver in the form of ACK packets to gradually increase the sending rate. The classical TCP uses the additive increase, multiplicative decrease (AIMD) algorithm which increase the sending rate at a linear rate while decrease it exponentially in the event of losses which is assumed to be caused by congestion. In essence, it is a conservative protocol that is engineered to avoid severe Internet congestion. A TCP connection consists of four phases: slow-start, congestion avoidance, fast retransmit and fast recovery [1]. Each connection starts with a slow-start phase and proceeds to the congestion avoidance phase. The fast retransmit and fast recovery phases are invoked to combat network problems such as TCP segment loss or data re-ordering. For long-lived TCP connections which mainly operate in the congestion avoidance phase, the effect of slow-start phase may be negligible. On the contrary, for short-lived connections, the TCP performance is highly dependent on the slow-start phase. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 86 – 98, 2007. © Springer-Verlag Berlin Heidelberg 2007
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
87
Although TCP performs satisfactorily in wired networks, the proliferation of satellite networks as an important component of the global information infrastructure presents specific challenges to TCP. The operating conditions and characteristics of satellite radio links are considerably different compared to wired links for which the TCP was first conceived. First, segment losses can no longer be assumed as a congestion indication since random losses caused by channel errors may not be negligible in satellite links. For instance, the channel quality of a Ka band satellite link is especially susceptible to atmospheric events such as rain. Second, the long propagation delay of satellite links has greatly lengthened the feedback process of the TCP protocol. Since TCP adjusts its sending rate per reception of ACK, the longer the round trip time (RTT), the slower it reacts to the current condition of the link. The problem is clear in a geostationary (GEO) satellite system which has approximately 560ms round-trip propagation delay, causing the TCP to increase its sending rate only once in more than half a second even in a lightly loaded link. The situation worsens when there are multiple packet losses. From a satellite operator’s perspective, this is highly inefficient in minimizing bandwidth wastage. Satellite bandwidth is an expensive commodity which should be efficiently utilized. TCP should send data at the right rate so that the link is optimally utilized without causing congestion. In this paper, we focus on the slow-start phase and propose a cross-layer mechanism between transport and medium access control (MAC) layers to enhance the TCP performance while improving bandwidth utilization in the context of a bandwidth-ondemand (BoD) GEO satellite system. Section 2 reviews the literature on TCP enhancements for satellite networks. We detail our reference system architecture in section 3. We elucidate our cross-layer enhancement in section 4. The performance of our proposal is then evaluated under different operating conditions via simulation and the results are presented in section 5. We summarize and conclude our work with discussions on its applicability and weaknesses in section 6.
2 Enhancing TCP for Satellite Networks With the aforesaid problems, TCP has been recognized as inadequate for satellite networks. Over the last years, various proposals have been presented in attempting to solve these problems. In general, there exist three approaches. The first attempts to solve the problems by tuning specific TCP parameters without modifying the original TCP procedures. Since it involves minimal changes to the intrinsic working of the protocol, it is usually readily deployable. Its drawbacks are that the proposal usually targets a specific problem while ignoring others and that the improvement is limited. An example of this approach is [2] where a larger initial window is proposed. The second approach intervenes with the original AIMD mechanism of the protocol. TCP Peach [3], TCP Westwood [4] and TCP Hybla [5] are examples following this vein. These TCP variants often cope with various link characteristics but introduce integration and inter-working questions. The third approach isolates the satellite domain via third party proxies that split the TCP connections [6] and uses a different transport protocol specifically designed to suit the satellite environment within this domain. Interested readers are referred to [7][8] and the references therein.
88
W.K. Chai and G. Pavlou
Recent literature shows a new approach: cross-layer design which involves increased interactions between different layers within the protocol stack. The rationale behind this approach is that the current OSI model does not cater for sufficient adaptability to wireless satellite networks [9]. With the variability of the wireless link, it is beneficial to have the lower layers informing the upper ones the current link conditions so that the upper layers can adapt their behavior (e.g. sending rate) for an optimal operating equilibrium. For example, in [10], the cross-layer approach is employed for TCP and physical layer in assigning a common bandwidth resource to TCP connections over a satellite channel under different fading conditions. A TCP-MAC cross-layer resource allocation scheme is proposed in [11] to reduce the average file transfer time and to achieve a fair sharing of resources among competing flows. Ref. [12] suggests the use of cross-layer interactions between transport and MAC layers in a split-connection scenario where random error detection (RED) [13] is introduced at the MAC layer to provide congestion indication to TCP senders. The cross-layer approach is also utilized in [14] for enhancing the performance of TCP Westwood over satellite. We refer to the ETSI BSM (Broadband Satellite Multimedia) protocol stack shown in Fig. 1 [15] which separates the network layers to satellite dependent (SD) layers and satellite independent layers (SI), connected via the Satellite Independent Service Access Point (SI-SAP). We exploit the fact that the SD layers can derive the exact available resources. Hence, if this information is communicated to the transport layer, the TCP sender will not have to be conservative in increasing its sending rate any longer. It can readily fill up the bandwidth without causing congestion. External Layers Satellite Independent
Application UDP
TCP
Other
IPV4 / IPV6 Satellite Independent Adaptation SI-SAP
Satellite Dependent
Satellite dependent Adaptation Satellite Link Control Satellite Medium Access Control Satellite Physical
Fig. 1. Satellite BSM protocol stack [15]
3 Bandwidth-on-Demand Geostationary Satellite System Our reference system, shown in Fig. 2, resembles the Digital Video Broadcasting Return Channel via Satellite (DVB-RCS) architecture [16]. In view of next generation Internet, we assume the resource management scheduler is onboard the satellite. The network control center (NCC) provides control and monitoring functionality to the architecture. Users are represented via satellite terminal (ST). The traffic gateway (GW) provides connection to other domains (e.g. public and private providers). The satellite return link utilizes multi-frequency time division multiple access (MF-TDMA) and the basic capacity unit of an MF-TDMA frame is the timeslot (TS).
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
89
GEO Satellite
GW
ST
NCC Internet
ST
ST
Fig. 2. Reference satellite network configuration
(2) Update request buffer
(3) Allocate Resources
propagation delay
frame number
Satellite (BoD Scheduler)
1
2
3
4
5
6
7
8
9
10
11
12
13
BT P(
14
15
16
BT P(
17 18
20
21
22
24
25
27 28
29
SR
(4
(5 ))
))
ST (BoD Entity)
1
2
3
4
(1) Estimate and send resource request (SR).
5
6
8
9
10
SR
11
12
13
14
(4) Process receiv ed BTP
(6 )
)
SR
7
(5 )
SR
(4
(3
SR
)
(2 )
(1
)
(3 ))
26
BT P(
SR
(2 ))
23
BT P(
SR
SR
SR
(1 ))
19
BT P(
SR
15
16
17 18
SR
19
20
21
22
23
24
25
26
27 28
29
(5) Activate new BTP
1 BoD cycle
Fig. 3. BoD timing diagram
For efficient use of the satellite resources, a BoD scheme is specified. It is composed of two stages: the resource request from the STs and the resource allocation from the scheduler. Based on the incoming traffic, the STs estimate the resources required and then send a slot request (SR) to the scheduler. The computation of the SR is derived from [17]. The scheduler then allocates the TSs based on these requests. It constructs the burst time plan (BTP) that contains the allocation information and broadcasts it to all STs. Fig. 3 illustrates the evolution of the cyclic BoD process. We enable the Free Capacity Assignment (FCA) which will distribute the unassigned slots (after the allocation of all received SRs) to users so as to achieve higher efficiency in the system. The FCA basically reflects the spare capacity which can be utilized by the TCP connections.
4 Cross-Layer Enhancement for TCP Slow-Start Phase 4.1 Problem Statement A TCP connection begins without the knowledge of available capacity and deploys the slow-start algorithm to probe the network with a small initial congestion window
90
W.K. Chai and G. Pavlou
( cwnd ) which governs the TCP send rate. The value of cwnd is increased per reception of ACK by
cwnd = cwnd + 1 The connection exits the slow-start phase when the cwnd exceeds the ssthresh value which may initially be set as the advertised window. In a BoD satellite network, the MAC scheduler actually knows the exact availability of the bandwidth. Hence, by passing this information to the TCP sender, it can immediately increase the sending rate to fill up the available bandwidth without further probing. Formally, if cwndi , ssthreshi and cwndiincr are the current cwnd , current ssthresh and allowable cwnd increase for connection i respectively, where i ∈ I ≡ {1, 2, 3, …, N } and N is the number of connections in the network, we formulate the problem as N
Max ( Z ) =
∑ cwndiincr
(1)
i =1
subject to the constraints: N
1.
∑ (cwndi
2.
cwndi ≤ ssthreshi ;
∀i ∈ {1, …, N }
3.
cwndiincr ≤ TSiFCA ;
∀i ∈ {1, …, N }
4.
cwndiincr ≈ 1; cwnd jincr
+ cwndiincr ) ≤ C total
i =1
∀i, j ∈ {1, …, N }
The first constraint ensures that the total sending rate after the increment of cwnd does not exceed the total capacity, C total , in the network. Since we focus only on the slow-start phase of the connection, the second constraint is included. The third constraint is required as the extra cwnd increment of each connection cannot exceed the slots allocated by the FCA, TSiFCA to connection i . The fourth constraint is important to achieve fairness for all connections. 4.2 Algorithm of Cross-Layer Enhanced Slow-Start Phase There are several design issues to be addressed. The foremost being the need for a cross-layer interaction facility interfacing between transport and MAC layers. We use a simple cross-layer entity that acts as an intermediary between them. The second design issue is the timescale separation between the two layers. The TCP sender adjusts its cwnd every RTT which is more than half a second for satellite networks. However, the system response time for the BoD which accounts for the time between the sending of an SR to the activation of the corresponding BTP is much less (96ms for our case). Our solution operates on the transport layer timescale since the update from the BoD is more frequent so that TCP sender may have more up-to-date information to act up on. Another issue is on how the BoD should distribute its free slots.
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
91
Conventionally, this is done based on the STs whereby the free slots are allocated to all connected STs in a round-robin fashion. Since all STs will receive equal amount of free slots, it is unfair to those TCP connections residing in an ST that has high number of connections. Alternatively, the free slot allocation can be done based on TCP connections. This will ensure fairness to all connections but will burden the satellite scheduler since it has to keep track of all the TCP connections within the network. We get around this by assuming each ST has the same number of TCP connections. Although unrealistic, our aim is to understand the benefits and effectiveness of the mechanism rather than solving the details of its deployment. Satellite operators can decide on this issue based on their own policies. In addition to that, we also exercise caution in designing our mechanism in order to avoid “spaghetti design” that consists of many convoluted cross-layer interactions. Voice for caution in cross-layer design has already been raised in [18]. Hence, it is still important that a certain level of modularity and integrity of the entire protocol stack are maintained. We cannot sacrifice long term performance guarantees with some immediate short term improvement to a certain performance metric. Although the TCP and BoD module lie in non-adjacent layers, they are in fact inter-dependent. The sending rate of the TCP affects the resource allocation in the BoD scheduler while the BoD resource allocation impacts the RTT parameter for TCP. Our solution involves both horizontal (i.e. between MACs of STs and the satellite) and vertical (i.e. between transport and MAC layer within each ST) interaction implementations. The horizontal plane implementation ensures that the scheduler onboard the satellite allocate the requested slots appropriately together with fair allocation of spare capacity to all STs while the cross-layer communication is implemented via the vertical plane. Table 1 and Table 2 show the pseudo code for the STs and satellite BoD scheduler respectively. Table 1. Pseudo code for ST involving both MAC and Transport layers 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
MAC::BEGIN(t = request time) If (new BTP = TRUE) { read BTP; compute TSs allocated by request; match MAC frame to the TSs for transmission; compute TSs allocated by FCA; if (TSs allocated by FCA > 0) { translate TSs to cwnd; compute cwnd_incr; } else reset(cwnd_incr); } cross-layer entity(cwnd_incr); activate BTP; compute(SR); send(SR); END
17. 18. 19. 20. 21.
TRANSPORT::BEGIN (EVENT = recv ACK) if ((slow-start = TRUE) && (cwnd_incr > 0)) { cwnd = cwnd + cwnd_incr; reset(cwnd_incr); } END
92
W.K. Chai and G. Pavlou Table 2. Pseudo code for Satellite BoD scheduler
1. 2. 3. 4. 5. 6. 7. 8.
MAC::BEGIN (t = allocation time) While (TS available) { allocate(SR, TS); } if (total SR > C) { buffer unsatisfied SR; } else { distribute remaining slots by FCA; construct(BTP); broadcast(BTP); END
}
The core idea is to have the BoD communicate the latest amount of free capacity allocated to the TCP sender so that the sender knows exactly how much it can open up its cwnd without causing congestion. So, when a BTP is received, each BoD entity will calculate the free slots allocated for its TCP connection(s) based on its record on past requests. The result is translated from slots to TCP segment via (2): -
cwndiincr =
TSiFCA × ( θTS × Fperiod ) 8 × SFF
(2)
psizeTCP
where θTS is the rate granularity of the slot in kbps , Fperiod is the frame duration in second, SFF is the number of frames within a super frame and psizeTCP is the size of the TCP segment in byte. The computed cwndiincr value is truncated to the nearest integer. This information is passed to the cross-layer entity. When an ACK is received, the TCP sender fetches cwndiincr from the cross-layer entity and opens up its cwnd accordingly. The cwndiincr will then be reset to avoid ST TCP
Satellite
ST
MAC
MAC Data
MAC
TCP
SR BTP
x-layer
MAC frames
MAC frames
Data
update cwnd
ACK ACK
ACK
ACK SR
Data BTP
MAC frames
x-layer
cwnd
update
MAC frames
ACK
ACK
MAC frames MAC frames
Data
ACK
ACK
Fig. 4. Timing diagram for interaction between ST and BoD scheduler
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
93
multiple increases of cwnd . Fig. 4 illustrates the timing for the interaction between the STs and BoD scheduler. Note that, even though the cwndiincr has been updated after the first system response time, we only allow the TCP sender to open its cwnd when an ACK is received. This ensures that the connection is working properly since reception of ACKs implies that the previous data has been correctly received.
5 Performance Evaluation 5.1 Simulation Setup We implement our cross-layer algorithm in the ns-2[19] simulator with an extension of BoD module. Fig. 2 shows the network topology used where the bottleneck is assumed to be at the satellite segment. Two satellite link capacities are being evaluated: 512kbps and 2048kbps link yielding 32 and 128 16kbps slots per frame respectively. Each frame is of 24ms duration and a super-frame (SF) consists of four frames (i.e. SFF = 4 ; SF duration = 96ms). We enable the fragmentation function. Each TCP segment will be fragmented into multiple 48 bytes MAC frames; each with a 5-byte header. The receiver MAC reassembles these frames before passing the segment to the upper layer. The link buffers are configured to be big enough to avoid link layer losses. In a BoD network, there are basically three options in how requests are submitted to the scheduler: (1) via pre-assigned slots, (2) by piggybacking on data slots and (3) via contention in a random access paradigm. We use the first option. The FCA option is turned on. Spare capacity is distributed in a fair round-robin manner. We compare our new algorithm with TCP NewReno. The TCP segment size is 500 bytes. The effect of our solution is demonstrated by considering both long-lived and shortlived TCP sources. The receiver is set to acknowledge all segments received. 5.2 Preliminary Inspection of the Mechanism Behavior Our preliminary evaluation of the scheme involves two simple scenarios where a TCP sender transmits data over an under-utilized network over lossless satellite links set with capacity 512kbps and 2048kbps respectively. The receiver advertised window is set as 64kB. Fig. 5(a) compares the evolution of the cwnd value for the connection with and without our cross-layer mechanism while Fig. 5(b) shows the instantaneous TCP throughput of each case. Clearly, the TCP connection with the cross-layer mechanism manages to open up its cwnd quicker, achieving thus higher throughput in shorter time. Graphically, from Fig. 5(b), the area between the two plots of the same link capacities represents the throughput difference. For instance, the performance gain for the 2048kbps link is 282.5kB for the period between 2s and 8s (the shaded area). However, we also see that the throughput level will coincide thus providing the same performance, implying that the cross-layer mechanism only offers performance gain in the initial phase of the connection (i.e. the slow-start phase). We also see that better bandwidth utilization is only achieved during the slow-start phase. We further present in Fig. 6 the time sequence of the connections in the two scenarios. In both cases, the cross-layer mechanism gives its TCP sender a “head start” in the packet transmission, obtaining thus a better throughput performance throughout.
94
W.K. Chai and G. Pavlou
We show in section 5.5 that the performance gain increases in lossy satellite links as slow-start events will occur more often. Also, when we compare the two figures, we see that for the 2048kbps satellite link, the performance gain is higher. This is logical since there are more free slots thus enabling bigger increase in the cwnd value. Hence, the performance gain achievable is dependent on the link capacity. (a)
(b) 12
140
512kbps link 2048kbps link
120
10 TCP Throughput (byte/s)
cwnd (segment)
100
80 without x−layer mechanism 60
4
x 10
with x−layer mechanism
40
8
6
4
20
2
3
4 Time (s)
5
with x−layer mechanism
2
512kbps link 2048kbps link
0 1
Throughput difference in shaded area
6
without x−layer mechanism
7
0 2
3
4
5
6
7
8
9
Time (s)
Fig. 5. Comparing (a) the cwnd evolution and (b) throughput achieved for TCP connection with and without cross-layer enhancement over two types of link capacity (a)
(b)
200
1000
180
900 TCP Data Segment
TCP Data Segment 160
800
ACK Packet
120
with x−layer mechanism
100 80 60
600 500
with x−layer mechanism
400 300
40
without x−layer mechanism
200 without x−layer mechanism
20 0 2
ACK Packet
700
Packet Number
Packet Number
140
2.5
3
3.5
4 Time (s)
4.5
5
5.5
100 6
0 2
3
4
5
6
7
Time (s)
Fig. 6. Slow-start with and without cross-layer for (a) 512kbps and (b) 2048kbps link
5.3 Impact to File Transfer Sessions We evaluate the performance improvement of our cross-layer mechanism for file transfer. We set up file transfer protocol (ftp) sessions for transferring files of different sizes ranging from 10kB to 10MB across satellite link with capacity 512kbps and 2048kbps. For these simulations, the advertised window is set to a high value (6.4MB). The performance improvements achieved are shown in Fig. 7. We see a general trend of decreasing performance improvement when the file size increases. This is because once the connection exits the slow-start phase, it behaves as the original TCP protocol and yields similar performance for the rest of the transfer time.
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
95
Scrutinizing the performance improvement for the case with link capacity 2048kbps, we see a gentle increasing slope for the transfer of very small files. This is due to the fact that since the original transfer times for these files are already short, the completion of the file transfers has been achieved before the congestion avoidance stage was reached. In other words, these cases do not attain the maximum benefit of the crosslayer mechanism. For the case with link capacity 512kbps, we detect an extra slowstart event for file transfer of size 1MB and above. This explains the “spike” in the figure. 45
30
35 30
20 15 10 5 0 4 10
20
10 with x−layer without x−layer 0 0
25
5
10
6
10 Transfer Length (byte)
7
10
Fig. 7. Performance improvement generally decreases as file size increases
Transfer Time Reduction (s)
Performance Improvement (%)
Transfer Time (s)
512kbps link 2048kbps link
40
5
10 15 Number of TCP Connections
20
5
10 15 Number of TCP Connections
20
2.5 2 1.5 1 0.5 0
Fig. 8. File transfer time comparison for TCP with and without cross-layer mechanism
5.4 Multiple Competing Connections Scenario We investigate the aggregate performance of our cross-layer mechanism over errorfree satellite channel. Multiple TCP connections are setup to send 100 packets each over a 2048kbps link whereby each connection is started 20ms apart. Intuitively, the effect of the cross-layer mechanism should decrease when there are many competing connections since most TSs will be allocated, resulting in a low number of free slots available. Fig. 8 confirms the deduction. Although the cross-layer mechanism ensures a better performance (Fig. 8 (upper)), its absolute gain decreases with more competing connections (Fig. 8 (lower)). Extrapolating the results, our proposal will not provide performance improvement in a highly congested link since there is no spare capacity. So, it is only beneficial to users in underutilized network condition. 5.5 Lossy Satellite Link Scenario In the literature, there exist two mainstream approaches in studying a lossy satellite link scenario. The traditional approach is to inject errors to packets causing packet drops. This allows packet level performance examination. The alternative approach is to model the attenuation effect on a satellite link as a decrease of bandwidth [20]. The reasoning behind this approach is that with the advance of forward error correction (FEC) techniques, erroneous packets are recoverable at the expense of redundancy overhead. The worse the link condition, the higher the redundancy overhead used, resulting in less bandwidth devoted to carrying actual information. Since packet-level dynamics are important for our evaluation, we use the first approach.
96
W.K. Chai and G. Pavlou
Transfer Time (s)
We configure the 2048kbps satellite link to vary across a range of frame error rate (FER) and compare the time needed to transfer a 10MB file for a connection with and without the cross-layer mechanism in an otherwise unutilized link. We present the results in Fig. 9 which shows that the gain in using the cross-layer mechanism increases exponentially with the FER. This occurrence is directly related to the number of the slow-start event taking place within the duration of the connection. 1000 800
with x−layer without x−layer
600 400 200 0 −6 10
−5
−4
10
10
−3
10
Transfer Time Reduction (s)
FER
150 100 50 0 −6 10
−5
−4
10
10
−3
10
FER
Fig. 9. File transfer time comparison over a lossy satellite link
6 Summary and Conclusions In this paper, we propose a simplistic cross-layer mechanism between the transport (TCP) and MAC (BoD module) layer to enhance the performance of TCP connections in a GEO satellite network. The proposal speeds up the TCP slow-start by utilizing information acquired from the MAC layer via minimal cross-layer communication. Our proposal provides potentially significant improvements to TCP performance, especially for medium sized file transfers over lossy satellite links. With minimal changes to the original protocol, it is also readily deployable. We also show when the mechanism is of no benefit to both users and satellite operators. From our evaluations, we summarize our findings regarding the mechanism in Table 3 below: Table 3. Characterization of the cross-layer mechanism Operating conditions / Scenario Connection duration Link Capacity Link Utilization
Gain
Operating conditions / Scenario
Long
Low
Short
High
File Transfer (for file transfer)
High
High
Low
Low
High
Low
Low
High
Gain
Big
Low
Small
High
Number of connections
High
Low
Low
High
Link Condition (FER)
High
High
low
Low
There remain some non-trivial practical issues to be considered. First, the mechanism operates under the assumption that the satellite links are the bottleneck link. Although this is usually the case, we have not investigated the penalty incurred (if any) if this assumption is not true. In this sense, its applicability to the Internet is
Cross-Layer Enhancement to TCP Slow-Start over Geostationary Bandwidth
97
unverified. Nevertheless, our results suggest that it is feasible for deployment in smaller networks that involve a single administrative domain. Second, the more aggressive approach used here will pose questions on fairness as traditional TCP senders are, by design, not able to utilize the free capacity. However, we argue that our proposal does not actually deprive them from the resources. From a different perspective, it is actually an incentive for users to start using our mechanism as early as possible for better performance. In short, we have presented an initial study of the mechanism and showed its potential. Further in-depth investigations, possibly in a testbed, should be conducted to quantify the performance gain. Acknowledgments. This work is performed within the framework of the SatNEx and ENTHRONE projects, funded by European Commissions (EC) under the Framework Program 6. The financial contribution of the EC towards this project is greatly appreciated.
References 1. Allman, M., Paxson, V., Stevens, W.: TCP congestion control, RFC 2581 (April 1999) 2. Allman, M., et al.: Increasing TCPs initial window, RFC2414 (September 1998) 3. Akyildiz, I., Morabito, G., Palazzo, S.: TCP-Peach: a new congestion avoidance control scheme for satellite IP networks. IEEE/ACM Trans. on Network 9, 307–321 (2001) 4. Casetti, C., et al.: TCP Westwood: end-to-end congestion control for wired/wireless networks. Wireless Networks Journal 8, 467–479 (2002) 5. Caini, C., Firrincieli, R.: TCP Hybla: a TCP enhancement for heterogeneous networks. Int’l J. of Satell. Commun. and Network 22, 547–566 (2004) 6. Henderson, T.R., Katz, R.: Transport protocols for Internet compatible satellite networks. IEEE J. of Selected Areas in Commun. 17(2) (1999) 7. Allman, M., et al.: Enhancing TCP over satellite networks, RFC2488 (January 1999) 8. Caini, C., et al.: Transport layer protocols and architectures for satellite networks. Int’l J. of Satell. Commun. and Network 25, 1–26 (2007) 9. Giambene, G., Kota, S.: Cross-layer protocol optimization for satellite communications networks: A survey. Int’l J. of Satell. Commun. and Network 24, 323–341 (2006) 10. Celandroni, N., Davoli, F., Ferro, E., Gotta, A.: Long-lived TCP Connections via satellite: cross-layer bandwidth allocation, pricing and adaptive control. IEEE/ACM Trans. on Network 14(5), 1019–1030 (2006) 11. Chini, P., Giambene, G., Bartolini, D., Luglio, M., Roseti, C.: Dynamic resource allocation based on a TCP-MAC cross-layer approach for DVB-RCS satellite networks. Int’l J. of Satell. Commun. and Network 24, 367–385 (2006) 12. Peng, F., Wu, L., Leung, V.C.M.: Cross-layer enhancement of TCP split-connections over satellites links. Int’l J. of Satell. Commun. and Network 24, 405–418 (2006) 13. Floyd, S., Jacobson, V.: Early random detection gateways for congestion avoidance. IEEE/ACM Trans. on Network 1(4), 397–413 (1993) 14. Kalama, M., et al.: Cross-layer improvement for TCP Westwood and VoIP over satellite. In: Int’l Workshop on Satellite and Space Commun, pp. 204–208 (2006) 15. ETSI, Satellite Earth Stations and Systems (SES); Broadband Satellite Multimedia (BSM) services and architectures; functional architecture for IP internetworking with BSM networks,” TS 102 292, V.1.1.1 (2004-2)
98
W.K. Chai and G. Pavlou
16. ETSI, Digital Video Broadcasting (DVB); Interaction channel for satellite distribution systems, EN 301 790, V.1.3.1 (2003-03) 17. Açar, G.: End-to-end resource management in geostationary satellite networks, PhD. Dissertation, University of London (November 2001) 18. Kawadia, V., Kumar, P.R.: A cautionary perspective on cross-layer design. IEEE Wireless Communications, 3–11 (2005) 19. The ns manual [Online]. Available: http://www.isi.edu/nsnam/ns/doc 20. Bolla, R., Davoli, F., Marchese, M.: Adaptive bandwidth allocation methods in the satellite environment. In: Proc. IEEE ICC, pp. 3183–3190 (2001)
TCP Performance over Cluster-Label-Based Routing Protocol for Mobile Ad Hoc Networks Vitaly Li and Hong Seong Park Department of Electrical and Computer Engineering, Kangwon National University 192-1 Hyoja 2 Dong, Chuncheon, 200-701, Korea
[email protected],
[email protected]
Abstract. The performance of a TCP protocol on MANETs has been studied in a numerous researches. It has been shown that performance of TCP on MANET is poor and that it is caused by two groups of reasons: one inherited from wireless networks and one caused by nodes mobility. One reason of TCP degradation is inability to distinguish between packet losses due to congestion from those caused by nodes mobility and as consequence broken routes. This paper presents the Cluster-Label-based Routing protocol that is an attempt to compensate source of TCP problems on MANETs – multi-hop mobile environment. By utilizing Cluster-Label-based mechanism for Backbone the CLR is able to concentrate on detection and compensation of movement of a destination node. The proposed protocol provides better goodput and delay performance than standardized protocols especially in cases of large network size and/or high mobility rate. Keywords: TCP performance, routing, MANETs, cluster-label.
1 Introduction Mobile ad hoc networks (MANETs) are formed by the wireless mobile nodes communicating with each other without presence of fixed infrastructure. The nodes communicate directly while in transmission range of each other and use routing algorithms for multi-hop communications. Every node could function as an end system, a router or both of them at a time where end system nodes send or receive data and router nodes forward data. The IETF MANET working group [1] has standardized OLSR (Optimized Link State Routing) [2], AODV (Ad hoc On-demand Distance Vector Routing Protocol) [3] and DSR (Dynamic Source Routing) [4], where first is proactive and last two are reactive routing protocols respectively. In proactive protocols like OLSR or DSDV (Distributed-Sequenced Distance Vector Routing Protocol) [5], all nodes should maintain its routing table to all possible destinations regardless of actual needs for the route between source and destination nodes. In reactive protocols such as AODV or DSR, the route is obtained by source node in on-demand manner only when there is a data to send. In addition to the above mentioned network layer protocols, a transport layer protocol such as TCP (Transmission Control Protocol) is also needed for reliable data Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 99 – 108, 2007. © Springer-Verlag Berlin Heidelberg 2007
100
V. Li and H.S. Park
communications. As the TCP protocol is most widely used for current Internet applications it is well tuned for handling wired connections. The TCP attempts to determine the optimal available bandwidth using congestion control mechanisms such as slow-start and AIMD (additive increase and multiplicative decrease). The packets loss is used as congestion indication forcing to decrease the congestion window. However, in an environment presented by MANETs packet loss due to broken routes can result in the counter-productive invocation of TCP’s congestion control mechanism thus leading to underutilization of bandwidth and reducing effectiveness of protocol. There have been a number of studies that address that problem in literature [6-11]. It has been shown that the mobility of nodes causes most degradation of TCP performance in MANETs [6-7, 9, 11]. In addition to the problems inherited from wireless networks, the multi-hop environment in a presents of mobility poses new problems. This paper is focused on a routing stability problem. Since the route in a MANET is a set of links between nodes connecting source and destination at any given time any of the nodes participating in a data routing could move out causing the packet loss and consequently the performance degradation. The common approach for solving mobility problem is to use mechanisms to (a) detect and distinguish link failures, and (b) initiate the proper response. Examples of such schemes include Explicit Link Failure Notification (ELFN) [6] and TCP-F [12], where the intermediate node detects packet loss and sends explicit notification to the source node so that the sender can distinguish between route failures and congestion, and initiate a proper response. However, the approach has problem of scalability in case of large or dense network with heavy traffic. This paper proposes a routing protocol called Cluster-Label-based Routing Protocol (CLR) for improving TCP performance over MANET. The CLR is built on top of the backbone created by the topology control scheme called Cluster-Label-based mechanism for Backbone (CLaB) [13]. The CLaB constructs and constantly maintains the overlay infrastructure based on interconnected clusters. Each cluster is assigned with unique identifier called Cluster-Label and maintenance algorithm provides constant connectivity between Cluster-Labels. The CLR establish path from source to destination utilizing the cluster-labels rather than nodes IDs. Using such approach leaves the problem of intermediate nodes movement to the CLaB and remain problems of destination node movement and route recalculation. In order to cope with these problems the destination-initiated movement notification is used and cluster-labels routing tables is constructed using proactive messages exchanges. The rest of the paper is organized as follows. Section 2 gives an overview of Cluster-Label-based mechanism for Backbone. Section 3 describes the Cluster-Labelbased Routing protocol. Section 4 shows performance evaluations and Section 5 concludes the paper.
2 An Overview of Cluster-Label-Based Mechanism for Backbone The Cluster-label-based mechanism for Backbones (CLaB) forms and maintains backbone over an ad hoc network. The following definitions are used.
TCP Performance over Cluster-Label-Based Routing Protocol
101
Definition 1. A cluster is a set of nodes with a central node called clusterhead. Any node in a cluster can communicate with a clusterhead directly. Definition 2. A host cluster X for a node N is a cluster such that the N and the clusterhead of X can communicate directly. Definition 3. Two clusters are overlapping if there is at least one node that can communicate directly with clusterheads of both clusters. The goal of the CLaB is to create clusters and maintain connections between them in case of nodes movement. In addition unique identifier called Cluster-Label is assigned to each newly created cluster. Hence, a node can be found in a network by the Cluster-Label of a host cluster. The creation and maintenance of a clustered network is based on a periodic exchange of local information between neighbor nodes. Nodes periodically exchange “Hello” packets with information about itself that defines node’s unique ID, status and host cluster related information. The status can be a clusterhead, a member or orphan which corresponds to neither clusterhead nor member. Each orphan node should participate in an election of a clusterhead within transmission range. Once the clusterhead is selected every node in its transmission range silently becomes a member of a newly created cluster and thus is not eligible to participate in a further clusterheads election. If node does not hear from any clusterhead for a three “Hello” packet periods it changes its status to orphan and starts the clusterhead election process again. For the simplicity the lowest ID clustering algorithms [14] is used throughout the paper. However, it is worth noting that any clustering algorithm can be used as long as it does not avoid overlapping clusters and can be executed locally. For each newly created cluster the unique Cluster-Label is generated and spread among cluster members by clusterhead. Once the Cluster-Label is propagated by clusterhead the formation part is complete and the continuous maintenance part takes place. Each node n by receiving “Hello” message from neighbor nodes collects and keeps following information: l(n) is a set of neighbors of node n and c(n) is a set of cluster labels of host clusters for node n. Then
p ( n) =
∪ c(i)
∀i∈l ( n )
is a set of cluster labels whose clusterheads are residing within two hops range from node n. The two hops restriction is set in order to track overlapping clusters only. The “Hello” message from node n therefore contains ID, information about current state, c(n) and p(n). The clusterhead should indicate its cluster by setting additional field. The following relationship is defined
⎧ p (a) > p (b) ⎨ ⎩ p (a) = p (b)
if p (b) ⊂ p (a ) if p (b) ⊂ p (a ) ∨ p (a ) ⊂ p (b)
On receiving the “Hello” message from a clusterhead each node n should check whether p(n)>p(CH). If so, the node should automatically set its status to clusterhead
102
V. Li and H.S. Park
and start propagating the “Hello” message. No confirmation is required from a previous clusterhead. The decision is based simply on comparison of local p(n) versus received from a clusterhead. Since the maintenance part cannot assure that clusterheads of neighbor clusters are two hops away from each other the newly selected clusterhead should discover and keep set of three hops away connections with neighbor clusters. Let’s denote list of three hops away clusterheads as p3(CH). Then ⎛ ⎞ p3(CH ) = ⎜⎜ ∪ p (i ) ⎟⎟ \ p (CH ) ⎝ ∀i∈l ( CH ) ⎠
Both p(CH) and p3(CH) are used for communications with neighbor clusters. From the procedure described above, the following definition of a cluster-label and a clusterhead can be stated: Definition 6. A cluster-label is a unique identifier of a group of nodes with a single node carrying the role of a clusterhead. Once created the cluster-label is further defined by neighbor cluster-labels. The definition 6 is essential in a sense that it not only defines the cluster-label but the mechanism of preventing the change of it. Since the maintenance of a cluster-label depends on cluster-labels of neighbor clusters the group of adjacent clusters forms protection against nodes mobility. The example is shown in Figure 1. Virtual group of four clusters (A, B, C and D) are shown above the group of nodes. While a clusterhead of cluster A defines whether or not a node is a member of cluster A, the neighbor clusters defines whether or not a node is a clusterhead of cluster A. In general, the maintenance part takes place between cluster members and the current clusterhead sharing same cluster label. The clusterhead replacement is based on ability to support existing connections with neighbor clusters and then the affect of a clusterhead replacement is minimal from the point of view of the backbone. Since both formation part and maintenance part relies on periodical exchange of “Hello”
Fig. 1. Group of Clusters
TCP Performance over Cluster-Label-Based Routing Protocol
103
packets between one hop neighbors which is similar to most routing protocols, the energy consumption does not exceed energy consumptions presented by previous routing protocols.
3 Cluster-Label-Based Routing Protocol The main purpose of routing is to provide stable connection between source and destination despite node movement. The Cluster-Label-based Routing Protocol (CLR) is to utilize the advantages given by CLaB mechanism. Each node in network is defined by its unique identifier called node’s ID and its location with regard to the backbone created by CLaB. Therefore the task of CLR is to find the initial location of a node and track its movement through clusters. The discovery part uses reactive approach. Whenever source node needs to acquire the route to destination it issues the Route Request (RREQ) packet to its host cluster’s clusterhead in the following form: {sourceID, destID}, in which sourceID and destID are IDs of source and destination nodes, respectively. The clusterhead then forwards the RREQ to clusterheads of neighbor clusters. The process is repeated for each receiver clusterhead, until the cluster of the destination node is found. The difference from traditional reactive routing algorithms is that each clusterhead attach its cluster label instead of ID to RREQ prior to forwarding. None of other forwarding nodes (gateways) can change the RREQ. Therefore, upon receiving the RREQ, the destination node sends RREP back to source node using reversed route in the following form: {sourceID, sourceCL, CL1, CL2, …, destCL, destID}, in which sourceCL and destCL are the cluster labels of source’s and destination’s host clusters, respectively; and CL1, CL2, … are cluster labels of intermediate clusters. In essence the route discovery is a form of controlled flooding algorithm where the RREQ message is propagated to all network through forwarding from one cluster to another. Every clusterhead propagates only first received RREQ message for a particular source/destination pair therefore the problem of uncontrolled flooding often presented in reactive routing protocols is absent. The route maintenance part divided into two separate algorithms: notification and route re-calculation. Each clusterhead is responsible for propagating its p(CH) to the network triggered by permanent addition or deletion of records from p(CH). On receiving such information from all clusters each node constructs the cluster-labels routing table and is able to calculate the route to any cluster in terms of cluster-labels. Every node is able to detect the movement to neighbor cluster by obtaining the “Hello” packet. If such case is detected, the node issues a notification message called RCHG to each node in its active routing table. The RCHG message consists of sender’s ID and sender’s host cluster’s cluster-label. And then node recalculates the route to each destination using pair {destID, destCL}. Since the route to destination node is defined by route to destination’s host cluster, the cluster-labels routing table is used. On receiving the RCHG the node should check whether the sender is in the active routing table. If not, the RCHG is ignored. Otherwise, the route is recalculated using cluster-labels routing table. The recalculation is required since the movement of a source or destination node can be quite chaotic and simple addition of a current host cluster’s clusterlabel could lead to longer route causing delay and throughput degradation.
104
V. Li and H.S. Park
It is worth noting that the CLR uses both reactive and proactive approaches. Potentially this may lead to the high overhead comparing to other routing protocols. However, the route discovery uses RREQ broadcasting through clusterheads only, which is well known technique to reduce the control overhead. On the other hand the proactive approach uses event-triggered directed transmission. Since the CLaB takes care of connections between neighbor clusters such events occurs rarely comparing with that in proactive protocols such as DSDV or OLSR. Also, since the information about cluster-labels (group of nodes) is propagated rather than nodes, the amount of data is quite small. The same holds true in regard to the computation power consumption and amount of memory resources necessary for keeping the cluster-labels routing table. It can be argued that the destination-initiated notification could be harmful for the UDP connections since the destination often is not aware about route to the source node and therefore cannot send the RCHG. In such a case the route recovery could be done in a standard manner by re-issuing RREQ message from a source node. However, in this case the RREQ could be unicasted to the previous host cluster of a destination node to be broadcasted from there which can greatly increase the speed of route discovery. In previous work [13] it has been suggested to use previous clusterhead for packets forwarding and RCHG notifications in a way similar to one often used in the handoff procedure. However, such approach does not work well for slow data rate, where clusterhead could be changed before next packet arrives.
4 Performance Evaluations 4.1 Simulation Model All simulations are implemented using ns2 network simulator [15]. The common simulation parameters are presented in Table 1. Table 1. Simulation parameters
Simulation time Maximum link bandwidth Number of nodes Mobility Model Pause time Number of TCP connections TCP version TCP Window size Packet size Transmission Range
1200s 2Mbps 200 Random-waypoint 30s 40 NewReno 32 1460bytes 250meters
AODV, DSR and OLSR were used as targets for comparison. All three protocols are standardized by IETF MANET working group. First two protocols use reactive approach for routing discovery and maintenance. The route is obtained on-demand when needed and dropped if not used. The OLSR uses proactive approach meaning
TCP Performance over Cluster-Label-Based Routing Protocol
105
that regardless of actual needs for a route the routing table on each node contains routes to all possible destinations in a network. The information is updated through periodic data exchanges with neighbors. 4.2 Accuracy of Simulations Presented results had been obtained from the simulation on NS2 network simulator. In order to strengthen obtained results the average of 20 simulations is presented below. The parameters of a network for corresponding points in a graph are same while initial distribution of nodes is random across the network area. TCP connections between nodes are fixed, i.e. node X always establishes connection with node Y throughout a simulation set. The CLaB algorithm starts immediately after the start of simulation while first attempt to establish TCP connection issued 5 seconds after the simulation starts. Since the initial placement of nodes differs with each simulation run the connection parameters such as number of hops between connection participants differ as well.
Fig. 2. Average Goodput vs. Network Size (a) max speed – 5m/s (b) max speed – 10m/s (c) max speed – 15m/s (d) max speed – 20m/s
106
V. Li and H.S. Park
Another issue is the choice and accuracy of a simulation tool and hence the accuracy of obtained results itself. There are three major simulation tools widely used by research community: NS2, Glomosim [17] (currently succeeded by QualNet [18]) and OPNET Modeler [16]. NS2 and Glomosim are free tools while QualNet and OPNET Modeler are available under commercial license. The wide support and a free license made NS2 network simulator a de facto standard for network simulations within academia. It has been argued [19] that network simulators often fail to represent realistic model of network and especially MANETs leading to inaccurate results. In particular it is often difficult or impossible to configure realistic simulation and environment parameters such as terrain relief, radio propagation model, climate, mobility model. From the other hand the network simulators could present fair general network model within certain boundaries. Since the real evaluations on a testbed are very expensive in a case of MANETs the NS2 remains primary research tool for initial discovery of a network performance.
Fig. 3. Average Packet Delay vs. Data Arrival Rate (a) max speed - 5m/s (b) max speed - 10m/s (c) max speed - 15m/s (d) max speed - 20m/s
TCP Performance over Cluster-Label-Based Routing Protocol
107
4.3 Simulation Results Figure 2 shows the average goodput distribution versus network size. It is known that the route length affect the goodput. With increasing the network size the average length of a route is also increased accordingly. The results for the small network size up to 1.4km2 shows that the OLSR routing protocols shows best goodput among the three compared protocols and slightly worse than CLR. This is due to the high routing messages overhead caused by OLSR and the numerous layer 2 contentions and congestion periods that caused by this overhead. Figures 2(d) corresponds to high speed case. It can be seen that for very high speed case of 20m/s the goodput is decreased greatly for CLR routing protocol and is decreases very rapidly for other three protocols. However, even for a high speed case, the CLR makes better than compared three. The second set of simulations shows the delay performance of a 1km2 case with varying average packet arrival rate. The typical simulation scenarios consider high packet arrival rate. Therefore for such simulations a throughput or goodput is appropriate metric. However, some TCP applications issue a data at much slower rate and require a low per-packet delay rather than high throughput. Figure 3 shows the average packet delay in a TCP flow. It is shown that for low speed corresponding to 5m/s case the average TCP packet delay is similar for all protocols while the data arrival rate is high enough so that reactive routing protocols can detect the broken route. In case the data arrival rate is low, the reactive protocols do not detect a broken route until after the route error notification. However in case of OLSR and CLR the route update is independent of data arrival rate and thus no delay is caused. The situation becomes even worse in case of increasing maximum nodes speed. The links between intermediate links becomes volatile more often and the delay is caused even in case of relatively high data arrival rate. Again delay is not increased in case of OLSR and CLR since both protocols use proactive approach and does not require sending the data in order to detect the broken route.
5 Conclusions This paper presents the performance analysis of TCP over Cluster-Label-based Routing Protocol for Mobile Ad Hoc Networks. The routing protocol is presented on the top of a backbone created by Cluster-Label-based mechanism for Backbone. The routing discovery algorithm is performed reactively and route maintenance algorithm is performed proactively. The presented protocol is compared with existing protocols: OLSR, AODV and DSR which are standardized protocols under IETF MANET working group. The performance evaluations are presented by simulations and show that in terms of goodput the CLR performs better than compared protocols, especially in cases of high nodes mobility and large network size. Average delay imposed by discovery of broken routes is comparable with presented by proactive routing protocols such as OLSR. Overall the obtained results combined with previous studies on Cluster-Label-based mechanism for Backbone makes the CLR suitable protocol for MANETs in case of large networks with low to very high mobility of nodes.
108
V. Li and H.S. Park
References 1. Internet Engineering Task Force,Manet working group charter http://www.ietf.org/ html.charters/manet-charter.html 2. Clausen, T., Jacquet, P., Laouiti, A., Minet, P., Muhlethaler, P., Qayyum, A., Viennot, L.: Optimized Link State Routing Protocol(OLSR). IETF RFC 3626 3. Perkins, C.E., Belding-Royer, E.M., Das, S.R.: Ad-hoc On-Demand Distance Vector (AODV) Routing. IETF RFC 3561 4. Johnson, D.B., Maltz, D.A., Hu, Y.: The Dynamic Source Routing Protocol for Mobile Ad-hoc Networks (DSR), IETF Internet Draft, draft-ietfmanet-dsr-08.txt 5. Perkins, C.E., Bhagwat, P.: Highly dynamic Destination-Sequenced Distance-Vector routing (DSDV) for Mobile Computers. In: SIGCOMM Symposium on Communications Architectures and Protocols 6. Holland, G., Vaidya, N.: Analysis of TCP Performance over Mobile Ad Hoc Networks. In: ICMCN’99. 5th ACM/IEEE International Conference on Mobile Computing and Networking (1999) 7. Fu, Z., Meng, X., Lu, S.: How Bad TCP Can Perform In Mobile Ad Hoc Networks. In: ISCC’02. 7th International Symposium on Computer and Communications (2002) 8. Ng, P.S., Liew, S.C.: Re-routing Instability in IEEE 802.11 Multi-hop Ad Hoc Networks. In: LCN’04. 29th Annual IEEE International Conference on Local Computer Networks 9. Abdullah-al-Mamun, Md., Mahbubur Rahman, M., Tan, H.-P.: Performance Evaluation of TCP Over Routing Protocols for Mobile Ad Hoc Networks. In: 1st Chinacom (2006) 10. Kim, D., Bae, H., Song, J.: Analysis of the Interaction between TCP Variants and Routing Protocols in MANETs. In: ICPPW’05. International Conference on Parallel Processing Workshops (2005) 11. Seddik-Ghaleb, A., Ghamri-Doudane, Y., Senouci, S.-M.: Effect of Ad Hoc Routing Protocols on TCP Performance within MANETs. In: IWWAN’06. International Workshop on Wireless Ad Hoc & Sensor Networks 12. Chandran, K., Raghunathan, S., Venkatesan, S., Prakash, R.: A feedback-based scheme for improving TCP performance in ad hoc wireless networks. IEEE Personal communications 8(1), 34–39 (2001) 13. Li, V., Park, H.S., Oh, H.: Cluster label based Mechanism for Backbone on Mobile Ad Hoc Networks. In: Braun, T., Carle, G., Fahmy, S., Koucheryavy, Y. (eds.) WWIC 2006. LNCS, vol. 3970, Springer, Heidelberg (2006) 14. Li, C.R., Mario, G.: Adaptive Clustering for Mobile Wireless Networks. IEEE Journal of Selected Areas in Communications 15(7), 1265–1275 15. The network simulator – NS-2 http://www.isu.edu/nsnam/ns 16. Zhu, C., Yang, O.W.W., Aweya, J., Oullette, M., Montuno, D.Y.: A comparison of active queue management algorithms using the OPNET Modeler. 2002 40(6), 158–167 (2002) 17. Pandey, A.K., Fujinoki, H.: Study of MANET routing protocols by GloMoSim simulator. Int. Journal of Network Management 15(6), 393–410 (2005) 18. Hsu, J., et al.: Performance of Mobile Ad Hoc Networking Routing Protocols in Realistic Scenarios, U.S. Army CECOM RDEC MILCOM ’03 (2003) 19. Cavin, D., Sasson, Y., Schiper, A.: On the accuracy of MANET simulators. In: Proceedings of the Workshop on Principles of Mobile Computing (POMC’02), pp. 38–43. ACM, New York (2002)
An Analytic Model of IEEE 802.16e Sleep Mode Operation with Correlated Traffic Koen De Turck, Stijn De Vuyst, Dieter Fiems, and Sabine Wittevrongel SMACS Research Group, Department TELIN, Ghent University, Sint-Pietersnieuwstraat 41, B-9000 Gent {kdeturck,sdv,df,sw}@telin.ugent.be
Abstract. We propose and analyse a discrete-time queueing model for the evaluation of the IEEE 802.16e sleep mode operation in wireless access networks. This mechanism reduces the energy consumption of a mobile station (MS) by allowing it to turn off its radio interface (sleep mode) when there is no traffic present at its serving base station (BS). After a sleep period expires, the MS briefly checks the BS for data packets and switches off for the duration of another sleep period if none are available. Specifically for IEEE 802.16e, each additional sleep period doubles in length, up to a certain maximum. Clearly, the sleep mode mechanism can extend the battery life of the MS considerably, but also increases the delay at the BS buffer. For the analysis, we use a discrete-time queueing model with general service times and multiple server vacations. The vacations represent the sleep periods and have a length depending on the number of preceding vacations. Unlike previous studies, we take the traffic correlation into account by assuming a D-BMAP arrival process. The distribution of the number of packets in the queue is obtained at various sets of time epochs, as well as the mean packet delay and the mean number of consecutive vacations. We apply these results to the IEEE 802.16e sleep mode mechanism with downlink traffic. By means of some examples, we show the influence of both the configuration parameters and the traffic correlation on the delay and the energy consumption.
1
Introduction
The new IEEE 802.16e [1] standard for Broadband Wireless Access networks holds extensions to the original 802.16 standard for support of subscriber mobility. Among these extensions are an efficient handover procedure and an optional energy saving mechanism, the latter of which is the topic of this paper. The idea behind the energy saving mechanism is to allow a Mobile Station (MS) to enter sleep mode, which means that its radio interface is switched off during certain intervals of time. While in such a sleep period, the MS cannot be reached by its serving Base Station (BS) and any arriving traffic must be buffered there until the MS’s sleep period ends. At the end of each sleep period, the MS switches its radio back on for a short time (listening interval) to check whether data packets are available at the BS. If not, the MS’s sleep mode continues with Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 109–120, 2007. c Springer-Verlag Berlin Heidelberg 2007
110
K. De Turck et al. arriving downlink traffic
buffer
BS downlink
busy
idle sleep period 1
sleep period 2
L
MS
AWAKE
T0
busy
L
2T0
sleep period 1
sleep period 3
SLEEP
L
4T0
sleep period 2
L
AWAKE
T0
L
2T0
SLEEP
Fig. 1. Indication of busy, idle and sleep periods at the BS and sleep mode vs. awake mode at the MS. An idle period consists of a number of consecutive sleep periods and ends when arrivals have occurred during its last sleep period. During busy periods, the packets in the buffer are served and transmitted to the MS until the buffer is empty, after which a new idle period starts. The listening interval at the end of each sleep period is indicated by ‘L’.
another sleep period. However, if any traffic arrived during the last sleep period, the MS remains powered and enters awake mode. This allows the BS to transmit the packets in its buffer, which it does in an exhaustive manner, i.e. the BS keeps transmitting packets until its buffer is completely empty. At this point, the MS switches back to sleep mode. Thus, the MS alternates between sleep and awake mode, as is illustrated in Fig. 1. Specifically for IEEE 802.16e, an exponential increase strategy is used for updating the lengths of the sleep periods. The first sleep period after entering sleep mode has length T0 . Each subsequent sleep period however, has twice the length of the previous sleep period, i.e. 2T0 , 4T0 , 8T0 , . . . , until a maximum length Tmax is reached. After that, all additional periods have length Tmax . Also, if during a sleep period, packets need to be sent from the MS in the uplink direction, the sleep mode is interrupted immediately and the MS stays in awake mode until all packets in both directions have been transmitted. Usually however, the amount of uplink traffic is small compared to the downlink traffic, so we choose to ignore it in our analysis. An inherent drawback of the sleep mode mechanism is the degradation of QoS (Quality of Service). As the transmission of packets arriving at the BS must be postponed until the current sleep period of the MS is finished, it is clear that the overall packet delay will suffer. Hence, a trade-off needs to be made with respect to the sleep period lengths. Short sleep periods result in too much unneeded activations of the MS radio interface which is less energy efficient, while very long sleep periods result in excessive packet delays. Therefore, it is important to be able to predict the influence of the sleep mode parameters on both MS energy consumption and packet delay. According to the IEEE 802.16e standard,
An Analytic Model of IEEE 802.16e Sleep Mode Operation
111
the sleep mode parameters are (a.o.) the initial sleep period T0 and the maximal sleep period Tmax which are pre-negotiated each time the MS enters sleep mode. However, we assume here that these parameters remain fixed throughout the operation of the system. Quite a few authors have shown interest in the performance of the sleep mode operation, either in case of IEEE 802.16 or other technologies. In [2], the average energy consumption of the MS is obtained in case of downlink traffic only, as well as an approximate expression for the mean packet delay. The energy consumption of the MS in case of both downlink and uplink traffic is considered in [3]. Both [2] and [3] model the incoming (and outgoing) traffic as a Poisson process. An accurate assessment of the delay experienced at the BS buffer however, requires a queueing model. For IEEE 802.16e, in [4], the BS buffer is modelled as a continuous-time finite-capacity queue with a Poisson arrival process and deterministic service times. A semi-Markov chain analysis leads to expressions for the mean packet delay and the mean energy consumption by the MS. The analysis in [5] is based on an M/G/1/K queueing model with multiple vacations and exhaustive service. Similar work can also be found in [6], where the length of a vacation is assumed to depend on the previous vacation length. In [7], the sleep mode operation in Cellular Digital Packet Data (CDPD) services is evaluated. The difference with IEEE 802.16e is that the subsequent sleep periods do not increase in length. The system can thus be modelled as a queueing system with multiple vacations and exceptional first vacation. The loss probability in both [5] and [7] is obtained as well. A simulation study of CDPD sleep mode performance is found in [8]. An alternative to the exponential increase of the sleep period lengths is evaluated by simulation in [9]. All of the above models however, specifically assume an uncorrelated arrival process (i.e. a Poisson process) which is known to be unrealistic for most types of data traffic. The main contribution of our paper is the fact we analyse the sleep mode operation for traffic modelled by a D-BMAP (Discrete Batch Markov Arrival process). This is a very general model that can capture most traffic characteristics up to any desired precision. As we demonstrate, the presence of correlation in the carried traffic stream has a major impact on the behaviour of the BS buffer, and hence also on the efficiency of the sleep mode operation. The model we propose is also generic with respect to the lengths of the subsequent sleep periods. Instead of restricting us to the exponential increase strategy of the IEEE 802.16e standard, we assume the deterministic lengths of the first, second, third, . . . , sleep period to be free parameters of the model. This way, the model can capture any deterministic sleep period updating strategy (not only exponential) and also account for the presence of the listening intervals and an additional negotiating interval when the MS enters sleep mode. The remainder of this paper is organised as follows. In Sect. 2, we describe the queueing model under study. Section 3 presents the analysis of the buffer content and the mean packet delay. The energy consumption is studied in Sect. 4 and some numerical examples are discussed in Sect. 5.
112
2
K. De Turck et al.
Queueing Model
We consider a discrete-time model of an infinite-capacity single-server queue. Packets are served (transmitted) according to a first-come-first-served discipline. The service times of the packets are assumed to be independent and identically distributed (i.i.d.). Let the random variable S denote a service time, its probability mass function is denoted by s(n) = Pr[S = n slots], n 1, and its probability ∞ generating function by S(z) = n=1 s(n)z n . The main peculiarity of our model is that whenever the server has finished all its work, it goes to sleep; an internal timer is then started and the server awakes to check the queue content after a sleep period of t1 time slots. When upon awaking the server finds that there are still no packets, it goes to sleep again, this time for a sleep period of t2 slots. The next sleep period would be of length t3 , and so on. In general, tn , n 1, denotes the length (expressed in slots) of the nth sleep period after the queue has become empty. If after a sleep period, the server finds a non-empty queue, it serves all packets present at that point and also all new packets that arrive while the server is working, until the queue becomes empty again and the whole procedure is repeated. We will refer to the entire time interval between two subsequent busy periods (i.e., where the server is busy doing work) as an idle period or vacation period, which is divided into . n a number of sleep periods. For convenience, we also define τn = i=1 ti , n 1, and τ0 = 0. Note that in case the origin of the time axis is set to coincide with the start of the idle period, each τn marks the end of a short listening interval where the server checks for packets in the buffer. The definition of the tn s and the τn s is further illustrated in Fig. 2.
t1
... τ0 = 0 busy period
t2 τ1
t3 τ2
idle period
... τ3
time busy period
Fig. 2. The asnatomy of an idle period
We will assume that packets arrive in the queue according to a Markov modulated batch arrival process. We assume a finite number N of background states. The arrival process is completely defined by the values a(k, j|i); k 0; i, j ∈ {1, · · · , N }, denoting the probability that if the background state is i during a slot, there are k arrivals during this slot and the background state during the next slot is j. We put these probabilities in a matrix generating function A(z) with dimension N × N , whose entries are defined as follows: ∞
. [A(z)]ij = a(k, j|i) z k . k=0
(1)
An Analytic Model of IEEE 802.16e Sleep Mode Operation
113
This arrival process, whereby the distribution of the number of arrivals in a slot depends on the transition made by a Markov chain in that slot, is called the Discrete Batch Markovian Arrival Process (D-BMAP) [10]. Delayed access is assumed, i.e. arriving packets do not enter the buffer until the end of their arrival slot. Let σ denote a probability vector (of dimension 1 × N ), with [σ]i the steadystate probability of being in background state i during a slot. The vector σ satisfies σA(1) = σ , and σ1 = 1 , (2) with 1 a column vector of ones of appropriate size. The probability generating function of the number of arrivals while being in background state i is denoted by Ai (z), which can be found as the ith entry of the vector A(z)1. The average number of arrivals during a slot in background state i is E[a|i] = Ai (1); the average number of arrivals during a random slot is E[a] = σA (1)1.
3
Buffer Analysis
The distribution of the buffer content at the beginning of an arbitrary slot can be found most elegantly by using a few intermediary steps. First, we will derive the matrix generating function V(z), with the following entries: ∞
. [V(z)]ij = v(k, j|i)z k ,
(3)
k=0
where v(k, j|i) is the probability that an idle period ends with k packets in the queue and with background state j, provided that the background state was i during the first slot of the idle period, as indicated in Fig. 3. Below we will derive V(z) in terms of A(z), the τn s and the tn s. ...
state j
state i
... time
idle period k packets
Fig. 3. The definition of V(z)
The second step is to analyse the buffer content at service completions, resulting in the 1 × N vector generating function Uc (z), whose ith entry equals the partial generating function of the buffer content at a service completion instant when the background state during the following slot is i. The last step is to study the buffer content at the beginning of an arbitrary slot, which has vector generating function U(z).
114
K. De Turck et al.
3.1
Derivation of V(z)
Consider a vacation period, consisting of n sleep periods. The background state during the first slot of the vacation period is i; the background state during the first slot after the vacation period is j; k packets arrive during the nth sleep period. Then we can write: [V(z)]ij =
∞ ∞
vn (k, j|i)z k ,
(4)
n=1 k=1
where vn (k, j|i) is the probability that an idle period consists of n sleep periods and ends with a queue content of k in background state j, given that it started in state i. We know that an idle period consists of n sleep periods if and only if there are no arrivals during the first τn−1 slots and there are a positive number, say k, arrivals in the last tn slots. Let i be the background state during the first slot of the nth sleep period. Then we can write: [V(z)]ij =
∞ ∞ N
Pr[0 arrivals in τn−1 slots, i |i]
n=1 k=1 i =1
× Pr[k arrivals in tn slots, j|i ]z k .
(5)
In view of the definition of A(z), we can rewrite the above expression as [V(z)]ij =
∞ N
[Aτn−1 (0)]ii [Atn (z) − Atn (0)]i j .
(6)
n=1 i =1
The inner sum in (6) is in fact nothing more than a matrix multiplication, hence we finally find the following expression for V(z): V(z) =
∞
Aτn−1 (0){Atn (z) − Atn (0)} .
(7)
n=1
In the special case that the tn s for n above a certain threshold J are equal, i.e. tn = tJ for n J, we can rewrite (7) as a finite sum: V(z) =
J
Aτn−1 (0){Atn (z) − Atn (0)}
n=1
+ AτJ (0)(/I − AtJ (0))−1 {AtJ (z) − AtJ (0)},
(8)
where /I denotes the identity matrix. 3.2
Derivation of Uc (z)
In order to derive an expression for the vector generating function Uc (z) of the buffer content at service completions, i.e. in a slot following the departure of a
An Analytic Model of IEEE 802.16e Sleep Mode Operation
115
packet, we note the following. When the buffer content is non-zero, say u, after a certain service completion, the buffer content after the next service completion will be u − 1 + α, where α is the number of packets that arrive during the service time. When the buffer is empty after a service completion, the server will go into a series of sleep periods, and eventually awake with v packets in the buffer, hence there will be v−1+α packets after the next service completion. Note that V(z) is the matrix generating function corresponding to the buffer content of v packets at the end of the idle period. The matrix generating function that to corresponds k the number of packets α is given by the expression S(A(z)) = ∞ k=1 A (z)s(k). These observations lead to the following equation for Uc (z): Uc (z) = {Uc (z) − Uc (0)} 3.3
S(A(z)) S(A(z)) + Uc (0)V(z) . z z
(9)
Derivation of U(z)
To find the distribution of the buffer content at the beginning of an arbitrary slot, we first derive the distribution at the beginning of an arbitrary service slot, and the distribution at the beginning of an arbitrary vacation slot. Then we combine these two. Note that the following relationship exists between the vector generating function Ub (z) of the buffer content at the beginning of a service and the vector generating function Uc (z) of the buffer content at the end of a service: Ub (z)
S(A(z)) = Uc (z). z
(10)
Indeed, the matrix generating function S(A(z)) represents the number of packets that arrive during a service time, and the factor z −1 is due to the fact that one packet leaves the buffer at the end of the service. Combination of (9) and (10) then leads to Ub (z) = Uc (z) − Uc (0) + Uc (0)V(z). (11) The vector generating function UB (z) of the buffer content at the beginning of a random service (busy) slot, satisfies the following relation [11]: UB (z) = =
S−1 ES [ j=0 Ub (z)Aj (z)] E[S] Ub (z) (/I − S(A(z)))(/I − A(z))−1 , E[S]
(12) (13)
where E[S] is the mean length of a service time. In order to study the buffer content at the beginning of a vacation slot, let u0 denote the probability that the buffer content is zero at the end of a service. Note that u0 = Uc (0)1. Also, we introduce the vectors vn . The ith entry of vn , [vn ]i , is the probability that during an idle period, there are at least n sleep
116
K. De Turck et al.
periods, and the nth sleep period starts in background state i. The vectors vn , n 1, can be expressed as vn =
1 Uc (0)Aτn−1 (0) . u0
(14)
The (partial) vector generating function UV,n (z) (n 1) of the buffer content at the beginning of a random slot during the nth sleep period of a vacation, can then be found as follows. Considering such a slot, we know that it could have any position j (0 j tn − 1) within the sleep period with equal probability 1/tn . Also, assuming the position of this slot is j, the matrix generating function of the number of arrivals since the start of the sleep period is Aj (z). We find UV,n (z) = vn = vn
tn −1 1 Aj (z) tn j=0
(15)
1 (/I − Atn (z))(/I − A(z))−1 . tn
(16)
Hence, the vector generating function UV (z) of the buffer content at the beginning of a random slot in vacation is obtained as UV (z) =
∞ 1 tn UV,n (z), E[length vacation] n=1
(17)
where we have to divide by the mean vacation length to obtain a normalized probability generating function. The mean E[length vacation] can be obtained by ∞ E[length vacation] = tn vn 1. (18) n=1
Now, we have all the necessary parts to construct the vector generating function U(z) of the buffer content at the beginning of a random slot. The fraction of busy slots is equal to the load of the queueing system, ρ = E[a]E[S]. Hence, the fraction of vacation slots is equal to 1 − ρ. Therefore, U(z) = ρ UB (z) + (1 − ρ)UV (z). 3.4
(19)
Mean Queue Content and Mean Delay
The mean queue length E[U ] is given by: E[U ] = U (1)1.
(20)
In the context of the IEEE 802.16e sleep mode operation, a very important performance measure is the mean delay E[D], which we can find via Little’s law: E[U ] = E[a]E[D].
(21)
An Analytic Model of IEEE 802.16e Sleep Mode Operation
4
117
Energy Consumption
As stated earlier, the goal of the sleep mode operation is to reduce the energy consumption of a mobile device, by switching off its radio interface or antenna as much as possible. For this analysis, we assume that the main energy cost of the mobile device is to keep the radio interface activated, and that a listening slot (at the end of each sleep period) costs just as much energy as a random busy slot. The literature seems to suggest that these assumptions are fair (see e.g. [1]). Under these assumptions, the antenna activity rate, Pr[active], i.e. the fraction of slots that the antenna is active, is a good measure for the energy consumption. In the reference case without sleep periods, this probability is equal to 1. In the general case, we can derive the following formula for Pr[active]: E[# sleep periods per vacation] . (22) E[length vacation] The first term denotes the fraction of busy slots, while the second denotes the fraction of time that the antenna is activated during idle periods, i.e. at the end of each sleep period. The value E[# sleep periods per vacation] is equal to: Pr[active] = ρ + (1 − ρ)
E[# sleep periods per vacation] =
∞
vi 1,
(23)
i=1
due to the definition of vi , and the property that for any discrete random variable ∞ Y , E[Y ] = i=1 Pr[Y ≥ i]. The mean E[length vacation] is given by expression (18). Note that the vector Uc (0) is needed to calculate Pr[active], hence we cannot determine the activity rate without a buffer analysis, at least not when traffic is correlated.
5
Numerical Examples
In this section, we will show some practical results of our model. Note that we will use fairly simple assumptions, which makes it easier to interpret the results. Our model is however not restricted to these fairly idealistic assumptions, and can also handle more powerful traffic models, service time distributions etc. We will assume a so-called ON-OFF traffic model, i.e. the arrival process is modulated by a two-state Markov chain, and while the process is in state 1, (the ON state), arrivals occur according to a Bernoulli process with parameter a. When in state 2, the OFF state, no arrivals occur. The Markov chain spends a fraction σ of the time in the ON state. The coefficient of correlation between 1 the states in two subsequent slots is 1 − K where K is a measure for the mean K lengths of both ON and OFF periods, given by 1−σ and K σ respectively. It can be easily verified that the matrix generating function A(z) for this traffic model equals 1−σ 1 − a + az 0 1 − 1−σ K K A(z) = . (24) σ σ 0 1 1− K K
118
K. De Turck et al. 40
1
K = 40
0.8
Pr[active]
E[D]
30
20
K = 10
10
0.6
K=1
K = 10 K = 40
0.4
0.2
K=1 0
0
0
0.2
0.4
ρ
0.6
0.8
1
0
0.2
0.4
ρ
0.6
0.8
1
Fig. 4. In the left figure: Mean delay versus the load for a (2, 4, 8, 16, 32, 32, · · · ) strategy for different K’s, where σ = 23 and E[S] = 53 . Right figure: Antenna activity rate versus the load for the same parameters.
Further, we assume that the service times have a so-called shifted geometric distribution, i.e. its probability generating function is equal to S(z) =
(1 − s)z , 1 − sz
(25)
1 where 0 ≤ s < 1 and E[S] = 1−s . For convenience, we introduce notations like (2, 4, 8, 8, · · · ) to denote a strategy in which the first sleep period equals 2 slots, the second 4 slots, and every sleep period thereafter equals 8 slots. In Fig. 4, the evolution of respectively the mean delay E[D] and of the antenna activity rate Pr[active] are plotted in function of the load ρ for different values of the correlation parameter K. We observe that for high values of ρ, the mean delay grows faster as the traffic correlation increases. This is in line with the common wisdom that traffic correlation tends to deteriorate the system performance. An observation which is more specific for this model is that for low loads, the mean delay is fairly high, and then decreases to reach a certain minimum, before increasing again when ρ → 1. This is due to the fact that for very small loads, the system will almost always be in the longest sleep period (in this example, of 32 slots) when a packet arrives. The expected number of slots until the service of such a packet will start, is equal to Tmax (here: 16 slots), such that the expected 2 delay for loads approaching zero is equal to Tmax + E[S]. Numerical examples 2 confirm this observation. When we look at the activity rate Pr[active], we see that 1 regardless of the correlation, Pr[active] → 1 for ρ → 1, and Pr[active] → Tmax for ρ → 0, which is what we expect. Perhaps more remarkable is that traffic correlation lowers the activity rate, and hence is good for the battery life. Figure 5 shows the influence of the correlation parameter K on the mean delay E[D] more directly. We choose σ = 23 , E[S] = 53 and strategy (2, 4, 8, 16, 32, 32, · · ·), for different loads. We see that E[D] either converges to a certain value, or keeps increasing rapidly. The determining factor is the load during ON periods,
An Analytic Model of IEEE 802.16e Sleep Mode Operation
119
30
ρ = 79 ; ρ1 =
7 6
ρ = 23 ; ρ1 = 1
E[D]
20
10
ρ = 19 ; ρ1 =
1 6
ρ = 49 ; ρ1 =
2 3
0 0
50
100
150
K
200
Fig. 5. Mean delay versus the correlation factor different loads, where σ = E[S] = 53 with a (2, 8, 16, 32, 32, · · · ) strategy
30
2 3
and
1
0.8
Pr[active]
E[D]
20
A
10
B
0
0.2
0.4
B
0.4
A
0.2
C
0
C
0.6
0
ρ
0.6
0.8
1
0
0.2
0.4
ρ
0.6
0.8
1
Fig. 6. In the left figure: Mean delay versus the load for strategies A : (16, 16, · · · ), B : (2, 4, 8, 16, 16, · · · ) and C : (1, 2, 3, 4, · · · , 15, 16, 16, · · · ), for K = 30, σ = 23 and E[S] = 53 . Right figure: Antenna activity rate versus the load for the same parameters.
ρ1 = aE[S]. If ρ1 > 1, the buffer will build up dramatically during ON periods, and, as long as ρ itself is smaller than 1, build down again during OFF periods, but these large fluctuations in buffer occupancy result in a large average delay, which becomes ever larger as the expected lengths of ON and OFF periods increase (i.e. as K increases). In Fig. 6, we show how well different sleep mode strategies fare, in function of the load. We already know what the influence of Tmax is when the load approaches zero. We now investigate the influence on the delay and the activity rate if this maximum is reached directly i.e. strategy (Tmax , Tmax , · · · ), or exponentially as in the IEEE standard i.e. (2, 4, 8, · · · , Tmax , Tmax , · · · ), or more slowly for example linearly (1, 2, 3, · · · , Tmax , Tmax , · · · ). It appears that the activity rate versus the load, which shows a straight line 1 between (0, Tmax ) and (1, 1) when Tmax is reached directly, increases when Tmax
120
K. De Turck et al.
is reached more slowly, especially so for ‘reasonable’ ρ (i.e. 0.2 < ρ < 0.6), hence such slow strategies are less energy efficient. On the other hand, the mean delay in the same interval seriously benefits from slow starting strategies. The best strategy depends of course on the trade-off one is willing to make between energy consumption and quality of service, but it appears to us that sleep mode operation provides powerful instruments to reach an acceptable compromise.
References 1. IEEE 802.16e-2005, Part 16: Air interface for fixed and mobile broadband wireless access systems — Amendment 2: physical and medium access control layers for combined fixed and mobile operation in licensed bands — Corrigendum 1 (February 2006) 2. Xiao, Y.: Energy saving mechanism in the IEEE 802.16e wireless MAN. IEEE Communications Letters 9, 595–597 (2005) 3. Zhang, Y., Fujise, M.: Energy management in the IEEE 802.16e MAC. IEEE Communications Letters 10, 311–313 (2006) 4. Han, K., Choi, S.: Performance analysis of sleep mode operation in IEEE 802.16e mobile broadband wireless access systems. In: VTC 2006-Spring IEEE 63rd. Vehicular Technology Conference, vol. 3, pp. 1141–1145 (2006) 5. Park, Y., Hwang, G.U.: Analysis of the sleep mode in the IEEE 802.16e WMAN. In: Proceedings of the KSIAM (Korean Society for Industrial and Applied Mathematics). 2006 Spring conference, Daegu, South-Korea, 26-27 May 2006 (2006) 6. Seo, J.-B., Lee, S.-Q., Park, N.-H., Lee, H.-W., Cho, C.-H.: Performance analysis of sleep mode operation in IEEE 802.16e. In: VTC2004-Fall 60th. Vehicular Technology Conference, vol. 2, pp. 1169–1173 (2004) 7. Kwon, S.-J., Chung, Y.-W., Sung, D.K.: Queueing model of sleep-mode operation in cellular digital packet data. IEEE Transactions on Vehicular Technology 52, 1158–1162 (2003) 8. Lin, Y.-B., Chang, Y.-M.: Modeling the sleep mode for cellular digital packet data. IEEE Communications Letters 3, 63–65 (1999) 9. Lee, N.-H., Bahk, S.: MAC sleep mode control considering downlink traffic pattern and mobility. In: VTC 2005-Spring 61st. Vehicular Technology Conference, vol. 3, pp. 2076–2080 (2005) 10. Blondia, C., Casals, O.: Statistical multiplexing of VBR sources: A matrix-analytic approach Performance Evaluation 16, 5–20 (1992) 11. Takagi, H.: Queueing analysis, a foundation of performance evaluation. In: Discretetime systems, Elsevier Science Publishers B.V, Amsterdam, The Netherlands (1993)
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity Pål Grønsund1, Paal Engelstad2, Moti Ayoun3, and Tor Skeie4 1
Department of Informatics, University of Oslo, PO Box 1080, 0316 Blindern, Norway 2 Telenor R&I, PO Box 1331, Fornebu, Norway 3 Alvarion Ltd. 21a HaBarzel St. Tel Aviv, 69710 Israel 4 Simula Research Laboratory, PO Box 134, 1325 Lysaker, Norway
[email protected],
[email protected],
[email protected], tskeie@{ifi.uio.no, simula.no}
Abstract. Mobile WiMAX is a promising wireless technology approaching market deployment. Much discussion concentrate on whether mobile WiMAX will reach a tipping point and become 4G or not. As a pre-mobile WiMAX system is being delivered, we decided to set up a real life field trial and perform the most important measurements over the system setup. The system was delivered with 4th order diversity. In this paper we analyze physical system performance based on field trial measurements, especially at locations with non line of sight conditions in urban areas. We investigate the gain with 2nd and 4th order base station diversity and derive analytical expressions. The system path loss is plotted and found to approach the Cost-231 Hata model for urban areas. Throughput is also measured and analyzed. Sub-channelization in the uplink points out to be an important feature for enhanced coverage. Keywords: Mobile WiMAX, field trial, throughput, diversity.
1 Introduction Mobile WiMAX is a mobile broadband wireless access system which offers high throughput, great coverage, flexible Quality of Service (QoS) support and extensive security. Mobility is added to the former success of fixed WiMAX, which uncover a new range of mobile services in addition to the fixed broadband services. Mobile WiMAX is certified by the WiMAX forum [1], which is a certification mark based on the IEEE 802.16 [2] that pass conformity and interoperability tests. There are two main classes of WiMAX systems called fixed WiMAX and mobile WiMAX. Fixed WiMAX is targeted for providing fixed and nomadic services, while mobile WiMAX can be used also for providing portable and (simple and full) mobile services. The system studied here is a partial- or pre-implementation of mobile WiMAX. The air interface of mobile WiMAX uses Orthogonal Frequency Division Multiple Access (OFDMA), which is very robust against multi-path propagation that causes frequency selective fading. Multiple access is implemented where subchannels, existing of a set of sub-carriers, are allocated for transmissions to and from Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 121 – 132, 2007. © Springer-Verlag Berlin Heidelberg 2007
122
P. Grønsund et al.
different users. Time Division Duplexing (TDD) provides an efficient use of the available bandwidth where flexible amounts may be divided between uplink and downlink. The mobile WiMAX MAC layer is connection oriented and has support for flexible QoS guarantees as constant bit rate, guaranteed bandwidths and best effort. The pre-mobile WiMAX system studied in this paper operated in the 3.5 GHz frequency band, with a fully implemented TDD scheme where uplink/downlink ratio could be flexible adjusted over a total bandwidth of 5 MHz. A partial implementation of OFDMA was used in the uplink with groups of 16, 8 or 4 sub-channels where one Subscriber Unit (SU) can transmit per symbol. Orthogonal Frequency Division Multiplex (OFDM) was used in the downlink. The Base Station (BS) was provided with 4th order diversity, and could easily be configured with 2nd and no order diversity. At the time of writing this article, as far as we know, no published material exists about mobile WiMAX and real life field trial measurements. As a pre-mobile WiMAX system was manufactured in early 2007, we decided to set up a test bed and perform the most important measurements to analyze the system performance. Throughput was measured with the transport protocol UDP and physical performance was measured for the most important attributes Received Signal Strength Indicator (RSSI) and Signal to Noise Ratio (SNR). At a subset of the locations we performed more extensive measurements to analyze the impact 2nd- and 4th order diversity had on the system performance. The main contribution of this paper is to present measurement results and comprehensive analysis of a real life pre-mobile WiMAX field trial, where expected path loss and throughput is presented. A second contribution is a study on performance for diversity applied in WiMAX, and the proposal of analytical expressions for expected gain with the use of different diversity order. The organization of the rest of this paper is as follows: Chapter 2 presents the system setup. The measurement procedures are given in chapter 3. Chapters 4 and 5 present the result for physical and throughput performance respectively. The impact of diversity and different diversity orders are given in chapter 6 before conclusions are drawn in chapter 7.
2 System Setup 2.1 System Description BreezeMAX TDD delivered by Alvarion [3] is a WiMAX-ready platform operating in TDD mode with a 5 MHz bandwidth in the 3.5 GHz frequency band. It is a Point to MultiPoint (PMP) radio access system, where a BS serves mobile, nomadic and fixed Subscriber Units (SU). The BS was setup with 3 Access Unit Indoor Units (AU-IDU), each constituting a sector with 120˚ beamwidth. The BS was configured with 4th order diversity. Both 4th order transmit diversity using Cyclic Delay Diversity (CDD) and 4th order receive diversity using Maximum Receive Ratio Combining (MRRC) were used at the BS. This was setup with each of the three AU-IDU each connecting to 4 AU Outdoor Units (AU-ODU), which was paired such that AU-ODU 1 and 2 form one pair and
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity
123
Fig. 1. AU-IDU connected to 4 AU-ODU paired 2x2 and connected to double polarized antennas
AU-ODU 3 and 4 form a second pair. Each pair was then connected to a dual polarization slant antenna as illustrated in Fig. 1. The total BS then consists of 1 NPU, which is the heart of the BS, 3 AU-IDUs, 12 AU-ODU and 6 antennas. It can be seen from Fig. 1 that one dual polarization slant antenna was used for 2nd order polarization diversity. Two dual polarization slant antennas were used for 4th order diversity with space diversity and polarization diversity in each antenna. All AU-ODUs was setup with the same frequency and transmit power, and all share a common MAC and modem. Output power at antenna port was 34 dBm for each AU-ODU. Total output power for each sector with four antennas is therefore 40 dBm. Each antenna has a 13 dBi gain. The measurements were performed using a Self Install (Si) CPE which is a compact SU intended for indoor installations. The Si CPE includes 6 internal 60˚ antennas
Fig. 2. Sub-channelization, with one SU per symbol
124
P. Grønsund et al.
providing full 360˚ coverage and connects to the end-user equipment through a 100 Base-T Ethernet interface. It uses Automatic Transmit Power Control (ATPC) where maximum transmission power is 22 dBm. Antenna gain is 9 dBi. Sub-channelization was implemented in the UL by using OFDMA technique, with a limitation that only one SU can transmit per symbol (Fig. 2). The gain will therefore not be multiple access over OFDMA, but that one SU with weak link quality can focus the power to fewer sub-channels to obtain connectivity. The sub-channels may be grouped in 16, 8 or 4. A 3 dBm gain in RSSI should be obtained when lowering one step in amount of sub-channels used. OFDM 256 FFT was used in the downlink. The TDD ratio was set to 50/50, thus half the bandwidth was available for DL and the other half for UL. The theoretical bitrates are listed for each modulation rate in together with the DL and UL bitrates when TDD rate is set to 50/50 in Table 1. Table 1. Teoretical bitrate (5 MHz) in full TDD and UL/DL TDD ratio 50/50 # 1 2 3 4 5 6 7 8
Modulation Rate BPSK ½ BPSK ¾ QPSK ½ QPSK ¾ QAM16 ½ QAM16 ¾ QAM64 2/3 QAM64 ¾
Bitrate (Mbps) 1.92 N/A 3.84 5.76 7.68 11.52 15.36 17.28
50/50 Bit Rate 0.96 N/A 1.92 2.88 3.84 5.76 7.68 8.64
2.2 Measurement Area The BS is setup in Hamar City in Norway for a pilot project to be driven by the internet service provider NextNet AS. Hamar is a smaller city with a population of 27.600 and a total area of 351 square kilometer. The city centre where most of the measurements were performed consists of 5 floor high buildings, and most of the city may be considered as rural areas. The BS is elevated 20 meters above ground level, and is positioned at the roof of a building.
3 Measurements To measure physical and throughput performance, the SU was placed at the dashboard in a car. We drove around in Hamar and stopped at 40 locations where measurements were performed. Some of the measurements were also performed inside restaurants. We implemented a measurement procedure to be performed at each of the locations. This procedure firstly accesses the SU and performs the physical measurements. This includes measuring the UL and DL RSSI, UL and DL SNR, UL and DL modulation rate, transmission (Tx) power and the amount of sub-channels used in the UL.
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity
125
Secondly the script performed throughput measurements with the transport protocol UDP for both UL and DL. Iperf [4] was used for throughput measurements. We wanted to measure the difference and gain when using 4th order-, 2nd order- and no diversity. These measurements were also performed with the SU placed at the dashboard in the car. Upon arriving at a random location, we first measured without diversity. The physical and throughput measurements were performed. We then switched to 2nd and 4th order diversity and performed physical and throughput measurements for each. Diversity measurements were performed at 8 locations.
4 Physical Performance 4.1 Received Signal Strength Indicator RSSI was measured at each location for both DL and UL. It is useful to relate RSSI to the distance between SU and BS which will give us an idea of the system path loss. To get a view of the system performance, it is interesting to compare the RSSI versus distance relation with well established path loss models. We used the Free Space Loss (FSL) model and the Cost 231 Hata [5] models for urban and suburban areas as comparison to our measurements. The UL RSSI values reported by the BS are the maximum RSSI from the best antenna, and not the correlated RSSI from all antennas that would be the real situation.
Fig. 3. RSSI versus Distance for DL (dark dots) and UL (grey triangles), plotted with FSL (topmost) and Cost 231 Hata models for suburban (middle line) and urban (bottom line) areas
126
P. Grønsund et al.
Secondly, the UL RSSI report is density RSSI, thus when comparing the results one needs to normalize the measurements where sub-channeling is used. This includes adding -3 dBm when 8 sub-channels are used and -6 dBm when 4 sub-channels are used. Fig. 3 shows DL and UL RSSI values plotted versus distance between SU and BS, together with the well known path loss models configured with the properties of the system in use. From the comparison in Fig. 3 we see that our RSSI values are similar to the Cost 231 Hata models. This was expected due to the fact that we operate with a mobile system as used for constructing the Hata models. The measurements at farther distances are more similar to the Cost 231 Hata model for suburban areas and approaches FSL. This confirms to our measurements, where locations at shorter distances were in urban areas and those at farther distances in rural areas. At farther distances there were also less obstacles between the SU and BS. 4.2 Signal to Noise Ratio SNR may sometimes be a better measure than RSSI because background noise and interference to the received signal is considered. Only one BS with three sectors operating in different frequencies is present in our measurements, and the SU is operating alone in the system. Thus little interference is expected. It is interesting to see how the SNR values versus distance (Fig. 4) follow the same pattern as RSSI versus Distance.
Fig. 4. SNR vs Distance and the logarithmic regressions for DL (black) and UL (light)
The points seem to follow a logarithmic decrease as expected from the linearity between SNR and RSSI values. Since the SNR versus distance relation seems to follow a logarithmic decrease, we applied logarithmic regression on the UL points and DL points separated. These equations will be on the form:
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity
y = a + b ln(d ) ,
127
(1)
where ‘a’ is the SNR value in dB at no distance, ‘b’ the coefficient relative to the distance ‘d’. Logarithmic regression for SNR in the DL and UL resulted in the formulas: SNRDL = 20.8 - 4.65ln(d ) and
(2)
SNRUL = 9.1 - 5.78ln(d ) .
(3)
It can be seen that the coefficient for SNR UL (Eq. 3) -5.78 has a greater decrease than SNR DL (Eq. 2) -4.65. A reason for this may be higher gain from the BS transmit diversity than BS receive diversity. Another point is that UL SNR dropped to zero at locations with bad sight capabilities which did not happen for DL SNR values.
5 Throughput Performance Throughput was tested with the transport protocol UDP at all locations. UDP is suitable for throughput testing purposes because it best simulates the actual bitrate due to the minor overhead added by the protocol. It is interesting to see what bitrates we obtained at locations relative to the distance between the SU and BS. Fig. 5 plots UDP bitrate versus distance, where it can be seen that there is difficult to state a formula for given bitrate relative to the distance. All locations where randomly chosen and is considered as NLOS capabilities. The actual propagation paths towards the single BS deployed was different from location to location, and it is therefore difficult to state a model from this based on the little amount of measurements. More suitable propagation paths could have been obtained if more BSs had been deployed.
Fig. 5. UDP bitrates plotted versus Distance for DL (black squares) and UL (light triangles)
128
P. Grønsund et al.
It can be seen from Fig. 5 that UL throughput is much weaker than DL at most of the locations. One reason for this is that the BS transmits with greater power than the SU. Secondly, BS transmit diversity should be more effective than BS receive diversity. It shall also be noted that DL always is full bandwidth, whereas UL may use a less amount of sub-channels. It is difficult to express any coverage range based on the amount of measurements, but with NLOS conditions one should generally stay within 1 km to be able to perform adequate in the UL. It is important to note that the measurements were performed with an indoor unit in NLOS environments. Maximum and minimum UDP performance in the UL was 6.16 Mbps and 0.094 Mbps respectively, and 6.20 Mbps and 0.61 Mbps in the DL. UL was averaged to 1.59 Mbps and DL was averaged to 2.66.
6 Diversity Impact Measurements were performed at 8 locations, where no diversity, 2nd order and 4th order diversity were measured. It is interesting to survey the gain when adding an order of diversity. The following subsections investigate the gain in RSSI, SNR, modulation rate and throughput for different levels of diversity in both DL and UL.
Fig. 6. RSSI (a), SNR (b), Modulation Rate (c) and UDP bitrate (d) is plotted versus Diversity Order for DL (black) and UL (grey)
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity
129
Fig. 6 plots the RSSI (a), SNR (b), modulation rate (c) and UDP bitrate (d) values obtained at all locations versus the diversity order in DL (black points) and UL (light points). We performed linear regression on the DL and UL points plotted in Fig. 6. The linear regression gave functions as listed in Table 2, where the variable ‘x’ is the diversity order which may be 1, 2 or 4. The functions are on the form: y = ax + b ,
(4)
where ‘a’ is the increasing factor for each added diversity order ‘x’, and ‘b’ is the parameter value for no diversity. Table 2. Equations deduced with linear regression where the diversity order is x є {1,2,4} x є [1,2,4] RSSI(x) SNR(x) ModRate(x) UDP (x)
DL
1.41x 1.25 x 0.44 x 0.35 x
- 84.5 (5) + 16.5 (6) + 4.3 (8) + 2.6 (10)
UL N\A
0.34 x + 7.9 (7) 0.41x + 1.3 (9) 0.13 x + 0.5 (11)
The functions given in Table 2 is based on the assumption that the gain when going from no to 2nd order diversity is the same as going from 2nd to 4th order diversity and vice versa. We therefore applied single linear regression on 0 to 2nd order diversity points and 2nd to 4th order diversity points. The linear regression coefficients which are the ‘a’ parameter in Eq. 4 is listed in Table 3 together with the ratio between 2nd to 4th order diversity and no to 2nd order diversity gain (4th/2nd). Table 3. Coefficients from linear regression applied on no to 2nd and 2nd to 4th order diversity in DL and UL for RSSI, SNR, Modulation Rate and UDP bitrate with the ratio between the diversity shifts Parameter RSSI SNR ModRate UDPbitrate
-> DL UL DL UL DL UL DL UL
2nd 1.12 N\A 1.31 0.19 0.12 0.06 0.33 0.12
4th 1.69 N\A 1.19 0.5 0.75 0.75 0.36 0.14
4th/2nd 1.51 N\A 0.9 2.63 6.25 12.5 1.09 1.17
It can be read from the ratios in Table 3 that it may be desirable to deduce a formula for the single diversity order shifts instead of using the formulas found in Table 2. The UL formulas deviate most, whereas the DL formulas are more valid. The following sub-sections analyses RSSI, SNR, modulation rate and UDP bitrate with regard to performance of and with different diversity orders.
130
P. Grønsund et al.
6.1 Received Signal Strength Indicator
Fig. 6(a) plots the RSSI values for No, 2nd and 4th order diversity in DL and UL. Linear regression applied on DL RSSI values gave us Eq. 5, which tells us that a 1.41 dBm increase in RSSI is expected in average for each added order of diversity order ‘x’. An increase in DL RSSI was expected since more power was transmitted when more antennas were used for diversity. A formula for the DL RSSI with deviations may be given as:
DL _ RSSI ( x) = 1.41(±0.28) x - 84.5, x ∈ [ 0, 2, 4] ,
(5)
where ‘x’ is the diversity order. The UL RSSI reported by the BS is the maximum RSSI from the best antenna, and not the correlated RSSI from all antennas. The expected result should be a coefficient describing a zero increase. Other measurements will therefore better describe the real improvement. UL RSSI is therefore not applicable in Table 2 and Table 3. 6.2 Modulation Rate
Fig. 6(c) plots the modulation rate as a function of diversity order for both DL and UL, and the respective linear regressions are given in Eq. 8 and Eq. 9. The modulation rates are identified by the numbers as given in Table 1. From Table 3 we find large deviations from no to 2nd and 2nd to 4th order diversity shifts in both DL and UL. Much greater gain is found in the shift from 2nd to 4th than from no to 2nd order diversity. A reason for this may be that to advance one level in modulation rate from the previous, an increase of 3 dBm in RSSI is needed. The probability for this is less when increasing 1.12 dBm by shifting to 2nd order diversity than when increasing 1.69 dBm by shifting to 4th order diversity, which is 37% versus 56% probability respectively. Eq. 8 and 9 are therefore not valid. Instead of linear regression, quadratic regression better simulates the modulation rate increase as a function of diversity order. The equations for Rx and Tx modulation rates will be on the form: ModRateYx = ax 2 + bx + c ,
(6)
which is an exponential function with the growth factor ‘a’, ‘c’ is the average initial modulation rate and ‘b’ is a constant describing when the function should increase. Modulation rate for Rx and Tx are respectively given as: ModRateRx = 0.16 x 2 − 0.19 x + 4.5 and
(7)
ModRateTx = 0.17 x 2 − 0.28 x + 1.5 ,
(8)
where ‘x’ is the diversity order. Eq. 14 and 15 are illustrated together with the actual measurement points in Fig. 7. An exponential growth is found for the modulation rate as a function of diversity order, where the exponential factor ‘a’ of Rx and Tx modulation rate are similar. The
Real Life Field Trial over a Pre-mobile WiMAX System with 4th Order Diversity
131
Fig. 7. Modulation Rate as function of Diversity order for Rx (black) and Tx (light) from Eq. 14 and Eq. 15
‘b’ value is less in Eq. 14 than in Eq. 15, which means that Tx modulation rate increase has a slower start than Rx modulation rate increase as functions of diversity order. This corresponds to the observation for ModRate in Table 3, where the shift from no to 2nd order diversity gain has less gain than 2nd to 4th order diversity. The average initial modulation rate for no diversity ‘c’ is naturally much greater for Eq. 14 than 15 due to the greater DL link quality. 6.3 Signal to Noise Ratio
Fig. 6(b) plots SNR versus diversity order, where DL SNR and UL SNR as functions of diversity order are given in Eq. 6 and Eq. 7 respectively. DL SNR increases with a factor of 1.25 dB for each order of diversity, and UL SNR with a factor of 0.34 dB. These results are descriptive for the diversity gains, where BS transmit diversity is more effective than BS receive diversity as expected. 6.4 Throughput
Fig. 6(d) shows the UDP bitrate as a function of the diversity order for DL and UL as given in the Eq. 10 and Eq. 11 respectively. Deviations between DL and UL are minor as can be seen in Table 2. UDP bitrate may be the best measure for the performance gain from the different level of diversity since it is based on the overall physical performance. The average increase in DL UDP bitrate is 0.35 Mbps for each added order of diversity, and 0.13 Mbps for UL UDP bitrate. BS transmit diversity is therefore found to be more effective than receive diversity as should be expected.
7 Conclusion Field trial performance measurements have been performed with a pre-mobile WiMAX system, where the most important features are sub-channelization and BS
132
P. Grønsund et al.
transmit and receive diversity. UDP bitrate for throughput and the link quality attributes RSSI and SNR have been measured at a range of random locations with NLOS capabilities. Based on these results we have found that the Cost 231 Hata model for urban environments fits the system propagation. The diversity impact on system performance has been analyzed, and formulas for expected performance gain as functions of diversity order have been proposed for RSSI, SNR, modulation rate and UDP bitrate. Sub-channelization in the uplink showed to improve coverage in that subscribers with weak link quality was able to focus the transmit power on a narrower bandwidth to obtain connectivity.
References 1. www.wimaxforum.org 2. Group, W.: IEEE Std 802.16e-2005 and IEEE Std 802.16-2004/Cor 1-2005, Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems, pp. 1–864 (2006) 3. Alvarion, www.alvarion.com 4. http://dast.nlanr.net/Projects/Iperf/ 5. Hata, M.: Empirical Formula for Propagation Loss in Land Mobile Radio Services. IEEE Transactions on Vehicular Technology, 317–325 (1981)
On Evaluating a WiMAX Access Network for Isolated Research and Data Networks Using NS-2 Thomas Michael Bohnert1 , Jakub Jakubiak2 , Marcos Katz3 , Yevgeni Koucheryavy2, Edmundo Monteiro1 , and Eugen Borcoci4 1
3
University of Coimbra Coimbra, Portugal {tbohnert,edmundo}@dei.uc.pt 2 Tampere University of Technology Tampere, Finland
[email protected],
[email protected] VTT Technical Research Centre of Finland Oulu, Finland
[email protected] 4 University Politehnica of Bucharest Bucharest, Romania
[email protected]
Abstract. IEEE 802.16 is yet a very recent technology and released hardware does frequently only support standards partially. The same applies to public available simulation tools, in particular for NS-2. As the latter is the de-facto standard in science and as we use it for our research in the context of the WEIRD project, we evaluate the of IEEE 802.16 support for NS-2. We present several general but also specific issues, which are important in order to carry out reliable research based on these tools. In particular, we show in much detail where modules deviate significantly and even fail totally.
1
Introduction
There is broad consensus among manufactures service and network operators about the Internet being the sole future communication infrastructure. Indeed, Internet Service Providers (ISP) continuously upgrade their networks with ever more powerful appliances in order to serve a dynamic and competitive market exhibiting a sustainable growth in demand. This in turn stipulates ever more bandwidth greedy services, like the much cited future of the World Wide Web (WWW), the Web 2.0, which in turn rise requirements on support by the underlying network. But ubiquitous, reliable and fast Internet access does not only foster the introduction of new services. It also motivates the consolidation of services traditionally delivered over dedicated distribution networks. Surely the most popular as intuitive example is telephony, which has been originally transmitted over the Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 133–147, 2007. c Springer-Verlag Berlin Heidelberg 2007
134
T.M. Bohnert et al.
Public Switched Telephone Network (PSTN) but nowadays is being provided more and more over the Internet infrastructure. Convergence and its inherent positive synergy effect on revenue is the obvious motivation and indeed, the distribution of shares on the latter is subject of fierce dispute [1]. Naturally, user expectations on service quality and availability do not change if a service provider decides to change its network infrastructure. As a consequence thereof, coming back to the telephony example and in times of cellular networks, customers do expect the same ”Anytime, Anywhere” experience if they subscribe to a Voice over IP (VoIP) service as they are used to from PSTN. The above mentioned is just one example why Broadband Wireless Access (BWA) is becoming increasingly important as last mile access technology. In fact there are ample more and their early precursors have been sensed by service providers, operators, manufacturers already few years ago. Consequently, standardization has been initiated and the IEEE 802.16 MAN family [2,?] is certainly the most powerful candidate recently finalised and published. Once available, manufacturers took on and first products have been released recently. Furthermore, an industry exclusive initiative named WiMAX Forum has been founded mainly to promote products and ensure interoperability. Part of this work is the definition of a Network Reference Model (NRM), which is a complete ”All IP End-to-End” infrastructure based on IEEE 802.16 as access technology [4,?] in order to interface with existing cellular networks. Although labelled as the future of BWA, IEEE 802.16 deployment is yet in an early stage. Principal reason is its novelty and henceforth, the advancement of this matter recently has motivated many research initiatives. One of them is the European research project called ”WiMAX Extensions for Remote and Isolated Research Data Networks (WEIRD)” [6]. As the name implies, the aim of the WEIRD project is to deploy, evaluate and enhance WiMAX technology as access technology for a set of European research units with remote and impervious testbeds. These sites are interconnected by the European research backbone network GANT2 [7] and relevant National Research and Education Networks (NRENs). Hence the project network is a Europe wide, complete end-to-end network with WiMAX as access technology. In brief, the project defines an extended architecture based on the NRM and latest hardware appliances for the radio access network, i.e. WiMAX Base, Subscriber and Mobile Stations (BS, SS, MS). In accordance with each testbed’s purpose, it defines a set of evaluation scenarios in order to scrutinise WiMAX technology under most challenging conditions. For instance, WiMAX is used to provide BWA to a Forest Fire Monitoring site in the Serra de Lousa, an impervious set of mountains in the heart of Portugal, which is prone to devastating forest fires during summer term. Different field monitoring sites are connected via WiMAX to sense precursors of fire outbreak and in case, to support the coordination of the fire brigades emergency actions by delivering high definition images to field personnel and VoIP for communication. Later in this paper, we present some of the scenarios in greater detail but the subject treated as a whole can be found in [8].
On Evaluating a WiMAX Access Network
135
As mentioned above, one focus of WEIRD is to look beyond the horizon of currently available standards, specifications and implementations. In particular for WEIRD, this means to deploy WiMAX, evaluate its performance, identify shortcomings and devise appropriate solutions and prototypes to improve this technology in different dimensions. Naturally, carrying out these works in real systems is a complex and lengthly task and prevents rapid progress. Hence, network simulation has become the standard means for proof of concept and early stage prototyping in the last years. But rapid innovation was not the only motivation to integrate networks simulations in WEIRD. Another reason is to overcome access restrictions to WiMAX internals. In fact, as WiMAX is an emerging, very powerful technology, manufacturers are not willing to provide full access to hardware and implementation internals, i.e. their hardware drivers. Moreover, many of the currently available hardware pieces are rather prototypes and do not fully support all features defined in standards. Given that aforementioned, in this paper we present our first results on WiMAX network simulation based on the de-facto Open Source standard tool in science, the Network Simulator 2, short NS-2 [9]. By today, two IEEE 802.16 modules have been released public, one by United States’ National Institute of Standards and Technology (NIST) [10] and another one by the Networks and Distributed Systems Laboratory (NDSL) part of the Chang Gung University in Taiwan [11]. By far for equal reasons as applying to WiMAX hardware, these modules are yet in early stages and do not implement all standardized features of the IEEE 802.16. Hence, the very first work of the WEIRD simulation track was to scrutinise the applicability and utility of these modules and these results are presented in the sequel of this paper. The paper is has been structured as follows. In Sec. 2 we present the WEIRD project and it’s architecture in more detail. Thereafter, we evaluate two different WEIRD scenarios, one using the NIST module, see Sec. 3, and another one using the NDSL module, Sec. 4. Thereafter we conclude in Sec. 5 and close the paper.
2
The WEIRD System, Objectives and Architecture
The WEIRD system aims to be a part of full multi-domain network architecture, allowing fixed and mobile access in new scenarios. Among others, WEIRD targets end-to-end QoS enabled services. The WEIRD business models should support different entities; each of them may offer high level services or connectivity services, in the access and/or core transport. The proposed architecture allows the organisational and technical independence of the entities managing the network domains: Network Access Provider (NAP), Core Network Service Provider (NSP), etc. The WEIRD system is built upon a networking infrastructure, presented in Fig 1. As illustrated, the architecture is made of three components which could be managed by different business entities, namely Customer Premises Equipment (CPE), Access Service Network (ASN) and Connectivity Service Network (CSN).
136
T.M. Bohnert et al.
Fig. 1. Simplified WEIRD Overall Network Infrastructure
They may be located within CPE or linked to the CSN, in case of Application Service Providers (ASP). The infrastructure includes the mobility support. Figure 1 has been simplified and abstracted in order to to emphasise generic interfaces between entities as defined in the WiMAX Forum terminology, denoted by R1, R2,..R8 and fully described in [4]. The CPE can be composed of single-user SS or multiple users SSs (MS), in case that an SS offers access to LANS/WLANs having several users/hosts. The fixed or mobile SSs are wireless linked with Base Stations (BS). An ASN, linked through an ASN Gateway (ASN-GW) to the CSN, may control and aggregate several BSs, based on a wireline or wireless IP infrastructure. The ASN-GW plays here both the data gateway and the control role for ASN. In a mobile environment the CSN may be the Home CSN or Visited CSN respectively. Connectivity with other networks may be realized via IP backbone. Application entities clients and/or servers can exist in the CPE side or in CSN networks. In WiMAX forum model also direct interfaces between different ASNs may exist (denoted by R4). The goal of the considered architecture is, among others, to control and ensure end-to-end QoS enabled services. WEIRD should achieve and control QoS in its scope: WiMAX segment and in ASN. To do this WEIRD defines corresponding interfaces with CPE and CSN and run appropriate QoS oriented signalling onto these interfaces. The WEIRD system offers different levels of QoS to the high level services/applications while using the IEEE 802.16 classes of services (Unsolicited Grant Service (UGS), real time Polling Service (rtPS), extended real time Polling Service (ertPS), non real time Polling Service (nrtPS) and Best Effort (BE)). This architecture supports different applications, capable or not to signal their QoS requirements (SIP/non-SIP based applications, legacy, etc.) by offering appropriate Application Programming Interfaces (APIs). The overall WEIRD architecture is structured as a multi-plane. It is fully described in [12]. Vertically there are two macro-layers, or strata, i.e., Application and Service Macro-Layer/Stratum and Transport Macro-Layer/Stratum. Horizontally, there are three planes: Management (MPl), Control (CPl) and Transport/Data Plane (DPl). This structuring aims to decouple the applications
On Evaluating a WiMAX Access Network
137
and high level services from transport technologies, in order to support heterogeneity of the core and access network technologies [13,?]. The Applications and Service Stratum include the layers and functions for management, control and also operations on data independently of network transport. The applications generally have a graphical user interface (GUI), a media module and signalling modules. Some applications are QoS signalling-capable (based on SIP or other protocols). Legacy applications are supported by a specially defined WEIRD agent, capable to signal their requirements. The WEIRD API Interface adapts the applications data and control flows to the Transport Stratum. Transport Macro-Layer/Stratum performs management, control of resources/traffic, as well as data operations in order to transport the information flow through various networking infrastructures. The MPl performs medium- and long-term management functions: for high level service management at the Application and Service Layer macro-layer and resource and traffic management at Transport macro layer. It provides coordination between all the planes. The CPl layers perform short-term control actions. In the Services and Applications Stratum the CPl sets up and releases connections, restores a connection in case of a failure; in the Transport Stratum, the CPl performs the short term actions for resource control and possible and traffic engineering and control, including routing. The DPl transfers the user/application data but also the control and management related data between the respective entities. Figure 2 shows a high level view of the basic WEIRD control plane architecture. The Control Plane architecture horizontally covers the following entities: SS/MS, ASN(BS, ASN-GW) and CSN. The Application and Service Stratum contains mainly the session signalling (e.g., SIP), including SIP agents and AAA functions. The Transport Stratum contains the layers: Connectivity Service Control as a layer of blocks with specific internal structure for SS, SN-GW and CSN.The main focus of WEIRD is on WiMAX and ASN network control, therfore CSC-ASN is the most important control block; QoS signalling based on NSIS signalling as QoS messages vehicle; Mobility Control, including micro and macro mobility based on Mobile IP and Resource Control which is the lower layer having the task to install resources in the network segments. Figure 2 does not include the RC for CPE network and CSN because these are specific to the CPE and CSN technologies. In case of WIMAX the RC communicates with WRC via SNMP in order to install Service Flows in WiMAX segment. A detailed description of the control architecture is given in [13,14]. WEIRD approach for QoS resource control is dynamic, based on the idea to reserve/admit/allocate resources, at request, via SIP or NSIS signalling in WiMAX and ASN segments. The requests are checked for Admission Control in CSC. When the Resource Manager of the Management Plane has done beforehand some provisioning (on the path requested), the CSC - ASN will admit/allocate resources (based on Service Flows in WiMAX segment and logical traffic trunks in the ASN part) by taking a part of the available preprovisioned resources. It is also possible that the request is completely new in terms of its scope; in such case the AC applied by the CSC-ASN, if successful
138
T.M. Bohnert et al.
Fig. 2. WEIRD Control Plane Architecture: UA: User Agent; SIP: Session Initiation Protocol; API: Application Programmer Interface; CSC: Connectivity Service Controller; AC: Admission Control; AC* this will exist only in an SS which manages multiple users in order to control the CPE segment resource allocation; NSIS: Next Step in Signalling Modules; MIP: Mobile IP; (W)RC: (WiMAX) Resource Controller;
Fig. 3. WEIRD Management Plane
will determine installation of new service flows and new pipes in ASN. Figure 3 shows a simplified picture of the Management Plane (MPl). It performs the classical network management functions (NMS) and the medium-long term resource management (RM). NMS/RM is thus composed of two subsystems:
On Evaluating a WiMAX Access Network
139
Conventional Network Management Systems (CNMS) having classical functions such as network static provisioning, network monitoring, alarm collection and management; Resource Manager which is responsible to manage reservation and allocation of connectivity resources in the ASN and WiMAX segments. The resource pre-provisioning is done by management actions, thus preparing in advance the resources to be used in the future by the high level services. From the granularity point of view, the provisioning can be done either at aggregated(preferable method) or per individual flow in provisioning mode; e.g., individual in the SS-BS zone of the chain; usually at aggregated level in the zones BS-(ASNGW), (ASN-GW)-CSN and inside or between CSNs. This is the main role of the RM part of the NMS and it falls completely under WEIRD scope. As described above, individual (per call) resource allocations for different flows can be dynamically established at request by the Control Plane, while taking into account the pre-provisioning done previously by RM (in the limits fixed by the RM). The resource provisioning by management is performed based on some forecasting information on future calls amount. The proposed architecture is flexible in the sense that it allows extensions (currently not in the scope of WEIRD): agreements between domain managers can be established by (SLA/SLS), on the amount of resources to be provisioned within each domain or between domains. Also a Network Dimensioning module can map the physical topology and link capacity information in a logical map of traffic trunks, described as a matrix of virtual pipes, independent of network infrastructure. Details for these are given in [12].
3
Simulating the WEIRD Scenario 1: Forest Fire Prevention
Forest fires render a serious problem in the Mediterranean Basin and currently used methods for fire detection have certain limitations, especially in remote, isolated areas. Thus several pilot projects, developed in Portugal by University of Coimbra, trying to push the use of new technologies in that area. Namely, the traditional fire detection systems are aided by the use of sensors, video and infrared cameras, coordinated remotely. However, the main drawbacks of such systems are usually the costs and limited image quality related to GSM/GPRS communications and the difficulty to implement radio links to transmit video in mountainous regions (both LOS and NLOS links). Therefore deployment of WiMAX network, proposed within the WEIRD project, seems to be most promising solution as communication medium for such an environment. Realtime voice, video and textual data, relayed over WiMAX networks, provides extensive communication means between mobile personnel and the central command station, offering fire fighters an invaluable advantage in managing the field operations. This particular simulation scenario evaluates the feasibility of fire prevention system based on the following actors: a Coordination Center (CC), a Surveillance Car (SC), a Helicopter (HC) and Base Station (BS). The CC is an entity in
140
T.M. Bohnert et al.
Fig. 4. Scenario scheme based on WEIRD specifications
charge of fire detection and managing the field operations of the fire brigade. The SC is a vehicle acting as a mobile watch tower, equipped with digital video camera, GPS receiver and possibly some sensors (wind, humidity, temperature). It also maintains a VoIP link with the CC. The HC, with equipment similar to that one carried by SC, further improves the effectiveness of field operations. As highly mobile, the HC offers fast information updates under changing conditions and can provide images from top-down perspective, otherwise unavailable. Based on this general description, our simulation consists of several stages: – Initially, the SC patrols the remote area, maintaining a VoIP link with the CC. At that point of time the HC is placed closely to the CC, waiting for orders. – The SC notices a fire! The CC is informed over VoIP, video transmission is started, the vehicle retracts from the endangered area. – The CC orders the HC to the fire location, video camera mounted on the HC starts transmitting – While the fire brigade is fighting the fire, the HC revolves around it, monitoring and providing assistance to the fire fighter officers in the field. – As soon as the fire is extinguished, all video transmission is suspended and the helicopter returns to its base As mentioned before, all traffic is relayed over the BS installed on the top of a nearby hill. Figure 4 illustrates the basic connection scheme (Base Station not included). In order to simulate this scenario we chose the NIST module as it, in contrast to NDSL, comes with IEEE 802.16 mobility support. To be more specific, the prerelease-092206 is based on the IEEE 802.16 standard (802.16-2004) [2] and the 802.16e-2005 [3] mobility extensions, including neighbour advertisement, scanning and handovers. While providing excellent mobility support, QoS features are yet totally left out. The default scheduler does not support any service classes differentiation but uses a simple first-in-first-served (FIFS) scheme for DL and Round Robin for UL traffic. In brief, according to the [15], the module implements several features where the most relevant are
On Evaluating a WiMAX Access Network
141
Fig. 5. Datarate as a function of modulation scheme and CP
– – – –
WirelessMAN-OFDM physical layer with configurable modulation Time Division Duplexing (TDD) Management messages to execute network entry (without authentication) Fragmentation and reassembly of frames
The module allows to adjust parameters such as modulation scheme, cyclic prefix, contention period length or frequency band width which influence the achievable throughput boundaries. In order to illustrate this, we run several simple and preliminary ”MS to BS” simulations for a 7Mhz channel and the results are presented in Fig. 5. As one can see, achievable datarates are a function of the cyclic prefix length, modulation order and modulation rate. Nevertheless, as we found out, those parameters are not subject to dynamic adjustment depending on link quality and (indirectly) on the distance between SSs/MSs and BSs. In fact one can only change those values manually, otherwise defaults are used, meaning that IEEE 802.16 Adaptive Modulation and Coding (AMC) adjustment is yet to be implemented. Moreover, modulation schemes are set in a per BS manner, thus all MSs connected to the same BS are forced to deploy the same modulation scheme. The main goal of this work, however, was to study the feasibility of applying NS-2 with WiMAX modules to simulate WEIRD scenarios. Thus, in this scenario setup the achievable throughput, and therefore the availability of services depending on the bandwidth usage, was investigated. The examined parameters also include the order and rate of modulation, length of contention period and the cyclic prefix length used. Further, one has to note that, since the NIST module offers no IEEE 802.16-2004 QoS model support, the achievable throughput is a question of contention, as, in order to transmit BE data, the competing nodes need to issue BW requests in the contention slots of each IEEE 802.16 frame, see [2, Chap. 6.3.6] for details.
142
T.M. Bohnert et al.
Fig. 6. Video throughput (in Mbit/s) as a function of modulation scheme over time
Fig. 7. Voice throughput (in Mbit/s) as a function of modulation scheme over time
In accordance with available WiMAX equipment by WEIRD, to be specific RedLine (from Juniper), we limited the simulation to BE traffic as no other scheduling service is currently supported by that hardware, in fact one more reason to use the NIST module for this scenario. The module uses weighted round robin for traffic classes prioritising, assigning different weights for different traffic classes. The default fraction of contention period assigned for BE traffic was only 20%, and for the simulation we increased it to its maximum value so that all the bandwidth could be used for contending; otherwise most part of the bandwidth would be wasted due to reservation for other, non-existing classes of traffic. In more detail, the following traffic types have been simulated:
On Evaluating a WiMAX Access Network
143
– Video transmission: 1 Mbit/s, CBR traffic generator, 2 sources: SC and HC – VoIP transmission: 64 Kbit/s, exponential on-off traffic generator; 2 bidirectional connections: SC — CC and HC — CC The total offered traffic is equal to 2304 Kbit/s, where 2048 Kbit/s falls into video traffic and 256 Kbit/s for voice. Voice traffic was approximated using exponential packet distribution, whereas video transmission used simple CBR. The graphs 6 and 7 illustrate the throughput of voice and video traffic as a function of modulation technique. Clearly, one can see from the fluctuations the effect of all nodes competing for bandwidth (in this case no cyclic prefix was used). The results also reveal that the radio channel was not saturated, as packets were dropped due to insufficient bandwidth, which is for BE essentially a function of the contention period. The module caused some other problems as well, though not as severe as NDSL - e.g. the MAC layer statistics were not gathered when the print_stats_ switch was used; the routing protocol, DSDV, needed extremely long time to converge (over 80 seconds). Moreover, by default 80% of availale bandwidth is assigned to not yet implemented traffic classes. The NIST module has several shortcomings and requires additional work to ensure it is both stable and reliable to handle WEIRD scenarios, especially as the module is incomplete - some functionality is still missing (adaptive channel adjustment, traffic prioritising, ARQ, etc.), other is implemented only optionally or partially (bandwidth scheduler and flow handler). Only recently, one day before this paper submission deadline, a new version of module, prerelease-041507, was published. Authors mention several fixes and some new functions, however due to the tight schedule we were unable to investigate this revision any further.
4
Simulating the WEIRD Scenario A3: Monitoring Volcanic Unrest
Similar to fortest fires, volcanic unrest poses a severe threat for nature and humans. As a pure matter of scale, lava flows can cover areas up to several square kilometers and volcanic ash-plumes reach elevations of tens of kilometers. One such example is the Hekla volcano located in south central Iceland. Being a very active volcano, it has erupted approximately every 10th in the last few decades. In Hekla’s particular case winds volcanic plume and ash falls on farmland in southern Iceland but also on Iceland’s northern coast. Furthermore, great elevations of volcanic ash, dispersed over thousands of kilometers in the stratosphere, can cause severe impacts to air-traffic in the Northern Atlantic and Europe. Although extremely quiet in between of eruptions, precursors, in the form of seismicity and sudden changes in strain, can be observed around one hour in advance. This short but extremely important timeframe allows to launch appropriate preparations and safety measures. Hence, Hekla is being continuously monitored by permanent and mobile stations whose locations are shown in Fig. 8. Besides a single BS (yellow point) there are five permanent, so-called GPS
144
T.M. Bohnert et al.
Fig. 8. The Hekla volcano and its vicinity [16]
stations plus one seismic station in Hekla’s immediate vicinity. In the event of volcanic unrest portable seismic devices are additionally deployed at three locations around the volcano. Video cameras are mounted and used to stream real-time pictures to a CC. The cameras cover the mountain from any perspective but also follow the ascension of the volcanic plume. In addition to video streaming, field personnel, operating during emergency cases, are equipped with VoIP devices in order to communicate among each other but also with the coordination centre [16]. After mounting the portable devices in an emergency case no terminal movement is foreseen. Hence, the NDSL module, version 2.03, released at the 03/14/2007, lends itself for the evaluation of the this WEIRD scenario and vice versa. In fact, by pure design choice, this module does not support mobility but is, according to the published module documentation [17], based on [2]. While this is a major difference from the NIST module, another one is that it supports the IEEE 802.16-2004 QoS model. Further, fragmentation as well as packing has been implemented and its convergence sublayer supports IP address based service flow mapping. The most important features of the physical layer are OFDMA and distance based AMC in accordance with the results published in [18]. The overall objective of this evaluation was twofold. Our first aim was to examine the QoS model for different channel bandwidths in order to see if we can come up with a set of configuration guidelines for the real WEIRD testbed. The second one was to learn how far the module lends itself as basis for our research. Hence, we set a set of simulations according to the map presented in Fig. 8. The same traffic models as for the previous scenario were used and we kept the NDSL standard configuration for the physical as well as MAC layer in order to get comparable results with [17], at least up to some extent. After running several simulations for slightly different traffic intensities we calculated the bandwidth and discovered several peculiarities. In the first place we found that the trace files slightly differed from the standard (format) for
On Evaluating a WiMAX Access Network
145
wireless networks [19, Chap. 16.1.7]. Precisely, many packets had the same time stamp (sending time). As this feature has not been documented in [17], and in order to compute exact results, we analysed the implementation. An immediate observation was that the current design differs much from [17], surely due to many major revisions meanwhile. Nevertheless, as the objective of this paper was to scrutinise the utility of NS-2 WiMAX modules, this finding is a relevant point to be mentioned. In fact, there is an significant reason behind this feature. The module has been designed in a way that each packet corresponds to one IEEE 802.16 MAC Protocol Data Unit (PDU). With respect to the NS-2 radio propagation models, [19, Chap. 18], this implies that the smallest unit possibly lost is one such PDU. As the PDU size is configurable (100 Byte default) one PDU can encapsulate a single fragment up to many Service Data Units (SDU), in this case basically IP packets. This is in any case a far larger value as the standard metric for radio link simulations, the Bit Error Rate (BER). Consequently, accurate PHY layer performance evaluations are precluded right from the beginning. A conclusion in line with the findings in [20], which in fact proposes a solution for this issue. Further, this explains the equal time stamp for different packets as one IEEE 802.16 frame is made of several MAC PDUs and all packets with equal time stamps therefore belong to a single IEEE 802.16 frame. After this disclosure we were able to calculate the maximal capacity achievable for various channel settings and we got intriguing results. In short, we were able to achieve unlimited capacity. This indication to an implementation error could be confirmed after some further code analysis. Very briefly, an error in the bandwith request and assignment management allows to map the whole buffer content of a single MAC connection to a single Bandwidth Information Element (BIE) for the same connection and not just the byte value of this BIE, for details see [2, Chap. 6.3.6]. As IEEE 802.16 connection queues are implemented using NS-2 PacketQueues, see [19, Chap. 7.1.2], which are theoretically of unlimited length, we could fill the queue with an unlimited number of packets between two IEEE 802.16 frames, which in turn, mapped to a single BIE, are sent in the next frame. Hence, any offered traffic could be sent in a single IEEE 802.16 frame, practically resulting in unlimited capacity measurements. One more significant detail has been revealed during our analysis. If SDUs are fragmented or packed in PDUs, the standard defines specific headers to be added in order to restore the SDUs correctly at the receiver. The size of these headers have to be taken into account while calculating the net bandwidth granted for a connections’s data transmissions. The absence of this feature in the current implementation incurs another slight impressions in capacity evaluations. Finally, during our analysis, we found one more or less undocumented feature, with respect to [17]. In fact, although not explicitely mentioned in the text, Fig. 2 in [17] indicates that IEEE 802.16 connection queues are located behind a node’s interface queue, essentially meaning a cascade of queues. What is important to understand is that the interface queue is shared by all connections and henceforth masks the ”multiple connections, multiple queues” feature defined for IEEE
146
T.M. Bohnert et al.
802.16 nodes. Most probably, packet drops are to be expected at this queue and not at each individual connection’s queue (which are currently anyways of infinite length). Naturally, this has several implications for QoS as traffic classes are mapped to connections which therefore should be totally isolated and not share a common pool of resources, in this case the space of the interface queue.
5
Conclusions
The major conclusion of this paper is that currently, public available NS-2 WiMAX modules are to be used with much care. In order to produce accurate, reliable and reproducible results, a sound understanding in wireless network simulation and moreover, in IEEE 802.16 standard details appears absolutely essential. Only backed up by these prerequisites the particular features both modules but also of NS-2 become fully obvious. As elaborated in detail in the previous section, this has been particularly important for the NDSL module, which is in it’s latest release not applicable without a major revision. Undocumented features, deviations, simplifications and abstractions, altogether standard methods in simulation, narrow the applicability of both modules to a few, very specific applications. Nevertheless, finally we would like to express our sincere gratitude to both development teams, those at NIST and at NDSL for their efforts. Doubtlessly, making their work public available is a major contribution to the research community, especially if a technolgoy is as recent as WiMAX. Moreover, we would like to stress that we intentionally refrain from any ”good/bad” or ”better/worse” conclusion but would like see our work understood as a contribution to further improve the quality of both modules and as a helpful support for those deploying them.
Acknowledgement This work has been jointly supported by the EU IST Integrated Project WEIRD (WiMAX Extensions for Remote and Isolated Data Networks) and ESF COST 290 Action.
References 1. Bohnert, T.M., Monteiro, E., Curado, M., Fonte, A., Moltchanov, D., Koucheryavy, Y., Ries, M.: Internet Quality of Service: A Bigger Picture. In: 1st OpenNet QoS Workshop Service Quality and IP Network Business: Filling the Gap, Diegem, Belgium (2007) 2. Part 16: Air Interface for Fixed Broadband Wireless Access Systems. IEEE Standard for Local and metropolitan area networks, IEEE, Los Alamitos (2004) 3. Part 16: Air Interface for Fixed Broadband Wireless Access Systems, Amendment 2. In: IEEE Standard for Local and metropolitan area networks, IEEE Computer Society Press, Los Alamitos (2005)
On Evaluating a WiMAX Access Network
147
4. WiMAX End-to-End Network Systems Architecture, (Stage 2: Architecture Tenets, Reference Model and Reference Points). WiMAX Forum (March 2006) 5. Gray, D., (ed.) Mobile WiMAX - Part I: A Technical Overview and Performance Evaluation. WiMAX Forum (June 2006) 6. Angori, E., Cimmino, A., Dinis, M., Guainella, E., Huusko, J., Monteiro, E., Spada, M.R.: WiMAX Extensions for Isolated Research Data Networks. In: International Congress ANIPLA 2006: Methodologies for Emerging Technologies In Automation, Rome, Italy (2006) 7. GEANT, European Research and Education Backbone, http://www.geant.net 8. Guainella, E., Borcoci, E., Katz, M., Neves, P., Curado, M., Andreotti, F., Angori, E.: WiMAX technology support for applications in environmental monitoring, fire prevention and telemedicine. In: IEEE Mobile WiMAX Symposium, Orlando FL, USA (2007) 9. Network Simulator 2 (NS-2), http://www.isi.edu/nsnam/ns/ 10. NIST Seamless and Secure Mobility, http://www.antd.nist.gov/seamlessandsecure.shtml 11. NDSL WiMAX Module for ns2 Simulator, http://ndsl.csie.cgu.edu.tw/ wimax ns2.php 12. D2.3 System Specification. Project Deliverable. WEIRD Consortium (May 2007) 13. Knightson, K., Morita, N., Towle, T.: NGN Architecture: Generic Principles, Functional Architecture, and Implementation. IEEE Communications Magazine, 49–55 (2005) 14. General Principles and General Reference Model for Next Generation Network. ITU-T Rec. Y.2011. 15. Rouil, R.: The Network Simulator NS-2 NIST Add On. IEEE 802.16 model (MAC+PHY). NIST (September 2006) 16. D.2.2 System Scenarios, Business Models and System Requirements (version 2). Project Deliverable. WEIRD Consortium (February 2007) 17. Chen, J., Wang, C., Tsai, F., Chang, C., Liu, S., Guo, J., Lien, W., Sum, J., Hung, C.: The Design and Implementation of WiMAX Module for NS-2 Simulator. In: ACM Valuetools 2006, Pisa, Italy, ACM Press, New York (2006) 18. Chen, J., Tan, W.: Predictive dynamic channel allocation scheme for improving power saving and mobility in BWA networks. Mobile Networks and Applications 12(1), 15–30 (2007) 19. Fall, K., Varadhan, K. (ed.) The NS Manual (formerly ns Notes and Documentation). The VINT Project (May 2007) 20. Betancur, L., Hincape, R.C., Bustamante, R.: WiMAX channel: PHY model in network simulator 2. In: 2006 workshop on ns-2: the IP network simulator, Pisa, Italy. ACM International Conference Proceeding Series, ACM Press, New York (2006)
Performance Evaluation of the IEEE 802.16 ARQ Mechanism Vitaliy Tykhomyrov1, Alexander Sayenko2, Henrik Martikainen1 , Olli Alanen1 , and Timo H¨ am¨al¨ ainen1 1
Telecommunication laboratory, MIT department, University of Jyv¨ askyl¨ a, Finland {vitykhom, henrik.martikainen, olli.alanen, timo.hamalainen}@jyu.fi 2 Nokia Research Center, Helsinki, Finland
[email protected]
Abstract. The IEEE 802.16 technology defines the ARQ mechanism that enables a connection to resend data at the MAC level if an error is detected. In this paper, we analyze the key features and parameters of the ARQ mechanism. In particular, we consider a choice for the ARQ feedback type, a scheduling of the ARQ feedbacks and retransmissions, the ARQ block rearrangement, ARQ transmission window and ARQ block size. We run a number of simulation scenarios to study these parameters and how they impact a performance of application protocols. The simulation results reveal that the ARQ mechanism plays an important role in transmitting data over wireless channels in the IEEE 802.16 networks. Keywords: IEEE 802.16, ARQ, NS-2 simulator.
1
Introduction
IEEE 802.16 is a standard for the wireless broadband access network [1] that can provide a high-speed wireless access to the Internet to home and business subscribers. It supports applications and servcies with diverse Quality-of-Service (QoS) requirements, such as Voice-over-IP (VoIP). The core components of a 802.16 system are a subscriber station (SS) and a base station (BS). The BS and one or more SSs can form a cell with a point-to-multipoint (PMP) structure. On air, the BS controls the activity within a cell, resource allocations to achieve QoS, and network admission based on network security mechanisms. An overview of the key 802.16 features is given in [5]. The automatic repeat request (ARQ) is a mechanism by which the receiving end of a connection can request the retransmission of MAC protocol data unit (PDU), generally as a result of having received it with errors. It is a part of the 802.16 MAC layer and can be enabled on a per-connection basis. The 802.16 specification does not mandate the usage of the ARQ mechanism meaning that it is a provider and a customer specific decision. The 802.16 ARQ mechanism is controlled by a number of parameters. The specification defines them but it does not provide concrete values and solutions. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 148–161, 2007. c Springer-Verlag Berlin Heidelberg 2007
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
149
This paper analyzes these parameters and studies the performance of the ARQ mechanism. In particular, the following parameters are considered: ARQ feedback type, scheduling of ARQ feedbacks and retransmissions, ARQ transmission window size, ARQ block size, ARQ block rearrangement. In this paper, we focus on ARQ, because, unlike the H-ARQ mechanism, it is available in all the 802.16 PHYs. This paper extends our previous research and simulation work on 802.16 networks. In [9], we presented a scheduling solution for the 802.16 BS. In [8], we analyzed the 802.16 contention resolution mechanism and proposed an adaptive algorithm to adjust the backoff parameters and to allocate a sufficient number of request transmission opportunities. The performance of the 802.16 ARQ mechanism has not been studied sufficiently, especially by means of extensive simulations. In [6], an analysis of the ARQ feedback types is presented. However, UDP traffic that is not sensitive to packet drops is considered. Furthermore, no algorithm to select the feedback is given. In [7], the ARQ mechanism is analyzed in the context of real-time flows of small packets. The rest of the article is organized as follows. Section II presents the key features and parameters of the 802.16 ARQ mechanism. We analyze their impact on performance and propose a set of solutions. Next, Section III presents a number of simulation scenarios to study the ARQ performance. This section also analyzes the simulation results. Finally, Section IV concludes the article and outlines further research directions.
2 2.1
IEEE 802.16 ARQ Mechanism Basics of the ARQ Mechanism
If ARQ is enabled for a connection, the extended fragmentation subheader (FSH) or the extended packing subheader (PSH) is used. The extended type is indicated by the extended bit in the general MAC header (GMH). Regardless of the subheader type, there is a block sequence number (BSN) in the subheader that indicates the first ARQ block number in the PDU. A PDU is considered to comprise a number of ARQ blocks, each of which is of the same constant size except the final block which may be smaller. The ARQ block size is an ARQ connection parameter negotiated between the sender and the receiver upon a connection setup. It is worth mentioning that the ARQ block is a logical entity – the block boundaries are not marked explicitly. The remaining block numbers in a PDU can be derived easily on the basis of the ARQ block size, the overall PDU size, and the first block number. Precisely for these reasons the ARQ block size is a constant parameter. Fig. 1 presents ARQ blocks with the fragmentation and packing mechanisms. Block numbers are given with respect to the BSN stored either in the FSH (see Fig. 1(a)) or PSH (see Fig. 1(b)). It is important to note that while the 802.16d specification [1] defines an ARQ block size as any value ranging from 1 to 2040 bytes, the 802.16e specification [2] has limited it to power of two values ranging from 16 to 1024 bytes, e.g. 16, 32, 64 and so on.
150
V. Tykhomyrov et al.
126
block 1
block 2
block 3
126
block 4
GMH FSH
CRC
block 1
block 2
128
GMH PSH
block 1
PSH
(a) fragmentation
129
block 1
block 2
PSH
CRC
(b) packing
Fig. 1. ARQ blocks with packing and fragmentation mechanisms
2.2
ARQ Feedback Types
To request a retransmission of blocks (NACK) or to indicate a successful reception of blocks (ACK), a connection uses ARQ block sequence numbers. In turn, the sequence numbers are exchanged by means of ARQ feedback messages. The 802.16 specification defines the following feedback types: a) selective, b) cumulative, c) cumulative+selective, and d) cumulative+sequence. The selective feedback type acknowledges ARQ blocks received from a transmitter with a BSN and four 16-bit selective ACK maps. The BSN value refers to the first block in the first map. The receiver sets the corresponding bit of the selective ACK map to zero or one according to the reception of blocks with or without errors, respectively. The cumulative type can acknowledge any number of the ARQ blocks. The BSN number in the ARQ feedback means that all ARQ blocks whose sequence number is equal to or less than BSN have been received successfully. The cumulative+selective type just combines the functionality of the cumulative and selective types explained above. The last type, cumulative+sequence, combines the functionality of the cumulative type with the ability to acknowledge reception of ARQ blocks in the form of block sequences. A block sequence, whose members are associated with the same reception status indication, is defined as a set of ARQ blocks with consecutive BSN values. A bit set to one in the sequence ACK map entity indicates that the corresponding block sequence has been received without errors and the sequence length indicates the number of blocks that are members of the associated sequence. When the ARQ feature is declared to be supported, a transmitting side, i.e., a receiver of the ARQ feedbacks, must support all the feedback types described by the 802.16 specification. The sender of the ARQ feedbacks has the ability to choose whatever format it will use. The WiMAX Forum recommendations [3] mandate the support of all the types except the selective ACK. 32 blocks 1
16
22
32
28
Cumulative ACK
Selective ACK BSN:1
MAP:1111110110000001
Cumulative with Selective ACK BSN:6
MAP:1011000000101001
MAP:0100110110110001
BSN:6
Cumulative with Block Sequence ACK BSN:6
Seq: 010
1
2
6
Seq: 101
1
1
1
Seq: 010
2
2
1
Seq: 101
2
1
2
Fig. 2. Example of ARQ feedback types
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
151
Fig. 2 presents an example in which every feedback type is applied to the same set of ARQ blocks. Selective ACK can acknowledge these 32 blocks in two maps. Cumulative ACK cannot acknowledge all the blocks because there are negative acknowledgements. Thus, only six blocks are encoded. Cumulative+selective ACK can send both positive and negative acknowledgements. However, since there should be 16 blocks per one selective map, some blocks remain unacknowledged. For this particular example, cumulative+sequence ACK can acknowledge only 28 blocks; one message can hold four sequence maps at most, whereas each map can have either two or three sequences. This type does not work effectively in this case because the block sequences are very short. 2.3
Choosing the Feedback Type
Each feedback type has its advantages depending on the ARQ feedback transmission frequency, the error disturbance patterns, and the computational complexity. From the implementation point of view, the selective feedback type does not require much processing resources because a connection simply puts information on the received blocks into a bitmap. On the other hand, a connection should try to rely upon the cumulative+sequence feedback types if resource utilization is of greater importance. However, it is more complex in implementation because block sequences must be detected. It could form an obstacle for a low power and low capacity mobile device. In this section, we do not analyze these feedback types from the implementation complexity point of view, but rather propose an algorithm to choose an ARQ feedback type to achieve a good resource utilization. Our algorithm is based on the following assumptions: a) it is always more efficient to send positive acknowledgements by means of the cumulative type, and b) the sequence map can encode more blocks than the selective one. Indeed, the cumulative type can encode any number of ARQ blocks by using just one BSN number. Consequently, the four sequence maps, each of which can have two sequences of 63 blocks, encode 504 blocks. If a map contains three short sequences, each of which can keep up to 15 blocks, then 180 blocks can be encoded. The proposed algorithm, simplified form of which is shown in Fig. 3, works as follows: 1. If there are positive acknowledgements in the beginning of the ARQ transmission window, construct the cumulative part. If there are no negative acknowledgements, then a single cumulative feedback message is sent. 2. If there are remaining negative acknowledgements (optionally followed by positive and other negative acknowledgements), which we cannot send by using the cumulative part, then we have to choose a map type. To make a decision, we construct the sequence maps and calculate whether the selective maps can acknowledge more blocks. The maximum number of blocks to acknowledge selectively is 64 and it should be a multiple of 16. As for the sequence part, there is a limit for a sequence length and the number of sequences we can send in one message. As a choice is made, we ”attach” map(s) to the cumulative part constructed at the previous stage. Eventually,
152
V. Tykhomyrov et al.
Calculate the number of blocks with ACK stage 1
Construct the cumulative IE
stage 2
Construct the sequence IE
Construct the selective IE
Send a feedback
Construct the selective IE
Send a feedback
Send a feedback stage 3
Fig. 3. Algorithm for choosing ARQ feedback types
we will have either the cumulative+sequence or cumulative+selective feedback type. 3. Note that we can reach this stage in two cases. In the first case there are no positive acknowledgements in the beginning of the ARQ transmission window and there is no way to create cumulative, cumulative+selective, or cumulative+sequence types. The second case to reach this stage is when neither cumulative+selective nor cumulative+sequence feedback types encode all the blocks. Though it is a rare case it can happen, because both the cumulative+selective and cumulative+sequence types have technical limitations. Regardless of the situation, we just create as many selective feedback types as necessary to acknowledge the remaining blocks. As mentioned above, four selective maps can acknowledge up to 64 blocks. It is worth noting that the presented algorithm scales well to the SS capabilities. If the selective type is not supported, then stage 3 is never executed. If there is no support for one of the cumulative types, then stages 1 and 2 are simplified. 2.4
Scheduling of ARQ Feedbacks and Retransmissions
While sending normal PDUs, retransmissions, and ARQ feedbacks messages, a connection should determine their order. Indeed, as a scheduler at the BS allocates resources to a connection, either uplink or downlink, a connection’s internal priority mechanism should decide upon which message is of more importance. We propose to send first the ARQ feedbacks, then retransmissions, and finally the normal user PDUs. The reason we assign the highest priority to the ARQ feedbacks is that they do not require much space and they have a huge impact on the ARQ performance. As a sender receives a feedback, it knows the blocks that
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
BS
SS
PQ
normal packets
ARQ feedbacks
downlink queues
incoming packets
retransmissions
bandwidth requests
uplink virtual queues
normal packets
ARQ feedbacks
retransmissions
incoming packets
downlink queues
153
PQ
WiMAX BS Scheduler
WiMAX PHY
Fig. 4. Queue structure to prioritize feedbacks and retransmissions
were received successfully and the blocks that are to be retransmitted. The successfully transmitted blocks can be removed from the retransmission buffer and the associated resources are cleared (see section 2.6). Furthermore, the sender adjusts the ARQ transmission window that, in turn, influences the performance, because a connection cannot send more blocks than the ARQ window allows. The reason we assign a higher priority to retransmissions is that a receiver can reconstruct a MAC service data unit (SDU) from fragments and forward it to the upper level only once all the fragments have been received. Furthermore, if the ARQ deliver in order option is turned on,1 then a receiver is obliged to forward SDUs in the same order in which a sender transmits them. This means that even though a receiver reconstructs successfully an SDU from all the fragments, it has to wait for all the previous SDUs. The simplest way to organize these priorities is to introduce several internal subqueues within a connection queue, as Fig. 4 illustrates. It is an extended version of the 802.16 QoS architecture considered in [4,9]. Every time a PDU arrives to the connection queue, it will be checked and depending on its type it will be placed into an appropriate subqueue. When PDUs are dequeued, the queue can check first the subqueue with the ARQ feedbacks, then the subqueue with retransmissions, and only then the subqueue with normal PDUs. In other words, a connection queue implements internally the strict priority queuing. An appealing feature of this approach is that it is completely transparent to the BS scheduler. Everything the BS scheduler needs to know to allocate resources is connection QoS requirements, if any, and the queue size [9]. If there 1
It is anticipated that this option will be turned on for most services. Indeed, there is no sense in turning this option off for the UDP based applications, such as VoIP. The VoIP receiver will just discard packets that arrive in the wrong order unless some sufficiently larger input buffer is utilized, which is not typical for most interactive applications. In the case of the TCP based services, an absence of a packet can be treated as a packet drop. It will trigger a retransmission of this packet though it can arrive later.
154
V. Tykhomyrov et al.
are several internal subqueues, then the BS scheduler will be informed about the aggregated queue size. This is especially the case for the uplink virtual queues that are maintained through bandwidth requests sent by SSs. An SS cannot inform about the size of each subqueue but rather about the aggregated size. When a connection is allotted slots, first it will send ARQ feedbacks. If there are remaining bytes in a data burst, the connection will send retransmissions, and only then normal PDUs will be sent. 2.5
ARQ Block Rearrangement
While retransmitting a PDU, a connection may face a problem that an allocated data burst is smaller than the PDU size to be retransmitted. This may happen if the BS scheduler allocates data bursts of different sizes, which is usually the case for rtPS, nrtPS, and BE connections. Suppose, that the BS allocates a data burst of three slots for the BE connection and the latter sends a PDU that spans the whole data burst. If this PDU encounters an error, the connection will retransmit it. However, if the BS scheduler allocates later a data bursts of two slots, there is no way to retransmit the original PDU. Fortunately, the connection may rely upon the retransmission with rearrangement that allows for fragmenting the retransmitted PDU on the ARQ block size boundaries. If there is a sufficiently small ARQ block size, then the connection may construct a smaller PDU. In this subsection we do not focus on the optimal ARQ block size, but rather consider a solution for a case where the retransmission with rearrangement is not supported. The reason this functionality can be absent is the fact that rearrangements involve much more complicated actions with PDUs in the retransmission buffer when compared to the PDU construction. A sender must keep a set of the ARQ timers for each ARQ block. If the retransmission with rearrangement is not supported, then eventually a sender can associate all those timers with a PDU, which requires much less resources. Furthermore, the rearrangement requires a sender to analyze a PDU and to search for block boundaries on which that PDU can be fragmented. This problem would not be so critical if the BS knew that a connection does not support rearrangements. However, there is no such QoS parameter that would indicate it. On the one hand, the BS can guess that a connection does not rearrange PDUs by monitoring bandwidth request sizes and the number of received bytes. On the other hand, a connection should not rely much upon this functionality because it is not mandated by the specification. Thus, the only safe way is to limit the maximum size of transmitted PDUs. 2.6
ARQ Transmission Window and ARQ Block Size
At any time a sender may have a number of outstanding and awaiting acknowledgements ARQ blocks. This number is limited by the ARQ transmission window that is negotiated between an SS and the BS during a connection set-up. A sufficiently large ARQ window allows for a continuous transmission of data. A
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
155
connection can continue to send ARQ blocks without waiting for each block to be acknowledged. Conversely, a smaller ARQ window causes a sender to pause a transmission of new ARQ blocks until a timeout or an ARQ feedback is received. Though it may seem that a large transmission window is always the best choice, it is worth noting that a large transmission window leads to increased memory consumption and processing load. Every ARQ block must be stored in the retransmission buffer until a positive feedback is received. The ARQ transmission window and the ARQ block size parameters depend one on each other. On the one hand, a connection may prefer to work with a small ARQ transmission window that will result in a necessity of choosing a larger ARQ block size because the throughput may be limited by the transmission window size. A large block size requires less resources because a set of the ARQ timers must be associated with a single ARQ block at the sender and a receiver end. At the same time, a connection supporting the retransmission with rearrangement may wish to work with a smaller ARQ block size because that will provide a greater flexibility in splitting large PDUs into several smaller ones. Furthermore, the choice for the ARQ block size can be dictated by the device peculiarities, such as the memory page size. These various requirements introduce a cyclic dependency between these two parameters. We anticipate that the ARQ block size should be the governing parameter, while the ARQ transmission window size should be adapted. The reason is that the ARQ block size has a set of discrete values, while the ARQ transmission window can accept any value within the specified range.
3
Simulation
This section presents a simulation analysis of the 802.16 ARQ mechanism. To run simulations, we have implemented the 802.16 MAC and PHY levels in the NS-2 simulator. The MAC implementation contains the main features of the 802.16 standard, such as downlink and uplink transmission, connections, MAC PDUs, packing and fragmentation, the contention and ranging periods, the MAC level management messages, and the ARQ mechanism. The ARQ implementation supports the ARQ blocks, the ARQ transmission window, retransmission with rearrangement, and all the ARQ feedback types. The ARQ implementation also includes the prioritization of the ARQ feedbacks and retransmissions, and the algorithm to select the feedback type. The implemented PHY is OFDM. Fig. 5 shows the network structure we use in the simulation scenarios. There is the BS controlling the WiMAX network, the parameters of which are presented in Table 1, five SSs, and one wired node. The details of the scheduling algorithm at the BS are presented in [9]. In a few words, the BS allocates resources fairly between the SSs based on their bandwidth request sizes. Each SS establishes one uplink and downlink BE connection to the BS (each SS also establishes the basic management connection to exchange management messages with the BS). An SS hosts exactly one FTP-like application that sends data over the TCP protocol to a wired node. The reason we choose such an application type is that
156
V. Tykhomyrov et al. destination node 1 Gbps 2 ms 5 stations
BE OFDM PHY
Fig. 5. Network structure Table 1. WiMAX network parameters Parameter PHY / Bandwidth Cyclic prefix length Duplexing mode Frames per second Slots per frame MCS PER Ranging transm. opportunities Ranging backoff start / end Request transm. opportunities Request backoff start / end Fragmentation / packing PDU size CRC / ARQ ARQ feedback / ARQ types ARQ block size / ARQ window ARQ block rearrangement ARQ deliver in order
Value OFDM / 7 MHz 1/32 TDD 400 (2.5 ms per frame) 75 64-QAM3/4 (108 bytes/slot) 10% 1 0 / 15 4 3 / 15 ON / ON as large as possible ON / ON standalone / all 16 bytes / 1024 ON ON
it tries to send as much data as possible thus utilizing all the available wireless resources. At the same time, the TCP protocol is very sensible to the packet drops that can occur in the wireless part. 3.1
General ARQ Results
In this simulation scenario we present general results concerning the ARQ performance. We analyze the throughput calculated at the wired interface of the BS, i.e., when the BS reconstructs the original packets from received PDUs. If we enable errors at the PHY level but keep the ARQ mechanism disabled for the transport connections, then there is no smooth data transmission. As Fig. 6(a) illustrates, there is quite a bursty uplink data transmission. Some SSs even do not send data for some periods of time. Such a behavior is explained by the fact that the BS does not test whether there is an erroneous PDU or not – it passes all the reconstructed SDUs to the wired node. Thus, the error
Performance Evaluation of the IEEE 802.16 ARQ Mechanism 8e+06
8e+06 SS1(BE) SS2(BE) SS3(BE) SS4(BE) SS5(BE)
6e+06 5e+06 4e+06
6e+06 5e+06 4e+06
3e+06
3e+06
2e+06
2e+06
1e+06
1e+06
1
2
3
4
5 6 Time, t (s)
7
8
9
SS1(BE) SS2(BE) SS3(BE) SS4(BE) SS5(BE)
7e+06 Throughput, B (bps)
Throughput, B (bps)
7e+06
0 0
157
10
(a) no ARQ
0 0
1
2
3
4
5 6 Time, t (s)
7
8
9
10
(b) ARQ
Fig. 6. Uplink connections throughput (errors at PHY) Table 2. Amount of transferred data ARQ / errors –/– +/– –/+ +/+ Uplink data (MB) 21.602 21.089 9.813 18.525
detection and retransmission occurs at the transport layer which affects greatly the throughput.It is worth mentioning that the results shown in Fig. 6(a) are even somewhat optimistic because there is a small round-trip delay between the source subscriber stations and the destination wired node. As this delay becomes larger, the throughput would decline appropriately. Fig. 6(b) shows the connection throughput when errors at the PHY level and the MAC ARQ mechanism are enabled. As follows from the figure, each BE connection achieves a smooth data transmission. Since there are errors in the PHY channel, the mean connection throughput is less than in the case when errors and ARQ are disabled, which can be seen from Table 2. Nevertheless, the ARQ mechanism ensures extremely good resource utilization. The fluctuations are explained by the fact that some PDUs are dropped and retransmitted. Table 2 provides a comparison of these subcases by using another criterion, the total amount of uplink data. As follows from the results, an absence of the ARQ mechanism when there are errors in the transmission channel (which is usually the case for the wireless networks) results in a very low resource utilization. If there are errors, then the ARQ mechanism improves significantly the performance. Table 2 also presents an interesting subcase where the ARQ is turned on, but errors are turned off. Its purpose is to show that ARQ introduces some overhead to the MAC level. 3.2
ARQ Block Rearrangement
In this subsection, we study the retransmission with rearrangement. The network parameters are the same as presented in Table 1. There are five SSs that send the
158
V. Tykhomyrov et al.
uplink data to the wired node through the BS. To demonstrate the importance of the ARQ block rearrangement, we turn on/off this feature and adjust the PDU size. Fig. 7(a) shows the throughput of all the uplink connections when the ARQ block rearrangement is turned off. As can be seen from the figure, uplink connection throughputs are not smooth but rather change drastically. Some uplink connections do not even transmit in some cases. This is a result of the insufficient size of the uplink data burst when a connection retransmits a PDU. As explained earlier, while a connection may transmit a large PDU, an attempt to retransmit the same PDU may fail if the BS allocates later an uplink burst of a smaller size.
8e+06
8e+06 SS1(BE) SS2(BE) SS3(BE) SS4(BE) SS5(BE)
6e+06 5e+06 4e+06
6e+06 5e+06 4e+06
3e+06
3e+06
2e+06
2e+06
1e+06
1e+06
0
0
1
2
3
4
5 6 Time, t (s)
7
(a) large PDU
8
9
SS1(BE) SS2(BE) SS3(BE) SS4(BE) SS5(BE)
7e+06 Throughput, B (bps)
Throughput, B (bps)
7e+06
10
0
0
1
2
3
4
5 6 Time, t (s)
7
8
9
10
(b) PDU size is 324 bytes
Fig. 7. Uplink connections throughput (no ARQ block rearrangement)
If a connection does not support the ARQ block rearrangement, then a possible solution is to use a smaller PDU size. Fig. 7(b) shows the uplink throughput for exactly the same case, but now all the connections have the maximum PDU size of 324 bytes. As follows from the figure, all the connections achieve more or less smooth data transmission. Bandwidth fluctuations are explained by the fact that if a PDU is dropped, then it is retransmitted. Besides, since a retransmitted PDU cannot be fragmented, some burst resources remain unused. If we compare Fig. 7(b) (small PDU, no ARQ block rearrangement) and Fig. 6(b) (unlimited PDU size, ARQ block rearrangement), we may notice that the ARQ block rearrangement has a significant impact on the performance. Connections can use large PDUs of any size thus decreasing the MAC level overhead. At the same time, all the connections achieve a smooth data transmission. 3.3
ARQ Feedback Types
In this simulation subcase, we study the ARQ feedback types. The network parameters are the same as in the previous simulations scenarios, the only difference is that we run simulation scenarios with different ARQ block size values.
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
159
Table 3. The ARQ feedback types statistics ARQ block size (B) Selective (%) Cumulative (%) Cumulative+selective (%) Cumulative+sequence (%) Total messages
16 18.8 64.6 0 16.6 23163
32 13.1 69.3 0 17.6 21734
64 11.6 70.2 0 18.2 21836
128 11.0 70.8 0 18.1 21552
256 11.3 70.9 0 17.8 21433
512 11.2 71.0 0 17.8 21219
1024 11.2 71.0 0 17.8 21219
Table 3 shows the results for these simulation runs. The total number of the ARQ feedback messages sent during each simulation run and the percentage of each ARQ feedback type are presented. As can be seen, no cumulative+selective feedback message was sent during the simulation runs. As explained earlier, it is almost always more efficient to send acknowledgements by means of the cumulative+sequence type that can encode more blocks than by using the cumulative+selective. If there are only positive acknowledgements, then the cumulative feedback type is used. As follows from the table, the majority of the ARQ feedback messages are of this type. It is worth noting that there are also pure selective feedback type messages. Having analyzed ARQ dumps, we noticed that if there are negative acknowledgements in the beginning of the ARQ map, then the only way to send these acknowledgements is to use the selective type. Thus, the selective feedback plays an important role in the ARQ mechanism. Table 3 also shows that larger ARQ block sizes need a fewer ARQ feedback messages. Indeed, the ARQ feedbacks acknowledge blocks, not bytes. In turn, the resource utilization is somewhat better for the large ARQ block sizes. 3.4
ARQ Transmission Window
In this subsection we study the impact of the ARQ transmission window on the throughput. The network parameters are the same as in the previous simulations scenarios, the only difference is that we vary certain parameters, such as the ARQ transmission window and block sizes. It is also worth noting that there is only one SS, otherwise it would be difficult to present an analysis of the throughput of all the SSs. Fig. 8 presents the simulations results for this case. We run a separate simulation for each ARQ transmission window value and ARQ block size. Since an SS throughput fluctuates during a simulation run, it is averaged by using the exponentially weighted moving average algorithm. The figure indicates large ARQ block sizes allow a connection to achieve its maximum throughput even for small ARQ transmission window values. Conversely, small ARQ block value needs a large ARQ transmission window to achieve a high throughput. As the ARQ transmission window grows, the throughput increases linearly regardless of the ARQ block size. Of course, this grows faster for larger ARQ block sizes.
160
V. Tykhomyrov et al. 2.5e+07
ARQ block: 16 bytes ARQ block: 32 bytes ARQ block: 64 bytes ARQ block: 128 bytes ARQ block: 512 bytes ARQ block: 1024 bytes
Throughput, B (bps)
2e+07
1.5e+07
1e+07
5e+06
0
0
100
200 300 ARQ window
400
500
Fig. 8. Throughput and the ARQ transmission window (different ARQ block sizes)
When the ARQ transmission window reaches a certain value, its further growth does not have an impact on the throughput because the latter is limited by the overall network capacity.2 Fig. 8 illustrates clearly that small ARQ transmission window values may prevent a connection from sending data even if it has slots allocated by the BS scheduler. Though it is not a huge problem for the BE connections, one should account for it if there is a QoS connection with the bandwidth requirements.
4
Conclusions
In this paper, we have analyzed the performance of the 802.16 ARQ mechanism. We have shown that the ARQ mechanism can improve significantly the performance of the TCP based applications. Since a probability for an erroneous transmission in the wireless channel is much higher when compared to the wired medium, the ARQ mechanism should be enabled for the TCP connections if a provider wants to ensure better QoS and to maximize the network utilization. Though we did not present simulation results for the UDP protocol, it is clear that its performance would not be affected by the absence of the ARQ mechanism because UDP transmission does not depend on packet drops. We have proposed a solution on how to prioritize user data, ARQ feedbacks, and retransmissions. The simulation results have also revealed the importance of the ARQ block rearrangement functionality. If an SS does not support it, then an additional care must be taken. An SS should choose smaller PDU sizes to achieve a smooth data transmission. We have also demonstrated that a connection must choose a sufficiently large ARQ transmission window size to utilize the allocated resources. While large ARQ blocks can utilize resources even with a small ARQ window, small ARQ blocks, such as those of 16 and 32 bytes, require much 2
The theoretically maximum throughput for the PHY parameters presented in Table 1 is 22 Mbps. We achieve the maximum throughput of approximately 20 Mbps because a certain number of slots is reserved for synchronization preambles, the UL-MAP and DL-MAP messages, and for the ranging and request contention opportunities.
Performance Evaluation of the IEEE 802.16 ARQ Mechanism
161
larger ARQ window. The analysis of the simulation results for the ARQ feedback types has revealed the importance of the selective type – this is the only type a connection can send if there are negative acknowledgments in the beginning of the ARQ transmission window. Our future research will aim to study the optimal parameters of the ARQ mechanism, particularly for the ARQ-enabled QoS connections. Alos, it will be important to compare the results provided by the ARQ mechanism and the H-ARQ mechanism available in the OFDMa PHY.
References 1. Air interface for fixed broadband wireless access systems. IEEE Standard 802.16 (June 2004) 2. Air interface for fixed broadband wireless access systems - amendment for physical and medium access control layers for combined fixed and mobile operation in licensed IEEE Standard 802.16e (December 2005) 3. WiMAX Forum Mobile System Profile, Release 1.0 Approved Specification Revision 1.2.2 (November 2006) 4. Cicconatti, C., Lenzini, L., Mingozi, E.: Quality of service support in IEEE 802.16 networks. IEEE Networks 20(2), 50–55 (2006) 5. Eklund, C., Marks, R., Stenwood, K., Wang, S.: IEEE standard 802.16: a technical overview of the Wireless MAN air interface for broadband wireless access. IEEE Communications 40(6), 98–107 (2002) 6. Kang, M.-S., Jang, J.: Performance evaluation of IEEE 802.16d ARQ algorithms with NS-2 simulator. In: IEEE Asia-Pacific Conference on Communications, pp. 1–5 (2006) 7. Perera, S., Sirisena, H.: Contention based negative feedback ARQ for VoIP services in IEEE 802.16 networks. In: 4th IEEE International Conference on Networks, vol. 2, pp. 1–6 (2006) 8. Sayenko, A., Alanen, O., H¨ am¨ al¨ ainen, T.: Adaptive contention resolution for VoIP services in IEEE 802.16 networks. In: The 8th IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (June 2007) 9. Sayenko, A., Alanen, O., Karhula, J., H¨ am¨ al¨ ainen, T.: Ensuring the QoS requirements in 802.16 scheduling. In: The 9th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 108–117 (October 2006)
Evaluating Differentiated Quality of Service Parameters in Optical Packet Switching Poul E. Heegaard1 and Werner Sandmann2 1
Norwegian University of Science and Technology, N-7491 Trondheim, Norway
[email protected] 2 University of Bamberg, D-96045 Bamberg, Germany
[email protected]
Abstract. Optical packet switching (OPS) is a promising technology for future generation networks solutions. One of the great challenges to be addressed is differentiated quality of service (QoS) and a major concern in providing QoS is packet loss. In case of differentiated services, packet losses should be significantly less likely for high priority classes than for low priority classes. However, as packet losses are very unlikely they become rare events which poses serious challenges to conventional analysis methodologies. We present an importance sampling scheme for fast simulation of packet loss rates in this setting. The change of measure adapts according to ant colony optimization, a metaheuristic inspired by the foraging behavior of ants. Thereby, our simulation does not require intimate a priori knowledge of the system and overcomes a general drawback of importance sampling. Within very moderate simulation time, accurate results are provided for an OPS model with varying parameter settings.
1
Introduction
The Internet has already reached an enormous complexity and it is still rapidly growing. A wide variety of new services have emerged. Realtime and multimedia applications such as Internet telephony, videoconferencing, video-on-demand and many others make high demands on future networking solutions such as increasing capacity and appropriate quality of service (QoS) provisioning. In particular, higher quality of service (QoS) than for rather conventional applications is required. Hence, providing faster transmission technologies and supporting differentiated QoS are major challenges for the future-generation Internet. Optical networks [16,17,23] are promising to meet these demands. Multiple wavelengths within a single fibre can be handled by wavelength division multiplexing (WDM), and architectures like optical packet switching (OPS) are capable to support high-speed operation by occupying wavelengths only for the time needed to transmit packets [14]. An important QoS attribute is the packet loss ratio due to contention. In OPS this occurs when two or more packets are routed towards the same wavelength at (partly) overlapping time. In the case of wavelength conversion the conflict is avoided unless all wavelengths on the fiber are occupied. In order to provide differentiated QoS such that packet losses are less likely for Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 162–174, 2007. c Springer-Verlag Berlin Heidelberg 2007
Evaluating Differentiated Quality of Service Parameters
163
services with higher QoS requirements, one can introduce preemptive priority levels. In case of an inevitable packet loss, low priority class packets are dropped before high priority class packets. Hence, the packet loss ratio for high priority service classes is significantly smaller than for low priority classes. However, in any case, packet losses should occur with a small probability, which means that packet losses are rare events, in particular for high priority services. This must be carefully taken into account as early as at the stage of planning and designing well-engineered networks. In particular, performance evaluation of architectures is necessary to guarantee that all demands are properly met. The requirement of packet losses being rare poses serious challenges to conventional performance analysis methodologies. Explicit analytical solutions are usually not available at all for complex networks even in the absence of rare events. Large deviations theory (LDT) [5,22] provides a framework for asymptotic investigations when the probability of the rare event converges to zero. However, in practice we are not interested in asymptotics but in specific probabilities in the range of, say, 10−9 to 10−12 . Moreover, LDT results can be obtained only for rather small systems. Likewise, numerical methods are typically not suitable for rare events analysis in complex networks. They are, if at all, only applicable when various restricting assumptions are made, for instance assuming product form solutions. Hence, efficient rare event simulation techniques are needed. Direct simulation of rare events is not effective, since rare events occur too infrequently in simulations to compute reliable statistical estimates in reasonable time. Accelerated simulation is necessary in the sense that simulation time to get estimates with desired statistical accuracy must be significantly reduced. Statistical accuracy is measured by properties like the relative error or the confidence interval half width of the simulation estimator which depend on the variance of the simulation estimators. Therefore, accelerating simulations means reducing the variance of the simulation estimator, and it turns out that Importance Sampling (IS) [3,8,9,12,13,20] is potentially well suited for this purpose. IS applies a change of measure, that is the original system is simulated under an alternative probability distribution (measure), and the systematically biased results are appropriately weighted to yield unbiased estimates. Unfortunately, it is by no means guaranteed that a specific change of measure always results in reduced variances but it may even yield an infinite variance increase. The crucial issue when applying IS is a proper choice of the alternative probability distribution which has many times shown to be an extremely difficult task. It often requires intimate knowledge and difficult pre-analyses of the system at hand that then can be exploited to determine a change of measure that usually only works well for this specific system. To be useful for a broad range of applications, to be used by non-experts in IS theory, and to offer the potential of integration into performance analysis tools it is necessary to come up with ”automated” methods not specifically designed for only one model that yield IS simulations with significantly reduced variance compared to direct simulation for a broad class of models.
164
P.E. Heegaard and W. Sandmann
In this paper, we present a novel method for adaptively learning the change of measure and we apply it to an optical packet switch supporting multiple service classes with full wavelength conversion and preemptive drop policy to provide differentiated QoS. The adaptive method is based on ant colony optimization (ACO) [2,6,7], a metaheuristic that has been designed for solving combinatorial optimization problems. Essentially, it is a multi-agent system inspired by the foraging behavior of real ants where artificial ants, additionally equipped with a memory and a solution evaluation facility, act as agents. In our method it is applied to efficiently find most likely paths to the rare event that are then particularly emphasized in the IS change of measure. The rationale with respect to IS behind this approach stems from LDT where it can be shown that rare events, if at all, occur along certain most likely (among a set of unlikely) paths that are however hard to determine analytically. The remainder of the paper is organized as follows. In Section 2 the considered OPS architecture and the preemptive drop policy are described and an according model is presented. The novel adaptive simulation technique is introduced in Section 3. Section 4 contains numerical results and finally, Section 5 concludes the paper and outlines further research directions.
2
Switch Architecture and Model
We consider a non-blocking asynchronous bufferless OPS with F input- and outelectrical control unit put fibres where each INPUT FIBER OUTPUT FIBER fibre provides W wavelengths λ λ λ/F W X X of capacity C [bps] each. I1 O1 All available wavelengths at λ/F any output fibre are shared amongst S different service λ/F classes. If possible, contenλ λ W X X λ/F tions are handled by waveIF OF length conversion such that packets are converted to idle wavelengths in case of mulFig. 1. Optical Packet Switch tiple packets routed toward the same wavelength at (partly) overlapping time. The service differentiation is provided by mapping of service classes to different preemptive priority levels. In the OPS this can be implemented by an electrical control unit that reads the packet header and determines the priority level and output port. The packet is then delayed by Fiber Delay Lines (FDL) while the control unit checks the state of the output ports with respect to wavelength occupancy and the laser is set accordingly. In Figure 1 a principal illustration of the OPS is given. If all wavelengths at output port Oi of fiber i are occupied, then a new packet arriving to Oi will be lost if the occupying packets have the same or higher
Evaluating Differentiated Quality of Service Parameters
165
priority. If at least one of the packets at Oi has lower priority than the arriving packet, then this will be discarded (preempted) to make room for the new packet. Of course, this means that any priority class can preempt lower ones starting with the lowest priority. In this paper we consider the packet loss ratio of each of the service classes at any single output fibre. Note that the preempted packet is deleted which corresponds to ”Lost Call Cleared” (LCC) in opposite to ”Lost Call Hold” (LCH). The former has higher loss but bounded and lesser delay than the latter. This means that the system does not store the packet and that the packet must be resent by the end-system application or higher level protocol if applicable. For model-based analysis we adopt the assumptions made and justified in [15] where a similar architecture was studied but only for two customer classes and state independent arrival rates. We introduce appropriate extensions to handle multiple service classes and state dependent arrival rates. Priorities for the service classes 1, . . . , S are such that higher class index corresponds to higher priority. Hence, service class S is the highest prioritized class. We assume that packets of service class i arrive according to a Poisson process with possibly state dependent arrival rate λi (·) and that the packet length of service class i is exponentially distributed with mean packet length Li . Thus, the service rate is μi = Li /C. With that we can formally specify an according continuous-time Markov chain (CTMC), suitable for simulation, as follows. 0,W λ2 (1,W-1) λ (0,W-1) 2
μ2 (0,W) λ1 (0,W-1)
0,W-1 λ (0,W-2) 2
.. . λ (0,1) 2
1,W-1
μ1 (1,W-1) λ (1,W-2) 2 μ2 (0,W-1)
···
λ (1,1) 2 μ2 (0,2)
.. .
μ2 (1,W-1)
···
λ1 (0,1)
..
.
λ1 (1,1)
λ1 (W-2,1)
μ1 (2,0)
μ1 (W-1,1)
···
1,1 μ1 (1,1)
μ2 (0,1)
.
μ2 (1,2)
0,1 λ (0,0) 2
..
λ (1,0) 2
λ1 (1,0)
1,0 μ1 (1,0)
λ (W-1,0) 2
μ2 (1,1)
λ1 (0,0)
0,0
W-1,1
λ1 (W-2,0)
···
μ1 (2,0)
λ2 (W,0) μ2 (W-1,1) λ1 (W-1,0)
W-1,0
μ1 (W-1,0)
W,0 μ1 (W,0)
Fig. 2. State transition diagram for S = 2 service classes with state dependent rates where all critical target states are marked. Diagonal transitions correspond to preemptions of class 1 packets by class 2 packets. State (0,W) is the only state where packets of the highest priority class can be lost.
Since we consider S service classes the state space is S–dimensional where x1 + · · · + xS ≤ W for each state x = (x1 , . . . , xS ). Completions of class i correspond to transitions from x = (x1 , . . . , xS ) to (x1 , . . . , xi−1 , xi − 1, xi+1 , . . . , xS ) with
166
P.E. Heegaard and W. Sandmann
transition rate μi (x). The state transitions for arrivals of class i in case of at least one idle wavelength, that is for x1 + · · · + xS < W, are from state x = (x1 , . . . , xS ) to state (x1 , . . . , xi−1 , xi +1, xi+1 , . . . , xS ) with transition rate λi (x). If all wavelength are occupied, that is x1 + · · · + xS = W, and a class i arrival occurs, then the state changes when xj > 0 for at least one j < i. Denote by j = max{k < i : xk > 0} the index of the lowest priority class that currently occupies a wavelength. Then state x = (x1 , . . . , xS ) changes to (x1 , x2 , . . . , xj − 1, . . . , xi + 1, . . . , xS ) with transition rate λi (x). In all other cases where all wavelength are occupied arriving packets are lost. As packet losses occur when all wavelengths are occupied, we define the set R = {x ∈ NS : x1 +· · ·+xS = W } as the set of critical target (rare) states, and we shall aim at estimating probabilities of either the entire set or specific subsets corresponding to packet losses of specific service classes. For specific service classes the critical set is the set of all states where all wavelengths are occupied and no one by lower priority classes. That is, for service class i, Ri = {x ∈ NS : xi + · · · + xS = W }. In particular, for the highest priority class, packet losses can only occur in state (0, . . . , 0, W ) which means that RS = {(0, . . . , 0, W )} consists of a single state. As an example for the special case of two service classes, the state transition diagram with marked critical target states is shown in Figure 2 where class 2 is the highest priority class and thus can preempt class 1.
3
Efficient Simulation Technique
In this section, we start with the foundations of Importance Sampling (IS) and the requirements for application to the OPS model, after which we proceed to the description of our novel fast simulation technique. 3.1
Importance Sampling
The most general description of Importance Sampling (IS) is in measure theoretic terms from which all applications to specific model types and domains can be obtained as special cases. Consider two probability measures P and P ∗ on a measurable space (Ω, A), where P is absolutely continuous with respect to P ∗ . That is for all A ∈ A, P ∗ (A) = 0 ⇒ P (A) = 0. Then, the Radon-Nikodym theorem guarantees that the Radon-Nikodym derivative L = dP/dP ∗ exists and ∀A ∈ A : P (A) = L(ω)dP ∗ . A
The basic property exploited by IS is that for any random variable Y on (Ω, A), EP [Y ] = Y (ω)dP = Y (ω)L(ω)dP ∗ = EP ∗ [Y L] where EP and EP ∗ denote expectations with respect to the probability measures P and P ∗ , respectively. Hence, when a simulation is performed under P ∗ the sample mean of Y L is an unbiased estimator for EP [Y ]. In the context of
Evaluating Differentiated Quality of Service Parameters
167
IS, the probability measure P ∗ is called the IS measure, L is called the likelihood ratio and choosing P ∗ in place of P is referred to as the change of measure. It is known that theoretically there always exists an optimal change of measure yielding zero variance but this cannot be applied in practice since the corresponding optimal IS measure explicitly depends on the unknown quantity to be estimated and is therefore not available for implementation. Even if it was it may take a form that makes sampling from it difficult or impossible. As usual, estimating the probability of an event is done by estimating the expectation of its indicator function. Using the indicator function I(R) of the rare event R, its probability is expressed by EP [I(R)] and IS is applied via P (R) = EP [I(R)] = EP ∗ [I(R)L] with an according likelihood ratio L. Then P (R) can be estimated by the unbiased sample mean of I(R)L under probability measure P ∗ . In [8] the framework for stochastic processes including Markov processes has been given, cf. [20]. For Markov chains, either in discrete or continuous time, the probability measures P and P ∗ are path distributions and absolute continuity corresponds to the condition that all paths containing the event of interest that are possible under P, i.e. in the original model, must remain possible under P ∗ . In continuous time this can be obviously achieved by the condition that for all positive transition rates in the original model the corresponding transition rates under IS are positive. Similarly, this condition applies for transition probabilities in case of discrete time. In this paper, we deal with CTMCs as described in Section 2 and can thus restrict our presentation to continuous time. Consider a CTMC with initial distribution ν, transition rates qij and corresponding jump probabilities pij = qij /qi where qi = j,j q =i ij is the total outrate of state i. That is, the holding or sojourn time in state i is exponentially distributed with mean 1/qi . One way to perform IS in this setting is to simulate an alternative ∗ CTMC with initial probability distribution ν ∗ , transition rates qij and jump ∗ ∗ probabilities p∗ij = qij /qi∗ such that the condition qij = 0 ⇒ qij = 0 holds for all i, j. Then the likelihood ratio for a path consisting of X0 , X1 , . . . , Xk becomes L=
k ν(X0 ) qXi−1 exp(−qXi−1 τXi−1 )pXi−1 Xi ∗ ∗ ν ∗ (X0 ) i=1 qX exp(−qX τXi−1 )p∗X X i−1
i−1
i−1
i
where τXi−1 is the generated exponential holding time in state Xi−1 . Hence, the ∗ change of measure in IS for CTMCs consists of the choice of transition rates qij such that estimators can be achieved that have significantly reduced variances compared to direct simulation estimators. In our case this means that alternative arrival and service rates λ∗i (x) and μ∗i (x) must be properly chosen. Note that with IS all rates may be state dependent even if they are state independent in the original system. 3.2
Adaptive Change of Measure
Adaptive approaches aim at learning a good change of measure without intimate knowledge of the model at hand. Starting with some initial change of measure, a couple of independent simulation runs are performed, and the change of measure is updated according to rules that depend on and thereby characterize the
168
P.E. Heegaard and W. Sandmann
specific adaptive method. Then the simulation is continued by making multiple independent simulation runs with the updated measure and so on until finally the method converges to a measure with which the actual simulation is performed. Most of the adaptive approaches reported in the literature aim at either directly minimizing the (estimated) variance of the IS estimator or a related property such as for example the cross-entropy between the actually used measure and the optimal one [18,19]. These approaches typically become either computationally expensive and/or exhaust storage when applied to complex models without making too restrictive assumptions. Our technique makes use of LDT-based reasoning on how rare events occur. Although a purely analytical approach to rare event analysis is usually impossible for complex networks, LDT gives some valuable insights and guidelines. In particular, rare events typically occur via certain most likely paths and it is known that the change of measure should mainly emphasize these paths. Indeed, several successful applications of IS have been reported where LDT was utilized to obtain a suitable change of measure. However, the drawback of this approach is that the change of measure is specifically designed for one particular model, and seemingly obvious modifications or extensions usually fail even for only slightly different or larger models. While it is difficult to determine the most likely paths analytically, they can be estimated. For example, in [11,10] the transition rates are changed in accordance to the importance of a path with this transition as the first step where the target importance is determined by a ”lookahead” approach inspired by the failure distance, a metric for the distance from any state to a target set of rare states that was introduced in [4]. In [4] all transition rates are changed after each simulated transition. As the major drawback the simulation speed-up strongly depends on an accurate computation of the failure distances which involves the computation of minimum cut sets, in general an NP-hard problem. Hence, the applicability of this approach is quite limited. In [11,10] the path likelihood is estimated by determining the most likely path from the current state to any state in the set of rare states of interest. However, the computational demand of the lookahead approach increases as the dimensionality of the state space increases. 3.3
Search and Update Procedure
In our new technique the target importance is instead determined by ant colony optimization. More specifically, a search and update procedure inspired by the foraging behavior of (real) ants is applied and further equipped with an explicit memory and evaluation facility for (artificial) ants. The ants search iteratively for paths in a connected graph (a network) between source nodes and destination nodes. The path quality is evaluated on arrival to a destination node and then each ant backtracks over the links along the reverse path back to the source node leaving pheromones to guide future ants in their search for the same destination. The better the path, the stronger the pheromone updates. We apply a similar approach here to gradually let the change of measure adapt to the current model. The nodes are now states, and the links are state transitions, the source nodes
Evaluating Differentiated Quality of Service Parameters
169
are the origin or regenerative states, and the destination nodes are given as the set R of rare events of interest. Let x = (x1 , · · · , xS ) be the state vector and π(x, y) a path between two states x and y. A path between state x and a state in the rare event set is denoted π(x, R) and the r-th sampled path is π (r) . The probability of a path π is p(π). The maximum normalized probability of a path from state x to a state in R, given that an arrival of service class i occurred in state x is r (j) (x, R) | i-arrival) j=1 p(π r pmax (π(x, R) | i-arrival) = maxk { j=1 p(π (j) (x, R) | k-arrival)} where for any path π(x, R) from x to the set of rare states, p(π(x, R) | i-arrival) denotes the probability of that path given that an i-arrival transition occurred in state x. The normalization factor is the largest accumulated target probability of all k-arrival transitions in state x. The ant colony optimization then applies the following procedure that is repeated for every simulation iteration r = 1, · · · , R: repeat – Sample a path π (r) from x towards a target state in R; Stop when hitting the origin states; – if π (r) contains states of the rare event set R then for each state y ∈ π (r) update pmax (π(y, R) | i-arrival); until <end of simulation condition>. The random search of the ants in every state is governed by a random proportional rule that is incrementally updated for every new path found by the ants. This proportional rule is determined by the normalized, possibly state dependent IS transition rates λ∗i (x) and μ∗i (x) in each state. Even in case of state independent rates in the original model, that is λi (x) and μi (x) as denoted above are constant for all states x, the IS transition rates may be state dependent. In fact, only this flexibility of state dependent rates renders successful applications of IS possible in complex settings. Each sampled path that includes visits to R invokes a recalculation of the maximum path likelihood for all states in the path. Then the transition rates are changed for all transitions in the path according to the following updating rule for the change of measure. Denote by 1i the S– dimensional vector with entry 1 at component i and zero entries elsewhere, and define Δi (x) = max(μi (x + 1i ) − λi (x); 0). Then the arrival and service rates for IS are updated according to λ∗i (x) = λi (x) + pmax (π(x, R) | i-arrival) · Δi (x), μ∗i (x) = μi (x) − pmax (π(x, R) | i-arrival) · Δi (x − 1i ). Note that only transition rates along the sampled path are updated which strongly reduced the computational demands compared to other approaches. Initially, when no information of the target likelihood exists, the ants search the state space by a guided random walk. After this initial phase the ants use the updating rules.
170
4
P.E. Heegaard and W. Sandmann
Numerical Results
We have performed a series of simulation experiments for different numbers of service classes as well as different numbers of wavelengths. Here, we present numerical results for S = 1, 2, 3 service classes with W = 4, 8, 16 wavelengths. All service classes have different priority levels. Service class i has priority i, i = S(highest), · · · , 1(lowest). The load level of each service class is varied such that the loss probabilities are in the range from 10−6 to 10−12 . Table 1. Parameter Settings Case I II III IV V VI VII VIII IX X XI XII XIII
W 4 4 4 4 4 4 4 8 8 8 8 16 16
S λ3 (x) λ2 (x) 1 0.01 2 0.01 0.01 2 0.01 0.05 2 0.01 0.005 3 0.01 0.01 3 0.01 0.05 3 0.01 0.005 1 0.15 2 0.15 0.15 2 0.15 0.3 2 0.15 0.05 1 0.6(W − x1 ) 2 0.6(W − x1 ) 0.6(W − x2 )
λ1 (x) μ3 (x) - 0.99 · x1 - 0.99 · x1 - 0.99 · x1 - 0.99 · x1 0.01 0.99 · x1 0.05 0.99 · x1 0.005 0.99 · x1 - 0.85 · x1 - 0.85 · x1 - 0.85 · x1 - 0.85 · x1 0.4 · x1 0.4 · x1
μ2 (x) μ1 (x) 0.99 · x2 0.95 · x2 0.995 · x2 0.99 · x2 0.99 · x2 0.95 · x2 0.95 · x2 0.995 · x2 0.995 · x2 0.85 · x2 0.7 · x2 0.95 · x2 0.4 · x2 -
The specific system parameters used within the simulation experiments are listed in Table 1 which includes the arrival and service rates used in the simulations. All simulation experiments were performed with state dependent service rates; and both state independent and state dependent arrival rates. The state dependent arrival rates are such that λi (x) = (Wi − xi )λi . The objective is to estimate the loss probabilities for the different service priority classes. For a service priority class i this means the probability that all wavelengths are in use and no low level priority classes (j < i) can be interrupted to make wavelengths available. Table 2 shows the simulation results obtained by our novel technique. Each simulation experiment focusses on one service (priority) class at the time. The sets of critical target rare event states for the different service class with according priority levels p = 1, · · · , S are Rp = {x ∈ NS : xp + · · · + xS = W }. For each simulation experiment, 3,000,000 independent simulation runs have been ¯ their standard deviations performed from which the according sample means X, ¯ X¯ have been computed. Standard deviation and relSX¯ and relative errors X/S ative error are the common properties to measure the statistical accuracy of an estimation, here more specifically an estimation via simulation. All simulation experiments required only a few seconds to a few minutes runtime.
Evaluating Differentiated Quality of Service Parameters
171
Table 2. Simulation Results Case High priority, RS Medium priority, RS−1 ¯ ¯ X¯ ¯ ¯ X¯ X X/S X X/S I 4.29e-010 0.0001 II 4.26e-010 0.0006 6.82e-009 0.0177 III 4.07e-010 0.0014 6.04e-007 0.0049 IV 4.27e-010 0.0004 2.14e-009 0.0125 V 4.21e-010 0.0010 6.81e-009 0.0125 VI 3.88e-010 0.0020 5.81e-007 0.0070 VII 4.26e-010 0.0006 2.15e-009 0.0145 VIII 1.96e-011 0.0005 IX 1.64e-011 0.0015 4.09e-009 0.0181 X 1.28e-011 0.0028 2.30e-007 0.0119 XII 1.86e-011 0.0008 1.52e-010 0.0209 XII 6.91e-012 0.0106 XIII 1.89e-012 0.0767 5.18e-008 0.0386
Low priority, R1 ¯ ¯ X¯ X X/S 3.27e-008 0.0413 6.63e-006 0.0114 7.85e-009 0.1000 -
The simulation results in Table 2 show the good performance of our novel simulation technique. All estimates are statistically accurate with small relative errors. The method works fine both for state dependent and state independent arrival rates. It should be noted that the accuracy of the highest priority class with preemption depends on the load level of the lower priority classes, on the number of service classes, and on the number of wavelengths. The number of service classes and wavelengths also influences the accuracy of the low priority blocking probability estimates. As a final comment on the provided runtime gain, we like to emphasize that it was not possible to compare our results to results from simulation experiments without IS, simply because in direct simulations no packet losses were observed within reasonable time. This can be easily explained by a rough argumentation as follows. If we consider an event that has a probability of 10−11 to occur within a simulation run then on average 1011 runs must be simulated to observe only one single occurrence of this event. It becomes even worse if one wants to make statistically valid estimations. Assuming as a rule of thumb that at least one hundred observations of an event are necessary to provide reasonable statistics then on average 1013 simulation runs can be expected necessary to yield feasible results. Thus, since we only needed 3,000,000 runs, we have an improvement factor in the order of 107 . Assume one minute as a reasonable representative average runtime for 106 runs. Then to obtain results of the accuracy of ours, direct simulation requires approximately 7,000 days which is more than nineteen years. Of course, the above argumentation is only rough but rather tends to underrate the effort required by direct simulation than to overrate it (since only one hundred necessary observations of the rare event is quite optimistic in order
172
P.E. Heegaard and W. Sandmann
to get proper statistical accuracy). Hence, the simulation speed-up provided by our technique compared to direct simulation is indeed dramatic.
5
Conclusion
In this paper, we have considered the performance of an optical packet switch (OPS) with service differentiation. The service classes are mapped onto different preemptive priority levels, and the QoS performance attribute is the packet loss ratio for each service class. Acceptable QoS should guarantee low packet loss ratios for all service classes, and very low packet loss ratios for the high quality class. As packet losses are very unlikely they become rare events which poses serious challenges to conventional analysis methodologies. In order to tackle these challenges, a novel technique for efficient simulation of packet losses has been presented and applied to OPS with multiple service classes where contentions are handled by full wavelength conversion and a preemptive drop policy. Our simulation technique makes use of Importance Sampling which proceeds by choosing an alternative probability distribution for accelerated simulation and appropriately weighting the results to obtain unbiased estimators with significantly reduced variance compared to direct simulation estimators. The necessary parameters determining the alternative distribution are adaptively obtained by the ant colony optimization metaheuristic using a suitable search and updating procedure. An exceptionally strong feature of our method is that no intimate a priori knowledge of the system under consideration is required to set up the simulation. The parameters gradually adapt to the model as its state space is explored and the Importance Sampling change of measure is set accordingly. This change of measure is then applied by the succeeding simulation trials. Very accurate results with small relative errors were obtained within only a few minutes for parameter settings where direct simulation would require decades of runtime. Further studies are planned and already started to demonstrate that the novel technique works well for different model structures. For instance, for optical packet switched nodes with an upper bound of the number of wavelengths allowed to be allocated by a low priority service class, first studies indicate the validity and efficiency of the method. It should be also noted that the applicability of the method is not restricted to exponentially distributed times. It is easy to incorporate phase-type distributions into the model and then to apply the ant colony optimization procedure similarly as it is done in the present paper. This will be another topic of future investigations. Further research also includes systematic studies of the properties of the Importance Sampling estimators obtained via the ant colony optimization. Although a main motivation is the applicability to rare events with probabilities in orders of magnitudes of practical interest, it is of course also interesting and useful to examine asymptotic properties such as, e.g., those recently considered in [1,21] to gain insights how the method behaves as the probability of interest converges to zero.
Evaluating Differentiated Quality of Service Parameters
173
References 1. Blanchet, J.H., Glynn, P.W., L’Ecuyer, P., Sandmann, W., Tuffin, B.: Asymptotic robustness of estimators in rare-event simulation. In: Proceedings of the 2007 INFORMS Simulation Society Research Workshop (to appear, 2007) 2. Blum, C.: Ant colony optimization: Introduction and recent trends. Physics of Life Reviews 2, 353–373 (2005) 3. Bucklew, J.A.: Introduction to Rare Event Simulation. Springer, Heidelberg (2004) 4. Carrasco, J.A.: Failure distance based simulation of repairable fault-tolerant systems. In: Proceedings of the 5th International Conference on Modeling Techniques and Tools for Computer Performance Evaluation, pp. 337–351 (1992) 5. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, Heidelberg (1998) 6. Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, Cybernetics, Part B: Cybernetics 26(1), 29–41 (1996) 7. Dorigo, M., St¨ utzle, T.: Ant Colony Optimization. The MIT Press, Cambridge (2004) 8. Glynn, P.W., Iglehart, D.L.: Importance sampling for stochastic simulations. Management Science 35(11), 1367–1392 (1989) 9. Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Methuen (1964) 10. Heegaard, P.E.: Efficient Simulation of Network Performance by Importance Sampling. PhD thesis, Norwegian University of Science and Technology (1998) ¨ Inter11. Heegaard, P.E.: A scheme for adaptive biasing in importance sampling. AEU national Journal of Electronics and Communications, Special Issue on Rare Event Simulation 52(3), 172–182 (1998) 12. Heidelberger, P.: Fast simulation of rare events in queueing and reliability models. ACM Transactions on Modeling and Computer Simulation 5(1), 43–85 (1995) 13. Juneja, S., Shahabuddin, P.: Rare event simulation techniques: An introduction and recent advances. In: Henderson, S.G., Nelson, B.L. (eds.) Simulation, Handbooks in Operations Research and Management Science, Ch. 11, pp. 291–350. Elsevier, Amsterdam, The Netherlands (2006) 14. O’Mahony, M.J., Simeonidou, D., Hunter, D.K., Tzanakaki, A.: The application of optical packet switching in future communication networks. IEEE Communications Magazine 39(3), 128–135 (2001) 15. Øverby, H., Stol, N.: Quality of service in asynchronous bufferless optical packet switched networks. Telecommunication Systems 27(2-4), 151–179 (2004) 16. Perros, H.(ed.) Special issue on optical networks. Computer Networks 50(2), 145– 288 (2006) 17. Ramaswami, R., Sivarajan, K.N.: Optical Networks: A Practical Perspective, 2nd edn. Morgan Kaufmann, San Francisco (2001) 18. Rubinstein, R.Y.: Optimization of computer simulation with rare events. European Journal of Operations Research 99, 89–112 (1997) 19. Rubinstein, R.Y., Kroese, D.P.: The Cross Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer, Heidelberg (2004) 20. Sandmann, W.: Importance sampling in Markovian settings. In: Proceedings of the 2005 Winter Simulation Conference, WSC’05, pp. 499–508 (2005)
174
P.E. Heegaard and W. Sandmann
21. Sandmann, W.: Efficiency of importance sampling estimators. Journal of Simulation 1(2), 137–145 (2007) 22. Shwartz, A., Weiss, A.: Large Deviations for Performance Analysis. Chapman & Hall, Sydney, Australia (1995) 23. Sivalingam, K.M., Subramaniam, S.: Emerging Optical Network Technologies. Springer, Heidelberg (2004)
GESEQ: A Generic Security and QoS Model for Traffic Priorization over IPSec Site to Site Virtual Private Networks Jesús A. Pérez, Victor Zárate, and Angel Montes ITESM - Campus Cuernavaca, Department of Electronics Temixco, Morelos, 62584, México {jesus.arturo.perez, vzarate, A00373556}@itesm.mx
Abstract. Virtual Private Networks (VPNs) are replacing expensive leased lines since they use the public Internet as the communication channel. However, Internet by itself does not provide guarantees of security and QoS to send voice and video packets. In this paper we present a generic security and QoS model named GESEQ which is based on the DiffServ model that shows design considerations when delay-sensitive packets need to be protected by IPSec tunnels and also need to be treated preferentially to overcome the increase of packet latency, jitter and packet loss under network traffic congestion. We made and present a particular test scenario and its quantitative analysis to prove that GESEQ works efficiently when trying to send voice and video packages with security and QoS. Keywords: Differentiated Service, IPSec, QoS, Virtual Private Network.
1 Introduction Computer networks have evolved enough to support multimedia applications like IP videoconferences over corporate networks and the Internet. As the Internet and corporate intranets continue to grow, applications other than traditional data are envisioned such as VoIP and IP videoconferences. More and more users and applications are choosing the Internet as their communication channel every day, and the Internet needs the functionality to support both existing and emerging applications and services. Today, however, the Internet offers only best-effort service. A best-effort service provides no service guarantees regarding whether a packet is delivered or not to the receiver in a timely manner, or delivered at all, because packets are frequently dropped during network congestion [1]. The use of VoIP and videoconferences demands QoS requirements for voice and video traffic. QoS is the network capacity to offer the best service for a selected traffic flow [2]. The voice and video transmission with a great latency and jitter, low bandwidth and great loss could end up in an unacceptable communication [3]. Real time multimedia traffic is very sensitive to delay and the best-effort service is not the right approach for this kind of traffic. The DiffServ model is used as one of Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 175 – 186, 2007. © Springer-Verlag Berlin Heidelberg 2007
176
J.A. Pérez, V. Zárate, and A. Montes
most important QoS solutions which can give a special treatment to the delaysensitive packets. Since nowadays it is recommended to include QoS tools over the link, protecting the information should also be a requirement, especially when transmitting data over the Internet or a shared WAN. The IP protocol by itself does not protect the data over a public network. In order to prevent and mitigate some attacks, the IPSec Virtual Private Network (VPN) technology was developed. Without any distinction IPSec integrates security elements to the IP protocol such as: origin authentication, data integrity, confidentiality, no repudiation and anti-packet repetition [4]. But including security over the network damages the QoS of delay-sensitive packets. As a consequence, it is important to balance both the security and the QoS aspects at the same time. Many studies have been done to evaluate the IPSec performance but the shown results do not apply to our purposes since our network infrastructure includes routers which create the IPSec tunnels. Also, the data to transmit is voice and video simultaneously in the same tunnel; the data is in real time and not buffered, generated by one videoconference using wired media. The results in [5] only focus in voice traffic and not in video traffic. In [6] the test scenario consists in a wireless network and the IPSec tunnel creation is based on desktop nodes. The same happens in [7] where no network layer equipment is included and also the test did not include voice or video traffic. The scenario in [8][9] includes wireless equipment and no multimedia traffic considered for the results. In [10] the results included streaming voice and video, but this kind of traffic is not real time like the videoconference’s traffic since the data is stored in a file before it is sent. In this context, the objective of our work is to create a generic security and QoS model that can be implemented when trying to protect and treat preferentially the delay-sensitive packets over an IPSec Site to Site VPN. GESEQ bases its functionality on the IPSec VPNs mechanisms and the DiffServ QoS tools. The structure of this work is as follows: Section 2 introduces basic concepts of IPSec VPNs. Section 3 describes the basics of QoS. Section 4 describes the GESEQ model. Section 5 depicts a particular test scenario that was built to prove our security and QoS model and its implementation. Finally, section 6 explains the results after implementing GESEQ and section 7 describes the conclusions achieved.
2 IPSec Virtual Private Networks IPSec was designed and created by the IETF as the security architecture for the Internet Protocol (IP). It defines the IP packet formats and infrastructure dedicated to provide authentication, data integrity, confidentiality and anti-packet repetition. IPSec is based on two encapsulation protocols: ESP (Encapsulation Security Payload) and AH (Authentication Header). AH provides origin authentication, data integrity and anti-packet repetition. ESP provides all the characteristics enlisted above but additionally provides confidentiality through data encryption.
GESEQ: A Generic Security and QoS Model for Traffic Priorization
177
3 Quality of Service The usage of QoS tools enables a connection with a good degree of performance. The most important DiffServ QoS tools are: classification and marking, scheduling mechanisms, congestion avoidance and link specific. In order to measure the QoS improvement after the QoS tools implementation it is necessary to define some parameters to quantify the performance. The usual ones are: latency, jitter and packet loss. According to the recommended values, the one-way latency should not exceed 150ms, jitter should not exceed 50ms and packet loss should be below 1% [11]. When these parameters limits are exceeded, it does not necessarily mean that the communication will be lost; but that the quality for voice and video will be degraded in proportion to the exceeded recommended limit. The premise of DiffServ consist on offering different network service levels to a packet, thereby enabling scalable service discrimination in the Internet without the need of per-flow state and signaling at every hop [12]. Packets of a particular service belong to a particular class, and the treatment of each class is described by PHBs (Per Hop Behavior), with which the network node must comply. Meaningful services can be constructed by setting a field in the IP header (IPP or DSCP) upon network entry or at a network boundary, conditioning the marked packets at network boundaries according with the requirements or rules of each class or service. In data networks carrying multimedia traffic, the latency is broken down into: Packetization delay (Pdi,), Serialization delay (Sdi,) and the Propagation delay (Pndi,). For every packet, its latency is calculated adding: Pdi + Sdi + Pndi. But since the Pdi can not be calculated with a sniffer, we did not include it for the results. The results of latency, jitter and packet loss of all the packages were averaged in order to produce a more accurate output. The used formulas to calculate the packet latency and the packet jitter are shown next:
∑ Latency =
N
i =1
Sdi ⊕ Pndi N
∑ Jitter =
N i =1
At i − At i −1 N
Where i is the packet number, N is the total number of packets transmitted during the specific period of time and At is the arrival time of a packet at the destination node. The next section explains our GESEQ model.
4 GESEQ: Generic Security and QoS Model GESEQ consists on a set of generic components which are specialized in allowing a secure transference of multimedia traffic with QoS from a VPN customer perspective (figure 1). It includes a practical sequence of steps that can be followed for the given purpose. GESEQ can be split into three parts: security, QoS and management. It is not important whether the security steps are implemented before the QoS ones or not, since both functionalities are independent. The management part of our model involves the administrative aspects of security and QoS.
178
J.A. Pérez, V. Zárate, and A. Montes
4.1 Security 1.
Determination of the protection scope: In order to provide security over the links it is first necessary to determine the scope of the IPSec protection. The scope could be required in a telecommuter’s network, over a single LAN (or part of a LAN), over extranets or intranets.
Fig. 1. GESEQ model
2.
3.
4.
Determination of the IPSec tunnel devices: The security scope will determine the IPSec devices required to protect the traffic. The most common hardware VPN devices are: routers, pixes and firewalls but there are also dedicated VPN devices such as VPNs hubs. Determination of the information to be protected: Only the critical information should be protected through the IPSec VPN since encryption is a process that demands a lot of computational resources. Configuration of security elements: The final step is to configure the security elements such as the authentication and integrity mechanism, the IPSec and key exchange protocol and the encryption algorithm.
The peer authentication mechanisms validate the IPSec device identity; the device could be a router or any other IPSec device. The most common peer authentication mechanisms are digital certificates and pre-shared keys. The most popular packet integrity mechanisms are: HMAC-MD5 and HMAC-SHA. The last one is more secure since it has more mathematical complexity and that is why we recommend its use. The suite of protocols IPSec (AH and ESP) encapsulate the original IP packets. To create a more secure link we suggest ESP in the tunnel mode since it also offers confidentiality through the usage of encryption algorithms.
GESEQ: A Generic Security and QoS Model for Traffic Priorization
179
Given that IPSec is an open standard, it can use many encryption algorithms but the bidirectional IPSec SAs can only use one at the same time. The recommended encryption algorithms in GESEQ are the strongest: 3DES, AES and IDEA. In Our experiments AES had the best performance so we propose in GESEQ AES as the best. The key exchange protocol generates and regenerates the encryption keys during the validation time of every IPSec SA. Another IKE’s function is to establish the IPSec SAs which defines the way of protecting the traffic. 4.2 Quality of Service 1.
2.
3.
Determination of the information that will need QoS: In order to provide a preferential or deferential treatment to the information, it is first necessary to determine what kind of treatment the information needs. Determination of the QoS scope: The scope could include a complete or partial LAN, a wireless LAN or WAN links. The most common implementation is where the channel speed is scarce such as in the case of WAN links. QoS is typically considered to be an end-to-end characteristic: if one link in the end-to-end fails to provide it, then the application will not receive the QoS it needs [13]. Configuration of QoS tools: The next step to follow is the configuration of the QoS tools: classification and marking, scheduling mechanism, congestion avoidance and link specification. (figure 2).
The marking can be done over layer 2 or layer 3 (IP) headers but it is recommended to do it over layer 3 since the network could be implemented with different layer 2 technologies such as ATM or Frame Relay. The marking can be implemented over the IP precedence bits (in case of the ToS byte) or in the DSCP bits (in case of the DS byte). Two used scheduling mechanisms are: Class-Based Weighted Fair Queuing (CBWFQ) and Low Latency Queuing (LLQ). CBWFQ enables the creation of up to 256 classes of traffic, each one with its own reserved queue. Each queue is serviced based on the bandwidth assigned to each class or kind of traffic. LLQ bases its functionality on CBWFQ but it results more optimal when transmitting real time applications since it creates a low latency queue dedicated to this kind of traffic and serves it faster than the other 64 queues(recommended number). For that reason we suggest LLQ to be used in GESEQ. After defining the queuing technique, a percentage of bandwidth must be assigned to the traffic types. For multimedia packets the bandwidth percentage should be greater than for the remaining traffic since in our model, we want to improve the multimedia traffic. The congestion avoidance tools that we implemented are: Weighted Random Early Detection (WRED) that drops packets randomly according to the weight of the IPP header value, and DSCP-based WRED which discards packets randomly as well, but according to the weight of the DSCP header value. However, if possible RIO must be implemented since it has a better performance. The link-specific tools such as RTP-IPSec header compression (IPSec cRTP) are under development. The link fragmentation is only recommended when the link speed is lower than 768 kbps since a 1500-byte data packet can take too long to be transmitted
180
J.A. Pérez, V. Zárate, and A. Montes
Fig. 2. QoS toolset
because of the serialization delay. Policers and shapers are traffic conditioners that check traffic violations from a defined customer. When the customer exceeds a determined amount of traffic, the policer drops the exceeded packets. The shaper instead of dropping them, it buffers the exceeded packets until they can be sent. 4.3 Management In this section the VPN client must consider the administrative aspects of the transmission such as accounting, charging and the establishment of the Service Level Agreement (SLA) between the client and the provider, there are cases where the VPN client network could cross several VPN providers [14]. An automated software agent, namely the VPN service broker, can be installed for VPN QoS management. We need to clarify that GESEQ does not guarantee that latency, jitter and packet loss will not exceed the recommended values. Instead our model guarantees that under stress and traffic congestion the delay-sensitive packets will not increment in a high degree its latency and jitter.
5 Test Scenario and Implementation The test scenario was designed with the intention of representing a real university campus or a small size company network. Our main propose was not to simulate a
GESEQ: A Generic Security and QoS Model for Traffic Priorization
181
network behavior instead we want to create a real scenario where we can see the performance of our topology once that GESEQ has been implemented. The particular test scenario is shown in figure 3. It consisted of six nodes; two for the IP videoconference and the remaining four nodes dedicated to generate extra traffic in order to induce network congestion. Videoconf. Node A and Videoconf. Node B held the IP videoconference through the IPSec tunnel. In our test scenario node C established two HTTP and one FTP session with node E. The HTTP sessions (through a web page) were used to download an 800Mb size file. The FTP session was also used to download a similar file but using the FTP protocol. To generate more traffic, node C sent continuous ICMP packets (pings) to node E with size of 10000 bytes (all ICMP packets in our scenario were of this size). Node D was the FTP and HTTP server on network segment 1. It also established two HTTP sessions with node E and generated ICMP traffic towards the same node. Node E had the roles of FTP and HTTP servers on network segment 2. It connected to node D to set two HTTP and one FTP sessions. It also generated ICMP traffic towards node D. Node F established two HTTP sessions with node D and it also generated ICMP traffic towards node D. The IPSec VPN was implemented between two routers (A and C) creating one IPSec tunnel to protect all UDP multimedia information until arriving to the destination node in both ways. All FTP, HTTP, ICMP and remaining traffic was not protected by the IPSec VPN. Every WAN link was set up to the speed of 1Mbps by using the router’s serial interfaces. The switched networks are 100Mbps, so there is a bottleneck
Fig. 3. Network topology
182
J.A. Pérez, V. Zárate, and A. Montes
in the core network. The videoconference nodes held an IP videoconference with VIGO VCON proprietary consoles, cameras and microphones. We used three Cisco routers model 2621XM with IOS 12.3(14T). All nodes were Pentium IV with Windows XP as the operating system. The testing consisted on capturing the voice and video packets that traveled from Videoconf. Node B to Videoconf. Node A with the use of the Etherpeek NX sniffer. We only considered the multimedia packets for the results reported in this article. The remaining traffic (FTP, HTTP and ICPM) was discarded since it was just injected for congestion purposes. The sniffer node captured the incoming and outgoing traffic between the network segments throughout two network adapters (NICs). We established fifteen real-time IP videoconferences. Every videoconference lasted 2 minutes, during that time all multimedia packets were captured making a total of 30 minutes of videoconference packets captured. The first five videoconferences were implemented without the IPSec VPN and without QoS tools. The next five videoconferences included only the IPSec VPN but with no QoS tools. The final five videoconferences included the usage the IPSec VPN and the QoS tools. Following the proposed steps from GESEQ, the implementation is shown next. 5.1 Security The VPN protection scope was the Autonomous system between routers A and C, including all the traffic among them, Routers A and C established the IPSec tunnel, since Router B just need to forward the encrypted traffic. Only the voice (G722) and video (H263) packets were protected. The security elements were defined as follows: pre-shared keys for the peer (router) authentication were chosen while the HMAC-SHA was used as the packet integrity mechanism. IPSec ESP tunnel was chosen as the mode to transport encrypted multimedia packets. We used AES as the encryption algorithm and IKE as protocol that generated/regenerated the encryption keys and the IPSec SA. 5.2 Quality of Service In our scenario there were five kinds of traffic (protocols) which were: Voice (G722) and video (H263), HTTP, FTP, ICMP traffic and remaining traffic. For our purposes voice and video packets had the most preferential treatment. HTTP and FTP traffic did also have priority but with fewer privileges. ICMP and the remaining traffic did not have any preferential treatment. The QoS scope began and finished in the fast Ethernet interfaces of the edge routers. Classification and marking was based on DSCP values. Traffic G722 and H263 were marked as DSCP AF41, HTTP as AF11, FTP as AF12 and ICMP/remaining as 0. The elected congestion avoidance mechanism was DSCP-based WRED. The scheduling mechanism was LLQ. The packets were scheduled for transmission on the output interface. Since the multimedia packets were the preferential traffic, we decided to dedicate 50% of the total amount of bandwidth (1Mbps).
GESEQ: A Generic Security and QoS Model for Traffic Priorization
183
5.3 Management We configured the VPN and QoS mechanisms manually. We did not consider any SLA. The configuration of edge routers and intermediate routers were different. Edge routers created the IPSec tunnels and applied DSCP markings to the incoming preferential packets at the ethernet interface. They also applied the QoS tools at the output interface. The intermediate router did not create any IPSec tunnel but did apply the QoS tools at its two serial interfaces.
6 Results Analysis In the particular test scenario we included three sub-scenarios to evaluate latency, jitter and packet loss which were: a) Without IPSec VPN implemented and without QoS mechanisms. b) With IPSec VPN implemented and no QoS mechanisms. c) With IPSec VPN implemented and having QoS tools configured. 6.1 Latency The latency parameter never exceeded the recommended 150ms (figure 4) in the subscenarios a and c. The exception was in sub-scenario b. This behavior was expected since the routers had to encrypt and decrypt multimedia packets and these packets were not treated preferentially. As a consequence, the latency in sub-scenario b was of 252.18ms and 264.77ms for voice and video packets respectively; exceeding the recommended 150ms. Packets Latency 252.2 254.8
300 250 200 Time (ms) 150
81.3 95.5
66.2 70.0
100
Voice Latency Video Latency
50 0 No VNP no QoS
With VPN no QoS
With VPN and QoS
Fig. 4. Results of packet latency
In sub-scenarios a and c the latency was satisfactory according to the recommended value. In sub-scenario a the latency was of 81.27ms for voice packets and of 95.47ms for video packets. Very interesting results were found in test sub-scenario c where, having both VPN and QoS configured, latency reached 66.24ms for voice and 70.02 for video packets.
184
J.A. Pérez, V. Zárate, and A. Montes
As we can see the latency was better for the multimedia packets than in subscenario a. We attribute this behavior to the congestion manager mechanism we used since the multimedia packets were treated preferentially. Although the voice and video packets were encrypted and decrypted, the scheduling mechanism gave attention faster and more frequently to their low latency queue; and using the 50% of reserved bandwidth, the multimedia packets took less time to travel towards the destination node. We can see that the video packets took more time to be sent than the voice packets. The reason is because video packets are longer. A voice packet (using a G722 codec) occupied 538 bytes meanwhile the average size for a video packet (with a H263 codec) was of 1300 bytes. The serialization delay tends to increase while the packets size gets longer. We see that even though encryption process is relevant to the packet latency, it is not the main reason for increasing the packets’ latency. The deployment of IPSec and its associated overload is a main reason for the increased packet latency in sub-scenario b). However, this can be offset by using QoS mechanisms. 6.2 Jitter The jitter for voice packets under the three sub-scenarios was greater than the recommended 50ms (figure 5). However, we can observe that, once the QoS tools are implemented, the video packets reduced their jitter to 18.19ms. Without QoS tools, jitter increased to 30ms in sub-scenarios a and b. We can also see that the implementation of the IPSec VPN did not affect the jitter for both voice and video packets. Packets Jitter 70 60 50 40 Time (ms) 30 20 10 0
60.12
30.96
60.07
59.96
30.28 18.19
No VNP no QoS
With VPN no QoS
Voice Jitter Video Jitter
With VPN and QoS
Fig. 5. Results of packets jitter
Despite this behavior the end user did not notice a bad voice quality because videoconference consoles were used and 60ms is not far from the recommended 50ms. The similarities on the jitter results were caused by using the same traffic injection rate in the three scenarios. The consoles have memory for buffering that could compensate the changes in the arrival time. Having buffers for jitter control, the asynchronous arrival time became synchronous, therefore improving the voice quality.
GESEQ: A Generic Security and QoS Model for Traffic Priorization
185
6.3 Packet Loss The packet loss under the three sub-scenarios was almost null (figure 6). This percentage was based on the total amount of packets transmitted by the origin node towards the destination one. The increment in sub-scenario b can be attributed to the IPSec sliding window (similar to the TCP sliding window). This window is dedicated to protect against antireplay attacks and to detect out-of-time packets. If we do not have any mechanism to treat preferentially a multimedia packet, this packet is going to be competing for the link access and will wait certain x time (buffered) until the transmission media is available. Percentage of Packet Lost 0.1
0.1
0.08 0.06
%
Voice Packet Loss
0.04
Video Packet Loss 0.02
0 0
0
0
0
0 No VNP no With VPN QoS no QoS
With VPN and QoS
Fig. 6. Results of packets loss
If this x time is high, the packet has a big risk of arriving out of time at the destination router. The router may then discard the packet in question since it did not fit on time in the sliding window. Having QoS tools implemented did not produce any packet loss. When the multimedia packets had to be scheduled at the output interface they were sent immediately before the remaining packets. One important factor that must be considered is the speed of the interfaces. Having interfaces with a speed mismatch among them will increment the packet loss.
7 Conclusion We defined a generic model that offers the possibility of transmitting any kind of traffic with security and with QoS, but for our purposes we focused on the delaysensitive traffic. We can see that GESEQ is versatile since the security mechanisms and the QoS tools can be applied separately, depending on the network needs. One particular test scenario was created to prove the efficiency of GESEQ. Based on the observed performance results of the QoS parameters, we conclude that GESEQ allowed a better quality in the transmission of voice and video packets when they are encrypted under network traffic congestions. GESEQ improved (under our particular traffic conditions) 74% the voice and video latency, it improved 40% the video jitter and finally improved 0.01% the video packet loss. GESEQ perform well even when
186
J.A. Pérez, V. Zárate, and A. Montes
the network is slightly and heavily congested. Its performance is very closed to the results showed in this paper. We observed that when protecting the multimedia information with a IPSec VPN and not implementing classification, marking, scheduling mechanisms and congestion avoidance tools; the QoS parameters increased, therefore degrading the quality. In our future work we would produce a performance evaluation on using different IPSec and QoS parameters for different network scenarios. Also we are considering presenting the same results of section 6.3 for the background traffic.
References [1] Vegesna, S.: IP Quality of Service. Cisco Press (2001) ISBN 1-57870-116-3 [2] Cisco Systems, Wireless Quality-of-Service Deployment Guide (2000), www.cisco.com [3] Monsour, Y., Patt-Shamir, B.: Jitter control in QoS networks. IEEE/ACM Transactions on Networking (TON) 9(4) (2001) [4] Naganand, D., Harkins, D.: IPSec The New Security Standard for the Internet, Intranets, and Virtual Private Networks. Prentice Hall PTR Internet Infrastructure Series. Prentice Hall, Englewood Cliffs (1999) [5] Barbieri, R., Bruschi, D., Rosti, E.: Voice over IPsec: analysis and solutions. In: Proceedings of the IEEE Computer Security Applications Conference, pp. 261–270 (2002) [6] Wei, Q., Srinivas, S.: IPSec-based secure wireless virtual private network. In: IEEE MILCOM 2002 Proceedings, vol. 2, pp. 1107–1112 (2002) [7] Khanvilkar, S., Khokhar, A.: Virtual private networks: an overview with performance evaluation. IEEE Communications Magazine 42(10), 146–154 (2004) [8] Khatavkar, D., Hixon, E.R., Pendse, R.: Quantizing the throughput reduction of IPSec with mobile IP, Circuits and Systems. In: Circuits and Systems, MWSCAS-2002, vol. 3, pp. 505–508 (2002) [9] Hadjichristofi, G.C., Davis, N.J., Midkiff, S.F.: IPSec overhead in wireline and wireless networks for Web and email applications. In: Conference Proceedings of the IEEE International Performance, Computing, and Communications, pp. 543–547 (2003) [10] Al-Khayatt, S., et al.: Performance of multimedia applications with IPSec tunneling. In: IEEE Proceedings of International Conference on Coding and Computing, pp. 134–138 (2002) [11] Szigeti, T., Hattingh, C.: End-to-End QoS Network Design: Quality of Service in LANs, WANs, and VPNs. Cisco Press (2005) [12] Nichols, et al.:‘Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers, RFC 2474, IETF (1998) [13] Davie, B.: Deployment Experience with Differentiated Services. In: Proceedings of the ACM SIGCOMM 2003 Workshops, pp. 131–136 (2003) [14] Schneider, J.M, PreuB, T., Nielsen, P.S.: Management of Virtual Private Networks for Integrated Broadband Communication. In: Conference proceedings on Communications architectures, protocols and applications, ACM SIGCOMM, pp. 224–237 (1993)
Routes Building Approach for Multicast Applications in Metro Ethernet Networks Anatoly M. Galkin, Olga A. Simonina, and Gennady G. Yanovsky State University of Telecommunications, Telecommunication Networks Department, St. Petersburg, Russia
[email protected],
[email protected],
[email protected] http://www.sut.ru
Abstract. This study addresses the problems of the multicast traffic transmission in Metro Ethernet Networks. We developed an approach of the spanning tree construction which improves the QoS parameters for the multicast traffic. Keywords: multicasting, Metro Ethernet, spanning trees.
1 Introduction Users’ requirements are shifting to applications different from just the Internet access, traffic amount from streaming and real-time applications constantly grows throughout last years. Each application sets its own requirements to QoS parameters, different from legacy elastic traffic requirements. IPTV application is one of such examples, which generates multicast traffic with hard QoS requirements. Problems of compound traffic transmission are brought to a forefront for carriers and service providers. Triple play concept implies simultaneous transmission of elastic traffic, audio traffic (VoIP and streaming) and video traffic (videoconference and streaming) across the single transport network; having effective QoS-aware transport network is vital nowadays. Metro Ethernet is one of the most available transport mechanisms obtained recognition for multimedia traffic transmission during last years. Section II reviews the history of Ethernet development. It outlines Metro Ethernet architecture and discusses the suitability of Metro Ethernet transport for triple play services. Spanning tree protocols are considered. In Section III we propose a scheme addressing spanning tree construction for the multicast applications in Metro Ethernet. An idea is to combine a multicast source and a root of the spanning tree instance mapped to the multicast VLAN. We show an advantage of our approach. Section IV concludes the work.
2 Metro Ethernet 2.1 Metro Ethernet Features Carrier Ethernet (or Metro Ethernet) is the one of the advanced technology of NGN transport plane. Non-commercial organization, the Metro Ethernet Forum, is responsible for the Ethernet standards development [1]. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 187 – 193, 2007. © Springer-Verlag Berlin Heidelberg 2007
188
A.M. Galkin, O.A. Simonina, and G.G. Yanovsky
Ethernet technology overstepped the limits of local area networks (LAN) long time ago. It was influenced by several important factors such as: -
permanent Ethernet capacity increases, from 3.00 Mbps in the begging up to 10 Gbps nowadays and 100 Gbps in near future; appearance of cheap layer 2 switches and therefore careful replacement CSMA/CD (common bus) with full duplex.
Nowadays the Ethernet is perceived as convenient and efficient tool that helps new services to get to the market since it allows multiple graduations of attributes and QoS management. But we should remember that modern Ethernet is not similar to the native Ethernet technology, offered by Robert Metcalfe back in 1973 [2]. Metro Ethernet Forum defines Carrier Ethernet as a traditional Ethernet with five carrier-class attributes [3]: • • • • •
Scalability, Protection, Hard QoS, TDM support, Service management.
Almost all premises (MAN subscribers) own Ethernet-based LANs and in case if metro carriers possess Ethernet transport as well it simplifies the user-network interface (UNI). Besides, Ethernet is much cheaper than legacy solutions for metro networks (TDM or Frame Relay). Development of Metro Ethernet Networks (MEN) is strongly related to the Spanning Tree Protocols (STP) family’s implementation. In what follows, we consider STPs features and see how these protocols could be applied in MENs. 2.2 Spanning Tree Protocols Native Ethernet networks use STP for packet routing through the network, it was originally proposed by Radia Perlman [4] and finally standardized in IEEE 802.1d. It is layer 2 protocol implemented in switches (and formerly in bridges). Essentially STP uses shortest-path approach in the construction of a tree overlaid to a mesh Ethernet network. Initially, spanning tree was used to avoid broadcast storms appearing due to the loops in a network. STP prevents loops in a network by blocking of redundant links. And so a load is concentrated over certain links while other links are not used, there is no load balancing mechanism. There are five states of each port of each non-root switch: disabled, blocked, listening, learning and forwarding. Only forwarding state ports can forward the frames. This provides a shortest single path to the root. When network topology changes, all nodes restart a new algorithm of the spanning tree protocol and that could take up a big delays (up to several minutes). It is obvious that STP has certain inherent disadvantages if used in MEN: • STP does not use many ports of switches (non-efficient utilization of recourses); • Bad resiliency; • A lack of load balancing mechanisms; • A lack of QoS support.
Routes Building Approach for Multicast Applications in Metro Ethernet Networks
189
Later, STP was significantly, an updated version Rapid STP (RSTP) has been issued. Initially it was defined in IEEE 802.1w, later it was standardized in 802.1d [5]. RSTP reduces a number of states from five to three – discarding, learning and forwarding. RSTP is able to decrease a convergence time up to 3-5 seconds and sometimes up to subseconds (it depends on network topology) through fast transition to forwarding state. And the topology change notification is propagated throughout the network simultaneously, unlike STP. But, still like STP, RSTP has only one tree throughout the whole network. Therefore, all disadvantages are similar to STP except the convergence time. These problems are solved by the fat-tree hierarchy in LANs, meaning that if a link is closer to the root, it has higher bandwidth. While in LANs it’s possible, it is not feasible in MANs. Some above mentioned problems are solved by new improvements within Multiple STP. Initially it was defined in 802.1s and currently it is defined in 802.1q [6]. MSTP divides a network into several regions. Common STP connects all regions. Each region has its own regional root. But inside the region there are many Multiple Spanning Tree Instances (MSTI). Each MSTI is a RSTP instance. Since MSTP runs RSTP as the underlying protocol it has some drawbacks as well. But at any time several spanning trees can govern a region and network administrators can use manual load balancing inside a region by mapping certain traffic and/or certain customers to a specific spanning tree. Each MSTP instance is mapped to VLAN. Traditionally VLAN assigns with a subscriber (VLAN per subscriber), it consolidates subscriber’s sites. But it is possible to map the VLAN to a specific type of traffic (VLAN per service) meaning that different QoS can be provided for each type of traffic. MSTP standard does not specify the details of mapping of VLANs to the spanning trees. In MEN we can assign the VLAN to a spanning tree. Thus we map certain service to a certain spanning tree. And each spanning tree may have some special properties appropriate for the service it carries. In the next section we develop a new approach for building spanning tree for multicast traffic.
3 An Efficient Approach of Spanning Tree Building for Multicast Traffic For multicast traffic in IP networks IGMP protocol is used to establish a membership in a multicast group. Also different routing protocols: PIM-DM, MOSPF, CBT, PIMSM, etc. are used to build multicast trees with different approaches. In layer 2 networks multicast traffic is treated as broadcast traffic and packets are forwarded to all ports. To solve this problem in L2 networks different solutions have been developed, e.g. IGMP snooping [7], GMRP, etc. In what follows we examine two cases when multicast source concurs with spanning tree root and when they do not concur. Let us depict a network as a tree (see Fig. 1). We suppose that a spanning tree instance has a root at switch A and a depth is d. We can consider two variants. In the
190
A.M. Galkin, O.A. Simonina, and G.G. Yanovsky
first case the multicast source concurs with spanning tree root, in the second case the multicast source is located in the node B on the depth b. One of the critical QoS parameters for multicast applications is a delay; selection of multicasting source location in the spanning tree can be estimated using delay. We assume that each link has one unit of delay. Total cost of a path from a root to each switch can be estimated using: d
∑k ⋅n k =0
(1)
k
where n is a number of the nodes in the k-depth. Root Switch A
Amount of nodes
Depth 0
1
n1
k
nk
d
nd
d
∑k ⋅n k =0
k
Fig. 1. General spanning tree view
Consider the total cost of a path from multicast source for two cases. At first, we divide this tree into several subtrees. A root of one of subtrees is located in each depth from the node A to the node B (Fig. 2). In total there are b+1 subtrees. Let us denote: nk(i) is a number of nodes in the depth k, which belong to subtree with root in the depth i. So we can divide the sum (1) into b+1 parts: d
d
d
d
k =0
k =0
k =i
k =b
b
d
∑ k ⋅ nk = ∑ k ⋅ nk(0) + ... + ∑ k ⋅ nk(i ) + ... + ∑ k ⋅ nk(b) = ∑∑ k ⋅ nk(i ) i = 0 k =i
(2)
Routes Building Approach for Multicast Applications in Metro Ethernet Networks
Root Switch A
191
Depth 0
Switch i
i
Switch B b
d
d
k nk(b )
d k nk(i )
k i
k b
d
k nk( 0)
k 0
Fig. 2. Total paths’ cost in the first case
d k b
(k b)nk(b )
d
(b k 2i )nk(i )
k i
d k 0
Fig. 3. Path cost in the second case
(b k ) nk(0)
192
A.M. Galkin, O.A. Simonina, and G.G. Yanovsky
Let us find the total path’s cost in the second case when the source of multicast traffic is located in the node B. (fig 2). The total path’s cost from switch B to each other: d
∑ ( k + b) ⋅ n k =0
(0) k
d
d
k =i
k =b
b
d
+ ... + ∑ (k + b − 2i ) ⋅ n k( i ) + ... + ∑ ( k − b) ⋅ n k( b ) = ∑ ∑ ( k + b − 2i ) ⋅ n k( i )
(3)
i =0 k =i
In order to know what the expression is larger we find the difference between (3) and (2): b
d
b
d
b
d
∑∑ (k + b − 2i) ⋅ n(ki ) − ∑∑ k ⋅ nk(i ) = ∑∑ (b − 2i) ⋅ nk(i ) i = 0 k =i
i = 0 k =i
(4)
i = 0 k =i
Since i is the counter of a tree depth of subtree roots then value of sum (4) when i belongs to the range of [0; b/2] is greater then the value when i at the range of (b/2; b] because the number of nodes in supertree is much more then in subtree. And so the expression obtained in (4) is positive. 2 .10
5
1.5 .10
5
dif2( b)
1 .10
5
dif1( b)
5 .10
4
0
0
5
10
15
20
b
Fig. 4. Delay differences
Fig. 4 shows the dependences in accordance to the expression (4), where b is the depth of the multicast traffic source. The dotted curve dif1(b) is depicted with less number of n at each k-depth than another curve. If source of multicast traffic is far from spanning tree root, then values of delay difference are bigger (see the expression 4), moreover if number of nodes at each depth is bigger, values of delay are bigger. Thus in the second case total path’s cost from the root to each node is larger and so delay is larger. This means that if a multicast source is put to the root of a smaller
Routes Building Approach for Multicast Applications in Metro Ethernet Networks
193
subtree that summary delays become larger. And so rather then putting multicast source into random place it is better to build spanning tree from multicast source for QoS assurance. Also in the presence of superposition of spanning tree root and multicast source reconfiguration could be avoided in the process of stream traffic forwarding (for instance at the change of a root, its disconnection, quick topology change). It results in better network stability and QoS parameters predictability.
4 Conclusions The analysis of existing spanning tree algorithms has shown that one of the most pointed issues is the problem of choosing a place of multicast traffic source. The study offers an approach of building spanning tree from the multicast source. In particular two cases are considered, when root is chosen randomly and when root of spanning tree is coincided with root of multicast tree. We obtained the better result in terms of delays when the multicast traffic source is concurred with spanning tree. So spanning tree instances mapped to the VLAN assigned to the multicast traffic should be built with a root from concurred with multicast source for the improvement of the QoS parameters.
References 1. Metro Ethernet Forum, http://metroethernetforum.org 2. Metcalfe, R.M., Boggs, D.R.: Ethernet: Distributed Packet-Switching For Local Computer Networks. Communications of the ACM 19(5), 395–404 (1976) 3. The Evolution of Ethernet to a Carrier Class Technology, Featuring Bob Metcalfe, available on http://www.netevents.tv 4. Perlman, R.J.: An algorithm for distributed computation of a spanning tree in an extended LAN. In: SIGCOMM 1985, pp. 44–53. 5. IEEE, IEEE 802.1D Standard for local and metropolitan area networks: media access control (MAC) bridges (2004) 6. IEEE, IEEE 802.1Q Standards for local and metropolitan area networks: virtual bridged local area networks (2003) 7. Jiang, X., Wang, J., Sun, L., Wu, Z.: IGMP snooping: A VLAN-Based Multicast Protocol. IEEE High Speed Networks and Multimedia Communications, 335–340 (2002)
Performance Modelling and Evaluation of Wireless Multi-access Networks Remco Litjens, Ljupco Jorguseski, and Mariya Popova TNO Information and Communication Technology, Delft, The Netherlands
Abstract. In support of the on-going research and development activity in the design and performance assessment of radio access network integration schemes, we present novel resource sharing models for the efficient simulation of multi-access networks involving gsm/edge, umts/hsdpa and 802.11a-based wlan systems. The applicability of the proposed models in determining call and system level performance measures is demonstrated by means of a limited set of illustrative numerical experiments revealing the significant performance gains when integrating distinct access networks with a rather simple radio access selection scheme.
1
Introduction
In light of the large installed base of diverse wireless access networks generally operating in isolation, recent research activity has been targeted towards the development and performance assessment of distinct levels of network integration. Such integration is pursued both at the call level, where radio access (ra) selection schemes are devised to ensure that calls are always best connected, i.e. served by the most appropriate of the available access networks throughout its lifetime [2,4,14,18]; and at the packet level, where multi-radio transmit diversity is applied in order to exploit the concurrent assignment of multiple ras via fast channel-aware cross-ra scheduling or parallel transmissions [6,19]. The applied ra selection and/or multi-radio transmit diversity scheme directly relates to the pursued performance objectives, which may cover any combination of coverage, capacity and quality of service enhancements, the desired balance of which depending on the network operators’ policies. The complexity of contemporary access network technologies imposes that the development and performance assessment of access network integration schemes is carried out using simulation methods. Yet, in light of the joint consideration of multiple access networks, the applied simulation models should not be overly detailed in order to keep the simulation times required to obtain statistically reliable results within reasonable limits. The principal contribution of this paper is to propose new models for the efficient simulation of integrated gsm/edge, umts/hsdpa and wlan access networks. Additionally, the applicability of the proposed models is demonstrated in a limited set of numerical experiments revealing the significant performance gains when integrating distinct access networks with a rather simple ra selection scheme. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 194–209, 2007. c Springer-Verlag Berlin Heidelberg 2007
Performance Modelling and Evaluation of Wireless Multi-access Networks
195
The outline of the paper is as follows. In Section 2 the general setting is described, followed by detailed descriptions of the proposed resource sharing models for the gsm/edge, umts/hsdpa and 802.11a-based wlan in Sections 3, 4 and 5, respectively. Section 6 demonstrates the applicability of the proposed models with some illustrative numerical experiments revealing the benefit from integrating access networks. Section 7 ends the paper with some concluding remarks.
2
General Setting
In this section we describe the general setting of the proposed study in terms of the network layout and considered services, to provide a concrete context for the resource sharing models detailed in the upcoming sections. The overall network layout is depicted in Figure 1, showing cosited gsm/edge (2g) and umts/hsdpa (3g) access points in a hexagonal layout, each serving three sectors. In the central cell, 56 adjacent 802.11a-based wlan hot spot cells are placed. It is stressed that the proposed technology models are readily applicable in a wide variety of network topologies. Three distinct services are considered, viz. speech telephony, video streaming and file download. The speech and video services are characterised by an autonomously sampled duration and fixed throughput requirements of rspeech and rvideo kb/s, respectively. The data service is characterised by a randomly distributed file size and a minimum throughput min requirement of rdata .
Fig. 1. The considered network layout with co-sited gsm/edge and umts/hsdpa cellular networks and a number of centrally located 802.11a-based wlan hot spots
196
3
R. Litjens, L. Jorguseski, and M. Popova
GSM/EDGE Model
In this section we describe the proposed performance model for a gsm/edge access network [13]. Consider a given sector in the gsm/edge network. Let Ctotal denote the total number of physical channels available at the sector, i.e. 8 times the number of assigned frequencies, comprising Ctraffic traffic channels and Ctotal − Ctraffic control channels. The Ctraffic traffic channels are shared by speech, video and data calls: speech calls are handled via gsm technology, while video and data calls are handled via edge technology. At a given time instant, the sector state is characterised by the number c of channels that is in active use and by the sets S, V and D of present speech, video and data calls, respectively. Let s ≡ |S| , v ≡ |V| , d ≡ |D|. The channel rate assigned to a video or data call m ∈ V ∪ D handled via edge technology and with a given location depends on the experienced signalto-interference-plus-noise ratio (sinr), which is calculated by considering a fixed 2g (and uniform) base station transmit power of Pmax (in Watt) in the serving cell 2g and an average base station transmit power of (cb /Ctotal ) × Pmax Watt in each co-channel cell b, effectuating a regular seven cell reuse pattern and assuming all traffic handled by a given cell is uniformly spread over the cell’s channel pool. The applied noise floor of N 2g dBm includes both thermal noise and an assumed receiver noise figure of Nf2g dB. The obtained sinr is converted to deciBels and subsequently mapped to an attainable channel rate rm as depicted in Figure 2 (top left) [10]. In a given (admissible) sector state, the channel assignment for the different calls is determined as follows. Speech calls are assigned a single traffic channels. Video calls are also assigned precisely the number of channels that matches their fixed throughput requirement, i.e. cm ≡ rvideo /rm for video call m ∈ V with channel rate rm . Data calls fairly share the remaining traffic provid channels, ing each data call with cm ≡ min cmax , Ctraffic − s − m ∈V cm /d traffic channels, where cmax denotes the multislot capability of the edge terminals. The experienced throughput of data call m ∈ D in the considered system state min is then equal to rm cm ≥ rdata , which is guaranteed by the admission control scheme: a newly arriving call of any type is admitted to the system if and only if the minimum channel requirement for the new and all on-going calls can be satisfied, applying the channel assignment as described above. System level performance indicator of interest includes the resource utilisation, which for the considered sector in the given state is equal to Ctotal − Ctraffic + s + m∈V cm + m∈D cm c ≡ . Ctotal Ctotal The expected resource utilisation is determined by appropriate time-averaging over all sector states observed during the simulation and eventually averaging over all gsm/edge cells.
Performance Modelling and Evaluation of Wireless Multi-access Networks WLAN CHANNEL RATES
60
60000
48
48000
CHANNEL RATE (kb/s)
CHANNEL RATE (kb/s)
EDGE CHANNEL RATES
36
24
12
0
36000
24000
12000
0 -20
-10
0
10
20
30
-20
-10
SINR (dB)
0
10
20
30
SINR (dB)
HSDPA CHANNEL RATES
HSDPA MULTI-USER DIVERSITY GAIN
14000
4
MULTI-USER DIVERSITY GAIN
K=f
CHANNEL RATE (kb/s)
197
11200 K=0 (Rayleigh) 8400
5600
2800
3 K=0 (Rayleigh)
K=5
2
K = 10 K=f 1
0 -20
-10
0
10
20
SINR (dB)
30
0
10
20
30
40
50
NUMBER OF HSDPA CALLS
Fig. 2. Applied mappings of the experienced s(i)nr to the channel rate for edge, hsdpa and wlan models. The bottom right chart depicts the multi-user diversity gain versus the number of data calls under proportional scheduling in the hsdpa model.
4
UMTS/HSDPA Model
In this section we describe the proposed performance model for the umts/hsdpa [16,17] access network, inspired by [5,8,7]. At a given time instant, the umts/ hsdpa system state is characterised by the sets Sb , Vb and Db of present speech, video and data calls in each sector b, respectively. Let sb ≡ |Sb | , vb ≡ |Vb | , db ≡ |Db |. In the proposed model, speech and video calls are assumed to be handled over dedicated channels (umts) with a downlink Eb /N0 requirement of γspeech and γvideo , respectively, while data calls are handled over hsdpa’s 3g shared channel. Restricted by a total downlink power budget of Pmax Watt, 3g 3g a pilot channel power of Ppilot Watt and another Pcommon Watt invested in other common channels, transmit power control is applied to determine the required power settings for all dedicated channels, while hsdpa’s shared channel is assumed to utilise all remaining power (given a presence of at least a single hsdpa call).
198
R. Litjens, L. Jorguseski, and M. Popova
An iterative power control scheme like the one presented in [3] is readily adapted to incorporate the hsdpa ‘power-filling’ mode. Considering the computational burden that power control calculations generally impose on 3g network simulations, it is noted that this burden is significantly reduced by the inclusion of hsdpa, as (i) a base station always transmits at full power when one or more hsdpa calls are present, accelerating the convergence of the power control scheme; and (ii) since hsdpa provides only a single (shared) channel the number of channels whose transmit power need to be determined is reduced significantly. If and only if the Eb /N0 requirements of all speech and video calls can be satisfied, the power control scheme converges. Upon convergence of the power control scheme, the required transmit power for all dedicated channels are given and hence also the hsdpa channel power is known. Subsequently, the experienced sinr for each individual hsdpa data call is determined using the hsdpa channel power in the serving cell, the obtained intra/inter-cell interference powers and an applied noise floor of N 3g dBm, including both thermal noise and a receiver noise figure of Nf3g dB. In order to appropriately map the obtained sinr to an attainable channel rate, it is important to realise that in live networks hsdpa’s adaptive modulation and coding scheme is performed in the base station and can therefore operate on the multipath fading time scale (unlike in e.g. gsm/edge networks). Since in the proposed system level simulation model multipath fading is not explicitly considered, the corresponding effects must be implicitly incorporated by converting the mapping of instantaneous sinr to instantaneous channel rate, to a mapping of average sinr to average channel rate. Considering a Ricean multipath fading model, simple off line simulations have been conducted to derive the mapping φK presented in Figure 2 (bottom left). In addition to the average mappings that apply for Ricean factor K ∈ {0 ( ∼ Rayleigh fading), 5, 10}, the instantaneous mapping is also depicted for reference purposes (based on [1,9]), which could also be interpreted as the case of K = ∞ with pure line of sight (los) propagation and hence no multipath fading effects. It is stressed that this mapping φK indicates the potential rate a data call with a given sinr can attain, assuming exclusive availability of the hsdpa transport channel and no restrictions related to the use of spreading codes, two important aspects that addressed below. In a umts/hsdpa sector all downlink transmissions share a single ovsf (orthogonal variable spreading factor) code tree. The desired orthogonality of spreading codes imposes that assigned codes may not have a forefather relation in the code tree, which effectively implies that an assigned code of spreading factor sf consumes a fraction 1/sf of the code tree capacity. The downlink dedicated channels of rspeech = 12.2 kb/s (speech calls) and rvideo = 64 kb/s (video calls) use spreading codes with sf equal to 128 and 32, respectively, thus claiming fractions 1/128 and 1/32 of the code tree capacity. In addition, a fixed portion of the ovsf code tree is used for control and common channels, including a.o. the pilot, broadcast, paging and other signalling channels. In a typical umts/hsdpa sector, these channels jointly occupy a fraction 10/256 of the code tree. hsdpa’s shared transport channels utilises a varying number of spreading codes of sf
Performance Modelling and Evaluation of Wireless Multi-access Networks
199
16 in parallel, depending on need and availability. Given sb speech and vb video calls in a considered sector b, the available number of sf 16 codes that can be assigned to hsdpa traffic is equal to cHSDPA = 16 × (1 − [10 + 2sb + 8vb ] /256) . Observe that if there is no umts traffic, i.e. if sb = vb = 0, cHSDPA = 15. The code availability limits the assignable channel rates. Assuming the mapping of channel quality to channel rate as given in [1], the maximum attainable bit rate ψ (cHSDPA ) (in kb/s) given an availability of cHSDPA sf 16 codes is given by the table below. cHSDPA 0 1 2 3 4 5 6 7 ψ (cHSDPA ) 0 230.5 465.5 871 1291.5 3584 3584 4859.5 cHSDPA 8 9 10 11 12 13 14 15 ψ (cHSDPA ) 5709 5709 7205.5 7205.5 8618.5 8618.5 8618.5 12779
Channel sharing in hsdpa is coordinated by the packet scheduler. Besides the channel-oblivious round robin (rr) scheduler, which was also implicitly applied in the (gsm/)edge model, the fact that the hsdpa packet scheduler is located in the base station allows channel-aware scheduling, where the scheduling decisions can be based on the instantaneous channel conditions and thus follow multipath fading fluctuations. Such channel-aware scheduling provides multi-user diversity gain: the fact that the scheduler can choose who to serve based on actual channel gains allows a higher overall throughput. As with adaptive modulation and coding, we must model the impact of multipath fading and multi-user diversity implicitly, since the presented simulator operates on a more coarse-grained time scale than would be needed to model multipath fading explicitly. A widely implemented channel-aware scheduler is the so-called proportional fair (pf) scheduler, whose objective it is to provide fair channel access to all data calls, yet to choose the instances of service intelligently and achieve multi-user diversity gain. The modelled pf scheduler achieves these objectives by selecting the served data call based on the ratio of the instantaneous and the average sinr. As a consequence, the throughput experienced by a given data call is a function r (db , sinr) which depends only on the call’s average sinr and the number db of calls sharing the channel. The function r (db , sinr) is well-approximated by r (db , sinr) ≈ μK (db ) φK (sinr) /db , where 1/db indicates the fair access of all calls, μK (db ) ≥ 1 reflects the multi-user diversity gain and φK (sinr) gives the channel rate the call experiences if it had exclusive channel access. The function μK (db ) has been obtained via off line dedicated simulations and is depicted in Figure 2 (bottom right) for K ∈ {0, 5, 10, ∞}, where in the pure los case of K = ∞ no multipath fading and hence also no multi-user diversity exists. Combining the distinct model aspects covering adaptive modulation and coding, ovsf code tree sharing and channel-aware packet scheduling, the following expression gives the throughput of a data call m ∈ Db with a given average sinr, sharing the hsdpa transport channel with db − 1 other data calls, given a code availability of cHSDPA sf 16 codes and assuming a propagation environment characterised by Ricean factor K:
200
R. Litjens, L. Jorguseski, and M. Popova
rm = min {μK (db ) φK (sinr) , ψ (cHSDPA )} /db . We deem the considered system state admissible if and only if the Eb /N0 requirements of all speech and video calls can be satisfied with an aggregate down3g link transmit power of no more than a fraction α of Pmax and the experienced throughput rm of all data calls m ∈ Db in all sectors b exceeds the minimum min requirement rdata . The ‘α restriction’ reflects a typical operational target level for the average downlink transmit power, leaving an intentional margin for incidentally needed power peaks to cope with the dynamics of multipath fading. Since the transmit power of hsdpa’s shared channel is dynamically adapted on a millisecond time scale, the applied margin can be used for hsdpa data transfer. Admission control operates such that all and only newly arriving calls that would bring the umts/hsdpa network in an inadmissible state are rejected. For sector b, the resource utilisation in the given system state is expressed in terms of the used aggregate transmit power. More specifically, denote with pm , m ∈ Sb ∪ Vb the transmit power used for the dedicated channels maintained in cell b and with phsdpa the power used for the hsdpa channel (which is either zero or equal to all remaining power). Then the resource utilisation is equal to 3g 3g Ppilot + Pcommon + m∈Sb pm + m∈Vb pm + phsdpa , 3g Pmax which is equal to 1 if db > 0. As for the gsm/edge model, the expected resource utilisation is determined by appropriate time-averaging over all system states observed during the simulation and averaging over all umts/hsdpa cells.
5
802.11a WLAN Model
In this section we describe the proposed performance model for the 802.11abased wlan [22]. It is noted that the modelling applies equally well to 802.11gbased networks, which use the same mac protocol and also feature the same set of channel rates (if applicable, the lower 802.11b channel rates are also readily incorporated in the model). Consider a given cell in the wlan network with calls contending for the shared medium according to the basic access mode of the distributed coordination function (dcf) [20,23]. The access point is assumed to apply round robin scheduling on an ip packet level. Denote with V and D the sets of present video and data calls, respectively (the wlan cells are assumed to not support the speech service as the considered dcf mode is rather unfit to handle such realtime services) and let v ≡ |V| , d ≡ |D|. Considering a reuse plan with four non-overlapping frequencies, the channel rate assigned to a video or data call m ∈ V∪D handled via the wlan technology and with a given location depends on the experienced sinr, which is calculated wlan by considering a fixed (and uniform) base station transmit power of Pmax Watt wlan in the serving cell, co-channel interference and a noise floor of N dBm which includes both thermal noise and a receiver noise figure of Nfwlan dBm. The
Performance Modelling and Evaluation of Wireless Multi-access Networks
201
obtained sinr is converted to deciBels and subsequently mapped to an attainable channel rate rm as depicted in Figure 2 (top right) [25]. The proposed cell level wlan resource sharing model is a generalisation of that presented and validated in [23], extending the model to support heterogeneous calls both in terms of the associated service and considering a realistic diversity in channel rates applied by the calls. Other sources modelling (only) tcp-based data transfers in 802.11-based wlans include [21,24]. The key distinction between the model developed by [23] and extended in this paper, and these models, however, is the observation that the access point generally contends for medium access with at most one other station, leading to somewhat less complex analytical expressions. In the applied notation, there are vi (di ) video (data) calls using channel rate ri , i = 1, · · · , 8 (see Figure 2 above), expressed in b/s. Denote with rmin ≡ mini ri the lowest bit rate that all stations understand. data data Let Tdata packet,i , Tack,i and Tcollision,i denote the time of a successful data packet transfer, tcp ack transfer and the time of a collision involving a data packet for a call with channel rate ri , respectively (note that collisions may indeed occur in the downlink traffic scenario due to the time duplexing character of the dcf in combination with the uplink transmissions of tcp acks). For a data call with channel rate ri , these event times are −1 data Tdata mac + xdata packet,i = phy + ri overhead + xpayload + 6 + δ + sifs + phy + ri−1 [ack + 6] + δ + difs −1 −1 Tdata mac + xdata ack,i = phy + ri ack + 6 + δ + sifs + phy + ri [ack + 6] + δ + difs Tdata collision,i
data = phy + ri−1 mac + xdata overhead + xpayload + 6 + δ + eifs
where phy denotes the physical header plus preamble in seconds, mac denotes video the mac header size in bits, xdata overhead (xoverhead ) the size of the tcp/ip (udp/ip) data headers in bits, xack is the size of a tcp acknowledgment in bits, xdata payload (xvideo payload ) is the payload size in bits of a data (video) packet, δ is the propagation delay between sender and receiver in seconds, sifs is the short interframe spacing, ack is the size of a mac acknowledgement in bits, difs = sifs + 2 × τ is the dcf interframe spacing (with τ the time slot duration) and eifs = sifs −1 + phy + rmin [ack + 6] + δ + difs is an extended interframe spacing applied video by a station who receives an unintelligible packet. Let Tvideo packet,i and Tcollision,i be defined and derived analogously for video packets. The feasibility of the video throughput requirement of rvideo and the attainable data throughputs are determined by analysing a typical transmission cycle with 2m data packets, m tcp acks (considering delayed tcp acknowledgments) and m
video packets. An example of such a cycle with m =m
= 1 is depicted in Figure 3. The applicable values of m and m
(actually only their ratio turns out to matter) depend on the system state given by the di ’s and vi ’s and will be
202
R. Litjens, L. Jorguseski, and M. Popova
DIFS
AP
MEDIUM BUSY
DIFS
DIFS TCP PACKET
DIFS
TCP PACKET 3 2 1
6 5 4 3 2 1
9 8 7 6 5 4
SIFS
SIFS ACK
DIFS
ACK
UDP PACKET 8 7 6
4 3 2 1
SIFS ACK
TCP ACK
STATION 1
6 5 4 3 2 1
SIFS TCP ACK offered to the MAC layer ACK STATION 2
Fig. 3. Illustration of a wlan transmission cycle
determined below for the case that d, v > 0 (the cases where d = 0 or v = 0 are readily assessed by analogy with the presented analysis). Using data
Tpacket =
8 di i=1
d
data
Tdata packet,i , Tack =
8 di i=1
d
data
Tdata ack,i , Tcollision =
8 di i=1
d
Tdata collision,i
by taking the expectation of the different channel rates that the present data video flows may be assigned, and analogously determined expressions for Tpacket and video Tcollision , the expected duration of a transmission cycle is equal to
m 1 m 1 data data T = 2m 1− Tpacket + Tcollision 2m +m
1 + cwmin 2m +m
1 + cwmin
m 1 m 1 video video +m
1− Tpacket + Tcollision 2m +m
1 + cwmin 2m +m
1 + cwmin
1 1 + cwmin data +m 1− Tack + (2m + m)
τ, 1 + cwmin 2 adequately conditioning on the success or failure of the different transmissions and further including the 2m +m
backoff periods (note that the backoff period associated with the uplink tcp acks coincides with the backoff period of a downlink data or video packet and hence should not be counted separately). Regarding the collision probability of a data or video data packet, it is noted that such a collision occurs only if the preceeding packet was the second (delayed acknowledgments) of two consecutive data packets associated with the same flow and the competing packets sample identical backoff counters (from a contention window of length cwmin = 15). Finally, note that the duration of a failed tcp data video ack transmission is already captured by Tcollision or Tcollision , which involve longer packets, and is therefore not included separately. Since within such a transmission cycle, 2m data and m
video packets are transmitted, the expected net data throughput and net/gross video throughputs (in b/s) per call are as follows:
Performance Modelling and Evaluation of Wireless Multi-access Networks
data rnet =
1−
m 1 2m +m
1 + cwmin
2mx data payload , dT
mx
video payload ,
vT m 1 video = 1− rgross . 2m +m
1 + cwmin
203
(1)
video rgross =
(2)
video rnet
(3)
Note that the throughputs are the same for all calls associated with the same service type, regardless of their individual channel rates, which is due to the fact that the scheduler applies round robin on a packet basis, rather than on a time slice basis, as is done in e.g. edge or hsdpa networks (this is in line with the performance anomaly investigated in [15]). It is readily verified, however, that the uniform throughput level is higher in scenarios where the channel rates applied by the present calls are higher. In order to adequately incorporate the assumed constant bit rate (rvideo ) character of video calls, the transmission cycle parameters m and m
must be video chosen such that the resulting gross video throughput rgross is indeed equal to rvideo . To this end, we substitute γ ≡ m/
m into expression (2) and note that the resulting expression depends only on the ratio γ ≡ m/
m and not on the individual values of m and m.
In a given scenario, where the di and vi , i = 1, · · · , 8, as well as all other variables in the above expression are known, video the equation rgross = rvideo · 10−3 is readily converted to the following quadratic equation after careful algebraic manipulations: 1 + cwmin xvideo video payload 0 = Tpacket + τ− γ2+ 2 rvideo v · 10−3 ⎡ ⎤ video video data video Tpacket Tcollision 2T + 2T − + 1+cwmin ⎦ γ+ ⎣ packetdata packet 1+cwmin cwmin Tack 2xvideo payload + 1+cw + (1 + cw ) τ − min rvideo v·10−3 min data
data
4Tpacket −
data
data
2Tpacket 2Tcollision 2cwmin Tack + + + 2 (1 + cwmin ) τ, 1 + cwmin 1 + cwmin 1 + cwmin
(4)
for which the single positive solution for γ can be shown to exist if and only if it is indeed possible to grant each video call a gross throughput of rvideo kb/s. Once γ is known, the desired net throughputs immediately follow from expressions (1) and (3) using m
= γ m. Admission control in the wlan operates such that a newly arriving call video or data call is rejected if and only if it would bring the wlan cell in an inadmissible state, i.e. iff no single positive solution for γ exists that solves equation data (4) or the uniform net data throughput rnet is lower than the minimum data min throughput requirement of rdata kb/s. In a wlan cell the resource utilisation can be defined as the fraction of time that actual transmissions occur on the shared channel, i.e. including data packets, mac or tcp layer acknowledgments, physical layer training sequences (in
204
R. Litjens, L. Jorguseski, and M. Popova
phy), ..., but excluding idle times, backoff times, interframe spacings, .... Consider a given cell state with v, d > 0 and let the derived transmission cycle consist of 2m tcp data packets, m tcp acks and m
= γm udp data packets. The transmission cycle then has length T as given above, while the actual resource utilisation time within the transmission cycle is given by
1 Tutilisation = T − m 2+γ+ 1− (δ + sifs + δ + difs) + 1 + cwmin 1 + cwmin − (2 + γ) m τ, 2 i.e. subtracting all idle times. Hence the resource utilisation is equal to Tutilisation / T. As for the gsm/edge and umts/hsdpa models, the expected resource utilisation is determined by appropriate time-averaging over all cell states observed during the simulation and eventually averaging over all wlan cells.
6
Numerical Experiments
In this section we present some numerical results from a few illustrative simulation experiments that have been conducted to demonstrate the applicability of the presented models and given an indication of the performance gains that can be achieved from network integration. We concentrate hereby on the umts/hsdpa and wlan models. The network layout as depicted in Figure 1 is considered, with applied hexagonal cell radii of 0.60 kilometer for the 3g networks and 0.06 kilometer for the wlan. With regard to the propagation model, for the umts/hsdpa access network, assumed to operate in the 2100 Mhz band, the cost231 Walfisch-Ikegami [11] nlos (non line of sight) path loss model is used. As explained in Section 4, the adequate modelling of hsdpa’s scheduling and fast adaptive modulation and coding schemes involves an assumption regarding the multipath fading process. We consider a Ricean fading model characterised by Ricean factor K. The considered 802.11a-based wlan access network operates in the 5 Ghz band, for which the path loss is modeled according to the indoor model in [12] with path loss exponent of 3.2. As part of the traffic model we assume exponentially distributed speech and video call durations with an average of 60 seconds, and fixed throughput requirement of rspeech = 12.2 kb/s (enhanced full rate/adaptive multirate codec) and rvideo ≡ 64 kb/s, respectively. The data file size is lognormally distributed with an average of 500 kbits and a coefficient of variation of min 1.5. A minimum throughput requirement of rdata ≡ 64 kb/s is considered for data calls. The overall call arrival process is Poisson with rate λ, while in the considered scenario the fraction of generated calls that is of the speech, video and data service type, is equal to 0.3, 0.4 and 0.3, respectively. A comparison is made between a scenario with completely isolated access networks and a scenario with integrated access networks. In the former case, the generated calls may be handled only by the ‘home operator’, which is assigned according to the following probability distribution: all speech calls have
Performance Modelling and Evaluation of Wireless Multi-access Networks umts/hsdpa wlan 3g wlan Pmax 20 W Pmax 0.1 W phy 20 10−6 3g Ppilot 2 W δ 1 10−6 3g Pcommon 2 W τ 9 10−6 3g wlan N −102 dBm N −96 dBm sifs 16 10−6 Nf3g 5 dB Nfwlan 5 dB difs 34 10−6 α 0.75 eifs 90 23 10−6 γspeech 5 dB γvideo 4 dB
s s s s s s
xdata overhead xdata payload xdata ack xvideo overhead xvideo payload mac ack
320 11680 320 224 11776 272 112
205
bits bits bits bits bits bits bits
the umts/hsdpa network as a home, while 2/3 (1/3) of all video and data calls have the umts/hsdpa (wlan) network as a home. In case a call cannot be admitted to the ‘home network’ for reasons of coverage or capacity, it is blocked. In the latter case a simple radio access selection scheme is applied, according to which a call still attempts admission at its ‘home network’ first, but only in case it cannot be admitted it attempts to access the other access network. Regarding the call arrival process, we note that in the spatial dimension all calls are uniformly placed within the service area of their home operator. Other applied radio and protocol parameter settings are listed in the table below. From top-to-bottom, Figure 4 shows the ra-specific resource utilisation, the service-specific call blocking probability and the data throughput (average and 90th percentile) versus the aggregate call arrival rate λ for the isolated (left) and integrated (right) scenarios. A number of observations can be made from these charts. The resource utilisation is (obviously) increasing in the traffic load, with the umts/hsdpa curve flattening out at 75% (for λ 10 calls/s), in correspondence with the applied admission control scheme. In comparison, the resource utilisation in the wlan network is very low (there is no call blocking in the wlan cells for the considered range of traffic loads), due to the relatively high transport capacity. Comparing the isolated and integrated scenarios, we observe that the umts/hsdpa curves are identical as there is no overflow traffic from wlan to the umts/hsdpa network. For the high λ’s, the wlan resource utilisation is roughly twice as high in the integrated scenario when compared to the isolated scenario, as half of the video and data calls blocked in the umts/hsdpa cell lie within the wlan coverage area and can thus be handled by the wlan cells. For low λ’s, i.e. before the umts/hsdpa network fills up, the wlan resource utilisation is the same for both scenarios. All call blocking probability curves are increasing in λ, as is intuitively obvious. The ordering of the curves is in line with the effective resource claim of the different services in the umts/hsdpa network (where all blocking occurs), noting that hsdpa (data service) is a more resource efficient mode of traffic handling than umts (speech, video services). Note that significant blocking occurs for the same range of λ’s where the umts/hsdpa resource utilisation is at its maximum. Comparing the isolated and integrated scenarios, we note that the
206
R. Litjens, L. Jorguseski, and M. Popova
INTEGRATED ACCESS NETWORKS
ISOLATED ACCESS NETWORKS 1
UMTS/HSDPA
0.1
0.01 WLAN
RESOURCE UTILISATION
RESOURCE UTILISATION
1
0.1
0.01
0.001
0.001 0
10
20
30
40
0
50
30
40
50
1
speech telephony
0.8
0.6 video streaming
0.4
0.2
data download
CALL BLOCKING PROBABILITY
CALL BLOCKING PROBABILITY
20
INTEGRATED ACCESS NETWORKS
ISOLATED ACCESS NETWORKS 1
0.8
0.6
0.4
0.2
0
0 0
10
20
30
40
0
50
10
20
30
40
50
AGGREGATE CALL ARRIVAL RATE (calls/s)
AGGREGATE CALL ARRIVAL RATE (calls/s)
ISOLATED ACCESS NETWORKS
INTEGRATED ACCESS NETWORKS 10000
8000
average data throughput
6000
4000 90th data throughput percentile
2000
0
DATA THROUGHPUT (kb/s)
10000
DATA THROUGHPUT (kb/s)
10
AGGREGATE CALL ARRIVAL RATE (calls/s)
AGGREGATE CALL ARRIVAL RATE (calls/s)
8000
6000
4000
2000
0 0
10
20
30
40
AGGREGATE CALL ARRIVAL RATE (calls/s)
50
0
10
20
30
40
50
AGGREGATE CALL ARRIVAL RATE (calls/s)
Fig. 4. Results from a set of illustrative numerical experiments: the figure depicts the ra-specific resource utilisation, the service-specific call blocking probability and the data throughput (average and 90th percentile) versus the aggregate call arrival rate for the isolated (left) and integrated (right) scenarios
speech blocking probablity curve remains the same, as the wlan does not handle speech traffic. For medium-to-high λ’s, the video and data call blocking probailities are roughly halved by integrating the networks, since of all the video and data calls with umts/hsdpa as a home network that would be blocked in the isolated scenario, about half lie within the wlan coverage area and can thus still be served.
Performance Modelling and Evaluation of Wireless Multi-access Networks
207
Perhaps intuitively somewhat surprising, the average data throughput curves in the bottom charts are not monotonously decreasing in the traffic load (in contrast to the 90th data throughput percentile curve, which decreases monotomin nously to rdata ). This is the net effect of two contradicting trends. On the one hand, for each network individually the experienced throughput decreases in λ due to the increased competition for the shared resources. On the other hand, as λ increases, the fraction of served data calls that is handled by wlan increases, since the blocking probability in the umts/hsdpa cell increases, so that the average data throughput is more and more dominated by the throughputs experienced in the wlan cells, which are significantly higher. As the figure shows, the former effect dominates for small λ, where the amount of blocking in the umts/hsdpa cell is still limited, while for high λ the number of data flows handled per time unit in the umts/hsdpa cell stabilises and the latter effect becomes more significant. Comparing the isolated and integrated scenarios, we observe that for medium-to-high λ’s, the average data throughput is increased by allowing overflow data traffic from the umts/hsdpa cell to be handled by the wlan cells. The 90th throughput percentile appears hardly affected by network integration as the 10% data calls with the worst throughput performance are likely to be handled in the umts/hsdpa cell, whose loading is unaffected by the network integration.
7
Concluding Remarks
We have presented resource sharing models for the efficient simulation of multiaccess networks involving gsm/edge, umts/hsdpa and 802.11a-based wlan systems. The applicability of the proposed models in determining call and system level performance measures has been demonstrated by means of a limited set of illustrative numerical experiments revealing the significant performance gains when integrating distinct access networks with a rather simple radio access selection scheme. In continued research we aim to extend the proposed models in adequately incorporating broader service integration and differentiation capabilities as well as developing models for other technologies such as WiMax and the application of multi-radio transmit diversity and mimo schemes. Furthermore, we intend to utilise the developed models to devise and evaluate effective radio resource management schemes for multi-access networks.
Acknowledgements The authors would like to thank Frank Roijers (tno ict, The Netherlands) for his constructive comments on the wlan model and Zhiyi Chen (Delft University of Technology, The Netherlands) for assistance with the simulation experiments.
208
R. Litjens, L. Jorguseski, and M. Popova
References 1. 3GPP ts 25.214, Physical layer procedures (FDD), v5.8.0, Release 5 (2004) 2. Badia, L., Taddia, C., Mazzini, G., Zorzi, M.: Multi-radio resource allocation strategies for heterogeneous wireless networks. In: Proceedings of WPMC ’05, Aalborg, Denmark (2005) 3. Bambos, N.D., Chen, S.C., Pottie, G.J.: Radio link admission algorithms for wireless networks with power control and active link quality protection. In: Proceedings of Infocom ’95, Boston, USA (1995) 4. Baraev, A., Jorguseski, L., Litjens, R.: Performance evaluation of radio access selection procedures in multi-radio access systems. In: Proceedings of WPMC ’05, Aalborg, Denmark (2005) 5. van den Berg, J.L., Litjens, R., Laverman, J.F.: HSDPA flow level performance: the impact of key system and traffic aspects. In: Proceedings of MSWiM ’04, Venice, Italy (2004) 6. Berggren, F., Litjens, R.: Performance analysis of access selection and transmit diversity in multi-access networks. In: Proc. of Mobicom ’06, Los Angeles, USA (2006) 7. Bonald, T., Proutiere, A.: Wireless downlink data channels: user performance and cell dimensioning. In: Proceedings of Mobicom ’03, San Diego, USA (2003) 8. Borst, S.C.: User-Level performance of channel-aware scheduling algorithms in wireless data networks. In: Proceedings of Infocom ’03, San Francisco, USA (2003) 9. Brouwer, F., de Bruin, I., Silva, J.C., Souto, N., Cercas, F., Correia, A.: Usage of link-level performance indicators for hsdpa network-level simulations in E-UMTS. In: Proceedings of ISSSTA ’04, Sydney, Australia (2004) 10. Chuang, J., Timiri, S.: EDGE compact and EDGE classic packet data performance, 3G Americas, white paper (January 1999) 11. Damosso, E., Correia, L. (eds.) Digital mobile radio towards future generation systems cost 231 final report, European Commission (1999) 12. Dobkin, D.: Indoor Propagation and Wavelength, RF Design (September 2002) 13. Furus¨ ar, A., Mazur, S., M¨ uller, F., Olofsson, H.: EDGE: enhanced data rates for GSM and TDMA/136 evolution. IEEE Personal comm. magazine (1999) 14. Furusk¨ ar, A., Zander, J.: Multiservice allocation for multiaccess wireless systems. Transactions on Wireless Communications 4(1), 174–184 (2005) 15. Heusse, M., Rousseau, F., Berger-Sabbatel, G., Duda, A.: Performance anomaly of 802.11b. In: Proceedings of Infocom ’03, San Francisco, USA (2003) 16. Holma, H., Toskala, A.(eds.) WCDMA for UMTS: radio access for third generation mobile communications. John Wiley & Sons, Chichester, England (2005) 17. Holma, H., Toskala, A.(eds.) HSDPA/HSUPA for UMTS: high speed radio access for mobile communications. John Wiley & Sons, Chichester, England (2006) 18. Koo, I., Furuskar, A., Zander, J., Kim, K.(eds.): Erlang capacity of multiaccess systems with service-based access selection. Communications letters 8(11) (2004) 19. Koudouridis, G.P., Karimi, H.R., Dimou, K.: Switched multi-radio transmission diversity in future access networks. In: Proceedings of VTC ’05, Dallas, USA (2005) 20. Litjens, R., Roijers, F., van den Berg, J.L., Boucherie, R.J., Fleuren, M.J.: Analysis of flow transfer times in IEEE 802.11 wireless LANs. Annals of Telecommunications 59 (2004)
Performance Modelling and Evaluation of Wireless Multi-access Networks
209
21. Miorandi, D., Kherani, A.A., Altman, E.: A queueing model for HTTP traffic over ieee 802.11 WLANs. Computer networks 50(1) (2006) 22. Prasad, N., Prasad, A.: WLAN systems and wireless IP for next generation communications. Artech House, Norwood, USA (2002) 23. Roijers, F., van den Berg, J.L., Fang, X.: Analytical modelling of TCP file transfer times over 802.11 wireless LANs. In: Proceedings of ITC 19, Beijing, China (2005) 24. Sakurai, T., Hanly, S.: Modelling TCP flows over a wireless LAN. In: Proceedings of European wireless ’05, Nicosia, Cyprus (2005) 25. Yee, J., Pezeshki-Esfahani, H.: Understanding wireless LAN performance tradeoffs. Communication systems design (2002)
Analysis of a Cellular Network with User Redials and Automatic Handover Retrials Jose Manuel Gimenez-Guzman, Ma Jose Domenech-Benlloch, Vicent Pla, Vicente Casares-Giner, and Jorge Martinez-Bauset Dept. of Communications, Universitat Polit`ecnica de Val`encia, UPV ETSIT Cam´ı de Vera s/n, 46022, Val`encia, Spain {jogiguz,mdoben}@doctor.upv.es, {vpla,vcasares,jmartinez}@dcom.upv.es
Abstract. In cellular networks, repeated attempts occur as result of user behavior but also as automatic retries of blocked requests. Both phenomena play an important role in the system performance and therefore should not be ignored in its analysis. On the other hand, an exact Markovian model analysis of such systems has proven to be infeasible and resorting to approximate techniques is mandatory. We propose an approximate methodology which substantially improves the accuracy of existing methods while keeping computation time in a reasonable value. A numerical evaluation of the model is carried out to investigate the impact on performance of the parameters related to the retry phenomena. As a result, some useful guidelines for setting up the automatic retries are provided. Finally, we also show how our model can be used to obtain a tight performance approximation in the case where reattempts have a deterministic nature.
1
Introduction
In the POTS the phenomenon of repeated attempts due to user behavior, and its analysis, have been studied, at least, since the early 70’s [1]. In modern cellular networks, network driven retries of blocked handover requests (retrials) occur on top of the reattempts triggered by the user behavior during a fresh session setup (redials) [2,3]. There are important differences between redials and automatic retrials. Blocked handovers will be automatically retried until a reattempt succeeds or the user moves outside the handover area. In the former case the session will continue without the user noticing any disruption, while in the latter the session will be abruptly terminated. In contrast, persistence of redials depends on the user patience and an eventual abandonment results in session setup failure, which is less annoying than the abrupt termination of an ongoing session. Moreover, automatic retrials are rather deterministic in nature [2] while redials are affected by the randomness of human behavior. Thus, from a modeling perspective, both types of reattempts need to be considered separately giving rise to two separate orbits of retrying customers. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 210–222, 2007. c Springer-Verlag Berlin Heidelberg 2007
Analysis of a Cellular Network with User Redials
211
Even if only a single orbit —instead of two— is considered the resulting model is of the type of the multiserver retrial queue, for which it is known that an analytical solution is not available and only numerical approximations can be obtained (see [3,4,5] and references therein). In particular Marsan et al. [3] consider a system fairly similar to the one considered here, and propose an approximate technique for its analysis. In [6] a generalization of the approximate method in [3] was proposed for a system with only a single retrial orbit, showing a substantial improvement in the accuracy at the expense of only a marginal increase of the computational time. In this paper we extend the approximation technique of [6] to a system with two different retrial orbits (redials and retrials). The proposed method is employed to perform a numerical analysis of the system focusing on how redials and retrials impact on the system performance. As a result some guidelines for setting up the automatic retries are provided. Additionally, we propose an accurate approximation method to analyze the performance of a system with deterministic retrials, (i.e. the maximum number of retrials or the time between consecutive reattempts take fixed values). To the best of our knowledge previous performance analyses of cellular systems with retrials [3,4,6] assume that the maximum number of retrials is geometrically distributed and the time between consecutive reattempts is exponentially distributed. The rest of the paper is structured as follows. Section 2 describes the system under study, while Section 3 discusses the system model and the analysis methodology. In Section 4 the numerical analysis of the impact of retrials/redials is carried out. Final remarks and a summary of results are provided in Section 5.
2
System Description
We consider a cellular mobile network, with a fixed channel allocation scheme and where each cell is served by a different base station, being C the number of resources in the cell. The physical meaning of a unit of resource is dependent on the specific technological implementation of the radio interface. Without loss of generality, we consider that each user occupies one resource unit. As shown in Fig. 1 there are two arrival streams: the first one represents new sessions and the second one handovers from adjacent cells. Both arrivals are considered Poisson processes with rates λn and λh respectively, being λ = λn + λh . For determining the value of λh we consider that the incoming handover stream is equal to the outgoing handover stream, due to the system homogeneity [7]. For the sake of mathematical tractability, the session duration and the cell residence time are exponentially distributed with rates μs and μr , respectively. Hence, the channel holding time is also exponentially distributed with rate μ = μr + μs and the mean number of handovers per session when the number of resources is infinite is NH = μr /μs . The FGC (Fractional Guard Channel) policy is characterized by only one parameter t (0 ≤ t ≤ C). New sessions are accepted with probability 1 when there are less than L = t resources being used and with probability f = t − L, when there are exactly L resources in use. If there are more than L busy
212
J.M. Gimenez-Guzman et al. Table 1. Transition rates
Transition (k, m, s) → (k + 1, m, s)
Condition Rate 0 ≤ k ≤ L − 1 m < Q n & s < Qh λ m < Qn & s = Qh λ + βh m = Qn & s < Qh λ + βn m = Qn & s = Qh λ + βn + βh k=L m < Q n & s < Q h λh + f λn m < Qn & s = Qh λh + βh + f λn m = Qn & s < Qh λh + f (βn + λn ) m = Qn & s = Qh λh + βh + f (βn + λn ) L
resources, new sessions are no longer accepted. Handovers are accepted while the system is not completely occupied. When an incoming new session is blocked, according to Fig. 1, it joins the 1 1 redial orbit with probability (1 − Pin ) or leaves the system with probability Pin . If a redial is not successful, the session returns to the redial orbit with probability (1−Pin ), redialing after an exponentially distributed time with rate μred . Redials are able to access to the same resources as the new sessions. 1 Similarly, Pih , Pih and μret are the analogous parameters for the automatic 1 retrials. Making Pih = 0, at least one retrial will be performed. In that case, if the system were so loaded that the probability of a successful retrial could be considered negligible, the time elapsed since the first handover attempt until the system finally gives up and the session is dropped will be a sum of X iid exponential rv of mean μ−1 ret . In our model the discrete rv X follows a geometric
Analysis of a Cellular Network with User Redials
Pin1 Redial orbit
m red
Pin 1-Pin
1-Pin
1
p=f p=1
ln lh
m ret
213
1-Pih
1 2 ... L-1 L ... C
m
1
Retrial orbit
Pih1
1-Pih Pih
Fig. 1. System model
distribution with mean 1/Pih , hence the total time from the first attempt until abandonment is described by an exponential rv of rate μr = μret Pih . In the light of the above discussion, our model represents a situation in which the blocked handover requests will keep retrying while the user remains within the handover area, being the sojourn time modeled as an exponential rv of rate μr . This assumption has been shown to have a low impact on the performance measures of interest [8].
3
System Model and Performance Analysis
The model considered can be represented as a tridimensional (k, m, s) Continuous Time Markov Chain (CTMC), being one dimension (k) the number of sessions being served, the second dimension (m) the number of sessions in the redial orbit and the third dimension (s) the number of sessions in the retrial orbit. The main mathematical features of this queueing model are the fact of having two infinite dimensions (the state space of the model is {0, . . . , C} × Z+ × Z+ ) and the space-heterogeneity along them is produced by the retrial and redial rates, which respectively depend on the number of customers on the retrial and the redial orbits. It is known that the classical theory (see, e.g., [9]) is developed for random walks on the semi-strip {0, . . . , C} × Z+ with infinitesimal transitions subject to conditions of space-homogeneity. When the space-homogeneity condition do not hold the problem of calculating the equilibrium distribution has not been addressed beyond approximate methods [10], [11]. Indeed, if we focus on the simpler case of multiserver retrial queues (with only one retrial orbit) it can emphasize
214
J.M. Gimenez-Guzman et al.
the absence of closed form solutions for the main performance characteristics when C > 2 [12]. As it is clear that in our case it is necessary to resort to approximate models and numerical methods of solution, in [6] we developed a generalization of the approximation method proposed in [3]. The new methodology is applied to both retrial and redial orbits, reducing the state space to a finite set by aggregating all states beyond a given occupancy of the orbits: Qn (Qh ) defines the occupancy from which the states in the redial (retrial) orbit are aggregated. By increasing the values of Qn and/or Qh the considered state space in the approximation is enlarged and the accuracy of the solution improves at the expense of a higher computational cost. Due to that aggregation two new parameters for each orbit are introduced. The parameter Mn denotes the mean number of users in the redial orbit conditioned to those states where there are at least Qn users in the orbit, i.e. Mn = E(m|m ≥ Qn ). The probability that after a successful redial the number of users in the redial orbit does not drop below Qn is represented by pn . For the retrial orbit the parameters Mh and ph are defined analogously. As a result of the aggregation the state space of the approximate model is S = {(k, m, s) : 0 ≤ k ≤ C; 0 ≤ m ≤ Qn ; 0 ≤ s ≤ Qh } where states of the form (·, Qn , ·) represent the situation where at least Qn users are in the redial orbit. Likewise the states of the form (·, ·, Qh ) represent the situation where at least Qh users are in the retrial orbit. The transition rates for the approximate model are shown in Table 1. In order to compute the steady-state probabilities of the system (π(k, m, s)) the actual values of the parameters Mn , pn , Mh and ph should be known. By balancing the probability fluxes across the vertical and horizontal cuts of the transition diagram, and equating the rate of blocked first attempts to the sum of the rates of successful and abandoning reattempts, the parameters above are expressed in terms of the steady-state probabilities Qn
π(C, m, Qh )
π(C, m, Q ) + π(C, m, Q m=0
ph =
Qn
h
m=0 1 λh (1 − Pih )
Qn
μret
Qn
π(k, m, Qh ) + Pih k=0 m=0
pn =
ζ1 ζ2
(1)
π(C, m, Qh )
;
Mn =
1 λn (1 − Pin )ζ2 μred ζ3
(3)
π(k, Q , s) + (1 − f ) π(L, Q , s) C
Qh
Qh
n
k=L+1 s=0
(2)
m=0
where
ζ1 =
(π(C, m, Qh ) + π(C, m, Qh − 1)
m=0 C−1 Qn
Mh =
h
− 1)
n
s=0
Analysis of a Cellular Network with User Redials Qh
C
Qh
π(k,Qn −1,s)+π(k,Qn, s) +(1−f)
ζ2 = k=L+1 s=0
π(L,Qn −1,s)+π(L,Qn,s) s=0
L−1 Qh
ζ3 =
215
Qh
Qh
π(k,Qn ,s)+f k=0 s=0
π(L, Qn ,s)+(1−f)Pin s=0
C
Qh
π(L,Qn ,s)+Pin s=0
π(k, Qn ,s)
k=L+1 s=0
The global balance equations, the normalization equation and Eqs. (1)–(3) form a system of simultaneous non-linear equations, which can be solved using — for instance— the iterative procedure sketched next: set pn = ph = 0, Mn = Qn and Mh = Qh and compute the steady-state probabilities using the algorithm defined in [13], now compute Mn , pn , Mh , ph using Eqs. (1)–(3) and start again. In all of our numerical experiments we repeated the iterative procedure until the relative difference between two consecutive iterations was less than 10−4 for all four parameters. The most common performance parameters used in cellular systems are the blocking probabilities of both new sessions (Pbn ) and handovers (Pbh ). Additionally, it is also used the probability of having a handover failure, denoted as forced termination probability (Pf t ), which is given in terms of the non-service probah bility (Pns ), i.e. the probability that a handover request and all its subsequent reattempts are blocked. Moreover, we define the mean number of redials (retrials) per user as un (uh ) and the mean number of users in the redial (retrial) orbit as Nred (Nret ). C
Qh
Qn
k=L+1 m=0 s=0
π(L, m, s) m=0 s=0
C
μred (1−Pin) λn
un=
Qn −1 Qh
Qn −1 Qh
mπ(k, m, s)+Mn ζ1 +(1−f) k=L+1 m=0 s=0 Qn
C
C
k=0 m=0 s=0
Qh
Qh
π(C, m, s) μret Pih λh
;
Pf t =
h NH Pns h 1 + NH Pns
Qn Qh −1
Qn
m=0 s=0
m=0
Qn Qh−1 m=0 s=0
Qn Qh −1
Qn
m=0 C
Qh
1 π(C, m, Qh ) +(1−Pih )
Qn
sπ(k, m, s) + Mh k=0 m=0 s=0
π(C, m, s) m=0 s=0
Qn
sπ(C, m, s)+Mh
Nret =
Qh
Qn 1 π(C, m, Qh ) +Pih
sπ(C, m, s)+Mh
μret (1−Pih) λh C
π(k, Qn , s) k=0 s=0
m=0 s=0
uh=
π(k, m, s)
mπ(k, m, s) + Mn Qn
Qh
k=L+1 m=0 s=0
C Qn −1 Qh
Pbh =
Qn
π(L, m, s) + m=0 s=0
Nred =
mπ(L, m, s) + m=0 s=0
Qh
1 + (1 − Pin ) (1 − f )
h Pns =
Qh
Qn
π(k, m, s) + (1 − f )
Pbn =
π(k, m, Qh ) k=0 m=0
m=0 s=0
π(C, m, s)
216
4
J.M. Gimenez-Guzman et al.
Results and Discussion
In this section a number of numerical examples are presented with the purpose of illustrating the capabilities and versatility of our model and the analysis methodology. The numerical analysis is also aimed at assessing the impact on performance of varying the values and/or distributions of the system parameters. For the numerical experiments a basic configuration is used and then the different parameters are varied, normally a single variation is introduced in each experiment. Thus, unless otherwise indicated, the value of the parameters will be those of the basic configuration: C = 32, NH = μr /μs = 2, μ = μr + μs = 1, 1 1 t = 31, Pih = Pin = 0.2, μred = 20, Pih = Pin = 0, μr = 10μr and then μret = 100/3. 4.1
Approximate Methodology
Here we evaluate the accuracy of the approximate analysis as a function of Qh and Qn . For a given performance indicator I and given values of Qh and Qn the relative error introduced by the approximate model is estimated by n +1,Qh +1) I (Qn , Qh ) = I(QI(Q − 1 . In Fig. 2 the relative error estimate is plotted n ,Qh ) as a function of Qh = Qn taking as performance indicators Nred and Nret . As it might be expected, except for a very short transient phase, the value of I (Qn , Qh ) decreases when the values of Qh and Qn increase, and also, that a higher load (given by λn ) results in a poorer accuracy. The curves also show that a good accuracy can be achieved with relatively low values of Qh and Qn , having been observed in all the numerical examples we have carried out. Moreover, in all the numerical results shown hereafter the values of Qh and Qn have been chosen so that Nred (Qn , Qh ) < 10−4 and Nret (Qn , Qh ) < 10−4 . 4.2
Redimensioning with Redials
Due to the human behavior, users normally redial if a previous attempt has been blocked. Network operators, however, do not consider redials as such simply because they are not able to distinguish between first attempts and redials, therefore every incoming session is regarded as a first attempt. Without that distinction, a resource over-provisioning can occur because for each user requesting a session whose first attempt is blocked several new session requests are actually accounted (one per attempt). In order to evaluate the magnitude of over-provisioning the following experiment was carried out. We start from a basic situation in which the QoS objectives (Pbn ≤ 0.05 and Pf t ≤ 0.005) are fulfilled and consider several values of load growth. For each value of the load increment, the amount of resources (C) is redimensioned in order to meet the QoS objectives. The redimensioning process is done using the complete model and a simplified model where redials are considered as fresh new calls, i.e. λn = λn + μred Nred . Figure 3 shows a sample of results from the redimensioning process which reveal that ignoring the existence of redials can produce a significant over-provisioning.
Analysis of a Cellular Network with User Redials
217
0.025 N (λ =8) red n Nret (λn=8) Nred (λn=12) Nret (λn=12)
0.02
0.015
0.01
0.005
0 1
2
3
4
5
6
7
Q ,Q n
h
Fig. 2. Accuracy of the approximate methodology 50
70 complete model simplified model
complete model simplified model
65
48
60 46
C
C
55 44
50 42 45
40
38 0
40
2
4
6
8 10 12 Load increase (%)
(a) Pin = 0.1.
14
16
18
20
35 0
2
4
6
8 10 12 Load increase (%)
14
16
18
20
(b) Pin = 0.
Fig. 3. Resource redimensioning with and without considering redials
4.3
Impact of Automatic Retrial Configuration
If the network operator enables the automatic retrial option the blocked handover attempts will be automatically retried while the user remains within the handoff area. We consider a fixed mean sojourn time in the handover area (μr = 20/3) and study the impact of varying the retrial rate (μret ). Note that for varying μret while μr is kept constant the value of Pih is varied accordingly using their relationship, μr = μret Pih . Figure 4 shows that a higher value of μret results in a lower forced termination probability but also a higher mean number of retrials per session. While the former is a positive effect the later is not that much as it entails an increased signaling load. In order to gain a further insight into the existing tradeoff between Pf t and uh we define the overall cost function CT = βλn Pf t + λh uh . The choice of the value for β may depend on many factors and a suitable value can vary widely from one situation to another, thus we have used a wide range of values,
218
J.M. Gimenez-Guzman et al.
0.14
0.12
3 μ =50 , P =0.4 ret ih μ =100 , P =0.2 ret ih μ =200 , P =0.1 ret ih μ =400 , P =0.05 ret ih μ =800 , P =0.025 ret
2.5
ih
0.1
μret=50 , Pih=0.4 μret=100 , Pih=0.2 μ =200 , P =0.1 ret ih μret=400 , Pih=0.05 μret=800 , Pih=0.025
uh
2
0.08
P
ft
1.5
0.06
1 0.04
0.5
0.02
0 7
8
9
10
λ
11
12
13
0 7
14
8
9
10
n
λn
11
12
13
14
Fig. 4. Performance parameters for different retrial configurations 15
40
10 35
Γ=1 Γ=2 Γ=5 Γ = 10
5
CT
−5
25
T
T
ret
C − C (μ =1)
30
0
−10 β=2 β=5 β = 10 β = 15 β = 20 β = 50 β = 100
−15
−20
−25 0
50
100
150
μ
200
250
20
15
300
10 0
20
40
ret
(a) Increment (CT (μret ) − CT (1)) when μret varies, Γ = 1.
60
80
μret
100
120
140
160
(b) Absolute value, β = 10.
Fig. 5. Cost function, λn = 12
β = {2, 5, 10, 15, 20, 50, 100}. We also explored the effect of varying the mean sojourn time in the handover area 1/μr (actually a normalized parameter with respect to 1/(Cμ) has been used Γ = Cμ/μr ). The shape of cost curves in Fig. 5(a) shows the existence of an optimal configuration point. Both the relevance of the optimal configuration point and the value of the retrial rate at which it is attained increase when the weight factor β is increased. Moreover, Fig. 5(b) shows that the optimal value of μret is rather insensitive to the mean value of the sojourn time in the handover area. 4.4
Distribution of the Maximum Number and Time Between Reattemps
In real systems (e.g. GSM) the time between retrials as well as the maximum number of retrials per request take a deterministic value instead of an stochastic
Analysis of a Cellular Network with User Redials 0.8
0.7
0.6
219
0.5
MM (analytic) MM (simulation) DM (simulation) MD (simulation) DD (simulation)
MM (analytic) MM (simulation) DM (simulation) MD (simulation) DD (simulation)
0.45 0.4 0.35
0.5
h
Pb
b
Pn
0.3
0.4
0.25 0.2
0.3
0.15
0.2 0.1
0.1
0.05
0 7
8
9
10
λ
11
12
13
0 7
14
8
9
10
λ
n
11
12
13
14
n
Fig. 6. Distribution of the time between reattempts: impact on Pbn and Pbh . Legend: XY , X (Y ) ≡ distribution for redials (retrials); M ≡ exponential, D ≡ deterministic. 1.6
0.7
0.6
Geo (analytic) Det (simulation)
1.4
Geo (analytic) Det (simulation)
1.2 0.5
1
Phb
uh
0.4
0.8
0.3
0.6 0.2
0.4 0.1
0 7
0.2
8
9
10
λn
11
12
13
14
0 7
8
9
10
λn
11
12
13
14
Fig. 7. Analytical approximation of a deterministic maximum number of retrials; d = 5, 1 Pih =0
one [2]. In our model, however, in order to keep the mathematical analysis tractable, we used an exponentially distributed time between retrials and a geometric distribution for the maximum number of reattempts. Here we validate these two assumptions with the help of a discrete event simulation model. In order to simplify the simulations we set λh = 2λn instead of computing the equilibrium value of λh . Time distribution between redials/retrials: We analyze the values of Pbn and Pbh when the distribution of the time between redials, retrials, or both are switched from exponential to deterministic, keeping constant its mean value. From the results in Fig. 6, and others not shown here due to the lack of space, we conclude that assuming an exponential distribution for the time between redials and/or retrials has a negligible impact in all the performance parameters of interest.
220
J.M. Gimenez-Guzman et al.
Distribution of the maximum number of reattempts: We compare a geometric distribution (after each unsuccessful attempt the user decides to abandon the system with probability Pi ) with a deterministic distribution (the users leaves the system after d unsuccessful attempts). For making these two options comparable the mean number of reattempts must be the same in both cases. Note it is not the same as both distributions having the same mean as the distributions refer to the maximum number of reattempts and not to the actual number of reattempts. While the following discussion deals only with retrials it can be easily extended to redials as well. Let q denote the blocking probability for retrials (note that in general q = Pbh ), the average number of retrials is uGeo = h
1 Pbh (1 − Pih )((1 − Pih )q)n−1 (1 − (1 − Pih )q) =
n≥1
1 (1 − Pih )Pbh (4) 1 − (1 − Pih )q
h 2 d−2 uD ] + dPbh q d−1 = Pbh h = (1 − q)Pb [1 + 2q + 3q + . . . + (d − 1)q
1 − qd (5) 1−q
for the geometric and deterministic case, respectively. If we assume that both q and Pbh take approximately the same value in both cases, by equating the right hand side of (4) and (5) we obtain Pih =
1−q 1 (q d − Pih ) q(1 − q d )
(6)
For a given value of d, by using the expressions for Pbh and uh and Eqs. (4) and (6), the value of Pih that yields uGeo = uD h h can be iteratively computed. The results shown in Fig. 7, and similar ones not shown here due to the lack of space, demonstrate that using the adjusting procedure described above, our model can provide an excellent approximation for the performance analysis of a system in which the maximum number of retrials is a fixed number.
5
Conclusions
In cellular networks, repeated attempts occur due to user redials when their session establishments are blocked and also due to automatic retries when a handover fails. The impact of both phenomena plays an important role in the system performance and it should not be ignored. However, the main feature of the Markovian model describing such a complex system is the space-heterogeneity along two infinite dimensions. Due to this fact, we develop an approximate methodology that aggregates users in the retrial/redial orbit beyond a given occupancy. Our proposal achieves a higher accuracy than other techniques while keeping computation time negligible from a human point of view. A numerical evaluation of the system has been performed in order to evaluate the impact of the reattempt phenomena in the system performance. We have
Analysis of a Cellular Network with User Redials
221
studied the effect of automatic retrials for handovers while the user remains into the handover area, giving some guidelines to the network operators in order to configure this behaviour optimally. Finally, we have shown how our model can be used to obtain a tight performance approximation when the time between reattempts and maximum number of reattempts are deterministic. Results of this approximate method are compared against those obtained by simulation, concluding that the proposed method is very accurate.
Acknowledgements This work was supported by the Spanish Government (30% PGE) and the European Commission (70% FEDER) through projects TSI2005-07520-C03-03 and TEC2004-06437-C05-01 and by C´ atedra Telef´ onica de Internet y Banda Ancha (e-BA) from the Universidad Polit´ecnica de Valencia. Besides M. Jose DomenechBenlloch was supported by the Spanish Ministry of Education and Science under contract AP-2004-3332.
References 1. Jonin, G., Sedol, J.: Telephone systems with repeated calls. In: Proceedings of the 6th International Teletraffic Congress ITC’6, pp. 435.1–435.5 (1970) 2. Onur, E., Deli¸c, H., Ersoy, C., C ¸ aglayan, M.U.: Measurement-based replanning of cell capacities in GSM networks. Computer Networks 39, 749–767 (2002) 3. Marsan, M.A., De Carolis, G.M., Leonardi, E., Lo Cigno, R., Meo, M.: Efficient estimation of call blocking probabilities in cellular mobile telephony networks with customer retrials. IEEE Journal on Selected Areas in Communications 19(2), 332– 346 (2001) 4. Tran-Gia, P., Mandjes, M.: Modeling of customer retrial phenomenon. IEEE Journal on Selected Areas in Communications 15(8), 1406–1414 (1997) 5. Chakravarthy, S.R., Krishnamoorthy, A., Joshua, V.: Analysis of a multi-server retrial queue with search of customers from the orbit. Performance Evaluation 63(8), 776–798 (2006) 6. Dom´enech-Benlloch, M.J., Gim´enez-Guzm´ an, J.M., Mart´ınez-Bauset, J., CasaresGiner, V.: Efficient and accurate methodology for solving multiserver retrial systems. IEE Electronic Letters 41(17), 967–969 (2005) 7. Marsan, M.A., Carolis, G.D., Leonardi, E., Cigno, R.L., Meo, M.: How many cells should be considered to accurately predict the performance of cellular networks? In: Proceedings European Wireless (1999) 8. Pla, V., Casares, V.: Effect of the handoff area sojourn time distribution on the performance of cellular networks. In: Proceedings of IEEE MWCN, pp. 401–405 (2002) 9. Neuts, M.: Matrix-geometric Solutions in Stochastic Models: An Algorithmic Approach. The Johns Hopkins University Press, Baltimore (1981) 10. L. Bright, P.G.T.: Calculating the equilibrium distribution of level dependent quasibirth-and-death processes. Communications in Statistics-Stochastic Models 11(3), 497–525 (1995)
222
J.M. Gimenez-Guzman et al.
11. Latouche, G., Ramaswami, V.: Introduction to Matrix Analytic Methods in Stochastic Modeling. ASA-SIAM (1999) 12. Artalejo, J.R., Pozo, M.: Numerical calculation of the stationary distribution of the main multiserver retrial queue. Annals of Operations Research 116(1-4), 41–56 (2002) 13. Servi, L.D.: Algorithmic solutions to two-dimensional birth-death processes with application to capacity planning. Telecommunication Systems 21(2-4), 205–212 (2002)
Stochastic Optimization Algorithm Based Dynamic Resource Assignment for 3G Systems Mustafa Karakoc1 and Adnan Kavak2 1
Kocaeli University, Dept. of Electronics and Computer Edu., 41380, Kocaeli, Turkey 2 Kocaeli University, Dept. of Computer Engineering, 41040, Kocaeli, Turkey {mkarakoc, akavak}@kou.edu.tr
Abstract. Orthogonal variable spreading factor (OVSF) codes are widely used to provide variable data rates for supporting different bandwidth requirements in wideband code division multiple access (WCDMA) systems. Many works in the literature have intensively investigated to find an optimal dynamic code assignment scheme for OVSF codes. Unlike earlier studies, which assign OVSF codes using conventional (CCA) or dynamic (DCA) code allocation schemes, in this paper, stochastic optimization methods which are genetic algorithm (GA) and simulated annealing (SA) were applied which population is adaptively constructed according to existing traffic density in the OVSF code-tree. Simulation results show that the GA and SA provide reduced code blocking probability and improved spectral efficiency when compared to the CCA and DCA schemes. It is also seen that, computational complexity of GA and SA are also more than CCA and DCA.
1 Introduction Wireless bandwidth is a precious resource. Effectively management of this resource is an important issue. Services provided in second generation (2G) systems are typically limited to low-bit rate data. In the 2G CDMA systems, each user is assigned a constant Walsh code. Beyond these 2G services, higher rate services, such as file transfer and quality of services guaranteed multimedia application, are provided by third generation (3G) systems [1]. In order to meet the demands of mixed traffic applications, 3G systems support variable data rates for different users. WCDMA is the most popular radio access technology system that was proposed by the 3rd Generation Partnership Project (3GPP). In WCDMA systems, orthogonal variable spreading factor (OVSF) codes are generated in the form of tree structure [2] to facilitate the variable rate data transmission. In a WCDMA system, two operations are applied to user data. The first one is channelization, which transforms every bit into a code sequence. The length of the code sequence per data bit is called the spreading factor (SF), which is typically power of two. The second operation is scrambling, which applies a scrambling code to the spread signal. Scrambling codes are used to separate transmission from a single source. OVSF codes preserve the orthogonality between channels of different rates and spreading factors. Channelization codes in the OVSF code tree have a unique description as CSF,k, where SF is the spreading factor of the code and k Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 223 – 234, 2007. © Springer-Verlag Berlin Heidelberg 2007
224
M. Karakoc and A. Kavak
is the code number, 1 ≤ k ≤ SF . All codes in the same layer are orthogonal to each other, while codes in different layers are also orthogonal if they do not have an ancestor-descendant relationship. The data rate is doubled whenever we go one level up in the tree. Two users should not be given two codes that are not orthogonal. In an OVSF WCDMA system, each base station (BS) manages a code tree for downlink transmission. Since the number of orthogonal codes is limited, BSs are responsible of utilizing their code trees efficiently to increase the performance of the system. There are several techniques, which are mainly classified as the conventional code allocation (CCA) [3] and the dynamic code allocation (DCA) schemes [4]. In the conventional code assignment (CCA) scheme, an OVSF code is assigned to a transmission request only when there is an available one that can fulfill the requested data rate and the orthogonality is maintained. The code blocking is the major problem with this scheme. Whereas code replacement scheme searches how to relocate codes when a new call arrive, finding no proper place to accommodate it. This can reduce code blocking, but will incur code reassignment cost. Dynamic code assignment schemes [4] [6] [7] are used to answer this problem. For this purpose, many researchers have focused on dynamic utilization of OVSF codes. Many existing results [4] [5] [6] [7] are investigated as follows. Tseng et al. proposed single-code and multi-code placement and replacement schemes in WCDMA systems [5] [7]. The algorithm for single code placement/replacement is simple in [7], but it possibly incurs many fragmented codes, which produce a code blocking problem. The multi-OVSF code placement and replacement scheme [5] can actually reduce the code blocking problem by using a codeseparation operation. Minn et al. [4] developed a dynamic assignment of OVSF code to provide an optimal dynamic code assignment (DCA) scheme, which reassigns codes with minimum costs. An alternative scheme presented by Rouskas et al. [6] for OVSF code assignment and reassignment schemes at the forward link of WCDMA 3G systems, which behave similar to crowded first and most user first schemes in [5] [7]. In this paper, we focused on stochastic search techniques such as Genetic Algorithm (GA) and Simulated Annealing (SA) for dynamic allocation of OVSF code tree with a random initial population. Regarding GAs [8], is generally applied in wireless communications for optimizing and designing antenna arrays [9], or detecting multiuser [10]. An OVSF code allocation strategy using GA is proposed in by Cinteza et al [11]. They have used binary representation of a chromosome and investigated fixed traffic density. New code requests coming onto an OVSF code tree already containing active codes are managed by using the GA. However, SA firstly proposed by Kirkpatrick et al. [12], which is based on the analogy between the process of finding best solution of a combinatorial optimization problem and annealing process. SA algorithm is also applied in wireless communications for detecting multiuser [13], and optimizing and designing antenna arrays [14]. The remainder of this paper proceeds as follows. In the next section GA/SA based dynamic OVSF code allocation strategy is given. Simulation parameters and computer simulations are presented in Section III. Finally, conclusions are given in Section IV.
2 Dynamic OVSF Code Allocation Using GA and SA Code blocking is the major problem during OVSF code assignment, which limits system performance. In order to find an OVSF code to the call requesting user, reallocation
Stochastic Optimization Algorithm Based Dynamic Resource Assignment
225
Fig. 1. Dynamic OVSF code allocation flow chart using GA / SA
Chini
ª º 16 21» «46R 29R 14 2R R R ¼ ¬
º ª TP 1 «14 21 6 16 9 » ¬ 2R R 4R R 2R¼
ª º P1 « 5 8 13 18 24» ¬4 R 2 R 2 R R R ¼
Fig. 2. Sample OVSF code tree and construction of initial population from code tree
of assigned OVSF codes in the code tree can help to find a suitable code, when code blocking occurs. This section discusses stochastic search techniques such as GA and SA adapted in OVSF code assignment strategy. Basic flowchart for GA/SA based allocation scheme is given below as Figure 1.
226
M. Karakoc and A. Kavak
In idle state, execution is not required for resource assignment. Call is initiated with the call processor’s signaling to resource manager to allocate resources for a traffic channel. First, availability of capacity (total rates of unused codes) is checked in the code tree whether to support the requested call rate in the system. If there is enough capacity, then availability of requested rate OVSF code is checked among unused codes in the relevant layer. If a call cannot be assigned a code due unavailability of the code with the requested rate (or all supported codes for this rate are not orthogonal to the assigned codes), GA/SA block is executed. This block is start with initial chromosome for both GA and SA. For the clearly understanding of the tree structure using GA/SA, a sample OVSF code tree is shown in Figure 2. The OVSF code tree which is input to the GA/SA block is named as initial chromosome (Chini) and this chromosome is represented with the index information belonging to active users. In other words, index numbers of occupied branches are expressed by existing OVSF code tree index which corresponds to a chromosome in the population. 2.1 GA Based Dynamic OVSF Code Assignment Scheme GA, one of the optimization and global search methods, is based on Darwin’s theory of evolution and simulated natural selection [8]. GA was developed further by Holland in the 1970s. It is applied effectively to solve various combinatorial optimization problems and worked with probabilistic rules. Selection, crossover and mutation are the most known genetic operators.
Fig. 3. GA block
Detailed presentation of GA block diagram is shown in Figure 3. GA adaptively defines the size of initial population P of n chromosomes (number of chromosome) according to current traffic density. Each chromosome in the initial population has different code tree index number, but the number of data bit rates and the total amount data bit rates of each chromosome is identical with initial chromosome. According to Figure 2, initial chromosome is represented with Chini =[6 9 14 16 21] and the data bit rate of the users in this chromosome is [4R 2R 2R R R]. Each active user’s index number in the chromosome is called as a gene and gene is represented by an integer
Stochastic Optimization Algorithm Based Dynamic Resource Assignment
227
number. In the case of OVSF code tree reassignment scheme, size of a population is depends on the traffic density which is determined by U
n = SF − ∑ S (i )
(1)
i =1
where, U is the total number of active users and S(i) is the date rate of ith active user, where i=1,…,U. In the above figure, U and S are 5 and 10R, respectively. Therefore the number of chromosome in the initial population (n) is obtained as 6 (16-10). Each of n chromosomes comprises coded information of existing OVSF code tree obtained with using random permutation that fully describes a potential solution to the optimization problem and expresses the different OVSF code tree. Nevertheless number of users and each user’s data bit rate remain same as Chini. Figure 2 also shows how the 1st chromosome is obtained from the Chini. Temporary population TP(1) which is derived from Chini with random permutation is sequentially assigned to empty OVSF code tree from 1st gene to Uth gene. It is important to consider the orthogonality principle, while assigning codes in the OVSF code tree. Index numbers are taken to compose a new chromosome P(1). The process of obtaining P(1) is as follows: For each gene of TP(1), the corresponding gene in P(1) is selected as the possible leftmost OVSF code that has the same rate as this gene in TP(1). For instance, the first gene numbered by 14 in TP(1) has the rate 2R. Hence, possible leftmost gene with rate 2R is the OVSF code numbered as 8 in P(1). For, the gene numbered 21 with rate R in TP(1), we obtain the OVSF code numbered 18, and so on. After obtaining each corresponding gene for P(1), we list the genes in P(1) from highest rate to lowest rate. This process is repeated n times to fill the P. Although each chromosome has different index numbers, the number of data bit rates and the total amount data bit rates are the same as initial chromosome. P shows, several different possible result for a given problem. It is clear that iteration number of optimal solution is depends on population size (n), users’ data bit rates (S(i)), and their location in the code tree. A fitness value for each chromosome of population is evaluated according to fitness function, which is defined specially for OVSF code assignment–reassignment problem. The fitness value of jth chromosome f(j) is the quantity of replacement of each individual in P(j) according to Chini defined by f ( j) =
1 ⎡ ∑ ⎢ i =1 ⎣ U
Chini (i ) − P ( j,i )
⎤ ⎥ × S (i ) ⎦
(2)
where, j is the chromosome number, j=1,…,n. Based on the fitness values of the chromosomes in the population, the selection operator creates a new population of n chromosomes, which contains chromosomes that, on average, have better fitness values than those in the original population. While this can be accomplished by many different techniques, this study mainly uses roulette wheel. In roulette wheel, probability of each chromosome Pr(j) is inversely proportional to its fitness value (f(j)). The jth chromosome (Pselected(j)) is selected if its fitness value (f(j)) is minimum or Pr(j) is maximum. Each available chromosome in the population is proportionally chosen with its fitness value.
228
M. Karakoc and A. Kavak
In order to produce better traits, the chromosomes in the new population should be hybridized using the crossover operation. Pairs of chromosomes are selected from the population subjected to crossover rate (pc). The number of chromosomes (nj) with crossover operation is given by n j = n × pc
(3)
If nj is odd, then (nj-1)/2 randomly selected chromosome pairs are used for crossover, otherwise it is nj/2. Chromosome pairs used at crossover operation are randomly chosen. In this work, we consider single point crossover technique in which for each pair to be crossed a random integer l is chosen as crossover point. In this technique, as depicted in Figure 4, randomly selected two chromosomes’ first pair’s head part up to lth gene is associated with second pair’s tail part from l+1st gene to lth gene, where t is the length of each chromosome. 1
16R 2
8R
C4,1 C4,2
4
4R 2R
8
R
16 17
C8,1
9
5
10
6 11
9
8R
2R
8
R
16 17
C8,1
9
l+1
t
16
28
15
2R
30 31
R
C8,8
10
C8,1
C4,1 C4,2 9
C2,2 5
10
16 17 18 19 20 21
C16,1
6 11
12
22 23 24 25
1 6
9
12
C4,3 C4,4
7
13
C8,8
14
l
l+1
t
12
22
23
4
4R 15
2R
30 31
R
8
C8,1
9
l+1
t
16
23
6 11
12
22 23 24 25
1 10
15
C1,1 C2,2
10
4
C8,8
14
l
5
16 17 18 19 20 21
C16,1
7
13
26 27 28 29 30 31
C2,1
C4,1 C4,2
3
C4,3 C4,4
10 1
2
8R
22 23 24 25 26 27 28 29
1
8
C1,1
C2,1
16R
6 11
4
4R
C1,1
5
10
4
7
C2,2 3
18 19 20 21
C16,1
14
l
C2,1
C4,1 C4,2
4
4R
13
10 1
2
8R
C4,3 C4,4
12
16R 2
3
22 23 24 25 26 27 28 29
1 6
1
16R C2,2
18 19 20 21
C16,1
C1,1
C2,1
3
C4,3 C4,4
7
13
C8,8
14
15
26 27 28 29 30 31
l
l+1
t
12
22
28
Fig. 4. Single point crossover operation 1
16R 2
8R 4
4R 2R R
8
C8,1
C4,1 C4,2 9
5
6 11
4 8
C4,3 C4,4
7
13
C8,8
14
26 27 28 29
C8,1
10
11
10
30 31
R
3
C8,1
9
10
6 11
16 17 18 19 20 21
13
14
28
30 31
R
C8,8
26 27 28 29
22
2R
7
13
C8,8
14
4
l
C8,1
9
C2,2 6 11
16 17 18 19 20 21
12
22 23 24 25
Fig. 5. Swap mutation operation
11
3
C4,3 C4,4
7
13
C8,8 15
14
26 27 28 29 30 31
l
k 4
23
C1,1
5
10
C16,1
22
C2,1
C4,1 C4,2
12
15
26 27 28 29 30 31
12 1
8
3
C4,3 C4,4
k 10
2
4R 15
12
22 23 24 25
C16,1
8R
k 12
8
4
C4,3 C4,4 7
12
l 4
2R
C2,2 5
16R
6
16 17 18 19 20 21 22 23 24 25
C16,1
15
C1,1
C2,1
C4,1 C4,2
23
C2,2 5
4
4R
C1,1
C2,1
C4,1 C4,2 9
16
2
8R
k
10 1
2
8R 4R
3
l 9
16R
R
12
16 17 18 19 20 21 22 23 24 25
C16,1
1
16R C2,2
10
6
2R
C1,1
C2,1
20
26
Stochastic Optimization Algorithm Based Dynamic Resource Assignment
229
Finally the mutation operation is applied to the population produced by the crossover operation to preserve genetic diversity by perturbing chromosomes randomly. In this operation, if randomly obtained number between 0 and 1 smaller than mutation rate (pm), then mutation process is started; otherwise a new random number is taken for the next chromosome. In mutation process, randomly selected two genes (l,k) in each chromosome are exchanged, which is shown in Figure 5. This operation is referred as swap mutation technique, which is used in this study. The entire process which is selection, crossover and mutation continued until the optimization criterion is achieved. Although the fitness values can change due to the changing traffic density, the GA runs continuously to optimize the OVSF code tree using the fitness function according to most recent traffic density. 2.2 SA Based Dynamic OVSF Code Assignment Scheme SA firstly developed by Kirkpatrick et al., is a local search algorithm. It is based on the analogy between the process of finding a possible best solution of a combinatorial optimization problem and the annealing process of a solid to its minimum energy state in statistical physics. Figure 6 shows detailed presentation of SA block diagram. The searching process in SA starts with initial chromosome. A neighborhood of this solution is generated using any neighborhood move rule. Then the cost of this possible solution (=chromosome) is obtained with Δf = f i − fi −1
(4)
where, Δf represents the change amount between costs of the two solutions. fi and fi-1 represent fitness values belong to neighborhood solution and current solution, respectively. It is calculated with fitness function which is given in Eq. 2. Neighborhood solution is obtained with neighborhood move. This operator is used to produce a near solution to current solution in search space. Swapping move is used in this paper. This operator works same with the swap mutation operation in GA. If Δf<0, current solution is replaced with the generated neighborhood solution. Otherwise if Δf>0, its mean that current solution is replaced with the generated neighborhood solution within the limits of specific probability. exp (− Δf / T ) > R
(5)
where, T is temperature which is a positive control parameter and R is a random number varies from 0 to 1. Algorithm turns neighborhood solution operator to obtain possible solution with better fitness value, until inner loop criterion is met. For each inner loop step, t temperature value is decreased accoding to Lundy & Mees cooling schedule. The performance of this SA is dependant on cooling schedule operator. In Lundy & Mees schedule; the relationship between Tk+1 and Tk is below: Tk +1 =
Ti − T f Tk , β= 1 + β Tk MTi T f
(6)
where, (β>0) is coefficient between two temperatures, Tk+1 and Tk. In this study, initial temperature (Ti) is 0.9, while final temperature (Tf) is 0.1, and M is the number of
230
M. Karakoc and A. Kavak
Fig. 6. SA block
outer loop in the algorithm. Then inner loop is checked. Inner loop criterion decides how many possible new solutions produced in every temperature. In this study, inner loop criterion is set to 5. Then, the optimization criterion is checked. The optimization criterion check blocks in the flow chart are used to control the algorithm. If criterion is provided, then algorithm is finalized and requested data bit rate is assigned to new user. This process is run-on until to assign the requested data bit rate to new user or to met the outer loop. Outer loop criterion is used to stop the searching process. This criterion is number of 1000. If the requested data bit rate can not assign while the outer loop is met, then call is blocked.
3 Computer Simulations Simulations are implemented to evaluate and compare the performance of CCA, DCA, GA based DCA, and SA based DCA schemes. For the performance evaluation, firstly, call blocking probability which is the ratio of the number of blocked calls to the number of incoming call requests is investigated under different call patterns with predefined probability. Then, spectral efficiency in the system is considered, which
Stochastic Optimization Algorithm Based Dynamic Resource Assignment
231
calculated as the ratio of data rates of served calls to the total requested data rate. Finally, computational complexity of each algorithm is calculated in terms of the number of multiplication and addition operation per iteration. In the simulations following parameters are used which are classified as OVSFconcerned, GA-concerned, and SA concerned parameters: As the OVSF concerned parameters, call arrival process is Poisson process with mean arrival rate λ varied from 4 to 64 calls/unit. Call duration is exponentially distributed with a mean value of 0.25 units of time. Maximum spreading factor SF is 256. Possible OVSF code rates are generated using Uniform distribution between R and SF×R. Mean call arrival rate, call duration, code rate distribution, and SF are input parameters for our simulations. Single simulation is run until 1000 incoming calls are generated. Some of calls leave the system according to Exponential call duration. Active (served) calls of OVSF codes, GA parameters, number of assigned, blocked, and reassigned users and their data rates are stored while the simulation is running. For the same input parameters, the simulations are repeated 10 times and the results for these 10 simulations are averaged. Then, regarding the GA-concerned parameters, a chromosome is represented by an integer number. Population size depends on the traffic density, in other words number of user in the system and their data bit rates are important for population size.
Fig. 7. Blocking probability at different traffic loads and different call patterns, SF: 256
232
M. Karakoc and A. Kavak
Roulette wheel, single point, and swap operators are used for selection, crossover, and mutation techniques. Crossover rate pc is 0.8, while mutation rate pm is 0.2. However in SA, initial and final temperatures are 0.9 and 0.1, respectively. Swap neighborhood move and Lundy & Mees cooling schedule are used. Inner loop criteria is set to 5, while outer loop is 1000. Figure 7 shows the simulation results for blocking probability under different call patterns. The performance of the system for different call patterns are important, in which predetermined call rates with known probabilities are applied to the system. For this purpose, let’s assume that the call requests of R, 2R, 4R, and 8R arrive to the system with probabilities of PR, P2R, P4R, and P8R, respectively, and this case denoted by (PR:P2R:P4R:P8R). For instance, call pattern of (4:4:1:1) means that percentages of call rates R, 2R, 4R, and 8R are 40%, 40%, 10%, and 10% respectively. In the call pattern of (1:1:1:1), call rates R, 2R, 4R, and 8R have equal probability that is 25% for each. From the figure, best performance for all methods is obtained with the call pattern (4:4:1:1) Meanwhile, when the algorithms are compared, GA clearly results in the smallest blocking probability among the four methods. These results are expected because all code assignment algorithms can reduce the code blocking probability for low rate transmission request. Therefore, the call pattern (1:1:4:4) does not exhibit obvious improvements because data requests with 4R and 8R are not always accepted by a loaded code tree. Figure 8 shows the spectral efficiency of the GA, SA, DCA, and CCA methods at different traffic loads. The spectral efficiency of the resource is inversely proportional to the traffic load in the system. It is clearly seen that GA and SA are effectively use given spectrum. As numerically at traffic load 10, spectral efficiency of GA, SA, DCA, and CCA are % 25.9, % 25.6, % 22.2, and % 12.6, respectively. It is clear that, code reassignment incurs cost on the system. Therefore, we derived the number of addition and multiplication operation per iteration for each algorithm.
CCA = σ (U , SF ) + U + SF + 1
(7)
DCA = 2σ (U , SF ) + 2 S (i ) + 1
(8)
(
)
GA = 3σ (U , SF ) + n U 2 + 4U + 5 + 3 + j 2
(9)
SA = σ (U , SF ) + 2U + (nU + 2) + j (U + 1)
(10)
2
Eq. 7, 8, 9, and 10 give the computational loads of CCA, DCA, GA, and SA, respectively. It is simply observed that, GA has most complex structure among the all algorithms. As numerically, it is approximately 2.6 times more computational load than CCA, and 1.6 times than DCA, and 1.15 times than SA. σ (SF,U ) term is primitive.
σ (U , SF ) = U + SF + SF log 2 SF
(11)
Stochastic Optimization Algorithm Based Dynamic Resource Assignment
233
CCA DCA GA SA SF=256
45
40
Spectral Efficiency (%)
35
30
25
20
15
10 0
2
4
6
8 10 Traffic Load, (A=λ/μ)
12
14
16
Fig. 8. Spectral efficiency at different traffic loads, SF: 256
4 Conclusion This work mainly investigates Genetic Algorithm (GA) and Simulated Annealing (SA) based OVSF code assignment in order to provide the answer to following questions; what can be done in order to improve the number of active users in the system and how the high data bit rate requests are assigned if enough capacity is provided by system. For this purpose, the performances of GA and SA are evaluated under different comparison parameters. The simulation results show that the GA and SA as compared to the CCA and DCA, provides the smallest blocking probability and largest spectral efficiency in the system. In addition to that, influences of different call patterns (different probabilities) for various traffic rates (R:2R:4R:8R = 4:4:1:1, 4:1:1:4, 1:1:1:1, and 1:1:4:4) are also tested. It is seen that 4:4:1:1 pattern results in the best. According to computational complexity, generally it is seen from this study that the more system performance causes, the more computational load.
References 1. Holma, H., Toksala, A.: WCDMA for UMTS. Wiley, Nokia, Finland (2000) 2. Adachi, F., Sawahashi, M., Okawa, K.: Tree structured generation of orthogonal spreading codes with different lengths for forward link of DS-CDMA mobile. IEE Electronics Letters 33, 27–28 (1997)
234
M. Karakoc and A. Kavak
3. Okawa, K., Adachi, F.: Orthogonal Forward Link Using Orthogonal Multi Spreading Factor Codes for Coherent DS-CDMA Mobile Radio. IEICE Transactions on Communication, E81-B (4), 778–779 (1998) 4. Minn, T., Siu, K.: Dynamic Assignment of Orthogonal Variable- Spreading-Factor Codes in W-CDMA. IEEE Journal on Selected Areas in Communications 18(8), 1429–1440 (2000) 5. Chao, C.M., Tseng, Y.C., Wang, L.C.: Reducing Internal and External Fragmentations of OVSF Codes in WCDMA Systems with Multiple Codes. In: Proc. IEEE Wireless Communications and Networking Conf., vol. 1, pp. 693–698 (2003) 6. Rouskas, A.N., Skoutas, D.N.: OVSF Codes Assignment and Reassignment at the Forward Link of W-CDMA 3G Systems. In: Proc. IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, vol. 5, pp. 2404–2408 (2002) 7. Tseng, Y.C., Chao, C.M.: Code Placement and Replacement Strategies for Wideband CDMA OVSF Code Tree Management. IEEE Transactions on Mobile Computing 1(4), 293–302 (2002) 8. Goldberg, D., E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989) 9. Manetta, L., Ollino, L., Schillaci, M.: Use of an Evolutionary Tool for Antenna Array Synthesis. In: Rothlauf, F., Branke, J., Cagnoni, S., Corne, D.W., Drechsler, R., Jin, Y., Machado, P., Marchiori, E., Romero, J., Smith, G.D., Squillero, G. (eds.) Applications of Evolutionary Computing. LNCS, vol. 3449, pp. 245–253. Springer, Heidelberg (2005) 10. Shayesteh, M.G., Menhaj, M.B., Nobary, B.G.: A New Modified Genetic Algorithm for Multiuser Detection in DS/CDMA Systems. In: Reusch, B. (ed.) Computational Intelligence. Theory and Applications. LNCS, vol. 2206, p. 608. Springer, Heidelberg (2001) 11. Cinteza, M., Radulescu, T., Marghescu, I.: Orthogonal Variable Spreading Factor Code Allocation Strategy Using Genetic Algorithms. Cost 290, Technical Document, Traffic and QoS Management in Wireless Multimedia Networks (Wi-QoST) (2005) 12. Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by Simulated Annealing. Science 220, 671–680 (1983) 13. Yoon, S.H., Rao, S.S.: Annealed neural network based multiuser detector in code division multiple access communications. IEE Communications Proceedings 147(1), 57–62 (2000) 14. Osseiran, A., Logothetis, A.: A method for designing fixed multibeam antenna arrays in WCDMA systems. Antennas and Wireless Propagation Letters 5, 41–44 (2006)
Adaptive Resource Reservation for Efficient Resource Utilization in the Wireless Multimedia Network Seungwoo Jeon, Hanjin Lee, and Hyunsoo Yoon Dept. of Electrical Engineering and Computer Science, KAIST, Republic of Korea {swjeon, hjlee, hyoon}@nslab.kaist.ac.kr
Abstract. With the increase of wireless network and demand for the variety of services, the handoff resource reservation for the Quality of Service(QoS) will be significant issue in future wireless networks. But on the other hand, the efficiency of resource usage also needs to be considered for preventing waste of network resources due to the reservation. Therefore, this paper suggests an efficient resource reservation scheme to support QoS provision as well as the efficient resource utilization simultaneously. In the proposed scheme, handoff probability estimation with the filtering algorithm is used to calculate the optimal amount of resources to be reserved in the neighbor cells. So, the mobile units with the proposed scheme can predict the amount of resources required in the future and reserve resource effectively. The performance simulation shows that the proposed scheme is advantageous with the balance of QoS provision and resource utilization in the wireless network. Keywords: wireless network, handoff, resource reservation.
1
Introduction
The wireless network has grown so far and became one of the necessary parts in our life. Unlike the past mobile communication, the demand for the variety of services will be expected to increase and the Quality of Service(QoS) will be emphasized more strictly in the future wireless network [6]. In terms of QoS, the handoff resource reservation in the cells which are likely to be visited would be a promising scheme in the future wireless network. There are many chances that the mobile units might go through cells and have to carry out handoff to maintain their communication with others. Many studies indicate that frequent handoffs are closely related to the critical degradation of QoS by increasing call drop and block problems for lack of resources [7]. Therefore, some related issues about that have been studied actively in recent years. There are several works about resource reservation for supporting QoS provision. Some schemes are based on the guard channel allocation that certain amount of resources are always allocated for handoff call [4][5]. These are initially intended to prevent handoff call drop, but the resource usage would not Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 235–247, 2007. c Springer-Verlag Berlin Heidelberg 2007
236
S. Jeon, H. Lee, and H. Yoon
be maintained efficiently because of inflexible or false allocation. Unlike those schemes, [1] has suggested the shadow cluster scheme that reserves resources in the cluster of near cells(shadow) with the probability of call arrival. Furthermore, [2] has proposed a call admission control based on the resource reservation in the most likely cell cluster(MLC) using mobility information and cell residence time of the mobile unit. However, even though these suggestions can be effective for the guarantee of QoS for certain mobile units, they may provoke excessive resource reservation which could make unfavorable problems for others. Generally, the support of QOS and efficient resource usage are thought of as opposite to each other, but have to be ensured simultaneously. Therefore, in this paper, we propose a novel scheme for resource reservation which aims at not only the guarantee of QoS for mobile units with variety of services but also effective usage of network resources. So, to fulfil this purpose, our scheme predicts the near-optimal amount of resources in neighbor cells based on the handoff probability estimation. In normal situation, it can be a intuitive and rational strategy which is able to balance these conflictive requirements. The remainder of this paper is organized as follows : In Section , the principal of the proposed adaptive resource reservation scheme is presented. Section discusses the stepwise procedure of the scheme in the wireless network. Performance evaluation for the proposed scheme is presented in Section . Finally, Section summarizes the overall conclusion and remarks future work.
2 2.1
The Adaptive Resource Reservation Scheme The Concept of Adaptive Resource Reservation Scheme
In our scheme, the handoff probability per the cell can be defined as the possibility that the mobile unit reaches that region in case of handoff, and the precise calculation of that is directly related to the determination of the amount of resources to be reserved in the next cell. Therefore, all of the neighbor cells adjacent to the current one are considered and estimated values for those are used for resource reservation. Considering the mobile unit which is using certain service s, the amount of resources that has to be reserved in cell i at time t can be written as follows : s Bis (t) = Pi (t)Btotal
(1)
s where Btotal is the required resources to maintain service s and Pi (t) is the handoff probability per cell i which is predicted at time t. Note that Pi (t) is uniformly distributed between 0 and 1. Although the handoff probability estimation can be normally resulted from the mobile unit’s current observable characteristics such as signal-to-noise ratio(SNR) and mobility, we assume that it also has significant relevance to the previous handoff probability estimation. Therefore, both of these two factors need to be considered simultaneously, but the weight of each has to be changed depending on the location of the mobile unit.
Adaptive Resource Reservation for Efficient Resource Utilization
237
The current cell
(a)
The mobile unit
(b)
Fig. 1. The mobile unit’s relative location
Let us consider the different mobile unit’s locations as shown in the Figure 1. If the mobile unit is relatively far from the cell boundary like (a), the handoff probability could be determined largely by current observation rather than previous estimation. In contrast, the closer the mobile unit is approaching to the boundary like (b), the more handoff probability can be affected by past estimation because the chance of making dramatic change in the mobility is gradually reduced and the former predicted probability would be much closer to the next one as moving to the neighbor cells [3]. In this reason, we adopt the discrete Kalman filter as the key mechanism in the proposed scheme. Because the Kalman filter is one of the effective way to predict the next value with recursive mechanism [9], our scheme can adjust the gravity of previous estimation and the current one of handoff probability by using that mechanism with the mobile unit’s location information. Therefore, the near-optimal amount of resources which the mobile unit would require can be predicted in advance. In addition to that, the current estimation of handoff probability per each neighbor cell is thought to be composed of two parts in our scheme by considering the mobile unit’s variety of properties : (i) The possibility of handoff occurrence in the current cell. (ii) The probability per each neighbor cell which the mobile unit would reach according to its mobility in case of handoff. Since these values are regarded to be in the relation of probability independence, the handoff probability can be simply induced by multiplying both of them. 2.2
The Probability of Handoff Occurrence
In terms of the mobility, it’s generally assumed that the handoff occurs when the mobile unit is crossing the boundaries of current cell. It is also thought as a result of the weakness of received signal strength from current base station, but these two explanations can be understood as same because the designation of cell region is based on the signal strength measurement.
238
S. Jeon, H. Lee, and H. Yoon
From this consideration, the proposed scheme uses a feasible estimation function for handoff occurrence probability. It is based on the relation between service duration and cell residence time because they are closely related to the handoff. Let the probability of handoff occurrence in the future which is estimated at current time h can be written as F (h). Note that the anticipated handoff initiation time denoted by k(h) is calculated every pre-defined time period from call initiation or entering the current cell. Normally, we can expect that the handoff occurs when the cell residence time tc is shorter than service duration time ts in case that the mobile unit is still in the middle of service. Therefore, F (h) can be expressed as follows. F (h) = P (ts > tc |ts > k(h))
(2)
Because it’s generally acceptable that call duration and cell residence time are modeled as exponentially distributed random variables, let tc and ts follow 1 1 exponential distributions fc (τ ) and fs (τ ) with mean and respectively. For μ λ the simplicity of calculation, the intersection of P (ts > tc ) and P (ts > k(h)) can be abbreviated as the probability that the k(h) is larger than tc from considering empirical handoff call mobility pattern. Therefore, the above formula can be rewritten like follows. P (ts > tc , ts > k(h)) P (k(h) ≥ tc ) = P (ts > k(h)) 1 − P (ts ≤ k(h)) k(h) k(h) −μτ fc (τ )dτ μe dτ 0 0 = = k(h) k(h) −λτ 1− 0 fs (τ )dτ 1− 0 λe dτ
F (h) =
(3)
Next, the differentiation of both sides gives the f (h), the probability density function of F (h). k(h) −μτ μe dτ d d F (h) = { 0 k(h) } dk(h) dk(h) 1 − −λτ λe dτ 0 k(h) −λτ k(h) −μτ μe−μk(h) (1 − 0 λe dτ ) + λe−λk(h) ( 0 μe dτ ) = k(h) (1 − 0 λe−λτ dτ )2
f (h) =
(4)
= (μ − λ)e−(μ−λ)k(h) + λeλk(h) The second term can be simply neglected from above assumption that ts is much longer than k(h). So, (3) and (4) can be simply expressed as follows. (μ − λ)e−(μ−λ)k(h) if μ > λ −(μ−λ)k(h) f (h) = (μ − λ)e (5) 0 otherwise
k(h)
f (τ )dτ
F (h) = 0
1 − e−(μ−λ)k(h) 0
if μ > λ otherwise
(6)
Adaptive Resource Reservation for Efficient Resource Utilization
239
From (5) and (6), the handoff occurrence probability can be determined by the relation between service duration and cell residence time. Note that it also follows the exponential distribution with mean which is the difference between expected cell passing rate (μ) and service on/off rate (λ) of the mobile unit. Assume that mean service duration time is already known, and the mean cell residence time is defined as the expected duration for a mobile unit to remain 1 in the cell. Therefore, the can be calculated every time period as follows : μ 1 2R = h μ V (j) h j=0
(7)
where R is the cell radius and V (j) denotes the mobile unit’s speed at j. Note that the averaged value of V (j) is used not to reflect abrupt change at a moment. Finally, because k(h) means the current cell crossing time and can have different value per each neighbor cell according to the mobile unit’s location, it is rewritten as k i (h) per each cell i and defined as follows : k i (h) =
di (h) V (h)
(8)
where di (h) denotes the distance between the mobile unit’s current position at time h and the boundary between neighbor cell i and current one. Therefore, considering each neighbor cells adjacent to the current one, the probability of handoff occurrence per neighbor cell i can be represented as F i (h). 2.3
The Crossing Probability per Neighbor Cell
Generally, the association between the direction of the mobile unit and resource reservation need to be investigated for efficiency. However, except some cases like highway and railroad, it is not easy to predict the direction precisely. In the wireless network, it is largely feasible assumption that the next cell would be the one from which the mobile unit get the strongest signal strength at present. So, this can be adopted in our scheme as well, but the crossing probability per other cells also needs to be considered simultaneously in case of dramatic change of mobility. From above consideration, the proposed scheme uses the simple estimation function for crossing probability per each neighbor cell as follows : 1 for presumed next cell n 1 1 M (h) = (9) max(0, − cos(π − ψmn (h))) for other cells 2 2 where M n (h) denotes the crossing probability per cell n when the mobile unit begins handoff from the current cell m and ψmn (h) is the difference between current mobile unit’s direction and the straight path of the neighbor cell n and
240
S. Jeon, H. Lee, and H. Yoon
Cell n Straight path
Estimated direction
ψ mn
Cell m
Fig. 2. The mobile unit’s trajectory and straight path between cells
cell m at time h, as shown in Figure 2. Note that the absolute value of ψmn (h) can have a range between 0 and π. From this, even if a cell is not currently presumed as the next, the crossing probability of that can remain high when the value of ψmn (h) is not so large. So, by considering both of the signal strength and direction difference simultaneously, small change or false observation of signal strength and mobility can be effectively controlled and the chance of dramatic variation on crossing probability can be noticeably reduced. 2.4
Handoff Probability Prediction with Filtering Algorithm
In this section, the synthetic estimation of handoff probability is suggested from previous ones. As mentioned before, the overall probability can be calculated based on the discrete Kalman filter algorithm with location information. The filtering procedure for handoff probability estimation is depicted in Figure 3 and represented as two phases like follows : (i) With the estimation at the previous period and current observable one for handoff, predict the handoff probability per neighbor cells at the next period. (ii) Update and refine the estimation with measurement information for arriving at more accurate values.
Noise
Estimation
Observation
Prediction
Estimation
MSE
MSE
Update
Measurement
Fig. 3. The filtering for handoff probability estimation
Adaptive Resource Reservation for Efficient Resource Utilization
241
Assume that the period of estimation is pre-defined. Then the actual handoff probability vector Xn at period n can be represented as follows : Xn = (p(1), p(2) · · · p(N ))T
(10)
where p(i) is the handoff probability per neighbor cell i and N denotes the number of the neighbor cells. The current observation of handoff probability at period n can be written as Un , whose elements are the multiplications of two probabilities F i (n) and M i (n) respectively as mentioned in the previous sections. Un = (F 1 (n)M 1 (n), F 2 (n)M 2 (n), · · · F N (n)M N (n))T
(11)
Throughout this procedure, the process and observation error are considered. In normal filtering situation, they are defined as the zero mean Gaussian noise 2 with covariance matrix Q and V , which are assumed as σQ IN and σV2 IN in the proposed scheme. Note that IN denotes the identity matrix with size N , also σQ and σV are scalar normal deviation of process and observation error respectively. In the update stage of the filtering process, Zn , the measurement of Xn which is calculated every period can be understood as following notation : Zn = Hn Xn + vn = (z(1), z(2) · · · z(N ))T
(12)
where Hn is the observation model that maps the true value into the measurement and defined as IN in our scheme. The vn denotes the observation error in the mapping process as mentioned above, and the measurement per cell i, z(i) can be simply estimated as follows : z(i) =
RSS(i) max(RSS(1) , RSS(2) · · · RSS(N ) )
(13)
where RSS(i) denotes the received signal strength from cell i that can be obtained every period. As assumed above, because the handoff probability per each neighbor cell is significantly related to the received signal strength from cells, the z(n) can be also expressed as the ratio of RSS value to the maximum one. From above considerations, the linear filter algorithm with mean square error(MSE) matrices can be obtained. The resulting algorithm can be easily understood and implemented in the system. The predict stage: 1) Prediction of state estimation ˆ n|n−1 = An X ˆ n−1|n−1 + Bn Un X
(14)
ˆ n|n−1 is the priori estimation of Xn given no measurement information where X ˆ n|n is the posteriori estimation of Xn after the update stage. The N × N and X coefficient matrices An and Bn are defined as follows : An + Bn = IN
(15)
242
S. Jeon, H. Lee, and H. Yoon
where the elements of Bn can be defined like follows. di (n) Bn (i, j) = min(1, log2 (1 + R )) 0
if i = j
(16)
otherwise
From (16), the elements of An are gradually increased respectively when the mobile unit is approaching to the each cell boundary, in contrast to the Bn . 2) Prediction of MSE covariance matrix Pn|n−1 = An Pn−1|n−1 ATn + Bn + Q
(17)
ˆ n|n−1 ) and where error covariance matrices Pn|n−1 and Pn|n denote cov(Xn − X ˆ n|n ) respectively. cov(Xn − X The update stage: 1) Optimal Kalman gain matrix Kn = Pn|n−1 HnT (Hn Pn|n−1 HnT + V )−1
(18)
where Kn means the N × N Kalman gain matrix at the period n. 2) Update of state estimation ˆ n|n = X ˆ n|n−1 + Kn (Zn − Hn X ˆ n|n−1 ) X
(19)
ˆ n|n−1 represents the discrepancy between where measurement residual Zn − Hn X the estimated value and actual measurement. 3) Update of MSE covariance matrix Pn|n = (IN − Kn Hn )Pn|n−1
(20)
This filtering process lasts recursively until the actual handoff begins and the estimated value can be adaptively changed from that. Therefore, the mobile unit ˆ n|n . can reserve and adjust resources in neighbor cells at the rate of elements in X By the help of this, the estimation result is robust to minor mobility change and can reflect the actual value near-optimally.
3
The Adaptive Resource Reservation Procedure
From the above algorithm, the overall resource reservation procedure can be obtained. Note that in our scheme, the reservation only comes under the real time services. The mobile units with non-real time services do not reserve resources because those services are generally assumed to be less sensitive to delay or
Adaptive Resource Reservation for Efficient Resource Utilization
243
variation on the throughput. So, the resources for the non-real time services are inconsistently allocated according to the amount of free resources in the next cell when the handoff begins. The overall resource reservation procedure can be represented as follows. 1. When the mobile unit with the real-time services reaches the designated resource reservation boundary that is pre-defined, it requests resource reservation in the neighbor cells to the current base station. 2. The base station calculates the handoff probability per each cell and requests resource reservation to the neighbor cells with that information. 3. The base station in the current cell and those in the neighbor cells periodically exchange information about resource reservation and adjust the amount of reserved resources for the mobile unit. 4. When the mobile unit reaches the handoff area, the target cell is identified and the handoff process begins. If there are enough resources reserved or sufficient ones as free in the target cell, the mobile unit can maintain its service and others reserved in the neighbor cells are immediately deallocated.
4
The Performance Evaluation
We simulated our scheme in terms of QoS provision and resource utilization at acceptable level. So, the handoff call drop and new call block, which are closely related to the QoS, are investigated throughout this evaluation. The service class in the simulation is composed of three real time and one file transfer protocol(FTP) services listed in Table 1, and the simulation parameters are summarized in Table 2. Note that the call request per cell denotes the number of new originating calls in a cell. Table 1. The service types Service Video on demand Videophone Telephony FTP(Non real-time)
Data rate 1∼ 6 Mbps 256 kbps ∼ 2 Mbps 64 ∼ 128 kbps 1 ∼ 10 Mbps
Mean duration 10 min 5 min 3 min 10 min
In this simulation, the direction(φ) and speed(v) of the mobile unit is referenced to [8] and newly designed as follows : min{max[vold + v, 0], vmax } where Pv ≤ 0.9 vnew = (21) 0 otherwise φnew =
φold φold + φ
where Pd ≤ 0.8 otherwise
(22)
244
S. Jeon, H. Lee, and H. Yoon Table 2. The simulation parameters
Parameter Cell bandwidth R N The number of the total cell Base station transmission power Shadowing Noise power spectral density Call request per cell σV , σQ Resource reservation boundary(SNR) Handoff hysteresis(SNR) Estimation time period
Value 30 Mbps 1000 m 6 7 (wrap-around) 45 dBm Lognormal with deviation 5 dB -174dBm/Hz Poisson distribution (mean between 0.3 ∼ 1/sec) 0.1 9 dB 2 dB 1 sec
Table 3. The mobility parameters Mobility Pedestrian Normal speed vehicle High speed vehicle
vmax 3 km 60 km 90 km
v -1 ∼ 1 km/h -3 ∼ 3 km/h -5 ∼ 5 km/h
φ −90o ∼ 90o −45o ∼ 45o −30o ∼ 30o
Real time service 0.4 ARR No Reserv. Guard Ch.
0.35
Prob. of Handoff call drop
0.3
0.25
0.2
0.15
0.1
0.05
0 0.3
0.4
0.5
0.6 0.7 Call req /sec
0.8
0.9
1
Fig. 4. The handoff drop probability for real time services
where vold , φold and vnew , φnew denote the old and updated values of the speed and direction respectively. The v and φ refer to the acceleration of v and φ, also vmax denotes the maximum speed of the mobile unit. Also, the Pv and Pd are uniform random values between 0 and 1, which are chosen every period.
Adaptive Resource Reservation for Efficient Resource Utilization
245
Non real time service 0.4 ARR No Reserv. Guard Ch.
0.35
Prob. of handoff call drop
0.3
0.25
0.2
0.15
0.1
0.05
0 0.3
0.4
0.5
0.6 0.7 Call req / sec
0.8
0.9
1
Fig. 5. The handoff drop probability for non-real time services 0.5
0.45
ARR No reserv. Guard Ch.
0.4
Prob. of new call block
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 0.3
0.4
0.5
0.6 0.7 Call req /sec
0.8
0.9
1
Fig. 6. The new call block probability
Furthermore, the mobile unit’s mobilities are classified into three groups listed in Table 3 to emulate the real wireless network circumstance. We compared the handoff drop and new call block probability of the proposed scheme with those of other schemes, which are no-reservation and fixed guard channel scheme that is generally referenced in this subject. The fixed guard channel scheme is set to reserve 30% of total resources exclusively for the real time call’s handoff. The results are averaged on 20 times simulations. Figures 4 and 5 show the handoff call drop probability according to the mean call requests per second in a cell. Figure 4 shows that the real time call’s drop probability in the proposed scheme (ARR) is surely lower than non-reservation (No Reserv) and fixed guard channel scheme (Guard Ch). For the non-real time service, the handoff call drop probability of the proposed scheme is almost similar to those of others as shown in Figure 5. It is because the
246
S. Jeon, H. Lee, and H. Yoon
resource reservation is only allowed to the handoff call with real time services. However, as mentioned before, it is expected to provoke little problems related to QoS. Note that the probability in the guard channel scheme is relatively higher than others. That is supposed as a result of exclusive resource reservation for handoff call with real time services. Figure 6 shows the new call block probability according to the call requests per second. From this, the block probability of the proposed scheme can be maintained low, almost similar or slightly higher than non-reservation. Principally, the proposed scheme has to take out inevitably chances for new calls to use sufficient resources. However, because the resource reservation for handoff calls can be controlled periodically, most of new calls can have acceptable amount of resources for communication and the degradation of QoS can be prevented as possible. In contrast to our scheme, the guard channel scheme has the highest call block probability, which is largely coincided with our anticipation.
5
Conclusion and Future Work
In the future wireless network, the guarantee of QoS as well as resource utilization efficiency are expected to be important issues. Therefore, to handle these requirements in balance, we propose a new resource reservation scheme based on handoff probability estimation with filtering algorithm. It can work effectively by periodical updating of reservation so that redundant resources are not allocated until the handoff begins. The simulation results show that the guarantee of QoS and efficient resource usage can be accomplished acceptably with the proposed scheme. From those, we can conclude that our scheme can be a good alternative way for handoff resource reservation in the wireless network with limited resources and variety of services. Furthermore, our future works include the comparison our scheme with other different suggestions and evaluation in a more realistic scenario. We anticipate that those following works can present more details about the suggested scheme.
References 1. Levine, D.A., Akyildiz, I.F., Naghshineh, M.: A resource estimation and call admission algorithm for wireless multimedia networks using the shadow cluster concepts. In: IEEE/ACM Trans. Veh. Technol. Conf., vol. 5 (1997) 2. Aljadhai, A.R., Znati, T.F.: A framework for call admission control and QoS support in wireless environments. In: IEEE Infocom (1999) 3. Liu, T., Bahl, P., Chlamtac, I.: Mobility modeling, location tracking, and trajectory prediction in wireless ATM networks. IEEE J. Select. Areas. Commun. 16 (1998) 4. hong, D., Rappaport, S.S.: Traffic model and performance analyis for cellular mobile radio telephone systems with prioritized and nonprioritized handoff procedures. IEEE Trans. Vehic. Tech. 35 (1986) 5. Ma, Y., Han, J.J., Trivedi, K.S.: Call admission control for reducing dropped calls in code division multiple access (CDMA) cellular systems. In: IEEE Infocom (2000)
Adaptive Resource Reservation for Efficient Resource Utilization
247
6. Huang, L., Sunil kumar, Jay Kuo, C.-C.: Adaptive resource allocation for multimedia QoS management in wireless networks. IEEE Trans. Veh. Technol. 53 (2004) 7. Diederich, J., Zitterbart, M.: Handoff prioritization schemes using early blocking. IEEE communications surveys 7 (2005) 8. Ye, J., Hou, J., Papavassiliou, S.: A comprehensive resource mangement framework for next generation wireless networks. IEEE Trans. Mobile computing 1 (2002) 9. Welch, G., Bishop, G.: An introduction to the Kalman Filter. In: ACM SIGGRAPH 2001 (2001)
A Discrete-Time Queueing Model with a Batch Server Operating Under the Minimum Batch Size Rule Dieter Claeys, Joris Walraevens , Koenraad Laevens, and Herwig Bruneel SMACS Research Group, Department of Telecommunications and Information Processing, Ghent University, Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium {dclaeys,jw,kl,hb}@telin.ugent.be
Abstract. In telecommunications networks, usually an aggregation of information units (a batch) is transmitted instead of individual information units. In order to obtain performance measures for such networks, we analyze a discrete-time queueing model with a batch server operating under the minimum batch size (MBS) service policy. Specifically, we calculate the steady-state probability generating function (PGF) of the system contents at the beginning of an arbitrary slot. This PGF enables us to derive some important performance measures. Furthermore, we investigate, through some numerical examples, the influence of some parameters on the optimal choice of the MBS. In this paper, we focus on the influence of the load and the distribution of the service times. Keywords: queueing model, batch server, service policies, minimum batch size, probability generating function.
1
Introduction
In telecommunications networks, usually some sort of aggregation of information units takes place at the sender’s side. This is mainly done for efficiency reasons, since only one header per aggregated batch has to be constructed instead of one header per single information unit. Optical switched networks apply this method (see e.g. Qiao[10]). At the edges, IP packets with the same destination and QoS requirements are aggregated into optical bursts which are injected into the network. The number of information units waiting to be aggregated and transmitted or the time that information units are delayed are some important performance characteristics of the aggregation process. In order to obtain expressions for these performance measures, we have developed a queueing model with a server which can process several information units simultaneously, i.e. a batch server. Note that in the remainder of the paper we use the more generic term
The second author is a Postdoctoral Fellow with the Fund for Scientific Research, Flanders (F.W.O.-Vlaanderen), Belgium.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 248–259, 2007. c Springer-Verlag Berlin Heidelberg 2007
A Discrete-Time Queueing Model with a Batch Server
249
customers instead of information units. The maximum number of customers that can be served simultaneously is called the server capacity. A traditional server has a capacity equal to one, while a batch server has a capacity larger than one. Many other applications of batch servers exist (e.g. in transportation and distribution logistics), so it is no surprise that batch-service models have been studied in the past by many researchers (e.g. Chaudhry[3], D¨ ummler[4], Gupta[6], Kim[7], Neuts[9]). The service policy is an important aspect of batch servers. This is a rule that determines how many customers have to be in the system before a server starts processing. For example, with an immediate-batch service policy the server starts serving from the moment that at least one customer is in the system. Notably, in this case, additional customers cannot be added to the batch in service, though there might be free capacity. When long service times are likely, it may seem desirable always to serve at full capacity instead, i.e. adopt a fullbatch service policy. Drawback to this policy is that customers might have to wait a long time until the system contains enough customers so that the server can process at full capacity, e.g. when the arrival rate is low. Neuts[9] presented a general service policy called the minimum batch size (MBS) policy. The MBS is a threshold that defines the minimum number of customers that have to be in the system before the server is allowed to start processing. Note that an immediate-batch service policy corresponds to a MBS equal to one, while the MBS is equal to the server capacity in case of a full-batch service policy. Below, we first describe the model. Then, in section 3, we calculate the steadystate probability generating function (PGF) of the system contents at the beginning of an arbitrary slot. We also obtain the probability mass function of the number of customers in a served batch and the mean delay of the customers. In section 4, we investigate through some numerical examples how the optimal MBS is influenced by the load and the distribution of the service times. Conclusions are drawn in section 5.
2
Model Description
We consider the discrete-time queueing model depicted in Fig. 1. The model consists of a queue of infinite length and a batch server. The batch server has a capacity of c, which means that maximum c customers can be served simultaneously. The MBS, the minimum number of customers that have to be present in the system before the server starts processing is denoted by l (1 ≤ l ≤ c). The transmission is synchronised, i.e. service cycles start and end at slot boundaries and thus last an integer number of slots. We assume that the lengths of consecutive service cycles, as well as the numbers of new customers arriving during consecutive slots, can both be represented by a set of independent and identically distributed (IID) random variables (RV’s), with PGF’s S(z) and A(z) respectively. Due to the momentgenerating property of PGF’s, their mean values are equal to S (1) and A (1) (we use primes to indicate derivatives). We also assume
250
D. Claeys et al.
1 2 S(z) A(z)
c
Fig. 1. Model of the system, containing a queue and a batch server
that the service times are independent of the number of served customers. The queueing discipline is irrelevant for the analysis in this paper. So, our model is a batch-arrival batch-server queueing model and can be denoted by M X |GI l,c |1. The equilibrium condition of this queueing model requires that the load ρ = A (1)S (1)/c < 1.
3
Analysis
We use a PGF-based approach to analyze the system contents. In the course of this paper, we denote the mass function of an RV X by x(n). So, x(n) = Pr[X = n]; X(z) is the corresponding PGF. For example, A denotes the number of arrivals during an arbitrary slot. The corresponding mass function and PGF are denoted by a(n) and A(z) respectively. We denote the number of customers in the queue (server) at the beginning of slot k by Qk (Wk ). Let Hk be the remaining number of slots of the service cycle at the beginning of slot k. If the server is idle during slot k (i.e. Wk = 0), then Hk = 0 and Qk < l. Hk = 1 means that slot k is the last slot of the service cycle. If the system contains less than l customers at the end of that service cycle, then the server remains idle until the beginning of the first slot where the system contains at least l customers. Q, W and H are the steady-state RV’s corresponding to Qk , Wk and Hk respectively. Vk (z, x, y) is the joint PGF of Qk , Wk and Hk , i.e. Vk (z, x, y) = E z Qk xWk y Hk . The corresponding steady-state PGF is denoted by V (z, x, y). We first calculate V (z, x, y) and from this expression we then obtain the steady-state PGF U (z) = E z Qk +Wk of the number of customers in the system at the beginning of an arbitrary slot by putting x = z and y = 1 in V (z, x, y). We can summarize the behavior of the system from slot k to slot k + 1 in two cases, depending on Hk : – Hk ≤ 1, i.e. the server is idle during slot k (Hk = 0) or slot k is the final slot of the service cycle (Hk = 1). This category can be seperated in three subcategories: • Qk + Ak < l ⇒ (Qk+1 = Qk + Ak , Wk+1 = 0, Hk+1 = 0) , • l ≤ Qk + Ak < c ⇒ (Qk+1 = 0, Wk+1 = Qk + Ak , Hk+1 = Sk+1 ) , • c ≤ Qk + Ak ⇒ (Qk+1 = Qk + Ak − c, Wk+1 = c, Hk+1 = Sk+1 ) ,
A Discrete-Time Queueing Model with a Batch Server
251
with Ak the number of customer arrivals during slot k and Sk+1 the length of the service cycle that starts at slot k + 1. Note that the PGF for Ak is A(z) since the numbers of consecutive customer arrivals are IID. Analogously, S(z) is the PGF corresponding to Sk+1 . – Hk > 1, i.e. the server is serving a batch during slot k and the service cycle is not finished at the end of slot k. This implies that (Qk+1 = Qk +Ak , Wk+1 = Wk , Hk+1 = Hk − 1) . This enables us to express Vk+1 (z, x, y) as a function of Vk (z, x, y): Vk+1 (z, x, y) = E z Qk +Ak {Hk ≤ 1, Qk + Ak < l} + S(y)E xQk +Ak {Hk ≤ 1, l ≤ Qk + Ak < c} xc + S(y) c E z Qk +Ak {Hk ≤ 1, Qk + Ak ≥ c} z l−1 A(z) + Vk (z, x, y) − Pr [Qk = n, Hk = 0] z n y n=0 Q W −yE z k x k {Hk = 1} .
(1)
where E z X {Y = i} = E z X |Y = i Pr[Y = i]. Let us introduce the following notations: – q0 (n) = limk→∞ Pr[Qk = n, Hk = 0] , – R(z, x) = limk→∞ E z Qk xWk {Hk = 1} , – e(n) = lim n, Hk ≤ 1]. The corresponding PGF is equal k→∞ Pr[Qk + Ak = l−1 to A(z) q0 (n)z n + R(z, 1) . n=0
In order to obtain the steady-state PGF V (z, x, y), the limit k → ∞ is taken in (1), which means that any RV Xk evolves to X, if the system is stable (i.e. if the load ρ < 1). We obtain in the steady state: l−1 c−1 c −c n V (z, x, y) = 1 − S(y)x z e(n)z + S(y) e(n) xn − xc z n−c n=0
c −c
+ S(y)x z
A(z)
l−1
n=l
n
q0 (n)z + R(z, 1)
n=0
l−1 A(z) n + V (z, x, y) − q0 (n)z − yR(z, x) . y n=0
(2)
Replacing y by 0 in (2) yields: l−1 n=0
q0 (n)z n =
l−1 n=0
e(n)z n ,
(3)
252
D. Claeys et al.
since V (z, x, 0) =
l−1
n n=0 q0 (n)z ,
∂V (z,x,y) ∂y y=0
= R(z, x) and S(0) = 0. Com-
bining (2) and (3) results in: 1−
l−1 A(z) A(z) V (z, x, y) = 1 − q0 (n)z n − A(z)R(z, x) y y n=0 + S(y)xc z −c (A(z) − 1)
l−1
q0 (n)z n
n=0
+ S(y)xc z −c A(z)R(z, 1) + S(y)
c−1
e(n) xn − xc z n−c .
(4)
n=l
Although an analysis of V (z, x, y) is possible, we are particularly interested in an expression for U (z), the steady-state PGF of the number of customers in the system at the beginning of an arbitrary slot. We will concentrate on this PGF in the remainder. Since the system contents equals the sum of the queue contents and the number of customers in the server, we obtain U (z) from V (z, x, y) by putting x = z and y = 1 in (4): [1 − A(z)] U (z) = −A(z)R(z, z) + A(z)R(z, 1) .
(5)
At this point, R(z, z) and R(z, 1) still have to be determined. The first step in the further calculations is to substitute y by A(z) in (4): A(z)R(z, x) = S(A(z))xc z −c (A(z) − 1)
l−1
q0 (n)z n
n=0
+ S(A(z))xc z −c A(z)R(z, 1) c−1 + S(A(z)) e(n) xn − xc z n−c .
(6)
n=l
Then the following expression for R(z, z) is found by substitution of x by z in (6): l−1 n A(z)R(z, z) = S(A(z)) (A(z) − 1) q0 (n)z + A(z)R(z, 1) . (7) n=0
Analogously, we find an expression for R(z, 1) by substituting x by 1 in (6): (A(z) − 1) A(z)R(z, 1) = S(A(z))
l−1 n=0
q0 (n)z n +
c−1 n=l
z c − S(A(z))
e(n) [z c − z n ] .
(8)
A Discrete-Time Queueing Model with a Batch Server
253
By combining (5), (7) and (8), we obtain for U (z): l−1 c−1 S(A(z)) U (z) = c (z c − 1) q0 (n)z n + S ∗ (A(z)) un (z c − z n ) (9) z − S(A(z)) n=0 n=l
where we have introduced un = S (1)e(n) and where S ∗ (z) = S S(z)−1 is the (1)(z−1) PGF of the position of an arbitrary slot in the service cycle; the first slot of a service cycle is assumed to have position 0. This expression still contains c unknown constants (q0 (n), 0 ≤ n ≤ l − 1 and un , l ≤ n ≤ c−1). They can be determined by solving a set of c lineair equations. By means of Rouch´e’s theorem, we can prove that the denominator has c zeroes (z0 = 1, z1 , . . . , zc−1 ) inside and on the complex unit disk |z| ≤ 1. Due to the analytic nature of PGF’s, these also have to be zeroes zeroes c−1of the numerator. l−1 Let us define T (z) = (z c − 1) n=0 q0 (n)z n + S ∗ (A(z)) n=l un (z c − z n ). Our set of lineair equations then consists of the normalization condition U (1) = 1 and c − 1 equations T (zi ) = 0, 1 ≤ i ≤ c − 1. Special Cases The steady-state PGF of the system contents in case of an immediate-batch service policy is obtained by substituting l by 1: c−1 S(A(z)) c ∗ c n U (z) = c q0 (0)(z − 1) + S (A(z)) un (z − z ) . z − S(A(z)) n=1 where the unknowns un have to be determined as mentioned above. Analogously for the full-batch service policy, we find by substituting l with c: U (z) =
c−1 (z c − 1)S(A(z)) q0 (n)z n . z c − S(A(z)) n=0
c−1 n Since is a polynomial of degree c − 1, this can be substituted n=0 q0 (n)z c−1 i (except for some special cases treated below in the remark) by K i=0 z−z 1−zi c where the zi ’s (0 ≤ i ≤ c − 1) are the zeroes of z − S(A(z)). If we invoke the normalization condition, we obtain that K = 1 − ρ, so we have: U (z) =
c−1 (1 − ρ)(z c − 1)S(A(z)) z − zi . z c − S(A(z)) 1 − zi i=0
Remark 1. The previous substitution for a full-batch service policy is not always allowed, namely when some of the zi ’s are zeroes of z c −1. This implies that z −zi c−1 is already a factor of z c −1 in the numerator and not necessarily of n=0 q0 (n)z n anymore. In this case, the initial system contents has to be taken into account. To make this more clear, let us give the following example: consider a system that initially contained two customers and assume a server capacity c of 10 and
254
D. Claeys et al.
5j that new customers arrive in batches of a multiple of 5, i.e. A(z) = ∞ j=0 a5j z . So, the system can contain 2, 7, 12, 17, 22,. . . customers. This is also the case for the queue contents. Since q0 (n) = limk→∞ Pr[Qk = n, Hk = 0], we have that q0 (i) = 0 if i ∈ {0, 1, 3, 4, 5, 6, 8, 9}. This means that only q0 (2) and q0 (7) are not equal to 0. The exact values can be determined using the normalization condition and the zeroes of z c − S(A(z)) that are not zeroes of z c − 1. Performance Measures From the PGF of the system contents, the moments of the system contents can be calculated by using the moment generating property of PGF’s. Also, the mean delay (i.e. the number of slots a customer sojourns in the queue and server, excluding the slot in which the customer arrived) can be found by using the discretised version of Little’s theorem (see e.g. Fiems[5]). In the next section, we will use the mean delay to optimize l. Finally, the probability mass function can be approximated by using the dominant pole approximation (see e.g. Bruneel[2]) or by inverting the PGF numerically (see e.g. Abate[1]). Fig. 2 shows the probability that the system contains more than n customers versus the value of n. It has been verified that these approximations are nearly identical to results obtained by simulation. We further see that the slopes of the tail probabilities in a linear-log plot are independent of the MBS l. This is a consequence of the fact that the dominant pole, a zero of z c − S(A(z)), is independent of l. These results are important since they provide a good approximation for the loss ratio of the system when the queue has a finite capacity of n + c customers. Remark 2. From (4), it is also possible to derive other performance measures, e.g. the mean number of customers in a served batch. This could be a good indication of whether the server is working near full capacity or not. If we substitute z by 1 in (4) and multiply this by y, we have:
0
10
l=1 l=5 l=10
−1
10
Pr[U>n]
−2
10
−3
10
20
40
60
80
100
n: number of customers
Fig. 2. Pr[U > n] in terms of n; c = 10, ρ = 0.7, α = 0.9
A Discrete-Time Queueing Model with a Batch Server
255
l−1 (y − 1)E xWk y Hk = (y − 1) q0 (n) − yR(1, x) n=0
c
+ S(y)y x Pr[H = 1] +
c−1
e(n) [x − x ] n
c
.
(10)
n=l
Hereby, we have used the fact that R(1, 1) = Pr[H = 1]. Let us denote the steadystate PGF of the number of customers in a served batch by B(x). Replacing y by 1 and using R(1, x) = B(x) Pr[H = 1] in (10) implies: 1 e(n) [xn − xc ] . Pr[H = 1] c−1
B(x) = xc +
(11)
n=l
From (11), we easily obtain the corresponding probabilities b(n): ⎧ e(n) if l ≤ n ≤ c − 1 ⎪ ⎨ Pr[H=1] c−1 e(n) b(n) = 1 − n=l Pr[H=1] if n = c ⎪ ⎩ 0 else We still need to calculate Pr[H = 1]. To this end, we replace z by 1 in (8); we obtain:
A (1) Pr[H = 1] =
l−1 n=0
q0 (n) +
c−1
e(n)(c − n)
n=l
c − S (1)A (1)
,
(12)
where q0 (n) and e(n) are already numerically calculated. In Table 1, some probabilities b(n) are shown for a system with c = 10, l = 5, and the parameter α of the geometric service distribution equals 0.9. The probabilities are calculated for several loads. We observe that the higher the load, the more the batches are filled. Other performance measures, such as the mean number of customers in the queue while a non-full batch is served, can be calculated as well. Table 1. b(n) for a system with c=10, l = 5 and α = 0.9 n b(n) if load = 0.1 b(n) if load = 0.5 b(n) if load = 0.9 5 0.93891 0.46668 0.07885 6 0.05269 0.15156 0.04244 7 0.00502 0.06565 0.02498 8 0.00175 0.04553 0.01936 9 0.00084 0.03781 0.01779 10 0.00079 0.23277 0.81659
256
D. Claeys et al.
4
Optimization of MBS
In this section, we investigate through some numerical examples how an appropriate choice for the MBS l is influenced by the distribution of the service times and the load. Note that also other parameters can have an influence, but we do not investigate them in this paper. We assume a Poisson distribution for the number of arrivals, i.e. the probability of having n arrivals during a slot is
e−A
n (1) (A (1)) . n!
We consider three cases for the distribution of the service times:
– S(z) = z: this means that service cycles last exactly one slot. – S(z) = (1−α)z 1−αz : the lengths of service cycles have a geometric distribution. – S(z) = z m , m > 1: the service cycles last exactly m slots. We define the optimal MBS as the MBS that minimizes the mean system contents and hence the mean delay of the customers. 4.1
Single-Slot Service Times
Fig. 3 shows the mean system contents (left pane) and the mean delay (right pane) in terms of the load for a system with a server capacity c of 10. In the figure we plotted the curves for 3 MBS’s. The figure reflects that the system performs best with MBS l equal to one, which corresponds to an immediate-batch service policy. This is trivial since waiting is not useful in this case. Indeed, the service cycle is finished at the end of the slot it started, thus the best policy is to serve every slot. In this case, we obtain for U (z) and l = 1, the same expression as for a system with c servers of capacity one. In Fig. 3, we also observe that, for very low load ρ en l > 1, the mean system contents is not converging to zero, although, for load equal to zero, the mean system contents is zero. Hence, we have a discontinuity in ρ = 0. It can intuitively be shown that for a MBS l > 1 and ρ → 0, the mean system contents is about l−1 2 . This is also observed from the figure. If we investigate the mean delay of the system, we remark again a special situation for ρ → 0 and l > 1: the mean delay goes to infinity. This is caused by the long time customers might have to wait to form a batch of at least l customers if the load is low. If the load increases, the mean delay diminishes. For still higher loads the queueing effect becomes dominant, leading to an increasing mean delay. 4.2
Geometric Service Times
In Fig. 4, the mean delay is plotted versus the load. We assume a server capacity c equal to 10. The mean service time is equal to 5 slots (i.e. α = 0.8) in the left part while this is 10 slots in the right part (i.e α = 0.9). The figure illustrates that l = 1 is not always the best choice in this case. We observe that a larger MBS is preferable for a higher load. So, there are transition points in the load from where the system performs better with a certain MBS i(≤ c) than with MBS i − 1. From these loads onwards, the positive effect of waiting longer to provide a
A Discrete-Time Queueing Model with a Batch Server
257
10
30
9 25
8 7
mean system contents
mean delay
l=10
4 10
3
l=10
l=5
2 5
l=5 1
l=1
l=1 0 0
0.2
0.4
0.6
0.8
1
0 0
0.2
0.4
0.6
0.8
1
load
load
Fig. 3. Server capacity c = 10; service cycles last exactly 1 slot; left part: mean system contents; right part: mean delay 40
40
l=10
30
30
l=5 mean delay
mean delay
l=10
15
15
l=5
l=1 10
10
5
0 0
l=1
5
0.2
0.4
0.6
0.8
1
0 0
0.1
load
0.2
0.3
0.4
0.5
0.6
0.7
0.8
load
Fig. 4. Mean delay versus the load for several MBS’s with server capacity c = 10; α = 0.8 in the left part and α = 0.9 in the right part
better utilisation of the server wins on the negative effect that customers might have to wait a long time before l customers are present. We remark that the profit of waiting is larger for larger mean service times; the transition points also appear for a lower load. 4.3
Deterministic Service Times of m Slots
In this last example, the service times are deterministically equal to m slots. In Fig. 5, we plotted the mean delay as a function of the load. The service time m is equal to 5 slots in the left part and 10 slots in the right part. These are the same mean service times as in Fig. 4. Though it appears like there are no transition points on this figure, there in fact are: they appear for a very high load
258
D. Claeys et al. 40
40
l=10 30
30
l=5
l=10
mean delay
mean delay
15
15
l=5
l=1 10
10
5
5
l=1
0 0
0.2
0.4
0.6
load
0.8
1
0 0
0.2
0.4
0.6
0.8
1
load
Fig. 5. Mean delay versus the load for several MBS’s with server capacity c = 10; m = 5 in the left part and m = 10 in the right part
and the performance gain is negligible. This difference with respect to geometric service times is caused by the variance of the service times. It is well-known that in many queueing systems a bigger variance of the service times causes a higher mean system contents. Apparantly, the lower the MBS l, the more this effect plays a role.
5
Conclusion
In this paper, we have studied a system with a batch server operating under the MBS service policy. Specifically, we have calculated the steady-state PGF of the system contents at the beginning of an arbitrary slot for the M X |GI l,c |1 queueing model. From this PGF we could obtain the required performance measures. We have also analyzed the number of customers in a served batch. Furthermore, by means of some case studies in section 4, we have investigated the influence of the load and the distribution of the service times on the optimal choice of the MBS l. This has enabled us to conclude: – If the service times are equal to one slot, then l = 1 is the optimal MBS. – If the service times can last longer than one slot, the load influences the optimal MBS. Then, there are transition points in the load from where the system performs better with a certain MBS i(≤ c) than with MBS equal to i − 1. Obviously other parameters also influence the optimal MBS, e.g. the distribution of the number of arrivals during a slot. This is a topic for future research. Furthermore, we also plan to incorporate the influence of the number of customers in a served batch on the service times in a future analysis. This will obviously complicate the analysis.
A Discrete-Time Queueing Model with a Batch Server
259
References 1. Abate, J.: Numerical inversion of probability generating functions. Operations Research Letters 12(4), 245–251 (1992) 2. Bruneel, H., Steyaert, B., Desmet, E., Petit, G.H.: Analytic derivation of tail probabilities for queue lengths and waiting times in ATM multiserver queues. European Journal of Operational Research 76, 563–572 (1994) 3. Chaudhry, M.L., Templeton, J.G.C.: A First Course in Bulk Queues. Wiley, New York (1983) 4. D¨ ummler, M.A., Sch¨ omig, A.K.: Using discrete-time analysis in the performance evaluation of manufacturing systems. In: 1999 International Conference on Semiconductor Manufacturing Operational Modeling and Simulation (SMOMS ’99), San Francisco, California (1999) 5. Fiems, D., Bruneel, H.: A note on the discretization of Little’s result. Operations Research Letters 30, 17–18 (2002) 6. Gupta, U.C., Goswami, V.: Performance analysis of finite buffer discrete-time queue with bulk service. Computers & Operations Research 29, 1331–1341 (2002) 7. Kim, N.K., Chaudhry, M.L.: Equivalences of Batch-Service Queues and MultiServer Queues and Their Complete Simple Solutions in Terms of Roots. Stochastic Analysis and Applications 24, 753–766 (2006) 8. Kleinrock, L.: Queueing Systems: Volume – I Theory. Wiley Interscience, New York (1975) 9. Neuts, M.F.: A general class of bulk queues with Poisson input. Ann. Math. Stat. 38, 759–770 (1967) 10. Qiao, C.M., Yoo, M.S.: Optical burst switching (OBS) - a new paradigm for an optical Internet. Journal of high speed networks 8(1), 69–84 (1999)
Derivatives of Blocking Probabilities for Multi-service Loss Systems and Their Applications V.B. Iversen1 and S.N. Stepanov2 COM · DTU, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
[email protected] 2 Sistema Telecom 125047, Moscow, 1st Tverskay-Yamskaya 5, Russia
[email protected] 1
Abstract. Derivatives of blocking probabilities of multi-service loss networks are important for traffic engineering. An explicit formula for the derivatives of blocking probabilities with respect to offered traffics is obtained expressed by stationary probabilities of global state probabilities. The approach is based on the convolution algorithm. It allows us to find expressions for the derivatives in a much more easy way than known so far. It is briefly shown how derivatives can be applied for approximate evaluation of performance measures and for studying the error of performance measures estimation caused by small changes of offered traffic. The results can also be applied for network optimization. Keywords: Derivatives, blocking probabilities, multi-service loss networks, approximate evaluation.
1
Introduction
The derivatives of blocking probabilities of multi-service loss systems are important for solving optimization problems, but to obtain them in a suitable form is a complicated task. There are only few results in this field [1]–[6]. Complete solutions exist for models whose performance measures are expressed through Erlang-B function [1,3]. The derivative of the Erlang-B function En,A with respect to the intensity of offered traffic A is: dEn,A n − A(1 − En,A ) = · En,A . dA A
(1)
Here n is the capacity of the link expressed in basic bandwidth units (BBU). The requests for bandwidths arrive according to Poisson processes. In (1) the value of the derivative of Erlang-B formula is expressed through the value of Erlang-B function. This property simplifies the estimation of derivatives. After calculating En,A we can easily find the derivatives of En,A with respect of offered traffic. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 260–268, 2007. c Springer-Verlag Berlin Heidelberg 2007
Derivatives of Blocking Probabilities
261
For multi-rate systems the situation is more complicated because (i) we do not have explicit expressions for performance measures through the values of input parameters and (ii) because we may have an arbitrary number of input flows. Results similar to (1) are obtained in [4] where for a single-link multirate model simple expressions are found for derivatives of blocking probability of flow number i with respect to traffic intensity of i’th flow. These derivatives are expressed through global state probabilities, i.e. distribution of total number of occupied basic bandwidth units. Global state probabilities together with performance measures are obtained when realizing the convolution algorithm [10]. So in this case the values of derivatives can be found together with main stationary performance measures of the model. Unfortunately, to obtain these particular derivatives is not sufficient for studying the error of estimation of performance measures caused by small changes of offered traffic and to solve other similar problems related to the practical usage of derivatives. To do so we need to know the derivatives of the blocking probabilities with respect to intensities of all traffic streams offered to the link. This problem in general form was studied in [2,6] but without presenting algorithm for obtaining the values of derivatives. These results have certain theoretical value but it is difficult to apply them for solving practical engineering problems. In [5] for multi-rate one-link model a computational scheme is presented for estimation of derivatives of any probability of blocking with respect to intensity of any stream. But to realize the suggested approach we need to perform much more complex calculations than for Erlang case. In this paper, based on ideas of [4], we present a new algorithm for finding the derivatives of blocking probabilities with respect to the intensity of any traffic stream of multi-rate models. The complexity of the problem is reduced to same complexity as for the classical single traffic stream. We illustrate the realization of the suggested approach by an example of a one-link model with full accessibility and an arbitrary number of traffic streams. But the proposed method can also be applied for any model whose performance measures are estimated by means of the convolution algorithm [10]. The paper is organized as follows. In Section 2 the model is described. The convolution algorithm will be formulated in Section 3. In the following Section 4 we derive the basic formulæ for derivatives of blocking probabilities with respect to the intensity of any traffic stream. In Section 5 we present possible applications of the results for approximate evaluation of performance measures and for studying the estimation error of performance measures caused by small variations of offered traffic.
2
Model Description
Let us consider a single-link traffic model, where the link transmission capacity is represented by n basic bandwidth units, and let us suppose that we have K incoming Poisson flows of calls with traffic intensities Ak , k = 1, 2, . . . , K. A call of k’th flow uses dk bandwidth units for the time of connection. It is supposed
262
V.B. Iversen and S.N. Stepanov
that calls from the k’th stream are blocked when more than n−dk bandwidth units are occupied. Without loss of generality we shall assume that the holding times all are exponentially distributed with mean value chosen as time unit, but it is known that the model considered is insensitive to the distribution of the holding time, and each flow may furthermore have individual mean holding times. Let ik (t) denote the number of calls of the k’th flow served at time t. The model is described by a K-dimensional Markovian process of the type r(t) = {i1 (t), i2 (t), . . . , iK (t)} with state space S consisting of vectors (i1 , i2 , . . . , iK ), where ik is the number of calls of the k’th flow being served by the link under stationary conditions. The state space S is defined as follows: (i1 , i2 , . . . , iK ) ∈ S,
ik ≥ 0, k = 1, 2, . . . , K,
K
ik dk ≤ n .
k=1
Let us by P (i1 , i2 , . . . , iK ) denote the unnormalised values of stationary probabilities of r(t). After normalisation the value p(i1 , i2 , . . . , iK ) denotes the mean proportion of time when exactly {i1 , i2 , . . . , iK } connections of different types are established. Assume that for state (i1 , i2 , . . . , iK ) the value i denotes the total number of occupied bandwidth units i = i1 d1 + i2 d2 · · · + iK dK . The process of transmission of requests for bandwidth of k’th flow is described by blocking probability πk , k = 1, 2, . . . , K, and by mk , the mean number of bandwidth units occupied by calls of k’th flow. Their formal definition through values of state probabilities are as follows (here and further, summations are for all states (i1 , . . . , iK ) ∈ S satisfying the above formulated condition, and by small characters we denote the normalised values of state probabilities): πk = p(i1 , . . . , iK ), mk = p(i1 , . . . , iK ) ik dk . (2) i+dk >n
(i1 ,...,iK )∈S
Because mk = Ak dk (1 − πk ) we only consider the problem of estimating πk in the following. There are many algorithms for estimation of πk . All of them are based on the product form relations valid for P (i1 , . . . , iK ): P (i1 , i2 , . . . , iK ) = P (0, . . . , 0) ·
AiK Ai11 Ai22 · · ...· K , i1 ! i2 ! iK !
(i1 , i2 , . . . , iK ) ∈ S. (3)
The best calculation scheme for the model introduced is the recurrence algorithm first obtained in [7] and later also derived in [8],[9]. This algorithm exploits the fact that the performance measures (2) can be found if we for the process r(t) know probabilities p(i) of being in the state where exactly i bandwidth units are occupied: p(i) = p(i1 , i2 , . . . , iK ) . i1 d1 +···+iK dK =i
The corresponding formulæ are as follows: πk =
n i=v−dk +1
p(i),
k = 1, 2, . . . , K.
(4)
Derivatives of Blocking Probabilities
263
The unnormalised values of P (i) are found by the recurrence: 1 Ak dk P (i − dk ) I(i − dk ≥ 0), i K
P (i) =
i = 1, 2, . . . , n,
(5)
k=1
where we usually let P (0) = 1, and the function I(·) equals one if the formulated condition is fulfilled and otherwise equals zero. A numerically better and stable procedure is to normalize the state probabilities during each iteration. The alternative approach for estimation of πk , k = 1, 2, . . . , K, is based on the convolution algorithm [10]. The realization of this scheme allows very easily to find in one run the values of individual probabilities of blocking πk , k = 1, 2, . . . , K ∂πk and the values of derivatives ∂A , i = 1, . . . , K, j = 1, . . . , K. We will show this j in Section 4, but first we give a short description of the convolution algorithm.
3
Convolution Algorithm
For two independent vectors of the same dimension having components x = {x(0), x(1), · · · , x(a)} and y = {y(0), y(1), · · · , y(a)} we can define the convolution operator that being applied to vectors x, y gives the vector z with components {z(0), z(1), · · · , z(b)} as follows: z(i) = x(0) y(i) + x(1) y(i−1) + . . . + x(i−1) y(1) + x(i) y(0),
i = 0, 1, ..., b ,
where b ≤ a. In the following, the term convolution means usage of the convolution operator defined in the above way. Because it is known that the solution of the system of state equations has a product form (3) it can be found by means of an algorithm that we shall refer to as the convolution algorithm [10]. It consists of making the following three steps: 1. For k’th stream (k = 1, 2, . . . , K) calculate its individual unnormalized state probabilities {Pk (0), Pk (1), ..., Pk (n)} as if it was the only traffic stream offered to the n bandwidth units. For determination of Pk (r) we use the product form expressions (3) when it is considered for only one incoming flow number k. We have the following relations: ⎧ ⎨ Aik , r = 0, dk , 2dk , . . . , i dk , . . . , k dk , Pk (r) = (6) ⎩ i! 0, r = 0 , dk , 2dk , . . . , i dk , . . . , k dk . In (6) k is a maximum number of calls of k’th stream that can be served simultaneously. It is clear that k is the integer part of n/dk . Let us call {Pk (0), Pk (1), . . . , Pk (n)} the individual state distributions for call stream number k. 2. In any fixed order make successive convolution of all K individual state distributions. Let P (r) be the vector obtained after convolving all n individual distributions. Here r, r = 0, 1, . . . , n, is the total number of bandwidth units occupied by all calls. Let PK\k (r) be the vector obtained after convolving
264
V.B. Iversen and S.N. Stepanov
all K individual distributions except stream number k. Here r = 0, 1, . . . , n is a total number of occupied bandwidth units by all calls except for calls of the stream number k. 3. If we perform convolution of vector PK\k (r), r = 0, 1, . . . , n, with vector Pk (), = 0, 1, . . . , n, of individual state distributions for stream number k we obtain after normalization the system state distribution p(r), r = 0, 1, ..., n, and individual performance measures πk , mk of the last stream having number k. This algorithm is alternative to the calculation scheme (5). For this particular case it is not so effective as (5) but much more general for the number of models where convolution algorithm can be applied. The performance measures for all streams can be found after performing the above mentioned steps for each stream by putting it at the end of the convolution procedure. Let us denote by Nm the computational efforts that is required to find the performance measures for all streams. We will measure Nm by the number of multiplications. Let us denote by Nc the number of convolutions and by Ncm the mean number of multiplications in performing one convolution. Then it is clear that Nm = Nc Ncm . For the convolution algorithm Nc = k(k − 1), so the required computational efforts is Nm = k(k − 1)Ncm . In [11] it shown how to decrease the amount of computational efforts in the implementing of the convolution algorithm by decreasing both Nc and Nmc . The total number of convolutions can be made equal to Nc = 4k − 6 by storing some of the intermediate results, and Nmc can be decreased by truncation of the used state space.
4
Algorithm for Derivatives of Individual Blocking
Auxiliary expressions obtained during realization of convolution algorithm results in a very simple scheme for determination of the derivatives of the individual blocking probabilities πk , k = 1, 2, . . . , K. Let us show this by the example of n the model considered. Let us denote by N = P (r) the normalizing constant. r=0
Then in accordance with definition, the derivative of πk with respect of Aj can be found through the following expressions: ∂πk ∂ P (n) + P (n−1) + . . . + P (n−dk +1) = (7) ∂Aj ∂Aj N 1 ∂ (P (n) + P (n−1) + . . . + P (n−dk +1)) = 2× ×N N ∂Aj
∂N − (P (n) + P (n−1) + . . . + P (n−dk +1)) . ∂Aj Then in accordance with definition of the convolution algorithm we have for r = 0, 1, . . . , n: P (r) = PK\j (r)Pj (0) + PK\j (r−1)Pj (1) + . . . + PK\j (0)Pj (r) .
(8)
Derivatives of Blocking Probabilities
265
Because PK\j (r) does not depend on Aj and due to the definition of Pj (r) (6) its derivative has the form: ∂Pj (r) = Pj (r−dj ) , ∂Aj
r = dj , dj + 1, ..., n .
(9)
From (8, 9) we have the following relation for the derivative of unnormalised stationary probability P (r) with respect to the intensity of j’th traffic stream: ∂P (r) = PK\j (r−dj )Pj (0) + PK\j (r−dj −1)Pj (1) + . . . + PK\j (0)Pj (r−dj ) ∂Aj = P (r−dj ). (10) Using (10) we finally obtain from (7): ∂πk = p(n−dj ) + p(n−dj −1) + . . . + p(n−dj −dk +1) (11) ∂Aj − {p(n−dj )+p(n−dj −1)+. . . + p(0)} · {p(n)+p(n−1)+. . .+p(n−dk +1)} = p(n−dj ) + p(n−dj −1) + . . . + p(n−dj −dk +1) − (1−πj )πk . The usage of (11) allows us to obtain the value of derivatives of individual blocking with respect to the intensity of any traffic stream served by the link in single run of the recursion (5) used for estimation of individual blocking probabilities.
5
Examples of Usage of Derivatives for Multi-service Loss Systems
Let us suppose that the performance of a multi-service link is described by a function f (π1 , π2 , . . . , πK ), where πk , k = 1, 2, . . . , K, are individual blocking probabilities for stream number k, each depending on the intensities of calls arriving A1 , A2 , . . . , AK , so we can write πk = πk (A1 , A2 , . . . , AK ). Examples of specific types of function f (·) are the following: 1. The individual blocking probability for the stream number k: f (π1 , π2 , . . . , πK ) = πk .
(12)
2. The mean number of resource units occupied by calls of k’th flow: f (π1 , π2 , . . . , πK ) = Ak dk (1 − πk )
(13)
3. The average revenue from calls being served: f (π1 , π2 , . . . , πK ) =
K
k Ak (1 − πk ),
k=1
where k is the rate at which class-k calls generates revenue.
(14)
266
V.B. Iversen and S.N. Stepanov
Because we know the derivatives of individual blocking with respect to the intensity of any traffic stream served by the link it allows us to study how changes in arrival rates affect link performance described by function f (·). Let us for simplicity choose a type of f (·) defined by (12): πk (A1 + ΔA1 , A2 + ΔA2 , . . . , AK + ΔAK ) = πk (A1 , A2 , . . . , AK ) +
(15)
∂πk ∂πk ∂πK ΔA1 + ΔA2 + . . . + ΔAK . ∂A1 ∂A2 ∂AK
The above result is of great practical importance. It solves at least three problems: 1. Discrete-event simulation. Suppose we want to forecast blocking probabilities, but that we only have a rough idea of what offered load (A1 , A2 , . . . , AK ) is going to be. Then we need to simulate the system not only for values (A1 , A2 , . . . , An ) but also at the perturbed loads (A1 + ΔA1 , A2 , . . . , AK ), (A1 , A2 + ΔA2 , . . . , AK ),.... This requires K + 1 simulation runs and requires much computer time. An alternative approach is to simulate πk (A1 , A2 , . . . , AK ) once and then to use the relation (15). 2. Approximate calculation. If we calculate the exact value of the probability πk (A1 , A2 , . . . , AK ) for given input values (A1 , A2 , . . . , AK ), then it possible to use linear function (15) for approximate calculation of blocking probabilities in some neighbourhood of the values (A1 , A2 , . . . , AK ) determined by expression (A1 ± ΔA1 , A2 ± ΔA2 , . . . , AK ± ΔAK ). Because the function πk (A1 , A2 , . . . , AK ) changes smoothly with respect of components (A1 , A2 , . . . , AK ) it allows us to attain good accuracy of estimation for comparatively large values of ΔAk , k = 1, 2, . . . , K. 3. Error of measurements. Very often in solving dimensioning problems we use as input parameters the values of traffic intensities (A1 , A2 , . . . , AK ) found with some error depending on the used statistical procedures and confidence level. So instead of (A1 , A2 , . . . , AK ) we need to use (A1 ± ΔA1 , A2 ± ΔA2 , . . . , AK ± ΔAK ). For this case relation (15) allows us to find the error of performance measure estimation caused by the error of measurements of the model’s input parameters. Let us consider a numerical example that for the model studied shows the accuracy of approximate estimation of the blocking probabilities with help of derivatives. Figure 1 shows the exact value of πk (A1 + ΔA1 , A2 + ΔA2 , . . . , AK + ΔAK ) for k = 3, K = 5 and its approximation found by (15). The values of others parameters are as follows n = 500, A1 = 100, d1 = 1, A2 = 50, d2 = 2, A3 = 25, d3 = 4, A4 = 20, d4 = 5, A5 = 10, d5 = 10 with ΔAk defined by relation:
Derivatives of Blocking Probabilities
267
Blocking 0,14
0,12
0,1
0,08
0,06
Exact value of blocking
0,04
0,02
Approximate value of blocking
0 x 0,900 0,910 0,920 0,930 0,940 0,950 0,960 0,970 0,980 0,990 1,000 1,010 1,020 1,030 1,040 1,050 1,060 1,070 1,080 1,090 1,100
Fig. 1. The approximate and exact value of blocking
ΔAk = Ak (x − 1). The value of x on the Figure varies from 0.9 to 1 (the negative ΔAk ) and from 1 to 1.1 (the positive ΔAk ). The results presented shows good accuracy of approximation, especially with increasing blocking probabilities.
6
Conclusion
In this paper we present a new algorithm for finding the derivatives of blocking probabilities with respect to the intensity of any traffic stream of multi-rate models. The complexity of the problem is reduced to the same complexity as for the classical single traffic stream. The results obtained give explicit expressions for derivatives of blocking probabilities through stationary probabilities of total number of occupied bandwidth units. We demonstrate the realization of the suggested approach by an example of one-link model with arbitrary number of traffic stream and full accessibility of the calls to the link. But the proposed method can be realized for any model whose performance measures are estimated according convolution algorithm [10]. It is briefly shown how derivatives can be applied for approximate evaluation of performance measures or for the study error of performance measures estimation caused by small changes of offered traffic. These results can be also be applied for solving economical problems and for network optimization
268
V.B. Iversen and S.N. Stepanov
References 1. Akimaru, H., Takahashi, H.: Asymptotic Expansion for Erlang Loss Function and its Derivatives. IEEE Transactions on Communications. vol. COM 29(9), 1257– 1260 (1981) 2. Virtamo, J.: Reciprocity of Blocking Probabilities in Multiservice Loss Systems. IEEE Transactions on Communications 36(10), 1257–1260 (1988) 3. Esteves, J.S., Craveirinha, J., Cardoso, D.: Computing Erlang-B Function Derivatives in the Number of Servers. Communications in Statistics, Stochastic Models 11(2), 311–331 (1995) 4. Iversen, V.B.: Derivatives of Blocking Probabilities of Multi-Service Loss Systems. NTS–12. In: 12th Nordic Teletraffic Seminar, Otnas, Esbo, Finland, August 22– 24(1995) 5. Nilsson, A.A., Perry, M., Gersht, A., Iversen, V.B.: On Multi-rate Erlang-B Computation. ITC-16. In: 16th International Teletraffic Congress, Edinburg, Scotland (1999) 6. Ross, K.W.: Muliservice Loss Models for Broadband Telecommunication Networks. Springer, Heidelberg (1995) 7. Fortet, R., Grandjean, Ch.: Congestion in a Loss System when some Calls want Several Devices Simultaneously. In: Electrical Communications. Paper presented at ITC–4, Fourth International Teletraffic Congress, London, UK, 15–21 July, vol. 39, pp. 513–526 (1964) 8. Kaufman, J.S.: Blocking in a Shared Resource Environment. IEEE Transactions on Communications. vol. COM-29, 1474–1481 (1981) 9. Roberts, J.W.: A Service System with Heterogeneous User Requirements – Applications to Multi–Service Telecommunication Systems. In: Pujolle, G. (ed.) Performance of Data Communication Systems and their Applications, pp. 423–431. North–Holland, Amsterdam (1981) 10. Iversen, V.B.: The Exact Evaluation of Multi-Service Loss Systems with Access Control. NTS-7, Lund, Sweden, and Teleteknik 31(2), 56–61 (1987) 11. Iversen, V.B., Stepanov, S.N.: The Usage of Convolution Algorithm with Truncation for Estimation of Individual Blocking Probabilities in Circuit Switched Telecommunication Networks. ITC-15. In: Proceedings of the 15th International Teletraffic Congress, Washington, USA, pp. 1327–1336 (1997)
Rare Events of Gaussian Processes: A Performance Comparison Between Bridge Monte-Carlo and Importance Sampling Stefano Giordano1 , Massimiliano Gubinelli2 , and Michele Pagano1 1
2
Universit` a di Pisa, Dipartimento di Ingegneria dell’Informazione, Via Caruso 16, I-56126 Pisa, Italy {s.giordano,m.pagano}@iet.unipi.it Universit´e de Paris-Sud, Equipe de probabilit´es, statistique et mod´elisation, Bˆ atiment 425, F-91405 Orsay Cedex, France
[email protected]
Abstract. A goal of modern broadband networks is their ability to provide stringent QoS guarantees to different classes of users. This feature is often related to events with a small probability of occurring, but with severe consequences when they occur. In this paper we focus on the overflow probability estimation and analyze the performance of Bridge Monte-Carlo (BMC), an alternative to Importance Sampling (IS), for the Monte-Carlo estimation of rare events with Gaussian processes. After a short description of BMC estimator, we prove that the proposed approach has clear advantages over the widespread single-twist IS in terms of variance reduction. Finally, to better highlight the theoretical results, we present some simulation outcomes for a single server queue fed by fraction Brownian motion, the canonical model in the framework of long range dependent traffic. Keywords: Rare Event Simulation, Gaussian Processes, Importance Sampling, Most Likely Path, Bridge Monte-Carlo.
1
Introduction
In the framework of teletraffic engineering, many challenging issues have recently arisen as a consequence of the evolution of network architectures and services. First of all, the last decade was marked by the search for global network architectures, which should handle heterogeneous applications and (sometimes) very stringent Quality of Service (QoS) guarantees. On the other hand, the growing interest for new sophisticated traffic models, able to take into account the Long Range Dependent (LRD) nature of real traffic, had a deep negative impact on analytical tractability, making simulation a more and more relevant tool for performance estimation. Indeed, even in the case of a simple single server queue, only a few asymptotic results for specific input traffic models are known [1,2]. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 269–280, 2007. c Springer-Verlag Berlin Heidelberg 2007
270
S. Giordano, M. Gubinelli, and M. Pagano
A primary QoS parameter is the loss probability, whose typical values can be very small and therefore difficult to estimate through standard Monte-Carlo (MC) simulation, since long run times are required to achieve accurate results. Simple queuing models are often considered in the literature in order to test the efficiency of new speed–up techniques and then generalize the results to more complex scenarios. In the framework of LRD traffic, the usual benchmark is represented by the estimation of the overflow probability (an upper bound for the loss probability in the corresponding finite-buffer system) for a lossless single server queue. By virtue of central-limit-type arguments (superposition of a large number of independent sources), Gaussian processes have been used quite often to capture in a parsimonious and flexible way the long–memory property of actual traffic flows; in particular, fractional Brownian motion (fBm) has become the canonical model with a LRD correlation structure [3]. To tackle the limits of traditional MC simulation in dealing with rare events, Importance Sampling (IS) techniques can be applied. Unfortunately, the goodness of an IS-based algorithm strongly depends on the choice of a proper change of measure to reduce the variance of the estimate. Recent works [4,5] show that single-twist IS (which consists in a change of measure chosen within the class of pdfs that differ from the original one only by a shift in the mean value) cannot be even asymptotically efficient if the input is fBm with Hurst parameter H = 0.5. As a matter of fact, asymptotic optimality can be achieved [5,6] by the use of more refined IS techniques, but at the cost of a higher computational complexity. In [7] we introduced Bridge Monte-Carlo (BMC), an alternative strategy to the efficient MC estimation of the overflow probability, which exploits the Gaussian nature of the input process. BMC estimator has the same computational complexity of single-twist IS and does not rely on a change of measure; hence, its applicability is quite general, since the only assumption is the knowledge of the correlation structure of the incoming traffic flow. The aim of this work is to evaluate the performance of the BMC estimator (in terms of its variance) and compare it with the more traditional single-twist IS. The analytical results are then verified through discrete event simulations, considering as input to the queue fBm traces with different values of H.
2
Setting of the Problem
The reference model we will consider in the paper is a single server queue with infinite buffer and deterministic service rate. In particular, we are interested in the evaluation of the overflow probability, i.e., the probability that the steadystate queue-length Q exceeds a given threshold b. By Lindley’s recursion, the latter is given by IP (Q ≥ b) = IP sup(Xt − ϕt ) ≥ 0 (1) t∈I
where ϕt = b + tμ, μ is the difference between the mean values of the service and arrival rates and {Xt }t∈I is a Gaussian noise, with covariance Γts = E[Xt Xs ],
Rare Events of Gaussian Processes
271
t, s ∈ I, modeling the fluctuation of the input traffic. In the following, we will assume that Γtt > 0 for any t ∈ I except a point t0 ∈ I for which Γt0 t0 = 0. For instance, if the input is modelled by fBM, it is easy to see that the previous hypotheses are fulfilled (with t0 = 0) since, in that case, Γtt = σ02 t2H . In general the set I can be a finite subset of ZZ or a whole bounded interval of IR. In the first case the process X is just a random vector in X = IR|I| (|I| is the cardinality of I), which we will consider a Banach space using the Euclidean norm. In the second case, it can usually be assumed that the process X belongs to the Banach space X of continuous functions from I to IR endowed with the supremum norm. Since for simulation purposes a finite-dimensional I is enough, we prefer to restrict our discussion to the first framework, which is free of some technicalities that would prevent a clear exposition of the novel methodology, although most of the arguments works in (almost) the same way both in the discrete and in the continuous settings. Let us introduce a metric | · |H on X associated to the finite-dimensional Gaussian process X and defined as |ρ|2H = ρ, ρH = ρ, Γ −1 ρ = [Γ −1 ]ts ρt ρs (2) t,s∈I
where ·, · is the Euclidean scalar product of X and Γ −1 is the inverse of the covariance matrix Γ . We will denote by H the so called reproducing kernel Hilbert space of X, i.e., the set of elements ρ ∈ X for which |ρ|H < +∞. In order to compare the behavior of the estimators when the probability of interest is small, we will introduce a small parameter ε in equation (1) and consider the probabilities pε defined as pε = IP sup(εXt − ϕt ) ≥ 0 . (3) t∈I
For instance, in the many-sources regime, i.e., when n i.i.d. Gaussian sources are aggregated and queuing resources (buffer size and service rate) are scaled with n, buffer overflow (over level nb) becomes a rare event when n → ∞ as a consequence of statistical multiplexing and, as can be easily checked by direct computation, √ the corresponding overflow probability is given by equation (3) with ε = 1/ n.
3
Definition of IS Estimators
A trivial approach to the estimation of the probability pε is to draw i.i.d. samples X (1) , . . . , X (N ) from X and consider the MC estimator pε
N 1 1A (X (i) ) N i=1 ε
where Aε is the event Aε = {x ∈ X : sup[εxt − ϕt ] ≥ 0} . t∈I
(4)
272
S. Giordano, M. Gubinelli, and M. Pagano
However, if pε → 0 the number N of samples to obtain a reliable estimate should −10 grow roughly as p−1 or ε . Estimation of very small probabilities (e.g. pε 10 smaller) becomes impossible or computationally heavy. Importance Sampling (IS) is a popular technique devised to build unbiased estimators not suffering from the smallness of pε . This is achieved by changing the law of the process so that to favor the occurrence of the target rare event and taking this change into account by reweighting the estimation according to the likelihood ratio, which, in measure-theoretic terms, is the Radon-Nikodym derivative of the original law with respect to the new one [8]. The efficiency of an IS-based algorithm depends on the choice of a proper change of measure to reduce the variance of the estimate. It is well known that the optimal change of measure (zero-variance pdf) involves the knowledge of the probability we want to estimate and therefore cannot be practically adopted. The issue is commonly tackled by minimizing some sub-optimal criteria instead. To this aim, we consider the class of IS estimators constructed by shifting the process X with a constant path η ∈ H (single-twist estimator). In the case of Gaussian processes, the likelihood ratio Lη (x) becomes 1 2 Δ dγ Lη (x) = (x) = exp −η, x + |η| (5) H dλη 2 H where γ and λη are the laws (on X ) of {Xt }t∈I and {Xt + ηt }t∈I respectively. Hence, the overflow probability can be rewritten as pε = E[1Aε (X)] = 1Aε (x) dγ(x) X −1 −1 = 1Aε (x)Lε η (x) dλε η (x) X
−1
= E[1Aε (X + ε−1 η)Lε
η
(X + ε−1 η)]
and the single twist IS estimator is defined to be Δ
pN twist,ε,η =
N −1 1 1A (X (i) + ε−1 η)L−ε η (X (i) + ε−1 η) N i=1 ε
(6)
where η is the twist and its choice determines the performance of the estimator. Indeed, the variance of the single-twist IS estimator is controlled by the quantity −1
2 σtwist,ε,η = Var[1Aε (X + ε−1 η)Lε
−1
= E[(1Aε (X + ε−1 η)Lε
−1
= E[1Aε (X + ε−1 η)L2ε
η
(X + ε−1 η)]
η
(X + ε−1 η))2 ] − p2ε
η
(X + 2ε−1 η)]e|η|H /ε − p2ε 2
2
= Itwist − p2ε where
−1
Itwist = E[1Aε (X + ε−1 η)L2ε Δ
η
(X + 2ε−1 η)]e|η|H /ε . 2
2
(7)
Rare Events of Gaussian Processes
273
It is worth noticing that, as proved in [4,5], the single twist IS estimator cannot be asymptotically efficient if the input is fBm with H = 0.5. In any case, according to Large Deviation arguments, the natural choice (see, for instance, [5,9,10]) for η is represented by the most-likely path to overflow ρ∗t = ϕt∗ Γtt∗ /Γt∗ t∗
(8)
where t∗ is a most-likely time, i.e., a time which satisfies inf
t∈I
4
ϕ2t ϕ2∗ = t . Γtt Γt∗ t∗
The BMC Approach
The Bridge Monte-Carlo (BMC) method is based on the idea of expressing the overflow probability as the expectation of a function of the Bridge Y of the Gaussian input process X, i.e., the process obtained by conditioning X to reach a certain level at some prefixed time t ∈ I: Yt = Xt − ψt Xt , where Δ
ψt =
(9)
Γtt . Γtt
By the properties of Gaussian processes, the joint process (X, Y ) is still Gaussian and the process Y is independent of Xt since E[Xt Yt ] = Γtt −
Γtt Γ =0. Γtt tt
Moreover, its covariance is a simple function of the covariance of the original process X: Γ Γ Δ Γts = E[Yt Ys ] = Γts − t t s t . (10) Γt t Finally, it is relevant to point out that the computational effort to simulate Y is equal to that of X. Since Xt = Yt + ψt Xt for any t ∈ I, we can express the probability pε of the event of interest Aε as follows: pε = IP sup[εXt − ϕt ] ≥ 0 = IP sup[εYt + εψt Xt − ϕt ] ≥ 0 t∈I t∈I = IP inf ψt−1 [ϕt − εYt ] ≤ εXt = E IP inf ψt−1 [ϕt − εYt ] ≤ εXt |Y t∈I t∈I
Yε =E Φ ε Γt t (11)
274
S. Giordano, M. Gubinelli, and M. Pagano
where ϕt − εYt ψt
Δ
Y ε = inf
t∈I
and Δ
∞
Φ(x) = x
e−y /2 √ dy . 2π 2
Given an i.i.d. sequence {Y , i = 1, . . . , N } distributed as Y , we introduce the Bridge Monte-Carlo (BMC) estimator for pε as follows: (i) N Y Δ 1 N ε pbmc,ε,t = Φ N i=1 ε Γt t (i)
with (i)
Yε
(i)
Δ
= inf
t∈I
ϕt − εYt ψt
.
It is easy to check that the BMC estimator is unbiased, i.e. E pN = pε bmc,ε,t and that its variance is given by 2 Var pN bmc,ε,t = σbmc,ε,t /N where
2 σbmc,ε,t = Var Φ
Yε ε Γt t
⎡ = E ⎣Φ
Yε ε Γt t
2 ⎤ ⎦ − p2ε .
To heuristically justify the efficiency of BMC from a computational perspective, we point out that if the basic MC method can be seen as a numerical scheme to perform integration in a large number of variables, then BMC is a hybrid method in that it performs one of these integrations exactly exploiting the properties of Gaussian processes, while the remaining integrations are still performed using a MC scheme (the bridge Y lives in a smaller space than the original process X, since one of the coordinates is zero by definition). When it comes to rare event estimation, it happens that in the full space of the process the characteristic function of the rare event has support on a region with small probability and this renders direct MC estimation ineffective. However, BMC smoothes out the function to be integrated allowing a more efficient estimation by the MC part.
Rare Events of Gaussian Processes
5
275
Comparison Between IS and BMC Estimators
The main result of the paper is represented by the following theorem, which basically states that, for any choice of the rarity parameter ε, BMC performs better than single-twist IS, even when the change of measure is based on the most likely path ρ∗ . Theorem 1. For any ε > 0 and any twist η of the form ηt = λψt , t ∈ I with λ ∈ IR, we have 2 2 σbmc,ε,t ≤ σtwist,ε,λψ . (12) Proof. Since the mean values of the two estimators are the same (both estimators are unbiased), it is enough to prove that ⎡
2 ⎤ Y Δ ε ⎦ ≤ Itwist . Ibmc = E ⎣Φ ε Γt t We can rewrite the l.h.s. as ⎡ Ibmc = E ⎣
+∞
−∞
2 ⎤
−y 2 /(2Γt t )
e 1Y ε ≤εy
2πΓt t
dy
⎦
and then perform the change of variables z = y − α (for an arbitrary α ∈ IR): ⎡
2 ⎤ +∞ −z 2 /(2Γt t ) 2 e Ibmc = E ⎣ 1Y ε −εα≤εz e−αz/(Γt t )−α /(2Γt t ) dz ⎦ 2πΓt t −∞ which, by Jensen’s inequality, is less than
+∞ −z 2 /(2Γt t ) 2 2 e Ibmc ≤ E 1Y ε −εα≤εz e−αz/(Γt t )−α /(2Γt t ) dz 2πΓt t −∞ so that
Ibmc ≤ E
+∞
−∞
2 e−z /(2Γt t ) 1 e−2αz/(Γt t )−α /(Γt t ) dz 2πΓt t Y ε −εα≤εz 2
.
Now we want to rewrite the exponent −2αz/(Γt t ) − α2 /(Γt t ) in a different way, taking into account that Y, ψH = 0 almost surely and that |ψ|2H = Γt−1 t (see Lemma 1 below). Hence −2αz/(Γt t ) − α2 /(Γt t ) = −2Y + zψ, αψH − |αψ|2H and, setting η = εαψ (so that λ = εα is an arbitrary real number), we get
+∞ −z 2 /(2Γt t ) e −2ε−1 Y +zψ,ηH −ε−2 |η|2H Ibmc ≤ E 1 e dz . 2πΓt t Y ε −εα≤εz −∞
276
S. Giordano, M. Gubinelli, and M. Pagano
Now recall the definition of Y ε and note that the event {Y ε − εα ≤ εz} is equivalent to the event {supt [ε(z + α)ψt + εYt − ϕt ] ≥ 0} = {zψ· + Y· + ε−1 η· ∈ Aε } = {X· ∈ Aε − η}, so that
+∞ −z 2 /(2Γt t ) e −2ε−1 Y +zψ,ηH −ε−2 |η|2H Ibmc ≤ E 1zψ· +Y· +ε−1 η· ∈Aε e dz . 2πΓt t −∞ Finally, since Y +zψ has the same distribution as the original process X (since Y is its bridge and z is an independent Gaussian random variable with the “right” covariance), we have −1 −2 2 Ibmc ≤ E 1X· +ε−1 η· ∈Aε e−2ε X,ηH −ε |η|H −2 2 −1 = E 1Aε (X + ε−1 η)L2ε η (X + 2ε−1 η) eε |η|H = Itwist where we took into account the definition (5) of the likelihood ratio for Gaussian processes. This proves our claim. Here we prove the auxiliary results needed in the above proof. Lemma 1. We have Y, ψH = 0 almost surely and ψ, ψH = Γt−1 t Proof. Let us first prove that ψ, ψH = Γt−1 . By definition, the scalar product t on H is such that Γ· a , Γ· b H = Γa b ; since ψt = Γt t /Γt t , we have the claim: ψ, ψH = Γt−2 Γ· t , Γ· t H = Γt−1 . t t To prove the second statement, recall that Y is a centered Gaussian process with covariance Γ given by equation (10); then the random variable A = Y, ψH is still Gaussian with mean zero. We will prove that its variance is also zero, so that we can conclude that A = 0 almost surely. Let us compute E[A2 ]: ⎡ ⎤ E[A2 ] = E ⎣ [Γ −1 ]ts Yt ψs [Γ −1 ]t s Yt ψs ⎦ t,s
=
t ,s
[Γ −1 ]ts ψs ψs [Γ −1 ]t s E [Yt Yt ]
t,s,t ,s
=
[Γ −1 ]ts ψs ψs [Γ −1 ]t s Γtt
t,s,t ,s
=
[Γ −1 ]ts ψs ψs [Γ −1 ]t s (Γtt − Γtt ψt ψt )
t,s,t ,s
=
[Γ −1 Γ Γ −1 ]s s ψs ψs − Γtt ψ, ψ2H
s,s
= ψ, ψH − Γtt ψ, ψ2H = 0 by the previous result on the H-norm of ψ.
(13)
Rare Events of Gaussian Processes
277
The Theorem is quite general and inequality (12) holds for any choice of the conditioning point t and any twist η of the form ηt = λψt = λ
Γtt Γtt
(λ ∈ IR) .
In particular, if the most likely time to overflow t∗ is chosen as conditioning point (i.e., t = t∗ ), then equation (8) implies that ηt ∼ ρ∗t . Since λ is an arbitrary constant, from the previous Theorem it follows that the BMC estimator (with t = t∗ ) has a lower variance than the single-twist IS estimator with η = ρ∗ .
6
Simulation Results
In this section we present some of the results achieved by applying the above described techniques to the simulation of a single server queue. As a reference scenario, we consider the many-source regime, where n i.i.d. Gaussian sources are multiplexed together (i.e., the aggregate traffic has covariance nΓts ) and the queuing resources are scaled with n. We recall that in this case the overflow probability is still given by equation (3), i.e.: √ Δ (n) pn = P Q ≥ nb = P sup(Xt / n − ϕt ) ≥ 0 t
√ where ε = 1/ n and X is a centered Gaussian process with covariance Γts . In our simulations, we consider a queue with μ = 0.1 and b = 0.3, fed by fBm traces (with σ0 = 1) for different values of the Hurst parameter. In all cases the trace length is 3 · t∗ (with preliminary sets of simulations, we verified that longer traces do not affect significantly the results) and the estimations are averaged over N = 106 random samples. In the performance comparison, for IS we consider two different changes of measure, corresponding to the most-likely path ρ∗ and the widely used (see, for instance [11]) linear path, denoted in the following as IS-MLP and IS respectively. Figure 1 shows that the overflow probability (for sake of brevity, only the plots for H = 0.7 are shown) decays exponentially as expected and the slope of the plot (in logarithmic scale) is in accordance with the Large Deviation limit [5,9]: − lim n log pn = inf n→∞
t∈I
ϕ2t ϕ2t∗ Δ 1 ∗ 2 = = |ρ |H . 2 Γtt 2 Γt∗ t∗ 2
(14)
The estimations obtained with IS-MLP and BMC (with t = t∗ ) are quite close, while, as expected, naive IS (i.e., IS with linear path) is less efficient. To confirm the correctness of the previous Theorem, we present the behavior of the Relative Error (i.e., the estimated standard deviation of the estimator divided by pε ) for H = 0.5 (standard Brownian motion – see Figure 2) as well as H = 0.7 (LRD traffic – see Figure 3). In both cases, the relative position of IS-MLP and BMC curves agrees with the theoretical results, while naive IS introduces bigger errors for H = 0.7. Heuristically, this is explained by the fact
278
S. Giordano, M. Gubinelli, and M. Pagano 1 BMC IS-MLP IS 0.01
OverflowProbability
1e-04
1e-06
1e-08
1e-10
1e-12
1e-14 100
200
300
400
500 n
600
700
800
900
Fig. 1. Overflow Probability estimates for H = 0.7 1000
RelativeError
100
10
BMC IS = IS-MLP BMC-1/2MLT BMC-2MLT 1 100
200
300
400
500 n
600
700
800
900
Fig. 2. Relative Error estimates for H = 0.5
that we did not use the additional information about the most probable way in which overflow is reached. It is worth noticing that for H = 0.5, IS and IS-MLP are equivalent: indeed, in case of standard Brownian Motion, the most likely path is linear.
Rare Events of Gaussian Processes
279
10000
RelativeError
1000
100
10 BMC IS-MLP IS BMC-1/2MLT BMC-2MLT 1 100
200
300
400
500 n
600
700
800
900
Fig. 3. Relative Error estimates for H = 0.7
Finally, Figures 2 and 3 also compares the performance of BMC for different choices of the conditioning point: t = t∗ (BMC), t = 1/2·t∗ (BMC-1/2MLT) and t = 2 · t∗ (BMC-2MLT). As expected, in accordance with the Large Deviation interpretation, the most likely time to overflow t∗ appears to be the best choice for t. BMC estimators, for a wrong choice of the conditioning point, may behave worse than IS with MLP; this result is quite intuitive and not in contrast with Section 5. Indeed, when t = t∗ , ψt is not proportional to ρ∗ and the Theorem only refers to changes of measure of the form ηt = λψt .
7
Conclusions
It is well known that single-twist IS approach is not the best technique to tackle the problem of rare events for Gaussian traffic. This was the main motivation that led us to propose the Bridge Monte-Carlo (BMC) estimator, a novel approach that exploits the Gaussian nature of the input process and relies on the properties of Bridges. The computational cost of BMC is comparable to that of single-twist IS (the same random samples are generated during the simulation runs) and in this paper we proved that BMC performs better of IS in terms of estimator variance. The theoretical results are also confirmed by various sets of simulations, which also highlight that naive IS (with linear path to overflow) has the worst performance among the analyzed speed–up techniques. Finally, it is important to point out that the principle underlying the BMC method can be applied to any Gaussian process (only the knowledge of the
280
S. Giordano, M. Gubinelli, and M. Pagano
covariance function is required in order to estimate Y ε ) and in a wide variety of network systems (e.g. tandem queues, schedulers, etc.) and could be generalized with more than one conditioning or with dynamic choice of the parameters. Acknowledgments. The authors would like to acknowledge support from the Network of Excellence EuroNFI (Design and Engineering of the Future Generation Internet).
References 1. D¸ebicki, K., Mandjes, M.R.H.: Exact overflow asymptotics for queues with many Gaussian inputs. Technical Report PNA-R0209, CWI, the Netherlands (2002) 2. Narayan, O.: Exact asymptotic queue length distribution for fractional Brownian traffic. Advances in Performance Analysis 1(1), 39–63 (1998) 3. Norros, I.: Studies on a model for connectionless traffic, based on fractional Brownian motion. In: Conference on Applied Probability in Engineering, Computer and Communications Sciences, Paris, INRIA/ORSA/TIMS/SMAI (1993) 4. Baldi, P., Pacchiarotti, B.: Importance Sampling for the Ruin Problem for General Gaussian Processes. Technical report (2004) 5. Dieker, A.B., Mandjes, M.R.H.: Fast simulation of overflow probabilities in a queue with Gaussian input. ACM Trans. Model. Comput. Simul. 16(2), 119–151 (2006) 6. Dupuis, P., Wang, H.: Importance Sampling, Large Deviations and differential games. Technical report, Lefschetz Center for Dynamical Systems, Brown University (2002) 7. Giordano, S., Gubinelli, M., Pagano, M.: Bridge Monte-Carlo: a novel approach to rare events of Gaussian processes. In: Proc. of the 5th St.Petersburg Workshop on Simulation, St. Petersburg, Russia, pp. 281-286 (2005) 8. Heidelberger, P.: Fast Simulation of Rare Events in Queueing and Reliability Models. Performance Eval. of Computer and Commun. Syst. 729, 165–202 (1993) 9. Addie, R., Mannersalo, P., Norros, I.: Most Probable Paths and performance formulae for buffers with Gaussian input traffic. Eur. Trans. on Telecommunications 13 (2002) 10. O’Connell, N., Procissi, G.: On the Build-Up of Large Queues in a Queueing Model with Fractional Brownian Motion Input. Technical Report HPL-BRIMS98-18, BRIMS, HP Labs. Bristol (U.K.) (1998) 11. Huang, C., Devetsikiotis, M., Lambadaris, I., Kaye, A.: Modeling and simulation of self-similar variable bit rate compressed video: a unified approach. In: Proc. of SIGCOMM’95, Cambridge, US, pp. 114–125 (1995)
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES) Helena Rif`a-Pous and Jordi Herrera-Joancomart´ı Universitat Oberta de Catalunya Rb. del Poble Nou, 156 08018 Barcelona {hrifa,jordiherrera}@uoc.edu
Abstract. The functioning of an ad hoc network is based on the supportive contributions of all of its members. Nodes behave as routers and take part in route discovery and maintenance. In this paper, a forwarding protocol is presented that stimulates node cooperation to improve the throughput of the network. The incentive mechanism is provided through a micropayment protocol that deals with the cooperation range of the users. Most cooperative nodes intrinsically benefit of the best routes and availability and they take preference in front of selfish ones. Keywords: Multihop ad hoc networks, cooperation, forwarding, payment.
1
Introduction
The functioning of an ad hoc network is based on the supportive contributions of all of its members. Nodes cooperate to form a communication infrastructure that extends the wireless transmission range of every terminal without using any dedicated network device. To ensure and spur the cooperative behavior of ad hoc network members, an incentive mechanism is required that regulates the resources spent and given to the community. Protocols to stimulate cooperation can be divided in two groups: reputationbased and credit-based1 . The former treat packet forwarding as an obligation and isolate and punish those nodes that do not behave as expected, while the latter consider it as a service that can be valued and charged. Reputation-based schemes define a method for keeping track of nodes’ actions in order to classify reliable and unreliable nodes [2,3,4,5]. The main problem of this approach is distinguishing misbehaving nodes from those that can not retransmit packets due to energy constraints, channel fadings or simply natural disconnections. The assumption that a node shall forward always all the packets it receives is too hard for a network formed of -beyond others- small and handheld 1
For a detailed comparison of different cooperative protocols we refer to [1] where there are summarized the most relevant proposals.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 281–293, 2007. c Springer-Verlag Berlin Heidelberg 2007
282
H. Rif` a-Pous and J. Herrera-Joancomart´ı
devices. On the other hand, nodes on some strategic points of the network will have more transmission requests than those on the periphery, and it will be unfair to punish them if they can not hold all the transport. In credit-based schemes, virtual currency is introduced to stimulate each node to behave cooperatively. Nodes that generate traffic have to pay to those ones that help forwarding the data. In this category, a distinction can be done regarding the nature of the payment: money-based schemes and token-based schemes. Money-based schemes [6,7,8] use money as the payment token. The drawbacks of that kind of currency models is that the costs of managing financial information have a considerable legal and administrative overhead. Furthermore, the minimization of selfish nodes is not guaranteed since users without economical concerns can behave selfishly in the net and pay whatever is needed to have its packets transmitted. Token-based schemes generally require the nodes have a balanced number of packets transmitted and relayed [9,10]. Nodes increase the number of stored tokens when they forward packets, and decrease them proportionally to the number of hops when sending messages. A node shall forward packets until it earns enough to send its owns, so this kind of protocols can be sometimes limiting the capacity of the network if the average token level is too low. On the other hand, if it is too high, tokens no longer suppose an incentive to cooperate and the mechanism does not fulfill its purposes any more. Present research in credit-based mechanisms is basically focused on how much a node should be paid for forwarding messages. One research direction is finding a fair incentive algorithm that rewards the users for the resources used in the forwarding connection [10,11,12]. The circumstances and resources employed by relying parties (battery level, transmission energy, position within the network topology, mobility, bandwidth, ..) are considered to calculate the cost of a certain path. Although theoretically these kind of algorithms are very attractive, they are too complex for mobile ad hoc networks. The real cost of a transmission changes for every transferred packet so the overhead involved for sending a message is barely affordable. Too hard protocols may provoke a contrary effect on the users, not willing to participate on the network. In this paper, we present a Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES), a simple credit-based scheme that provides incentives to selfish mobile nodes to cooperate. The proposed protocol seeks to foster the traffic through a fair protocol, but instead of trying to pay for the resources spent in a connection, it rewards with a high quality of service those constant collaborative nodes. An evaluation of the system through a simulation analysis is also presented. The contributions of the proposal are the following. In spite of previous approaches, that try to spur the system though a payment model that rewards the nodes based on its utility function, FURIES uses a payment protocol to categorize nodes’ behavior. Users are prone to collaborate in order to obtain a better quality of service. One of the novelties of this protocol with respect to the
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
283
previous ones proposed in the literature is that introduces an incentive factor to prize the forwarding of packets of high ranked people. The paper is organized as follows. In section 2 we introduce the protocol and give an overview of the proposed architecture. Section 3 describes the protocol details and analyzes some interesting aspects to spur traffic in multihop networks. Section 4 evaluates the solution based on simulation results. Finally, we conclude the paper in section 5.
2
FURIES General Description
We present in this section the general description of our Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES). FURIES is a credit-based protocol that combines properties of both credit-based and reputation-based incentive models. On one hand, it uses payment mechanisms to charge/reward the forwarding of packets through the net. On the other hand, it manages user reputation status to classify reliable from unreliable nodes. Packets of both high and low reputed nodes are prone to be sent, however nodes with higher reputation take preference to get their data forwarded, i.e. they favor of a better quality of service. The interchange currency used in the FURIES payment protocol is not money but credit to transmit data. The unit of credit is a token that represents 1 packet of 2346 bytes 2 . Credit tokens exchanged in a transmission session are used to state the reputation of a user and categorize its involvement in the net. Nodes that generate traffic loose tokens and reputation, while the ones that forward it, gain them. However, payments and collections are not balanced. The cost of sending a packet depends on the hop distance to the destiny. On the other hand, the reward is based on the credit level of the sender, that is, its participation status. Thus, nodes earn more credits for forwarding packets of highly reputed and credited users. 2.1
FURIES Entities
In this paper we consider a user that wants to connect to another one who is not in his transmission range, so a multihop route has to be established. We assume a routing protocol that provides information of available routes. Opposed to other credit-based protocols for ad hoc networks, FURIES does not require that the source node knows the complete path to the destination but only the hop distance. FURIES will stimulate the transmission through the discovered routing paths. Credit-based schemes require the use of tamper-proof hardware or a trusted third party (TTP) to manage the tokens. We make use of a TTP to securely store the credit account of nodes and give memory to the system, that is, credits earned or spent in a session are taken in consideration further the lifetime of a particular ad hoc network. 2
The maximum size of an IP packet over a 802.11 network is 2346 bytes [13].
284
H. Rif` a-Pous and J. Herrera-Joancomart´ı
FURIES architecture is composed of the following entities: – Certification Authorities (CA) that issue identity certificates for the participants of ad hoc networks. The recognized CAs are the ones accepted in the Internet Community and that follow some established security policies. – Reputation Authority (RpA), a TTP that is used to manage the users’ credit account. Such information is contained in a reputation certificate that will be implemented as an attribute certificate according the standard X.509. All users in our model are registered in a well known CA that issues them a certificate which binds their identity with their public key. With this certificate, users can sign on the RpA that will manage their credits. The RpA is an independent entity not related to any specific CA. It can deal with CAs of different providers as long as it accepts its certification policies. Moreover, the RpA does not need to be centrally controlled but can be a distributed entity under the control of a world-wide community. Reputation certificates are used to classify users and fix the rewarding credits of a forwarding. For this reason it is important that these certificates hold updated information at any time. Therefore, reputation certificates are short live certificates, with a validity that we fix in 10 days. 2.2
Incentive Factor IF
The FURIES protocol introduces an Incentive Factor (IF ) element to prioritize the forwarding of packets from collaborative nodes and thus provide them a good quality of service. Nodes do not need to pay more to receive a better service, the incentives a router receive to forward a packet are intrinsically stated in the protocol based on the profile of each payer. The incentive factor modulates the credits (c) that an intermediate node has to receive for its job such that c = IF · d, where IF is the incentive factor of the sender node, and d is the number of transmitted packets. We have designed the incentive factor as a function of the credits such that it asymptotically tends to 0 when the credit balance of a user grows in negative values, and increases polynomially otherwise. Since the amount of data transmitted in ad hoc networks can range from a few Kb when the devices are very small and limited, up to hundreds of Mb when the net has access to the Internet, the gradient of the incentive factor function is bigger for values around 0 (see figure 1(a)). This allows to clearly make a distinction between selfish and unselfish nodes. The IF function on the credits is the following: IF (c) = A ∗ abs(c)(signum(c)/B) Through simulations we have heuristically approximated two values for A and B, resulting in A = 1/2, and B = 10 (see figure 1(b)). IF (c) = 1/2 ∗ abs(c)(signum(c)/10) , −109 < c < 109
(1)
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
285
4 1.0 0.9 3 Incentive Factor
Incentive Factor
0.8 0.7 0.6 0.5
2
1
0.4 0.3 −200
0 0
200
400
600
800
−7.5
1,000
−5.0
−2.5
0.0
2.5
5.0
7.5
Credits (log scale)
Credits
(a) IF for low range credits
(b) IF function (log axis)
Fig. 1. Incentive factor function
The charges and rewards of a transmission are not balanced, so we have limited the range of the accumulated credits to [−109 , 109 ] in order to avoid the saturation of a node in an extreme position. When the credit rate of a user is 0, its incentive factor is A = 1/2, which is lower than 1. This discourages users from indiscriminately registering themselves with a new identity to reset their record. The neutral incentive factor (IF = 1), that is, when a forwarder receive the same amount of credits for a carried packet that the ones it would have to pay in case it initiates a transaction, is when the accumulated credit of a user is 103 packets which is a little more than 2, 2M Bytes of data.
3
FURIES Credit-Based Protocol
FURIES stimulates cooperation through a credit mechanism that regulates nodes’ transmissions based on their reputation. In this section we detail such mechanism, that can be divided in three phases: – Initialization phase – Contract establishment and communication, driven by a micropayment scheme – Charging and Rewarding phase 3.1
Initialization
In order to initiate a transmission in a multihop network a user needs to hold a reputation certificate that states its forwarding parameters. In particular, the reputation certificate sets two main attributes: – Credits (c): Accumulated user credits at the time of certificate generation. – Incentive Factor (IF ): The result of applying the IF equation(1) over c.
286
H. Rif` a-Pous and J. Herrera-Joancomart´ı
When a user first requests a certificate in an RpA he is issued a certificate with c = 0 and IF = 1. His IF will be 1 until the user starts transmitting data or his accumulated credit is equivalent to an incentive factor greater that 1. We give new users an IF of 1 to not prejudice their first transactions. At the same time, we spur users first to give resources to the net and then take the profit. 3.2
Micropayment Scheme
The micropayment scheme we use in this paper is highly inspired on the PayWord [14], a light protocol that allows offline verification of the payment proofs. The micropayment protocol is divided in two parts: Contract Establishment and Data Transmission. Figure 2 depicts all the steps. Contract Establishment When node A0 wants to send data to node An , assuming the path will go through nodes A1 , · · · , An−1 : 1. A0 generates payment tokens in the following way: Node A0 generates a long fresh chain of paywords w0 , w1 , ..., wm by choosing w0 at random and by applying a hash function h iteratively such that wj = h(wj−1 ) for j = 1, 2, · · · , m, where m is the maximum number of possible payments during the session. 2. A0 prepares a contract offer. The offer includes the sender and receiver identifiers, IA0 , IAn , the serial number of the sender reputation certificate, SNA0 , and its validity period V , the number of hops of the route n and the top hash chain value wn : Offer = {IA0 , IAn , SNA0 , V, n, wm } 3. Node A0 sends a forwarding request towards An that contains its digital signature of the contract offer together with its reputation certificate RCertA0 : ReqA0 = {SA0 [Offer], RCertA0 } 4. The request is read by intermediate nodes of the path (A1 , · · · , An−1 ). If they are not interested in forwarding the packet because for them the expense is not worthwhile, they send a reject response to A0 . Otherwise, they enclose in the request a signed attachment with information about its identity. ReqAi = {ReqAi−1 , SAi [ReqAi−1 ], IAi }
For i = 1, · · · , n − 1
After forwarding an offer request, a node Ai waits (n − i)·timeout seconds for a response, either positive or negative, from Ai+1 . If it not arrives, it send a break up chain message to A0 . 5. Node An receives the request of transmission from user A0 along with the information of the relaying parties Ai for i = 1, · · · , n − 1. An verifies the signatures and checks that the number of hops stated in the contract offer is at most n.
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
Node A1
Node A0 SA0(Offer)
(3)
(1-2)
...
ReqA0+ SA1(ReqA0) (4)
Contract Establishment
Node A(n-1)
287
Node n
ReqA(n-2) + SA(n-1)(ReqA(n-2)) (4) (6)
Signed contract (5)
(7) Contr
Payment check (8) Data Transmission
data
(9)
Fig. 2. Micropayment protocol
6. If all data is correct and node An accepts the transmission from A0 , it generates a contract with the data of the received offer and an appendix with the list of recruited routing nodes, and signs the overall information. It sends the contract to node A0 using the same bidirectional path as the one used in the reception. RepAn = Contr = SAn [ReqAn−1 ] 7. All routing nodes keep a copy of the contract. Data Transmission At the end of the contract set up phase, data transmission can be started. 8. If A0 wants to send d packets of data to An , it will transmit to A1 the information along with a payment check. The payment check consists of the d next hash values of the chain. In fact, presenting the highest hash it is enough. For instance, for the first d packets A0 has to send the chain value wn−d . inf o = {packets, wn−d} 9. A1 verifies the payment, checking that wn = hd (wn−d ), where d is obtained from the number of transmitted packets. A1 keeps a copy of the wn−d value and forwards the inf o to the next node. Such operation is performed at each intermediated node Ai , for i = 1, · · · , n − 1. 10. Finally, An obtains the packet inf o. 3.3
Charging and Rewarding Model
Charging and rewarding is performed using a protocol between the routing nodes involved in the transmission and the reputation authority, RpA. This phase must be executed anytime after the data transmission session and within the validity period of the contract, when the nodes have online connection with the RpA. It is important to notice that the possession of a payment proof by a node Ai does not entail that the corresponding node Ai has forwarded the data, just
288
H. Rif` a-Pous and J. Herrera-Joancomart´ı
that it has received it. However, if Ai has the payment proof, it is clear that Aj for 0 ≤ j ≤ i − 1 indeed forwarded the packets. For that reason, when a routing node Ai with i = n reports a payment proof to the RpA, it only receives half of the full router rate, while the lower nodes of the path can be completely rewarded. Only when the destination node of a packet An sends payment proofs to the RpA it is evidenced that the data has been delivered and all intermediate nodes are rewarded. In order to stimulate destination parties to send the proofs, they are also rewarded with a tax of 1 credit for each packet they demonstrate they have received. The detailed protocol is the following: 1. When node Ai wants to get payed for the forwarding services, it sends to the RpA the forwarding contract, Contr, and the payment proof wk , where k = m−d, being m the maximum number of packets that can be transmitted within that session, and d the number of forwarded packets. 2. The RpA verifies that hd (wk ) = wm , which ensures that the token is valid. RpA obtains the value wm from Contr, where the value is signed by the sender node A0 and then assumed authentic. 3. Then, the RpA executes the following procedure: – If no proof wk has been previously presented by any node, then RpA adds (IFA0 · d) credits to each node Aj for 1 ≤ j < i and (1/2 · IFA0 · d) credits to Ai , in case i = n. If i = n (i.e. the reporter is the destination node) then An is rewarded with d credits. In any case, the RpA also deducts (i · d) credits from A0 credits. – If Aj , for some 1 ≤ j < i has already presented the proof wk to the RpA, then the RpA adds (1/2 · IFA0 · d) credits to Aj , (IFA0 ∗ d) credits to each node Ak for j + 1 ≤ k < i and (1/2 · IFA0 · d) credits to Ai , in case i = n. If i = n, then An is rewarded with d credits. In any case, the RpA also deducts ((i − j) · d) credits from A0 credits. – If Aj , for some i < j ≤ n has already presented the proof wk to the RpA, then the RpA informs to Ai that it has already been rewarded for such operation. Since the incentive factor of a node can suffer changes in short periods of time, the rewarding IFA0 to be used in step 3 is the one stated in the reputation certificate which serial number SNA0 appears in the forwarding contract Contr. However, when the transmission path is short (n < 5), rewarding IFA0 can not exceed 1. This prevents fake nodes to create looping traffic between them in order to increase their credit. Then, 1, if n < 5, IFA0 > 1 Rewarding IFA0 = IFA0 , otherwise It has to be noted that the charging and rewarding model we propose is unbalanced, hence, it faces a problem of credit saturation when all nodes achieve the maximum credit level. This congestion leads the system to work as if it was a plain model that can neither prioritize transmission packets to provide a quality
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
Threshold=1
289
Threshold=1
A5 +1 credit/pkt
IF=0,8
-5 credit/pkt
A0
Threshold=0,7
Threshold=0,5 Threshold=0,7
+0,8 credit/pkt
A1 A2
A4
Threshold=0,8
A3
+0,8 credit/pkt +0,8 credit/pkt
+0,8 credit/pkt
Fig. 3. Illustration of the payment charges
of service, nor offer any real incentive to the routing nodes to spur the data forwarding. To avoid such case, the RpA maintains a sliding window for each user that inspects the accumulated amount of data forwarded by each of them during the last 30 days. If the result does not exceeds 1% of his forwarding credit, this will be reduced 1% every day that passes in these conditions. Figure 3 illustrates the charging and rewarding model with an example. Nodes only transmit packets from initiators whose incentive factor is greater than a threshold. Node A0 has two connection routes to node A5 , however, it can not use the shortest one to send data to A5 because its reputation value is not high enough to encourage the intermediate nodes of this path to forward its packets. Nodes in the shortest path are centrally located in the network, receive a lot of forwarding requests, and only relay packets of users who are very collaborative and have a high reputation level. As a result, A0 has to select the longest path for the transmission, which is more expensive (it costs 5 credits/packet instead of 3 credits/packet), but offers the required availability.
4
Evaluation
Simulations of FURIES were conducted to evaluate the general characteristics of the protocol and provide proof of concept. We used a self-developed application that considers network layer factors and allows us to make qualitative appraisals. However, we do not model the problems of physical and link layers, so that quantitative performances can not be directly extracted from the tests. We simulated two different payment models: a plain payment protocol without incentives, such as [9](that is, sending one packet through 3 hops costs 3 credits, and the intermediate nodes get 1 credit each one), and the proposed FURIES protocol with the incentive factor defined in section 2.2. The simulated networks are composed of 100 nodes that move randomly in a square area of 1000m2. The transmission range is 70m. Each node starts, on average, 2 transmissions a day of messages the size of which is uniformly distributed from 1Kb to 10Mb. The application is run during a simulation period of a year. 100 simulation runs have been performed. Table 1 compares the results of a population attempting to send data through a multihop network giving the mean and variance over 100 simulations. We have
290
H. Rif` a-Pous and J. Herrera-Joancomart´ı Table 1. Forwarding Simulation: Plain protocol vs. FURIES
Ratio of accepted transmissions Reputation level of accepted senders vs. average Reputation level of rejected senders vs. average
Plain protocol E(X) = 69%, σ 2 = 0.95 E(X) = 0%, σ 2 = 4.64 E(X) = 0%, σ 2 = 5.23
FURIES E(X) = 83%, σ 2 = 1.82 E(X) = 8%, σ 2 = 0.64 E(X) = −12%, σ 2 = 2.59
modeled the users willingness to forward packets based on their available resources (i.e. battery level), and the profits they can make for the action. Relaying parties do not transmit if the battery level is below 20%. However, our assumption is that between 20% and 50% they will resend packets if they obtain a credit rate over the cost price, in particular, a benefit more than 30%. If the remaining battery is above 50%, users will transmit if the reward is at least the 90% of what they offer. Despite the battery level, we also assume that nodes with a negative credit balance will accept any forwarding request. Otherwise, when the forwarding is rejected, the initiator has to search another routing path. It tries it up to five times. First of all, it has to be noted that the number of accepted transmissions in FURIES is greater than in the plain payment protocol. This is one of the goals of incentive protocols, and FURIES achieve it. By offering appropriate incentives -a good reputation status that, as we state in the next point, provides a quality of service-, FURIES can take profit of the maximum forwarding capacity of nodes and thus improve the overall throughput of the network. The service of forwarding packets is rewarded with credits, and the accumulation of credits increases the reputation status. The second and third rows of Table 1 show that in plain mode the reputation level of nodes which packets are accepted or rejected is not relevant since its average is the same as the rest of the population. That is, in spite of its accumulated credits, the sending of any node can be blocked. Nevertheless, in FURIES accepted traffic is from people who hold a better profile (8% better than the average), and rejected one is from those nodes that tend to behave more selfishly (its reputation is 12% worse than the average). Hence connectivity of cooperative nodes takes priority and such users receive a better quality of service. FURIES spurs cooperation, but does not enforce it. There are multiple reasons for which a node can not collaborate in a determinate moment (lack of resources, bandwidth..). What is not acceptable is a continuous selfish behavior, and thus is penalized. Moreover, when users enter in the FURIES system, they start with a negative reputation level in order to prevent sybil attacks that cause the unfair exploitation of the system. In general, the advantage of FURIES in front of other credit based mechanisms [8,6,7] is that credits have a double use: being the exchange currency of the payment protocol and, moreover, being the hook that attracts nodes to relay packets of certain users. The accumulation of credits is awarded, and because credits can not be obtained by external means, users have to provide resources to the net if they want to benefit of its services.
100
20
90
15
80
10
70 60
5
50
0
40
-5
30
-10
20 Credit trend
10
291
Credit/Average credit (%)
Packets delivered (%)
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
-15
0
-20 0,0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1,0
x=Threshold/Router_IF Throughput
Accepted/Average credit
Rejected/Average credit
Fig. 4. Forwarding response of an ad hoc network
The evolution of an ad hoc network depends on the behavior of each of its members and how they react to the proposed incentives. We made a simulation of FURIES to analyze the performance of a network relative to the threshold used to trigger the forwarding services. We assume nodes always reject to forward when their battery level is below 20% of its capacity. Otherwise, they accept the transmission if the incentive factor of the initiator stated in its reputation certificate IFA0 , is greater in a certain factor x than its own IF , that is, IFA0 ≥ IFAi · x, where Ai is the forwarding node. Figure 4 shows the results of the simulation based on parameter x, that is, the quotient between the triggering threshold and the incentive factor of the forwarding node. The background columns of the figure depict the percentage of packets accepted to transmit. It is shown that the throughput of the network is nearly constant whatever the threshold. However, when we harden the condition and require the IFA0 is equal to the IFAi of the forwarding node (x = 1), the throughput gets down to 58%. If we would increment the threshold a little more, the throughput will continue to fall toward 0%. With this result it may seem that the best x to choose is a low one. However, for very low values of x we can not offer quality of service, the probability to get a packet rejected is hardly the same for all kind of nodes. It is worth noting the lines in the figure that show the relation between the reputation level of the nodes which packets are accepted or rejected, and the average level. The more these lines are separated, a better quality of service is offered because the reputation of a node most influence the forwarding acceptance decision. Moreover, the figure depicts with black arrows the credit storage trend of a group of people whose initial credit level was 0. It is shown that when x is low, the credit storage of the group tends to decrease, so in the long run people will not have credit to transmit. Therefore, there is a compromise to get the best results. Setting thresholds with low values increases the performance at short term but the network gets unhealthy: less credits, no quality of service, and so, at last, less motivation
292
H. Rif` a-Pous and J. Herrera-Joancomart´ı
to do the forwarding. On the other hand, high thresholds can reduce the throughput of the net. Consequently, there is no fixed optimum threshold, it depends on the resources of the node, its eagerness to transmit and so the necessity to obtain credits, etc.. The threshold is a variable that has to be adjusted in every case to get the expected reactions. However, the adjustment can be done automatically to meet the requirements of a specific environment.
5
Conclusions
The work presented in this paper describes a simple solution to stimulate cooperation in multihop networks and provides proof of concept. We have analyzed the protocol and, by means of simulation, we have evaluated the functionality of the system based on the configurable parameters. The results prove that FURIES fulfills its objectives: it improves the throughput of the network and reinforces a quality of service for collaborative nodes. In terms of future work, we plan to to study the performance of the protocol in real environments, evaluate its overhead in terms of energy consumption and delay, and compare it quantitatively and qualitatively with other mechanisms of incentives.
Acknowledgement The work described in this paper has been supported by the Spanish MCYT with a grant for the project PROPRIETAS-WIRELESS SEG2004-04352-C04-04.
References 1. Marias, G.F., Georgiadis, P., Flitzanis, D., Mandalas, K.: Cooperation enforcement schemes for MANETs: A survey. Wirel. Commun. Mob. Comput. 6, 319–332 (2006) 2. Buchegger, S., Boudec, J.L.: Nodes bearing grudges: Towards routing security, fairness, and robustness in mobile ad hoc networks. In: Euromicro Workshop on Parallel, Distributed and Network-based Processing (PDP) (2002) 3. Michiardi, P., Molva, R.: Core: A COllaborative REputation mechanism to enforce node cooperation in Mobile Ad Hoc Networks. Institut Eurecom. RR-02-062 (2001) 4. Rebahi, Y., Mujica, V., Simons, C., Sisalem, D.: SAFE: Securing packet Forwarding in ad hoc networks. Work. on App. and Services in Wirel. Networks (2005) 5. He, Q., Wu, D., Khosla, P.: SORI: A Secure and Objective Reputation based Incentive Scheme for Ad-hoc Networks. In: IEEE Wirel. Commun. and Net. (2004) 6. Butty´ an, L., Hubaux, J.: Nuglets: a virtual currency to stimulate cooperation in self-organized ad hoc networks. Tech.Rep.DSC (2001) 7. Anderegg, L., Eidenbenz, S.: Ad hoc-vcg: a truthful and cost-efficient routing protocol for mobile ad hoc networks with selfish agents. In: Mob. Compt. and Net (MobiCom), pp. 245–259. ACM Press, New York (2003) 8. Zhong, S., Chen, J., Yang, Y.R.: Sprite: A simple, cheat-proof, credit-based system for mobile ad hoc networks. In: IEEE INFOCOM, vol. 3, pp. 1987–1997. IEEE Computer Society Press, Los Alamitos (2003)
A Forwarding Spurring Protocol for Multihop Ad Hoc Networks (FURIES)
293
9. Butty´ an, L., Hubaux, J.: Stimulating cooperation in self-organizing mobile ad hoc networks. Tech.Rep.DSC (2002) 10. Crowcroft, J., Gibbens, R., Kelly, F., Ostring, S.: Modelling incentives for collaboration in mobile ad hoc networks. Perform. Eval. 57, 427–439 (2004) 11. Ileri, O., Mau, S.C., Mandayam, N.: Pricing for enabling forwarding in selfconfiguring ad hoc networks. IEEE J. Sel. Areas Commun. 23, 151–162 (2005) 12. Yoo, Y., Ahn, S., Agrawal, D.: A credit-payment scheme for packet forwarding fairness in mobile ad hoc networks. IEEE Intern. Conf. on Commun (ICC) 5, 3005–3009 (2005) 13. Congdon, P., Aboba, B., Smith, A., Zorn, G., Roese, J.: IEEE 802.1X Remote Authentication Dial In User Service (RADIUS) Usage Guidelines. RFC 3580 (2003) 14. Rivest, R.L., Shamir, A.: PayWord and MicroMint: Two Simple Micropayment Schemes. In: Security Protocols Workshop, pp. 69–87 (1996) 15. Tewari, H., O’Mahony, D.: Multiparty micropayments for ad hoc networks. IEEE Wirel. Commun. and Net (WCNC) 3, 2033–2040 (2003)
Direct Conversion Transceivers as a Promising Solution for Building Future Ad-Hoc Networks Oleg Panfilov1, Antonio Turgeon1, Ron Hickling1, and Lloyd Linder2 1
Technoconcepts, 6060 Sepulveda Blvd., Van Nuys, CA 91411, USA 2 Consultant {panfilov, tony, ronh, llinder}@technoconcepts.com
Abstract. A potential solution for building ad-hoc networks is described. It is based on the Technoconcepts’ TSR chipset. This chipset provides direct conversion of RF signals to baseband. Such RF/DTM chips convert the received signals into digital form immediately after the antenna, making it possible to provide all required signal processing in digital form, thus allowing adaptive frequency bands free of interference. This flexibility provides the most favorable conditions for quality communications. The offered solution has its own set of challenges. The paper describes the major challenges, and the potential ways of addressing them. An example of possible ad-hoc network architectures is presented that is based on the RF/DTM chips. Keywords: Direct RF conversion, dynamic spectrum allocation, opportunistic networks, network connectivity, frequency agility, protocol independence.
1 Introduction Ad-hoc wireless networks (AWN) show great promise in solving stringent communication requirements for system self-organization/self-configuration in a very dynamic and unpredictable communication environment typical for any mission critical applications. Operation in such communication environment, with its random pattern of interference, demands using adaptive principles for allocation of system resources in general and frequency bands in particular. To add to the seriousness of situation is the acute shortage of available frequencies that slows convergence of a viable mix of voice, data and video. The major culprit in the current spectrum shortage is the static nature of frequency allocation. Dynamic spectrum allocation (DSA) is a relatively new concept, taking advantage of the architectural features of software defined radio (SDR), to allow for clear, unjammed reception of a radio service, when the pre-determined allocated frequency band contains interference from another, unwanted service. DSA would substantially enhance the quality of service by eliminating or substantially minimizing the connection down time due to an unfavorable RF environment. It could be the sought after solution for mission critical applications, although any kind of communication system would undoubtedly benefit from it. The necessity to review the current state of affairs in spectrum utilization was emphasized in [1], [2]. The suggestion is to rely more on market forces rather than on Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 294 – 305, 2007. © Springer-Verlag Berlin Heidelberg 2007
Direct Conversion Transceivers as a Promising Solution
295
administrative regulations. Dynamic channel frequency assignment is proposed as the main mechanism to be implemented in the design of a physical and link layer OSI network model. This idea has attracted a lot of attention in finding a solution to the current spectrum shortage. For example, adaptive, dynamic mechanisms of spectrum reuse based on the smart radio are considered in [4], [5] as a viable solution to the current static frequency allocation. The importance of adaptive principles in DSA, including priority classes of different services for individual or multiple operators in the multi-vendor environment, is described in [6]-[21]. This paper leverages the ideas presented in [4]-[21], and provides the next step of the implementation of DSA. The main focus here is the practical realization of spectrum management principles by utilizing the direct conversion RF/DTM chips. These chips are able to provide universal network access to a broad range of frequencies covering the major wireless communication protocols including CDMA, GSM, WiMax, WiFi, and as well as future wireless standards. The emphasis is to use dynamically downloadable software to operate in a frequency agile environment, based on dynamic frequency management. A given range of possible scenarios, implementing RF/DTM chips, illustrates the benefits of frequency agility and protocol independence in solving DSA problems. The solution of one problem, particularly dynamic spectrum allocation, brings other problems that require adequate attention. This paper shows the benefits and challenges of broadband frequency agile chips in light of specifics in operating in broad frequency ranges. It is a necessity to cancel powerful interferers at the receiver front end while these interferers might be outside of the spectrum of desirable signals, as can be seen in Figure 1. The spectral occupancy measurements for Figure 1 were performed in New York City during the 2004 Republican Convention, when traffic intensity was substantially above the average level. It can be seen from Figure 1 that a lot of interference is picked up by the broadband receiver.
Continuous Distribution Peaked at Low Values >> Many handsets and many distances
Large Amplitude Concentrations >> Small Number of Base Stations
Fig. 1. Amplitude histogram of PCS band (courtesy of [3])
296
O. Panfilov et al.
Proper addressing of in-band and out-of-band interference will preserve the dynamic range of a receiver front end and avoid the possibility of its operation in the nonlinear region with all the resultant negative consequences. It has to be noted that since interference by its nature is not predictable, it has to be cancelled or avoided by using adaptive means. The approach to this solution will be described. Dynamic spectrum allocation can be viewed as an area where new solutions can be provided to an industry that is based on static spectrum allocation principles. Such outdated allocation principles result in gross inefficiencies of allocated frequencies on one side, and an acute shortage of these frequencies on the other. The successful solution to the spectrum availability problem will create a highly sought after win-win situation for both parties - customers as well as service providers. A significant amount of system simulation work has been done to look at how this concept could be implemented at the system level. What is lacking, and the focus of this paper, is the analysis of circuit level architectures, to support the concepts of DSA from an integrated circuit standpoint, to begin to understand the practical limitations of implementing the concepts in hardware, as well as performing architectural trade-off studies to determine solutions to overcome these limitations. The main body of the paper has the following structure. Section two describes the main DSA challenges. Section three shows the main architectural features and technical realization specifics of ad-hoc networks incorporating DSA. Section four is devoted to the existing TSR chip test results. The results will show the level of chip’s maturity, the areas where it can be utilized right away, as well as what kind of improvements are expected in the future chip generations. Section five summarizes the obtained results.
2 Main DSA Challenges The main DSA implementation challenges are easy to understand from the analysis of system operation. Figure 2 shows the block-diagram of the network operating environment of an ad-hoc network system implementing the DSA concept. Here, the simultaneously operated multitude of different wireless area networks including wide area (WANs), metropolitan (MANs), global (GANs) using different elevation satellites, and shared pre-defined frequency bands, have to coexist with ad-hoc networks set up to deal with the emergency situation. The illustration in Figure 2 corresponds to setting up an ad-hoc network to deal with a large forest fire, coordinating operation of the first response teams from fire, ambulance, and police departments. Two way communications between each element and the DSA controller are shown in purple colors. Here is an example of centralized control of frequency allocation. It assumes analysis of the interference level at each site and sending that information to a DSA controller. That controller analyzes data from all participating nodes and sends allocated frequencies to each node. As soon as the interference environment will change, the DSA controller will send the updated frequency distribution based on the new available frequencies. To operate in such a dynamic environment, the individual nodes of an ad-hoc network have to be able to receive and transmit in a broad range of
Direct Conversion Transceivers as a Promising Solution
297
Fig. 2. Network operating environment (an ad-hoc network using DSA control coexists with multitude of traditional networks)
frequencies, in addition to being able to scan the entire operational frequency range sequentially by separate set of transceivers to provide the DSA controller with data on the spectrum distribution of interference. Figure 3 shows an ad-hoc network operation separately illustrating the interaction of its components. The practical implementation requires co-ordination within a network between node receivers and the DSA controller as is shown in Figure 3. Here the DSA controller provides monitoring of the
298
O. Panfilov et al.
Fig. 3. An extended view of and ad-hoc network in action
operational environment through back channels with ad-hoc network nodes, where each node is equipped with the scanning RF receiver measuring the local interference levels.
3 Main System Architectural Features The DSA concept is based on determining a frequency band free of interference, and using it for the current communication session. In addition, it must support real-time mission critical services, across ad-hoc wireless networks involving traditional quality of service (QOS) mechanisms for dynamic allocation of RF spectrum among multiple users. With that in mind, shown below is a partial set of requirements for the implementation of DSA in ad-hoc networks: • Operational frequency band: It has to be sufficient to include the major frequency bands for the most popular communication standards. • Low switching time: Low latency in switching from one spectrum band to another is important in providing uninterruptible service and satisfying QOS. It has to be less than 150ms to satisfy QOS requirements. • Low bit error rate (BER): Low BER is important to maintain required QOS. • Spurious free dynamic range (SFDR): Low system SFDR allows operating in the broad dynamic range of input signals without experiencing a negative impact of nonlinear distortion and compression of the receiver front end. • Power efficiency: It is important to prolong battery life of each device, thus extending the lifetime of the entire system.
Direct Conversion Transceivers as a Promising Solution
299
• Self-organization: Procedures involved in self-organization include: spectrum occupancy analysis within the targeted range of frequencies, selection of a few recommended frequencies for potential users, and sending these recommendations to the participating nodes. Actual implementation can be done based on centralized or distributed approaches using dedicated transceivers and specialized signal processing for performing spectrum analysis and allocation functions. Both options have issues. • Scalability: This requirement refers to the ability of a system to support a variable number of nodes without affecting performance of each of them. It includes the number of nodes, traffic load, and some mobility aspects. • Security: Security is always a critical aspect during deployment of a wireless network, since the broadcast nature of wireless signals and network itself is vulnerable to attacks at various protocol layers. Efficient resilience for physical layer jamming is very crucial for SDR. Using adaptive interference cancellation may be recommended to enhance system security and availability of its services. • Multicasting: Efficient multicast should be supported by the ad-hoc network. For example, in a two-way, or many-to-one, communication, many services such as voice and multimedia will require simultaneous processing of different signals. These signals come from different geographically distributed sources that may have variable signal power spectral density at a receiver input. The frequency allocation to each one of the originated sources has to take into account the destination source status in terms of interference level at its location. As a further extension of DSA concept, it is conceivable that multiple services can share frequency bands, or use frequency bands, that typically they do not have, by taking advantage of this intelligent network. DSA provides a new way of taking advantage of communications transceivers to optimally remove jamming and interference situations for different frequency bands and services. By doing so, the receiver will not be taxed from a dynamic range perspective. Resultantly, through the use of digital programmability, the DC power of the receiver can be optimized for the scenario of normal operation. Thus, as a result of no interference, the receiver block requirements can be relaxed, and resultantly, the DC power of the receiver can be reduced, enhancing performance of the receiver. As with all new concepts, the DSA implementation will evolve architecturally. The initial application of the DSA may be expected in systems that have the luxury of dissipating more power consumption, in order for the ideas to be refined. Eventually, the concept will be applied to consumer / hand-held applications, and the flexibility of digital programmability, as well as software license upgradeability, will open up an entirely new way of looking at wireless communications. Putting it all together, a DSA controlled system might look like the one shown on Figure 4. Figure 4 shows the DSA controller from Figure 3 in more detail. There are two signal paths: one for transmission and the other for reception. Figure 4 shows a conceptual block diagram of a DSA system. A system includes two transceivers and baseband DSA controllers. A base-band controller provides intelligent digital signal processing. Initially, the receiver on transceiver #1 is in SCAN mode, by selecting the wide-band input filter setting. The entire frequency range is down-converted with the receiver in a high dynamic range mode. The
300
O. Panfilov et al.
Fig. 4. Detailed Block Diagram of a Conceptual Implementation of the DSA architecture
frequency band of interest is probed for the signals of interest, as well as interference. If there are interferers, the transmitter sends information to the receiver in transceiver #2 to move frequency bands. The transmitter of the transceiver #2, as well as the filter bank of the receiver on transceiver #1, is digitally programmed for the new frequency band, and the signal is received on a narrow bandwidth channel. If there are no interferers, the filter bank on transceiver #1 is selected for the proper frequency band, and the signal is received on a narrow bandwidth channel. For reception on transceiver #2, the opposite holds. Additional transceivers can be added to create a mesh network, and the base-band processors must keep track of the services, time-slotting, frequency bands, and hand-shaking that must occur in order for the network to work properly, without self-jamming. Aside from the TechnoConcepts radios, most of the equipment is “off the shelf”. The system controller function is provided by a standard personal computer. ETHERNET will be used for all inter-equipment data transfer therefore standard routers form the core of the Interoperability switch and the Channel Bank MUX and Controller. For transmission the baseband signal is routed to the proper radio by the channel bank multiplexer. An appropriate TX module (a radio exciter) is software defined to operate on the appropriate channel using the appropriate protocol. The output of the exciter is routed to the appropriate PA/channel through the transmit cross-point switch. This switch permits any TX module be used as an exciter to any PA/channel combination. Customized PA modules are used for each channel to match the PA requirements to the licensed power and mode(s) available to the channel. Duplicate hot standby PA’s automatically replace the main PA in case of failure, and alternate TX modules can serve as hot standbys resulting in complete redundancy for the entire radio. For reception, the channel signal is first routed through a receive cross-point switch permitting any RX module to be assigned to any channel. The chosen RX module is software defined to operate on the desired channel with the appropriate protocol. As in the TX case, any RX module can be assigned to any channel providing maximum reliability, flexibility and redundancy. The demodulated baseband signal is then routed through the channel bank to the desired location. These locations can include another transmitter (for simple repeater operation), a trunk controller (for a trunking configuration), to a pager server (for a pager configuration), and/or to the dispatcher
Direct Conversion Transceivers as a Promising Solution
301
via the microwave link. Of course, multiple transmitters, including other base stations, can be fed simultaneously to configure a simulcast system, through the appropriate processing and time delays. Figure 5 provides block diagrams of the TX and RX modules. These modules are based upon TechnoConcepts proprietary RX and TX silicon germanium integrated circuits. These “chips” bridge the radio frequency domain with the digital processing domain. The receiver chip provides a one-way conversion from a radio signal to digital signals. The transmitter chip provides a one way conversion from digital signals to a radio signal. The baseband processors are digital integrated circuits that receive or transmit the desired radio message using the proper channel and the proper protocol. The Configuration and Control processors choose the proper RX and /or TX module, set it to the proper channel, tell it to use a chosen protocol, and route the appropriate baseband signals. The channel bank multiplexer and controller routes the baseband signals to and from the RF Gateway system, defines system configurations (interoperability) and provides for interoperability to other systems. Figure 5 consists of multiple transceivers shown in Figure 4. This network is controlled by the baseband DSA controller network, and hand-shaking procedure between the transceivers must be accomplished intelligently in order to avoid self-jamming.
Fig. 5. Details of the DSA network
4 Test Results In support of DSA, the TSR receiver and transmitter ICs have been developed by Technoconcepts. The receiver IC has a wide band RF front end, allowing the digitization of a wide frequency range. The test set-up for the TSR measurements includes two external synthesizers for the RF input frequency, as well as the RF clock source. A simplified version of the test set-up is shown in Figure 6.
302
O. Panfilov et al.
Fig. 6. Block-diagram of system test
Additionally, band-pass filters are included at the outputs of the synthesizers to filter out the wide-band noise and distortion of the signal sources. The ADC produces a serial bit stream. This bit stream is de-multiplexed on-chip. The de-multiplexed digital data then goes into a decimating filter block on-chip. The output of the decimating filter block is the digital output of the chip. This data is transferred from the receiver evaluation board to a data capture board. The data capture board interfaces to a PC. A Fast Fourier Transform (FFT) is performed on the digital data, and the result is shown in Figure 6. The digital output bits of the TSR chip are a decimated, filtered version of the on-chip ADC’s digital output. The Spur-Free Dynamic Range (SFDR) and Signal-to-Noise Ratio (SNR) over a frequency range of 850 MHz to 6 GHz. are obtained from the ADC FFT plots and are shown in Figures 7 and 8. Measurements of Receiver SFDR vs. ADC Output Power With Receiver Input Frequency @ 0.85 GHz, 1.75 GHz, 2.75 GHz, 3.5 GHz, 6 GHz 75 0.85 GHz 1.75 GHz 2.75 GHz
70
3.5 GHz 6 GHz
65
SFDR (dBc)
60
55
50
45
40
35
30 -45
-40
-35
-30
-25
-20
-15
-10
ADC Output Power (dBFS)
Fig. 7. Existing TSR SFDR as a function of ADC output power
-5
Direct Conversion Transceivers as a Promising Solution
303
Receiver SNR in 5 MHz Bandwidth vs. ADC Output Power With Receiver Input Frequency @ 0.85 GHz, 1.75 GHz, 2.75 GHz, 3.5 GHz, 6 GHz 60
0.85 GHz 0.85 GHz 1.75 GHz 2.75 GHz 3.5 GHz 6 GHz
55
50
6 GHz
SNR (dBc)
45
40
35
30
25
20
15 -45
-40
-35
-30
-25
-20
-15
-10
-5
0
ADC Output Power (dBFS)
Fig. 8. Existing TSR SNR as a function of RF input frequency
the receiver’s SFDR, versus ADC output power, for a few typical frequencies within the entire operational range are shown on Figure 7. Frequencies selected include the end points of operational range as well as a few ones in proximity of popular communication standards implementation. It can be seen from Figure 7 that SFDR, as expected, degrades over frequency, however, the receiver demonstrates frequency agility, and the ability to receive and digitize a very wide band of frequencies. At lower RF input frequencies, the receiver’s measured SFDR is on the order of 70 dB. Figure 8 shows measurements of the receiver’s SNR in a 5 MHz bandwidth versus the ADC amplitude for several input RF frequencies. The SNR performance of the receiver supports a number of wireless applications, and thus demonstrates the frequency agility and flexibility of the receiver. For DSA applications, in order to encompass a wide frequency range, wide band receivers such as the TSR receiver IC will be needed. These receivers will be used in conjunction with intelligent / switch-able front end filter banks, in order to take advantage of the DSA architecture.
5 Conclusion This paper has described a new approach in addressing implementation specifics of ad-hoc networks (AHN) destined to operate in very dynamic and unpredictable communication environment typical for any mission critical applications. The offered approach is based on using direct conversion RF/DTM chips taking advantage of the architectural features of software defined radio (SDR), to allow for clear, un-jammed reception of a radio service, when the pre-determined allocated frequency band
304
O. Panfilov et al.
contains interference from another, unwanted service. RF/DTM chips convert the received signals into digital form immediately after an antenna, making it possible to provide all required signal processing in digital form, thus allowing adaptive selection of frequency bands free of interference. The advantages and challenges, associated with implementing dynamic spectrum allocation (DSA) to ad-hoc networks have been presented. Notional concepts have illustrated how a RF/DTM transceiver can be used to implement DSA for a wireless system. All the analysis was done under the assumption that the conventional regulatory constraints on the current static spectrum distribution were being either relaxed or removed. A mesh network of transceivers, functioning using the DSA principle, has been discussed. There are architectural trade-offs and system issues that need to be further developed for practical implementations of the DSA concept. To that end, Technoconcepts has built and demonstrated a wide-band receiver front end IC that can support further development of DSA. The SNR and SFDR of the IC have been measured over a broad range of RF input frequencies, verifying that it is feasible to support the DSA concept from a hardware standpoint.
References 1. Hoffmeyer, J.A.: Regulatory and standardization aspects of DSA technologies - Global Requirements and Perspectives. In: DySPAN 2005, pp. 700–705 (2005) 2. Pawełczak, P., Prasad, R.V., Xia, L., Niemegeers, I.G.: Cognitive radio emergency networks - requirements and design. In: DySPAN 2005, pp. 601–606 (2005) 3. McHenry, M.: NSF Spectrum Occupancy Measurements, New York (2004) 4. Maldonado, D., Le, B., Hugine, A., Rondeau, T.W., Bostian, C.W.: Cognitive radio applications to dynamic spectrum allocation: a discussion and an illustrative example, New Frontiers in Dynamic Spectrum Access Networks. In: DySPAN 2005. First IEEE International Symposium, November 2005, pp. 597–600 (2005) 5. Cowen-Hirsch, R., Shrum, D., Davis, B., Stewart, D., Kontson, K.: Software radio: evolution or revolution in spectrum management. In: MILCOM 2000. Proceedings of the 21st Century Military Com. Conference, vol. 1, pp. 8–14 (2000) 6. Xu, L., Tonjes, R., Paila, T., Hansmann, W., Frank, M., Albrecht, M.: DRiVE-ing to the Internet: Dynamic Radio for IP services in Vehicular Environments. In: LCN 2000. Proceedings of the 25th Annual IEEE Conference on Local Computer Networks (2000) 7. Kim, H., Lee, Y., Yun, S.: A dynamic spectrum allocation between network operators with priority-based sharing and negotiation. Personal, Indoor and Mobile Communications. In: PIMRC 2005. IEEE 16th International Symposium, September 11-14 2005, vol. 2 (2005) 8. Kulkarni, R., Zekavat, S.: Traffic - Aware Inter-vendor Dynamic Spectrum Allocation: Performance in Multivendor Environment. In: IWCMC’06, July 3-6, 2006, Vancouver, British Colambia, Canada (2006) 9. Oner, M., Jondral, F.: Cyclostationarity-based methods for the extraction of the channel allocation information in a spectrum pooling system. Radio and Wireless Conference, September 19-22, 2004, IEEE, Los Alamitos (2004) 10. Leaves, P., Moessner, K., Tafazolli, R., Grandblaise, D., Bourse, D., Tonjes, R., Breveglieri, M.: Dynamic spectrum allocation in composite reconfigurable wireless networks. Communications Magazine, IEEE 42(5) (2004)
Direct Conversion Transceivers as a Promising Solution
305
11. Grandblaise, D., Moessner, K., Leaves, P., Bourse, D.: Reconfigurability support for dynamic spectrum allocation: from the DSA concept to implementation, Mobile Future and Symposium on Trends in Communications. In: SympoTIC ’03. Joint First Workshop, 2628 October (2003) 12. Keller, R., Lohmar, T., Tonjes, R., Thielecke, J.: Convergence of cellular and broadcast networks from a multi-radio perspective. IEEE Wireless Communications 8(2) (2001) 13. Andrews, J.: Interference Cancellation for Cellular Systems, University of Texas, Austin (February 8, 2005) 14. Lu, X.: Novel Adaptive Methods for Narrow Band Interference Cancellation in CDMA Multiuser Detection. In: ICASSP 05 (2005) 15. Panfilov, O., Hickling, R., Turgeon, T., McClellan, K.: Direct Conversion Software Radio - Its Benefits and Challenges in Solving System Interoperability Problems. In: Mobility 2006 conference, Bangkok, Thailand (2006) 16. Santivanez, C., Ramanathan, R., Partridge, C., Krishnan, R., Condell, M., Polit, S.: Opportunistic Spectrum Access: Challenges, Architecture, Protocols. In: WiCon ’06, Boston MA (2006) 17. van Wazer, L. (FCC), Spectrum Access and the Promise of Cognitive Radio Technology. In: presented at the FCC Cognitive Radio Workshop (2003) 18. Marshall, P.: (DARPA), Beyond the Outer Limits - XG Next Generation Communications. In: presented at the FCC Cognitive Radio Workshop (May 19, 2003) 19. Hsieh, T., Kinget, P., Gharpurey, R.: An Approach to Interference Detection for Ultra Wideband radio Systems, Design, Applications, Integration and Software. In: 2006 IEEE Dallas/CAS Workshop, pp. 91–94 (2006) 20. Cagley, R.E., Mcnally, S.A., Wiatt, M.R.: Dynamic Channel Allocation for Dynamic Spectrum Use in Wireless Sensor Networks. In: MILCOM 2006. Military Communications Conference, pp. 1–5 (2006) 21. Haas, H., Nguyen, V.D., Omiyi, P., Nedev, N., Auer, G.: Interference Aware Medium Access in Cellular OFDMA/TDD Networks, Communications. In: ICC ’06, June 2006, vol. 4, pp. 1778–1783 (2006)
Location Tracking for Wireless Sensor Networks Kil-Woong Jang Division of Nano Data System, Korea Maritime University 1 YeongDo-Gu Dongsam-Dong, Busan, Korea
[email protected]
Abstract. In location tracking, there is a trade-off between data accuracy of mobile targets and energy efficiency of sensor nodes. If the number of nodes is increased to track a mobile target, the level of data accuracy increases, but the energy consumption of nodes increases. In this paper, we propose a new location tracking scheme that considers these two factors. The proposed scheme is designed to track mobile targets in a cluster-based sensor network. In order to increase the energy efficiency of sensor nodes, a portion of nodes that detect the mobile target is selected to track the target using the backoff procedure. In addition, we consider data accuracy as controlling the number of nodes by varying the transmission range of the nodes. We perform a simulation to evaluate the performance of the proposed scheme over sensor networks. We provide simulation results comparing the proposed scheme with the general approach. The simulation results show that the proposed scheme has the excellent performance over a broad range of parameters. Keywords: wireless sensor networks, location tracking, backoff procedure, moving objects, wireless networks.
1
Introduction
Sensor networks are an emerging technical challenge in ubiquitous networks. Using many sensor nodes with sensing, wireless communication and computation functions, a wide range of monitoring applications, such as temperature, pressure, noise and so on, have been commonly studied in the literature [1]. Sensor networks consist of a large number of nodes which are very close to each other, and have a multi-hop wireless topology. The nodes’ batteries are constrained, and cannot be recharged or replaced after they are deployed. In sensor networks, the low power consumption requirement is the most important constraint of the nodes. Detecting and tracking a mobile target is one of the challenging application areas in sensor networks [2-5]. In general, the nodes surrounding a mobile target detect and track the target, and collaborate among themselves to aggregate data regarding the target. The aggregated data then can be forwarded to a sink or a base station. Consideration for detecting and tracking a mobile target provides reliable data of the mobile target and forwards the data to the sinks or the base stations in a fast and energy efficient way. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 306–315, 2007. c Springer-Verlag Berlin Heidelberg 2007
Location Tracking for Wireless Sensor Networks
307
In this paper, we propose a new scheme to detect and track the mobile target, differing in terms of energy efficiency and data accuracy. The proposed scheme is designed based on the cluster network. The head in each cluster collects data about the mobile target from the nodes and aggregates the data to obtain more accurate data about the mobile target. In general, all active nodes surrounding the mobile target can participate in tracking the target. In the proposed scheme, however, even if only a portion of active nodes are tracked, energy efficiency can be increased. To select the participating nodes to track the mobile target, the proposed scheme makes use of the backoff procedure. A node that detects the mobile target broadcasts a message to other nodes existing within the node’s transmission range, using the backoff procedure. The number of nodes required to track the target can vary depending on the transmission range. In the performance evaluation, we evaluate data accuracy and energy efficiency for the proposed scheme under a variety of parameters.
2
Related Work
Several approaches have been studied to detect and track a mobile target in sensor networks. Yang et al. [2] studied the problem of tracking moving objects using distributed wireless sensor networks in which sensors are deployed randomly. They proposed an energy efficient tracking algorithm, called Predict-and-Mesh that is well suited for pervasively monitoring various kinds of objects with random movement patterns. Predict-and-Mesh is a distributed algorithm consisting of two prediction models: n-step prediction and collaborative prediction, and a predication failure recovery process called mesh. Zhang et al. [3] proposed a dynamic convoy tree-based collaboration (DCTC) framework to detect and track the mobile target and monitor its surrounding area. DCTC relies on a tree structure, which includes nodes around the mobile target, and the tree is dynamically configured to add and prune selected nodes as the target moves. As the configuration of a tree changes dynamically, they proposed two tree expansion and pruning schemes and two tree reconfiguration schemes. This paper compared and evaluated their schemes under the different node density, in terms of coverage and energy consumption. Halkidi et al. [4] presented a distributed mechanism to track tracking moving objects efficiently and accurately with a network of sensors. Their mechanism provides set-up and cooperation of the sensors within the network, while providing fault tolerant characteristics through replication. They provide an algorithm for predicting, with high probability, the future location of an object based on past observations by many sensors.
3 3.1
The Proposed Scheme Assumptions
We present some assumptions before describing how the proposed scheme operates. Every node has capabilities for sensing, communication and processing.
308
K.-W. Jang
D
C B
E
A
Fig. 1. An example of selection process for the proposed scheme. A circle node represents a sensor node, and a rectangle node represents a mobile target. A dotted circle represents the maximum transmission range of a node.
In addition, all nodes have the same amount of energy and no error of communication and processing. Every node knows its own position a priori through manual techniques or by using various other techniques [7]. 3.2
Scheme Description
We describe two processes for the proposed scheme: selection and release. In the proposed scheme, all nodes periodically maintain active or sleep status. The nodes with active status have the capability of detecting and tracking a mobile target. Whenever a mobile target is detected by nodes surrounding the target, it is determined whether the node is tracking the mobile target using the backoff procedure. Here the backoff procedure used in the proposed scheme is slightly different from that of IEEE 802.11 [8]. The backoff procedure of the proposed scheme is as follows. Each node selects a random time and the node that selects the shortest random time will transmit a DETECT message. The other nodes stop their backoff times and do not wait for the remaining time. The node that transmit the message to neighbors starts to track the mobile target. Upon receiving the message, the neighbors stop their backoff procedure and do not reply to any message. To briefly describe the selection process presented so far, we use the example shown in Fig. 1. Suppose five nodes detect the mobile target simultaneously. Five nodes trigger the backoff procedure. If node A has the lowest value of backoff times, node A sends a DETECT message to neighbors when the value of the backoff times is equal to zero. In Fig. 1, node C and E receive the message from node A, and stop the backoff procedure. However, as nodes B and D exist out of
Location Tracking for Wireless Sensor Networks
D
C
D
B
E
A
309
C B
E
A
(a)
(b)
Fig. 2. Collision examples of selection process for the proposed scheme. An arrow between nodes represents that source sends a control message to destination.
the transmission range of node A, they do not receive the message from node A. Thus, node B and D continue to progress through the backoff procedure. If node D has the lowest value of backoff times, node D is selected to track the target. If more than two nodes simultaneously transmit the DETECT message due to having the same backoff time, the collision event occurs. We consider some examples for collision. Suppose node A and E have the same value of backoff times, then the event of collision occurs as shown in Fig. 2(a). Node A and E start to track the mobile target after sending a DETECT message. Due to collision, no nodes receive the message, and they progress through the backoff procedure. In such a situation, we can consider two cases. The first case is that node C has the lowest value of backoff times. Node C then sends a DETECT message to neighbors and starts to track the mobile target. On receiving the message, nodes A and E continue to track the mobile target and nodes B and D finish the backoff procedure. Finally, nodes A, C and E track the mobile target. The second case is that node B or D has the lowest value of backoff times. If node D has the lowest value of backoff times, node D sends a DETECT message to neighbors. Nodes A and E continue to track the target since node C, which is the intermediate node between node A (or E) and node D, do not reply to the message. Therefore, three nodes, node A, E and D, track the mobile target. Another example is as follows. Consider that nodes A and D have the same value of backoff times as shown in Fig. 2(b). When node A sends a DETECT message to neighbors, nodes C and E receive the message. At the same time, when node D sends the message to neighbors, nodes B and C receive the message. Here, as node C concurrently receives the message from node A and D, the collision occurs at node C. Therefore, since node C does not recognize the message, it continues to progress through the backoff procedure. When the backoff times of
310
K.-W. Jang
D
C
B
A
Fig. 3. An example of release process for the proposed scheme. A dashed circle represents the maximum sensing range of a node.
node C is zero, node C sends a DETECT message to neighbors. Finally, nodes A, C and D track the target. The target will be located out of the sensing range of tracking nodes as the target moves. In such a case, the tracking nodes send a RELEASE message to neighbors. Of nodes receiving the message, nodes that can detect the target trigger the backoff procedure. As mentioned above, nodes that have the lowest value of backoff times are selected as the tracking node. To present the release process of the proposed scheme easily, we consider an example, shown in Fig. 3. Now, node A is tracking the target, and nodes B and C only detected it. As the target moves, it is located out of the transmission range of node A. Node A sends a RELEASE message to neighbors, since it cannot track the target. Upon receiving the message, nodes B and C start the backoff procedure, and nodes having the lower value of backoff times may be tracking the target. If nodes exist that are not neighbors of node A and can detect the target, they also trigger the backoff procedure. In Fig. 3, node D is not neighbors of node A and can detect the target. We can see that nodes B and C start the backoff procedure after they receive the message from node A, but node D starts the backoff procedure when it detects the target. Therefore, node D starts the backoff procedure earlier than nodes B and C, and node D has a higher probability of being selected as the tracking node. Therefore, the proposed scheme does not need to frequently change tracking nodes when the target moves in a uniform direction. If the RELEASE message losses, the nodes (like node D in Fig. 3) outside of the transmission range of node A only start the backoff procedure. Our location tracking scheme is designed in a cluster-based sensor network [6]. As the head of each cluster aggregates data from tracking nodes, the proposed
Location Tracking for Wireless Sensor Networks
311
scheme can increase the level of data accuracy and when the size of the data report is reduced it can decrease the energy required to send data from the head to sink.
4
Performance Evaluation
We carried out computer simulation to evaluate the performance of the proposed scheme. In this section, we describe performance metrics, simulation environment and simulation results. 4.1
Performance Metrics
In order to evaluate the performance of the proposed scheme, the performance metrics of interest are – the number of tracking nodes and – total energy consumption. 4.2
Simulation Environment
The network model for simulation consists of randomly placed nodes in a constant square area. We assume that there are some nodes distributed randomly over 200 × 200 m2 , and we divide the network area into clusters that consist of 50 × 50 m2 . In order to measure the energy dissipation of nodes, we use a radio model developed in [5]. In this model, nodes have the transmitter circuitry, which consists of a transmit electronics and a transmit amplifier, Let Ee be the energy dissipated in transmit and receive electronics and Ea be the energy dissipated in the transmit amplifier. We assume that Ee = 50 nJ/bit and Ea = 100pJ/bit/m2. We also assume that the energy loss happens according to the distance between source and destination. Therefore, the transmit energy, Et , dissipated to send a k bit data packet to a destination at distance d is as follows: Et = Ee × k + Ea × k × d2 .
(1)
The value of total energy, E, dissipated to transmit data between the tracking node and the head, is as follows: E = Es + Et .
(2)
Here Es is defined as the energy dissipated to detect and track the mobile target. However, Es is very small value compared to Et , thus we did not include this value to calculate E in this paper. In the proposed scheme, we need additional energy, Ec , to carry out the selection and release processes for tracking nodes. Therefore, E for the proposed scheme is as follows: E = Es + Et + Ec .
(3)
The simulation parameters used to simulate the proposed scheme are listed in Table 1.
312
K.-W. Jang Table 1. Simulation Parameters Parameters
Values
Network size (m2 ) 200 × 200 Number of nodes 100 – 700 Transmission range (m) 6 – 20 Sensing range (m) 10 Simulation time (s) 1000 Speed of a mobile target (m/s) 2 Size of a control message (DETECT, RELEASE) (byte) 10 Size of a sensing report: Rs (byte) 40 or 80
4.3
Simulation Results
In this section, to evaluate the proposed scheme, we compare it with a different location tracking scheme, in terms of the average number of tracking nodes and total energy consumption. The compared scheme makes all nodes that can detect the mobile target track the target. In this paper, hereafter we call this the normal scheme. When comparing the proposed scheme and the normal scheme, both are adapted in the same cluster-based sensor networks. We first experiment on various values of transmission ranges for sensor nodes, as shown in figures 4 and 5. In Fig. 4, we plot total energy consumption, and in Fig. 5, we plot the average number of tracking nodes for the two schemes. In these figures, we denote the proposed scheme as ”proposed” and the normal scheme as ”normal”. At the same time, we experiment with two different sizes of sensing reports, Rs , for two schemes, 40 and 80, respectively. Fig. 4 shows that the energy efficiency of the proposed scheme is approximately 3 times greater than that of the normal scheme, because the proposed scheme utilizes a lower number of nodes than the normal scheme. In the normal scheme, when the size of the data report doubles, the energy consumption of this scheme is almost doubled. However, in the proposed scheme, although the size of the data report doubles, it consumes less that double the energy previously required, because it includes the energy consumption to carry out the selection and release process. When using the normal scheme, average number of nodes required to track a target is about 5.5, as shown in Fig. 5. This means that there are about 5.5 nodes surrounding the target. In this figure, we see the proposed scheme tracks the target and uses 2 – 4 times fewer nodes than the normal scheme. By using a fewer number of nodes to track the target, the proposed scheme has the advantage of increasing the energy efficiency. However, as the number of tracking nodes is decreased, the data accuracy can be decreased. In this figure, as the transmission range of nodes is increased, we also see the average number of tracking nodes is accordingly decreased. This is because the messages for the backoff procedure are sent further due to longer transmission range. Specially, when the transmission range is twice that of the sensing range, only a node tracks the target.
Location Tracking for Wireless Sensor Networks
313
Total energy consumption (J)
300 normal (Rs=40) normal (Rs=80) proposed (Rs=40) proposed (Rs=80)
250 200 150 100 50 0 6
8
10
12
14
16
18
20
Transmission range (m)
Fig. 4. Total energy consumption under various transmission ranges
Average number of tracking nodes
6 5
normal (Rs=40) normal (Rs=80) proposed (Rs=40) proposed (Rs=80)
4 3 2 1 0 6
8
10
12
14
16
18
20
Transmission range (m)
Fig. 5. Average number of tracking nodes under various transmission ranges
We next evaluate the various numbers of deployed nodes, as shown in figures 6 and 7. In Fig. 6, we plot total energy consumption, and in Fig. 7 we plot the average number of tracking nodes for the two schemes. Although the number of nodes is increased in the proposed scheme, the energy consumption is only increases slightly because the number of tracking nodes is slowly increased. However, energy dissipated in the normal scheme is increased in direct proportion to the number of deployed nodes. Moreover, as the size of the data report is doubled, we saw that the normal scheme has energy consumption increased at double speed, but the proposed scheme only increased slowly. Fig. 7 shows that the proposed scheme has fewer numbers of tracking nodes than the normal scheme. In particular, as the number of deployed nodes is increased, we see that the gap for the number of tracking nodes is accordingly increased. When a lot of nodes are deployed in sensor networks, though a part of all the nodes surround the target track the target, we can see that the energy efficiency is increased and
314
K.-W. Jang
Total energy consumption (J)
350 normal (Rs=40) normal (Rs=80) proposed (Rs=40) proposed (Rs=80)
300 250 200 150 100 50 0 100
200
300
400
500
600
700
Number of deployed nodes
Fig. 6. Total energy consumption under the various numbers of deployed nodes
Average number of tracking nodes
6 normal (Rs=40) normal (Rs=80) proposed (Rs=40) proposed (Rs=80)
5 4 3 2 1 0 100
200
300
400
500
600
700
Number of deployed nodes
Fig. 7. Average number of tracking nodes under the various numbers of deployed nodes
the data accuracy decreased slightly in the above figures. In figures 5 and 7, we can see that the plots overlap in the graph because average number of tracking nodes of normal and proposed schemes is equal irrespective of Rs . The simulation results show that the proposed scheme achieves high energy efficiency using a portion of nodes surrounding the target when a lot of nodes are deployed in the network. In an environment requiring data accuracy, the proposed scheme can increase data accuracy by increasing the number of tracking nodes and reducing the transmission range of nodes.
5
Conclusions
In this paper, we presented a new location tracking scheme to maintain the energy of sensors in sensor networks efficiently. The proposed scheme is designed to track mobile targets in a cluster-based sensor network. In order to increase the
Location Tracking for Wireless Sensor Networks
315
energy efficiency of sensor nodes, a portion of sensor nodes, to detect the mobile target, are selected to track the target using backoff procedure. Moreover, we consider data accuracy as controlling the number of nodes, by varying the transmission range of the node. Using the simulation, we evaluated the performance of the proposed scheme in terms of the energy dissipated in the network and the data accuracy. The simulation results demonstrated that the proposed scheme outperformed a general scheme over various parameter ranges.
References 1. Akyildiz, I., Su, W., Sankarasubramanian, Y., Cayiraci, E.: Wireless sensor networks: a survey. Computer Networks 38, 393–422 (2002) 2. Yang, L., Feng, C., Rozenblit, J.W., Qiao, H.: Adaptive tracking in distributed wireless sensor networks. In: 13th Annual IEEE International Symposium and Workshop on Engineering of Computer Based Systems (2006) 3. Zhang, W., Cao, G.: DCTC: Dynamic Convoy Tree-Based Collaboration for Target Tracking in Sensor Networks. IEEE Transaction on Wireless Communications 5, 1689–1701 (2004) 4. Halkidi, M., Papadopoulos, D., Kalogeraki, V., Gunopulos, D.: Resilient and energy efficient tracking in sensor networks. International Journal of Wireless and Mobile Computing 1(2), 87–100 (2006) 5. Kung, H.T., Vlah, D.: Efficient Location Tracking Using Sensor Networks. In: Proceedings of IEEE WCNC (2003) 6. Heinzelman, W.R., Chandrakasna, A., Balakrishnam, H.: An Application-Specific Protocol Architecture for Wireless Microsensor Networks. IEEE Transaction on Wireless Communications 1(4), 660–669 (2002) 7. Zou, Y., Chakrabarty, K.: Sensor Deployment and Target Localization Based on Virtual Forces. In: INFOCOM (2003) 8. IEEE Standard 802.11 for Wireless Medium Access Control and Physical Layer Specifications (August 1999)
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks with Anonymous Packets Jerzy Konorski Gdansk University of Technology ul. Narutowicza 11/12, 80-952 Gdansk, Poland
[email protected]
Abstract. A mobile ad hoc network (MANET) station acts both as a source packet generator and transit packet forwarder. With selfish stations and the absence of administrative cooperation enforcement, the lack of forwarding incentives has long been recognized as a serious design problem in MANETs. Reputation systems discourage selfishness by having past cooperation increase the present source packet throughput. We describe a simple watchdogcontrolled first-hand reputation system and point to a form of selfishness not addressed by existing research, arising from packet anonymity. If the watchdog at a station cannot tell a nearby station's source packets from transit packets, that station is tempted to admit more source packet traffic than a fair local admittance control (LAC) scheme permits. We analyze a related noncooperative LAC game and characterize three types of its Nash equilibria. Next we propose a simple packet forwarding protocol by the name Decline and Force (D&F) and using an approximate performance model show that, when properly configured, D&F leads to a fair and efficient game outcome.
1 Introduction A mobile ad-hoc network (MANET) uses a wireless medium to interconnect a number of stations, some pairs of which are adjacent i.e., can directly receive each other's transmissions. MANET stations should act both as a terminals (admit source packets and absorb destination packets) and packet forwarders (transmit transit packets on behalf of nonadjacent stations); this enables multihop source-to-destination routes. However, forwarding in MANETs needs incentives. Firstly, stations belonging to different owners need not be concerned about global connectivity. Secondly, forwarding transit packets is doubly unprofitable: it consumes energy and delays source packets. Finally, MANETs allow station anonymity and so refusal to forward by an "energy stingy" and/or "bandwidth greedy" station may meet with little punishment. One can envisage various kinds of selfish (as distinct from cooperative and malicious) forwarding behavior; [17] presents a taxonomy. Various schemes have been proposed to enforce cooperative forwarding behavior. Micropayment schemes make it necessary for a station to forward transit packets in order to earn a virtual currency with which to pay other stations for forwarding its source packets. Honest credit clearance necessitates tamper-proof cryptographic modules at each station [6, 7] or secure communication protocols [2, 21]. Typically, Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 316 – 329, 2007. © Springer-Verlag Berlin Heidelberg 2007
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
317
a station is required to forward at least as many transit packets as it transmits source packets; on the other hand, there will never be enough incentives to forward more than that [19]. To overcome this difficulty, a market-based approach [11, 13, 15, 20] has each station define a payoff it wants to maximize, while the underlying pricing scheme ensures that cooperative forwarding yields greater payoffs. A rational outcome of the stations' behavior, in the form of a Nash equilibrium [12], can then be predicted in a game-theoretic framework. Reputation systems such as CORE [18], CONFIDANT [5], or OCEAN [2] can in essence be considered a variety of marketbased solutions. Here, part of the payoff is a reputation a station owes to past forwarding cooperation. Verification of packet forwarding can be provided by a watchdog (WD) [16], a mechanism that promiscuously senses an adjacent station's transmissions and compares them with copies of packets sent to, and supposed to be forwarded by, that station. First-hand WD-based reputation measures can be disseminated to produce indirect reputation measures. The success of market-based or reputation system depends on how adequately the defined payoff reflects the stations' preferences (e.g., source packet throughput, energy expenditure, the ratio or linear combination of the two etc.). In this paper we describe a WD-controlled first-hand reputation system, where the stations are primarily after source packet throughput. In Sec. 2 we point to a difficulty not addressed by previous research, arising from packet anonymity. In short, the WD cannot tell an adjacent station's source packets from transit packets. This gives rise to selfish manipulation of a local admittance control (LAC) mechanism. In Sec. 3 we introduce a noncooperative LAC game and in Sec. 4 characterize its reachable Nash equilibria. Next, in Sec. 5 we propose a simple packet forwarding protocol by the name Decline and Force (D&F). The idea is to give a station under heavy transit traffic the right to decline to receive packets, as well as give adjacent stations the right to nevertheless force packets into that station if they are under heavy transit traffic themselves. Using an approximate performance model we show in Sec. 6 that, when properly configured, D&F may lead to a fair and efficient game outcome. Sec. 7 concludes the paper.
2 Forwarding Model and Selfish Behavior A MANET is modeled as a collection of N stations, each of which has a permanent identity announced to the other stations, implements agreed upon MAC and multihop routing protocols, is equipped with a WD, and may agree with each destination upon a full-packet encryption scheme. The latter feature allows packet anonymity in that no station other than the destination can determine a packet's source station. This protects user data and prevents traffic analysis attacks. Fig. 1 illustrates anonymous packet forwarding with WDs, assuming AODV-type routing [4] (DSR-type routing modified similarly as in [8] can also apply). In particular: • •
A pair of adjacent stations, n and m, establish a neighborhood relationship (by exchanging their identities and routing information). Station n transmits a packet to next-hop neighbor station m, appending to it n, m, and the destination station identity d (source station identity, in this case n, is encrypted along with the packet body).
318
• •
•
J. Konorski
Station m checks that it has a neighborhood relationship with n. If m = d, the packet is decrypted; otherwise n is replaced by m and m is replaced by next-hop neighbor station l, If m ≠ d, station n performs a WD check: it compares a sensed packet from station m with a retained packet copy; if no match is found within a predefined deadline B, the WD check is failed. To factor out MAC delays we express B as the number of packets transmitted by station m; this amounts to a station buffer limit of B packets. Based on the statistics of failed WD checks, station n may recognize station m as selfish and punish it by terminating the neighborhood relationship. packet transmission from m sensed at n m source
body
n
m
retained copy n
l
m
source
d
l
source
d
d
body
body packet transmitted along route from n to d
Fig. 1. Anonymous packet forwarding with WDs
The backlog limit B mandates local admittance control (LAC) to prevent buffer overflow, and the right to legitimately decline to receive a transit packet, announcing a current backlog of B (via a FULL primitive). LAC settings should ensure equal admittance rates at all stations. However, since packet anonymity blurs the distinction between source and transit packets sensed by the WD, nothing stops a station from unrestrained admission of source packets and issuing FULL primitives. Such behavior is undetectable (hence, unpunished) and selfish, as it yields a larger-than-fair source packet throughput. A possible remedy consists in (i) immediate termination of a neighborhood relationship if a failed WD check occurs without a prior FULL primitive, (ii) imposing a public-knowledge tolerable rate V* of FULL primitives that keeps a neighborhood relationship alive, and (iii) forcing a station to receive high enough transit traffic, at the same time giving it the right to legitimately decline to receive transit packets under heavy transit traffic.
3 LAC Game In the sequel we focus on a simple Drop-and-Throttle LAC mechanism [14], which admits a source packet only if the current backlog is below a. The LAC threshold a is set autonomously by each station (the smaller a, the less source traffic is admitted). Consider an N-player noncooperative LAC game where station n's feasible actions correspond to LAC thresholds an (1 ≤ an ≤ B −1). A LAC profile has the form (an a−n), where a−n = (am, m ≠ n) is the opponent profile. Station n payoff involves:
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
• •
319
source throughput S[an a−n] (the source packet admittance rate), and a fullness measure V[an a−n] (the proportion of time with backlog B).
We take V[an a−n] ≥ V* as the condition of termination of all neighborhood relationships involving station n. Note that V[an a−n] thus measures station n's first-hand reputation. Let 1C be the indicator function (1 if C is true and 0 otherwise) and define station n payoff as:
payoff n [an a− n ] = S[an a− n ] ⋅ 1V [ a n a −n ]
(1)
Both S[an a−n] and V[an a−n] are determined by the employed packet forwarding protocol. However, any plausible protocol creates a conflict between increasing S[an a−n] and increasing V[an a−n] i.e., unrestrained admission of source packets does not pay: Assumption 1. (i) S[an a−n] increases in an and decreases in any am (where m ≠ n), and (ii) V[an a−n] increases in any am. A LAC profile will be called fair if payoff n [ano a−o n ] is the same for n = 1,…,N and o unfair otherwise; it will be called efficient if payoff n [ano a−n ] > 0 for n = 1,…,N and inefficient otherwise (in the latter, some station(s) find all their neighborhood relationships terminated). According to game theory [12], selfish stations reach a Nash equilibrium (NE), where no station has an incentive to change its LAC threshold unilaterally.
Definition 1. A NE is a LAC profile [a no a −o n ] such that for n = 1,…,N and any an, payoff n [ano a−o n ] ≥ payoff n [an a−o n ] .
(2)
We now formalize reachability of a NE. To this end, we model dynamic scenarios of the LAC game, where each station n adjusts an to seek a maximum payoff. Assumption 2. The LAC threshold adjustment mechanism is (i) gradual − a change of an by ±j causes a payoff change at any other station equivalent of j consecutive changes of an by ±1, and (ii) prompt − any other station becomes aware of, and can react to, each of these j changes before the next one takes effect. Part (i) reflects updating of S[an a−n] and V[an a−n] via a low-pass filter, hence without abrupt changes. Part (ii) is an idealization that limits our interest to unit changes of LAC thresholds. We thus assume that the stations move sequentially, each changing its LAC threshold by ±1, which immediately yields station payoffs corresponding to the new LAC profile. Moreover, the stations act in quasi-unison, none lagging behind more than one move. We model the LAC game in the extensive form to account for the order and timing of the stations' moves (a move consists in an adjustment of the LAC threshold). Given past play we only specify a set of stations on move, rather than a single station; yet owing to Assumption 2 no information sets [12] need to be specified.
320
J. Konorski
Definition 2. A noncooperative LAC game is a quadruple ({1,…,N}, A, payoff, move), where A is the set of feasible LAC thresholds, payoff: AN → RN, and move: Π → 2{1,…,N} determines the order of moves; Π is the set of feasible play paths and 2{1,…,N} is the powerset of {1,…,N}. A feasible play path has the form πk = ((a0, M0),…, (ak, Mk)), where ak = (a1k ,..., aNk ) ∈ AN and ∅ ≠ Mk ⊆ move(πk−1); ank ≠ ank −1 implies that n ∈ Mk and ank = ank −1 ± 1 . By convention, move(π0) = {1,…,N}. A play path specifies the LAC profiles in successive "rounds" of moves along with the sets M0, M1, … of stations that moved. These are arbitrary subsets of the sets move(π0), move(π1), … of stations on move. The timing of moves within Mk is not specified − the stations may move simultaneously, one by one, or subset by subset. Let vn(k) be the number of times station n has changed its LAC threshold on the play path πk. The quasi-unison specification is: move(πk) = {n = 1,..., N | vn (k ) = min1≤ m ≤ N vm (k )}.
(3)
Definition 3. A best-reply strategy of station n is a function σ nBR : Π → A with ⎧⎪ ∈ arg max a: |a −a k |≤1 payoff n [a a −k n ], if n ∈ move(π k ) ∩ M k +1 n ⎪⎩= ank , otherwise.
σ nBR (π k )⎨
(4)
where k = 0,1,… That is, each station seeks a best move given the past play path and current opponent profile. Fig. 2 shows a 5-station LAC game scenario under best-reply strategies, indicating upon each "round" the sets of stations on move and those actually moving, as well as the current LAC profile; it is assumed that V[2 (2…2)] < V* and V[3 (2…2)] ≥ V*. LAC threshold NE
station 1
1
−
1
−
1
+
2
−
2
station 2
1
+
2
×
2
×
2
−
2
×
not on move
station 3
1
−
1
−
1
+
2
+2
−
on move & not moving
station 4
1
+
2
×
2
×
2
+
2
+
on move & moving
station 5
1
−
1
+
2
2
−
2
×
M0 = {2,4} M1 = {5} M2 = {1,3} M3 = {3,4} "rounds" of moves
Fig. 2. LAC game scenario under best-reply strategies
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
321
Definition 3. Given a0, a reachable NE is a LAC profile [a1o ,..., aNo ] such that (i) the set Π contains a play path π = ((a0, M0),…, (ak, Mk), (ak+1, Mk+1),…) with ak = [ano a−o n ] for some finite k, and (ii) ano ∈ arg max1≤ a ≤ B −1 payoff n [a a−k n ] for n = 1,…,N. Condition (ii) implies that a NE reached in the kth "round" persists later on; indeed, by (4) and Assumption 1, σ nBR (π k ′ ) ≡ ano for all k' ≥ k. E.g., (2,…,2) is a NE in Fig. 2.
4 Reachability of Nash Equilibria Suppose an efficient LAC profile a0 = [a0 (a0 ... a0)] initially prevails e.g., is negotiated at network setup. (We continue to indicate an arbitrarily chosen station's LAC threshold and the opponent profile.) If the MANET topology and traffic pattern are symmetric then a0 is also fair. For readability we further write payoff[an a−n] instead of payoffn[an a−n]. Hence, payoff[a + 1 (a ... a)] denotes the payoff to a station whose LAC threshold is a + 1, all the other stations setting a. The proposition to be presented shortly states a necessary and sufficient condition for any reachable NE to be fair and efficient; moreover, its proof helps categorize any other reachable Nash equilibria if the condition is not fulfilled. In particular, the reachable outcomes of the game can be characterized as follows: • each station receives a positive payoff and has no incentive to change its LAC
threshold (a fair and efficient NE), or • each station receives a zero payoff and finds no incentive to change its LAC
threshold as all neighborhood relationships are terminated (a fair and inefficient NE), or • a timing game (a "war of preemption" or a "war of attrition") arises after some play 1 path, leading to an unfair NE.
Proposition 1. Fix V* such that V[a (a ...a)] < V* and V[a + 1 (a + 1 ... a + 1)] ≥ V* for some a, i.e., payoff[a (a ...a)] > 0 and payoff[a + 1 (a + 1 ... a + 1)] = 0. Under best-reply strategies, [a (a ...a)] is the only (and fair and efficient) NE reachable from a0 if an only if payoff[a + 1 (a ... a)] = 0.
(5)
Proof: Let us show first that [a (a ...a)] is reachable from the initial LAC profile a0. If payoff[a0 + 1 (a0 + 1 ... a0 + 1)] = 0 then the assertion is immediate, so assume the opposite: payoff[a0 + 1 (a0 + 1 ... a0 + 1)] > 0. By Assumption 1, payoff[a0 + 1 (a0 + 1 ... a0 + 1 a0 ... a0)] > payoff[a0 + 1 (a0 + 1 ... a0 + 1)] > 0.
1
In a timing game (see [12] for suggestive examples), a player moves at most once and initially all players have incentives to move. In a "war of preemption" moving early yields higher payoffs than moving late or not at all, whereas in a "war of attrition" the converse is true.
322
J. Konorski
Consequently, payoff[a0 + 1 (a0 + 1 ... a0 + 1 a0 ... a0)] > payoff[a0 (a0 + 1 ... a0 + 1 a0 ... a0)] regardless of the number of (a0 + 1) entries in the opponent profile. I.e., regardless of how many stations have already increased their LAC thresholds, any other one has an incentive to do so. This implies that any play path starting at a0 and conforming to (3) and (4) contains an ak = [a0 + 1 (a0 + 1 ... a0 + 1)]. Fig. 3a illustrates this type of scenario using a conceptual payoff vs. LAC threshold plot: with each station finding its payoff higher upon increasing its LAC threshold, all of them end up at a symmetric and higher LAC profile. Continuing along any play path we arrive at al = [a (a ...a)] with move(πl) = {1,…,N} such that payoff[a (a ...a)] > 0 and payoff[a + 1 (a + 1 ... a + 1)] = 0. Assume that (5) holds, implying payoff[a (a ...a)] > payoff[a + 1 (a ... a)]. Using Assumption 1, we also find that payoff[a (a ...a)] > payoff[a − 1 (a ...a)], as illustrated in Fig. 3b. Thus for any n, σ nBR (π l ) ≡ a and so [a (a ... a)] is a fair and efficient NE, which proves the "if" part. To prove the "only if" part we assume payoff[a + 1 (a ...a)] > 0 and give examples of play paths conforming to (3) and (4) that lead to inefficient and/or unfair NE. Since now payoff[a (a ...a)] < payoff[a + 1 (a ...a)], each station n ∈ move(πl) has an incentive to set a + 1 i.e., σ nBR (π l ) = a + 1 . Imagine a continuation of the play path πl with Ml+1 = {1}, Ml+2 = {2}, …, Ml+N = {N} (clearly, it conforms to the quasi-unison constraint (3)). One concludes that a "war of preemption" results, in which some stations will have set a + 1 and some will not. Indeed, σ 1BR (π l ) = a + 1 , that is, station 1 finds payoff[a (a ... a)] < payoff[a + 1 (a ... a)] (recall that payoff[a + 1 (a ...a)] > 0). However, σ NBR (π l + N −1 ) = a i.e., station N, the last that contemplates setting a + 1, will find no incentive to do so since payoff[a (a + 1 ... a + 1)] ≥ payoff[a + 1 (a + 1 ... a + 1)] = 0. In the meantime, a number of stations besides station 1 (say 2 through l') may have set a + 1, namely those that found payoff[a (a + 1 ... a + 1 a ... a)] < payoff[a + 1 (a + 1 ... a + 1 a ... a)]. The resulting LAC profile al+l' = [a + 1 a + 1 ... a + 1 a ... a] (with l' entries equal to (a + 1)) is an unfair and efficient NE if payoff[a + 2 (a + 1 ... a + 1 a ... a)] = 0; otherwise the play path continues to eventually reach an unfair and efficient NE where some or all of the l' stations will have set a + 2 or larger. Either way, stations 1 through l' receive a larger payoff than the others. In Fig. 3c; the upward arrow symbolizes some stations setting a higher LAC threshold upon finding that payoff[a+ 1 (a ...a)] > payoff[a (a ...a)], whereas the other arrow symbolizes the rest finding payoff[a (a + 1 ... a + 1)] ≥ payoff[a + 1 (a + 1 ... a + 1)] and so staying at a. Still assuming payoff[a + 1 (a ...a)] > 0, consider another scenario starting with the same play path πl as before. Let Ml+1 = {1,…,N}, i.e., all the stations set a + 1 almost simultaneously (in the same "round" of moves), producing al+1 = [a + 1 (a + 1 ... a + 1)]. Since payoff[a + 1 (a + 1 ... a + 1)] is zero, so is payoff[a + 2 (a + 1 ... a + 1)] (by Assumption 1); hence no station will find an incentive to further increase its LAC threshold. However, decreasing it may be worthwhile. Consider two cases:
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
323
1) payoff[a (a + 1 ... a + 1)] = 0. No station has an incentive to change its LAC threshold, therefore [a + 1 (a + 1 ... a + 1)] is a fair and inefficient NE with all neighborhood relationships terminated. This type of scenario is illustrated in Fig. 3d; it is similar to that in Fig. 3a except that the resulting payoffs are zero. 2) payoff[a (a + 1 ... a + 1)] > 0. Now each station has an incentive to decrease its LAC threshold i.e., set a. We can continue along the play path assuming Ml+2 = {1}, Ml+3 = {2}, …, Ml+N+1 = {N} and reason similarly as before to conclude that a "war of attrition" results, in which some stations will have decreased their LAC thresholds and some will not. Indeed, the first station to set a (station 1) does so because payoff[a (a + 1 ... a + 1)] > payoff[a + 1 (a + 1 ... a + 1)] = 0, whereas the last one (station N) finds no incentive to do so because payoff[a (a ... a)] < payoff[a + 1 (a ... a)]. Hence an unfair and efficient NE is reached: those stations that have set a receive payoff[a (a + 1 ... a + 1 a ... a)], whereas the other stations receive a higher payoff[a + 1 (a + 1 ... a + 1 a ... a)]. This scenario is illustrated in Fig. 3e: initial incentives to move upwards bring all stations' payoffs to zero; subsequent incentives (or lack thereof) make some of them stay at a + 1, whereas the others go back to a for a higher payoff. Of the above outcomes of the LAC game, a fair and efficient NE is the only desirable. Proposition 1 is mainly of cautionary value, as it shows that the other outcomes are reachable too, even though the MANET topology, traffic pattern, and initial LAC profile are symmetric.
(a)
(c)
(b)
a
a+1 z
payoff[z+1 (z…z)]
a z
(e)
(d)
a+1
a z
payoff[z (z…z)]
a
a+1 z
a
a+1 z
payoff[z−1 (z…z)]
Fig. 3. LAC game payoff vs. LAC threshold; see text for explanation of scenarios a through e
5 D&F Protocol A packet forwarding protocol should give each station n under heavy transit traffic the right to legitimately decline to receive packets, and each neighbor station the right to legitimately force packets into station n. In the Decline and Force (D&F) protocol presented below, executing these rights is discretional and linked to local conditions. A station operates in the NORMAL, CONGESTED, or FULL mode, depending on whether the danger of a failed WD check is perceived as remote, incipient, and immediate, respectively. A CONGESTED station legitimately declines to receive transit
324
J. Konorski
packets, which a CONGESTED or FULL neighbor station can disregard and force a transit packet transmission; a FULL station receives no transit traffic. Thus D&F assists a CONGESTED station in reducing inbound transit traffic, and prevents failed WD checks at a FULL station. (A destination packet can always be received as it is not subject to a WD check.) It is therefore vital that the current mode be announced to all neighbor stations via mode primitives, either broadcast in special control packets or piggybacked on data packets. The proportion of time a station remains FULL i.e., the fullness measure V, is thus known to each neighbor station (no record of the CONGESTED mode has to be kept). Observe that having forwarded a transit packet (and thus passed the WD check at a 2 neighbor station), a station has no incentive to attempt a retransmission. Consequently, D&F must stipulate that (i) every transmitted packet be received (no retransmissions) and (ii) every received transit packet be forwarded (no failed WD checks). Condition (i) implies that a NORMAL station must know about a neighbor station's mode change to CONGESTED or FULL (and a CONGESTED station about a neighbor station's mode change to FULL) in time to suspend an intended packet transmission; accordingly, expedited transmission of CONGESTED and FULL primitives is prescribed. Condition (ii) can be enforced by imposing severe punishment for a failed WD check; D&F prescribes immediate termination of the relevant neighborhood relationship even if V < V*. Both prescriptions can be replaced by some tolerance levels to account for transmission errors and misinterpretation of the mode primitives (to keep the description simple we assume zero tolerance). Any systematic violation of D&F rules is counterproductive. Indeed, • Failure of station n to announce a NORMAL to CONGESTED change forgoes the right to legitimately decline to receive packets; ultimately, to avoid failed WD checks, station n would have to announce FULL mode, worsening its fullness measure V. Similar failure in the case of a CONGESTED to FULL change directly leads to failed WD checks and the termination of station n's neighborhood relationships. Conversely, failure to act upon a FULL to CONGESTED change unnecessarily worsens V, whereas similar failure upon a CONGESTED to NORMAL change drives the neighbor stations CONGESTED and enables them to force traffic into station n, soon bringing about FULL mode. • Illegitimate declining to receive a packet from, or forcing a packet into, a neighbor station terminates the neighborhood relationship upon a failed WD check. Both these violations are counterproductive for obviously a station wants to maintain all neighborhood relationships for which V < V* and none for which V ≥ V*. • Finally, failure to acknowledge a sensed packet (as if the packet was not sensed or was received in error) causes no retransmission or a failed WD check. A possible remedy consists in treating such events as failed WD checks if their frequency distinctly exceeds the estimated statistics of channel errors. Being free to set the mode on its own, each station pursues a maximum payoff (1). Using an approximate analysis we will show that the principle of mode setting decides the type of reachable NE and for some V* may cause the LAC game to reach a fair and efficient NE. 2
See [20] for a (costly) alternative method to include retransmissions in a reputation measure.
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
325
6 LAC Game Payoffs Under D&F We have seen that the outcome of the LAC game is determined by the payoffs for LAC threshold profiles of the form [a' (a ... a)] with |a' − a| ≤ 1. These will now be approximately calculated for a symmetric topology and traffic pattern, where each station has M neighbor stations, and the average source-to-destination route length is H hops. Let the station mode be solely determined by its current backlog x (x ≤ B) and a D&F threshold e: NORMAL if x < e, CONGESTED if e ≤ x < B, and FULL if x = B. Furthermore, (A1) transit packets follow a stationary Poisson process with exponentially distributed transmission times whose mean 1/μ includes MAC delays, (A2) a station can simultaneously transmit and sense a packet, (A3) mode primitives are issued and transmitted with negligible delay, (A4) the rate of packet corruption due to channel errors is negligible, and (A5) a station admits as many source packets as its LAC threshold permits. Even though the model and the principle of mode setting are simplistic, our objective is to discuss reachability of NE rather than accurate performance measures. (A1) and (A2) enable a birth and death approximation [10] and by (A3) the birth and death coefficients only depend on current modes at neighbor stations. (A2) implies separate channels for transmission and reception; we stick to this assumption to avoid shifting the focus to multiple access. (A5) reflects that selfishness is a concern primarily under heavy traffic. Consider a symmetric LAC threshold profile [a (a…a)]. Following the "isolated node" approach [1], we focus upon a station n, where the backlog follows a birth and death process with steady-state probability distribution (pX(x), a ≤ x ≤ B) . (Because of (A5), it never drops below a.) Let current backlog at each of M neighbor stations be drawn from the steady-state probability distribution (pY(x), a ≤ x ≤ B) . This determines the birth and death coefficients at station n as explained below We thus seek a dependence of the form pX(⋅) = f[pY(⋅) ]; by the model symmetry this becomes a fixpoint-type relationship p(⋅) = f[p(⋅)] to be solved iteratively for p(⋅). Denote the birth and death coefficients conditional on x by αx and βx (βa, αe−1, and αB−1 signify admission of a source packet, and mode change to CONGESTED and FULL, respectively). Recalling D&F operation we express βx through pY(⋅): ⎧ ⎡ 1 ⎞⎤ ⎛ ⎪ μ ⋅ ⎢1 − PY (e) ⋅ ⎜1 − ⎟⎥, if x < e ⎪ ⎝ H ⎠⎦ βx = ⎨ ⎣ ⎪μ ⋅ ⎡1 − p ( B ) ⋅ ⎛⎜1 − 1 ⎞⎟⎤, if x ≥ e, Y ⎥ ⎪⎩ ⎢⎣ ⎝ H ⎠⎦
(6)
where PY(e) = Σe≤y≤B pY(y) is the probability that a neighbor station is CONGESTED or FULL, and 1/H is the fraction of packets whose destination is next-hop. To approximate αx, let β C = ∑ e ≤ x ≤ B pY ( x) ⋅ β x and β N = ∑ a ≤ x < e pY ( x) ⋅ β x be the average death coefficients at a CONGESTED and NORMAL neighbor station, respectively (by putting βx we take advantage of the model symmetry). The flow of packets into CONGESTED station n only consists of forced and destination packets,
326
J. Konorski
whereas NORMAL station n receives all packets for which it is the next-hop station (1/M of the traffic from each of M neighbor stations). Since the fraction 1/H of traffic received at station n are destination packets, which do not contribute to αx, we have for x = 0,…,B − 1: ⎧β , if x ≥ e 1⎞ ⎪ C ⎛ α x = ⎜1 − ⎟ ⋅ ⎨ ⎝ H ⎠ ⎪( β + β ), if x < e. C ⎩ N
(7)
Standard calculation [10] now yields:
pX ( x ) = C ⋅
α a ⋅ α a +1 ⋅ ... ⋅ α x −1 , β a +1 ⋅ β a + 2 ⋅ ... ⋅ β x
(8)
where C is a normalization constant. Further calculation is performed iteratively: assuming some initial pY(⋅) calculate pX(⋅) and in the next iteration substitute for pY(⋅) until pX(⋅) ≈ pY(⋅) ≈ p(⋅). The fullness measure follows directly and for the source throughput we use the observation that a source packet is admitted each time a death occurs at backlog a. Hence, S[a (a ... a)] = β a ⋅ p(a) and V [a (a ... a)] = p( B ).
(9)
To calculate payoff[a' (a ... a)], where a' = a ± 1, calculate pX′ ( x) by putting a := a' in (8), with the αx and βx obtained by substituting PY(e) = Σe≤y≤B p(y), and next reapply (9) i.e., take S[a′ (a ... a)] = β a ′ ⋅ p′X (a′) and V [a′ (a ... a)] = p′X ( B ). In doing so, we assume that the interaction of the other stations with a single station setting a different LAC threshold does not affect their birth and death processes significantly. Numerical experiments confirm the validity of Assumption 1 under D&F. Sample results are depicted in Fig. 4 for various LAC and D&F thresholds (source throughput is normalized with respect to the maximum attainable value Mμ; discrete points are joined by solid lines for readability). In view of (A5) it is logical to consider only a < e, as otherwise there would be no packet transmissions other than forced. Taking e = 4 yields a fair and efficient NE at a = 3, with normalized payoff S[3 (3 ... 3)] = 39.5%; this coincides with the highest fair payoff attainable in a cooperative setting. Taking e = 3 leads to zero payoffs and the termination of all neighborhood relationships (fair payoffs of 39.5% could be attained in a cooperative setting), while e = 5 leads to a timing game (cf. Fig. 3c,e) with unfair payoffs ranging from 29.4% to 50.4%. (fair payoffs of 39.3% could be attained if all stations were cooperative and stuck to a = 3). The outcome of the LAC game varies with e and V*. E.g., if we increase V* twofold (V* = 0.01) then e = 5 and e = 7 yield fair and efficient Nash equilibria and no e yields a timing game. One may look for V* yielding a fair and efficient NE. This brings into consideration a quality we refer to as robustness. Let a fair and efficient NE occur at [a (a ... a)]. Then V[a (a ... a)] < V* and V[a + 1 (a ... a)] ≥ V* (cf. Fig. 3b). Since V is a statistical average and V* is typically low in magnitude, the
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
80
80 e =3
e =4
60 payoff (%)
60 payoff (%)
327
39.5%
40 20
39.5%
40 20
0
0 1
2
3
a
4
5
1
2
a
3
4
5
80 e =5
payoff (%)
60
50.4%
40
29.4%
20 0 1
2
3
a
4
5
Fig. 4. LAC game payoffs for various e; M = 3, H =3, B = 8, V* = 0.005, marking as in Fig. 7 50
20
e =2
40
10
e =4 e =5
r
r
e =3
30
e =6
20
e =6
e =5
10
e =4
0 0.0001
0.001
V*
0.01
0.1
0 0.0001
0.001
V*
0.01
0.1
10 e =7
r
e =3 e =6
0 0.0001
0.001
V*
0.01
0.1
Fig. 5. Robustness for various types of NE; top left: fair and efficient, top right: inefficient, bottom: timing game; circled: suggested range of V*
values of V[a (a ... a)], V*, and V[a + 1 (a ... a)] ought to be far enough apart, for each station to credibly perceive the payoff it is receiving and avoid accidental departure from the NE. To assess statistical credibility, the distance between V[a (a ... a)] and V* should be expressed as a multiple of the standard deviation of sample V[a (a ... a)]; the latter is roughly proportional to the square root of V[a (a ... a)]. Similar conclusion applies to the distance between V* and V[a + 1 (a ... a)]. Therefore, the lesser of the two relative distances measures the robustness r of the NE:
328
J. Konorski
⎧⎪V * −V [a (a ... a)] V [a + 1 (a ... a)] − V * ⎫⎪ r = min ⎨ , ⎬, V [a + 1 (a ... a)] ⎪⎭ ⎪⎩ V [a (a ... a )]
(10)
and the interesting range of V* is where r > 0. For inefficient NE the three values to be far enough apart are V[a + 1 (a ... a)], V*, and V[a (a + 1 ... a + 1)], and for the timing game, V[a + 1 (a ... a)], V*, and V[a (a + 1 ... a + 1)], the rest of the argument being similar. Fig. 5 plots r versus V* for the three types of NE. Note that high and wide-ranged robustness is only desirable in the case of fair and efficient NE. The plots therefore indicate preferable V*, namely those for which an e* exists yielding high r in the top left plot and no high r in the other two (a suggested range of V* is circled, e* = 6).
7 Conclusion MANET stations are understandably reluctant to forward transit packets given that it consumes the energy and bandwidth needed by source packets. One way of incentivizing packet forwarding is to define a WD-controlled reputation system where a station should not decline to receive transit packets too often lest its neighborhood relationships be terminated. However, stations under heavy transit traffic should be exempted from this rule, which selfish stations may abuse by setting their LAC thresholds so as to admit larger-than-fair source packet traffic. With packet anonymity allowed, source and transit packets are indistinguishable to a neighbor station's WD, hence the selfishness cannot be detected or punished. Using simple MANET and game models we have characterized reachable Nash equilibria of the resulting LAC game. In a symmetric MANET, the desirable type of NE is a fair and efficient one, where each station gets the same source packet throughput and no neighborhood relationship is terminated. In this context, a class of packet forwarding protocols worth studying is D&F, where each station is given the right to decline to receive transit packets, as well as the right to force transit packets into a neighbor stations on some conditions. While in principle these conditions can be quite sophisticated, decided autonomously, and possibly time-varying, we have demonstrated that the type of NE can be controlled even with a single decision parameter (the D&F threshold). Research challenges for the future involve D&F optimization under more general principles of mode setting, and for more realistic MANET, traffic, and LAC game models. These are being studied in both simulation and real-world environments.
Acknowledgment This work was supported by the Ministry of Education and Science, Poland, under Grant 1599/T11/ 2005/29.
An Incentive-Based Forwarding Protocol for Mobile Ad Hoc Networks
329
References 1. Agnew, G.B., Mark, J.W.: Performance Modeling for Communication Networks at a Switching Node. IEEE Trans. Comm. COM-32, 902–910 (1984) 2. Bansal S., Baker M.: Observation-based Cooperation Enforcement in Ad hoc Networks (2003), http:arxiv.org/abs/cs.NI/0307012 3. Ben Salem, N., Buttyan, L., Hubaux, J.-P., Jakobsson, M.: A Charging and Rewarding Scheme for Packet Forwarding in Multi-hop Cellular Networks. In: Proc. ACM Symposium on Mobile Ad Hoc Networking and Computing MobiHoc’03, Annapolis MD (2003) 4. Broch, J., Johnsonn, D., Maltz, D.: The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks, IETF Internet Draft (1998) 5. Buchegger, S., Le Boudec, J.-Y.: Performance Analysis of the CONFIDANT Protocol (Cooperation Of Nodes Fairness In Dynamic Ad-hoc NeTworks), Tech. Rep. IC/2002/01, Swiss Federal Institute of Technology (2002) 6. Buttyan, L., Hubaux, J.-P.: Nuglets: A Virtual Currency to Stimulate Cooperation in SelfOrganized Mobile Ad-Hoc Networks, Tech. Rep. DSC/2001/001, Swiss Federal Institute of Technology (2001) 7. Buttyan, L., Hubaux, J.-P.: Stimulating Cooperation in Self-Organizing Mobile Ad Hoc Networks. J. Mobile Networks and Applications 8, 579–592 (2003) 8. Cheng, Y., Agrawal, D.P.: Distributed Anonymous Secure Routing Protocol in Wireless Mobile Ad Hoc Networks, http:www.ececs.uc.edu/ cdmc/OPNETWORK_Yi.pdf 9. Felegyhazi, M., Hubaux, J.-P., Buttyan, L.: Nash Equilibria of Packet Forwarding Strategies in Wireless Ad Hoc Networks. IEEE Trans. Mobile Computing 5, 1–14 (2006) 10. Feller, W.: An Introduction to Probability Theory and its Applications. J. Wiley and Sons, New York (1966) 11. Fratkin, E., Vijayaraghavan, V., Liu, Y., Gutierez, D., Li, T.M., Baker, M.: Participation Incentives for Ad Hoc Networks, www.stanford.edu/#y1314/adhoc 12. Fudenberg, D., Tirole, J.: Game Theory. MIT Press, Cambridge (1991) 13. Ileri, O., Mau, S.-C., Mandayam, N.B.: Pricing for Enabling Forwarding in SelfConfiguring Ad Hoc Networks. IEEE J. Selected Areas Commun. 23, 151–162 (2005) 14. Kamoun, F.: A Drop-and-Throttle Flow Control Policy for Computer Networks. IEEE Trans. Comm. COM-29, 444–452 (1981) 15. Marbach, P.: Cooperation in Wireless Ad Hoc Networks: A MArket-BasedApproach. IEEE/ACM Trans. Networking 13, 1325–1338 (2005) 16. Marti, S., Giuli, T.J., Lai, K., Baker, M.: Mitigating Routing Misbehavior in Mobile Ad Hoc Networks. In: Proc. 6th Annual Conf. on Mobile Computing and Networking MobiCom 2000, Boston MA (2000) 17. Michiardi, P., Molva, R.: Making Greed Work in Mobile Ad Hoc Networks, Res. Rep. RR-02-069, Institut Eurecom, Sophia-Antipolis (2002) 18. Michiardi, P., Molva, R.: CORE: A Collaborative Reputation Mechanism to enforce node cooperation in Mobile Ad hoc Networks. In: Proc. 6th IFIP Conf. on Security Comm. and Multimedia (CMS 2002), Portoroz, Slovenia (2002) 19. Srinivasan, V., Nuggehalli, P., Chiasserini, C.F., Rao, R.R.: An Analytical Approach to the Study of Cooperation in Wireless Ad Hoc Networks. IEEE Trans. Wireless Commun. 4, 722–733 (2005) 20. Wang, Y., Giruka, V.C., Singhal, M.: A Fair Distributed Solution for Selfish Nodes Problem in Wireless Ad Hoc Networks. In: Nikolaidis, I., Barbeau, M., Kranakis, E. (eds.) ADHOC-NOW 2004. LNCS, vol. 3158, Springer, Heidelberg (2004) 21. Zhong, S., Chen, J., Yang, Y.R.: Sprite: A Simple, Cheat-Proof, Credit-Based System for Mobile Ad-Hoc Networks. In: Proc. INFOCOM 2003, San Francisco (2003)
Providing Seamless Mobility Using the FOCALE Autonomic Architecture John Strassner, Dave Raymer, and Srini Samudrala Motorola Labs, 1301 East Algonquin Road, MS IL02-2240 Schaumburg, IL 60196 USA {john.strassner, david.raymer, srini.samudrala}@motorola.com
Abstract. Existing wireless networks have little in common, as they are designed around vendor-specific devices that use specific radio access technologies to provide particular functionality. Next generation networks seek to integrate wired and wireless networks in order to provide seamless services to the end user. Seamless Mobility is an experiential architecture, predicated on providing mechanisms that enable a user to accomplish his or her tasks without regard to technology, type of media, or device. This paper examines how autonomic mechanisms can satisfy some of the challenges in realizing seamless mobility solutions. Keywords: autonomic communications, autonomic networking, network management, Seamless Mobility.
1 Introduction Current voice and data communications networks are difficult to manage, as exemplified by the stovepipe systems that are common in Operational and Business Support Systems [1]. This is due to the desire to incorporate best of breed functionality, which prohibits the sharing and reuse of common data [2], resulting in the ability to manage the increase in operational, system, and business complexity. For example, management of wireless operations requires the analysis and interpretation of diverse management data from different sources to provide a machineinterpretable view of system quality as perceived by the end user [3][4]. Management and optimization of wireless systems require mechanisms that are mostly specific to a particular type of radio access technology (RAT). Unfortunately, current RATs use a set of non-compatible standards and vendor-specific functionality. This is exacerbated by current trends, such as network convergence (which combine different types of wired and wireless networks), as well as future multi access mode devices [5] and cognitive networks [6], in which the type of network access can be dynamically defined. The vision of Seamless Mobility [7] is even more ambitious – the ability for the user to get and use data independent of access mode, device, and media. Note that while handover between multiple network technologies is challenging, the hard part of Seamless Mobility is in maintaining session continuity. This paper describes ongoing research in supporting the needs of Seamless Mobility through the use of a novel autonomic networking architecture called FOCALE. The organization of this paper is as follows. Section 2 describes the vision of Seamless Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 330 – 341, 2007. © Springer-Verlag Berlin Heidelberg 2007
Providing Seamless Mobility Using the FOCALE Autonomic Architecture
331
Mobility. Section 3 defines autonomic networking, and how it differs from autonomic computing. Section 4 describes the FOCALE architecture in detail, and Section 5 shows how FOCALE meets the needs of wired and wireless network management. Section 6 describes how FOCALE can be used to implement Seamless Mobility. Section 7 concludes the paper.
2 The Vision of Seamless Mobility Businesses are gaining competitive advantage through innovative applications that empower their increasingly mobile employees and customers, and end-users are now starting to expect a world of easy, uninterrupted access to information, entertainment, and communication across diverse environments, devices and networks. Businesses want anywhere, anytime communications to provide enhanced productivity to their workforce. Consumers are equally eager for personalized services that make it easy to access and share digital content when, where and how they want it. Network operators seeking an edge in a changing marketplace are exploring new approaches to delivering this content in a personalized, timely and cost-effective manner. Seamless Mobility, and its vision of seamless service delivery, requires significant changes to existing wired and wireless network management systems. For example, when handover from one wireless system to another wired or wireless system is performed, a “seam”, or discontinuity, is created that interrupts the continuity of application experience. Motorola’s vision of Seamless Mobility is to provide simple, uninterrupted access to any type of information desired at any time, independent of place, network and device. Seamless Mobility is dependent on the underlying business models (the willingness to pay for it by the user); this is why it revolves around an experiential architecture that captures the current context of what the user is doing, so that services that the user desires can be optimized. In earlier work, we have developed a novel context model that provides a first step in solving this difficult problem that is part of our FOCALE [12] architecture; this will be explained in more detail in Section 5.
3 Salient Features of Autonomic Networking The purpose of autonomic computing and autonomic networking is to manage complexity. The name “autonomic” was chosen to reflect the function of the autonomic nervous system in the human body. By automating more manual functions (e.g., treating them as functions under involuntary control), additional resources (human and otherwise) are made available to manage higher-level processes. Most current autonomic computing architectures look like that shown in Figure 1 [13][14], and focus on providing self-* functionality to IT components and systems. They consist of a Managed Element instrumented with Sensors and Effectors to send data to and receive commands from an Autonomic Element, which is part of an Autonomic Manager. The Autonomic Manager provides its own set of Sensors and Effectors to communicate with other Autonomic Managers. The Autonomic Element implements a simple control loop: it uses sensors to retrieve data, which is then
332
J. Strassner, D. Raymer, and S. Samudrala
analyzed to determine if any correction to the Managed Resource being monitored is Autonomic Element needed (e.g., to correct “non-optimal”, “failed” or “error” states). If so, then those Analyze Plan corrections are planned, and appropriate actions are executed using effectors that Execute Monitor Knowledge translate commands back to a form that the Managed Resource can understand. Effectors Sensors The motivation behind autonomic networking is twofold: (1) to perform Managed Resources manual, time-consuming network tasks (such as configuration management) on Fig. 1. Autonomic Building Block behalf of the network administrator, and (2) to pre- and post-process data to enable the system and the administrator to work together to perform higher-level cognitive functions, such as planning and optimization of the network. The FOCALE autonomic networking architecture is specifically designed for use with heterogeneous wired and wireless networks. While it builds on autonomic computing concepts, it introduces several novel changes into an autonomic architecture in order to cope with the problems described in Section 1, and are summarized below. Effectors
Managed Element
Autonomic Manager
Sensors
¾ Inability to relate current business needs to network services and resources delivery ¾ Multiple control mechanisms, each having different side effects and resource requirements, are applied to the same “path” ¾ Heterogeneous information sources, each with their own organization, structure, and semantics to their own management information, must be correlated to infer problems and solutions ¾ Heterogeneous languages per network element, meaning that correlation between commands issued and data found must be deduced, not looked up In current environments, user needs and environmental conditions can change without warning. Therefore, the system, its environment, and the needs of its users must be continually analyzed with respect to business objectives. FOCALE uses inferencing to instruct the management plane to coordinate the (re)configuration of its control loops in order to protect the current business objectives. In addition, FOCALE uses modeling and ontologies to develop representations of Managed Resources that are inherently extensible, so that the knowledge of a Managed Resource can be updated over time. This work is novel. The second difference is a result of converged networks. The control mechanisms of wired and wireless networks are very different. Autonomic computing efforts to date do not consider this – rather, they assume a set of largely homogeneous computing resources. FOCALE addresses this through a novel knowledge fusion approach (see section 4.3) that relates diverse functionality in different managed entities to each other. The last two points are usually not found in autonomic computing applications, which typically use the same sensors and effectors to get data and send commands. FOCALE solves this by using a model-based translation layer, which is described in more detail in Section 5.3.
Providing Seamless Mobility Using the FOCALE Autonomic Architecture
333
4 The FOCALE Architecture FOCALE stands for Foundation – Observation – Comparison – Action – Learn – rEason, which are six key principles required to support autonomic networking. These principles manage complexity while enabling the system to adjust to the changing demands of its users and environmental conditions. Basic operation is as follows. Assume that behavior can be defined using a set of state machines, and that the configuration of each device is determined from this information. FOCALE is a closed loop control system, in which the current state of the managed element is calculated from sensed data and compared to the desired state defined in the state machines. Any variance from the desired state is analyzed to ensure that business goals and objectives are still being met. If they are, the system will keep monitoring state (though it may need to change what is being monitored); if they aren’t, then the system executes a set of configuration changes to fix the problem(s). Equally important, the results of these changes are observed to ensure that the system reacted as expected. However, since networks are complex, highly interconnected systems, FOCALE introduces six novel modifications to current autonomic control loops: (1) the use of multiple control loops, (2) a combination of information models, data models, and ontologies are used to develop state machines for orchestrating behavior, (3) diverse information is normalized for gathering vendor-specific management data and issuing vendor-specific commands, (4) the functions of the control loop are changed based on context, policy, and the semantics of the data as well as the current management operations being processed, (5) reasoning mechanisms are used to generate hypotheses as to why the actual state of the managed element is not equal to its desired state, (6) learning mechanisms are used to update its knowledge base. 4.1 FOCALE’s Use and Adaptation of Multiple Control Loops Figure 2 shows FOCALE’s two types of control loops. The desired states of the Managed Resource are predefined in the appropriate state machines using business goals [15][16] [17]. In our case, we use Key Performance and Quality Indicators (KPIs and KQIs) of Service Level Agreements (SLAs) to define these business goals. These are modeled in DEN-ng [20], which enables each to be strongly related to the other. This is one example of translating business needs into network functionality. The top control loop (maintenance) is used when no anomalies are found (i.e., when either the current state is equal to the actual state, or when the state of the managed element is moving towards its intended goal). The bottom (adjustment) control loop is used when one or more reconfiguration actions must be performed. The use of multiple control loops (Figure 2 shows two for simplicity) enables FOCALE to provide more flexible management, and is fundamental to overcoming the limitations of using a single static control loop having fixed functionality. Since FOCALE is designed to adapt its governance model according to the context (see Section 4.4), FOCALE associates a given set of policies for each particular context. This set of policies determines the specific functionality that can be provided by the system, as well as defines the functionality and operation of each component of each control loop (note that this adaptation of system functionality could not be managed by using a static, non-changing control loop, as is done in the current state of the art). In addition,
334
J. Strassner, D. Raymer, and S. Samudrala
controlling the reconfiguration process must be able to have its YES functionality adapted to suit the Compare Actual State to Gather Sensor Match? Desired State Data vendor-specific needs of the differNO ent devices being adapted. For Managed Resource Define New Device example, even a standard protocol Configuration(s) like BGP cannot share the same Loop 2: Adjustment configuration among different vendors, due to vendor-specific implementation differences as well Fig. 2. Two of FOCALE’s Control Loops as different functionality that may or may not be part of a standard. As will be seen in Section 4.7, different components of FOCALE can each alter the function of the control loops according to the set of policies active for a particular context. Another use of multiple control loops is to protect the set of business goals and objectives of different constituencies (e.g., business users vs. programmers vs. architects vs. network operators). Each of these constituencies has different concepts, vocabularies, and understanding of the same function. Thus, the implementation of a “common” objective for these constituencies is different and sometimes in conflict. Therefore, FOCALE uses multiple control loops – having a single control loop to protect these objectives is simply not feasible. The reconfiguration process uses dynamic code generation based on models and ontologies [15][16][17][18]. This forms a novel control loop: context changes policies, which in turn change functionality through dynamic code generation that is orchestrated through a set of state machines. The policies are tightly linked to the model, which enables the model to be used both as a generic specification of functionality as well as an instance model. More specifically, sensor data is used to populate the state machines that in turn specify the operation of each entity that the autonomic system is governing. The management information that the autonomic system is monitoring signals any context changes, which in turn adjusts the set of policies that are being used to govern the system, which in turn supplies new information to the state machines. The state machines defines the (re)configuration commands required to achieve a particular state or set of states. Loop 1: Maintenance
4.2 FOCALE’s Behavioral Orchestration FOCALE uses information and data modeling to capture knowledge relating to network capabilities, environmental constraints and business rules. Unlike other approaches, we combine the knowledge from these models with a different type of knowledge – information from a set of ontologies; this produces an augmented set of data structures that, together with machine-based learning techniques, can be used to reason about this knowledge. Figure 3 shows how each activity in a business model can be represented by a set of classes that are then related to one or more Finite State Machines (FSMs). As the system changes, code is dynamically generated according to the appropriate FSM(s) to protect business goals. Knowledge embedded within
Providing Seamless Mobility Using the FOCALE Autonomic Architecture Business Process Model
Finite State Machine
DEN-ng Model
Code Generation
FOCALE Architecture
Fig. 3. The FOCALE Approach
335
system models will be used by policy management systems to automatically configure network elements in response to changing business goals and/or environmental changes. Policies help standardize how configuration changes are applied in relation to context and business goals. An important benefit of this approach is that it explicitly relates commands and sensor data to each other, thereby simplifying the management task. This enables a closed control loop to be formed, one in which the commands issued can be related to management data that can be queried for, thereby verifying that the issued commands had the desired effect.
4.3 FOCALE’s Management Data and Command Normalization Networks use vendor-specific devices that have varying functionality, as well as ones that implement the same functionality in different ways. This is why management standards such as SNMP are in and of themselves not enough to adequately manage networks. FOCALE associates one or more ontologies with its DEN-ng based [20] data and information models. This enables ontologies to represent relationships and semantics that cannot be represented using UML. For example, UML cannot represent the relationship “is similar to” because it doesn’t define logic mechanisms to enable this comparison. Note that this relationship is critical for heterogeneous end-to-end management, since different devices have different languages, programming models, and side effects [1], and administrators need to ensure that the same relative commands are given to devices having different languages. This is why we combine UML models with ontological data to synthesize new semantic capabilities. The autonomic manager uses ontologies to analyze sensed data to determine the current state of the managed entities being monitored. Often, this task requires inferring knowledge from incomplete facts. For example, consider the receipt of an SNMP alarm. The alarm in and of itself doesn’t provide the business information that the system needs. Which customers are affected by the alarm? Which SLAs of which customers are affected? FOCALE tries to determine, without human intervention, which SLAs of which customer are impacted by examining its libraries of model and ontology data. Once an SLA is identified, it can be linked to business information, which in turn can assign the priority of solving this problem. FOCALE uses a process known as semantic similarity matching [21] to establish additional semantic relationships between sensed data and known facts. This is required because, in this example, an SLA is not directly related in the model to an SNMP alarm. Inferencing is used to establish semantic relationships between the fact that an SNMP alarm was received and other facts that can be used to determine which SLAs and which customers could be affected by that SNMP alarm. Note that without the use of coherent information
336
J. Strassner, D. Raymer, and S. Samudrala
and data models, these facts could not be established; without augmenting this knowledge with ontological data, this inferencing could not be accomplished. 4.4 FOCALE’s Context-Driven Policy Management Figure 4 shows a simplified form of the DEN-ng context model, which relates Context to Management Information to Policy [21], and works as follows. Context determines the working set of Policies that can be invoked at any given time; this working set defines the set of Profiles and Roles that can be assigned, which in turn defines functionality that can be invoked or provided. Significantly, this model also defines the set of management information that is used to determine how the Managed Element is operating. Note that this proactive definition of how to determine whether a component or function is operating correctly is very important to the central concept of governance. Managed Entity Roles Fig. 4. Simplified DEN-ng Context Model are used to describe the state of the Managed Entity, and are then linked to both Policy and Context by the four aggregations shown. Specifically, Policy is used to define which management information will be collected and examined; this management information in turn affects policy. Context defines the management information to monitor, and the values of these management data affect context, respectively. Our context-aware architecture, which controls our autonomic manager, is shown in Figure 5. This architecture enables the type of algorithm, function, and even the type of data to use to be changed as a function of context. This is facilitated by detecting context changes to change Context Manager Policy Manager the active policies that Policies control application of intelligence are being used at any Autonomic Manager given time. Control Control Control Control Current systems that use policy (regardless of Model-Based Analyze Data Determine Managed whether it is part of an Translation and Events Actual State Resource autonomic system) use it Reasoning and Ontological in a static way, causing Define New Device Comparison Learning Configuration(s) three serious problems: Control (1) it is impossible for pre-defined policies to Fig. 5. Simplified FOCALE Architecture anticipate all conditions EntityData
DescribedByEntityData
0..n
0..n
ManagementInfo
DescribedByMgmtInfo
0..n
0..n
0..n
Entity
0..1
ManagedEntity
Tak esOnManagedEntityRoles
0..n
GovernsManagementInfo
0..n
0..n
Policy
0..n
0..n
SelectsPolicies
0..n
0..n 0..n
0..n 0..n
0..n
ManagementInfoUsesPolicy
ContextDependsOnMgmtInfo
ManagedEntityRole
0..n
0..n
1
0..n
GovernsManagedEntityRoles
0..n
ManagedEntityRoleUsesPolicy
0..n
PolicyResultAffectsContext
0..n ContextData
0..n
0..n
MgmtInfoAltersContext
ContextDependsOnManagedEntityRoles ManagedEntityRoleAltersContext
0..n
HasContexts
0..1 ContextData Composite
ContextData Atomic
YES
Match?
NO
Providing Seamless Mobility Using the FOCALE Autonomic Architecture
337
that can affect a managed resource, let alone a managed system, (2) management systems are static, in that they are designed to manage known resources and services; if a new resource or service is dynamically composed, how can a static management system manage it, and how can pre-defined static policies be applicable, and (3) if the underlying context or business objectives change, existing policies may no longer be relevant. Therefore, FOCALE enables context changes to vary the policies used, which in turn change the functions of the control loop. The policies used in our project follow a standard event-condition-action model: events are generated when new wireless system data is available; these events trigger the evaluation of the conditions of one or more policy rules. If the conditions are matched, then one or more policy actions are executed. (Note that this is a very simplified description; for more details, including more granular execution options, please see [20]). In our prototype system, this results in a Causal Analysis, which classifies the reason for the KPI or KQI violation and defines actions to fix the problem. A separate set of policy rules is then used to implement the actions; this is to allow for humans to examine the proposed operation of the system until the humans gain the confidence needed to let the policies run on their own. While this is possibly not needed in autonomic computing, this is definitely needed in autonomic networking, due to the tremendous complexity of networks. 4.5 FOCALE’s Use of Machine Learning and Reasoning Machine learning and reasoning are provided by the “Ontological Comparison” and “Machine Learning and Reasoning” functions in Figure 5. The former implements semantic similarity matching as previously described, and is used by other components to find equivalent semantic terms in the analysis, learning and reasoning functions; the latter implements reasoning and learning algorithms. A bus enables the reasoning and learning components to “watch” the current operation being performed, and be used in addition to or instead of that operation. For example, machine learning can examine the operations being performed, and note the effectiveness of the actions taken given the context, policies, and input data used. This can be used to build a knowledge base that can help guide future decisions. Similarly, an abductive reasoning algorithm can be used to generate hypotheses as to the root cause of problems sensed, which the autonomic manager then tries to verify by using the models and ontologies to query the system and the environment to gather more data to support each hypothesis. The combination of these learning and reasoning algorithms enable the autonomic system to adapt to changing business goals, user needs and environmental conditions. Machine learning enables the incorporation of new and learned behavior and data. Information modeling facilitates machine learning by providing a general to specific ordering of functionality, as well as details regarding aggregation and connectivity, within the system. Machine learning can be applied to temporal, spatial as well as hierarchical system aspects, allowing for learned behavior about different system “cuts” or cross-sections. Hypothesis formation is a mapping of the data to be explained into the set of all possible hypotheses rank-ordered by plausibility [22]. We define the hypothesis space by using a combination of the information model, the results of the topology discovery
338
J. Strassner, D. Raymer, and S. Samudrala
process, and axiomatic knowledge. If the hypothesis space is too large, then falsification techniques can be used to provide a “critic” function [22], which helps reject some hypotheses and reduce the hypothesis space cardinality. Two examples of a critic function are (1) incorporating upper and lower boundary conditions on capacities and qualities directly into the information model to use for comparison purposes, and (2) the use of ontological relationships, such as “never-has-a”, “is-not-a-kind-of”, and especially “does-not-cause” relationships. One or more machine learning algorithms may be employed to gain experience from the environment, and to aid the reasoning process. We expand the traditional definition of machine learning [23] to include notions of identifying specific values (statistics), identification of specific attribute-value pairs (traditional “data-mining”), and finally the identification of attributes and processes linked to events (our new definition of “machine learning”). Clearly, techniques such as candidate elimination and decision trees, coupled with notions of positive and negative examples, may be employed to help define those attributes of interest surrounding an anomalous event. However, these techniques tell us nothing about the cause behind this event (how the attributes might be linked), nor what the sequel effects and consequences might be. Furthermore, it conveys no understanding. Hence, our machine learning approach combines modeled data with the knowledge of subject matter experts to define a set of axioms and theories. We use machine learning to maintain and repair established theories, as well as in finding successively minimal descriptions of those theories upon encountering future examples of the behavior described by the theories. Finite state machines are a way of encoding behavior, and these may be considered a form of causal structure. The transition probabilities between states need to be maintained for any managed entity whose behavior varies with context. Machine learning and statistics are critical in refinement of transition probabilities and maintenance/repair activities as well as in finding behavioral cues by linking together state change with stimulus/response pairs that describe behavior.
5 Applying FOCALE to Wired/Wireless Network Management Sensor data from the Managed Element is analyzed to determine if the current state of the Managed Element is equal to its desired state. If it is, the process repeats. If it isn’t, then the autonomic manager examines the sensor data. If the autonomic manager already understands the data, then it continues to execute the processes that it was performing. Otherwise, the autonomic manager examines the models and ontologies to develop knowledge about the received data (the remaining steps are complex and beyond the scope of this paper; indeed this first step can often be enough). This knowledge is then fed to a set of machine-based learning and reasoning algorithms that reason about the received data. For example, if the data represents a problem, then the algorithms try and determine the root cause of the problem; finding a cause, actions are then issued, which are translated into vendor-specific commands by the model-based translation functions, and applied to the appropriate Managed Elements. Note that this may include Managed Elements that were not the cause of the problem, or were not being monitored. The cycle then repeats itself, except that in general the monitoring points will have changed to ensure that the reconfiguration commands had their desired effect.
Providing Seamless Mobility Using the FOCALE Autonomic Architecture
339
6 Using FOCALE to Implement Seamless Mobility One of the many challenges of Seamless Mobility is providing Seamless Management, not only across Wired and Wireless domains, but also across multiple domains of various RATs. Each of these Radio Access Technologies have their own set of specifications, provide different functionality and need different governance and management mechanisms. For example, an SLA for a customer will have different specifications depending on the type of network the customer is currently using. A customer could expect a better Voice Service Quality on a GSM network than on a Wi-Fi network using VOIP; this knowledge is easily encoded into ontologies, while the detailed setting for network configuration are encoded in our data models. Furthermore, the SLAs in this example will be different for each network type and RAT; they could also be affected by the vendor-specific functionality of the devices being used. Context-Driven Policy Management determines which Policies are to be used to govern what functionality of which type of network. Furthermore, other business driven policies can also be used to control the overall experience in a Seamless Mobility environment. For example, if a user with a dual mode phone enters an area where a Wi-Fi connection is available, performing a switch from a cellular to a Wi-Fi VOIP call should not be based only on the power availability of the network, but on a number of other factors that are context-specific. For example, suppose a customer has three profiles in a Seamless Mobility environment: (1) their Business Profile, in which their devices must use their employer’s network services, (2) a Home Profile, in which the customer wants to minimize cost (since these are services that the customer is paying for), and (3) an Entertainment Profile, which is currently designed to maximize a video on demand experience. Simple business policies for this scenario could look as follows: IF Business Profile is Enabled, THEN maximize Security ELSE, if Home Profile is Enabled, THEN minimize Cost ELSE, maximize Quality of Service for Video on Demand application In this scenario, the phone, being a device controlled by Seamless Mobility policies, will act differently depending on the context (i.e., the particular profile that is currently active). For example, if the Business Profile is active, then the phone will strive to maximize Security, and hence handover to a network that is the most secure of its choices (or even alert the user that no such network service exists!). In contrast, if that same device is in Home Profile mode, then its prime directive is to minimize cost, and hence it will try to handover any calls received to the cheaper Wi-Fi network to save money for the customer. However, this solution will not be blindly followed, since the Wi-Fi network may not be able to provide the SLA requirements for the customer. Hence, the Autonomics manager may determine that the call should not switch in order to fulfill the SLA requirements of the customer (or, alternatively, alert the user as to the possible degradation in quality and ask if that is acceptable to the user). Note that in both of these cases, we strive to avoid making “black or white” decisions. In other words, just because a policy says “handover to cheapest network” doesn’t always mean that this should be done if it violates some other user metric. It is this ability to reason that makes autonomics so valuable for Seamless Mobility.
340
J. Strassner, D. Raymer, and S. Samudrala
(Similarly, the ability of the phone to learn user preferences in such situations and act more intelligently on behalf of the user is also very valuable, but is beyond the scope of this paper).
7 Conclusions This paper has described the application of the novel FOCALE autonomic networking architecture to realize Seamless Mobility solutions. The rationale behind FOCALE was explained by examining current problems in network management along with the challenges of next generation networks and applications, like Seamless Mobility, and how autonomic principles can be used to meet these challenges. In this background, autonomic networking was differentiated from autonomic computing. FOCALE builds on these differences and introduced six novel enhancements to current autonomic architectures: use of multiple control loops, use of models with ontologies to develop state machines for orchestrating behavior, normalizing and correlating diverse data, enabling control loop components to change their operation according to context, and incorporation of reasoning and learning mechanisms. Behavioral orchestration is achieved by deducing the current state of a Managed Resource through analyzing and reasoning about sensed management data. This is done using a novel set of control loops that are reconfigured to meet the changing context and/or data being examined. Machine learning is facilitated by our use of models and ontologies; machine reasoning is used for hypothesis generation and theory maintenance. Finally, we explained how FOCALE could be used to realize Seamless Mobility. Handovers are analyzed using the model based translation functions; models and ontologies enable us to gather and reason about sensed data, and ensure that the correct relative commands for heterogeneous network devices are issued. Future work will expand on the types of management data that are analyzed, as well as evaluate different types of machine learning and reasoning algorithms for different situations in Seamless Mobility applications. References 1. Strassner, J.: Autonomic Networking – Theory and Practice, Tutorial for IM (2007 ) 2. Strassner, J.: Knowledge Management Issues for Autonomic Systems. In:TAKMA 2005 conference (2005) 3. Lee, J., Miller, L.: CDMA Systems Engineering Handbook. Artech House Publishers (1998) ISBN 0-89006-990-5 4. Rosenberg Adam, N., Kemp, S.: CDMA Capacity and Quality Optimization. McGrawHill, New York (2003) 5. Ovesjö, F., Dahlman, E., Ojanperä, T., Toskala, A., Klein, A.: FRAMES Multiple Access Mode 2 – Wideband CDMA. In: PIMRC 1997 (1997) 6. Mitola, J.: Cognitive Radio Architecture: The Engineering Foundations of Radio XML. Wiley-Interscience, Chichester ISBN 0471742449 7. http://www.motorola.com/content.jsp?globalObjectId=6611-9309 8. TMF Wireless Service Measurements Handbook Approved version 3.0, GB923v30_040315.pdf (March 2004)
Providing Seamless Mobility Using the FOCALE Autonomic Architecture
341
9. Kreher, R.: UMTS Performance Measurement: A Practical Guide to KPIs for the UTRAN Environment (October 2006) ISBN: 0-470-03487-4 10. Pareto chart definition, http://en.wikipedia.org/wiki/Pareto_chart 11. Homer-Dixon, T.: The Ingenuity Gap: Facing the Economic, Environmental, and other Challenges of an Increasingly Complex and Unpredictable World. Vintage Books, New York (2002) 12. Strassner, J., Agoulmine, N., Lehtihet, E.: FOCALE – A Novel Autonomic Networking Architecture. In: LAACS 2006 13. IBM, An Architectural Blueprint for Autonomic Computing, v7 (June 2005) 14. Kephart, J.O., Chess, D.M.: The Vision of Autonomic Computing (Jan 2003), www.research.ibm.com/autonomic/research/papers/AC_Vision_Computer_Jan_2003.pdf 15. Strassner, J.: A New Paradigm for Network Management – Business Driven Device Management. In: SSGRRs 2002 conference 16. Strassner, J., Raymer, D., Lehtihet, E., Van der Meer, S.: End-to-end Model-Driven Policy Based Network Management. In: Policy 2006 Conference 17. Strassner, J., Raymer, D.: Implementing Next Generation Services Using Policy-Based Management and Autonomic Computing Principles. In: NOMS 2006 18. www.omg.org/mda 19. Strassner, J.: Seamless Mobility – A Compelling Blend of Ubiquitous Computing and Autonomic Computing. In: Dagstuhl Workshop on Autonomic Networking (January 2006) 20. Strassner, J.: Policy-Based Network Management. Morgan Kaufman Publishers, Seattle (2003) 21. Wong, A., Ray, P., Parameswaran, N., Strassner, J.: Ontology mapping for the interoperability problem in network management. Journal on Selected Areas in Communications 23(10), 2058–2068 (2005) 22. Josephson, J., Josephson, S.: Abductive Inference: Computation, Philosophy, Technology, ch. 7, Cambridge University Press, Cambridge (1996) 23. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Evaluation of Joint Admission Control and VoIP Codec Selection Policies in Generic Multirate Wireless Networks B. Bellalta, C. Macian, A. Sfairopoulou, and C. Cano Network Technologies and Strategies (NeTS) Research Group Departament de Tecnologies de la Informaci´ o i Comunicaci´ o Universitat Pompeu Fabra, Passeig de Circumval.laci´ o, 8, 08003 Barcelona, Spain {boris.bellalta, carlos.macian, anna.sfairopoulou, cristina.cano}@upf.edu
Abstract. Multirate wireless networks in general share a common problem for the transmission of VoIP traffic, since the rate changes of some of the flows affect the transmission of all others, which causes an unacceptable voice quality degradation. In this work, an admission control algorithm is combined with a codec selection block to mitigate this negative effect. This centralized admission control is able to block or drop calls in order to maintain the system stability despite rate changes. Moreover, the integrated codec selection block is able to select the most adequate VoIP codec for each incoming or already active call based on the channel rate used and according to a number of optimization policies. Several such policies are designed and evaluated in this paper. Results show that this combined adaptive solution provides a beneficial trade-off among the performances of the different codecs in terms of MOS, blocking and dropping probability and cell resource usage. Furthermore, a number of interesting singularities deriving from the multirate nature of the network are identified. Keywords: VoIP codecs, Mobile Networks, Multiple channel transmission rates, WLANs.
1
Introduction
All kinds of wireless channels suffer from high error rates due to variable fading, noise and interferences, which make difficult the correct data transmission (for example between a mobile node (MN) and an access point (AP)). Moreover, the channel profile also changes during the duration of a communication session due to the MN mobility pattern, being the distance d between the two communicating nodes one of the key parameters thereof: The farther off, the lower the quality. In order to allow for larger coverage areas (larger values of d) with adequate communication conditions, the system itself increases the data protection by increasing the channel coding rate (longer redundancy patterns) and using more robust modulations. Hence, longer messages have to be transmitted for the same amount of user data, while using a less efficient modulation, which results in Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 342–355, 2007. c Springer-Verlag Berlin Heidelberg 2007
Evaluation of Joint Admission Control and VoIP Codec Selection Policies
343
longer delays to transmit the same amount of information. As a consequence, the available bit rate for user data is greatly reduced. Such systems are referred to as multirate systems; systems which dynamically choose the transmission rate (code and modulation) to guarantee a lower error rate in the transmissions [1]. WLANs are but one example of such systems, using a multirate scheme in order to achieve high coverage areas (about a 100 meters of radius). MNs near the AP see good channel conditions and therefore can use high transmission rates (low protection codes and efficient modulations) to obtain a high throughput. However, MNs far from the AP observe bad channel profiles and require lower transmission rates. For example, the IEEE 802.11b [2] defines four transmission rates: 11, 5.5, 2 and 1 Mbps, which are obtained from the use of a punctured convolutional code and four different modulations [2]. As shown in [3], and as a consequence of the CSMA/CA random access protocol, 802.11b MNs using low transmission rates degrade the performance of MNs using higher transmission rates as they have to wait for the long transmission delays of the slow ones. For best-effort (elastic) data transmission, the use of multiple rates does not have a critical impact over the user / network performance, as the effect is to lower the overall network performance, which is translated simply into larger delays for transmiting or receiving the desired data (for example, downloading a web page). For inelastic and/or real-time traffic such as VoIP, however, the variable channel data rate could be very problematic. To understand this, consider that n VoIP calls are active and the total bandwidth requirement of the n calls, B(n) is lower than the channel capacity at time t, B(n, t) < C(t). At t + 1, a single MN suffers a reduction in its transmission rate, which results in a C (t + 1) < C(t) and that B(n, t + 1) > C (t + 1). This situation degrades the quality of all n VoIP calls as they are not able to achieve the necessary (constant) throughput for the codec in use, while the packet losses and delay increase. Possible solutions to this case are to drop one or more calls (actually, dropping the call which has changed to a lower transmission rate would be enough) or reducing B(n, t + 1) until at least B(n, t + 1) = C(t + 1). One way to do this is by using lower bandwidth VoIP codec, which can be obtained by switching to another codec [4,5,6], using AMR (Adaptive Multi-Rate) codecs [7, 8], changing the codec parameters (different packetization intervals) or by choosing both codec and packetization interval that provide the most appropriate combination [9]. In most of the works mentioned above, the authors use some threshold metrics to call the adaptation procedure when the quality decrease becomes unacceptable. The most commonly used metrics are the packet loss ratio and the end-to-end delay, since they are the critical parameters in a VoIP transmission. From them, only [6] and [9] are working on the specific problems of a multirate environment, while the others use the adaptive mechanism to deal mostly with bad channel conditions. Also, most of the above mentioned focus only on one part of the problem, either addressing the new calls arriving to the network with an Admission Control mechanism or the ongoing calls with the codec adaptation procedure. Only [5] and [9] provide a combined mechanism using a codec adaptation module working together with admission control. Still,
344
B. Bellalta et al.
both solutions are somehow limited since the former proposes a change of codec for all calls when congestion is detected while the latter proposes changing the codec only of the node that suffers a rate change, which depending on the case might not be enough to correct the congestion. As for the works based on AMR codecs, notice that their working principle could be useful for our purposes if the AMR codecs are able to react to network congestion rather than bad channel conditions in terms of bit channel errors or measuring SNR values since, in our scenario, a MN although observing very good channel conditions will suffer from the extra bandwidth required by other MNs with low transmission rates. In any case, the use of some decision block is required which selects the more properly or suitable VoIP codec rate. In this paper, a combined call admission control (CAC) scheme and dynamic VoIP codec rate selection module is presented, to alleviate the effects of the multirate issue in a generic wireless network. The CAC is not only able to block (incoming) or drop (active) VoIP calls, but it also includes a policy to select the most adequate VoIP codec rate in order to avoid having to block or drop calls by adapting them to the current cell situation. Different adaptation policies are proposed and evaluated, which try to optimize the network usage according to different criteria: call quality, number of simultaneous active calls, fairness, etc. These are simple yet representative policies, which provide a basic insight into the intrinsic trade-offs found in the most common adaptation policies, when applied to multirate environments. Only general and simple assumptions about the channel conditions are made to evaluate the policies, in order to make the results qualitatively valid for a broad range of wireless technologies. Results show the relation between the different metrics of interest for designing a successful VoIP service: offered load, blocking / dropping probabilities, speech quality and simultaneous number of calls.
2
A Multirate Scenario
Throughout the paper, a single wireless channel (cell) of capacity C is ideally shared (statistically multiplexed) among n Mobile Nodes (MNs) and a single Access Point (AP). An example of the considered scenario is depicted in Figure 1, where MNs use different VoIP codecs at different transmission rates. Each node tries to use the bandwidth it requires, without considering any fixed channel division by using different time slots, frequency sub-bands or codes, which allows the node to access to complete channel capacity. No assumptions have been made about the MAC protocol employed, except that all MNs share a common wireless channel, with equal access probability to it. Then, if the required bandwidth by all active nodes is higher than the channel capacity, all of them will see how their communications are degraded as they will all suffer from packet losses and / or higher packet delays (overload of transmission MAC queues). Due to the channel conditions, a node i (including the AP) will transmit to the channel using the transmission rate Ri , occupying a relative bandwidth Bi• equal to Bi• = αi Bi , where αi = (C/Ri ) is the bandwidth excess due to use lower
Evaluation of Joint Admission Control and VoIP Codec Selection Policies
345
MN B=64 Kbps G.711 VoIP 1 Mbps
B=8 Kbps G.729 VoIP 2 Mbps
B=16 Kbps G.728 VoIP 2 Mbps
MN
MN
MN B=32 Kbps G.726 VoIP 5 Mbps
AP
B=208 Kbps G.729, G.728, G.726, G.711 11,5.5,2,1 Mbps
MN B=64 Kbps G.711 VoIP 11 Mbps
B=8 Kbps G.729 VoIP
MN 1 Mbps
MN
B=8 Kbps G.729 VoIP 1 Mbps
Fig. 1. Sketch of the considered scenario
transmission rates than the channel capacity C and Bi is the required bandwidth by node i. The set of channel transmission rates is R = [R1 , R2 , . . . , RN ], with R = [C, C/2, C/5.5, C/11], where the channel capacity is C = 1 Mbps. 2.1
VoIP Codec-Rate Adaptation
Each VoIP call comprises one downlink flow and one uplink flow. Since all flows go through the AP, the AP requires the same amount of bandwidth as all MNs together. Thus, the VoIP call i consumes a channel capacity equal to 2 · Bi (va ), where Bi (va ) is the one-way throughput of a flow using the specified VoIP codecrate (va ) 1 for M Ni . It is also assumed that the same transmission rate Ri is used for the uplink and downlink directions (see Figure 2) and the relative capacity required when transmitting at a rate Ri is Bi• (va , Ri ) = αi Bi (va ). In Figure 2, it is shown how MNs using low transmission rates require a relative capacity Bi• (va ) which is higher than the one of the MNs that use higher transmission rates (see call 2, which uses the same VoIP codec as call 1 and R1 > R2 ). To mitigate this effect, MNs which use low transmission rates could use low rate VoIP codecs in order to maintain their Bi• (va , Ri ) low. 1
For the sake of generality, and since it does not affect the validity of our results, only the nominal codec rate is used, obviating all protocol-dependent overheads.
346
B. Bellalta et al.
Channel capacity
R1 64 K
0
R1
R2
R2
64 K
64 K
64 K
Call 1
Call 2
uplink
R2
R2
8K
8K
Call 3
bandwidth
C
downlink
Fig. 2. Example with three calls sharing the channel
It is assumed that the considered VoIP codec-rates provide different voice quality results, being this quality proportional to its required bandwidth2 . Therefore, considering the set of V VoIP codecs, an integer value ranging from 1 (lowest bandwidth codec) to vN (highest bandwidth codec) is assigned to each VoIP codec as a measure of its voice quality. The set of VoIP codec rates is V = [v1 , v2 , . . . , vN ]. Four VoIP codec rates have been considered, with V = [64, 32, 16, 8] Kbps and quality Q = [1, 2, . . . , N ] 2.2
Admission Control
The cell state is governed by the admission control entity. It decides whether to accept or reject new calls based on the current system state and the information carried in the admission request transmitted by the MN or arriving from the fixed network. Moreover, it is able to drop active calls when the system state becomes unstable due to the behavior of the already accepted calls, for example due to a transmission rate change3 . A VoIP codec selection module is co-located with the admission control. It suggests the VoIP codec to be used following a given policy. For example, a valid policy would be ”always use the G.711 VoIP codec, independently of the considered rate, for it gives the best voice quality”. The set of policies used in this work is introduced in the next section. A block scheme of the considered admission control is shown in Figure 3. For evaluation purposes, it is considered that VoIP calls arrive to the system from an assumed infinite population with an arrival rate λ (calls/second) following a Poisson distribution. The call duration has a mean of 1/μ seconds and 2
3
It is theoretically possible to find a codec with higher bandwidth requirements and lower achieved MOS than a contender. Indeed, such codecs have been proposed in practice. However, once a better performing option is found, the older codec is abandoned. Hence, in our argumentation, we assume an ordered list of codec-rates along a bandwidth-MOS axis. in general, for any change to a faster rate than the one in use, the resulting system state is always feasible. Therefore, dropping calls only occurs when a MN using a faster rate changes to a slower one.
Evaluation of Joint Admission Control and VoIP Codec Selection Policies
347
Call Admission Control Call request arrival rate (VoIP codec, Rate)
λ
μ(n)-d(n) New Request
Codec Selection Policy
Rate Change rate (VoIP codec, Rate)
γ(n)
Call completion departure rate
Cell state information (bandwidth used, number of active calls (n), codec and rate used by each call)
d(n) Calls dropped departure rate
Rate Change (Accepted calls)
Fig. 3. Call Admission Control scheme with VoIP codec selection
it follows and exponential distribution. A new call arrives to the system using a transmission rate Ri picked from the set of existent set rates R. The probability to pick a given rate is uniformly distributed among all rates (so, all rates can be picked with equal probability). Once a call is admitted, every call will suffer a rate change in exponentially distributed periods with mean 1/γ seconds (the overall cell rate changes is nγ). Notice that calls depart the system due to their completion at rate μ(n) or because they are dropped, which occurs at rate d(n). How the codec negotiation and management takes place is outside the scope of this paper, for it is protocol and technology-specific. However, as an example, the SIP [10] protocol provides the possibility of updating the codec used during an active connection by means of an additional INVITE message. Obviously, in-call signalling to select adaptively the proper codec (from the set of codecs available for the two end points) without extra signalling overhead would be the desirable solution.
3
VoIP Codec-Rate Selection Policies
To mitigate the multi-rate problem several codec selection policies can be implemented, depending on the criterion to be optimized. A number of trade-offs exist among such criteria as call quality, maximum number of simultaneous active calls, resource usage efficiency, dropping and blocking probabilities, signalling overhead, complexity and time needed to regain system stability after a rate change. In this section, a set of simple policies is proposed and later evaluated. The chosen policies are valuable for they provide an upper and lower bound in the call quality versus call quantity trade-off, arguably one of the most critical parameters to tune any
348
B. Bellalta et al.
admission control scheme for VoIP. Moreover, the policies have been chosen to exemplify the most common approaches for codec adaptation found in the literature, albeit in a very simple form. As such, they are both representative of more complex schemes and yet intuitively easy to grasp. Furthermore, they isolate the most common adaptation mechanisms, so as to identify their individual effects, which in more sophisticated policies are usually found combined in refined but complex ways. The set of considered policies is the following: P1: Always Best Quality (ABQ). The highest bandwidth VoIP codec-rate is used for all calls without consideration of the transmission rate. This policy will result in a high VoIP speech quality but will suffer from higher blocking and dropping probabilities. As such, it provides the upper bound on call quality, and it is expected to provide the worst blocking and dropping probabilities. P2: Always Maximum Calls (AMC). The lowest bandwidth VoIP codecrate is used for all calls without consideration of the transmission rate. Being the opposite of the previous policy, AMC provides the lower bound on quality, but an upper bound on the number of simultaneous active calls and it is expected to provide the lower bound in blocking and dropping probabilities. P3: Constant Relative Capacity Consumption (CRCC). A specific VoIP codec-rate is assigned to each transmission rate, so as to equalize the capacity consumption across transmission rates, i.e.: Bi• (va , Ra ) ≈ Bi• (vb , Rb ), ∀a, b, where a and b belong to the set of codec-rate pairs. Therefore, a MN using transmission rate Ra will use the VoIP codec va . Notice that this is a ”fair” policy as all MNs have a voice quality proportional to their transmission rate. The typical operation mode of AMR codecs matches this policy. P4: Capacity Before Quality for the target call (CBQ-1). Based on P 3, a MN perceiving a change to a transmission rate Ra is allowed to choose a VoIP codec equal or lower than the va codec provided by P3. This policy allows each VoIP call to proceed with a worse voice quality but reducing its blocking and dropping probabilities, as it is able to check if lower codec-rates would be better suited to the network conditions. This would allow, for example, an incoming call to be accepted in an otherwise saturated network, by accepting to use an ”unfairly” low rate codec for its transmission rate. It is expected to provide a higher number of simultaneous calls than P3 but with lower speech quality. P5: Capacity Before Quality with Iterative Capacity Reduction (CBQ-n-IC↓). Based on P 4, this policy also contemplates reducing the VoIP codec-rate of other active calls4 . Different criteria can be used to select which calls have to suffer a VoIP codec degradation despite experiencing higher transmission rates. 4
Again, it is not considered how this policy would be implemented in practice. However, intuitively, a centralized cell optimization instance, which could be placed jointly with the AP and/or a SIP proxy, could be a good candidate.
Evaluation of Joint Admission Control and VoIP Codec Selection Policies
349
The algorithm followed by this policy is: 1. Select the lowest codec-rate for the target VoIP call. 2. Compute the required bandwidth to accept (or, for already active calls, not to drop) the target VoIP call i, so the required bandwidth is labelled as RBi . 3. Compute the amount of bandwidth that can be re-allocated from other calls and assigned to call i, ABi . If ABi ≥ RBi a feasible solution exists and the process is initiated. Otherwise, block (drop) the target call i. 4. Starting with the active calls with lowest transmission rate, decrease each call to a lower codec-rate until RBi is achieved. If all calls have suffered a codec-rate reduction and RBi has not been achieved, re-start the process5 . The rationale behind starting with low rate calls is that a codec-rate change from 16 to 8 Kbps in an active VoIP call using an excess bandwidth proportional to α = 11, reduces the consumed relative capacity from 176 Kbps to 88 Kbps, resulting in a capacity saving of 88 Kbps, while changing from 16 Kbps to 8 Kbps on a call using an excess bandwidth proportional to α = 1, reduces the consumed capacity only in 8 Kbps. Hence, it is more effective to begin with lower rate calls. P6: Capacity Before Quality with Iterative Capacity Reuse (CBQ-n-IC↑). This is an extension of P 5. Each time a call arrives, ends or is dropped, the available / released capacity is assigned (if possible) to other VoIP calls in order to increase their VoIP codec-rate and hence the quality of the communication. No unused capacity is kept as long as better quality for active calls can be achieved, but preference is given to a higher number of simultaneous calls, as in P5. 1. Compute the free capacity left by the leaving or rejected VoIP call i and when a new VoIP calls arrives to the system, F Bi . 2. Starting with the active call with lowest transmission rate, increase each call to a higher codec until F Bi is exhausted. If F Bi is not exhausted after upgrading all calls by one codec, re-start the process6 . The rationale for starting with the lower codec-rates is to mitigate the impact of the initial P 5 policy, as it allows to increase the quality of the VoIP calls that probably have suffered most from previously lowering their VoIP codec. However, it would be more efficient to start with high transmission rate calls, as an increment of a single codec implies lower relative capacity consumption. This should provide for the most efficient resource usage in all cases. The performance of each policy will be also linked with the available set of codec-rates and transmission rates. A higher density of codec-rates (i.e., smaller rate changes between two codecs) would allow for a finer tuning of the network resources. 5
6
If a call a single If a call a single
is decreased by more than one codec in the same step, it is considered to be change. is increased by more than one codec in the same step, it is considered to be change.
350
4
B. Bellalta et al.
Performance Results
The joint call admission control and the VoIP codec-rate selection mechanism are evaluated by using a simulation model of the described system. The simulator is built upon the Component Oriented Simulation Toolkit (COST) simulation libraries [11], allowing for a modular implementation of the proposed scheme (see Figure 3). The considered parameters are shown in Table 1. Table 1. Simulation Parameters Scenario 1 Scenario 2 Channel Capacity (C) Mbps 1 Available Channel Rates (Mbps) [C, C/2, C/5.5, C/11] VoIP Codec Rates (Kbps) [64, 32, 16, 8] 1/μ seconds / call 120 λ calls / second A·μ Traffic Load (A) Erlangs variable 15 γ rate changes / second 1/60 variable
In Figure 4 the blocking (a,c) and dropping (b,d) probabilities are shown. Figure 5 shows the average number of simultaneous active calls (a,c) and the average bandwidth used (b,d). Finally, Figure 6 shows the voice quality index (a,c) and the average number of codec-rate changes per call (b,d), for the different set of policies. P1 and P2 are the reference policies, as they always select a fixed codec independently of the system state and the transmission rate used by the requesting call. P1 uses the highest bandwidth codec, which results in the lowest average number of active calls. This low value is due to the high blocking (dropping) probabilities that the new (already active) VoIP calls suffer, which, as expected, increases with the traffic load. Transmission rate changes cause a very high dropping probability on active calls (higher than with the other policies), as most changes to a lower transmission rate imply dropping the call (for example, note that a 128 Kbps call using R1 which changes to R4 is always dropped as it requires more bandwidth than the channel capacity). Nevertheless, all calls that could depart from the system satisfactorily have perceived the best voice quality. The opposite case is P2, which uses the lowest bandwidth codec-rate. Then, P2 allows for the highest average number of simultaneous calls among all policies, due to its lower blocking probability. However, the high acceptance probability makes the system more vulnerable to call drops. Hence, P2 only shows the lowest values in this category at very low load conditions. Obviously it provides the lowest VoIP voice quality. Both P1 and P2 do not imply any codec change during the duration of the call. P3 is the basic transmission rate / codec-rate policy. It selects a codec based on the rate used by a MN / AP, therefore it is the most fair policy as each VoIP call consumes resources proportionally to its transmission rate. However, a performance lower than intuitively expected is observed, showing a low number
Evaluation of Joint Admission Control and VoIP Codec Selection Policies 1
1 P1 P2 P3 P4 P5 P6
0.9 0.8
0.8 0.7
Dropping Probability
Blocking Probability
P1 P2 P3 P4 P5 P6
0.9
0.7 0.6 0.5 0.4
0.6 0.5 0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
5
10
15 20 Traffic Load (Erlangs)
25
0 0
30
5
10
15 20 Traffic Load (Erlangs)
(a)
0.8
35
P1 P2 P3 P4 P5 P6
0.9 0.8 0.7
Dropping Probability
Blocking Probability
30
1 P1 P2 P3 P4 P5 P6
0.7 0.6 0.5 0.4
0.6 0.5 0.4
0.3
0.3
0.2
0.2
0.1 0
25
(b)
1 0.9
351
0.1
50
100
150
200 1/γ
(c)
250
300
350
400
0
50
100
150
200 1/γ
250
300
350
400
(d)
Fig. 4. VoIP call Blocking and Dropping Probabilities; Scenario 1 (a-b), Scenario 2 (c-d)
of simultaneous active calls (only higher than P1). This is due to the high blocking probability, higher than P1 for high offered load values, due to the low dropping probability at those points. Thus, P3 ensures that an accepted call has a higher probability to finish satisfactorily as it presents a lower dropping probability, specially at high offered VoIP traffic loads. This is due to the fact that the relative capacity used by P3 for an accepted call, Bi• (va , Ra ), will remain more or less constant in spite of rate changes. An interesting observation is that the voice quality index for P3 increases with the traffic load. The reason is quite simple; the calls dropped are those which change from a high to a low transmission rate, so degrading the voice quality due to the codec change. P4 is based on P3 but provides a higher adaptation to the system state by reducing its own voice codec-rate to lower ones than those allowed by P3 if necessary. This results in a lower blocking probability and a higher number of active calls. However, since the number of accepted calls is higher, the dropping probability is also a bit higher than P3. In terms of voice quality, P4 shows a degradation compared to P3 due to the fact of using lower VoIP codec-rates than P3. Anyway, the performance of P3 is closer to P4 than to P5 and P6, meaning that adapting only a single VoIP call can not improve the overall system performance significantly.
352
B. Bellalta et al. 16
1 P1 P2 P3 P4 P5 P6
12
0.9 0.8
Average Bandwidth used (Mbps)
Average number of active calls
14
10
8
6
4
0.7 0.6 0.5 0.4 0.3 P1 P2 P3 P4 P5 P6
0.2 2
0.1
0
5
10
15 20 Traffic Load (Erlangs)
25
0 0
30
5
10
15 20 Traffic Load (Erlangs)
(a)
30
1 P1 P2 P3 P4 P5 P6
12
0.9
Average Bandwidth used (Mbps)
14
10
8
6
4
0.8
0.7
0.6 P1 P2 P3 P4 P5 P6
0.5 2
0
35
(b)
16
Average number of active calls
25
50
100
150
200 1/γ
(c)
250
300
350
400
0.4 0
50
100
150
200 1/γ
250
300
350
400
(d)
Fig. 5. Number of active calls and bandwidth used; Scenario 1 (a-b), Scenario 2 (c-d)
With 1/γ = 60 seconds, on average non-dropped calls suffer 4 transmission rate changes during their lifetime7 . P4 shows a lower number of codec changes, specially at high traffic loads. The reason is quite simple: Before the call is started (at the admission control phase) a lower codec-rate than in the P3 case is selected. Hence, that call ends up using the same low codec for all the call duration. P5 is based on P4 but it is able to reduce also the VoIP codec-rates of the other active calls, starting by those which use the lowest transmission rates. For example, a new call request using R1 could be accepted if and only if it uses the lowest bandwidth codec v4 and another call commits to reduce its codec to a lower one. As expected, P5 shows the closest performance to P2, but providing a better speech quality for all load conditions. Finally, P6 tries to improve the low voice quality achieved with P5 by sharing among the remaining calls the available (free) bandwidth that is released when a call leaves the system or is dropped. Moreover, when a new call arrives to the system, the highest possible codec is allocated to it, independently of its transmission rate. At very low offered traffic, P6 even provides a better speech quality 7
Notice that as the dropping probability increases the average number of transmission rate changes per call falls for both P3 and P4, since some of the calls are evicted before exhausting their lifetime.
Evaluation of Joint Admission Control and VoIP Codec Selection Policies 4
10
3
2.5
2
P3 P4 P5 P6
9
Average number of codecs changes / call
P1 P2 P3 P4 P5 P6
3.5
Voice Quality
353
8 7 6 5 4 3 2
1.5 1 1
5
10
15 20 Traffic Load (Erlangs)
25
0 0
30
5
10
15 20 Traffic Load (Erlangs)
(a)
30
35
(b)
4
16 P1 P2 P3 P4 P5 P6
P3 P4 P5 P6
14
Average number of codec changes / call
3.5
3
Voice Quality
25
2.5
2
12
10
8
6
4
1.5 2
1
50
100
150
200 1/γ
(c)
250
300
350
400
0
50
100
150
200 1/γ
250
300
350
400
(d)
Fig. 6. Voice quality indicator and number of codec changes; Scenario 1 (a-b), Scenario 2 (c-d)
than P3, and remains always higher than P5 for any offered load. However, it is still considerably worse than all other policies except P2. As it is shown, P6 provides the best bandwidth usage, as it is able to allocate more than 95% of the bandwidth to active calls. P4, P5 and P6 require frequent codec-rate changes, which could prove impractical in real systems. Especially due to the signalling overhead involved in re-negotiating the codecs for every call, which would need an intense and repeated control packet interchange (with its associated additional delay and jitter), as well as additional processing burden for the involved parties. A possible solution would be to include the selection of the new codecs in-band with the data flow. RTP, for example, includes a field in its header for indicating the codec format transported [12]. Assuming a solution along this or similar lines, P6 provides a very good trade-off for all metrics across all load conditions. Notice that different rate values scale proportionally to all policies and do not change. Therefore, the selected policy will perform as expected, compared with others for any value of γ. A counterintuitive result is that reducing the transmission rate change frequency, policies P4, P5 and P6 reduce their voice
354
B. Bellalta et al.
quality index. This is due to the fact that, for low 1/γ values, the VoIP codecrate reduction is due the bandwidth allocation for new requests more than to avoiding to drop an already active call. 4.1
Is There a Right Policy?
There is not any policy which could be considered the optimal for all cases simultaneously. Therefore, the policy selection has to be based in balancing several metrics depending on the scenario. From the set of policies, two of them seem particularly attractive: First, P4, as it is the best tradeoff between all considered variables (number of calls, blocking and dropping probability, speech quality) and requires the lowest number of VoIP codec changes. Due the robustness of this policy and its simplicity (very low signalling traffic) it could be applied to scenarios which require a fast response, such as an scenario with emergency calls. The second one is P6, which provides always the best channel utilization and the maximum number of simultaneous calls (same as P5 but with better speech quality) and, for low load conditions the best speech quality compared with all other adaptive policies. However, a much heavier signalling load and processing power burden is necessary. Therefore, it could be used in low loaded scenarios, switching to P5 or P4 when the call arrival rate increases.
5
Conclusions
A proper selection of the VoIP codecs allows the system to react to the random variations in capacity due to transmission rate changes, which are motivated by the wireless channel and / or the user mobility. In this paper, a simple set of decision policies are presented, based on the ratio between the VoIP codec bandwidth requirement and the channel rate. Numerical results show that adaptive schemes provide a trade-off among the different expected characteristics of the set of codecs, such as quality speech index, average number of active calls and the achieved blocking and dropping probabilities. Merging all these metrics in a single indicator is a difficult task, since it is difficult to quantify the impact of increasing or decreasing one of these variables on the overall user / operator perception. This will be considered for further research.
References 1. Lacage, M., Manshaei, M.H., Turletti, T.: IEEE 802.11 Rate Adaptation: A Practical Approach. In: ACM International Symposium on Modeling, Analysis, and Simulation of Wireless and Mobile Systems (MSWiM), Venice, Italy, ACM Press, New York (2004) 2. IEEE Std 802.11. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. ANSI/IEEE Std 802.11 (1999 edn.) (Revised 2003) 3. Heusse, M., Rousseau, F., Berger-Sabbatel, G., Duda, A.: Performance Anomaly of 802.11b. In: IEEE INFOCOM 2003, San Francisco, USA, IEEE Computer Society Press, Los Alamitos (2003)
Evaluation of Joint Admission Control and VoIP Codec Selection Policies
355
4. Leng Ng, S., Hoh, S., Singh, D.: Effectiveness of Adaptive Codec Switching VoIP Application over Heterogeneous Networks. In: 2nd Int. Mobility Conference, Guangzhou, China (2005) 5. Toshihiko, T., Tadashi, I.: Wireless LAN Resource Management Mechanism Guaranteeing Minimum Available Bandwidth for Real-time Communication. In: IEEE WCNC 2005, New Orleans, USA, IEEE Computer Society Press, Los Alamitos (2005) 6. Sfairopoulou, A., Macian, C., Bellalta, B.: QoS adaptation in SIP-based VoIP calls in multi-rate 802.11 environments. In: ISWCS 2006, Valencia (2006) 7. Servetti, A., Martin, J.C.D.: Adaptive interactive speech transmission over 802.11 wireless LANs. In: Proc. IEEE Int. Workshop on DSP in Mobile and Vehicular Systems, Nagoya, Japan, April 2003, IEEE Computer Society Press, Los Alamitos (2003) 8. Matta, J., P´epin, C., Lashkari, K., Jain, R.: A Source and Channel Rate Adaptation Algorithm for AMR in VoIP Using the Emodel. In: 13th NOSSDAV, Monterey, CA, USA (2003) 9. McGovern, P., Murphy, S., Murphy, L.: Addressing the Link Adaptation Problem for VoWLAN using Codec Adaptation. In: IEEE Globecom 2006 - Wireless Communications and Networking, San Francisco, CA, IEEE Computer Society Press, Los Alamitos (2006) 10. Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., Schooler, E.: RFC3261: SIP: Session Initiation Protocol. Internet RFCs (2002) 11. Gilbert (Gang) Chen.: Component Oriented Simulation Toolkit (2004), http://www.cs.rpi. edu/ cheng3/ 12. Schulzrinne, H., Casner, S., Frederick, R., Jacobson, V.: RFC1889: RTP: A Transport Protocol for Real-Time Applications. Internet RFCs (1996)
A Novel Inter-LMD Handoff Mechanism for Network-Based Localized Mobility Management Joong-Hee Lee , Jong-Hyouk Lee, and Tai-Myoung Chung Internet Management Technology Laboratory, Electrical and Computer Engineering, Sungkyunkwan University, 300 Cheoncheon-dong, Jangan-gu, Suwon-si, Gyeonggi-do, 440-746, Korea {jhlee00, jhlee, tmchung}@imtl.skku.ac.kr
Abstract. Network-based Localized Mobility Management (NetLMM) is an outstanding candidate solution for the mobility management controlled by the network. In NetLMM, mobile nodes (MNs) can be provided mobility services without any installation of mobility-support stack. However, there is a restriction that the MN is able to have the mobility only within a single localized mobility domain (LMD). In this paper, we propose a novel Inter-LMD handoff mechanism in order to eliminate the shortcoming of the current NetLMM protocol. The proposed Inter-LMD handoff mechanism enables that the MN hands off across LMDs, even if the MN does not have a functionality of Mobile IPv6 (MIPv6). According to the performance evaluation, the proposed Inter-LMD handoff mechanism has approximately 5.8% more overhead than the current Inter-LMD handoff of MIPv6-capable devices, while the current NetLMM protocol does not support the handoff of MIPv6-incapable devices.
1
Introduction
MIPv6 is the basic mobility management protocol [1], and IETF has been working on improving MIPv6. As the result of the effort, improved mobility management protocols have been introduced to the Internet Society such as Fast Mobile IPv6 (FMIPv6) and Hierarchical Mobile IPv6 (HMIPv6) [2,3]. These protocols are improved based on the functionality of host perspectives. A mobile node (MN) must have a host software stack and act as an important role to manage its mobility in the host-based mobility management protocols. Note that the MN is quite likely to be a hand-held device which has low battery capability and low computing power. The specialized and complex security transactions are also required between the MN and the network, because the MN has to act as a part of the mobility management [4]. Hence, IETF has been interested in the solution for the NetLMM in order to minimize the load for the mobility operation of the MN [5]. In the network-based mobility management protocol, the MN can be provided the continuity of an access to the Internet without the functionality of
Corresponding author.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 356–366, 2007. c Springer-Verlag Berlin Heidelberg 2007
A Novel Inter-LMD Handoff Mechanism for NetLMM
357
MIPv6 or any other host-based mobility management protocol such as FMIPv6 and HMIPv6. This means that the MN is able to have the mobility only with the capability of the wireless access such as IEEE 802.11 series. A protocol called “A protocol for Network-based Localized Mobility Management (NetLMM)” is developing as a candidate solution for the network-based mobility management protocol by NetLMM working group [6,7]. The NetLMM protocol is based on the concept of Proxy Mobile IPv6 (PMIP) and employs PMIP client [5], and it uses entities called Local Mobility Anchor (LMA), Mobile Access Gateway (MAG), and PMIP client to support mobility for a MN. There is no additional signaling than the basic MIPv6 specified in [1]. Whenever the MN changes the MAG to which the MN is attached, PMIP client simply generates a binding update message (BU) and sends to LMA which acts as the home agent (HA) of the MN. This message generated by the PMIP client is called Proxy Binding Update message (PBU). As a response for the PBU, the LMA sends Proxy Binding Acknowledgment message (PBAck) back to the PMIP client, and then the PMIP client sends the PBAck back to the MAG. With this operation simply explained above, the MN is able to maintain the access to the Internet regardless of movements within a single LMD. The MN does not need any kinds of mobility management protocols, and only has the ability for the wireless access. The handoff within the single LMD is called Intra-LMD handoff. However, the MN must have the functionality of Mobile IP client in the IPv6 stack of the MN, if the MN wants to hand off to the other LMD, which is called Inter-LMD handoff [6]. We insist that this is contrary to the goal of NetLMM, because the MN should be a part of the operation for mobility management. In this paper, we propose a mechanism which supports Inter-LMD handoff without any additional requirement for a MN. The MN does not need to have the functionality of MIPv6 or any other mobile management protocol. The entities in the network such as LMA, MAG, and PMIP client exchange messages which have been already introduced in [1], [5], and [6]. With the proposed mechanism, the MN is able to have global mobility while the MN still has all of benefits acquired from the Network-based Localized Mobility Management. The rest of this paper is organized as follows. In section 2, we introduce the details of NetLMM, Intra-LMD handoff, and Inter-LMD handoff. Then, the details of the proposed mechanism are explained in section 3. In section 4, we evaluate the proposed mechanism and discuss the benefits of this mechanism. In section 5, we conclude the proposed Inter-LMD mechanism proposed in this paper.
2 2.1
Network-Based Localized Mobility Management Protocol Overview
The NetLMM protocol is the result of an effort toward the localized mobility management controlled by network. The NetLMM protocol is designed to support the mobility for MNs sufficiently in an administrative network. In the NetLMM protocol, there are several entities to provide the mobility for a MN. A
358
J.-H. Lee, J.-H. Lee, and T.-M. Chung
LMA acts as the standard MIPv6 Home Agent (HA) described in [1]. A MAG is the current attachment point of the MN. The LMA sends the MAG data traffic toward the MN using the address called Proxy Care of Address (pCoA). The pCoA is the address of the MAG to which the MN is currently attached. If the MN hands off to the other MAG, the designated PMIP client sends the LMA the PBU containing new pCoA of the MN. By receiving the PBU, the LMA always recognizes the current location of the MN, so that the LMA is able to tunnel the data traffic to the appropriate MAG. The MN has only one address in the single domain, even though the MN hands off between several MAGs. In the single domain, the MN does not have to send BU and change its address. That is the reason why the MN does not need to have the functionality of MIPv6.
Fig. 1. Message flow of MN’s initial attachment to LMA
The procedure of initial attachment of a MN to a LMD is represented in Fig. 1. As you can see in Fig. 1, a Proxy Home Address (pHoA) is assigned to the MN after attaching to the MAG1 at layer 2 and receiving a router advertisement message from the MAG1. The MAG1 sends the PMIP client the trigger containing the MN ID and the pHoA of the MN. With this information of the MN, the PMIP client sends the PBU to the LMA. After the PBU message is confirmed by the LMA, the LMA returns the PBAck, and the LMA begins to act as a standard HA of the MN. With the LMA and the pHoA, the MN is ready to communicate and connect to the Internet. Communication sessions would be established through the pHoA of the MN after this initial attachment to the LMA.
A Novel Inter-LMD Handoff Mechanism for NetLMM
359
If the MN is the MIPv6-capable device, and if the MN has already established sessions with some correspondent nodes via the HoA using standard MIPv6 before the initial attachment to the LMA, the MN must send BU containing the pHoA as the Care-of Address (CoA) to the GMA. The BU procedure to the GMA is the same procedure described in “Mobility Support in IPv6”[1], because the GMA is the entity that acts as the HA to provide the global mobility for the MN. The MN must be the device having the functionality of MIPv6 for the binding update to the GMA to acquire the global mobility. Note that the LMA is able to provide only the local mobility to the MN, which means that if the MN moves to the MAG controlled by the other LMA, the MN would be configured new pHoA and lose the session established via the old pHoA. 2.2
Intra-LMD Handoff
The Fig. 2 represents the procedure for Intra-LMD handoff. With this procedure, the MN is able to hand off between MAGs without losing connection to the Internet in the single domain. After the MN changes its layer 2 attachment to the MAG2, which is another MAG in the same LMD, the MAG1 transfers the context of the MN and data packets to the MAG2 using Context Transfer Protocol (CXTP) [8]. Then, the MAG2 sends the trigger to the PMIP client. PMIP client performs the proxy binding update with the LMA using the PBU and the PBAck. With this procedure, the LMA is able to intercept the data packets toward the MN, and forward them to the appropriate MAG which is the current attachment point of the MN.
Fig. 2. Message flow of Intra-LMD handoff
2.3
Inter-LMD Handoff with the Current Protocol
If the MN wants to hand off across the LMDs with the current protocol, the MN must have a functionality of MIPv6 or other mobility management protocols.
360
J.-H. Lee, J.-H. Lee, and T.-M. Chung
Because the MAGs located in different LMDs advertise different prefixes which represent their own LMD, the MN should configure new address with new prefix when the MN changes the attachment to the MAG collocated with the other LMA. The session cannot be maintained with the old pHoA in this scenario, so that the standard BU message has to be sent to the GMA by the MN to maintain the session via the old pHoA. Therefore, the MN must be the Mobile IPv6-capable device, even though this is contrary to the goals of NetLMM. The procedure for Inter-LMD handoff is exactly same as the initial attachment to the LMA depicted in Fig. 1.
3
Novel Inter-LMD Handoff Mechanism
In this section, we propose a novel Inter-LMD handoff mechanism. With the proposed mechanism, a MN does not have to be a MIPv6-capable device, and does have mobility without any restriction. Let us consider a simple topology to simplify the explanation. The topology is depicted in Fig. 3.
Fig. 3. Simple topology for NetLMM
The MN is the device communicating in the LMD1. Before the communication is over, the MN changes its attachment to the MAG3. The MAG3 does not advertise the prefix of the LMA1. Hence, the MN configures New pHoA (NpHoA) after the L2 attachment to the MAG3. After the MN is configured to the NpHoA, with the current protocol, there is no way to maintain the session via the Old pHoA (OpHoA) obtained in the LMD1 when the MN is not the MIPv6-capable device. However, the basic purpose of the NetLMM protocol is to provide mobility for the MN that is incapable to any other mobility management protocol such as MIPv6. If the MIPv6-incapable MN is able to hand off to the other LMD, the MN can have global mobility. Therefore, the novel Inter-LMD handoff is mandatory for the NetLMM protocol as the mobility management solution.
A Novel Inter-LMD Handoff Mechanism for NetLMM
361
Fig. 4. Message flow of Inter-LMD handoff
To make the session maintained even if the MN hands off to the other LMD, we develop the mechanism represented in Fig. 4. When the MN hands off to the MAG3 at layer 2, the MAG2 sends the information of the MN to the MAG3 using CXTP [8]. The information of the MN consists of its MN ID, OpHoA, and the address of the LMA1. After the end of the CXTP, the MAG3 sends the trigger message to the PMIP client2. The information contained in the trigger has a delicate difference from the original trigger of the NetLMM protocol. The difference is that the trigger used in the Inter-LMD handoff contains the OpHoA as well as the MN IP and NpHoA. The OpHoA is to send the PBU1 to the LMA1, and the NpHoA is to send the PBU2 to the LMA2. The PMIP client2 can recognize the trigger that means Inter-LMD handoff, so that the PMIP client2 sends the PBU1 containing (OpHoA, NpHoA) pair to the LMA1, and the PBU2 containing (NpHoA, pCoA) pair to the LMA2. After receiving the PBAcks from the LMA1 and LMA2 respectively, the data traffic toward the OpHoA is delivered via the LMA1. The PBU and PBack have to be protected basically by IP security (IPsec) [1,9]. However, it is likely that the Security Association (SA) between the PMIP client2 and the LMA1 has not established. Before sending the PBU2, the SA between the PMIP client2 and the LMA1 should be dynamically established using some key exchange protocol such as IKEv2 [10]. Then, the packets arrived to the LMA1 are forwarded to the LMA2 because the LMA1 receives the PBU1 from the PMIP client2, so that the LMA1 recognizes the NpHoA as the pCoA of the MN. The packets delivered to the LMA2 are forwarded to the MAG3 because the LMA2 receives the PBU2 from
362
J.-H. Lee, J.-H. Lee, and T.-M. Chung
the PMIP client2. With this procedure, the MN is able to maintain the connectivity of the session established with both the OpHoA and the NpHoA, and the handoff is transparent to the transport layer. The secure considerations for the CXTP such as the key establishment can be achieved based on public keys or an AAA protocol [8], but the details of the security considerations in CXTP is out of scope of this paper.
4
Performance Evaluation and Discussion
In this section, we evaluate the performance of the proposed Inter-LMD handoff mechanism explained in section 3. Before we evaluate the proposed Inter-LMD handoff mechanism, the fact that the NetLMM protocol cannot provide the mobility for the MIPv6-incapable device should not be overlooked. Note that we assume the MN has no functionality of Mobile IPv6. We evaluate the performance of the proposed Inter-LMD handoff mechanism based on the concept of the signaling cost introduced in [11], then we discuss about the result of the performance evaluation. 4.1
Performance of the Handoff Signaling
– – – –
CL2 . The total cost to change L2 attachment. CCXT P . The total cost to exchange MN context using CXTP. Cacq . The total cost to acquire IP address after the L2 attachment. Tmp . The transmission cost of the handoff signaling between the MAG and the PMIP client. – Tpl . The transmission cost of the handoff signaling between the PMIP client and the LMA. – Tpl1 . The transmission cost of the handoff signaling between the PMIP client and the LMA1.
Fig. 5. Signaling of the current Inter-LMD handoff
A Novel Inter-LMD Handoff Mechanism for NetLMM
363
Fig. 6. Signaling of the proposed Inter-LMD handoff
– Tpl2 . The transmission cost of the handoff signaling between the PMIP client and the LMA2. – Tmg . The transmission cost of the handoff signaling between the MN and the GMA. – PMN . The processing cost of the handoff signaling at the MN. – PGMA . The processing cost of the handoff signaling at the GMA. – PMAG . The processing cost of the handoff signaling at the MAG. – PP MIP . The processing cost of the handoff signaling at the PMIP client. – PLMA . The processing cost of the handoff signaling at the LMA. According to the message flows illustrated in Fig. 5 and Fig. 6, the signaling costs of the current Inter-LMD handoff and the proposed Inter-LMD handoff can be calculated as follows. CCh and CP h mean the cost of the current Inter-LMD handoff and the proposed Inter-LMD handoff respectively. CCh = CL2 + Cacq + 2Tmp + 2Tpl + 2Tmg + 2PP MIP + PLMA + PMAG + PMN + PGMA
CP h = CL2 + Cacq + CCXT P + 2Tmp + max(2Tpl1 , 2Tpl2 ) + 3PP MIP + 2PLMA + PMAG
(1)
(2)
The transmission cost can be assumed as the proportional value to the distance between the source and the destination, and the proportionality constant is δU [11]. Hence, Tmp , Tpl , Tpl1 , and Tpl2 can be expressed as lmp δU , lpl δU , lpl1 δU , and lpl2 δU respectively, where lmp , lpl , lpl1 , and lpl2 are the average distances between the MAG and the PMIP client, between the PMIP client and the LMA,
364
J.-H. Lee, J.-H. Lee, and T.-M. Chung
between the PMIP client and the LMA1, and between the PMIP client and the LMA2, respectively. We also assume that the transmission cost over the wireless link is ρ times higher than the cost over the unit of wired link since the cost over the wireless link is higher than the cost over the wired link in general. Since Tmg is the cost of BU and BA for MN, Tmg consists of a unit of the wireless link,which is from the MN to the MAG, and wired link which is from the MAG and the GMA. Hence, Tmg can be expressed as ρδU + lmg δU , where lmg is the average distances between the MAG and the GMA. Therefore, we can rewrite Eq. (1) and Eq. (2) as:
4.2
CCh = CL2 + Cacq + 2(lmp + lpl + lmg + ρ)δU + 2PP MIP + PLMA + PMAG + PMN + PGMA
(3)
CP h = CL2 + Cacq + CCXT P + 2(lmp + max(lpl1 , lpl2 ))δU + 3PP MIP + 2PLMA + PMAG
(4)
Performance of the Packet Delivery
In the proposed Inter-LMD handoff mechanism, a GMA is not necessary, whereas the GMA is the indispensable entity in the current Inter-LMD handoff. Because of this difference, we assume that the LMA, which the MN is attached firstly to, acts as the GMA in the current protocol in order to simplify the evaluation.
Fig. 7. The Packet Delivery of Current protocol
Fig. 8. The Packet Delivery of Proposed protocol
As you can see in Fig. 7 and 8, the packet delivery of the current protocol and the packet delivery of the proposed protocol have exactly same path as each other if the GMA in Fig. 7 and the LMA1 in Fig. 8 are the same entity. The LMA1 is the LMA in the previous LMD where the MN hands off from. Therefore, we do not evaluate the cost of the packet delivery. 4.3
Numerical Results and Discussion
We compare the cost of the handoff signaling based on Eq. (3) and Eq. (4) in this subsection. For simplification, we assume that each of the processing costs in Eq. (3) and Eq. (4) is equivalent to each other (i.e., P = PP MIP =
A Novel Inter-LMD Handoff Mechanism for NetLMM
365
Table 1. The parameter values for performance analysis Parameter L P δU ρ CL2 Cacq CCXT P Value 15 10 0.1 10 5 10 12
Performance Comparison 100 Current Handoff Proposed Handoff 90
80
The signaling cost
70
60
50
40
30
20
10
0 −1 10
0
10 The movement rate across LMDs
Fig. 9. The comparative result for the evaluation
PLMA = PMAG = PMN = PGMA ), and the average distances are also same (i.e., L = lpl = lpl1 = lpl2 = lmp = lmg ). For the analysis of the performance, we also assume the parameter values in Table 1. The result of the comparison for two handoff mechanisms is represented in Fig. 9. The cost of the proposed Inter-LMD handoff mechanism is approximately 5.8% higher than the current mechanism. The difference of the performance is from the cost of CXTP in the proposed Inter-LMD handoff mechanism. However, we assume that the average distance between the MAG and the GMA is same as the average distance between the other entities because the GMA is not necessary in the proposed Inter-LMD mechanism. It is quite likely that the GMA is much further entity than the other entities, because every entity is in the same or near administrative domain except the GMA. We should also remember the assumptions of the mechanisms. In the proposed mechanism, we assume that the MN does not need the functionality of the MIPv6, whereas the MN in the current mechanism cannot hand off across LMDs without the standard MIPv6 functionality which can be a burden to the MN as explained in [4].
366
5
J.-H. Lee, J.-H. Lee, and T.-M. Chung
Conclusion
In the current NetLMM protocol, it is impossible to support the mobility to a MN which wants to hand off across LMDs. Hence, the MN must have the functionality of MIPv6 in order to hand off across LMDs, even if the MN adopts the NetLMM protocol as its mobility management. Thus, we propose a novel Inter-LMD handoff mechanism. In the proposed mechanism, the MN does not have to be a MIPv6-capable device to be provided the global mobility which means the MN is able to hand off across LMDs. As the result of the performance evaluation, the cost of the proposed Inter-LMD handoff signaling is similar to the current mechanism, even though the MN does not need the functionality of MIPv6. As the future work, we are trying to decrease the cost of the handoff signaling and developing a scalable key distribution mechanism for MAGs in NetLMM.
Acknowledgement This research has been supported by a grant of the Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea (02-PJ3-PG6-EV080001).
References 1. Johnson, D., Perkins, C., Arkko, J.: Mobility Support in IPv6, RFC 3775 (June 2004) 2. Koodli, R. (ed.) Fast Handovers for Mobile IPv6, RFC 4068 (July 2005) 3. Soliman, H., Castelluccia, C., El Malki, K., Bellier, L.: Hierarchical Mobile IPv6 Mobility Management (HMIPv6), RFC 4140 (August 2005) 4. Kempf, J. (ed.) Goals for Network-based Localized Mobility Management (NETLMM), draft-ietf-netlmm-nohost-req-05 (October 2006) 5. Gundavelli, S., Leung, K., Devarapalli, V., Chowdhury, K., Patil, B.: Proxy Mobile IPv6, draft-sgundave-mip6-proxymip6-02 (March 2007) 6. Bedekar, A., Singh, A., Kumar, V., Kalyanasundaram, S.: A protocol for Networkbased Localized Mobility Management, draft-singh-netlmm-protocol-02 (March 2007) 7. NetLMM WG web site: (Accessed on March 2007), http://www.ietf.org/html. charters/netlmm-charter.html 8. Loughney, J., Nakhjiri, M., Perkins, C., Koodli, R.(ed.) Context Transfer Protocol (CXTP), RFC 4067 (July 2005) 9. Arkko, J., Devarapalli, V., Dupont, F.: Using IPsec to Protect Mobile IPv6 Signaling Between Mobile Nodes and Home Agent, RFC 3776 (June 2004) 10. Kaufman, C.(ed.) Internet Key Exchange (IKEv2) Protocol, RFC 4306 (December 2005) 11. Xie, J., Akyildiz, I.F.: A Novel Distributed Dynamic Location Management Scheme for Minimizing Signaling Costs in Mobile IP. IEEE Transactions on Mobile Computing 1(3) (2002)
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol by Using Active Packets Dimitri Marandin Technische Universität Dresden, Chair for Telecommunications, Georg-Schumann-Str. 9, 01062, Dresden, Germany
[email protected]
Abstract. Dynamic Source Routing (DSR) is an efficient on-demand routing protocol for ad hoc networks, in which only needed routes are found and maintained. The route discovery/setup phase becomes the dominant factor for applications with short-lived small transfer traffic (one single packet or a short stream of packets per transaction) between the source and the destination: resource discovery, text-messaging, object storage/retrieval, queries and short transactions. Route caching is helpful to avoid the need for discovering a route or to shorten route discovery delay before each data packet is sent. The goal of this work is to develop a caching strategy that permits nodes to update their cache quickly to minimize end-to-end delay for short-lived traffic. The proposed approach consists of an Active Packet that travels through the nodes of the network twice. During the first travel, it visits the nodes collecting fresh network topology information. When the first visit is finished, another one is started to validate and update the caches of the nodes with this newly obtained information. This mechanism removes invalid cache links and caches valid links based on the collected topology information. The correct information in caches allows to speed up the Route Discovery and even to avoid it. Keywords: mobile ad hoc networks, Dynamic Source Routing (DSR), link cache.
1 Introduction Mobile ad hoc networks are the active research topic in wireless communications. This technology makes possible for network nodes to communicate to each other using wireless transceivers (perhaps along multihop paths) without the requirement for a fixed infrastructure or centralized administration[1]. This is a unique characteristic of ad hoc networks concerning more conventional wireless networks, for example cellular networks and WLAN, in which nodes (such as, mobile phone users) communicate with each other using base stations. Since the nodes in ad hoc networks use wireless technology, the topology of the network created by the nodes can change, when the nodes move. A routing protocol, that can manage these topology changes to make the network reconfigurable, is necessary. Routing in ad hoc networks has been a dynamically growing research area for the last years. Many routing protocols for multihop ad hoc networks has been developed Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 367–378, 2007. © Springer-Verlag Berlin Heidelberg 2007
368
D. Marandin
beginning with straightforward modifications of Internet protocols, to complicated multilevel hierarchical proposals. The design routing protocols is one of the most significant challenges in ad hoc networks and is critical for the basic network operations. In [1] it is shown that on-demand routing protocols perform better than tabledriven ones in mobile ad hoc networks. In an on-demand mechanism, a node attempts to find a route when it has to send data packets to a destination. To avoid the cost to find a route for each data packet, nodes maintain discovered routes in a cache. Because of the potential mobility of nodes in an ad hoc network, cached routes can become stale. Thus, a good caching strategy is necessary to update caches of nodes. In this paper we focus on the “Dynamic Source Routing protocol” (DSR) [2], which is an on-demand routing protocol. We investigate and develop a caching strategy for it which permits nodes to efficiently adapt their cache to network changes. This paper has been structured as follows. Section 2 gives an overview of DSR protocol. Section 3 describes the cache structure used and the problem statement. In section 4 we discuss related work. Section 5 describes our approach for improving DSR with the use of Active Packets. Section 6 presents an evaluation of our approach, and finally, in section 7 we present our conclusions.
2 DSR: Dynamic Source Routing The Dynamic Source Routing protocol (DSR) [2] is a simple but effective on-demand protocol used in ad hoc networks. DSR has two basic mechanisms [2]: Route Discovery and Route Maintenance. Route discovery is the mechanism used at the source of the packets to find a route to the destination. When a source generates a data packet to the destination, it floods the network to find a route to the desired destination. It broadcasts a ROUTE REQUEST(RREQ) packet. When this RREQ is received at an intermediate node, it adds its address to the source route contained in the RREQ packet, and re-broadcasts it. When this RREQ is received by the destination, it adds its address to the source route and unicasts a ROUTE REPLY (RREP) to the originator of the RREQ. To reach the source of the packet, the destination reverses the route contained in the RREQ packet. Each intermediate node on the source route forwards this RREP. When the source receives this RREP, it can extract the source route from this packet and send data packets (DP) to the destination using this route. Route Maintenance is the mechanism that detects link failures and repairs them. Each node of the source route has to detect whether the packet has been received by the next hop. When the Route Maintenance mechanism detects that the link is broken, a ROUTE ERROR(RERR) packet is sent to the source of the packet. The source has to use another route to send packets to the destination or start the route discovery again.
3 Caching Any on-demand routing protocol must maintain some type of route cache in order to avoid the need of discovering a route every time before sending each data packet. A
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol
369
route discovery is an expensive operation, due to the flooding of the network, and it causes delay before the first packet data packet can be sent. After the source discovers a route, it has to store the route in some cache for transmitting following packets. Thus, caching is an essential component of on-demand routing protocols for wireless ad hoc networks. The DSR uses the route cache even more often, using it not only to cache routes for the purpose of originating packets, but also for the purpose of allowing nodes to answer Route Requests targeted at other nodes[10]. A route caching is the main approach to reduce the flooding overhead by avoiding route discovery as much as possible so that nonoptimal but available routes are preferred to the effort of finding the current optimal route. The use of a cache introduces the problem to properly manage it. A good caching strategy that updates the cache of nodes to the new topology is needed. For the development of a caching strategy in on-demand routing protocols, the cache structure used is very important. Two types of cache structure can be used: i) Path cache, when a node caches the complete path (a sequence of links), ii) Link cache, when a node caches each link separately, adding it to a graph of links. In Fig. 1, the differences between both structures are shown. In case of the path cache structure (Fig. 1 (a)), when the node A adds a new route A-B-F-G-H to its cache, it has to add the whole path as an independent entry in its cache. In case of the link cache structure (Fig. 1 (b)), when the node A adds the new route A-B-F-G-H to its cache, it has to add only the link B-F, since the other links already exist in its cache. It has been shown that link cache outperforms path cache [3]. This is because link cache can make better use of the cached network state information and it only deletes a broken link when the link causes a path to break, instead of deleting the whole path as path cache does. From all these reasons, we use a link cache in our approach. For a link cache, the reasonable choice is to permit the cache to store all links that are learnt, because there is a predetermined maximum of N2 links that may be in a network of N nodes. A
B
C
D
A
E
F
G
H
A
B
New path
A
B
F (a) Path Cache
C
D
G
H
New link
G
H
E
F
(b) Link Cache
Fig. 1. Path and Link cache structure for node A
Each node maintains its own cache. A node learns routes either when it forwards a data packet or a route reply. To make the protocol independent of the used MAC protocol, all the cached links have to be bi-directional ones. So, when a route reply message is forwarded by a node, its cache stores the links between the originator of the route reply and a node. They are cached as both directions for them have been tested: from the current node to the destination by the route discovery and from the destination to the current node by the route reply. The links from the originator of a
370
D. Marandin
route request to the node forwarding a route reply might be unidirectional, so they are not cached. These links of the path will not be cached until a data packet is sent.
4 Related Work Simulation studies in [1],[11],[12],[13],[14] have demonstrated the efficiency of route caching in on-demand routing protocols. Hoverer, [15] has shown that the high delay and low throughput of the DSR are mostly due to aggressive use of caching and lack of any technique to remove stale routes or to decide about the freshness of routes when many routes are available. Replies from caches reduce the delay of the route discovery, avoiding that the RREQ storm reaches every node of the network. Cached replies stop the flooding early, reducing the overhead. However, without an effective caching strategy, the information maintained in caches might be stale. Thus, replies from the node caches might contain invalid links. The undesirable effects of invalid routes when they are used by a source node to send data packets can be summarized as: • Packet losses, increasing packet delivery latency and routing overhead. These problems can be substantial. When mobility, traffic load, or network increase, more routes will become stale and have a negative effect on more traffic flows. When replying to route requests from the node caches is allowed, stale routes will be rapidly propagated to other nodes, worsening the situation. The route discovery overhead increases because the source node has to initiate more attempts of route discovery procedure. • Degradation of TCP performance. Stale routes also extremely degrade TCP performance[3]. As TCP cannot make a distinction between packet losses due to route failures from those due to congestion, it will incorrectly start congestion control mechanisms, causing the reduction in throughput. • Increased energy consumption at source nodes and intermediate nodes. If stale routing information is not removed quickly from the cache, TCP will retransmit lost packets again with invalid routes. • More time is required to find a new route to the destination node. The main reason of having stale routes is the mobility. Another reason that causes the stale cache problem in the DSR protocol [4] is “Incomplete error notification”. When a link failure is detected, a node returns a RERR to the source of the data packet which could not be delivered. Doing it, only the node belonging to the source route of this packet removes the invalid links from their cache. Using the optimization “Gratuitous ROUTE ERRORS” [5], a RREQ packet piggybacks the RERR information. But due to the replies from cache, the RREQ flooding does not reach every node of the network. From other side, nodes must maintain some type of routing cache because the route discovery/setup phase becomes the dominant factor for applications with short-lived small transfer traffic (one single packet or a short stream of packets per transaction) between the source and the destination: resource discovery, text-messaging, object storage/retrieval, queries and short transactions. Thus, caching is an essential component of on-demand routing protocols for wireless ad hoc networks.
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol
371
Hu and Johnson [14] suggested several adaptive link timeout mechanisms. For a link cache, assuming the cache capacity is limited as is typically the case, the setting of cache timeout is critical, as it can have a significant effect on the performance of routing protocol in metrics of packet delivery ratio, routing overhead and etc. But only a few studies have been done on how to adjust it. In [16], a static caching scheme was considered in which a fixed cache timeout is assigned to all links. After a link stays for this period of time, it is deleted from the link cache. The disadvantage of this approach is that it cannot adapt to network changes. If the timeout is relative short compared to the actual link lifetime, the link cache works poorly because of the large number of route requests and unnecessary overhead. Alternatively, if the timeout lasts too long, the link cache works badly as well since the number of route errors may grow due to failed links in the cache. From these reasons, assigning the timeout close to a link’s lifetime can improve the performance. As the real lifetime of a link strongly depends on network parameters, for example nodes’ mobility and node density, adaptive caching schemes are required to obtain good performance. But, if the cache timeout in an adaptive scheme do not work correctly in some scenarios, its performance can be even poorer than a static caching. [6] proposed an active network method to solve the problem of stale caches. In this approach, an active packet roams around the network, and gathers information about network topology. Nodes check the payload of this active packet once it is received and update their route caches. Therefore, the cache miss rate is smaller and the route discovery flooding is reduced. With active packets, not only used routes are updated, but also possible routes for future use are added to the cache. Therefore, both route request flooding for new routes and due to the stale routes are mostly avoided. But in [6] the update phase is similar to the phase of gathering information. It takes the same amount of time. In opposite, in our approach we employed a quick update mechanism that allows update caches by broadcast. Additionally, some information at nodes is already updated in the phase of collecting information. In [6] the active packet is generated periodically by a randomly chosen node. But how this random node is selected and how it is guaranteed that only one node generates an active packet in the specified period of time is not described. It has a consequence that multiple active packets can be created simultaneously. It increases overhead drastically without any benefit and decreases performance. In our approach, the last node in the phase of collecting information is responsible for a generation of the next active packet. In this way, we avoid a situation of several active packets in the same network segment. Also we complement the approach with mechanisms of an active packet regeneration when network segmentation occurs and the active packet cannot reach the part of the network.
5 Active Packets Approach 5.1 Overview The suggested improvement is based on the introduction of a new packet to the DSR. This network layer packet will be henceforth referred to as an Active Packet (AP). It visits each node of the network twice. Fig. 2 depicts a rough outline of the general functioning of the mechanism which guides the AP through the mobile network.
372
D. Marandin
The first time the packet arrives at a node, topology data is collected by the packet. If it is the first visit, then the left branch of the diagram is followed and the neighbour discovery procedure is started. The node must send messages to find out who its neighbours are at this precise moment. Once known that, the node fills up with this data a connection matrix stored in the AP in compressed form and forwards the AP to a randomly chosen unvisited neighbour. The unvisited neighbours can be identified by means of the connection matrix. From logical point of view, a connection matrix is an array of lists. Each list is associated to a visited node and contains addresses of node’s neighbouring nodes. After the neighbour discovery process, a visited node adds its address with the list of its neighbouring nodes to the connection matrix. When transmitted, a connection matrix is effectively encoded. If there is no more unvisited neighbouring nodes, the 2nd AP visit will start. The field “Enable 2nd visit” is set to “1”and the AP is broadcasted to nodes visited during the 1st AP visit. When a node receives the AP in its 2nd visit, the connection matrix in the AP is used to update and validate the link caches of the nodes of the network. The update is done in two phases. First, the cached links that (according to the connection matrix contained in the AP) no longer exist, that is, broken links, will be deleted from the nodes’ cache. Secondly, some new links of whose existence the cache had no knowledge and that are stored in the matrix are added. It helps to improve the cache
AP received
1st
Visit Number
Neighbour Discovery
2nd
First time received no yes
Update connection matrix
Number of unvisited neigbours > 0
Update node cache
no
Broadcast the AP
yes Send to unvisited neighbouring node
Start 2nd visit
Fig. 2. Simplified flow chart of the reception of an AP
Drop AP
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol
373
performance of the protocol by keeping the entries in the caches from becoming stale (the portion of successful good cache hits should be higher). It is also expected, that the extra signalling traffic generated by the APs introduced by the modification is compensated by the avoided route discovery flooding. As soon as a node receives the AP broadcasted from its neighbour in 2nd visit phase, it will examine, whether the AP has been already received in 2nd visit phase. If yes, the node will drop the AP, if no, the cache of the node will be updated by the AP. Then the AP is broadcasted again if necessary. There is a timer that determines when an AP ought to be created. This timer runs at every node in the network and starts when a node joins the network or receives an active packet. When during predefined timeout no active packets are received, the node selects a random back off time after which it is responsible to generate an AP. If during the back off period, the node receives an AP, it cancels a new AP generation and restart the timer. 5.2 Format of Active Packet First of all we must define a format of the AP. To remain compatible with nodes that use a standard DSR protocol, the DSR header is left unchanged. The AP information is included in an extra header after a standard DSR header. All the useful information is enough to be saved in the header. The header number 253 is used that is reserved by the Intermodal Association of North America (IANA) for testing. Thus, the value 253 of the next header field in the DSR header indicates the following AP header. The resulting frame is presented in fig. 3.
Fig. 3. MAC frame with the AP header
Nodes that are not AP compatible, but run the original DSR protocol simply ignore the AP header, at the same time as nodes that are AP capable can extract and process the enclosed information. We define two AP header formats for the two visit phases of the AP. For the first visit phase, the AP header contains the following information: • Enable 2nd visit (1 bit): determines if the AP is in its 1st visit phase of collecting information (when the value of this field is false or “0”) or in its 2nd visit phase of updating the caches (when the value of this field is true or “1”) • Backtracking(BT) variable (4 byte): the identifier of the node to which the packet must be route back (in the case that the previous node has been already visited but still has unvisited neighbours). • GenNodeID(4 bytes) and AP_ID (1 byte): a node generated the AP and the unique identifier of the AP. This pair is unique for each an AP. AP_ID must be set to a new value, different from that used for other APs recently initiated by this node. For example, each node may maintain a single counter value for generating a new AP_ID value for each AP it initiates. When a backtracking
374
D. Marandin
variable must be stored at the node, it saved together with this pair of fields. This pair allows knowing later to which AP a backtracking variable belongs. • Connection matrix representing a current view of the topology after AP 1st visit. It contains the information about direct links discovered during the 1st AP visit. The AP collects information only about bidirectional links in the network. For the second visit phase, the AP header contains the following information: • Enable 2nd visit (1 bit): determines if the AP is in its 1st visit phase of collecting information (when the value of this field is false or “0”) or in its 2nd visit phase of updating the caches (when the value of this field is true or “1”) • GenNodeID(4 bytes) and AP_ID (1 byte): a node generated the AP and the unique identifier of the AP. When a node broadcast the AP in the 2nd visit, it stores this pair in its ACTIVE_PACKET table. This pair is used to prevent broadcasting the packet multiple times during 2nd visit. AP_ID must be set to a new value, different from that used for other APs recently initiated by this node. For example, each node may maintain a single counter value for generating a new AP_ID value for each AP it initiates. • List of visited nodes (variable length): a list of node addresses that includes nodes that have been already visited in the 2nd phase. • Connection matrix (variable length) representing a current view of the topology after AP 1st visit. It is peremptory to somehow reduce the amount of data sent. From logical point of view, a connection matrix is an array of lists. Each list is associated to a visited node and contains addresses of node’s neighbouring nodes. After the neighbour discovery process, a visited node adds its address with the list of its neighbouring nodes to the connection matrix. When transmitted, a connection matrix is effectively encoded and transmitted in a compressed way. There are many other techniques that can be used in order to reduce the size of the packet and it is also important to be aware that the size of the AP is dynamic, so the more the nodes visited the larger the packet will become. This can cause great overhead and slow down the whole functioning of the network, and this is one of the reasons for thinking of limiting the number of nodes that the AP visits before starting to update the caches. 5.3 Initialization In normal operation, the node that starts 2nd visit phase is responsible for the generation of the next AP after timeout ACTIVE_PACKET_CREATION. This timeout defines a frequency of updates and must be set depending on the degree of mobility. For the exceptional situations (AP loss during 1st visit phase, at the beginning after the network formation or in the case of the network partition), a timer ACTIVE_PACKET_REGENERATION determines when an AP ought to be created. This timer runs at every node in the network and starts when a node joins the network. The timer is restarted when an AP is received. When during ACTIVE_PACKET_REGENERATION timeout no APs are received, the node selects a random back off time after which it is responsible to generate an AP. If during
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol
375
the back off period, the node receives an AP, it cancels a new AP generation and restart the timer. The timer ACTIVE_PACKET_REGENERATION is introduced to generate an AP at the beginning after the network formation or in the case of the network partition. The most important problem with the active packets is that, because of the mobile nature of this kind of networks, packets are often prone to getting lost.
6 Performance Evaluation We compared our algorithm, called DSR-AP (DSR with Active Packets), and the normal operation of DSR, called DSR. The network simulator ns-2 [7] is used, including the Monarch Project’s [8] wireless and mobile extensions. The signal propagation model used is a two-ray ground reflection model. The network interface uses IEEE 802.11 DCF MAC protocol [2]. The mobility model used is the modified version of random waypoint [9], in which the steady-state speed distribution is applied to the first trip to eliminate any speed decay. Node speed is randomly chosen from 1 m/s to 19 m/s, like proposed in [9]. The communication model is Constant Bit Rate (CBR) with four packets per second and packet size of 64 bytes in order to factor out the effect of congestion [1]. At every instant of the simulation, 3 simultaneous short-time flows between random nodes exist. Each flow lasts 10 seconds. The simulation time is 2000 s. As every 10 seconds new flows are started, the route requests to the route cache are frequent that allows evaluating the performance of the DSR-AP.
Fig. 4. Performance evaluation
376
D. Marandin
It can be observed (fig. 4) that the throughput of the DSR increases when the pause time is increased, but the throughput of the DSR-AP does not change. It means that the AP approach reduces strongly the impact of mobility on the performance. When the mobility is higher more links are broken, more packets are dropped. The DSR-AP has achieved its better performance by increasing the good cache hit portion. The highest improvement takes place with pause time equals 0. At pause time 0, the nodes move a lot and the network topology is changing quickly. From this reason the cache information becomes obsolete and contains a lot of invalid information. But thanks to the quick propagation of the AP, in the DRS-AP the cache is updated in time. At pause time 600 seconds, the network is not changing quickly, the DSR shows better performance, but still under the DSR-AP. Generally, as the pause time increases, the packet delivery ratio constantly increases as well. Despite of the degree of mobility, the packet delivery ratio of the DSR-AP are almost the same at 99%, which indicates that the AP approach successfully updated nodes caches and significantly reduced the chance of packet loss due to a route failure. When a route failure occurs, the up-to-dated node cache helps in switching to the new valid route and unnecessary packet losses are avoided. In the standard DSR, if an intermediate node cannot locate an alternate route in its cache, the data packet will be dropped. The proposed approach reduces the average end-to-end delay as well. Since the AP avoids performing new unnecessary route discoveries, some delay time has been saved. The requested routes are mostly available in caches and the cache information collected through APs seems to be correct. It has a consequence in insignificant or zero route discovery delay. The improvement is more noticeable for high mobility. Normalized routing overhead(the ratio of the amount of routing overhead comprising the routing packets to the amount of transferred data) decreases when the nodes move less (higher pause times) because less links will be broken and therefore, less routing traffic is needed to repair or discovery routes. The DSR produces lower control overhead, but at the cost of a low packet delivery ratio and a high end-to-end delay. In low traffic conditions like in our scenario, the per-packet overhead is high. If the data rate will be higher, then the normalized overhead of the DSR-AP will decreases at all pause times. Also if the number of connection will increase, the normalized overhead of the DSR will grow, as more route discoveries must be performed (since fewer routes are available in the caches). The overhead of the DSR-AP will stay stable. Improvement of the caching system performance results in reducing the route discovery control traffic. The improvement is obtained due to the fact that the AP helped to avoid or speed up many route discoveries by maintaining more valid entries in the node caches. Although the AP overhead is high, it reveals no problem for the high capacity of modern WLAN. The increased control overhead does not cause network congestion, significantly increased average packet latency and packets loss. In opposite, the proposed pro-active link cache maintenance can significantly improve overall routing performance.
7 Conclusions The route discovery/setup phase becomes the dominant factor for applications with short-lived small transfer traffic (one single packet or a short stream of packets per
Improvement of Link Cache Performance in Dynamic Source Routing (DSR) Protocol
377
transaction) between the source and the destination: resource discovery, textmessaging, object storage/retrieval, queries and short transactions. The proposed approach to shorten it consists of an Active Packet (AP) that travels through the nodes of the network twice. During the first travel, it visits the nodes and collects fresh network topology information. When the first visit is finished, another one will be started to validate and update the route caches of the nodes with this newly obtained information. This mechanism removes invalid cache links and stores the currently existing links based on the collected topology information. The valid and full information in caches allows to speed up the route discovery and even to avoid it. It is concluded that the Active Packet approach achieved its objective, by systematically improving to some extent all the metrics under analysis, except overhead. The improvements are in most of the cases more significant in high mobility of the nodes (smaller values of the pause time).
References [1] Broch, J., Maltz, D.A., Johnson, D., Hu, Y.-C., Jetcheva, J.: A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols. In: Proceedings of the 4th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom), Dallas, Texas, pp. 85–97 (1998) [2] Johnson, D., Maltz, D., Hu, Y.-C.: The Dynamic Source Routing for mobile ad hoc networks, IETF Internet Draft (July 2004), http://www.ietf.org/internet-drafts/draft-ietfmanet-dsr-10.txt [3] Hollan, G., Vaidya, N.: Analysis of TCP performance over mobile ad hoc networks. In: Proc. 5th ACM/IEEE MOBICOM, pp. 219–230 (1999) [4] Marina, M., Das, S.: Performance of routing caching strategies in Dynamic Source Routing. In: Proc. 2nd WNMC, pp. 425–432 (2001) [5] Maltz, D., Brooch, J., Jetcheva, J., Johnson, D.: The effects of on-demand behavior in routing protocols for multi-hop wireless ad hoc networks. IEEE J. on Selected Areas in Communication 17(8), 1439–1453 (1999) [6] He, Y., Raghavendra, C.S., Berson, S., Braden, B.: Active Packets Improve Dynamic Source Routing for Ad-hoc Networks. In: Proceedings OpenArch 2002 (June 2002) [7] Fall, Varadhan, K. (eds.) ns notes and documentation. The VINT Project, UC Berkeley, LBL, USC/ISI, and Xerox PARC (1997) [8] The Monarch Project. Mobile networking architectures. http://www.monarch.cs.rice.edu/ [9] Yoon, J., Liu, M., Noble, B.: Random Waypoint Considered Harmful. Electrical Engineering and Computer Science Department. University of Michigan [10] Maltz, D., Brooch, J., Jetcheva, J., Johnson, D.: The effects of on-demand behavior in routing protocols for multi-hop wireless ad hoc networks. IEEE J. on Selected Areas in Communication 17(8), 1439–1453 (1999) [11] Das, S.R., Perkins, C.E., Royer, E.M.: Performance Comparison of Two On demand Routing Protocols for Ad Hoc Networks. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM), Tel Aviv, Israel, pp. 3–12 (March 2000) [12] Johansson, P., Larsson, T., Hedman, N., Mielczarek, B., Degermark, M.: Scenario-based Performance Analysis of Routing Protocols for Mobile Ad-Hoc Networks. In: Proceedings of the 5th ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom), Seattle, WA, pp. 195–206 (August 1999)
378
D. Marandin
[13] Hollan, G., Vaidya, N.: Analysis of TCP performance over mobile ad hoc networks. In: Proc. 5th ACM/IEEE MOBICOM, pp. 219–230 (August 1999) [14] Hu, Y.-C., Johnson, D.B.: Caching strategies in on-demand routing protocols for wireless ad hoc networks. In: ACM/IEEE MOBICOM, pp. 231–242 (2000) [15] Perkins, C., Royer, E., Das, S., Marina, M.: Performance comparison of two on-demand routing protocols for ad hoc networks. IEEE Personal Communications 8(1), 16–28 (2001) [16] Marina, M.K., Das, S.R.: Performance of Route Cache Strategies in Dynamic Source Routing. In: Proc. Second Wireless Networking and Mobile Computing (WNMC) (April 2001)
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing in Wireless Sensor Networks Evgeny Osipov, LTU Lule˚ a University of Technology, Department of Computer Science and Electrical Engineering, Campus Pors¨ on, S-971 87 Lule˚ a, Sweden
Abstract. In this paper we consider a problem of implementing a hybrid routing protocol for wireless sensor networks, which natively supports data-centric, geographic-based and address-centric communication paradigms. We demonstrate the feasibility of such protocol by presenting tinyLUNAR, an adapted to the specifics of sensor networks reactive routing scheme originally developed for mobile wireless ad hoc networks. In addition to the support for several communications paradigms tinyLUNAR implements highly efficient multihop forwarding using only 1 B field that can be directly encoded in the standard IEEE 802.15.4 MAC header.
1
Introduction
Over the recent years wireless sensor networks (WSN) appeared as a unique networking environment with respect to routing amongst other aspects. Firstly, WSNs inherit the need for routing on geographic coordinates from its closest “relative” mobile ad hoc networks (MANETs) due to spatial distribution of nodes. Secondly, differently from MANETs and the Internet in general where communications are purely address-centric, the communications in sensor networks are heavily data-centric. With this type of communications a set of nodes satisfying certain attributes (e.g. particular readings of on-board sensors larger than a pre-defined threshold in a specific geographic region) should report the information mainly in connection-less manner. However, while data-centric communications dominate in sensor networks, there are numbers of applications that still require address-centric communications (e.g. sending an alarm message to a base station with a pre-defined unique ID). Finally, being built of severely energy and computing resources constrained devices WSNs place serious performance requirements routing protocols and data forwarding.
The work described in this paper is based on results of the IST FP6 STREP UbiSec&Sens (www.ist-ubisecsens.org). A significant part of this work has been performed in the Department of Wireless Networks in RWTH Aachen University while the author was working there.
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 379–392, 2007. © Springer-Verlag Berlin Heidelberg 2007
380
E. Osipov
The motivation behind this work is straightforward. During the last several years the area of WSN routing blossomed with a variety of protocols that separately support data-centric, geographic-based and address-centric communication paradigms. A good overview of the existing approaches is presented in [1]. One general conclusion, however, is that none of the existing protocols supports all three communication styles. This leads to a situation that in complex WSN applications such as large scale surveillance in homeland security scenarios where all types of communications are present at least three software instances for routing is needed. While not arguing against the development of specialized protocols, in this work we want to demonstrate the feasibility of designing an efficient hybrid protocol natively supporting multiple communication paradigms. In this paper we present tinyLUNAR, an adapted Lightweight Underlay Ad hoc Routing protocol [18] to the specifics of WSN environment. The major contribution delivered by our protocol is implementation of the multihop forwarding using a one byte field that can be encoded directly in the IEEE 802.15.4 MAC header. Secondly, tinyLUNAR supports multiple data communication types in one package, exposes flexible interfaces to an application level programmer and delivers competitive performance in comparison to the existing protocols. The protocol is implemented under TinyOS v.2.x1 and currently is the default routing scheme for the secure distributed data storage middleware tinyPEDS [3]. To the best of our knowledge currently there are no competitors to tinyLUNAR by its existing functionality and potential capabilities. The paper is structured as follows. In Section 2 we present the considered network model overview routing principles in wireless sensor networks and formulate the design requirements for tinyLUNAR. We outline our solution and present the background material in Section 3. The details of tinyLUNAR operations follow in Section 4. Overviewing the related work in Section 5 we discuss future developments of the protocol in Section 6. We conclude the paper in Section 7.
2
Problem Statement: Design Objectives for tinyLUNAR
2.1
Networking Model of WSN
We design tinyLUNAR for a network formed by wireless sensor nodes arbitrarily deployed in a monitored area. We do not place any assumptions either on the scale of the network or on its topological structure. The network can be hierarchical, deploying certain cluster formation scheme, or flat. We consider a relatively static network with low or no node mobility. The low mobility is implicitly present in a form of changing the position of a cluster head node and eventual node failures. Note that we neither place any specific assumptions on the structure of node’s IDs. Moreover, we consider a general case where each sensor node maintains a set of identities, including a) MAC addresses; b) position information (geographic coordinates, relative position); c) functional roles (cluster head, actuator); 1
TinyOS web portal. Online. Available: http://www.tinyos.net
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing
381
d) description of on-board sensors (temperature, humidity); etc. We also do not consider availability of a centralized directory service for address/name resolution. The target application of the sensor network requires both data-centric and address-centric communications. The network supports anycast, unicast, convergecast and multicast traffic flows. 2.2
Routing in Wireless Sensor Networks
There are two routing approaches applicable to wireless sensor networks. In centralized routing schemes [11,2] the forwarding map is computed at a base station based on the link and neighbor information gathered from the network. While one can achieve accurate computation of optimal paths in this case, the centralized schemes are known for poor scalability as they require periodic gathering of the topology information from all network nodes. On the other hand, in the distributed approaches [5, 13, 14] the routes are computed cooperatively by all network nodes exchanging the network state and control information. In this paper we concentrate on the decentralized routing approaches only. The distributed routing approaches fall into two global categories: the proactive and reactive schemes. The proactive approach is inspired by the routing experience in the wireline Internet. The routing topology is created prior to data transmissions from mobile nodes. The routing information is then dynamically updated according to changes of the network topology. In contrast, the reactive routing approach assumes no existing routing state in the network prior to data transmission from the particular station. Upon arrival of a first data packet the node enters a route discovery phase in which it announces the request for the particular destination address to the network. In reactive routing the routing information is maintained in the network only for the period of activity of the particular session. The major representatives of proactive routing for WSN is DSDV [14]. For reactive routing these are adapted versions of DSR [8] and AODV [13]. A special subclass of reactive routing schemes constitute self-routable protocols. In this approaches the forwarding state is not maintained by intermediate relay nodes. Instead, the forwarding decision is taken individually for every packet. Examples of the self-routable schemes are geographic routing [10, 9] and flooding. 2.3
Design Objectives for tinyLUNAR
The protocols referenced in the previous section represent only a small share of approaches developed for sensor networks during the last decade. While each scheme targets a specific need in WSNs one conclusion is clear: Currently there are no solution supporting several communication types in one package. The major design objective for tinyLUNAR is a capability to support as many types of traffic flows as possible for both data-centric and address-centric communications. Of course, functional universality comes at a price of increased complexity. With tinyLUNAR we want to understand how complex a universal protocol can be. At the same time we are not arguing against the development of specialized routing protocols. They obviously are best suited in specially engineered and rather narrow-purpose
382
E. Osipov
WSNs. However, we foresee that in large scale distributed networks with rich functionality a single implementation of routing supporting several connection types is more efficient than several separate single purpose schemes.
3
From LUNAR to tinyLUNAR: Solution Outline
The original Lightweight Underlay Adhoc Routing [18] is a reactive protocol that uses a simple mechanism of flooding and limited re-broadcasting (by default the range is limited to three hops) to establish a label switching virtual circuit in the forwarding plane. While keeping the core logic of its predecessor the operations of tinyLUNAR is different in the following way. Firstly, tinyLUNAR does not interpret data packets as connection initiation events. Instead, the protocol exposes well defined interfaces that allow upper layer programmers to configure the characteristics of the path. With these interfaces it is possible to parametrically specify a) various classes of destination identifiers; b) the type of the desired communications; and c) the propagation behavior for the route request messages. Secondly, in the data forwarding plane we change the dimension and impose certain structure on the selector2 field. The size of the selector field in tinyLUNAR is 1 byte. This potentially allows implementing the multihop forwarding using only the type field of the standard IEEE 802.15.4 MAC header and not consuming the payload space. Finally, we scale the forced path re-establishment mechanisms accordingly to satisfy bandwidth and energy limitations of wireless sensor networks. While the periodic route rediscovery is still present in tinyLUNAR, the duration of the period is tailored to a specific application that uses the protocol in a particular sensor network. For example, if tinyLUNAR is used in a hierarchical network to build routes toward a cluster head then, naturally, the the period is synchronized with the cluster head re-election mechanism. 3.1
Packet Forwarding Via Label Switching
Label switching is technique for overcoming the inefficiency of traditional layer 3 hop-by-hop routing. In the Internet the label switching (or virtual circuit) model is used amongst others in MPLS [16]. A simplified example of multihop data forwarding using label switching is illustrated in Figure 1. Assume each application running at a particular node is assigned with an ID number which allows a deterministic internal multiplexing of data packets between different applications3 . Suppose also our application needs a bidirectional unicast path for communication. In this setting the application with AppID = 120 at nodes with addresses X and Z communicate through a node with address Y . The figure shows the content of the forwarding table 2
3
Traversing the network LUNAR’s route request messages obtain a special forwarding label (called selector) in each router along a path to a destination. Further on, we use the terms application, component, interface in the sense defined by TinyOS.
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing 2
App (AppID=120)
App (AppID=120) Out lbl 1
120
2
1
Addr
Out lbl 1 2
Y
Addr Z
2
Out lbl
X
4
Addr
1 2
1
383
3
3
4
4
4
5
5
5
FT at ID=X
FT at ID=Y
120
3 3
Y
FT at ID=Z
Data packet format: MAC header
Link Label header
Payload
Fig. 1. Packet forwarding via label switching
(FT) in each node. The label switching paradigm consists of two phases: the path establishment and the data forwarding. Upon establishing a connection node X locally generates an incoming label and creates an entry in its FT. This number will be used by the next hop node to send packets to X on the backward path. In the created entry node X sets the outgoing label to the application ID 120 as the destination for the incoming packets. The empty address field in the first FT entry of X in our example indicates that the packet must be multiplexed locally to the application. Finally, node X signals the generated label to the next hop node in a path establishment message. During the forward propagation of this message node Y locally generates its incoming label (in our case 3). As an outgoing label it specifies the received incoming label from node X and also records the address of the node from which the request is received. In its turn node Z figures out that it is the destination node and performs the following actions. Firstly, it creates a backward label (in our case 4), which will be used by local application to send packets to node X. The outgoing label is set to the incoming label generated by node Y , the address of Y is recorded in the corresponding field. Secondly, the local application (AppID = 120) is signaled that for sending packets to node X it needs to indicate 4 as the entry point to the virtual circuit between Z and X. Finally, node Z initiates building of the forward path from X to itself by the same procedure described for node X. The path establishment message returns back to node X using the established backwards path from Z to Y and from Y to X. The result of the path establishment procedure are two numbers available for our communicating applications indicating the entry points of the established virtual circuits (2 from node X and 4 from node Z). The data forwarding that may commence immediately upon successful path establishment is straightforward. Each packet is assigned with an outgoing label as shown in the figure. In packets sent to the network it is the outgoing label recorded in FT for a specific incoming label. Normally, the incoming labels are directly mapped into FT indices inside the next hop node. Originally, this was meant to allow the label lookup and the label switching procedures to be placed directly within the switching fabric.
384
E. Osipov Concast logic
App_i Snd
Rcv
Snd
RTAction1 handler
Mcast logic
Rcv Snd
Rcv Snd
Rcv
RTActionN handler Snd
Rcv
tinyLUNAR Send
Receive
ActiveMessage component Send
Receive
MAC + Radio
Fig. 2. Position of tinyLUNAR in the TinyOS software architecture
This would enable the label switching technique performing much faster than the traditional routing method where each packet is examined in CPU before a forwarding decision is made4 .
4
tinyLUNAR: Protocol Description
Overall, the design of tinyLUNAR is partially influenced by the specifics of TinyOS software architecture. These design steps, however, do not affect the applicability of the protocol to a general class of embedded operating systems. As is the case with its predecessor we position tinyLUNAR directly above the link layer5 as illustrated in Figure 2. It intercepts all broadcast and unicast packets addressed to a node and is responsible for their forwarding to the remote destination and local demultiplexing between the communicating components. All networking components communicate directly through tinyLUNAR. Note that by networking components we mean pieces of software with self-consistent functionality, e.g. aggregator node election, multicast/convergecast handlers, etc. Each local component executed in a node is assigned a locally unique ID. The networking components can be one of the two types depending on the type of communications : a) connection-less and b) connection-oriented (uni- or bidirectional). 4.1
Modified Selector Structure
The selector field appended to each outgoing data packet and all tinyLUNAR control routing messages is a one byte integer, its format is shown in Figure 3(a). 4
5
One general comment is needed on the uniqueness of the labels. Since incoming labels are assigned internally by the relay node the values are obviously only locally unique. The determinism in multihop forwarding between several neighbors is achieved by per-link uniqueness introduced by MAC and link layer addresses. In wireless sensor networks achieving the global uniqueness of MAC level IDs is challenging. However, for correct functionality of the label switching forwarding it is sufficient to have a two-hops MAC uniqueness. An example of a distributed address auto-configuration protocol which achieves the needed uniqueness level is [15]. In TinyOS the ActiveMessage component situated directly above the MAC layer can be regarded as the link layer component.
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing
NetPTR
APP/FWD 1 bit
Index DstAddr (2 B)
7 bits 1 Byte
(a) Format of the selector field.
1 2 3
a1 a2
NetPTR (1 B) np1 np2
AppID (1 B)
ConnID (1 B)
Flags (1 B)
app1
con1
app2
con2
f1 f2 f3
385
Signature (2 B) s1 s2
(b) Structure of the forwarding table.
Fig. 3. Format of the selector field and the structure of forwarding table in tinyLUNAR
The first bit of the selector field splits the value space two parts: the application and the forwarding parts. When the AP P/F W D bit is zero tinyLUNAR treats the following seven bits denoted as N etP T R as an ID of the internal application or service component where the packet shall be multiplexed. When the APP/FWD bit is set the N etP T R part is interpreted as the incoming label as explained in Section 3.1. The rationale behind restricting the selector field to one byte is simple. We want to minimize the overhead due to multihop forwarding. In addition, we want to maximally re-use the standard fields of the IEEE 802.15.4 MAC header. Our intention is to encode the selector value in the type field of the later. 4.2
Generation of Incoming Labels
The format of the forwarding table is shown in Figure 3(b). The purpose of the DstAddr, N etP T R, AppID fields is essentially the same as in our example in Section 3.1. The ConnID field is used for connection oriented application components to differentiate between incoming (outgoing) connections. The field f lags is used internally for the FT management purposes; Signature is used to detect duplicates of route request messages, its computation is described below. The procedure for local generation of incoming labels is as follows. As it is described in Section 3.1, the major objective for the label switching forwarding is to make the route lookup procedure fast. Recall also that many of the embedded operating systems including TinyOS lack the support for dynamic memory allocation. In tinyLUNAR we statically allocate the memory for the forwarding table6 . The incoming label in our protocol is the index of the FT entry. In this way we completely avoid implementing a sophisticated searching algorithm. Upon reception of a packet with a selector belonging to the forwarding space we find the outgoing label and the MAC address of the next hop by directly indexing the forwarding table with the value extracted from the N etP T R part. Recall that the selector field should be present in all control messages of tinyLUNAR. In order for the routing layer to direct these packets to the corresponding processing blocks we assign the selector values for route request (RREQ) and route reply (RREP) messages in the application space. The two values are the same for all RREQs and RREPs generated by any node. 6
The number of entries in the FT is an adjustable parameter of tinyLUNAR, it is bounded by 128 entries (maximal number of incoming labels that can be encoded in 7 bits of the selector).
386
E. Osipov
N0
N1
N2
RREQ: TTL
AppID
num_IDs
DST_ID_SET match_action not_match_action replyto_sel replyto_addr Signature
RREQ
RREQ
RREQ
RREP:
Route Timeoute
RREP
AppID replyto_sel
replyto_addr replyfor_sel
DATA
DATA: Type field (1 Byte) FWD SELECTOR
RREQ Time
RREQ
RREQ
Standard MAC header
Payload
RREQ
Fig. 4. Path establishment and data forwarding phases of tinyLUNAR
4.3
Path Establishment in tinyLUNAR
The path establishment and the data forwarding phases of tinyLUNAR are schematically shown on a time sequence graph in Figure 4. TinyLUNAR is a reactive routing protocol where the route establishment is initiated upon a demand from the application. Normally, in all reactive routing schemes the indication of such demand is the first data packet arriving from an upper layer component. Recall, however, that our major design objective is to create a flexible routing protocol supporting different types of addressing and communications. We foresee that guessing the routing properties from the content of data packets is difficult. Instead, we decided to expose a set of intuitive interfaces to upper-layer programmers to allow them actively configuring the characteristics of desired routes. An intuitive abstraction for the route request procedure (RREQ) from the programmer’s point of view is an “if” statement formed by the source node that travels through the network. The intermediate nodes check whether identity condition matches their own identity. If yes then the node performs a matching action otherwise a not matching action is invoked. The identity condition is a set of tuples (ID CLASS, VALUE) that describes an individual node or a set
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing
387
RREQinit(); appendRREQcondition(ID_CLASS, VALUE);
appendRREQcondition(ID_CLASS, VALUE); addRREQaction(MATCH_ACTION, ACTION); addRREQaction(NOT_MATCH_ACTION, ACTION);
IF(Identity condition(s)) DO match action. ELSE DO not match action ENDIF
RREQfini(); Parameter
Meaning
IDENTITY CONDITIONS (256 ID classes are possible) ID CLASS GEOGRAPHIC Geographic coordinates ID CLASS ROLE Functional role played by node (e.g. cluster head) ID CLASS ADDRESS Unique address (e.g. MAC) MATCH ACTIONS (256 MATCH and NOT MATCH actions are possible) RT ACTION ROUTE UNIDIR Establish one way unicast route RT ACTION ROUTE BIDIR Establish bidirectional unicast route RT ACTION SUBSCRIBE Establish a multicast tree RT ACTION REPORT Establish a convergecast tree (directed diffusion like) NOT MATCH ACTIONS RT ACTION REBCAST Re-broadcast to all. Simple flooding. RT ACTION REBCAST GEO Re-broadcast geo based
Fig. 5. Interfaces exposed by tinyLUNAR to actively control the characteristics of the path and examples of ID classes, matching and not matching actions
of nodes. The match action allows a programmer to specify the behavior of the destination node(s). With the not match action the programmer may coordinate the propagation of the RREQ message through the network. Figure 5 illustrates the above concepts. The following piece of the NesC code from a networking component shows an example of forming a RREQ message for a bidirectional path to a node which is a cluster head in a neighboring cluster, the RREQ should be propagated using the shortest geographic distance. dstregion= getDSTregion(); if(call tinyLUNAR.RREQinit()== SUCCESS) { //First condition should always be "OR call tinyLUNAR.appendRREQcondition(OR_CLAUSE,ID_CLASS_GEOGRAPHIC, CONDITION_TYPE_EQ, (void*)dstregion); //Subsequent conditions could be any call tinyLUNAR.appendRREQcondition (AND_CLAUSE,ID_CLASS_ROLE, CONDITION_TYPE_EQ, (void*)ROLE_CLUSTER_HEAD); call tinyLUNAR.addRREQaction(MATCH_ACTION, RT_ACTION_ROUTE_BIDIR); call tinyLUNAR.addRREQaction(NOT_MATCH_ACTION, RT_ACTION_REBCAST_GEO); call tinyLUNAR.finiRREQpacket(APP_ID); }
When being called by the RREQf ini() interface, tinyLUNAR creates an entry for incoming route reply or data messages, forms and sends the RREQ
388
E. Osipov
message. In the first vacant entry of the forwarding table field AppID is the ID of a component that initiates the route discovery. The index of the entry becomes an incoming label. The entry with index 1 in Figure 3(b) is an example of an entry created through the RREQf ini() interface. The format of the RREQ message is shown in Figure 4. There AppID is the ID of the communicating component, T T L is the maximum hop count that RREQ is allowed to traverse, numID is the number of items in the following identity set (DST ID SET ), match action and not match action fields are set by using the appendRREQcondition() interface, replyto sel is the locally generated incoming label and replyto addr is the MAC address of the node. The Signature field is a CRC number computed over all fields of the RREQ message. The completed message is marked with the RREQ selector and is sent to the broadcast MAC address. The following after that path establishment procedure with respect to setting up the label switching path follows the logic presented in Section 3.1. The entry with index 2 in Figure 3(b) is an example of an entry created at a relay node after processing and re-broadcasting the RREQ message and entry 3 is created at the destination node before sending the RREP message. Further we only highlight differences in the path establishment procedure of tinyLUNAR: the identity checking, the decision for RREQ propagation and the reaction of the destination nodes on the received RREQ messages. The identity checking procedure while being more complex than conventional matching of a single destination address is rather straightforward and follows a similar logic. With tinyLUNAR a node has a choice on the particular pattern to propagate the route request when its identity does not match the one in the RREQ message. In the simplest case it is a “blind” re-broadcast. However, the node may drop the not matched request if it does not satisfy the propagation parameters. For example if one of the target identities is a geographic coordinate and the not match action is geographic forwarding then the node will re-broadcast the request only if it is within a certain proximity metric to the target region. As for the matching action the destination node will issue the route reply message only if the specified type of communication requires either joining a multicast group, sending acknowledgments (reliable routing) or a bi-directional unicast path. However when the match action is REPORT the matching node may start sending the data immediately upon completed processing of the RREQ message. In the forwarding plane tinyLUNAR provides a connection-less service. When upper layer components require connection-oriented communication this can be achieved by using port numbers in the protocol specific header. TinyLUNAR supports connection-oriented components by interfacing the ConnID field in the forwarding table to such components. It remains to say that as is the case with LUNAR our protocol adopts the simplest route recovery procedure. It re-builds paths periodically from scratch. In tinyLUNAR the route expiration time is tailored to the specifics of relatively static sensor networks. We keep a route as active for several minutes, moreover the timer is shifted every time a data packet arrives on this route. In hierarchical
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing
389
Table 1. Memory footprint of tinyLUNAR and tinyAODV Item tinyLUNAR FIB (B) 56 RAM Supporting (B) 714 Total (B) 770 ROM Total (B) 1134
tinyAODV 133 (incl. cache) 204 337 2760 (AODV Core and AODV fwd components)
networks we foresee that the route expiration timer is synchronized with a cluster head election protocol. 4.4
Implementation Details and Memory Footprint
We implemented tinyLUNAR in TinyOS v 2.x operating system. Note that currently TinyOS uses the type field of the IEEE 802.15.4 MAC header to indicate the communicating applications. In order to minimize changes to the core parts of the operating system we decided to encode the selector in the payload part of the packet for the proof-of-concept implementation. The memory footprint of tinyLUNAR component and the reference numbers from the implementation of tinyAODV (the number of supported FT entries in both protocols is 7) are shown in Table 1. Note that tinyLUNAR consumes twice less ROM memory than its counterpart. With respect to the RAM consumption the total number for tinyLUNAR is higher, however the size of the FT table in memory is more than twice lower. The remaining RAM overhead in TinyLUNAR comes from the universality of the protocol. An additional gain in RAM can be achieved by further careful optimization of the implementation.
5
Related Work
The major representatives of routing protocols for wireless sensor networks are overviewed in Section 2. In this section we discuss the relation of tinyLUNAR to the existing approaches. The universality of the protocol through supporting both data-centric and address-centric communications uniquely positions our approach in the domain of WSN routing. The ports of almost all address-centric protocols from MANETs such as DSDV, AODV remain address-centric in WSNs. One exception from this rule could be a possible modification of DSR, where the forwarding plane (source routing) is separated from the routing plane (reactive route request). With relatively easy modifications the addressing concept presented here can be adapted to work with DSR as well. However, one clear advantage of tinyLUNAR is its ability to conduct multihop communications using only one byte overhead. Obviously, the source routing of DSR becomes quickly less efficient on paths larger than one hop. As for the data-centric routing, the parametrized specification of the destination node(s) in tinyLUNAR allows to implement address-centric communications as a special case of data-centric ones. In principle, any existing data-centric
390
E. Osipov
routing scheme can be adapted according the principles described in this paper. However, to the best of our knowledge we are unaware of such attempts. Furthermore, normally in the data-centric domain the routing issues are hidden behind a specific data-centric application. Typical examples of this are TinyDB [12] and Directed Diffusion [6], where the authors mainly focus on a systematic way of querying the information from the network and not on the routing issues.
6 6.1
Discussion and Future Developments Addressing and Routing for WSN
While tinyLUNAR allows address-centric connections, a general question, however, is to which extend this communication paradigm is suitable for wireless sensor networks. In the original Internet architecture addresses indicate the location of the data, while names (e.g. URLs, email, etc.) are used to describe the communication parties [17]. In the context of the Internet, however, addresses were mainly introduced to hierarchically organize the network and to perform an efficient route lookup based on fixed-length bit sequences. Another property of addresses that justify their usage in the Internet is centralized control over their spatial distribution. The presence of names and addresses implied a two stage destination resolution process: Firstly the name is mapped to a destination address using a name resolution service and then the address is mapped to the forwarding path using a routing protocol. In a general class of randomly deployed and large scale wireless sensor networks, however, the control on global address distribution and the subsequent their centralized topological organization is rather infeasible task. In this case the Internet’s purely address-based routing approach appears as redundant to WSNs. This also limits the usability of ports of the address-centric MANET routing protocols as an additional bandwidth and energy consuming directory service is required. TinyLUNAR on contrary follows an alternative communication model routing by name, appeared in the early time of the Internet [4] and recently revived and proved to be successful in the context of address self-configuration in MANETs [7]. 6.2
Future Development
In general, the active message communication paradigm is a very powerful tool which increases the intelligence of the network. We conjecture that using this mechanism only for internal multiplexing between the communication components as it is currently implemented in TinyOS is suboptimal. The selectors used in tinyLUNAR play a twofold role. Firstly, they are used as forwarding labels; secondly, inside the end nodes they also indicate the application to which the packet must be multiplexed. Thus, having multiple functional roles tinyLUNAR selectors semantically better reflect the meaning of active message tags. In our future work we intend to move the tinyLUNAR functionality as close to the MAC layer as possible. In the case of TinyOS this would require modification of the ActiveM essage component.
tinyLUNAR: One-Byte Multihop Communications Through Hybrid Routing
391
The current implementation of tinyLUNAR includes a limited set of route request and reply actions. In particular, implementation of the RREQ propagation handler based on geographic coordinates, support for in-network processing and building the multicast and convergecast trees remain to be implemented. We leave the development of these issues for our future work. We also consider inserting some connection-oriented functionality in tinyLUNAR by encoding a limited number of ports in the selector field. By this we intend to further reduce the communication overhead for a selected class of connection-oriented applications.
7
Conclusions
In this paper we presented tinyLUNAR, a reactive routing protocol for wireless sensor networks. TinyLUNAR features the simplicity of its predecessor originally developed for mobile ad hoc networks. We showed that multihop forwarding in wireless sensor networks is feasible to implement using only one byte field of the IEEE 802.15.4 MAC header by adopting the label switching forwarding paradigm. The interfaces exposed by tinyLUNAR to upper-layer programmers allow flexible configuration of the protocol’s behavior. One distinct feature which makes our protocol unique is its ability to build routes to parametrically specified destinations. With this property TinyLUNAR is capable to establish routes both for data-centric and address-centric communications.
References ´ 1. Acs, G., Butty´ an, L.: A taxonomy of routing protocols for wireless sensor networks. In: Hiradastechnika, December 2006 (2006), [Online]. Available: http://www.hit.bme.hu/∼ buttyan/publications/AcsB06ht-en.pdf 2. Deng, J., Han, R., Mishra, S.: INSENS: Intrusion-tolerant routing in wireless sensor sensor networks. In: Department of Computer Science, University of Colorado, no. CU-CS-939-02, Boston, MA (2002) 3. Girao, J., Westhoff, D., Mykletun, E., Araki, T.: Tinypeds: Tiny persistent encrypted data storage in asynchronous wireless sensor networks. Elsevier Journal on Ad Hoc Networks (2007) 4. Hauzeur, B.: A model for naming, addressing and routing. ACM Transactions on Office Information Systems (October 1986) 5. Hill, J., Szewczyk, R., Woo, A., Hollar, S., Culler, D., Pister, K.: System architecture directions for networked sensors. In: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2000) 6. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust communication paradigm for sensor networks. In: MOBICOM, pp. 56–67 (2000), [Online]. Available: http://doi.acm.org/10.1145/345910.345920 7. Jelger, C., Tschudin, C.: Dynamic names and private address maps: complete selfconfiguration for manet. In: ACM CoNEXT’06, December 2006, ACM Press, New York (2006) 8. Johnson, D.: Routing in ad hoc networks of mobile hosts. In: Workshop on Mobile Computing Systems and Applications, Santa Cruz, CA, U.S. (1994)
392
E. Osipov
9. Karp, B., Kung, H.T.: Gpsr: Greedy perimeter stateless routing for wireless sensor networks. In: Proc. ACM MOBICOM, Boston, MA, August 2000 (2000) 10. Kuhn, F., Wattenhofer, R., Zollinger, A.: Worst-case optimal and average-case efficient geometric ad-hoc routing. In: Proc. 4th ACM International Conference on Mobile Computing and Networking, ACM press, New York (2003) 11. Li, Q., Aslam, J., Rus, D.: Hierarchical power-aware routing in sensor networks. In: Proc. DIMACS Workshop on Pervasive Networking (May 2001) 12. Maden, S.R., Franklin, M.J., Hellerstein, J.M.: TinyDB: An acquisitional query processing system for sensor networks. ACM Transactions on Data Base Systems (March 2005) 13. Perkins, C., Belding-Royer, E., Das, S.: Ad hoc On-Demand Distance Vector (AODV) Routing. RFC 3561 (Experimental) (July 2003), [Online]. Available http://www.ietf.org/rfc/rfc3561.txt 14. Perkins, C.E., Bhagwat, P.: Highly dynamic destination-sequenced distance-vector routing (DSDV) for mobile computers. In: SIGCOMM, pp. 234–244 (1994), [Online]. Available: http://doi.acm.org/10.1145/190314.190336 15. Ribeiro, C.: Robust sensor self-initialization: Whispering to avoid intruders. In: IEEE SECURWARE’07: International Conference on Emerging Security Information, Systems and Technologies, IEEE Computer Society Press, Los Alamitos (2007) 16. Rosen, E., Viswanathan, A., Callon, R.: Multiprotocol Label Switching Architecture. RFC 3031 (Proposed Standard), (January 2001), [Online]. Available http://www.ietf.org/rfc/rfc3031.txt 17. Shoch, J.: Inter-network naming, addressing, and routing. In: 17th IEEE Conference on Computer Commmunication Networks, IEEE Computer Society Press, Los Alamitos (1978) 18. Tschudin, C., Gold, R., Rensfelt, O., Wiblilng, O.: Lunar: a lightweight underlay network ad-hoc routing protocol and implementation. In: Next Generation Teletraffic and Wired/Wireless Advanced Networking (NEW2AN’04) (2004)
On the Optimality and the Stability of Backoff Protocols Andrey Lukyanenko University of Kuopio,
[email protected]
Abstract. In this paper, we analyze backoff protocols, such as Ethernet. We examine a general backoff function (GBF) rather than just the binary exponential backoff (BEB) used by Ethernet. Under some mild assumptions we find stability and optimality conditions for a wide class of backoff protocols with GBF. In particular, it is proved that the maximal throughput rate over the class of backoff protocols with N stations 1 is 1 − N1 N −1 and the optimal average service time for any station is
1
ES = N/ 1 − N1 N −1 or about N e for large N . The reasons of the instability of the BEB protocol (for a big enough input rate) is explained. Keywords: Ethernet, backoff protocol, contention resolution, stability, optimality, queueing theory.
1
Introduction
Ethernet was developed in 1973 by Bob Metcalf and David Boggs at the Xerox Palo Alto Research Center. Nowadays, it is the most popular local area network due to the ease in maintenance and the low cost. The principle of Ethernet is that all stations are connected to the same shared medium through transceivers. Whenever a single station wants to send a message, it simply broadcasts it to the medium. When at some moment of time there are two or more messages in the medium, they interfere, and none of them should be received by any station. To deal with unnecessary collisions, a resolution protocol was developed (see [7]). It has the following mechanisms: 1. Carrier detection. This mechanism lets stations know when the network has a message. If any station senses that there is a phased encoded signal (continuous) in the network, then it will defer the time of its own transmission until the channel becomes empty. 2. Interference detection. Each station listens to the channel. When it sends a message it is continuously comparing the signal that has just been sent and the signal in the network at the same moment of time. If these signals have different values, the message is considered to be corrupted. Here we introduce the round trip time: this is the time during which the signal propagates from one end of the network to the other and back. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 393–408, 2007. c Springer-Verlag Berlin Heidelberg 2007
394
A. Lukyanenko
3. Packet error detection. This mechanism uses checksums to detect corrupted messages. Every message with a wrong checksum is discarded. 4. Truncated packet filtering. This mechanism lets us reduce the load on the system if the message is already corrupted (detection during round trip time), and filter them on the hardware level. 5. Collision consensus enforcement. When a station sees that its message is corrupted, it jams the whole network by sending special “jammed” information. This mechanism ensures that no station will consider the corrupted message in the network to be a good one. Due to these mechanisms, a message is not sent when it is known that there is information in the medium, and if a collision happens, it can be clearly sensed. But there is still a possibility that one station decides that the medium is empty, while another has already started a transmission. There will be interference at some point of the network. A probabilistic protocol will help us to avoid this problem. Our work examines a general type of the probabilistic protocol. 1.1
Protocol
Let there be N stations and every station have a queue of messages to send. These stations are connected to a shared medium, where collisions may happen time to time. To deal with such collisions, the backoff protocol from the Aloha network was adopted (see [2,3]). If a collision occurs in the backoff protocol, the next retransmission will be done in one of the next W moments, where W is a time window of certain size and the retransmission in the window is chosen uniformly. A time slot (or just a slot) is a time equal to the round trip time. We can vary this time in our model in order to get model closer to the real protocol, where is no time synchronization. If a collision does not happen during the first time slot (for a message that require more than one timeslot to be transmitted), it will not happen at all, due to the carrier detection mechanism. Due to this behavior, we can consider that the transmission of one message takes only one time slot in our model. The segment of time slots [1 . . . W ] is called a contention window. The idea behind the contention window is that we select a time slot for transmission uniformly in the contention window. The main goal of this principle is to reduce the load on the network, and hence to increase the probability of successful resending of a message during one of the next time slots, within the contention window. Backoff protocols are acknowledgement-based protocols. This means that they store information about their own transmissions. This information is the number of uninterrupted unsuccessful transmissions up to this moment. It is called a backoff counter, and denoted by bi for station i. At first, the counter is equal to 0, and it increases after every unsuccessful attempt. The counter returns back to 0 after a successful transmission, and the message is removed from top of the corresponding queue. The counter is not increased endlessly; at some moment of time it is stopped, and we decide that we cannot successfully send the current message, and discard it. In Ethernet the upper bound for the backoff counter is 16.
On the Optimality and the Stability of Backoff Protocols
395
In general, in any backoff protocol, the contention window changes with the change of the backoff counter. The probability of sending at a time slot of the contention window (Wbi for station i) is a function of the backoff counter (bi for station i), and we call this probability the backoff function. We consider f (bi ) as a probability, but not necessary bi f (bi ) = 1. At any moment of time the probability f (bi ) defines uniform distribution for the next attempt to transmit when the backoff counter bi is known. We can set the contention window size via f (bi ) ≤ 1, as Wi = f −1 (x)W0 ≥ 1, where W0 is the minimal contention window def
1 size and f −1 (x) = f (i) . For W0 we use value 1 by default, when the opposite is not mentioned. Note, function f −1 (x) not necessary gives integer numbers, in that case, we will define below some a bit modified uniform distribution more precisely. We need to retransmit a message after every collision, or discard it. First of all, we increase the backoff counter, which represents the load of the system. If we know the backoff function for this counter, we can determine the contention window size. Then we take a random value from the contention window, representing some time delay in the slots. This is the time that must elapse before the next transmit endeavor. This random value we call a backoff time, and it is uniformly distributed on the contention window. As an example, in Ethernet the backoff protocol is called the BEB, and f (bi ) = 2−bi , for bi ≤ 10, and f −1 (bi ) = 1024, for bi > 10. As we mentioned before, after M = 16 we discard the packet.
1.2
Related Work
BEB protocol has over 30 years of research history. Results that have been received, appear to contradict each other; some authors say that the protocol is stable, some say it is not. The result of analyses greatly depends on the used mathematical model, i.e. how they mathematically approximate the Ethernet protocol. Here, we are going to mention some of the most interesting research outcomes. Kelly [8] (later with MacPhee [9]) showed that BEB protocol is unstable for λ > 0.693, and strongly stable for λ < 0.567 for infinite model (N = ∞). Also this author says that ”the expected number of successful transmissions is finite for any acknowledgement based scheme with slower than exponential backoff”. Then Aldous [1] with almost the same model found that all acknowledgmentbased protocols are unstable. They used queue-free infinite model. However, later H˚ astad et al. [6] discovered that finite model (i.e. model with a finite number of stations) with queues is stable for polynomial backoff protocol and BEB protocol is unstable for λ ≥ λ0 +
1 with λ0 ≈ .567. 4N − 2
Additionally, the definition of stability in [6] differs from the first authors. They define stability as the finiteness of expected time to return to zero (which is also
396
A. Lukyanenko
called positive recurrence [6]) and finiteness of expected average total queue-size. While the first two authors talk about stability in terms of throughput rate. Also several other result about the stability of Ethernet can be found in literature, see [10, 11, 12, 14, 13]. However, we are mostly interested in works of Bianchi [5] and Kwak et al. [4]. The models proposed by the latter authors seem to be the most reasonable. In [5] the throughput rate of wireless local area network is analyzed for a protocol which is close to Ethernet protocol. Similar model was considered in [4], where some results on the expected delay time and throughput for exponential backoff protocol from Ethernet network with general factor r (in BEB protocol factor r is equal 2) are obtained.
2
Analysis
Our analysis is based on the work of Kwak et al. [4] and Bianchi [5]. We use their model and now present some assumptions adopted from [4, 5]. 2.1
Model with Unbounded Backoff Counter
We have the following assumptions – Our first assumption is that our system is in a steady state. It is a reasonable assumption for a big enough number of stations N . We assume that a large number of stations makes the network work uniformly. In other words, if the number of stations is large, a single station does not have a great effect on the performance of the system. On the other hand, for a small number of stations this assumption may be far from reality (it is possible, for example, that one station may capture the channel, what is called a capture effect). By this assumption at any moment of time we have the same probability pc for the collision of message sent to the medium. – The second assumption is that all stations are identical, so the performance of every station is the same. – The third assumption is that our model is under the saturation condition. Hence, there are always messages waiting in the input. Without saturation assumption the system might show “bette” results, but this assumption lets us understand the worse case. – The last assumption is that the time is divided into time slots of equal length. During every time slot we can send a message, the propagation time of the message is assumed to be equal to the time slot. Every message is synchronized to the time slot bounds. We know that if collision has not happened during the first timeslot for some large message (large means that the message transmission duration is longer then timeslot duration), then most likely, it will not happen in the remaining timeslots of this message with high probability. When a new packet is tried to send for the first time, the (initial) contention window size is W0 . After the first time, the time within which the transmission
On the Optimality and the Stability of Backoff Protocols
i=0 1 - pc
pc
1 - pc
i=1
pc
1 - pc
i=2
pc
1 - pc
i=3
397
pc
1 - pc
Fig. 1. State model
will be tried is delayed by an amount of time which is uniformly distributed over the set {1, . . . , W0 }. Every time we have a collision we increase this delay set according to a backoff function f (i), where 0 < f (i) < 1 for i > 0 and we assume that f (0) = 1. After the ith collision the delay is distributed over {1, . . . , f −1 (i)W0 }. The initial value W0 can be interpreted as the multiplier value for function f −1 (i) (we always see it as a multiplier). After a successful transmission the delay again is distributed in {1, . . . W0 }. In our model, the backoff counter specifies the state of each station. Let Di be the time of staying in state i, called delay, thus we have the following formula for Di : 1 Yi Xi + 1 − Yi − = , k = 1, . . . , Xi Xi Xi (Xi + 1) Xi (Xi + 1) Yi P r{Di = Xi + 1} = , Xi + 1 P r{Di = k} =
(1)
where Xi = f −1 (i)W0 and Yi = f −1 (i)W0 − Xi . The construction above helps to deal with continuous backoff function, and it is applicable if f −1 (i)W0 ≥ 1,
for all i.
(2)
If f −1 (i)W0 is integer then equation (1) has the following (uniform) distribution, P r{Di = k} =
1 f −1 (i)W0
, k = 1, . . . , f −1 (i)W0 .
Definition (1) is almost the same as in [4]; now Xi and Yi are the integer and fractional parts not only for ri W0 , but for f −1 (i)W0 in general (in [4] f (i) = r1i ). Now we know how long we are going to stay in the state i. Next what we should do, is to find the probability Pi to succeed state i. The state model remains the same as in [4] (see Figure 1), hence the probability Pi is Pi = (1 − pc )pic ,
(3)
where the collision probability pc to be determined below. 2.2
System Load
Let EDi be the expected delay for state i. It then follows from (1) that EDi =
Wi + 1 , 2
(4)
398
A. Lukyanenko
where Wi = f −1 (i)W0 . We know that we enter state i with probability Pi and stay in i for EDi time in average. Thus, we can find the probability γi to be in state i at any instant. It corresponds to the fraction of time that system spends in this state in steady-state model: EDi Pi (Wi + 1)(1 − pc )pic γi = ∞ = ∞ j j=0 EDj Pj j=0 (Wj + 1)(1 − pc )pc =
(Wi + 1)(1 − pc )pic . ∞ W0 (1 − pc ) j=0 f −1 (j)pjc + 1
(5)
∞ In general we cannot find the exact value of j=0 f −1 (j)pjc , and furthermore we cannot expect that the sum of the series even converges on 0 < pc < 1. Let us define a new function ∞ def F (z) = f −1 (j)z j . (6) j=0
Denote by ξ = ξ(pc ) the (random) number of successive collisions before the successful transfer, then ∞ E f −1 (ξ) = f −1 (i)P {ξ = i} = (1 − pc )F (pc ). i=0
Note that we cannot consider F (pc ) as a generating function, because of a dependence between pc and the set of the backoff functions {f (i), i ≥ 0}. Substituting (6) into (5), we obtain a compact form of the equation for γi : γi =
(Wi + 1)(1 − pc )pic . W0 (1 − pc )F (pc ) + 1
(7)
It follows from [4] that the probability to be in state i with backoff timer γi equal to zero (the station is transmitting in state i) is exactly ED , hence the i transmission probability pt at any instant is (see (4)) pt =
∞ ∞ γi 2(1 − pc )pic = . EDi W0 (1 − pc )F (pc ) + 1 i=0 i=0
This immediately implies pt =
2 . W0 (1 − pc )F (pc ) + 1
(8)
Another dependence between pt and pc can be taken from [5]: pc = P {collision} = 1 − P {no transmissions from other N − 1 stations} = 1 − (1 − pt )N −1 .
(9)
On the Optimality and the Stability of Backoff Protocols
399
So we obtain another equation connecting pt and pc : 1
pt = 1 − (1 − pc ) N −1 .
(10)
Combining (8) and (10) implies 1 2 = 1 − (1 − pc ) N −1 . W0 (1 − pc )F (pc ) + 1
(11)
Note that the right-hand side of (11) is 0, when pc = 0, it is 1, when pc = 1 and it monotonically increases with pc . Let def
G(pc ) =
2 . W0 (1 − pc )F (pc ) + 1
(12)
Putting pc = 0 in (12) and taking into account (2) we obtain 0 < G(0) =
2 2 = ≤ 1. W0 f −1 (0) + 1 W0 + 1
To have a unique solution pc of (11) the monotone decrease of function G(pc ) (for 0 < pc < 1) is sufficient. To check this, we calculate derivative 2 G (pc ) = W0 (1 − pc )F (pc ) + 1 pc 2 =− 2 −W0 F (pc ) + W0 (1 − pc ) Fpc (pc ) . (W0 (1 − pc )F (pc ) + 1) As we can see, only the rightmost parentheses part in the equation above dedef −1 termines the sign of the derivative G(pc ) . Recall that F (pc ) = ∞ (j)pjc . j=0 f Thus we have W0 [−F (pc ) + (1 − pc ) F (pc )] = ⎡ ⎤ ∞ ∞ = W0 ⎣− f −1 (j)pjc + (1 − pc ) (j + 1)f −1 (j + 1)pjc ⎦ j=0
⎡ = W0 ⎣− ⎡ = W0 ⎣−
∞
j=0
f −1 (j)pjc +
∞
(j + 1)f −1 (j + 1)pjc −
∞
j=0
j=0
j=0
∞
∞
∞
j=0
f −1 (j)pjc +
j=0
(j + 1)f −1 (j + 1)pjc −
⎡ ⎤ ∞ −1 = W0 ⎣ (j + 1) f (j + 1) − f −1 (j) pjc ⎦ .
⎤ ⎦ (j + 1)f −1 (j + 1)pj+1 c ⎤ jf −1 (j)pjc ⎦
j=0
j=0
From the last equations we have that condition f −1 (i + 1) ≥ f −1 (i) for every i is enough to have non-increasing function G. Hence, if we have for at least one
400
A. Lukyanenko
Fig. 2. Intersection points for equation F (x) = L(x), where F (x) is observed in par1+x 1 ticular cases. FQ (x) = (1−x) 3 for quadratic polynomial function, FL (x) = (1−x)2 - for 1 linear function, and FE (x) = 1−2x - for BEB protocol.
k that f −1 (k + 1) > f −1 (k) (in addition to condition f −1 (i + 1) ≥ f −1 (i) for every i), then we have only one intersection. Note, that if W0 > 1 then we need only condition f −1 (i + 1) ≥ f −1 (i). This is a sufficient condition to have unique solution pc satisfying (11). Note that, if we have f (i) = d for all i (Aloha protocol), then the function G will be a horizontal line. Now we resolve equation (11) in such a way to obtain F (pc ): 1
F (pc ) =
1 + (1 − pc ) N −1 . 1 W0 (1 − pc ) 1 − (1 − pc ) N −1
(13)
For z ∈ (0, 1), introduce the function 1
def
L(z) =
1 + (1 − z) N −1 . 1 W0 (1 − z) 1 − (1 − z) N −1
(14)
Thus, the solution of the equation F (pc ) = L(pc ) gives us the value of pc (See Figure 2). We can see that the faster we increase the resolution window, the smaller is the probability of collision. So by this graphics, BEB protocol seems to be better than, for example, polynomial as the contention window of exponential backoff increases faster than the contention window for polynomial. Later we will show that with small number of collisions (big contention window) the channel becomes more and more loaded. (More packets wait in the queue, what is called instability in [6]).
On the Optimality and the Stability of Backoff Protocols
2.3
401
Expected Transmission Time
Next, we should find the average time for a message to leave the system successfully. In this model we do not have another choice, as we do not discard messages. For this reason we should introduce a new random variable. Let NR be the random variable of state at which a node transmits successfully. In other words, it is the number of collisions before a successful transmission. Obviously, the probability to transmit successfully exactly at the ith state is Pi = P (NR = i) = (1 − pc )pic
i ≥ 0.
(15)
Hence, the average number of collisions before transmission is E[NR ] =
∞
iPi =
i=0
pc . 1 − pc
Let S be the service time of a message. That is the time since the first attempt to transmit up to the instant when the message leaves the system. (In other words, that is complete time of transmission of a message). Now we can compute the average service time ES of a message being in the top of the queue. Because variables NR and Di are independent, we can use the known property of conditional expectation to obtain N R ES = E Di = ENR E Di N R i=0 i=0 N N R R Wi + 1 W0 E[NR ] + 1 −1 =E = E f (i) + . 2 2 2 i=0 i=0 N R
(16)
The first term in the sum is N ∞ ∞ R −1 −1 E f (i) = E f (i)½{NR ≥ i} = f −1 (i)P (NR ≥ i), i=0
i=0
i=0
where ½(·) is the indicator function. Recall that P (NR ≥ i) = pic . Then (16) becomes ∞ 1 1 ES = W0 f −1 (i)pic + . 2 1 − pc i=0 Thus, we have finally 1 ES = 2
W0 F (pc ) +
1 1 − pc
.
(17)
Now we insert (13) into (17) and obtain the following expected delay of the message at the top of the queue ES =
1 1
(1 − pc ) 1 − (1 − pc ) N −1
.
(18)
402
A. Lukyanenko
By easy algebra, (18) gets minimum at N −1 1 p∗c = 1 − 1 − N
(19)
N 1 N →∞ 1− −−−−→ e−1 . N
(20)
Recall the well-known limit
Hence, p∗c
N −1 1 N →∞ =1− 1− −−−−→ 1 − e−1 , N
(21)
and this simple expression gives us the optimal collision probability p∗c as the number of stations N tends to infinity. 2.4
Stability Condition
Let λ be the incoming rate for the system. We assume that the incoming message uniformly chooses the station in the network. Hence, incoming rate for single staλ tion is N . The condition that the queue of the station reduces in average during time is ES < N λ . This gives the following stability condition of the protocol: 1 λ < N (1 − pc ) 1 − (1 − pc ) N −1 . (22) 2.5
Optimality Condition
Now we could clearly say what conditions should be set to have the best possible protocol (over the class of the backoff protocols). Optimal value (19) of N −1 collision probability p∗c = 1 − 1 − N1 tends to 1 − e−1 as the number of stations N tends to infinity. Also, for this (optimal) value the maximal attainable throughput of the system is 1 N −1 λ∗ = sup λ : λ < 1 − , (23) N which tends to e−1 as N tends to infinity. It then follows from (18) and (19) that optimal point pc = p∗c gives the following minimal average service time ES = 1−
N
1 N −1 N
.
(24)
This expectation tends to infinity as N tends to infinity. Note that for individual station this tendency means instability, if we have infinite number of stations. These results agree with previous results of other authors on stability of infinite model. In spite of tendency to infinity for individual
On the Optimality and the Stability of Backoff Protocols
i =-1
1 i=0 1 - pc
pc
i=1
1 - pc
pc
i=2
1 - pc
pc
1 - pc
i=3
403
pc
1 - pc
Fig. 3. State model without saturation condition
stations the whole network has finite expected service time when N → ∞. For us, it is most significant to know the service time of every station for some fixed finite parameter N (the number of stations), by which we can tune real network for the best performance. Now we see from (13) that to achieve the optimal point, the backoff parameters shall satisfy the following condition F (p∗c ) =
2N − 1 N −1 . W0 1 − N1
(25)
It means that if we can find such a protocol (i.e. the set of backoff fucntions {f −1 (i)}) that (25) holds, then the protocol is the best over class of backoff protocols in the sense of minimization of the average transmission time of a message. 2.6
Elimination the Saturation Conditions
We extend our analysis by omitting the saturation condition. To get it we change the model on the following, see Figure 3. In the new model almost everything remains the same except we add new state −1 representing incoming messages. We assume that there is always some incoming rate (λ > 0). Hence there are always messages waiting in the incoming queue, the only difference is that we need to wait them for distinct the delay time. Let this time be the following random variable 0, if q > 0, def D−1 = τ, if q = 0, where τ is a random variable representing the time between incoming messages (E[τ ] = λ1 ) and q is the number of waiting messages in the queue. We can write the time delay for incoming queue in the following way, D−1 = τ · ½{q = 0}. As we can see there is a tight dependence between τ and queue size q. Technique of negative drift could help us to analyze the stability of this model. This technique says if we are outside some finite set has negative expected tendency for the change in the queue size then the system is positive recurrent (see [15]). (Also another condition is required to exclude an infinite jump.) To use this technique we define a random process X(t) = {q(t), b(t), l(t)} for some
404
A. Lukyanenko
i=0 1 - pc
pc
1 - pc
i=1
pc
1 - pc
i=2
pc
i=3
pc
...
pc
i=M
1 - pc
...
1
Fig. 4. State model with bounded counter
station, where q(t) is the number of waiting massages in the queue, b(t) is the size of backoff counter, and l(t) is the number of timeslots remained till the end of current contention window at some moment of time t (Note that b(t) is 0 or l(t) is 0 if q(t) is 0 at some moment of time t). For us the recurrence is enough for the system to be stable. Let us define the finite set outside which we need negative drift, as the set representing station with at least one waiting message in it. Due to this condition the station becomes saturated, and our model becomes identical to the already studied saturation model. The condition of negative drift hence is identical to the inequality (22). 2.7
Model with Bounded Backoff Counter
We can easily extend previous results on the model with an upper bound on the backoff counter (see Figure 4). In this model, if backoff counter exceeds some value M then the message becomes discarded and we take a new one from the queue. Probability to discard a message is P {discard} = pM+1 . c Now we shall recalculate the values for the new model, but some of them do not depend on the number of states and hence they remain the same. One of the unchanged values is the delay time Di . The probability to enter state i is to be modified, now we can find it as Pi =
(1 − pc )pic . 1 − pM+1 c
(26)
Hence the probability to be in state i at any moment of time is Pi EDi (Wi + 1)(1 − pc )pic γi = M = M j j=0 Pj EDj j=0 (Wj + 1)(1 − pc )pc =
(Wi + 1)(1 − pc )pic . −1 (j)pj + 1 W0 (1 − pc ) M c j=0 f
(27)
Using the same arguments we find that pt = =
M M γi 2(1 − pc )pic = EDi W0 (1 − pc )FM (pc ) + 1 − pM+1 c i=0 i=0
2(1 − pM+1 ) c , W0 (1 − pc )FM (pc ) + 1 − pM+1 c
(28)
On the Optimality and the Stability of Backoff Protocols
405
−1 where FM (pc ) = M (i)pic . Note that equation (8) is independent of the i=0 f upper bound for backoff counter. Hence we can use it here. Combining (28) and (8) we have solution for FM (pc ): 1 1 − pM+1 1 + (1 − pc ) N −1 c . FM (pc ) = 1 W0 (1 − pc ) 1 − (1 − pc ) N −1
(29)
Applying almost the same formula for the service time (now we have a finite sum instead of an infinite sum) we have ⎡
min{M,NR }
ES = E ⎣ ⎡
⎤
Di ⎦ = ENR ⎣E ⎣
i=0
min{M,NR }
=E⎣
⎡ ⎡
i=0
⎤
min{M,NR }
⎡
Wi + 1 ⎦ W0 ⎣ = E 2 2
i=0
⎤⎤ Di NR ⎦⎦ ⎤
min{M,NR }
f −1 (i)⎦ +
i=0
E[NR ] + 1 , (30) 2
where similar computation for the last component is possible by virtue of the same random variable NR . ⎡ E⎣
⎤
min{M,NR }
f
−1
(i)⎦ = E
M
i=0
f
−1
(i)½{NR ≥ i} =
i=0
=
M
f
−1
(i)P (NR ≥ i) =
M
i=0
f −1 (i)pic = FM (pc ).
i=0
Combining again the last equations we have ES =
1 2
W0 FM (pc ) +
1 1 − pc
.
(31)
But we already found FM (pc ) for (31), hence 1 − pM+1 c , ES = 1 (1 − pc ) 1 − (1 − pc ) N −1
(32)
which is equal to (17) when M = ∞. The negative drift condition for the bounded model will be 1
N (1 − pc ) 1 − (1 − pc ) N −1 λ< . 1 − pM+1 c
When M = ∞ (33) gives (22).
(33)
406
A. Lukyanenko
Fig. 5. Collision probability for Ethernet, where L(x) plots for 11, 51, 101, 501 and 1001 16 16 stations, FE16 (x) is function for Ethernet, F2.1 , F2.4 are exponential backoff functions (with parameter a = 2.1 and a = 2.4, respectively) with at most 16 attempts to transmit (as for Ethernet)
3
Application to the Ethernet Case
Now we try to apply these results to the real Ethernet protocol (See section 1.1). In addition, we present two exponential protocols that seems to show better performance in mathematic models. We will probe these protocols in cases, when the number of stations is 11, 51, 101, 501 and 1001. In the end we will give an outline of the estimated performances for these cases. In the introduction we said that Ethernet is a bounded BEB protocol with M = 16. Hence, for the Ethernet policy we have FE16 (pc )
=
10 i=0
2i pic
+
16
210 pic =
i=11
1 − (2pc )11 1 − p6c + 210 p11 . c 1 − 2pc 1 − pc
Additionally, we consider another (exponential) set of backoff-functions FaM (x) =
M i=0
ai xi =
1 − (ax)M+1 . 1 − ax
16 Especially, we are interested in some sets of functions Fa16 (x) (particulary F2.1 (x) 16 and F2.4 (x)). See Figure 5 to understand the behavior of the functions. In the table bellow we give comparative data on the behavior of the network depending on the protocol and the number of stations. In the table, the leftmost column shows the number of the stations, the next row shows the point of intersection pc , after that column shows the average service time for the network, and the last column shows the probability of discarding for that protocol. Every cell has 16 3 numbers, separated by ’/’, these numbers are correspondingly data for F2.4 (x), 16 16 F2.1 (x), FE (x).
On the Optimality and the Stability of Backoff Protocols
N N N N N
= 11 0.48 = 51 0.54 = 101 0.57 = 501 0.64 = 1001 0.67
/ / / / /
pc 0.54 0.62 0.65 0.73 0.77
ES N
/ / / / /
0.62 2.77 0.74 2.76 0.80 2.74 0.94 2.72 0.99 2.73
/ / / / /
2.64 2.69 2.71 2.82 2.93
/ / / / /
407
P {discard} 2.59 3 ∗ 10−6 / 3 ∗ 10−5 / 3 ∗ 10−4 2.83 3 ∗ 10−5 / 3 ∗ 10−4 / 6 ∗ 10−3 3.02 6 ∗ 10−5 / 7 ∗ 10−4 / 0.022 3.86 5 ∗ 10−4 / 5 ∗ 10−3 / 0.349 3.52 1.2 ∗ 10−3 / 0.012 / 0.809
From the table we derive that Ethernet protocol is better to use for small number of stations (something like 10 stations), on the other hand the protocol 16 16 F2.1 (x) is better to use for 51, 101 stations, and the protocol F2.4 (x) good choice for 501, 1001 active stations. Also the existing Ethernet protocol is better not to use if the number of stations is greater than 500, because the probability that your message will be discarded is high. In real network, where some stations may be almost “silent”, the number of active stations much lower then the actual number of the stations in the network.
4
Conclusions
We have found stability conditions for the steady state models for backoff protocols. These conditions were obtained both for the bounded and for the unbounded retry limit models. Consequently, we can analytically measure the throughput of the system, and the service time. We have also found the optimality conditions. The question of the optimality is still open, but for our model (unbounded retry limit) we prove that exponential function is the best choice for the backoff function. In the paper we present graphics that show the correlation between the level function L(x) and the “extended” backoff function F (x). Moreover, from the graphic in figure 2 we can see why sub-linear and super-exponential functions are not good choices. Finally, we show the “connection” of the stability for bounded backoff and the successful throughput rate (1 − P {discard}). In this paper, we present analytical solutions, but some questions still remain open. For example, the optimality (for different M ) of the general bounded backoff (particulary Ethernet) and the appropriateness of the steady state assumption. A simulation would help to answer the last question, but at present there are good reasons for supposing that this assumption is appropriate.
Acknowledgment I would like to thank Prof. Martti Penttonen, who introduced me the problem arising in Ethernet and helped me a lot with editing of this article. I thank Prof. Evsey Morozov for many valuable comments which have improved presentation of the paper and attracting my attention to the principle of negative drift, and excellent book of Meyn and Tweedie.
408
A. Lukyanenko
References 1. Aldous, D.J.: Ultimate Instability of Exponential Back-Off Protocol for Acknowledgement-Based Transmission Control of Random Access Communication Channel. IEEE Trans. on Information Theory, IT 33(2), 219–223 (1987) 2. Abramson, N.: The ALOHA system-another alternative for computer communications. AFIPS 37, 281–285 (1970) 3. Abramson, N.: Development of the ALOHANE. IEEE Trans. on Inform. Theory 31(2), 119–123 (1985) 4. Kwak, B., Song, N., Miller, L.E.: Performance analysis of exponential backoff. IEEE/ACM Trans. Netw. 13(2), 343–355 (2005) 5. Bianchi, G.: Performance Analysis of the IEEE 802.11 Distributed Coordination Function. IEEE J. on Sel. Areas in Commun. 18(3), 535–547 (2000) 6. H˚ astad, J., Leighton, T., Rogoff, B.: Analysis of backoff protocols for multiple access channel. SIAM J. Comput. 25(4), 740–774 (1996) 7. Metcalfe, R., Boggs, D.: Ethernet: Distributed Packet Switching for Local Computer Networks 8. Kelly, F.P.: Stochastic models of computer communication systems. J. Roy. Statist. Soc. B 47, 379–395 (1985) 9. Kelly, F.P., MacPhee, I.M.: The number of packets transmitted by collision detect random access scheme. Annals of Prob. 15, 1557–1568 (1987) 10. Goodman, J., Greenberg, A.G., Madras, N., March, P.: Stability of binary exponential backoff. J. of the ACM 35(3), 579–602 (1988) 11. Fayolle, G., Flajolet, P., Hofri, M.: On a functional equation arising in the analysis of a protocol for a multi-access broadcast channel. Adv. Appl. Prob. 18, 441–472 (1986) 12. Rosenkrantz, W.A.: Some theorems on the instability of the exponential back-off protocol. Performance ’84, 199–205 (1985) 13. Goldberg, L.A., MacKenzie, P.: Analysis of Practical Backoff Protocols for Contention Resolution with Multiple Servers. J. of Comp. and System Sciences 58, 232–258 (1999) 14. Goldberg, L.A., MacKenzie, P., Paterson, M., Srinivasan, A.: Contention Resolution with Constant Expected Delay. Journal of the ACM (JACM) 47(6), 1048–1096 (2000) 15. Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Springer, London (1993)
Maximum Frame Size in Large Layer 2 Networks Karel Slavicek CESNET and Masaryk University Botanicka 68a 60200 Brno Czech Republic
[email protected]
Abstract. Ethernet protocol originally designed local area networks became very popular and in practice the only protocol used in local and metropolitan networks. Currently both IETF and IEEE are working on some improvements of this protocol to make it more attractive for usage in WAN networks. These modifications may add some additional fields into ethernet header so that the effective maximum transportable data unit will be decreased. From this point follows the problem how to inform upper layers about maximum data unit which ethernet layer can transport. This problem is not addressed in prepared IETF and IEEE standards. This paper tries to point out to this problem and propose some possible solutions.
1 Introduction Ethernet is the grand unifying technology that enables communication via grand unifying network protocol - IP. Today’s applications (based on IP) are passed seamlessly throughout a complex ethernet system across carrier networks, enterprise networks and consumer networks. From its origin more than 30 years ago ethernet has evolved to meet the increasing demand of IP networks. Due to its low implementation cost, relative simplicity and easy maintenance, ethernet population has grown to today state that mostly all IP traffic start and end on ethernet connection. Ethernet has evolved beyond just offering fast and reliable local area network. It’s now being used for access, metropolitan and wide area networks. The next step on the road are carrier class networks. Let’s think about why ethernet is preparing to colonise carrier backbone networks just now. We can see that ethernet has grown to the SONET/SDH speed and rather near SONET/SDH transport properties. Since experimental ethernet in let say 1976, first standardised ethernet in 1982 to IEEE standard in 1985, the world of networking protocols was very diverse and was difficult to predict which protocol will survive. In 1995 fast ethernet was standardised and we can say that it defeated its main data network competitors - token ring an 100VG-AnyLan. However carriers transport protocols - SONET and SDH - at that time offered more bandwidth better resiliency and quality of service. It took yet rather long way through gigabit ethernet (standardised in 1998) to ten gigabit ethernet (standardised in 2002). Ten gigabit ethernet provides the Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 409–418, 2007. c Springer-Verlag Berlin Heidelberg 2007
410
K. Slavicek
same speed as most of carriers backbone networks and probably it will keep up with speed of SONET and SDH forever. Up to gigabit ethernet it is reasonable to transport ethernet frames inside SONET/SDH concatenated VC-4 containers. Ten gigabit ethernet would consume the whole STM-64 frame or a quarter of STM-256. The STM-256 is not so widely deployed because of more constraints on fiber optics infrastructure. The next bandwidth step in ethernet - 100 gigabit - will probably come earlier than next bandwidth step in SONET/SDH. Because mostly all today’s data traffic starts and ends on ethernet connection and ethernet technology is very price competitive to legacy SONET/SDH technologies. For all the above reasons the deployment of ethernet into provider backbone networks is very straightforward step.
2 Carrier Grade Ethernet There are more reasons which make service providers think about ethernet. With development of Internet and data networking in general there appeared customers who want to interconnect their remote branches to central office or/and between them selves. Such interconnections were implemented as leased layer 2 lines. Network operators usually offered leased lines on top of SDH or PDH transport network. Typical router at that time offered a lot of serial interfaces of type V.35, X.21 or G.703. As customers wanted more bandwidth, the ethernet was natural replacement of serial lines. A natural property of ethernet is the ability to connect large number of hosts via uniform layer 2 environment. It’s something very different from former networks offering a set of point-to-point connections. SONET/SDH networks is not able to simply simulate ethernet environment for more than two hosts. Another way how service providers may offer interconnection of remote customers facilities is to use layer 3 protocols - typically MPLS - and build a virtually dedicated infrastructure for each customer. Customers private layer 3 routes are stored in Virtual Route Forwarding (VRF) tables. If the customer requests layer 2 service, the situation is rather complicated even in case of MPLS usage. Originally only point-to-point ethernet over MPLS tunnels were defined. Even first IETF standards [9], [10] describing layer 2 network transport over MPLS were speaking about pseudo wires. Even though a protocol for point to multipoint layer 2 connections over MPLS was developed during pas few years. This protocol is called VPLS ind is specified in [12] and [13]. The VPLS protocol is rather complicated and uses many layers of encapsulation. The customers ethernet frame is encapsulated into MPLS frame and the MPLS frame is once more encapsulated into another layer 2 frame. The providers MPLS network has to emulate ethernet switching environment. MPLS PE routers have to learn MAC addresses carried over the VPLS network and distribute this information to other MPLS PE routers servicing given VPLS network. In other words MPLS PE routers have to maintain customers layer 2 forwarding database. The VPLS protocol is than implemented only in very expensive and complicated ”carrier-grade” networking equipment. On the other hand legacy carrier protocols (SONET/SDH and PDH) offer some properties which are not yet standardised for ethernet. There are two mostly independent working groups which are working on standardisation of these properties for ethernet.
Maximum Frame Size in Large Layer 2 Networks
411
One of these groups is IEEE 802.1. IEEE start from original ethernet specification and tries to address network operators needs without employing layer 3 protocols. The second one is the IETF TRILL group. IETF group has big experience and expertise in Internet protocol. They more emphasise layer 2 / layer 3 cooperation and don’t stick at utilisation of layer 3 protocols where it is useful. The third group working on ethernet protocol improvement is the ITU-T study group 15. The work of ITU-T is concentrated mainly on ethernet management and is out of the scope of this paper.
3 IEEE Approach In the IEEE approach we can see the endeavour for truly homogeneous set of networking protocols which can seamlessly carry data across local, metropolitan and provider backbone networks. The basis of scaling ethernet networks is the IEEE 802.1Q Virtual Local Area Network standard. This standard allows us to construct logical layer 2 networks on single physical network infrastructure. Of course this standard was designed for simple enterprise LAN environment and due to this it has scalability limitations. The limiting factor is the fact that the VLAN ID is only 12 bits so we can use at most 4096 VLAN. This number of VLANs is enough for the enterprise but may not be enough for larger metropolitan or even wide area networks. For this reason the Q-in-Q concept was introduced. The IEEE 802.1ad standard defines ”stacking” of two .1Q labels. Packets coming from customers network into providers equipment are labeled with this new label. The IEEE 802.1ad frame is on Fig. 1. Dest. Addr. Src. Addr. T service VL-ID T Cust VL-ID T/L User Data FCS Fig. 1. IEEE 802.1ad frame format
The main benefit of this approach is the independence of customers and providers addressing schema (VLAN numbering). This is the first prerequisite for ethernet being usable in large networks. However the IEEE 802.1ad standard doesn’t solve the problem with limited number of VLANs usable in provider backbone. For really large network operators this still may be limiting. More over all the operators devices carrying customers data should learn MAC addresses of customers devices and it may introduce very large forwarding information base in core providers networking equipment. Inside provider backbone some sort of aggregation is obvious. For this case IEEE is working on new standard 802.1ah or MAC-in-MAC. This protocol should be used on provider backbone bridges. It simply encapsulates original 802.1ad (or 802.1Q or ”legacy ethernet” frame) into new ethernet frame. By this way provider bridges need to know only limited number of MAC addresses. More over in 802.1ah header are two new tags available: so called Btag or backbone tunnel tag and I-tag or extended services tag. This way enable enough service instances across large backbone networks. The frame format of 802.1ah is on Fig. 2.
412
K. Slavicek Dest. Addr. Src. Addr. B-TAG I-TAG Original 802.1ad frame FCS Fig. 2. IEEE 802.1ah frame format
Another problem of ethernet is low efficiency of underlaying physical link utilisation. The matter of the problem is the fact that ethernet frame doesn’t have any hop count or time to live or similar field by which it would be possible to identify frames wandering in a circle. Instead of this the ethernet bridges utilise a spanning tree algorithm and construct a logical tree topology on top of general physical one to avoid packet circulating in the network. IEEE 802.1aq group works on new protocol called shortest path bridging which should improve physical topology utilisation.
4 IETF Approach The IETF approach defines a new type of devices called RBridge. The proper definition of this device we can cite from [3]:” RBridges are link layer (L2) devices that use routing protocols as a control plane. They combine several of the benefits of the link layer with network layer routing benefits. RBridges use existing link state routing (without requiring configuration) to improve RBridge to RBridge aggregate throughput. RBridges also provide support for IP multicast and IP address resolution optimizations. They are intended to be applicable to similar L2 network sizes as conventional bridges and are intended to be backward compatible with those bridges as both ingress/egress and transit. They also support VLANs (although this generally requires configuration) and otherwise attempt to retain as much ’plug and play’ as is already available in existing bridges. ” The IETF Trill group tries to support increased RBridge to RBridge bandwidth, keep the layer 2 network configuration free as it can be in current layer 2 networks while still being compatible with existing bridges and hubs. Of course the configuration free request is (as in current layer 2 networks) usable only in small networks. In large layer 2 networks it is theoretically possible but practically very unstable as we can confirm from practical experiences. RBridges use an Intermediate System to Intermediate System (IS-IS) routing protocol to discover RBridge peers, determine RBridge link topology, advertise layer 2 reachability information and establish layer 2 delivery using shortest path. Forwarding information is derived from combination of attached MAC addresses learning and path computation using link-state routing protocol. The RBridge have some characteristic of both bridges and routers. An unicast ethernet frame is forwarded toward the rbridge advertising its destination MAC address. A distribution tree is used for broadcast, multicast and unknown unicast traffic forwarding. This solutions have no impact on the Internet network layer architecture. However it is designed to cooperate with IPv4 ARP and IPv6 ND protocols. More precisely RBridges should cache ARP/ND responses an may not to forward ARP/ND request but send ARP/ND response instead of the target if they know the proper IP-MAC address mapping. Let’s cite from the RBridge protocol definition [1]:
Maximum Frame Size in Large Layer 2 Networks
413
“ Rbridges SHOULD implement an ’optimized ARP/ND response’ When the target’s location is assumed to be known by the first RBridge, it needs not flood the query. Alternative behaviors of the first Designated RBridge that receives the ARP/ND query would be to: 1. send a response directly to the querier, with the layer 2 address of the target, as believed by the RBridge 2. encapsulate the ARP/ND query to the target’s Designated RBridge, and have the Designated RBridge at the target forward the query to the target. This behavior has the advantage that a response to the query is authoritative. If the query does not reach the target, then the querier does not get a response 3. block ARP/ND queries that occur for some time after a query to the same target has been launched, and then respond to the querier when the response to the recentlylaunched query to that target is received The reason not to do the most optimized behavior all the time is for timeliness of detecting a stale cache. Also, in the case of secure neighbor discovery (SEND) [RFC3971], cryptography might prevent behavior 1, since the RBridge would not be able to sign the response with the target’s private key. ” This fact can be used for distributing information about MTU. The TRILL concept use encapsulation of transported layer 2 data frame into trill header. The overall RBridge architecture is on Fig. 3. The TRILL approach is in this point very similar to the IEEE 802.1ah one. The structure of the ”TRILL” frame is on Fig. 4. Higher Layers Trill Layer Rbridge realy Trill Layer Datalink Layer Datalink Layer Physical Layer Physical Layer Fig. 3. The overall RBridge architecture
Outer ethernet header TRILL header Inner (original) ethernet header Ethernet payload New (outer) FCS Fig. 4. Structure of TRILL frame
5 The MTU Problem The IP protocol was designed to be used on a variety of transmission links. Although the maximum length of an IP packet is 64kB most of the transmission lines enforce smaller packet length. The maximum length of packet which can be transported on given line is
414
K. Slavicek
called MTU. If an application sends IP packet larger than MTU some network device or devices must fragment this packet into smaller units. In the IP header there exist enough gears to deal with fragmented packets. Packet fragmentation has many disadvantages: Firewalls that filter or manipulate packets based on higher layers may have trouble processing fragmented packets. If the fragments come to firewall out of order then the non-initial fragments may be dropped so that the original packed could not be reassembled by the receiver. And of course reassembling packets (even if we do not use firewalls) by the receiving host in environment where fragments may come out of order and may be dropped (e.g. due to congestion) can consume a meaningful part of CPU time at the receiving host. For these reasons many applications, especially TCP based, try to avoid fragmentation by setting the DF (don’t fragment) bit in IP header and splitting the data into segments of proper size by the sending host. As the network protocol stack grows (ethernet header, IP header, GRE header, IPSEC header ...) the real size of data transported inside one packed decreases. Till now all necessary information about MTU were known at layer 3. At layer 3 there is a mechanism for signalling the requested MTU to the sender. This mechanism is the ICMP unreachable message. Much more difficult situation is in case where we are using some advanced layer 2 protocols like TRILL or 802.1ah. These protocols may decrease the effective MTU somewhere on the path where only layer 2 devices are used. These devices have no possibilities to inform sender about MTU. More over these devices are not intended to deal with layer 3 protocols so that they can’t fragment IP packets and should simply truncate or drop them. The situation is easy to see from Fig. 5. Till now in case that some e.g. metropolitan network operator is using 802.1ad protocol the MTU problem is solved by simply increasing the ethernet MTU of bridges evolved. Even if this solution doesn’t strictly corresponds to IEEE standards (the 802.3as which specifies larger frames than 1500 Bytes was released in September 2006) this is a commonly used solution. The problem is that this solution is bounded to single network
L3
L3
MTU 1500 MTU 2000
MTU 2000
L2
L2 transport network
L2
Fig. 5. MTU problem in large scale multiprovider ethernet network
Maximum Frame Size in Large Layer 2 Networks
415
operator and is not prepared for large ethernet network which eventually may traverse more operators where some operator may not be able or compliant to carry larger frames. In this new ethernet environment is would be much better to be able to signal the MTU to upper layers. As we mention above the TRILL protocol is very suitable for it. We are working on ”IEEE-like” solution as well but this work still needs some effort and will be presented in some next paper.
6 Possible Solutions for the MTU Problem RBridges are exchanging reachability and link state information via IS-IS protocol. This protocol may be very simply modified to carry the MTU of each link. By this way all RBridges may (mostly costless) know next to current layer 2 topology also MTU of all lines. RBridge can then offer rather smart solution of the problem how to pass the MTU size from layer 2 device to layer 3 network. Of course there are two trivial solutions: The first one is to reduce the MTU on the outgoing interface on customers boundary router. This solution is possible and anytime applicable. The main disadvantage is higher protocol overhead and suboptimal utilisation of WAN connectivity. The second one is to ask all service providers along the path used for long distance layer 2 line to increase the MTU. This solution is not every time applicable because the customer not every time knows all the carriers and carrier’s carrier participating on the transport service. The are two possible solutions which may lead to optimal bandwidth utilisation (that means maximum available MTU usage) and which would cost minimum (if any) changes to existing protocol at the same time. The first one is to use the same mechanism which is used in Path MTU discovery that is ICMP unreachable messages. The second one is to modify a little bit the ARP protocol. 6.1 Gratuitous ICMP Unreachable Type Fragment Needed This method is so simple that there is mostly nothing to explain. The idea can be easily demonstrated on the following example: Let say that an ingress RBridge has received an IP ethernet frame which is larger then MTU of the line which should be used to send the packet towards the destination. This RBridge simply discards the frame and sends to the originator an ICMP unreachable message with type Fragment needed and put the proper MTU into the messages as it would be done by router. Here are of course question about source IP address of this ICMP messages because RBridges need not to have any IP address and if they do have some IP address the address may be in another VLAN than sender of large data frame. Even in the case that the RBridge has IP address in the same VLAN as the IP ethernet frame that caused the ICMP Fragment needed message there may be problem with firewalling or filtering on some node on the path from RBridge to the originator of such frame. Here the solution is a little bit tricky - the RBridge will borrow the target IP address of the packet being dropped. I.e. from point of view of originator of oversize packet it will look a like the target host will respond with this ICMP message. This solution is very easy to
416
K. Slavicek
implement but the fact that it need to borrow someones else IP address is not very nice. (However it is similar to the way RBridges deal with ARP). The main advantage of this approach is full compatibility with existing protocols. This approach don’t introduce any modification of existing protocol stacks. It is fully transparent to today’s IP stack implementations. 6.2 Modification of ARP/ND A slightly more nice way which on the other hand needs some modification of existing protocols is to use ARP for sending the MTU information to layer 3. Because RBridges have all necessary information and more over may respond to ARP requests we can include the MTU information into ARP packet. The original ARP frame is on Fig. 6 and the modified one is on Fig. 7. The idea is very simple: ARP protocol uses 16-bit opcode field for only two possible values - request and response. We can utilise some bit of this field to identify that the ARP uses new frame type - i.e. i contains one more field containing the MTU. Much easier situation is in case of IPv6. The IPv6 ND (Neighbour Discovery) protocol already has and option field and one of defined options is MTU. Proposed modification of ARP is in some sense retrofitting of IPv6 properties into IPv4. The big disadvantage of this method is modification of existing ARP protocol which will cause modification of existing IP stack of equipment communicating directly with RBridges. Of course the backward compatibility problem can be very easily solved. hardware address space protocol address space length of hardware address length of protocol address opcode hardware address of sender protocol address of sender hardware address of target protocol address of target
16 bit 16 bit 8 bit 8 bit 16 bit 1 = request, 2 = reply n m n m
Fig. 6. Structure of ARP frame
hardware address space protocol address space length of hardware address length of protocol address opcode hardware address of sender protocol address of sender hardware address of target protocol address of target maximum hardware frame size
16 bit 16 bit 8 bit 8 bit 16 bit 0x0101 = MTU request, 0x 0102 MTU reply n m n m 16 bits
Fig. 7. Proposed enhancement of ARP frame
Maximum Frame Size in Large Layer 2 Networks
417
The RBridge may as a response to an ARP request send a pair of ARP responses - the first one with the MTU option and the second one without it. This way the hosts with the original implementation of ARP will ignore the first ARP response and learn MACto-IP address mapping from the second one. Of course the MTU problem is not solved for such hosts. 6.3 Comparison of Gratuitous ICMP Fragment Needed and Modified ARP Approaches The common problem of both approaches is how to signal to the host that the MTU for some destinations was increased. As an example let consider simple network as on the figure 5. Now we introduce a new connection with higher MTU as on the figure 8. Let the newly added path be the preferred one.
L3
L3
MTU 2000
MTU 1500
MTU 2000
L2 transport network
L2
L2
MTU 1800 L2 transport network
Fig. 8. MTU may increase under some circumstances
In this situation we can use the MTU 1800 instead of the original 1500. The problem is how to inform transmitter that larger MTU is available. In the gratuitous ICMP fragment needed approach there is no possibility of large MTU propagation. A little bit better situation is in case of modified ARP approach. In this case a gratuitous arp with new MTU option can be sent. Of course this approach can solve the problem only partially. The RBridge can send gratuitous arp only to hosts which are already in its ARP cache. More over this approach will work only for end nodes. If the L3 device on figure 8 is a router it probably has propagated smaller MTU to to true originator of the traffic. Current IP protocols have capabilities only to decrease the MTU. The reverse problem is not solved.
418
K. Slavicek
7 Conclusion The intention of this paper was to point out the problem of MTU signalling between layer 2 (ethernet) and layer 3 (IPv4 or IPv6). This problem may be a real issue of next generation ethernet deployment in provider networks. Both proposed solutions are straightforward and easy enhancement of existing protocols. They have both positive and negative properties. Proposed solutions are not mend as the optimal method of layer 2 to layer 3 MTU signalling but as an entry point for discussion about this problem which should end up with common consensus and may be yet another solution.
References 1. Perlman, R., Gai, S., Dutt, D.G.: RBridges: Base Protocol Specification.draft-ietf-trillrbridge-protocol-03.txt 2. Touch, J., Perlman, R.: Transparent Interconnection of Lots of Links (TRILL): Problem and Applicability Statement. draft-ietf-trill-prob-01.txt 3. Gray, E.: The Architecture of an RBridge Solution to TRILL. draft-ietf-trill-rbridge-arch02.txt 4. Gray, E.: TRILL Routing Requirements in Support of RBridges. draft-ietf-trill-routing-reqs02.txt 5. Plummer, D.: An Ethernet Address Resolution Protocol – or – Converting Network Protocol Addresses to 48.bit Ethernet Address for Transmission on Ethernet Hardware, RFC 826 (November 1982), http://www.ietf.org/rfc/rfc826.txt 6. Narten, T., Nordmark, E., Simpson, W.: Neighbor Discovery for IP Version 6 (IPv6), RFC 2461 (Standards Track) (December 1998) http://www.ietf.org/rfc/ rfc2461.txt 7. Callon, R.: Use of OSI IS-IS for Routing in TCP/IP and Dual Environments, RFC 1195, (December 1990), http://www.ietf.org/rfc/rfc1195.txt 8. IEEE Standard for Local and metropolitan area networks / Virtual Bridged Local Area Networks, 802.1Q-2005 (May 19, 2006) 9. Bryant, S., Pate, P. (eds.) Pseudo Wire Emulation Edge- to-Edge (PWE3) Architecture”, RFC 3985 (March 2005), http://www.ietf.org/rfc/rfc3985.txt 10. Martini, L., El-Aawar, N., Heron, G., Rosen, E., Tappan, D., Smith, T.: Pseudowire Setup and Maintenance using the Label Distribution Protocol (LDP), RFC 4447 (April 2006), http://www.ietf.org/rfc/rfc4447.txt 11. ANSI/IEEE Standard 802.1Q-2005, IEEE Standards for Local and Metropolitan Area Networks: Virtual Bridged Local Area Networks (2005) 12. Andersson, L., Rosen, E.: Framework for Layer 2 Virtual Private Networks (L2VPNs)”, RFC 4664 (September 2006), http://www.ietf.org/rfc/rfc4664.txt 13. Lasserre, M., Kompella, V.: Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling, RFC 4762 (January 2007), http://www.ietf.org/ rfc/rfc44762.txt
Analysis of Medium Access Delay and Packet Overflow Probability in IEEE 802.11 Networks Gang Uk Hwang Department of Mathematical Sciences and Telecommunication Engineering Program Korea Advanced Institute of Science and Technology 373-1 Guseong-dong, Yuseong-gu, Daejeon, 305-701, Republic of Korea
[email protected] http://queue.kaist.ac.kr/∼ guhwang
Abstract. In this paper, we first analyze the medium access delay of a packet in a terminal in the saturated IEEE 802.11 network. In our analysis, we use the renewal theory to analyze the detailed packet transmission processes of terminals in the network such as the backoff counter freezing. Using our detailed analysis of the packet transmission processes of terminals, we analyze the packet transmission process of a tagged terminal and the background traffic for the tagged terminal which is generated by non-tagged terminals, and derive the Laplace transform of the medium access delay of a packet under the saturated condition. Next, based on the analysis of the medium access delay under the saturated condition, we propose a mathematical model to analyze the packet overflow probability of an unsaturated terminal. We also provide numerical and simulation results to validate our analysis and investigate the characteristics of the system performance. IEEE 802.11 WLAN, Distributed Coordination Function, Medium Access Delay, Performance Evaluation, Packet Overflow Probability
1
Introduction
During the past few years, standards for WLANs (Wireless Local Area Networks) have been proposed to satisfy the demand for wireless services, and the IEEE 802.11 MAC (Medium Access Control) protocols [1] are the de facto standards for WLAN and the most widely used nowadays. In the IEEE 802.11, the main mechanism to access the channel is the DCF (Distributed Coordination Function), which is a random access scheme based on the CSMA/CA (Carrier Sense Multiple Access with Collision Avoidance). The DCF has two access techniques for packet transmission: the default, called the basic access mechanism, and an optional four-way handshake scheme, called the RTS/CTS mechanism. In both mechanisms, they use backoff counter and backoff stage to determine packet transmission times [14,16]. The RTS/CTS mechanism involves the transmission of the RTS (Request-ToSend) and CTS (Clear-To-Send) control frames prior to the transmission of the Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 419–430, 2007. c Springer-Verlag Berlin Heidelberg 2007
420
G.U. Hwang
data frame. A successful exchange of RTS and CTS frames attempts to reserve the channel for the time duration needed to transmit the data frame under consideration. The rules for the transmission of an RTS frame are the same as those for a data frame under the basic access scheme. Hence, the analysis of the basic access mechanism is the same as that of the RTS/CTS mechanism except that transmission times of packets are different in both mechanisms. We consider the RTS/CTS mechanism only in this paper. Many different works dealing with the performance analysis of the IEEE 802.11 DCF can be found in the literature. They have focused primarily on its throughput and capacity [2,3,4], adaptive backoff schemes [5,6,7], statistical analysis such as packet service times (or packet delay) of a terminal and related queueing analysis [8,9,10,11,12]. Regarding the analysis of the medium access delay (or the packet service time) which is defined by the time needed for a packet to be successfully transmitted after it is positioned in the transmission buffer of the terminal for the first time, Carvalho and Garcia-Luna-Aceves [8,9] have introduced an analytical model to characterize the service time of a terminal in saturated IEEE 802.11 ad hoc networks. However, their analysis in [8] and [9] has been carried out in terms of ¨ first and second order statistics only. Ozdemir and McDonald [13] have derived the packet service time distribution of a terminal. Tickoo and Sikdar [11,12] have computed the service time distributions by including explicitly modelling the impact of the network load on the loss rates and thus the delays. In this paper, we consider a network of N identical IEEE 802.11 DCF (Distributed Coordinate Function) terminals with the RTS/CTS mechanism, each of which is assumed to be saturated. For performance analysis, we propose a simple and efficient mathematical model to derive the Laplace transform of the medium access delay of a packet in a terminal. In our analysis, we first use the renewal theory to analyze the detailed procedure of backoff counter freezing as in [16]. We then consider the packet transmission process of a tagged terminal and the background traffic for the tagged terminal which is generated by non-tagged terminals. Finally, we derive the Laplace transform of the medium access delay of a packet in the tagged terminal. In addition to the saturated analysis, we consider an unsaturated terminal in the IEEE 802.11 network where the other N − 1 terminals in the network are assumed to be saturated. Based on our analytic results in the saturated analysis, we propose a mathematical model to analyze the packet overflow probability of an unsaturated terminal. Note that our approach is totally different from those in previous studies including Bianchi [14], and provides a simple and efficient model to analyze the system performance as shown later, which is the main contribution of this paper. The organization of this paper is as follows: In section 2, we first develop an analytic model to consider the details of the packet transmission process in a terminal. Then, we consider a tagged terminal, introduce the background traffic of the tagged terminal and analyze the medium access delay of a packet in the tagged terminal through the Laplace transform. In section 3, we consider an unsaturated terminal and analyze the packet overflow probability based on the
Analysis of Medium Access Delay and Packet Overflow Probability
421
results obtained in section 2. In section 4, we provide numerical examples to validate our analysis and investigate the characteristics of the system performance. In section 5, we give our conclusions.
2
Medium Access Delay
In this section, we analyze the medium access delay of a packet in a terminal of a saturated IEEE 802.11 network, which is defined by the time needed for the packet to be successfully transmitted after it is positioned in the transmission buffer of the terminal for the first time. We assume that there are N identical terminals in the IEEE 802.11 network. We tag an arbitrary terminal in the network, called the tagged terminal, and consider the medium access delay of a packet of the tagged terminal. For analysis, we consider the packet transmission processes of the other N −1 terminals as the background traffic of the tagged terminal. So, if there is a packet transmission in the background traffic at a packet transmission epoch of the tagged terminal, the transmitted packet of the tagged terminal experiences a collision. On the other hand, if there is no packet transmission in the background traffic at a packet transmission epoch of the tagged terminal, the packet of the tagged terminal is successfully transmitted. For analysis, we first analyze the packet transmission process of an arbitrary terminal, and then the packet transmission process in the background traffic. Finally, we analyze the medium access delay of the tagged terminal. 2.1
The Packet Transmission Process of a Terminal
For analysis, we focus ourselves on embedded time points where the backoff counter value of a terminal of interest is changed (i.e., the backoff counter value is decremented by 1 or a new backoff counter value is selected). A time interval between two consecutive embedded time points are defined by a virtual slot in the analysis. We assume that the packet transmission process of a terminal is well approximated by a renewal process. Let R denote the inter-renewal time, i.e., the number of virtual slots between two consecutive packet transmissions of an arbitrary terminal in steady state. To obtain the distribution of the random variable R, we first compute the steady state probability πk , 0 ≤ k ≤ m, that the terminal is in backoff stage k just after a packet transmission. Here, m denotes the maximum backoff stage. To do this, we assume that packet collision occurs independently at each packet transmission with a fixed probability p as in [14] in steady state. Then, it is shown that the steady state probability πk , 0 ≤ k ≤ m, is computed by [16] k p (1 − p), if 0 ≤ k ≤ m − 1, πk = (1) pm , if k = m. Next, let BC denote the backoff counter value selected by the terminal after the packet transmission in steady state. When the backoff stage after the packet
422
G.U. Hwang
transmission is k, the tagged terminal selects a new backoff counter value uniformly from the window [0, CWk − 1] where the window size CWk at backoff state k is CWk = 2k CW0 , 1 ≤ k ≤ m and CW0 is the initial window size at backoff stage 0. Hence, the steady state probability rk , 0 ≤ k ≤ CWm − 1, that the backoff counter BC is equal to k is given as follows [16]: Δ
rk = P {BC = k} =
m
πj
j=l
1 , CWj
CWl−1 ≤ k ≤ CWl − 1,
(2)
for l = 0, 1, 2, · · · , m where CW−1 = 0. Since R includes the virtual slot containing a packet transmission by its definition, we have R = BC + 1, from which the distribution of R is given by P {R = k} = P {BC = k − 1},
1 ≤ k ≤ CWm .
(3)
Now, let τ denote the steady state probability that an arbitrary terminal transmits a packet in an arbitrary virtual slot. By the definition of τ and the random variable R, it follows τ=
1 E[R]
(4)
where E[R] denotes the expectation of the random variable R and given by E[R] =
=
m j=0 m j=0
P {BS = j}E[R|BS = j] =
m j=0
πj
CWj
πj
k
k=1
CWj + 1 , 2
1 CWj (5)
where BS denotes the backoff stage after an arbitrary packet transmission in steady state. In addition, from the definitions of p and τ we have p = 1 − (1 − τ )N −1 .
(6)
Therefore, starting from (1), (6) with an initial value of τ , we obtain an updated value of τ from (4) and (5). Then, if the updated value of τ is not very close to the initial value of τ , we perform the same procedure with the updated value of τ as the initial value of τ . We do this recursive procedure until we get an exact value of τ . Note that our recent paper [16] shows that the value of τ from the above procedure is exactly the same that is obtained by Bianchi’s model [14]. For later use, we introduce a random variable R(e) which denotes the remaining number of virtual slots for an arbitrary terminal to perform the next packet transmission at an arbitrary virtual slot boundary in steady state. By renewal theory [15], it can be shown that P {R(e) = k} =
P {R ≥ k} , E[R]
1 ≤ k ≤ CWm
Analysis of Medium Access Delay and Packet Overflow Probability
423
where P {R ≥ k} is obtained from (2) and (3). Since all terminals are operated identically, the probabilistic characteristics of terminals are all identical, that is, the above results for R and R(e) can be used for the packet transmission process of any terminal in the network. 2.2
The Background Traffic Analysis
Now we first construct the packet transmission process of the background traffic based on our analysis in section 2.1. To do this, we consider actual slot boundaries where the backoff counter value of the tagged terminal is either changed or frozen. A time interval between two consecutive actual slot boundaries is called an actual slot in the analysis. Note that the actual slot is different from the physical slot in IEEE 802.11 DCF standard. In addition, the actual slot is also different from the virtual slot. The difference between actual and virtual slots occurs when there is a packet transmission of a non-tagged terminal in the virtual slot. In this case the virtual slot containing a packet transmission of a non-tagged terminal is divided into two actual slots, one of which is the packet transmission time of the non-tagged terminal and the other of which is the physical slot during which the backoff counter value of the tagged terminal is still frozen. Let Hbg be the number of actual slots between two consecutive packet transmissions in the background traffic. To get the distribution function of Hbg in steady state, we assume that there occurs at least one packet transmission in the background traffic at an arbitrary actual slot boundary, called the initial boundary, in steady state. We condition on the number N0 of non-tagged terminals in the background traffic transmitting packets simultaneously at the initial boundary. Then since there are N −1 non-tagged terminals, the probability mass function of N0 is given as follows: N −1 k τ (1 − τ )N −1−k P {N0 = k} = N −1k N −1 , 1 ≤ k ≤ N − 1. (7) τ l (1 − τ )N −1−l l=1 l Here, the denominator is the probability that there is at least one non-tagged terminal transmitting a packet at the initial boundary in steady state. Now, to compute the distribution of Hbg , we should consider the backoff freezing procedure for non-transmitting terminals in the background traffic at the initial boundary. That is, a non-transmitting terminal in the background traffic with positive backoff counter value, detecting a packet transmission needs one more actual slot to decrease its backoff counter value after the actual slot containing the packet transmission [1]. Then, if we consider the terminal transmitting a packet at the initial boundary in steady state, the next transmission time for the terminal is given by R. On the other hand, if we consider the terminal transmitting no packet at the initial boundary in steady state, the next transmission time is given by R(e) + 1. Here, one acutal time slot is added due to the backoff freezing procedure for non-transmitting terminal explained above. Hence, the distribution of Hbg is given as follows: For 1 ≤ k ≤ CWm , N−1
P {Hbg ≤ k} =
P {N0 = j} 1 − (P {R > k})j P {R(e) > k − 1} j=1
N−1−j
(8)
424
G.U. Hwang
N −j where P {N0 = j} is given in (7). Note that (P {R > k})j P {R(e) > k − 1} is the conditional probability that there will be no packet transmission in the background traffic during [0, k], given that there are j terminals transmitting packets in the background traffic at the initial boundary. For later use, We need to consider the conditional probability qbg that there occurs a collision in the background traffic, given that there is a packet transmission in the background traffic. Since there are N − 1 terminals in the background traffic, we have qbg =
1 − (1 − τ )N −1 − (N − 1)τ (1 − τ )N −2 . 1 − (1 − τ )N −1
We then define the packet transmission time Tbg in the background traffic by T = (1 − qbg )Ts + qbg Tc .
(9)
Here, Ts and Tc denote the times taken by a successful transmission and a collided transmission, respectively. That is, since we consider the RTS/CTS mechanism, Ts and Tc are given by Ts = RT S + SIF S + CT S + SIF S + P acket + SIF S + ACK + DIF S, Tc = RT S + DIF S where RTS, CTS, Packet and ACK are the respective packet transmission times of RTS, CTS, data packet and ACK, and SIFS and DIFS denote the Short Inter-Frame Space and the DCF Inter-Frame Space, respectively. In practice, there is a correlation between the background packet transmission process and the packet transmission process of the tagged terminal due to the packet collision. However, to make the analysis as simple as possible we assume that both packet transmission processes are independent from now on. 2.3
Medium Access Delay Analysis
In this subsection, we analyze the medium access delay of our tagged terminal based on our results in subsection 2.2. We start with the maximum backoff stage m. Let Dm be the time needed for a packet in the tagged terminal to be successfully transmitted, given that the starting backoff stage of the packet is stage m at an arbitrary time. To compute Dm , which is in fact the sojourn time in stage m during the medium access delay, let Dm (i) denote the time needed for a packet to be successfully transmitted, given that the starting backoff stage is m and the value of the starting backoff counter is i(0 ≤ i ≤ CWm − 1) at an arbitrary time, and Em (i) denote the remaining time needed for a packet to be successfully transmitted, given that the backoff stage is m and there occurs a backoff counter freezing at an arbitrary time with the backoff counter value i(1 ≤ i ≤ CWm − 1). Note that, since we have a backoff counter freezing for Em (i), i should be in 1 ≤ i ≤ CWm − 1, while since there is no such condition for Dm (i), i can take
Analysis of Medium Access Delay and Packet Overflow Probability
425
any value in [0, CWm − 1]. Now we consider the packet transmission processes of the background traffic as well as the tagged terminal. The fact that the starting stage is m implies there is a collision and consequently the next packet transmission time in the background traffic is Hbg obtained in (8). So, we have the following cases to compute Dm (i): For Hbg = j, - When 1 ≤ j ≤ i, there occurs a packet transmission in the background traffic before the packet transmission of the tagged terminal. In this case, the packet transmission in the background traffic is either a successful one d or a collided one. Then, we have Dm (i) = (j − 1)σ + T + Em (i − j + 1). - When j = i + 1, there occur simultaneous packet transmissions in the background traffic as well as the tagged terminal. In this case, we have a collided packet transmission. Since the backoff stage is the maximum stage m, we d have Dm (i) = iσ + Tc + Dm . - When i + 2 ≤ j ≤ CWm , there occurs a packet transmission of the tagged terminal before the packet transmission in the background traffic. In this case, the packet transmission of the tagged terminal is successful. Then, we d have Dm (i) = iσ + Ts . Here, T is given by (9), σ denotes the physical slot time in the IEEE 802.11 DCF, d
and = means both sides are equal in distribution. For later use we compute the Laplace transform E[e−sDm (i) ] of Dm (i) as follows: i
E[e−sDm (i) ] =
P {Hbg = j}E[e−s[(j−1)σ+T +Em (i−j+1)] ] j=1
+P {Hbg = i + 1}E[e−s[iσ+Tc +Dm ] ] +
CWm
P {Hbg = j}E[e−s[iσ+Ts ] ].
(10)
j=i+2
Here, P {Hbg = j} is computed from (8), and the Laplace transforms E[e−sEm (i) ] and E[e−sDm ] of Em (i) and Dm , respectively, will be computed soon. By similar arguments, the Laplace transform E[e−sEm (i) ] of Em (i) is given by i
E[e−sEm (i) ] =
P {Hbg = j}E[e−s[(j−1)σ+T +Em (i−j+1)] ] j=1
+P {Hbg = i + 1}E[e−s[iσ+Tc +Dm ] ] +
CWm
P {Hbg = j}E[e−s[iσ+Ts ] ]. (11)
j=i+2
For backoff stage n (0 ≤ n ≤ m − 1), we define Dn , Dn (i) and En (i) similarly as for backoff stage m. Let Dn (0 ≤ n ≤ m − 1) be the time needed for a packet in the tagged terminal to be successfully transmitted, given that the starting backoff stage of the packet is stage n. Since the starting backoff stage after a successful packet transmission is always stage 0, the medium access delay of a packet is, in fact, D0 by our definition. Let Dn (i) (0 ≤ i ≤ CWn − 1) and En (i) (1 ≤ i ≤ CWn − 1) be the random variables for stage n, (0 ≤ n ≤ m − 1) corresponding to Dm (i) and Em (i),
426
G.U. Hwang
respectively. Then, by similar arguments as above (here, to save space we omit the detailed derivation), the corresponding Laplace transforms E[e−sDn (i) ] and E[e−sEn (i) ] can be obtained similarly as given in (10) and (11), respectively. When the starting backoff stage is 0 at an arbitrary time, which implies that the tagged terminal transmits a packet successfully just before, all nontagged terminals in the background traffic do not transmit packets. Hence, in this case the next packet transmission time in the background traffic is the re(e) (e) maining transmission time, denoted by Hbg , which is give by P {Hbg ≤ k} = N −1 1 − P {R(e) > k − 1} . Hence, the computation of D0 (i), 0 ≤ i ≤ CW0 −1 (e)
is the same as above except that Hbg is used instead of Hbg . However, there is a packet transmission in the background traffic when we compute E0 (i), 1 ≤ i ≤ CW0 − 1, we use Hbg in the computation of E0 (i). Hence, the corresponding Laplace transforms are given by i
E[e−sD0 (i) ] =
P {Hbg = j}E[e−s[(j−1)σ+T +E0 (i−j+1)] ] (e)
j=1
+P {Hbg = i + 1}E[e−s[iσ+Tc +D1 ] ] +
CWm
P {Hbg = j}E[e−s[iσ+Ts ] ],
(e)
(e)
(12)
j=i+2
E[e−sE0 (i) ] =
i
P {Hbg = j}E[e−s[(j−1)σ+T +E0 (i−j+1)] ]
j=1
+P {Hbg = i + 1}E[e−s[iσ+Tc +D1 ] ] +
CWm
P {Hbg = j}E[e−s[iσ+Ts ] ].
(13)
j=i+2
Now using the above equations we can compute the Laplace transforms of Dn , 0 ≤ n ≤ m. Since the initial backoff counter in backoff stage n is selected uniformly in the window [0, CWn − 1], it follows that E[e−sDn ] =
CW n −1 i=0
1 E[e−sDn (i) ], 0 ≤ n ≤ m. CWn
Note that the Laplace transform of D0 is, in fact, the Laplace transform of the medium access delay. Hence, the expectation and variance of the medium access delay can be easily obtained by differentiating the Laplace transform of D0 once and twice, respectively. In the numerical studies, we compute the expectation and variance of the medium access delay.
3
Analysis of an Unsaturated Terminal
In this section, we evaluate the system performance of an unsaturated terminal in the IEEE 802.11 wireless network consisting of N terminals. Since the performance of a terminal in the network largely depends on the packet transmission processes of the other terminals in the network, we consider the worst case scenario where N − 1 terminals are assumed to be saturated and the terminal of
Analysis of Medium Access Delay and Packet Overflow Probability
427
interest, called the tagged terminal, is unsaturated. Since the network contains an unsaturated terminal, it is called the unsaturated network for simplicity from now on. Due to the saturation assumption of the other terminals in the worst case scenario, the resulting performance of the tagged terminal is the worst case performance. Note that the analysis of the worst case performance is meaningful because the quality of service that a terminal (or a user) feels largely depends on the worst case performance. To analyze the performance of the tagged terminal, we first compute the service capacity (or the service rate) of the tagged terminal in our worst case scenario. In section 2, we obtain the expected medium access delay E[D0 ] of a packet. Noting that the medium access delay is in fact the service time of a packet in a terminal, we see that the service rate of the tagged terminal is 1/E[D0 ]. From now on we assume that λ < 1/E[D0 ] for the stability of the tagged terminal. Next, we propose a mathematical model to evaluate the performance of the tagged terminal in the worst case scenario. The proposed mathematical model is based on the effective bandwidth theory [17]. The merit of using the effective bandwidth theory is that we can consider the arrival process and the service process separately and accordingly we do not need to consider the arrival process in the analysis of the service process. That is, even though we consider the performance of the tagged terminal in the unsaturated network, with the help of the effective bandwidth theory we use the packet transmission process of the tagged terminal in the saturated condition as the packet service process of the tagged terminal in the analysis. We will verify this in numerical studies later. Let N (t) be the number of successfully transmitted packets of the tagged terminal in the saturated condition during time interval [0, t]. We first compute the M.G.F. (Moment Generating Function) E[eθN (t) ] of N (t). For convenience, let μ and σ 2 denote the expectation and variance of the successful packet transmission time of the tagged terminal, which are obtained in section 2. From the renewal theory [15], for sufficiently large t it can be shown that N (t) can 2 be well approximated by a Normal distribution with mean μt and variance σμ3 t. t
θ+ σ
2t
θ2
Then, it follows that E[eθN (t) ] ≈ e μ 2μ3 . Then the EBF (Effective Bandwidth Function) ξS (θ) of the packet service process of the tagged terminal, defined by 1 ξS (θ) = limt→∞ tθ E[eθN (t) ] [17], is given by −1 1 σ2 θ E[e−θN (t) ] ≈ − 3 . t→∞ tθ μ 2μ
ξS (θ) = lim
Similarly, we define the EBF ξA (θ) of the arrival process by ξA (θ) = ΛAθ(θ) , where A(t) denotes the number of arriving packets at the tagged terminal during time interval [0, t] and ΛA (θ) = limt→∞ t−1 log E[eθA(t) ] [17]. Then, it can be shown [17] that the number Q of packets in the tagged terminal d
in steady state satisfies Q = supt≥0 A(t)−N (t) and that the overflow probability of Q is given by
428
G.U. Hwang ∗
P {Q > x} ≈ P {Q > 0}e−θ x ,
(14)
∗
where θ is the unique real solution of the equation ξA (θ) − ξS (θ) = 0. In the numerical studies, we will use equation (14) to see the overflow probability of the tagged terminal and compare the result with simulation result.
4
Numerical Studies
In this section, we first provide numerical results based on our analysis in subsection 2.3 to investigate the medium access delay of the IEEE 802.11 DCF. We also simulate the IEEE 802.11 network by using a simulator we develop which is written in the C++ programing language. We compare simulation results with our numerical results to validate our analysis. In all simulations as well as numerical studies, we use the system parameters as given below. – – – – – – – – –
Payload Size : 8000 bits Phy Header (including preamble) : 192 bits Mac Header (including CRC bits) : 272 bits RTS Frame : Phy Header + 160 bits CTS Frame : Phy Header + 112 bits ACK Frame : Phy Header + 112 bits Data Rate : 1e6 bits/sec Time Slot σ = 20e-6 sec, SIFS = 10e-6 sec, DIFS = 50e-6 sec CW0 = 32, m = 5
The results are given in Fig. 1. From Fig. 1(a), our analytic results are well matched with simulation results. In addition, the expectation of the medium access delay is linearly increasing in the number of terminals. From Fig. 1(b), our analytic results well follow the simulation results even though our analytic results are likely to slightly underestimate the variance of the medium access delay. We think this underestimation is due to the independence assumption 0.55
2 0.5
0.45
1.6
0.4
1.4 The Variance
The Expectation
1.8
0.35
1.2
1 0.3
0.8 0.25
0.6 0.2
0.4 0.15 15
20
25
30 35 40 The number of terminals
(a)
45
50
55
15
20
25
30 35 40 The number of terminals
45
50
55
(b)
Fig. 1. The Expectation and Variance of Medium Access Delay of A Packet (in seconds)
Analysis of Medium Access Delay and Packet Overflow Probability
429
0 simulation analysis −0.5
−1
log10(P{Q >x})
−1.5
−2
−2.5
−3
−3.5
−4
0
5
10
15
20
25 30 the buffer size (x)
35
40
45
50
Fig. 2. The Overflow Probability of An Unsaturated Terminal
on the packet transmission processes of the background traffic and the tagged terminal in our analysis. In Fig. 2, we plot the overflow probabilities obtained by (14) as well as simulation. In this study, the number of nodes is 50 and the packet arrival process is an ON and OFF process where the packet arrivals during ON periods are according to a Poisson process and there is no packet arrival during OFF periods. The transition rate from the ON state (resp. OFF state) to the OFF state (resp. ON state) is 70000 (resp. 25000) 1/second. The Poisson arrival rate during ON periods is obtained from the constraint that the average arrival rate is 1.5. As shown in Fig. 2, our analytic results are well matched with the simulation results, from which we can verify the usefulness of the proposed model in this paper.
5
Conclusions
In this paper, we analyzed the medium access delay of a packet in the saturated IEEE 802.11 wireless network. In our analysis, we considered the detailed packet transmission processes of terminals in the network and derived the Laplace transform of the medium access delay of a packet. Based on the analysis of the medium access delay under the saturated condition, we proposed a mathematical model to analyze the packet overflow probability of an unsaturated terminal. We also provided numerical and simulation results to validate our analysis and to investigate the characteristics of the system performance.
References 1. IEEE LAN MAN Standard, Part 11: Wireless LAN Medium Access Control (MAC) and Physical layer (PHY) specifications, ANSI/IEEE Std 802.11, 1999 edn. 2. Chhaya, H., Gupta, S.: Performance modeling of asynchronous data transfer methods of IEEE 802.11 MAC protocols. Wireless Networks 3, 217–234 (1997) 3. Foh, C.H., Tantra, J.W.: Comments on IEEE 802.11 saturation throughput analysis with freezing of backoff counters. IEEE Communications Letters 9(2), 130–132 (2005)
430
G.U. Hwang
4. Tay, Y., Chua, K.: A capacity analysis for the IEEE 802.11 MAC protocol. Wireless Networks 7(2), 159–171 (2001) 5. Cali, F., Conti, M., Gregori, E.: IEEE 802.11 protocol: design and performance evaluation of an adaptive backoff mechanism. IEEE Journal on Selected Areas in Communications 18(9), 1774–1786 (2000) 6. Wang, C., Li, B., Li, L.: A new collision resolution mechanism to enhance the performance of IEEE 802.11 DCF. IEEE Transactions on Vehicular Technology 53(4), 1235–1246 (2004) 7. Weinmiller, J., Woesner, H., Ebert, J.-P., Wolisz, A.: Modified backoff algorithms for DFWMAC’s distributed coordination function. In: Proceesings of the 2nd ITG Fachtagung Mobile Kommunikation, Neu-Ulm, Germany, pp. 363–370 (September 1995) 8. Carvalho, M.M., Garcia-Luna-Aceves, J.J.: Delay analysis of IEEE 802.11 in singlehop networks. In: Proceedings of the 11th IEEE International Conference on Network Protocols, Atlanta, USA (November 2003) 9. Carvalho, M.M., Garcia-Luna-Aceves, J.J.: Modeling single-hop wireless networks under Rician fading channels. In: Proceedings of WCNC 2004 (March 2004) 10. Tickoo, O., Sikdar, B.: On the impact of IEEE 802.11 MAC on traffic charateristics. IEEE Journal on Selected Areas in Communications 21(2), 189–203 (2003) 11. Tickoo, O., Sikdar, B.: Queueing analysis and delay mitigation in IEEE 802.11 random access MAC based wireless networks. In: Proceedings of IEEE INFOCOM, Hong Kong, China, pp. 1404–1413 (March 2004) 12. Tickoo, O., Sikdar, B.: A queueing model for finite load IEEE 802.11 random access MAC. In: Proceedings of IEEE International Conference on Communications, vol. 1, pp. 175–179 (June 2004) ¨ 13. Ozdemir, M., McDonald, A.B.: A queueing theoretic model for IEEE 802.11 DCF using RTS/CTS. In: The 13th IEEE Workshop on Local and Metropolitan Area Networks, pp. 33–38 (April 2004) 14. Bianchi, G.: Performance analysis of the IEEE 802.11 distributed coordinate function. IEEE Journal of Selected Areas in Communications 18(3), 535–547 (March 2000) 15. Ross, S.M.: Stochastic Processes, 2nd edn. John Wiley & Sons, Chichester (1996) 16. Hwang, G.U., Lee, Y., Chung, M.Y.: A new analytic method for the IEEE 802.11 distributed coordinate function. In: revision for IEICE Transactions on Communications (March 2007) 17. Chang, C.-S.: Performance guarantees in communication networks. Springer, Heidelberg (2000)
Communications Challenges in the Celtic-BOSS Project G´ abor Jeney1 , Catherine Lamy-Bergot2, Xavier Desurmont3 , ´ Rafael Lopez da Silva4 , Rodrigo Alvarez Garc´ıa-Sanchidri´ an5, Michel Bonte6 , 7 8 9 Marion Berbineau , M´ arton Csapodi , Olivier Cantineau , Naceur Malouch10 , David Sanz11 , and Jean-Luc Bruyelle12 1
7
Budapest University of Technology and Economics, Hungary
[email protected] 2 THALES Communications, France
[email protected] 3 Multitel Asbl, Belgium
[email protected] 4 Telef´ onica Investigacion y Desarollo, Spain
[email protected] 5 Ingenier´ıa y Econom´ıa del Transporte, S.A., Spain
[email protected] 6 ALSTOM-TRANSPORT, France
[email protected] Institut National de Recherche sur les Transports et leur S´ecurit´e, France
[email protected] 8 EGROUP-Services Ltd, Hungary
[email protected] 9 BARCO-SILEX, Belgium
[email protected] 10 University Pierre and Marie Curie, France
[email protected] 11 Soci´et´e Nationale des Chemins de fer Fran¸cais, France
[email protected] 12 Universit´e Catholique de Leuven
[email protected]
Abstract. The BOSS project [1] aims at developing an innovative and bandwidth efficient communication system to transmit large data rate communications between public transport vehicles and the wayside to answer to the increasing need from Public Transport operators for new and/ or enhanced on-board functionality and services, such as passenger security and exploitation such as remote diagnostic or predictive maintenance. As a matter of fact, security issues, traditionally covered in stations by means of video-surveillance are clearly lacking on-board trains, due to the absence of efficient transmission means from the train to a supervising control centre. Similarly, diagnostic or maintenance issues are generally handled when the train arrives in stations or during maintenance stops, which prevents proactive actions to be carried out. The aim of the project is to circumvent these limitations and offer a system level solution. This article focuses on the communication system challenges. Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 431–442, 2007. c Springer-Verlag Berlin Heidelberg 2007
432
1
G. Jeney et al.
Introduction
The purpose of the BOSS project is to design, develop and validate an efficient railway communication system both on-board trains and between on-board and wayside with a guaranteed QoS (Quality of Service) level to support the high demands for services such as security on-board trains, predictive maintenance, as well as providing internet access or other information services to the travellers. The principle of the overall architecture proposed to meet this goal is given in Figure 1. Railway communications can be considered as highly challenging because of the specific environment (safety, speed up to 350 km/h, tunnels, low cost, etc.). Therefore, the results from the BOSS project will serve other application areas like road applications, The work in the BOSS project will focus on providing an optimised communication system relying on an IPv6 (Internet Protocol version 6) architecture for both on-board and wayside communications, that will offer a high QoS level to allow support of both the high demands for better passenger security on-board trains and the high throughputs and lower QoS for the travellers. The BOSS project will lead to the availability of validated new and/or enhanced passenger services such as passenger security, predictive maintenance and internet access on board public transport vehicles such as trains, through an efficient interconnection between land and on-board communication systems with a guaranteed level of QoS. Moreover, in order to validate the design but also integrate real requirements, the use of realistic specifications and the adaptation of simulations based on real life inputs, will be validated by the BOSS project through the implementation and the testing of two different transmitting
Fig. 1. The BOSS project architecture
Communications Challenges in the Celtic-BOSS Project
433
techniques (typically among WLAN, WiMAX, UMTS or similar systems). This will demonstrate the wireless connection capability both for surveillance needs inside the carriages and from the train to the surveillance centre as well as for Internet access within the train. This technology platform will allow to demonstrate the feasibility of transport on-board surveillance, leading to the increase of the level of security services and comfort feeling in Public Transport for European citizens. It should be noted that the reason why video applications are a key issue in the project is that the bandwidth they require is far from negligible, which will have a deep impact on the choice of wireless accesses and design of end-to-end QoS solutions, but also handover capabilities. Furthermore, video streams have to be transmitted simultaneously from several trains to the control centre when alarms are set up. The problem of radio resources management and multi user accesses are also an important issue of the BOSS project. To reach these goals, the following axes will be followed and investigated: – establish interconnectivity between internal wired, internal wireless (e.g. WLAN) and external wireless (e.g. WiMAX, UMTS, 3GPP+, 802.22, . . . ) systems, with handover issues on external wireless links, coverage and multi user issues, – ensure guaranteed QoS, including while performing handovers, to allow for an end-to-end QoS over the whole system, manage the different QoS on the different data links, – develop robust and efficient video coding tools that can be embedded in handheld devices and respect low delay requirements (for mobile supervisors use), – propose video analysis solutions for security issues, – provide audio processing tools to complement video and improve robustness of alarms generation. This document focuses on the first two areas. Since the project was started in October 2006, and it is 2.5 years long, there are no specific results yet, which are worth mentioning. The primary aim of this publication is to receive opinion feedback from the scientific world about the way the project chose to follow.
2 2.1
State-of-the-Art Situation State-of-the-Art on Dual Mobility for Transmission with QoS
Mobility over IP networks. Mobile IPv6 (MIPv6) provides layer 3 transparent mobility for the higher layers (for example TCP), so a mobile node remains reachable with its home address without the need of being connected to its home network. The transition or handover between networks is transparent for the higher layers and the connectivity loss produced during the handover is due to the exchange of the corresponding signalling messages. Every Mobile Node (MN) has a local address (or Home Address), which represents its original network address. This address remains the same independently of the mobile node
434
G. Jeney et al.
position, which when passing to another network still keeps its home address. Packets sent to the mobile node, when staying in its original network, will be routed normally as if the node was not mobile. The prefix of this address is the same as the network prefix where the node was originated. When a mobile node goes to a different network, it obtains a guest address (Care-of-Address, CoA), belonging to the address space of the visited network. The mobile node can acquire its care-of address through conventional IPv6 mechanisms, such as stateless or stateful auto-configuration. From now on, it can be reached also with this new address (apart from the home address). After obtaining the new address, the mobile node contacts a specific router from its home network (Home Agent, HA) and, during the registration process, the node registers its current CoA. Afterwards, when a packet is sent to the mobile node home address, the Home Agent will intercept it and tunnel it to the mobile node CoA. With this mechanism, the packets reach the mobile node at any location, because the CoA belongs to the address space of the subnet where it is connected. Handovers over wireless systems. Handover refers to the process of changing access points during communication. It can be divided into two separate parts, horizontal handover (HH) and vertical handover (VH). Horizontal handover means changing access points inside the same network, i.e. the user changes its geographic position and thus a new access point is assigned to maintain the communication link. For example, horizontal handover happens when a mobile service subscriber exits a cell and enters another. Vertical handover means changing between different technology access technologies (networks) which are available at the same geographic location without disturbing the communication. For instance, if both GSM/GPRS and UMTS networks are available, the user switches from GSM/GPRS to UMTS, typically in the hope of higher bandwidth. Another goal is to implement IP based vertical handover protocols. With such protocols transparent mobility management might be realised (switching between access networks, roaming). An important scope of such protocols is the support of the multimedia streaming services spreading nowadays in mobile communication. The most serious problem is IP address changes caused by the handover. During the handover the mobile client roams to another access technology, and to another service provider at the same time ; after the handover a new IP address is assigned to it. The objective of the protocol is to hide the change of the address from upper layers both at the mobile device and at the communication partner. While we are focusing on TCP/UDP traffic, the protocol must handle the socket-pair on both sides to maintain the connection state. End-to-end QoS over heterogeneous networks. A special care will be taken to address end-to-end service continuity and QoS support across wired and ad hoc wireless networks. Providing Quality of Service to the network, especially in the case of several hops in a heterogeneous and dynamic environment is a challenging task. Static internet Quality of Service (QoS) solutions are difficult to implement in mobile ad hoc networks: The IntServ/RSVP architecture QoS
Communications Challenges in the Celtic-BOSS Project
435
model is not suitable due to the limited resources (e.g. the amount of state information increases dramatically with the number of flows and the nodes must perform admission control, classification and scheduling functions). The DiffServ architecture would seem a better solution. However, DiffServ is defined for a fixed infrastructure in which there are boundary DiffServ routers that perform QoS functions and a Service Level Agreement (SLA) that defines the kind of contract between the Internet Service Provider (ISP) and the client. In a mobile ad hoc network it may be difficult to identify what are the boundaries. A node should be able to work both as a boundary and as an interior router. That complicates the tasks of the nodes. Furthermore the very concept of SLA is also difficult to define in such environments. There are several proposals based on QoS aware routing and QoS aware MAC layer. However, QoS architectures and protocols still need to be studied further, especially with respect to node mobility, network diameter and processing power. Transport protocols face major challenges for the end-to-end support of QoS in wired/wireless heterogeneous networks. The main transport protocol, TCP (Transmission Control Protocol), has been designed to provide reliable end-to-end data delivery. The mechanisms have been designed and tuned for wired networks and ignoring the specificities of the wireless medium, such as high bit error rates, frequent loss of connectivity, power constraints. . . Streaming applications, especially for audio and video, share a preference for timeliness over reliability. These applications tend to use RTP (Real Time Protocol) in place of TCP, to avoid the built-in congestion control mechanisms of TCP that lead to increased delay in the delivery of packets. The coexistence of congestion controlled traffic (TCP) and non congestion controlled traffic (RTP/UDP) on the same network induces a lack of fairness between the different sessions and thus poses severe challenges to the overall QoS architecture. The project will study this issue, will review the work carried out in the IETF group DCCP (the Datagram Congestion Control Protocol) and will propose specific congestion control mechanisms adapted for the BOSS heterogeneous wired/wireless mesh network. The BOSS project will study and propose an end-to-end QoS framework that can apply both to fixed and ad hoc network extensions. The key features of this framework will comprise the definition of a QoS policy for the ad hoc and public safety world, in compliance with management and accounting rules, the identification, signalling and marking of flows to render a co-ordinated end to end service and the development of techniques for prioritising flows in each individual sub network. QoS over IP networks. In itself, QoS is an intuitive concept, defined by ITU-T Rec. E.800 as: “the collective effect of the service performance which determines the degree of satisfaction of a user of the service” or “a measure of how good a service is, as presented to the user, it is expressed in user understandable language and manifests itself in a number of parameters, all of which have either subjective or objective values.” Even though these definitions are quite simple
436
G. Jeney et al.
and comprehensive on a global level, it is generally complex to determine real measures to reflect specific network requirements or constraints. Furthermore, realising applications conforming to subjective parameters can be extremely difficult due to contrasting user needs. For these reasons, standard organisations and network researchers have spent considerable efforts in order to map the end-user perspective to specific network requirements. The results essentially report a subjective opinion of a properly set of hypothetical and testing users with regards to the satisfaction of the service (for example the vision of a video or the listening of a recorded audio), that in turn depends on several aspects, in particular network related in a telecommunication scenario. ITU-T has defined in Rec. P.800 a Mean Opinion Score (MOS) scale, which can be used to map the subjective opinion score to a qualitative value. New wireless solutions in mobility mode. The current state of the mobile communication and consumer electronics can be characterised by the convergence of devices and by the growing needs for connecting those devices. In this last respect, simplicity and security are the two primary goals. Cumbersome network settings can possibly be dealt with in the computer world but certainly not in the mobile and consumer electronics world. The main driver for creating the Near Field Communication Interface and Protocol (NFCIP-1), was to make the users able to create a connection between two devices without any special knowledge about the “network”, yet any NFC-compliant device could be connected securely. The concept is strikingly simple: in order to make two devices communicate, bring them together or make them touch. As the two devices identify each other, they exchange their configuration data via NFC and then the devices can set up and continue communication either with NFC or via other (longer range and faster) communication channels (such as Bluetooth or WiFi). In Fall 2002, Sony and Philips reached agreement on the development of Near Field Communication (NFC) technology. In order to promote NFC worldwide, the two companies submitted the draft specifications to ECMA International, the organisation responsible for standardising information and communication systems. After developing open technical specifications, NFCIP-1 was approved under EMCA-340, and subsequently submitted by EMCA International to ISO/IEC. It has received approval under ISO/IEC IS 18092. The NFCIP-2 standard allows interoperation with RFID, Proximity Card (ISO/IEC 14443: 2001, Identification cards – Contactless integrated circuit(s) cards – Proximity cards) and Vicinity Card (ISO/IEC 15693: 2001, Identification cards – Contactless integrated circuit(s) cards – Vicinity cards) devices and readers by defining a mechanism for selecting the three operation modes as part of establishing the connection. Communication systems for video surveillance tools. The combination of video surveillance and communication systems have been tested for metro applications in the PRISMATICA [14], STATUE [15], ESCORT [16] projects and a first generation system is already deployed on the RATP line 14 and Singapour [17] underground in specific tunnel environments. In the case of urban
Communications Challenges in the Celtic-BOSS Project
437
buses the RATP systems AIGLE et ALTA¨IR were a first step based on low capacity professional radio. Projects like, TESS [13] SECURBUS in Belfort [18], LOREIV in Marseilles [19] and EVAS can be also mentioned. In the TESS project a DVB-T link using the Worldspace geostationary satellite was successfully experimented to transmit passenger information in urban busses. WiMax have been experimented in Lille and innovative tools for audio event detection have been proposed to design a new bimodal surveillance system adapted to an embedded context. Some bricks developed will be re-used in the BOSS project in the context of the railway environment. Existing cellular systems allowing multimedia communication such as GSM, GPRS, EDGE and UMTS offer poor bandwidth efficiency in the downlink direction. Even if the problem of high communication costs for nominal traffic mode, it is impossible for these systems to cope with the flexibility and high traffic demand in crisis situation. 2.2
State-of-the-Art on Wireless Mobile Transmission Links
In today’s offer the high data rate and mobility access is not realistic particularly from the train to the ground due to the poor spectral efficiency of existing systems. In the French national project TESS [13] the inadequation of existing cellular systems have been demonstrated in the case of data transmission from a moving urban bus and a control centre. Recently many new signal processing schemes have been proposed and are being studied to increase the wireless capacity link taking into account the mobility issues. The main topics of interest are, MIMO/MTMR links, adaptive links, cross layer optimisations, interference cancellation, . . . Some of these techniques are proposed as options in the standards and need to be evaluated to guarantee they will lead to the expected capacity and robustness expectations. Ongoing standards such as 802.16e, 3GPP+, 802.20 and 802.22 aim at providing both mobility and high throughputs. To do so the signal processing schemes that are generally foreseen are based on diversity techniques (MIMO), new coding and modulation schemes and adaptivity as presented in the previous section. These issues need thus to be qualified in order to evaluate their impact on the system performances. Convergence between WMAN and 3GPP+ is also foreseen that could lead to lower cost deployments. For the application of these wireless links to the project objective, we can identify two major issues. The security issue will require mainly a large uplink (from the train to the ground) whereas the travellers need would rather be ADSL like service with a large downlink throughput and a smaller uplink. In this scope, two main directions will be followed: the throughput increase and the coverage increase. In both cases, multiple element antennas are accurate as the capacity increase may be used to work at lower SNR and/or increase the throughput. For a capacity increase to be achievable using point to point links, it is necessary to have sufficient propagation diversity for the MIMO channel to increase the capacity. Notice that the antennas could be distributed along the train. For a capacity increase in a multi-point to point link, the capacity increase is made
438
G. Jeney et al.
by making multiple links with either the train or the access points (the links sharing the same bandwidth and transmitting simultaneously. A harsh issue in the specific context is the short coherence time of the propagation channel. Indeed, due to the speeds of the trains and the specific propagation conditions, the propagation channels will vary drastically than the one experienced in the today standards developments that focus on pedestrian applications. Efficient capacity increasing schemes rely on the propagation channel knowledge requiring fast tracking/prediction and robust communications links.
3
Relevance to Market Needs and Expected Impact
The BOSS project will provide the communications systems needed to develop new telecommunications applications in the domain of public transport. The major impact will be on societal benefits to European end-users, both in terms of confidence in terms of security of the transport system and comfort feeling and transport efficiency especially during off-peak hours. 3.1
Technological Innovation and Strategic Relevance
The objectives presented in Section 1 will be reached by the development of a dedicated architecture relying on the interconnection of the indoor world (or train) and the outdoor world via a common and unique IP gateway. As illustrated by Figure 1, the IP gateway will allow to interconnect a set of both wired and wireless accesses: – Cameras and microphones dedicated to a bimodal surveillance, aiming at ensuring a high security level for passengers and employees by alerting on abnormal events; – Sensors dedicated to maintenance needs, aiming at providing alarms and machine functioning state information to any supervisor able to manage them; – Mobile units with wireless access via indoor wireless network (e.g. WLAN), carried by on-board supervisors (e.g. controllers), aiming at ensuring a second level of safety/security by being able to react to alarms/alerts and also by being able to release such alarms/alerts Wireless access via outdoor network (e.g. UMTS/WiMAX) between train and ground control centre where in-depth events detections or maintenance analysis can be carried out, as well as specific security/safety actions launched (e.g. calling the police to come to next train stop). Mobility new approaches. Due to its target use case, the BOSS project is considering not only one level of mobility, but two levels. The first one corresponding to mobility inside the train itself, which can be viewed as a traditional setting for the mobile user considering the train reference frame. The second mobility level, which takes into account the mobility of train itself in the terrestrial reference frame. As such, the BOSS Project is targeting a dual mobility mode with a guaranteed QoS level.
Communications Challenges in the Celtic-BOSS Project
439
Example of scenario for the BOSS project demonstration. Based on the BOSS IP gateway architecture presented in Figure 1, and based on the partners experience in the domain, a first example of scenario is described hereafter, in order to illustrate the type of application that the BOSS architecture could offer. This scenario corresponds to the transmission of an alert, for example due to a passenger’s health problem, to the distributed control centre for immediate action. Communication from/to train and distributed control centre (see Figure 2). A train (especially sub-urban trains) will cross several types of wireless network, from GSM-R/GPRS/UMTS/WiMAX and WLAN, and soon future systems such as 802.16e (mobile WiMAX) and 802.22 WRAN (Wireless Regional Area Network). Based on its IP mobility capability, the BOSS architecture will select the adequate outdoor wireless link for transmission of the data to the control centre. In practice, the IP gateway sees the outdoor wireless link as just another link in its network, and the control centre as another node in the global IP network. By one or several hops the data (alert, context, . . . as well as other data if necessary) are then transmitted to the control centre for analysis or action. Abnormal event detection (see Figure 3). A camera installed in the passenger car, and connected to the IP gateway by a wired link, films the scene that is then analysed by events detection tools for generating an automatic alert in case of abnormal event detection, or by means of a manual alert transmitted over a wireless link to the IP gateway by a mobile supervisor (e.g. controller). The health problem is here detected. In the same way, a set of microphones is connected to the IP gateway and capture the sound environment inside the vehicle. Audio signals are then analysed to automatically detect some events. In overcrowded environments where occlusions appear, a single visual analysis is not always sufficient to reliably understand
Fig. 2. Scenario example: communication from/to train and distributed control centre
440
G. Jeney et al.
Fig. 3. Example: video surveillance and abnormal event detection (health problem)
passengers activity. Sound can be a salient information to deal with ambiguities and to improve detection/identification rate. Decision and feedback. The alert having been given, the control centre will launch the adapted procedure. If available a local team will be able to take over immediately assembling the appropriate medical team and sending it to the next train station for immediate care of the passenger. In the case where the alert was an automatic one, the control centre could also decide to alert the controller by return channel to direct him/her to the sick passenger and help directing the medical team. This feedback capability will also be used by events detection systems to adapt their detection process and sensitivity according to relevance measures information from surveillance operators.
4
Conclusions
This article introduces the Celtic project BOSS. The BOSS project intends to develop a communication system relying on an IP gateway inside the train that will enable the communications both inside the train, for communications inside carriages and for mobile passengers and controllers, and outside the train, mobile in the terrestrial reference frame, with a link towards wireless base stations (e.g. WiMAX, DVB). The BOSS project will consequently work on a dual mobility level, and will work to guarantee a differentiated Quality of Service for the different targeted services. The project partners will also work on video surveillance applications adaptation, in particular via – the robustification of the existing tools and development of behaviour analysis algorithms to ensure that the passenger security is handled in the best possible way, – the addition of audio processing tools to increase the confidence of a generated alarm in situations where a single video analysis is not sufficient.
Communications Challenges in the Celtic-BOSS Project
441
As an enhanced level of railway passenger security services is highly demanding in terms of bandwidth, this application thus represents a good case study to validate the BOSS concepts. Moreover, taking advantage of the bandwidth made available both downlink and uplink, wireless communications solutions such as video on demand, internet access, travel information services, . . . which greatly interest travellers, will be integrated in the global BOSS framework via an adapted level of service management. The BOSS project, with its IP gateway, is willing to offer the mobile train the possibility to inform a control centre both security, and trains exploitation related issues, and consequently to greatly increase the user protection as well as demonstrating the possibility to offer at the same time video on demand, onboard information and telecommunication services to the travellers. Validation will be performed through on-line tests on train in revenue service.
Acknowledgement The editor of this document wants to thank all contributions received from different BOSS project members which made this publication possible.
References 1. http://www.celtic-boss.org 2. Haritaoglu, I., Harwood, D., Davis, L.: W4: real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 809–830 (2000) 3. Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modelling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 831–843 (2000) 4. Johnson, N., Hogg, D.: Learning recognition. Image and Vision Computing 14, 609–615 5. Hongeng, S., Brmond, F., Nevatia, R.: Representation and optimal recognition of human activities. In: IEEE Proceedings of Computer Vision and Pattern Recognition (2000) 6. Vu, T., Brmond, F., Thonnat, M.: Automatic video interpretation: a novel algorithm for temporal scenario recognition. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI03) (2003) 7. Desurmont, X., Chaudy, C., Bastide, A., Parisot, C., Delaigle, J.F., Macq, B.: Image analysis architectures and techniques for intelligent systems. In: IEE proc. on Vision, Image and Signal Processing, Special issue on Intelligent Distributed Surveillance Systems (2005) 8. Huang, T., Russell, S.: Object identification in a Bayesian context. In: Proceedings of International Joint Conference on Artificial Intelligence, Nagoya, Aichi, Japan, August 23–29, pp. 1276–1283 (1997) 9. Kettnaker, V., Zabih, R.: Bayesian multi-camera surveillance. In: IEEE Conference on Computer Vision and Pattern Recognition, Fort Collins, Colorado, 23–25 June, pp. 253–259 (1999)
442
G. Jeney et al.
10. Javed, O., Rasheed, Z., Shafique, K., Shah, M.: Tracking Across Multiple Cameras With Disjoint Views. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, pp. 952–957 (2003) 11. Pedersini, F., Sarti, A., Tubaro, S.: Multi-camera Systems? IEEE Signal Processing Magazine 16(3), 55–65 (1999) 12. Wren, C.R., Rao, S.G.: Self-configuring lightweight sensor networks for ubiquitous computing. In: Proceedings of the International Conference on Ubiquitous Computing, Seattle, WA,USA, 12–15 October 2003, pp. 205–206 (2003) 13. Final report for TESS project: Les syst`emes de communications satellites ou terrestres pour les flottes d’autobus urbains (satellite or terrestrial communication systems for urban busses fleet) (March 2003) 14. Final report, 5th PCRD project PRISMATICA 15. Final report, PREDIT2 – STATUE project 16. Final report, ESCORT – Enhanced diversity and Space Coding for underground metrO and Railway Transmission IST 1999–20006 project (December 2002) 17. INRESTS synthesis document: Synth`ese INRETS N0 40 Nov 2001, Le syst`emes de tl communication existants ou mergeants et leur utilisation dans le domaine des transports guides (Existing or emerging telecommunication systems and their use in guided transports) 18. SECURBUS – ACTES de la journ´ee s´ecurit´e dans les transports (Proceedings of Day on security in transport means), BELFORT, France (January 2002) 19. Communication avec les mobiles: application au trafic et aux transports routiers (Communication with mobiles: application to traffic and road transportation) Collection du CERTU (March 2001) ISBN 2-11-090861-0, ISSN 0247-1159 20. UITP (international Association of Public Transport), http://www.uitp.com 21. Vacher, M., Istrate, D., Serignat, J.F.: Sound Detection through Transient Models using Wavelet Coefficient Trees. In: Proc. CSIMTA, Cherbourg, France (2004) 22. Khoudour, L., Aubert, D., Bruyelle, J.L., Leclerq, T., Flancquart, A.: A distributed multi-sensor surveillance system for public transport applications. In: Intelligent Distributed Video Surveillance Systems, ch. 7, IEEE, Los Alamitos (to appear) 23. Bruyelle, J.L., Khoudour, L., Velastin, S.A.: A multi-sensor surveillance system for public transport environments. In: 5th International Conference on Methods and Techniques in Behavioral, The Netherlands (2005) 24. Sun, J., Velastin, S.A., Vicencio-Silva, M.A., Lo, B., Khoudour, L.: An intelligent distributed surveillance system for public transport. In: European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology, London, UK (2004)
Performance Analysis of the REAchability Protocol for IPv6 Multihoming Antonio de la Oliva1 , Marcelo Bagnulo2, Alberto Garc´ıa-Mart´ınez1, and Ignacio Soto1 Universidad Carlos III de Madrid Huawei Lab at Universidad Carlos III de Madrid {aoliva,marcelo,alberto,isoto}@it.uc3m.es 1
2
Abstract. There is ongoing work on the IETF aimed to provide support for different flavors of multihoming configurations, such as SHIM6 for multihomed sites, multiple CoAs support in MIP for multihomed mobile nodes and HIP for multihomed nodes and sites. A critical aspect for all the resulting multihoming protocols is to detect failures and gain information related with the paths available between two hosts. The Failure Detection and Locator Path Exploration Protocol (in short REAchability Protocol, REAP) being defined in the SHIM6 WG of the IETF is a good candidate to be included as a reachability detection component on protocols requiring this functionality. Performance study is performed by combining analytical estimations and simulations to evaluate its behavior and tune its main parameters. Keywords: multihoming, failure detection, SHIM6, REAP.
1
Introduction
So far, IPv4 has failed to provide a scalable solution to preserve established communications for arbitrarily small-size sites connected to the Internet through different providers after an outage occurs. This is so because the current IPv4 multihoming solution, based on the injection of BGP [BGPMULT] routes in order to make a prefix reachable through different paths, would collapse if the number of managed routing entries would increase to accommodate small sites and even hosts. In particular, this restriction prevents end hosts equipped with different access interfaces, such as IEEE 802.11, UMTS, etc, connected to different providers, to benefit from fault tolerance, traffic engineering, etc. On the other hand, the huge address space provided by IPv6 has enabled the configuration of public addresses from each of the providers of an end host. A step further is been taken in the SHIM6 WG of the IETF to develop a framework to manage in an end-to-end fashion the use of the different addresses between a communication held by two hosts. To achieve this, a SHIM6 layer is included
This work was supported by IST FP6 Project OneLab and by the Spanish MEC through Project CAPITAL (TEC2004-05622-C04-03).
Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 443–454, 2007. c Springer-Verlag Berlin Heidelberg 2007
444
A. de la Oliva et al.
inside the IP layer to assure that the same IP address pair is provided to the upper layers to identify a given communication, while the packets flowing in the network can use different IP addresses (locators) to be able to enforce different paths. The SHIM6 [SHIM] [SHIMAPP] layers of two communicating nodes that want to benefit from the multihoming capabilities first execute a four-way handshake to exchange in a secure way the relevant information for managing the IP addresses that play the roles of identifiers and locators. Once this exchange has been performed, both SHIM6 layers use the REAP (REAchability Protocol) protocol to timely detect failures in the currently used path, and once a failure is detected to select a new path through which the communication could be continued. Note that while the REAP protocol is being defined as a component of the SHIM6 multihoming solution, it is envisioned that such a protocol could became a generic component in different scenarios in which end-to-end path validation is of paramount concern, such as HIP [HIP] or Mobile IPv6 with registration of multiple CoAs [MONAMI].It is straightforward to conclude from the framework presented above that the REAP protocol determines the performance that upper layers perceive when an outage occurs in the communication path. The path failure detection function of the REAP protocol relies on timers driven by the inspection of upper layer traffic, and by specific Keep Alive probe packets when upper layer traffic is too sparse. The current specification of the REAP protocol lacks of experimental support to properly configure the timers of the protocol and to fully understand the interaction with transport layers such as UDP or TCP. In this paper we simulate the protocol with the OPNET1 tool and we analyze the configuration of these timers and the impact on different type of applications. The remainder of the paper is organized as follows: A description of the REAP protocol is given in section 2. The scenario used in the simulation, as well as the details of the simulation environment are shown in section 3. Section 4 presents the results obtained regarding the behavior of UDP (section 4.1) and TCP (section 4.2) protocols, providing an analysis of several application types. Finally in section 5 we provide the conclusions of our work.
2
Failure Detection and Path Exploration in the SHIM6 Architecture
The SHIM6 architecture is currently being defined by the IETF to provide multihoming between hosts with multiple provider independent addresses. The architecture defines a shim sublayer, placed in the IP layer, which is responsible for ensuring that the same local and remote addresses are provided to the upper layers for the peers involved in a given communication, while at the same time different addresses can be used to allow the usage of different paths. As a consequence, two roles are assumed by the IP addresses. The term identifier is used for addresses passed to transport and application layers, and the term locator is reserved for the actual addresses used for IP forwarding. SHIM6 defines 1
OPNET University Program, http://www.opnet.com/services/university/
Performance Analysis of the REAchability Protocol
445
two components to manage the identifier/locator relationship in two communicating peers: the secure exchange between the peers of information related with identifiers and locators, performed by the SHIM6 protocol [SHIM], and the identification of communication failures and the exploration of alternative paths.Failure detection does not need specific tools if traffic is flowing between two hosts. On the other hand, when a node has no packets to send, it is irrelevant for the node if the locator pair is properly working or not, since it has no information to transmit. So, a potential relevant failure situation occurs when a node is sending packets but it is not receiving incoming packets. Such situation does not necessarily imply a failure, since a unidirectional flow may be being received, but this is indistinguishable from a failure without additional tests. In this case, the node needs to perform an explicit exchange of probe packets to discover if the current locator pair is properly working. This exchange is described in the REAchability Protocol, REAP [REAP] specification. The REAP protocol relies on two timers, the Keep Alive Timer and the Send Timer, and a probe message, namely the Keepalive message. The Keep Alive Timer TKA is started each time a node receives a data packet from its peer, and stopped and reset, each time the node sends a packet to the peer. When the Keep Alive Timer expires, a Keep Alive message is sent to the peer. The Send Timer TSend , defined roughly as three times the Keep Alive Timer plus a deviation to accommodate the Round Trip Time, is started each time the node sends a packet and stopped each time the node receives a packet from the peer. If no answer (either a Keep Alive or data packet) is received in the Send time period a failure is assumed and a locator path exploration is started. Consequently, the Send timer reflects the requirement that when a node sends a payload packet there should be some return traffic within Send Timeout seconds. On the other hand, the Keepalive timer reflects the requirement that when a node receives a payload packet there should a similar response towards the peer within Keepalive seconds (if no traffic is interchanged, there is no Keep Alive signaling). As a consequence, there is a tight relationship between the values of the timers defined by the REAP protocol and the time required by REAP to detect a failure. The current specifications suggest a value of 3 seconds for the Keepalive Timer, and of 10 seconds for the Send Timer, although these values are supported by neither analytical studies nor experimental data.Once a node detects a failure, it starts the path exploration mechanism. A Probe message is sent to test the current locator pair, and if no responses are obtained during a period of time called Retransmission Timer TRT x , the nodes start sending Probes testing the rest of the available address pairs, using all possible source/destination address pairs. Currently, a sequential algorithm is defined to drive the exploration of locator pairs, behavior that will be assumed for the rest of the paper. So far the REAP specification process has focused on functionality, without paying too much attention to performance metrics in common operation conditions, such as the time required to detect a failure and recover it. An experimental analysis would provide relevant guidelines to tune the main parameters that define the REAP behavior when combined with different types of applications. In particular,
446
A. de la Oliva et al.
interaction with applications using TCP should be considered, in order to characterize the interactions between REAP and the flow and congestion control mechanisms provided by this protocol. When UDP transport is considered, the resulting behavior is driven mainly by the application protocol; in this case relevant applications should be analyzed. In the next sections we perform some simulations that try to provide valuable information related with REAP timer configuration for applications using UDP and TCP.
3
Simulation Setup
In this section we present the scenario used to test the path failure detection functionality of the REAP protocol. Figure 1 shows two nodes, Node A and B, each one with two interfaces and an IPv6 address configured on each interface. All simulations have been performed by establishing a communication through the pair (IPA1 , IPB1 ). All traffic exchanged between these IP addresses goes through Cloud 1 and 2. At a certain time, the link connecting Cloud 1 and 2 fails, this is detected by REAP and after a path exploration, the communication is continued using the IP pair (IPA2 , IPB2 ). Tests performed involve the TCP
Fig. 1. Simulated Scenario
and UDP protocols. The TCP tests, designed to evaluate the TCP behavior in cases with high and low data rates, are performed using an FTP file download application and a Telnet application. The traffic used to evaluate UDP behavior corresponds to either a Voice over IP (VoIP) application showing bidirectional packet exchange or to an unidirectional voice flow. Note that unidirectional flows result in increased exchange of REAP specific packets.For TCP, the Windows XP model defined in OPNET has been used. For UDP, a VoIP conversation, using the codec G.729 with a compression delay of 0.02 seconds. The RTT in both paths is the same, it has been implemented as a normal distribution with mean 80ms and 20ms variance. The failure event occurs at a time defined by an uniform distribution between 75 and 125 seconds. All simulations have been run for 250 seconds, the presented results are the average
Performance Analysis of the REAchability Protocol
447
of 45 samples. The real values are within ±10% (on the worst case) of the estimated values with a confidence interval of 95%.
4
Analysis of the Results
In order to find the values for the REAP timers that optimize the behavior of TCP and UDP when a path failure occurs, several measures have been performed. The main metric used through the analysis is the Application Recovery Time. This metric is defined as the difference in time between the last packet arriving through the old IP locators (addresses) and the first packet arriving through the new ones. This metric accurately measures the time to recover from a path failure when there is continuous traffic. The analysis is structured in the following items: – UDP behavior: To fully understand the behavior of applications using UDP, two types of traffic have been considered, bidirectional traffic (VoIP conversation) and unidirectional traffic (streaming of audio). – TCP behavior: TCP incorporates several characteristics such as congestion control and reliability that determines the resulting performance when a valid path is provided as a result of the REAP operation. To understand the behavior of applications using TCP two traffic types have been considered, a FTP download from a server and a telnet session showing sparse traffic exchange. With these two traffic types the behavior of applications with high traffic demands and applications with low traffic profiles are considered. 4.1
UDP Behavior
Consider that a failure occurs, when the TSend timer expires, the node tries to reach the peer by sending a probe to the IP address that is currently in use. This probe expires after TRT x seconds. At this time, a second probe is sent to the secondary IP address. The path exploration mechanism finalizes after the exchange of 3 probes per peer, directed to the secondary IP addresses. The time required to finalize the path exploration mechanism is almost constant (there is some variation due to the RTT variance) with a value of 0.7 seconds2 . Figure 2 shows the Recovery time for different TSend (TKA = TSend /3) values and for two types of UDP applications, Voice Over IP (VoIP) and an unidirectional VoIP flow. The results follow the expected behavior, being the relation between the Recovery Time and the TSend linear. This relation was expected to be linear since UDP is not reactive to path conditions and once the path is restored traffic is immediately sent through the new IP locators. Note that in figure 2 the Recovery Time of the unidirectional traffic is lower than the bidirectional one. The difference between them can be quantified and, in mean, it is approximately 2
This value is obtained by experimental results, although it can be computed as 0.5sec + 3RT T .
448
A. de la Oliva et al.
Fig. 2. UDP Recovery Time
equal to TKA 2 . This behavior is due to the fact that when there is only unidirectional traffic, Keep Alive messages are always exchanged in a regular basis. When a failure occurs, the last Keep Alive was probably received at the peer side some time before the failure. So the TSend was started when the first packet after the reception of the Keep Alive is sent, and thus is, probably, some time before the failure. On the other hand, if we have continuous traffic in both ways, the TSend timer is probably started closer to the time of the failure (the last time a packet was sent after the reception of a packet from the other side). Keep Alive Signaling on the unidirectional case. The worst case scenario related with the overhead on signaling introduced by REAP is the unidirectional communication traffic case. If the traffic is unidirectional, Keep Alive messages are exchanged in a regular basis to maintain the relation updated. Once a packet is received, the TKA timer is initiated, after this time period without sending any packet, a Keep Alive message is sent. The timer is not set again until a new packet arrives, hence the number of Keep Alive messages sent is dependant on the transmission rate of the source. If we call δ the time between two consecutive packets sent by the source, δ is an upper boundary to the time between sending a Keep Alive message and starting the Keep Alive timer again. Finally, the formula providing the number of Keep Alive messages sent per second is TKA1 +δ . 4.2
TCP Behavior
FTP Application. Figure 3 shows the Recovery Time achieved while varying the TSend timer. Note that the results for TCP traffic are not linear with
Performance Analysis of the REAchability Protocol
449
Fig. 3. TCP Recovery Time
the TSend parameter as occurred with the UDP values (figure 2). This behavior is due to the mechanisms implemented in TCP for congestion detection and avoidance, in particular dependent on the retransmission timeout of TCP. TCP interprets a retransmission timeout as an indication of congestion in the network, so it uses an exponential back-off to retransmit the packets to avoid increasing the congestion. This mechanism affects the Recovery Time, since although the path has been reestablished, the packets will not be retransmitted until the retransmission timeout expires. To show a detailed explanation of this behavior we present figure 4(a). Figure 4(a) presents, for a given experiment (TSend = 10sec), the Retransmission Timeout, Congestion Window and traffic sent through both paths. Traffic starts being sent through the primary path, until the link fails. At this moment the congestion window decreases and the retransmission timer increases. When the path exploration mechanism ends, the retransmission timer set up to 8 seconds has not expired. When it expires, packets are sent according to the slow start policy set for TCP. Figure 5 shows the difference in time between the arrival of a packet in a connection with a failure and the arrival of the same packet if no failure in the path occurs. As can be observed, packets suffer a big delay when the link fails (this delay is equivalent to the time needed to discover the failure and complete a path exploration mechanism), and then it remains roughly constant. This effect is due to the increase in the congestion window after the communication is recovered, packets will start to be sent faster until the congestion window reaches its top, after this packets are sent in a constant way, this behavior can be observed in figures 4(a) and 5. Due to the explanation presented above, we argue that the stair shaped graph in figure 3 is caused by the impact of the backoff mechanism of the retransmission timer of TCP. Figure 6, presents the backoff mechanism used by TCP to set up the
450
A. de la Oliva et al.
(a) Normal TCP operation
(b) TCP operation resetting the retransmission timeout Fig. 4. TCP behavior explanation
retransmission timer. As the number of retransmissions increases, the retransmission timer duplicates its value. We argue that as the TSend varies, the instant in time when the path is recovered falls in one of the steps presented in figure 6, this is the cause of the differences in time presented in figure 3.To improve
Performance Analysis of the REAchability Protocol
451
Fig. 5. Difference in time between packets in a communication with path failure and without path failure
Fig. 6. TCP Retransmission Timeout
the performance, we propose to reset the retransmission timer of TCP after a new path is chosen for the communication (figure 4(b)). Notice that this is not only more efficient, but also appropriate as the retransmission timer value is dependant on the properties of the path. In the simulator, we implemented a
452
A. de la Oliva et al.
Fig. 7. UDP vs TCP (resetting the retransmission timer) Recovery Time
hook in the TCP stack to allow REAP to reset the retransmission timer forcing TCP to start retransmitting packets immediately. The experimental results of this proposal are presented in figure 7 along with the previous results for UDP bidirectional VoIP traffic and TCP traffic to easy the comparison. As expected, the relation between the TCP Recovery Time and TSend is linear, and even more, the TCP modified behavior is very similar to UDP. Telnet Application. To conclude the TCP study, an application with low data traffic has been analyzed. The chosen application is Telnet, in which usually long periods of time without packet transmission occurs. The design of REAP tries to minimize signalling overhead, so no Keep Alive probing is performed when no upper layer traffic is exchanged, due to this behavior, REAP will notice the failure after a time, defined by TSend , of the first packet sent after the failure. To show this behavior figure 8 is presented. The Application Recovery Time metric, refers to the time elapsed between the first packet (using the old locators) sent after the failure and the first packet sent using the new locators. On this time period, the failure discovery and path exploration mechanisms are performed. The recovery procedure start time offset depends on the time between the path failure and the time when the application first tries to send a packet, but this offset is not important since the application is not affected by the failure because it was not trying to exchange any traffic. Figure 8 presents a similar trend to figure 3, presented on this figure for comparison purposes. The impact on the Application Recovery time of resetting the retransmission timer of TCP is shown in figure 8. As can be seen the effect is similar to the one presented on figure 7, being it a noticeable decrease on the Recovery Time of the application.
Performance Analysis of the REAchability Protocol
453
Fig. 8. Recovery time in a telnet application
TCP Reset Time. One of the most important characteristics of the REAP protocol to work in a TCP environment is to handle the recovery before the TCP session expires. In order to measure how much address pairs may be checked in the path exploration phase before TCP resets the connection, several tests have been done, using the default TCP configuration of a Microsoft Windows Server 2003 machine. The TCP stack implemented on it, resets the TCP connection if a limit of 53 retransmissions is reached. The time between the failure detection and the reset of the connection is of 75 seconds in mean.The REAP specification sets a backoff mechanism to handle the retransmission in the path exploration protocol. This mechanism follows Tout = 2n where n is the number of the retransmission count. This exponential backoff starts when the number of addresses proved is higher than 4, for the first 4 retransmissions a Tout of 0.5 seconds is used. The exponential backoff has a limit of 60 seconds, where it reaches the maximum, being the Tout for the rest of the retransmissions of 60 seconds.Taking into account the backoff mechanism, the number of possible IP address pairs explored is of 10. It is worth to notice that the first try of the REAP protocol is to check the current IP pair being used.As the previous results prove, the REAP protocol can check a big number of IP addresses before the TCP session expires, providing a mechanism to restore the communication.
5
Conclusion and Future Work
This paper presents details of the appropriate timer configuration of the REAP protocol as well as the effect of the protocol configuration in the two main 3
http://technet2.microsoft.com/WindowsServer/
454
A. de la Oliva et al.
transport protocols, TCP and UDP. We also present a possible modification to the TCP stack which enables TCP to take advantage of the REAP protocol providing a faster Recovery Time. The results show a clear increase in the performance of the failure detection over TCP, being comparable to UDP. As future work, there are several issues to be analyzed such as the aggressiveness of the path exploration mechanism (for example, explore alternative paths on parallel instead of serially) or a timer configuration based on the RT T which will decrease the Recovery Time while minimizing false positives on failure detection. We are also interested on increasing the information provided by the REAP protocol, going for path availability to gather some characteristics of the path (RTT could be the simplest example), and finally to study the combination of REAP with other protocols and scenarios, for example Mobile IP with multiple registrations and mobility.
References [REAP] Arkko, J., van Beijnum, I.: Failure Detection and Locator Pair Exploration Protocol for IPv6 Multihoming; IETF draft; draft-ietf-shim6-failure-detection-06 (September 2006) [LOCSEL] Bagnulo, M.: Default Locator-pair selection algorithm for the SHIM6 protocol; IETF draft; draft-ietf-shim6-locator-pair-selection-01 (October 2006) [HIP] Moskowitz, R., Nikander, P.: Host Identity Protocol (HIP) Architecture; Request for Comments: 4423 [MIP] Johnson, D., Perkins, C., Arkko, J.: Mobility Support in IPv6; Request for Comments: 3775 [SHIM] Nordmark, E., Bagnulo, M.: Level 3 multihoming shim protocol; IETF draft; draft-ietf-shim6-proto-06 (November 2006) [SHIMAPP] Abley, J., Bagnulo, M.: Applicability Statement for the Level 3 Multihoming Shim Protocol(shim6); IETF draft; draft-ietf-shim6-applicability-02 (October 2006) [BGPMULT] Van Beijnum, I.: BGP: Builiding Reliable Networks with the Border Gateway Protocol; Ed O’Reilly (2002) [MONAMI] Wakikawa, R., Ernst, T., Nagami, K.: Multiple Care-of Addresses Registration; IETF draft; draft-ietf-monami6-multiplecoa-01 (October 2006)
Controlling Incoming Connections Using Certificates and Distributed Hash Tables Dmitrij Lagutin and Hannu H. Kari Laboratory for Theoretical Computer Science Helsinki University of Technology, P.O. Box 5400, FI-02015 TKK, Finland
[email protected],
[email protected]
Abstract. The current architecture of the Internet where anyone can send anything to anybody presents many problems. The recipient of the connection might be using a mobile access network and thus unwanted incoming connections could produce a high cost to the recipient. In addition, denial of service attacks are easy to launch. As a solution to this problem, we propose the Recipient Controlled Session Management Protocol where all incoming connections are denied by the default and the recipient of the connection can choose using certificates what incoming connections are allowed. The recipient can also revoke rights for making an incoming connection at any time. Index terms: Session management, rights delegation, rights management, certificates, DoS countermeasures.
1 Introduction In the current Internet architecture, the initiator of the communication controls the connection. This is fine when connecting to Internet servers, but this policy might cause problems when connecting a private user directly. There are many reasons why the other endpoint, the recipient, would want to control which incoming connections will be allowed. The recipient might be using a wireless access network that has a limited bandwidth. The recipient even might have to pay for all network traffic, including the incoming traffic. In addition, the recipient might be in a situation where he does not want to be disturbed by unnecessary connections, like sales calls. In this case, only really important connections from a limited set of initiators should go through.Many access network's firewalls block all incoming connections unless the recipient has initiated the connection first, however this is too restrictive policy, the recipient should have the option for receiving direct incoming connections from trusted initiators. Naturally, unwanted incoming connections could also be blocked using personal firewall, but in that case they would still consume network resources. Thus, it is better to block unwanted connections already at the access network level, before they even reach the destination. Blocking unwanted connections would also make it much harder to initiate a denial of service and distributed denial of service attacks against the network or the recipient. The structure of this paper is as follows: in Chapter 2 we go through related work and requirements of our system. Chapter 3 introduces Recipient Controlled Session Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 455–467, 2007. © Springer-Verlag Berlin Heidelberg 2007
456
D. Lagutin and H.H. Kari
Management Protocol. Chapter 4 describes how can the packet security be handled under proposed system, Chapter 5 contains comparison with other similar solutions and Chapter 6 contains conclusions and discusses future work.
2 Requirements and Related Work We challenge the traditional freedomness of Internet (i.e. “anyone can send anything to anybody”) by setting requirements for initiator. The initiator must have direct or indirect permission from the recipient in advance in order to bypass a gatekeeper that protects the recipient against garbage from the Internet. The basic requirement of the proposed system is that unauthorized incoming connection attempts are denied before they reach the destination. The blocking is done by a gatekeeper (i.e. firewall of the access network).Detailed requirements for the system are listed in Table 1. Table 1. Requirements for limiting incoming connections Requirement
Description
Mandatory requirements R1
Blocking the unauthorized incoming traffic before the destination
The unauthorized incoming traffic should not disturb the recipient or cause any (monetary, energy or other) costs to the recipient or the access network of the recipient. This requirement is important especially in mobile networks where the available bandwidth might be quite low and where the recipient may have to pay also for incoming traffic.
R2
Rights for making an incoming connection should be revocable
The recipient must be able to revoke rights from the initiator at any given time.
R3
System must support mobility and change of initiator's IP address
The initiator might use different networks like fixed LAN, wireless LAN and 3G, and might change between them frequently. Change of initiator's network or an IP address should not require a full renegotiation of rights.
R4
Verification of data packets
There should be a mechanism to verify that the incoming data is really coming from a trusted initiator and the incoming data is not forged or duplicated by a malicious party.
R5
Authentication of the initiator
The recipient must be certain that the initiator is really the authorized initiator and not some malicious party.
R6
Rights for making an incoming connection should be delegatable
If the recipient allows delegation, initiator must be able to delegate rights to other parties.
R7
Resilience of the system
The system should not have a single point of failure
Optional requirements R8
Good performance, low signalling overhead
R9
System must support quality-of-service issues
Rights management should not consume excessive amount of any network resources.
Future enhancements The right for sending the incoming data may contain bandwidth or other quality-of-service limitations.
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
457
Previous research on this field has mainly concentrated on the prevention of denial of service attacks. Anderson et al. [2] describe system of capabilities to prevent denial of service attacks. The idea behind their system is that the initiator of the connection receives capability tokens beforehand from the recipient and these tokens will allow the initiator to send data to the recipient for a limited time. In this approach, there exist a network of Request-To-Send (RTS) servers coupled with verification points (VPs). The initiator sends a token request using RTS servers which forward the request to the recipient that allows or denies the request. In former case, the capability token is returned to the initiator using same RTS servers and the data connection can be established. RTS servers along the path take note of a token and pass token information to coupled verification points. When VPs receive traffic from the initiator, they check wherever the token included in data packets is valid and will allow only data packet with valid tokens to go through. This kind of solution will limit incoming connections, but it does not satisfy all our requirements, especially R1, R3, R4 and R5. Under this scheme, token requests are always forwarded to the recipient and thus even denied requests will consume resources within recipient's access network. Additionally, when the recipient receives a token request from the initiator, the recipient will only know initiator's IP address, thus the recipient cannot be completely sure who is actually the initiator. Since the recipient sends capability tokens unencrypted to the initiator, malicious party can snoop tokens and thus gain recipient's rights to itself. Finally, the recipient or the recipient's network cannot guarantee that data packets sent with valid tokens are indeed coming from the original trusted initiator and are not duplicated. Yaar et al. [20] introduce Stateless Internet Flow Filter (SIFF) to prevent distributed denial of service attacks. The aim of SIFF is to separate traffic in privileged and unprivileged traffic. The privileged traffic would be treated with a higher priority by routers, while the unprivileged traffic would receive a low-priority treatment and it would be impossible to launch denial of service attacks using only unprivileged connections. In order to establish privileged connection to the recipient, the initiator must first obtain a “capability” from the recipient. During the capability negotiation phase routers among the path between the initiator and the recipient would take note of capabilities and would later check that the data packets have the correct capability within them. The capability is bound to the initiator's IP address and is valid for a limited time after which the recipient must renew it.While SIFF reduces the risk of denial of service attacks it does not completely satisfy our requirements, especially R1, R3, R4, R5 and R6. For example, binding capabilities to a certain IP address presents some problems. The Host Identity Indirection Infrastructure (Hi3) [14] combines the Host Identity Protocol [13] with the Secure Internet Indirection Infrastructure (Secure-i3) [1]. Under the Host Identity Protocol (HIP), separate host identifiers are introduced to describe the identity of the host and the IP addresses will only be used for determining topological location of the host. The Secure-i3 is based on a Internet Indirection Infrastructure (i3) [18]. The aim is to use separate overlay network for connections. The initiator sends data to the certain identifier and the overlay networks forwards the data to the certain recipient based on the identifier. This way the initiator does not have to know the IP address of the recipient beforehand. Under the Hi3, the HIP base
458
D. Lagutin and H.H. Kari
exchange is handled by the overlay network while the actual data transfer is done between hosts using HIP. This way the recipient is not bothered by connection attempts, only after the base exchange has been completed successfully with an overlay network the initiator can start the actual data connection to the recipient. In order to provide more protection against denial of service attacks, SPI multiplexed NAT (SPINAT) [21] can be used together with Hi3. A SPINAT proxy is placed near the recipient and the initiator makes a connection to the SPINAT proxy which forwards the connection to the recipient. Thus, the initiator does not know the actual IP address of the recipient which makes it harder to launch denial of service attacks against the recipient. Such a system however would not satisfy the requirement R6 concerning delegatibility of rights. It would also be possible to implement similar system for limiting incoming connection based on the SIP [16] protocol. With this approach, the recipient and the initiator would register themselves to the SIP proxy and the SIP proxy would be used for negotiation of rights before the actual data connection. The initiator would contact the SIP proxy and ask for permission to send data to the recipient. The SIP proxy would then ask for permission from the recipient and if this permission is granted the proxy would send recipient's IP address to the initiator and the data connection could be established. SIP based system would not satisfy fully our requirements, especially R1, R2, R4 and R5. The biggest drawback in this kind of system is that after the initiator gains recipient's IP address, it will be impossible for the recipient to stop the initiator from sending data. Distributed Hash Tables (DHTs) [6] provide an effective solution for storing large amount of data in a distributed way. The DHT network is fault tolerant, if some node exits the DHT network, other nodes can take over its duties. There exists several DHT protocols like Tapestry [7] and Chord [19]. Many DHT protocols achieve O(ln n) lookup latency where n is the amount of nodes in the DHT network, some protocols achieve even lower latency using a higher amount of links to other nodes.
3 Recipient Controlled Session Management Protocol The aim of the Recipient Controlled Session Management Protocol is to allow a recipient to control incoming connections using certificates. Only explicitly allowed incoming connections will reach the recipient. A general architecture of the system is presented in Figure 1. To better understand the architecture depicted in Figure 1 and use cases presented below, main concepts like the initiator, the recipient, the session controller and the firewall are explained here. The initiator is the party that initiates the connection. The recipient is the party at the other end of connection, in our model the recipient grants explicit rights to initiators to allow incoming connections. The session controller (also called as “proxy”) is the entity in which the recipient trusts and it can give certificates for making incoming connections to trusted initiators. If the recipient is changing access networks, the session controller keeps track of recipient's current IP address or addresses just like the home agent in MobileIPv6 [8]. The services of the
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
459
Fig. 1. A general architecture of the Recipient Controlled Session Management Protocol
session controller can be offered to the recipient by e.g. recipient's operator1.In the proposed architecture, session controllers form a Distributed Hash Table network to eliminate a single point of failure. When the initiator requests certificates, it contacts the closest proxy from the DHT network and this proxy forwards the request through the DHT network to the home proxy of the recipient, thus the initiator does not need to know which proxy is the home proxy of the recipient. This home proxy is trusted by the recipient and it will provide certificates for trusted initiators. The firewall denotes the firewall in recipient's access network. The duty of the firewall is to block incoming traffic into the access network that is not allowed or desired. Thus the firewall must take note of the certificates that pass through it. The firewall will also grant certificates that allow nodes to use the firewall's access network. The objective is to create a connection from the initiator to the recipient. In order to create this connection, the initiator needs to receive necessary certificates for making an incoming connection from the proxy that acts as a session controller, otherwise the firewall in the recipient's access network would not allow the incoming connection to the recipient. Two conditions must be satisfied before the initiator can receive certificates from the proxy. First the recipient must grant a certificate to the initiator which shows that the initiator is trusted by the recipient. In addition, the recipient must certify the home proxy so that the home proxy has a right to grant necessary certificates to trusted initiators. The structure of the certificates that are used with the Recipient Controlled Session Management Protocol is explained briefly here. The certificate contains public keys of the issuer and the subject of the certificate. In addition the certificate contains a validity period and information about the rights and delegatable rights provided by the certificate. Rights include the right to create the incoming connection, the right for 1
The proposed architecture does not care who controls the proxy or is there several proxies in the network. The recipient might buy proxy services from the operator or operate an own proxy. The important thing is that recipient must trust that the proxy will give permissions to initiate connections and will disclose recipient's location (IP addresses) only to authorized initiators.
460
D. Lagutin and H.H. Kari
Fig. 2. Controlling incoming connections
session initialization management, the right to request a new certificate, the right to send data to the network and the right to delegate rights to other parties. Finally, there is an issuer's signature over the certificate. Below we show different cases how the Recipient Controlled Session Management Protocol can be used. The first example describes how Recipient Controlled Session Management Protocol can be used to limit incoming connections to the recipient. The second example shows how revocation of rights works with the Recipient Controlled Session Management Protocol. 3.1 Controlling Incoming Connections This basic case where the mobile user, the recipient, wants to restrict incoming connections is illustrated in Figure 2. The recipient authorizes the proxy 1. The recipient gives a C1 certificate to the proxy which means that the proxy can disclose the recipient's location to certain trusted initiators and the proxy can also authorize them to initiate connection to the recipient. Based on traditional certificate formats in this certificate the recipient is the issuer of the certificate, the proxy is the subject and the certificate is signed by the recipient. The certificate is valid from time T1 until T2. The recipient authorizes the initiator In steps 2 and 3 the recipient authorizes the initiator. This is a necessary step, since without the recipient's authorization, the proxy will not trust in the initiator. 2. The initiator delivers its public key to the recipient, it can be delivered offline.
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
461
3. If the recipient trusts the initiator, it creates a certificate C2 that allows initiator to request recipient's current location (IP address) and necessary certificates from the proxy (the right for session initialization management bit is set to one in C2 certificate). This certificate is sent to the initiator. Steps 2-3 can be carried offline. The recipient changes the access network Steps 4 and 5 describe the mobility management. 4. When the recipient changes the access network, as part of the network's AAA procedure the recipient gets a certificate Cf from the network (denoted in figure as “firewall”) that allows the recipient to use the new access network. 5. The recipient sends a location update together with this new Cf certificate to the proxy. Establishment of data connection Steps 6-9 describe establishment of data connection between the initiator and recipient. 6. The initiator contacts the proxy. The initiator sends C2 certificate to the proxy and requests the IP address of the recipient together with all required certificates. The recipient might change the network frequently, thus the initiator must retrieve recipient's latest IP address from the proxy. 7. Based on C2 certificate, the proxy knows that the initiator is authorized to get recipient's latest IP address and to receive a certificate for sending data to the recipient. As a result, the proxy creates a new C3 certificate that allows initiator to initiate the incoming connection to the recipient. Certificates Cf, C1, C3 together with the recipient's IP address are then sent to the initiator. 8. The initiator first sends a control message, using e.g. ICMP, with certificates Cf, C1 and C3 to the recipient. This allows the firewall of the recipient's network to check that the recipient is a legitimate node in the access network and the recipient is willing to receive the incoming traffic from the initiator. The Cf certificate tells the firewall that the recipient is authorized by the access network and the recipient is a valid entity. The C1 and C3 certificates create together a certificate chain: recipient->proxy->initiator, which denotes that the initiator has a right to initiate the connection to the recipient. 9. The data connection can be now established. Notes: The C3 certificate that allows initiator to create an incoming connection is given through the proxy because that way the recipient can easily revoke it. For example, if after step 2 the recipient does not trust the initiator anymore, the recipient can notify the proxy that the C2 certificate is revoked and thus the initiator would not be able to receive the C3 certificate from the proxy. If the recipient would have sent the C3 certificate directly to the initiator, then the revocation of the C3 would be much more difficult. If the recipient uses several network interfaces simultaneously, the proxy could return several different IP address and associated Cf certificates to the initiator. In addition to permitting certain initiators to send data to itself, the recipient could also specify more general policies to proxy, like “now I do not want to be disturbed, do not allow anybody to contact me” or “anybody can contact me, even without a valid C2 certificate”.
462
D. Lagutin and H.H. Kari
3.2 Revocation of Rights This use case is continuation to the previous one, it describes a situation where the recipient wants to revoke a right to send data from the initiator. The revocation is done by notifying the proxy, firewall and the initiator. The proxy is responsible for giving initiator the recipient's IP address with necessary certificates in exchange for C2 certificate. It is possible for the recipient to send a new version of C2 certificate (called C2N and C2N2 in this example) to the proxy. The proxy will keep only the newest version of this kind certificate in its database and this certificate always overrides the certificate received from the initiator. Thus, this method can be used to revoke and reissue C2.
Fig. 3. Revocation of rights for making an incoming connection
The use case, which is illustrated in Figure 3, demonstrates the situation where the initiator has already got a C3 certificate, the initiator is sending data but then starts misbehaving. As a result, the recipient revokes initiator's rights and the firewall in recipient's access network notices the revocation and stops the traffic coming from initiator. Later in the example, recipient reissues rights to initiator for making an incoming connection for another time period. The recipient wants to revoke rights 1. The recipient creates a C2N certificate, which is similar to the C2 certificate in the previous use case but all rights are cleared. This C2N certificate is then sent to the proxy and it will override any previous similar certificate. If the initiator requests recipient's IP address or C3 certificate from the proxy (step 6 in the previous use case), the proxy denies the request because in its database exist a valid C2N certificate that has its rights set to zero. The C2N certificate is also sent to the initiator to notify that it should not send any more data and at the same time the C2N certificate passes the firewall which takes note of it. Thus, if the initiator tries to send data directly to the recipient, the firewall will block the data flow.
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
463
The recipient wants to give rights back for a different time period 2. A new certificate, C2N2, is created and this certificate has a right for session initialization management bit set to one. This certificate is sent to both proxy and the initiator. This certificate will also pass the firewall which will take a note of it. Establishment of data connection 3. Now the initiator can request the recipient's IP address with necessary certificates from the proxy. If the initiator will make a request to the proxy using the original C2 certificate outside the validity time of C2N2 certificate, the request will be denied since the proxy has C2N2 certificate in its database and this certificate automatically overrides other certificates. 4. Just like in a previous case, the proxy sends recipient's IP address with necessary certificates to the initiator. 5. Similarly, the initiator sends a control message to the recipient containing necessary certificates. Certificates Cf, C1 and C3 show that the initiator has a valid rights for making an incoming connection to the recipient and C2N2 certificate does not refute this right. 6. The data connection can now be established
4 Providing a Packet Level Security The solution based on certificates presented above does not satisfy our requirements completely by itself. Especially during the negotiation phase and when the data connection is established, the sender of the data must be verified on the packet level to ensure that the data is really sent by the trusted initiator. The Packet Level Authentication (PLA) [3][10][15] can be used to accomplish this task. The idea behind the PLA is that every packet is cryptographically signed by the sender of the packet. Thus the authenticity and integrity of the packet can be verified by any node in the network. The PLA adds additional header to the standard IPv6 packet and this header includes information like sender's public key, sender's signature over the whole packet, sequence number, and a timestamp. Also, a certificate given by a trusted third party is included in the PLA header, this certificate guarantees that the sender is a valid well behaving entity. The trusted third party can for example be an operator or state authority. The PLA uses the elliptic curve cryptography (ECC) [9][12] with 160 bit keys and 320 bit signatures, thus using the PLA does not create a significant bandwidth overhead. The elliptic curve cryptography is computationally intensive, but hardware accelerators can be used for speeding the signing and verification tasks. Such accelerators achieve a good performance [11][17] with a relatively low power consumption [4][5]. The PLA provides benefits in following steps of the original example (Figure 2). In step 6., when the initiator requests recipient's IP address and certificates from proxy, the proxy can check wherever the public key of the packet received from the initiator matches the public key in the C2 certificate's subject field. If they match, then the initiator is really trustee of the recipient and the proxy can trust the initiator. If they do not match, then it means that some malicious entity has intercepted the C2 certificate and tries to obtain rights to itself with an intercepted certificate, in this case the request will be naturally denied.
464
D. Lagutin and H.H. Kari
Similar check must be made in step 9, the firewall must check that the sender of the data is really the entity that is authorized to make an incoming connection by the certificate C3. In addition the PLA is also required in the revocation of rights example (Figure 3). In steps 1 and 2 the proxy must check that those certificate updates that override previously issued certificates really come from the recipient. Thus other parties will not be able to send certificate updates to the proxy in recipient's name.
5 Analysis and Comparison with Other Solutions The summary of comparison with other solutions is presented in Table 2. The first R1 requirement is blocking the unauthorized incoming traffic before the destination. Our proposed system and Hi3 together with SPINAT satisfy this requirement, with other approaches malicious parties can freely send data to the recipient if recipient's IP address is known. The R2 requirement is the revocability of rights. SIFF and Capability tokens approaches do not support revocability directly, but similar results can be achieved using short time rights: if the initiator misbehaves, the right for sending data will not be renewed. With the hypothetical SIP based system the initiator can send freely data to the recipient after the initiator has received recipient's IP address. With our system it is possible revoke rights using new certificates that override existing ones. The R3 requirement is the mobility. SIFF does not support mobility since capabilities are bound to the initiator's IP address. When the initiator's IP address changes capabilities will not be valid anymore. Capability tokens include path between the recipient and initiator, thus if recipient changes the network, the capability token will not be valid anymore. The R4 requirement is the verifiability of data packets. Only our system and Hi3 guarantee the authenticity and integrity of the data. However with Hi3 only the recipient can validate the integrity while with our system any node on the path can perform integrity check for data packets. Thus, with Hi3 forged or duplicated packets will reach the recipient and will consume resources in the recipient's access network. With other systems, data is sent unsigned, thus it can be easily forged or duplicated. The R5 requirement is the authentication of the initiator. Our system and Hi3 use public keys to check the initiator before giving rights. SIFF and Capability tokens approaches do not have any means to the check the initiator beforehand, with these approaches the recipient will know only the IP address of the initiator. The R6 requirement is delegability of rights. Our system supports fully this requirement, if the recipient allows delegation, then the initiator can create a new certificate where the initiator authorizes another party. This certificate can be combined with a existing certificate chain: recipient->proxy->initiator to create a new certificate chain: recipient->proxy->initiator->initiator2, thus the initiator2 will be also allowed to create an incoming connection to the recipient. With Capability tokens it is possible to give the token, and thus delegate rights, to another party that is located within the same subnet. But this approach will not work in generic way. With SIP based system delegation of rights could be also implemented using proxy.
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
465
The final mandatory requirement, R7, is the resilience of the system, the system should not have a single point of failure. Hypothetical SIP based system with a centralized proxy would not satisfy this requirement. Our proposed system satisfies this requirement because a DHT network is used for proxies. Performance is quite good with all approaches, they do not consume excessive amount of network resources. Table 2. Comparison between different approaches regarding our requirements Requirement Mandatory R1 R2 R3 R4 R5 R6 R7 Optional R8 Future en- R9 chantments
Capability Tokens
SIFF
Sip based Our system Hi3+ system (RCSMP + PLA) SPINAT
No
No
Yes
No
Yes
Using short time rights No No No Within the same subnet Yes Yes Yes
Using short time rights No No No No
Yes
No
Yes
Yes Yes Yes No
Yes No No Yes
Yes Yes Yes Yes
Yes Yes No
Yes Yes No
No Yes No
Yes Yes With extensions
The final R9 requirement is the support for quality-of-service issues. Capability tokens have some support for this, e.g. the number of packets that the initiator is allowed to send is included in capability. Our system can be extended to satisfy this requirement, the certificate that is given to the initiator can include e.g. the priority level of the traffic and the maximum amount of bytes and packets that the initiator is allowed to send using the certificate. Overall, the biggest problem with most other approaches is that the traffic will always reach the recipient initially and thus consume resources within the recipient's network. With our approach unauthorized traffic to the recipient is blocked and connection attempts will go through the proxy first and only authorized initiators can send packets directly to the recipient. In addition, when the recipient changes the access network, it needs to send only one message to the proxy regardless of the number of initiators that want to contact the recipient.
6 Conclusions and Future Work We have presented the Recipient Controlled Session Management Protocol that uses certificates and distributed hash tables, a novel way to control incoming connections. Its aim is to block in a robust way unauthorized incoming connections before they reach destination.
466
D. Lagutin and H.H. Kari
Using certificates for rights management is a better approach than binding rights to the certain IP address. The initiator might use different networks like fixed LAN, wireless LAN and 3G, and he might change networks rapidly. If the right for making an incoming connection is bound to a certain IP address, the initiator would need to renegotiate that right each time when the initiator's access network is changed. This problem does not exist when using certificates, as long as the initiator possesses a valid certificate, he can make an incoming connection to the certain recipient regardless of initiator's IP address. Secondly, IP addresses can be forged by malicious entity which would allow malicious entity to highjack rights given to others. Finally, the IP address is not directly related to the identity of the initiator, thus when the recipient grants rights to the certain IP address, he cannot be completely sure to whom he is actually granting rights. The certificate approach does not have this problem since certificates contain initiator's public key. Security on a packet level is essential to satisfy our requirements, otherwise the recipient and the firewall in recipient's access network cannot be sure that packets are indeed coming from the trusted initiator and they have not been tampered with. Using a DHT network for proxies eliminates a single point of failure from the system. If some proxy goes offline, other proxies can take over its duties. Our presented system can be improved and extended in many ways to provide more complex services. For example, the recipient could mark trusted initiators with priority levels and provide this information to the proxy. The recipient could also notify the proxy of his state (busy, available, in the meeting, etc.) and the proxy could decide wherever to grant rights to initiators based on this recipient's state. Thus, if the recipient is very busy, proxy would grant rights for making incoming connections to only trusted initiator's with a high priority level. The recipient's state could also contain information wherever he is at work or at home. If the recipient is at home then the proxy could block incoming connection from work related initiator's. Since certifying each initiator separately by the recipient may be cumbersome and time consuming, the proposed system could be modified to allow certificates for making an incoming connections to be given on the higher level. For example, two companies could give necessary certificates to each other and afterwards employees of those companies could contact each other without additional certificates.
References 1. Adkins, D., Lakshminarayanan, K., Perrig, A., Stoica, I.: Towards a more functional and secure network infrastructure. Technical Report UCB/CSD-03-1232, Computer Science Division (EECS), University of California, Berkely, USA (2003) 2. Anderson, T., Roscoe, T., Wetherall, D.: Preventing Internet Denial-of-Service with Capabilities. In: ACM SIGCOMM Computer Communications Review, pp. 39–44 (2004) 3. Candolin, C.: Securing Military Decision Making In a Network-centric Environment. Doctoral dissertation, Espoo (2005) 4. Gaubatz, G., Kaps, J., Öztürk, E., Sunar, B.: State of the Art in Ultra-Low Power Public Key Cryptography for Wireless Sensor Networks. In: proceedings of the third International Conference on Pervasive Computing and Communications Workshops, Hawaii, USA (March 2005)
Controlling Incoming Connections Using Certificates and Distributed Hash Tables
467
5. Goodman, J., Chandrakasan, A.: An Energy-Efficient Reconfigurable Public-Key Cryptography Processor. IEEE Journal of Solid-State Circuits 36(11), 1808–1820 (2001) 6. Gribble, S.D., Brewer, E.A., Hellerstein, J.M., Culler, D.: Scalable, Distributed Data Structures for Internet Service Construction. In: Proceedings of the 4th Symposium on Operating System Design and Implementation (OSDI 2000), pp. 319–332 (2000) 7. Hildrum, K., Kubiatowicz, J.D., Rao, S., Zhao, B.Y.: Distributed Object Location in a Dynamic Network. In: Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA 2002), pp. 41–52 (2002) 8. Johnson, D., Perkins, C., Arkko, J.: Mobility Support in IPv6. The Internet Society, Network Working Group, Request for Comments: 3775 (2004) 9. Kobliz, N.: Elliptic Curve Cryptosystems. Mathematics of Computation 48, 203–209 (1987) 10. Lunberg, J.: Packet level authentication protocol implementation. In: Military Ad Hoc Networks, vol. 1(19), Helsinki (2004) 11. Lutz, J., Hasan, A.: High Performance FPGA based Elliptic Curve Cryptographic CoProcessor. In: Proceedings of the International Conference on Information Technology: Coding and Computing, ITCC 2004, Las Vegas, USA (April 2004) 12. Miller, V.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.) CRYPTO 1985. LNCS, vol. 218, Springer, Heidelberg (1986) 13. Moskowitz, R., Nikander, P.: Host Identity Protocol. Internet draft, work in progress (June 2006) 14. Nikander, P., Arkko, J., Ohlman, B.: Host Identity Indirection Infrastructure (Hi3). In: proceedings of the Second Swedish National Computer Networking Workshop, Karlstad, Sweden (November 2004) 15. Packet level authentication [online] [Accessed 10 October 2006], Available from: http://www.tcs.hut.fi/Software/PLA/ 16. Rosenberg, J., et al.: SIP: Session Initiation Protocol. The Internet Society, Network Working Group, Request for Comments: 3261 (2002) 17. Satoh, A., Takano, K.: A Scalable Dual-Field Elliptic Curve Cryptographic Processor. IEEE Transactions on Computers 52(4), 449–460 (2003) 18. Stoica, I., Adkins, D., Zhuang, S., Shenker, S., Sunara, S.: Internet Indirection Infrastructure. In: Proceedings of ACM SIGCOMM 2002, Pittsburgh, USA (August 2002) 19. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: Proceedings of the ACM SIGCOMM 2001, pp. 149–160 (2001) 20. Yaar, A., Perrig, A., Song, D.: SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks. In: proceedings of the 2004 IEEE Symposium on Security and Privacy, Oakland, USA (May 2004) 21. Ylitalo, J., Nikander, P.: BLIND: A Complete Identity Protection Framework for Endpoints. In: proceedings of the Twelfth International Workshop on Security Protocols, Cambridge, UK (April 2004)
Design and Implementation of an Open Source IMS Enabled Conferencing Architecture A. Buono2 , T. Castaldi1 , L. Miniero1 , and S. P. Romano1 1
University of Napoli Federico II, Via Claudio 21, 80125 Napoli, Italy 2 CRIAI Consortium, P.le E. Fermi 1, 80055 Portici (NA), Italy
Abstract. In this paper we embrace an engineering approach to service delivery over the Internet, by presenting an actual implementation of a conferencing framework compliant with the IP Multimedia Core Network Subsystem (IMS) specification. The architecture we describe has been conceived at the outset by taking into account ongoing standardization efforts inside the various active international bodies. At its current state, it is capable to provide video conferencing facilities with session management capabilities and floor control. The system presented is intended to serve as a running experimental testbed useful for protocol testing, as well as field trials and experimentations. It will be first described from a high level design perspective and subsequently analyzed in further detail by highlighting the most notable implementation choices. A mapping between the actual system components and the corresponding IMS logical functions will be provided and a discussion concerning those parts of the system which somehow depart from the IMS paradigm will be conducted. This on one hand will help the reader figure out potential discrepancies between our solution and the IMS model; on the other hand will open space for discussion around some important open issues on which the international research community still has to attain a rough consensus. Keyword: IP Multimedia Subsystem, Centralized Conferencing, Floor Control.
1
Introduction
The IP Multimedia Systems (IMS) architecture is currently being standardized by the third generation Partnership Project (3GPP) and aims to provide a common service delivery mechanism capable to significantly reduce the development cycle associated with service creation across both wireline and wireless networks. IMS main objective resides in trying to reduce both capital and operational expenditures (i. e. CAPEX and OPEX) for service providers, at the same time providing operational flexibility and simplicity. The envisaged portfolio of IMS services includes advanced IP-based applications like Voice over IP (VoIP), online gaming, videoconferencing, and content sharing. All such services are to be provided on a single, integrated infrastructure, capable to offer seamless switching functionality between different services. It is worth noting that IMS Y. Koucheryavy, J. Harju, and A. Sayenko (Eds.): NEW2AN 2007, LNCS 4712, pp. 468–479, 2007. c Springer-Verlag Berlin Heidelberg 2007
Design and Implementation of an Open Source IMS
469
is conceived as an access agnostic platform. This requirement clearly imposes a careful study of the core IMS components (such as Call/Session Control Function – CSCF, Home Subscriber Server – HSS, Media Resource Function – MRF and Application Server – AS), which must be scalable and able to provide advanced features, like five nine reliability. In the above depicted scenario, the need arises for an architecture which effectively tackles many complex issues, including implementation. Indeed, although early IMS trials and deployments are underway, various challenges still have to be faced, related to both the infrastructure and the service level. The goal of this paper is to provide a contribution to the solution of some of the above mentioned challenges, with special regard to the need for actual implementations of IMS architectures and services. More precisely, we will herein present an implementation of a conferencing framework compliant with the IP Multimedia Core Network Subsystem specification and currently capable to provide video conferencing facilities in conjunction with session management capabilities and floor control. The overall architecture will be first described from a high level design view; then, we will delve into the implementation details. The paper is structured as follows. Section 2 helps position our work by providing useful information about the reference context, as well as about the motivations behind our contribution. An IMS-compliant architecture for moderated video conferences is depicted in section 3. Implementation details are illustrated in section 4, whereas in section 5 we deal with related work. Finally, section 6 provides some concluding remarks, together with information about our future work.
2
Context and Motivation
The Session Initiation Protocol (SIP) [1] provides users with the capability to initiate, manage, and terminate communication sessions in an IP network. SIP already allows multi party calls between multiple parties. However, conferencing does represent a more sophisticated service than multi party calls among multiple users. Indeed, conferencing applies to any kind of media stream by which users may want to communicate: this includes, for example, audio and video media streams, as well as conferences based on instant messaging or even gaming. The conferencing service provides the means for a user to create, manage, terminate, join and leave conferences. This service also provides the network with the ability to deliver information about these conferences to the involved parties. The standardization process associated with centralized conferencing over IP is still at an early stage within the different communities involved in the development of a standard conference system. The Internet Engineering Task Force (IETF) is an open international community concerned with the evolution of the Internet architecture and protocols. The main working groups within the IETF involved in the multimedia conferencing standardization effort are Session Initiation Proposal Investigation (SIPPING) and Centralized Conferencing (XCON). The SIPPING working group has
470
A. Buono et al.
developed a framework for multi-party conferencing with SIP [2]. This framework is based on the Conferencing Requirements document [3], which defines a general architectural model, presents terminology, and explains how SIP is involved in a tightly coupled conference. On the other hand, the goal of the XCON working group is to define both a reference framework and a data model [4] for tightly coupled conference scenarios envisaging the presence of a centralized management entity, called focus. A focus is a logical entity which maintains a call signalling interface between each participating client and the so-called conference object representing a conference at a certain stage (e. g. description upon conference creation, reservation, activation, etc.). Thus, the focus acts as an endpoint for each of the supported signaling protocols and is responsible for all primary conference membership operations. At present, XCON has specified the so-called Binary Floor Control Protocol (BFCP) [5]. BFCP enables applications to provide users with coordinated (shared or exclusive) access to resources like the right to send media over a particular media stream. The 3rd Generation Partnership Project (3GPP) is a collaboration agreement among a number of regional standard bodies. The scope of 3GPP is to develop Technical Specifications for a third-generation mobile system based on GSM. 3GPP has specified the requirements and defined the overall architecture [6] for tightly-coupled conferencing. The mentioned document actually represents a sort of umbrella specification within the IP Multimedia Core Network subsystem (IMS), trying to harmonize the combined use of existing standard protocols, such as the Session Initiation Protocol (SIP), SIP Events, the Session Description Protocol (SDP) and the Binary Floor Control Protocol (BFCP). The Open Mobile Alliance (OMA) is the leading industry forum for developing market driven, interoperable mobile service enablers on the extensive 3GPP IMS architecture. The OMA Conferencing solution builds on the service enablers. At present, OMA has standardized a conference model for Instant Messaging [7], as well as a conference model for the Push to Talk service [8]. Our conferencing framework is based on the architecture for the 3GPP conference service and is fully compliant with the associated requirements. 2.1
The IMS Architecture
Fig. 1 shows the architecture for the 3GPP IMS conferencing service and the interfaces among the different entities involved. For the sake of clarity we don’t explain all the IMS entities and interfaces; we rather focus on those which are relevant to our project. The User Equipment (UE) implements the role of a conference participant and may support also the floor participant or floor chair role (the difference between such roles will be clarified in section 4). The UE might be located either in the Visited or in the Home Network (HN). In any case, it can find the P-CSCF via the CSCF discovery procedure. Once done with the discovery phase, the UE sends SIP requests to the Proxy-Call Session Control Function (P-CSCF). The P-CSCF in turn forwards such messages to the Serving-CSCF (S-CSCF). In order to handle properly any UE request,
Design and Implementation of an Open Source IMS
471
Legacy PLMN SIP-AS Sh ISC
Access Network
C, D, IM-SSF Gc, Gr
OSA-SCS
MRFC Mr
Sh
Mp
Si Mb
ISC
P-CSCF
ISC
MRFP
Mw Mw
S-CSCF
Dx
Mw
Access Gm Network
Mw
P-CSCF
HSS
Cx
Cx Dx
Mi
I-CSCF
SGW
SLF
Mw
Mk
Mm
Non-IMS IP PDN
Mg Mj
MGCF
BGCF
Mn Mb
MGW
Fig. 1. The IMS Architecture
the S-CSCF needs both registration and session control procedures (so to use both subscriber and service data stored in the Home Subscriber Server – HSS). It also uses SIP to communicate with the Application Servers (AS). An AS is a SIP entity hosting and executing services (in our scenario, the AS clearly hosts the conferencing service). The IP Multimedia Service Control (ISC) interface sends and receives SIP messages between the S-CSCF and the AS. The two main procedures of the ISC are: (i) routing the initial SIP request to the AS; (ii) initiating a SIP request from the AS on behalf of a user. For the initiating request the SIP AS and the OSA SCS (Open Service Access - Service Capability Server ) need either to access user’s data or to know a S-CSCF to rely upon for such task. As we already mentioned, such information is stored in the HSS, so the AS and the OSA SCS can communicate with it via the Sh interface. In a SIP based conferencing scenario, the MRFC (Media Resource Function Control ) shall regard the MRFP (Media Resource Function Processing) as a mixer. In fact, the MRFP hosts both the mixer and the floor control server. When the MRFC needs to control media streams (creating a conference, handling or manipulating a floor, etc.) it uses the Mp interface. This interface is fully compliant with the H.248 protocol standard. The MRFC is needed to support bearer related services, such as conferencing. The focus, conference policy server and media policy server are co-located in an AS/MRFC component in the 3GPP framework. S-CSCF communicates with MRFC via Mr, a SIP based interface. In this scenario the AS/MRFC shall implement the role of a conference focus and a conference notification service. MRFC may support the floor control server role, the floor chair role or the floor participant role.
472
3
A. Buono et al.
An IMS Compliant Video Conferencing Architecture
From an architectural perspective our effort was first to identify and locate the IMS logical elements needed to properly handle an advanced conferencing scenario and subsequently to find out how it was possible to replace such elements with real-world components. The following sections provide more information about the two steps mentioned above. Starting from a bird’s eye view of the IMS architecture as shown in Fig. 1, we can clearly identify several elements defining the behaviors needed in a conferencing scenario. The very first mandatory element is the User Equipment (UE), which has to be both SIP compliant and XCON enabled in order to correctly support conferencing. According to the flow of messages, we need to have the P-CSCF, which may behave like a proxy. The P-CSCF accepts requests and forwards them to the S-CSCF. Hence, S-CSCF and HSS are the next selected elements, which will perform a number of important tasks, such as checking users access and authorization rights, handling session control for the registered endpoint sessions and interacting with Services Platforms for the support of services. Once done with the control elements needed to build the signaling plane of a conferencing scenario, we can now focus on floor management, streaming and control. To accomplish this task we selected the following elements. The SIP-AS is the SIP Application server as defined in [9] and will be in charge of managing conferences (e.g. creating, modifying, deleting them). Besides, it will be responsible for Floor Control, by managing access rights to shared resources in our conferencing framework. The MRFC, in turn, will Control the media stream resources in the MRFP, will interpret information coming from an AS and S-CSCF and control MRFP accordingly. The MRFP will provide resources to be controlled by the MRFC, as well as additional functionality like mixing of incoming media streams (in our case, audio and video streams) and media stream processing (e.g. audio transcoding, media analysis). The MGCF will perform the interworking with the PSTN, while controlling the MG for the required media conversions. Finally, the MGW will help perform the interworking with the PSTN, at the same time controlling and reserving the resources required by the media streams. According to the identified requirements, we can replace the IMS Elements with several real-world components. In our architecture, some of these components have been provided by the open source community. Some other entities have been either developed from scratch or based on open source components that have been appropriately extended in order to meet our architecture requirements. As described in Fig. 2, we replaced the UE with a SIP client called Minisip (http://www.minisip.org/), made capable to handle BFCP protocol messages. We also replaced the P-CSCF with a fully compliant SIP Proxy server called OpenSER (http://www.openser.org/). The S-CSCF and HSS elements have been realized by exploiting an open source SIP server called Asterisk (http://www.asterisk.org). Asterisk actually provided us with many required
Design and Implementation of an Open Source IMS
473
Fig. 2. IMS Elements Mapping
IMS functions. In fact, the role of the SIP-AS is played in our architecture by an enhanced version of an Asterisk component called MeetMe, capable to manage conferences. Furthermore, the roles of the MRFC and MRFP components are played by a couple of ad-hoc modified Asterisk modules capable to provide media management, streaming and floor control. Finally we replaced the MGCF and MGW components with a native Asterisk component performing the interworking with the PSTN, including all the related activities. Based on the above considerations, in the next section we will delve into the details of the implementation of our architecture.
4
CONFIANCE: An Open Source Implementation of the Conferencing Architecture
In this section we present an actual implementation of an open platform for the support of IP-based conferencing scenarios. Such platform, which we called CONFIANCE (CONFerencing IMS-enabled Architecture for Next-generation Communication Experience), has been realized in the framework of a collaboration activity involving the University of Napoli and Ericsson’s Nomadic Lab in Helsinki and it tries to take into account the most recent proposals under development inside the various standardization communities. More precisely, starting from the IMS-compliant design described in the previous sections, we implemented a video conferencing service which provides advanced capabilities, like moderated access to conference resources. In order to accomplish this task, we took inspiration from the work ongoing inside the IETF XCON working group. As stated in section 2, the XCON framework defines a set of conferencing protocols, which are complementary to the call signaling protocols, for building
474
A. Buono et al.
advanced conferencing applications. Among them, the so-called BFCP (Binary Floor Control Protocol) deserves special attention. BFCP enables conferencing applications to provide users with coordinated (shared or exclusive) access to available conference resources. By coordinated access we mean the capability to manage the access to a set of shared resources, such as the right to send media over a particular media stream. Each shared resource or set of resources is associated with a so-called floor, which is defined as a permission to temporarily access or manipulate the set of resources in question. A logical entity called chair is made responsible for one or more such floors. Its main task is managing requests for the floors it is assigned to. The clients of a conference can make floor requests on a transactionby-transaction basis to the Floor Control Server, thus asking for the permission to access a specific set of resources. The server forwards incoming requests to the chair, asking her/him for a decision about them. The BFCP can be used not only to make requests, but also to ask for information about specific floors, existing requests and requests made by other participants as well in a conference. A 3rd party Floor Request mechanism is also offered to enable requests for floors having other users in a conference as beneficiaries, instead of the original requester. Chairs are besides offered more complex functionality, e.g. to actively revoke a floor from a participant who may be abusing it. Notice that while BFCP offers a mechanism for coordinated access to resources, the policies a Floor Control Server follows to grant or deny floors are outside its specification. To make available an XCON-compliant architecture, we had to work both on the client and on the server side, as well as on the communication protocols between them. On the client side we implemented the two roles envisaged in the architecture, namely the simple participant and the chair. On the server side, we implemented the role of the focus, as defined in [4]. Finally, coming to the communication protocols, besides implementing BFCP as it is currently specified in the IETF [5], we also designed and realized a brand new conferencing control protocol, whose features will be briefly described in the following. Specifically, BFCP has been implemented as a library integrated into both client and server entities of the architecture. Instead, due to the current lack of agreement in the XCON WG about a specific Conference Control Protocol, we went for a temporary alternative solution capable to offer the basic functionality that the architecture is supposed to provide. We then specified and implemented a text-based protocol, called XCON Scheduler. Clients can use such protocol to dynamically manage conference creation as well as conference information. 4.1
Server Side Components
On the server side, we adopted Asterisk, an open source PBX which is gaining more and more popularity. Asterisk has been conceived as a modular software component, which can be quite easily modified in order to introduce new functionality upon necessity. Indeed, we added to Asterisk the following functionality:
Design and Implementation of an Open Source IMS
475
– XCON-related identifiers, needed to manage conferences; – Floor Control Server (FCS), by means of a library implementing the serverside behaviour of the BFCP; – Scheduler Server, the server side component implementing the conference scheduling protocol; – Notification Service, to enable asynchronous events interception. The components have been realized as extensions to an already available conferencing facility, called MeetMe, allowing users to access conferences by simply calling a predefined conference room, associated with a standard extension of Asterisk’s dialplan. By interacting with clients through the dynamic exchanging of conference scheduling messages (as defined in the Scheduler component), the enhanced MeetMe module allows for dynamic conference management in a user-friendly fashion. Required changes in the dialplan, as well as dynamic reloading of the needed Asterisk modules, have been achieved by appropriately modifying the already available MeetMe module. To allow video conferencing functionality, which is lacking in the base MeetMe module, we added a BFCP-moderated video switching feature to MeetMe. Work is currently being done on a video mixing and transcoding functionality as well. Since the XCON framework defines new identifiers (e.g. Conference URI [10] and User ID [11]), as does the BFCP specification, the existing MeetMe data model has been enriched with the new required information. Coming to BFCP, this has actually required the greatest development effort, since we had to implement the entire protocol from scratch (at the time of this writing BFCP has only recently become a completely specified RFC, and nonetheless still work is in full swing inside the IETF with respect to its refinement). BFCP has been realized as a library, which is loaded at run time by the Asterisk server and is called whenever the need arises to deal with BFCP messages (creation, parsing, etc.). In fact, Asterisk acts as the Floor Control Server of the architecture (see Fig. 3). It comes into play every time a request is generated from a participant, asking for the right to access a specific resource (e.g. audio or video). As highlighted in the picture, the FCS itself does not take any decision; it rather forwards floor requests to the appropriate floor chair, who comes out with a decision that is eventually notified to the originating client (and potentially to other interested parties, e.g. other participants involved in the same conference as the originating client). When a chair is missing or no chair is assigned to a floor, then the FCS takes automatic decisions corresponding to a previously specified policy (i.e. automatically accept or deny new requests for the floor). As a transport method for BFCP messages, support for both TCP/BFCP and TCP/TLS/BFCP (as specified in [5]) has been implemented. Since an UE, to take advantage of the BFCP functionality, will need to know all the BFCP-related information involved in a conference she/he will be participating into, the focus needs a way to transmit this data to her/him. Apart from any out-of-band mechanism that could be exploited, the IETF specifies a
476
A. Buono et al.
2. Notification
1. Request
3. Decision
6. Notification
4. Granted or Denied
Fig. 3. The BFCP protocol in action
way [12] to let SDP (Session Description Protocol) transport this information. This functionality has been implemented as well in the module. Then, in order to provide users with the capability of dynamically managing conferences, we specified and implemented a brand new protocol. The Scheduler has been conceived as a text-based protocol, mainly used by the clients either to create a new conference instance, or to ask the server for the list of available conferences (i.e. either ongoing conferences or conferences scheduled in the next future). Starting from this protocol, we implemented a Web Services-enabled wrapper to its functionality, and a proxy client that would allow clients exploiting simple browsers to access and manage conference information. This kind of approach is the same the WG has recently started investigating for a new Conference Protocol Candidate, the Centralized Conferencing Manipulation Protocol [13], with which our approach is fully compliant when looking at the currently provided functionality. Finally, we implemented a Notification Service by exploiting both existing solutions and customly implemented modules. As an existing solution, the Asterisk Manager Interface (AMP) has been used to notify authorized listeners about all relevant conference-related events. However, this approach only allows active notifications towards a passive listener. To overcome this limitation, and allow a bidirectional event notification mechanism with an external entity, we implemented a brand new protocol, which we called Dispatcher. This protocol is the base for a work we’re carrying on in order to improve the scalability of centralized conferencing frameworks, and as such it is only mentioned here. It is presented in detail in [14]. 4.2
Client Side Components
On the client side, we decided to adopt Minisip as a starting point. Minisip is an open source soft-phone supporting the SIP protocol and making use of the
Design and Implementation of an Open Source IMS
477
GTK+ framework for its graphical interfaces. Most of the available configuration widgets used in Minisip have been appropriately modified in order to enable support for XCON and BFCP settings. Furthermore, brand new widgets have been created in order to include the required client side functionality, related to both conference scheduling and BFCP. As to conference scheduling, a widget has been implemented, which provides users with the needed conference scheduling functionality: through such widget it becomes easy to either create a new conference, or retrieve the list of active XCON conferences, or join an existing XCON conference. Coming to BFCP, new classes implementing the client side BFCP behavior have been added to Minisip. Associated with such classes, new widgets have been provided which enable users to: (i) send BFCP messages to the BFCP server; (ii) interactively build BFCP floor requests, either in participant or in chair (i.e. with enhanced floor management functionality) mode; (iii) keep an up-to-date log of the BFCP messages exchanged with the server. Finally, with respect to the role of the chair, we implemented ad-hoc interfaces used to either manage floor requests issued by conference participants, or build so-called third-party floor requests, i.e. requests generated by the chair on behalf of a different participant1 . Since we added to Minisip too the support for the encapsulation of BFCP information in SDP bodies, the BFCP automatically is automatically exploited when a SIP invite or reinvite contains BFCP-related data. Besides, the appropriate transport method for the BFCP communication with the FCS (i.e. TCP/BFCP or TCP/TLS/BFCP) is automatically chosen and exploited with respect to this SDP negotiation.
5
Related Work
The architecture we presented in this paper focuses on two main aspects: (i) compatibility with the IMS framework; (ii) capability to offer advanced functionality such as floor control, conference scheduling and management, etc. While there is a rich literature on each of the above points, when considered alone, to the best of our knowledge no integrated effort has been made to date which tries to provide a real architecture being both IMS compliant and capable to offer floor management functionality. This is mainly due to the fact that no agreed-upon solution has been so far designated in the various international standardization fora, with respect to some crucial points, such as the choice of the most suitable conferencing control protocol, as well as its integration in the general IMS architecture. Interestingly enough, a few works have already proposed to make a further step ahead, by moving from a centralized to a distributed perspective. This is the case, for example of [15], where the authors propose a model trying to extend the XCON approach to a distributed scenario. While this is currently out of the scope of the IETF, it does represent one of our primary goals for the next future, as it will be explained in the next section. On the IMS side, some efforts have already been devoted to the realization of IMS compliant testbeds, as in 1
Notice that such functionality is particularly interesting since it enables the chair to allow non XCON-enabled clients to take part to an XCON-enabled conference.
478
A. Buono et al.
the case of [16], where the authors propose a testbed for multimedia services support based on the IMS specification. Finally, several other works can be found in the literature, though based on superseded models such as those defined in the IETF SIPPING Working Group. This is the case, e.g. of [17] and [18].
6
Conclusions and Future Work
In this paper we presented an actual implementation of an IMS-compliant architecture aimed at offering an video conferencing service with enhanced functionality, such as conference scheduling facilities and conference moderation. The system we developed is based on open source components, which have been appropriately modified in order to introduce support for the new required protocols and mechanisms. We are currently done with the main implementation effort, related to a tightly coupled, centralized conferencing model. Though, many challenging issues still have to be faced. Particularly, we have already definined an architecture capable to realize a distributed conferencing system having strong reliability and scalability properties. Starting from the available centralized conferencing system, we have defined the overall architecture for distributed conferencing in terms of framework, data model and protocols definitions. The framework under definition has been called DCON, standing for Distributed Conferencing, but at the same time explicitly recalling the already standardized XCON model. Indeed, DCON will be implemented as a large scale evolution of the XCON framework. We are proposing to deploy our architecture on top of a two-layer network topology. The top layer is represented by an overlay network in which each node plays the role of the focus element of an XCON “island”. The lower layer, in turn, is characterized by a star topology (in which the central hub is represented by the focus element) and is fully compliant with the XCON specification. In the DCON scenario, communication among different islands (i.e. among the focus elements managing different islands) becomes of paramount importance since it enables to share information about the state of the available conferences, as well as about the participants involved in a distributed conference. To the purpose, we are investigating the possibility of adopting the so-called S2S (Server to Server ) module of the XMPP (Extensible Messaging and Presence Protocol ) protocol. XMPP has been standardized by the IETF as the candidate protocol to support instant messaging, e-presence and generic request-response services, and it looks to us as the ideal communication means among DCON focus entities. A prototype of the platform is already available (http://dcon.sf.net/) and currently provides distributed videoconferencing functionality.
Acknowledgments This work has been carried out with the financial support of the European projects NetQoS, OneLab and Content. Such projects are partially funded by the EU as part of the IST Programme, within the Sixth Framework Programme.
Design and Implementation of an Open Source IMS
479
References 1. Rosenberg, J., Schulzrinne, H., Camarillo, G., et al.: SIP: Session Initiation Protocol. RFC3261 (June 2002) 2. Rosenberg, J.: A Framework for Conferencing with the Session Initiation Protocol (SIP). RFC4353 (February 2006) 3. Levin, O., Even, R.: High-Level Requirements for Tightly Coupled SIP Conferencing. RFC4345 (November 2005) 4. Barnes, M., Boulton, C., Levin, O.: A Framework and Data Model for Centralized Conferencing. draft-ietf-xcon-framework-07 (January 2007) 5. Camarillo, G., Ott, J., Drage, K.: The Binary Floor Control Protocol (BFCP). RFC4582 (November 2006) 6. 3GPP: Conferencing using the IP Multimedia (IM) Core Network (CN) subsystem; Stage 3. Technical report, 3GPP (March 2006) 7. OMA: Instant Messaging using SIMPLE Architecture.Technical report, OMA. 8. OMA: Push to talk over Cellular (PoC) - Architecture. Technical report, OMA 9. 3GPP: IP multimedia subsystem; Stage 2, Technical Specification. Technical report, 3GPP (June 2006) 10. Boulton, C., Barnes, M.: A Universal Resource Identifier (URI) for Centralized Conferencing (XCON). draft-boulton-xcon-uri-01 (February 2007) 11. Boulton, C., Barnes, M.: A User Identifier for Centralized Conferencing (XCON). draft-boulton-xcon-userid-01 (February 2007) 12. Camarillo, G.: Session Description Protocol (SDP) Format for Binary Floor Control Protocol (BFCP) Streams. RFC4583 (November 2006) 13. Barnes, M., Boulton, C., Schulzrinne, H.: Centralized Conferencing Manipulation Protocol. draft-barnes-xcon-ccmp-02 (January 2007) 14. Buono, A., Loreto, S., Miniero, L., Romano, S.P.: A Distributed IMS Enabled Conferencing Architecture on Top of a Standard Centralized Conferencing Framework. IEEE Communications Magazine 45(3) (2007) 15. Cho, Y., Jeong, M., Nah, J., Lee, W., Park, J.: Policy-Based Distributed Management Architecture for Large-Scale Enterprise Conferencing Service Using SIP. IEEE Journal On Selected Areas In Communications 23, 1934–1949 (2005) 16. Magedanz, T., Witaszek, D., Knuettel, K.: The IMS Playground @ Fokus An Open Testbed For Next Generation Network Multimedia Services. In: Proceedings of the First International Conference on Testbeds and Research Infrastructures for the DEvelopment of NeTworks and COMmunities (TRIDENTCOM05) (2005) 17. Yang, Z., Huadong, M., Zhang, J.: A Dynamic Scalable Service Model for SIPbased Video Conference. In: Proceedings of the 9th International Conference on Computer Supported Cooperative Work in Design 18. Singh, A., Mahadevan, P., Acharya, A., Shae, Z.: Design and Implementation of SIP Network and Client Services. In: Proceedings of the 13th International Conference on Computer Communication and Networks (ICCCN), Chicago, IL (2004)
Author Index
Aalto, Samuli 1 Alanen, Olli 148 ´ Alvarez Garc´ıa-Sanchidri´ an, Rodrigo 431 Ayoun, Moti 121 Bagnulo, Marcelo 443 Bellalta, B. 342 Berbineau, Marion 431 Bohnert, Thomas Michael Bonte, Michel 431 Borcoci, Eugen 133 Bruneel, Herwig 248 Bruyelle, Jean-Luc 431 Buono, A. 468
De Cicco, Luca 73 de la Oliva, Antonio 443 Desurmont, Xavier 431 De Turck, Koen 109 De Vuyst, Stijn 109 Doma´ nska, Joanna 61 Doma´ nski, Adam 61 Domenech-Benlloch, Ma Jose
Fiems, Dieter
269
H¨ am¨ al¨ ainen, Timo 148 Heegaard, Poul E. 26, 162 Herrera-Joancomart´ı, Jordi 281 Hickling, Ron 294 Hryn, Grzegorz 38 Hwang, Gang Uk 419
133 Iversen, V.B.
Karakoc, Mustafa 223 Kari, Hannu H. 455 Katz, Marcos 133 Kavak, Adnan 223 Konorski, Jerzy 316 Koucheryavy, Yevgeni 133
210
121 109
Galkin, Anatoly M. 187 Garc´ıa, J. 13 Garc´ıa-Mart´ınez, Alberto 443 Gimenez-Guzman, Jose Manuel Giordano, Stefano 269
260
Jakubiak, Jakub 133 Jang, Kil-Woong 306 Jeney, G´ abor 431 Jeon, Seungwoo 235 Jorguseski, Ljupco 194 Juva, Ilmari 1
Cano, C. 342 Cantineau, Olivier 431 Casares-Giner, Vicente 210 Castaldi, T. 468 Chai, Wei Koong 86 Chung, Tai-Myoung 356 Chydzinski, Andrzej 38 Claeys, Dieter 248 Csapodi, M´ arton 431 Czach´ orski, Tadeusz 61
Engelstad, Paal
Grønsund, P˚ al 121 Gubinelli, Massimiliano
210
Laevens, Koenraad 248 Lagutin, Dmitrij 455 Lamy-Bergot, Catherine 431 Lee, Hanjin 235 Lee, Jong-Hyouk 356 Lee, Joong-Hee 356 Li, Vitaly 99 Linder, Lloyd 294 Litjens, Remco 194 Lopez da Silva, Rafael 431 Lukyanenko, Andrey 393 Macian, C. 342 Malouch, Naceur 431 Marandin, Dimitri 367 Martikainen, Henrik 148 Mart´ınez, I. 13 Martinez-Bauset, Jorge 210 Mascolo, Saverio 73
482
Author Index
Miniero, L. 468 Moltchanov, D. 49 Monteiro, Edmundo 133 Montes, Angel 175 Osipov, Evgeny
379
Pagano, Michele 269 Panfilov, Oleg 294 Park, Hong Seong 99 Pavlou, George 86 P´erez, Jes´ us A. 175 Peuhkuri, Markus 1 Pla, Vicent 210 Popova, Mariya 194 Raymer, Dave 330 Rif` a-Pous, Helena 281 Romano, S.P. 468 Samudrala, Srini 330 Sandmann, Werner 162 Sanz, David 431
Sayenko, Alexander 148 Sfairopoulou, A. 342 Simonina, Olga A. 187 Skeie, Tor 121 Slavicek, Karel 409 Soto, Ignacio 443 Stepanov, S.N. 260 Strassner, John 330 Susitaival, Riikka 1 Turgeon, Antonio 294 Tykhomyrov, Vitaliy 148 Viruete, E.
13
Walraevens, Joris 248 Wittevrongel, Sabine 109 Wojcicki, Robert 38 Yanovsky, Gennady G. Yoon, Hyunsoo 235 Z´ arate, Victor
175
187