PERFORMANCE MODELS AND RISK MANAGEMENT IN COMMUNICATIONS SYSTEMS
Springer Optimization and Its Applications VOLUME 46 Managing Editor Panos M. Pardalos (University of Florida) Editor–Combinatorial Optimization Ding-Zhu Du (University of Texas at Dallas) Advisory Board J. Birge (University of Chicago) C.A. Floudas (Princeton University) F. Giannessi (University of Pisa) H.D. Sherali (Virginia Polytechnic and State University) T. Terlaky (McMaster University) Y. Ye (Stanford University)
Aims and Scope Optimization has been expanding in all directions at an astonishing rate during the last few decades. New algorithmic and theoretical techniques have been developed, the diffusion into other disciplines has proceeded at a rapid pace, and our knowledge of all aspects of the field has grown even more profound. At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field. Optimization has been a basic tool in all areas of applied mathematics, engineering, medicine, economics and other sciences. The series Springer Optimization and Its Applications publishes undergraduate and graduate textbooks, monographs and state-of-the-art expository works that focus on algorithms for solving optimization problems and also study applications involving such problems. Some of the topics covered include nonlinear optimization (convex and nonconvex), network flow problems, stochastic optimization, optimal control, discrete optimization, multiobjective programming, description of software packages, approximation techniques and heuristic approaches.
For other titles published in this series, go to http://www.springer.com/series/7393
PERFORMANCE MODELS AND RISK MANAGEMENT IN COMMUNICATIONS SYSTEMS
By
ˆ GÜLPINAR NALAN Warwick Business School Coventry, UK PETER HARRISON Imperial College London, UK BERÇ RÜSTEM Imperial College London, UK
123
Editors Nalân Gülpınar Warwick Business School The University of Warwick Coventry, CV4 7AL, UK
[email protected]
Peter Harrison Department of Computing Imperial College London London, SW7 2BZ, UK
[email protected]
Berç Rüstem Department of Computing Imperial College London London, SW7 2BZ, UK
[email protected]
ISSN 1931-6828 ISBN 978-1-4419-0533-8 e-ISBN 978-1-4419-0534-5 DOI 10.1007/978-1-4419-0534-5 Springer New York Dordrecht Heidelberg London Mathematics Subject Classification (2010): 90B15, 90B18, 90C15, 90C90, 91A40, 93E03 Library of Congress Control Number: 2010937634 c Springer Science+Business Media, LLC 2011 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Computer and telecommunication sectors are important for global dynamic economic development. The design and deployment of future networks are subject to uncertainties such as capacity prices, demand and supply for services, and shared infrastructure. Moreover, recent trends in those sectors have led to considerable increase in the level of uncertainty. Optimal policies of the operators and design of complex network functionalities require decision support methodologies that take into account uncertainty in order to improve performance, cost-effectiveness, risk and security and ensure robustness. Stochastic modeling, decision making, and game-theoretic techniques ensure the optimum end-to-end performance of general network systems. Optimal system design unifying performance modeling and decision making provides a generic approach. Real-time optimal decision making is inevitably intended to improve efficiency. Risk management injects robustness and ensures that effects of uncertainty are taken into account. The achievement of best performance may conflict with the minimization of the associated risk. Robustness in view of traffic variations, changes in network capacity, or topology, is important for both network operators and end users. A robust network allows operators to hedge against uncertainty and hence save costs and yield performance benefits to end users. Robustness can be achieved by introducing diversity at transport level and using flow or congestion control. This book considers recent developments in the design, operation, and management of telecommunication and computer network systems in performance engineering and addresses issues of uncertainty, robustness, and risk. The book consists of 10 chapters that provide a reference tool for scientists and engineers in telecommunication and computer networks. Moreover, it is intended to motivate a new wave of research in the interface of telecommunications and operations research. Coventry, UK London, UK London, UK
Nalân Gülpınar Peter Harrison Berç Rüstem
v
Contents
Distributed and Robust Rate Control for Communication Networks . . . . . Tansu Alpcan
1
Of Threats and Costs: A Game-Theoretic Approach to Security Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Patrick Maillé, Peter Reichl, and Bruno Tuffin Computationally Supported Quantitative Risk Management for Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Denis Trˇcek Cardinality-Constrained Critical Node Detection Problem . . . . . . . . . . . . . . 79 Ashwin Arulselvan, Clayton W. Commander, Oleg Shylo, and Panos M. Pardalos Reliability-Based Routing Algorithms for Energy-Aware Communication in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . 93 Janos Levendovszky, Andras Olah, Gergely Treplan, and Long Tran-Thanh Opportunistic Scheduling with Deadline Constraints in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 David I Shuman and Mingyan Liu A Hybrid Polyhedral Uncertainty Model for the Robust Network Loading Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Ay¸segül Altın, Hande Yaman, and Mustafa Ç. Pınar Analytical Modelling of IEEE 802.11e Enhanced Distributed Channel Access Protocol in Wireless LANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Jia Hu, Geyong Min, Mike E. Woodward, and Weijia Jia vii
viii
Contents
Dynamic Overlay Single-Domain Contracting for End-to-End Contract Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Murat Yüksel, Aparna Gupta, and Koushik Kar Modelling a Grid Market Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Fernando Martínez Ortuño, Uli Harder, and Peter Harrison
Contributors
Tansu Alpcan Deutsche Telekom Laboratories, Technical University of Berlin, Ernst-Reuter-Platz 7, Berlin 10587 Germany,
[email protected] Ay¸segül Altın Department of Industrial Engineering, TOBB University of Economics and Technology, Sö˘gütözü 06560 Ankara, Turkey,
[email protected] Ashwin Arulselvan Center for Discrete Mathematics and Applications, Warwick Business School, University of Warwick, Coventry, UK,
[email protected] Clayton W. Commander Air Force Research Laboratory, Munitions Directorate, and Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA,
[email protected] Aparna Gupta Rensselaer Polytechnic Institute, Troy, NY 12180, USA,
[email protected] Uli Harder Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK,
[email protected] Peter Harrison Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK,
[email protected] Jia Hu Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK,
[email protected] Weijia Jia Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Hong Kong,
[email protected] Koushik Kar Rensselaer Polytechnic Institute, Troy, NY 12180, USA,
[email protected] Janos Levendovszky Budapest University of Technology and Economics, Department of Telecommunications, H-1117 Magyar tud. krt. 2, Budapest, Hungary,
[email protected] Mingyan Liu Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI 48109, USA,
[email protected] ix
x
Contributors
Patrick Maillé Institut Telecom; Telecom Bretagne, 2 rue de la Châtaigneraie CS 17607, 35576 Cesson-Sévigné Cedex, France,
[email protected] Geyong Min Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK,
[email protected] Andras Olah Faculty of Information Technology, Peter Pazmany Catholic University, H-1083 Práter u. 50/A Budapest, Hungary,
[email protected] Fernando Martínez Ortuño Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK,
[email protected] Panos M. Pardalos Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA,
[email protected] Mustafa Ç. Pınar Department of Industrial Engineering, Bilkent University, 06800 Ankara, Turkey,
[email protected] Peter Reichl Telecommunications Research Center Vienna (ftw.), Donau-City-Str. 1, 1220 Wien, Austria,
[email protected] David I Shuman Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI 48109, USA,
[email protected] Oleg Shylo Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA,
[email protected] Long Tran-Thanh Budapest University of Technology and Economics, Department of Telecommunications, z, H-1117 Magyar tud. krt. 2, Budapest, Hungary,
[email protected] Denis Trˇcek Faculty of Computer and Information Science, Laboratory of E-media , University of Ljubljana, Tržaška cesta 25, 1000 Ljubljana, Slovenia,
[email protected] Gergely Treplan Faculty of Information Technology, Peter Pazmany Catholic University, H-1083 Práter u. 50/A Budapest, Hungary,
[email protected] Bruno Tuffin INRIA Rennes – Bretagne Atlantique, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France,
[email protected] Mike E. Woodward Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK,
[email protected] Hande Yaman Department of Industrial Engineering, Bilkent University, 06800 Ankara, Turkey,
[email protected] Murat Yüksel University of Nevada - Reno, Reno, NV 89557, USA,
[email protected]
Distributed and Robust Rate Control for Communication Networks Tansu Alpcan
1 Introduction Wired and wireless communication networks are an ubiquitous and indispensable part of the modern society. They serve a variety of purposes and applications for their end users. Hence, networks exhibit heterogeneous characteristics in terms of their access infrastructure (e.g., wired vs. wireless), protocols, and capacity. Moreover, contemporary networks such as the Internet are heavily decentralized in both their administration and resources. The end users of communication networks are also diverse and run a variety of applications ranging from multimedia (VoIP, video) to gaming and data communications. As a result of the networks’ distributed nature, users often have little information about the network topology and characteristics. Regardless, they can behave selfishly in their demands for bandwidth. Given the mentioned characteristics of the contemporary networks and their users, a fundamental research question is, how to ensure efficient, fair, and incentivecompatible allocation of network bandwidth among its users. Complicating the problem further, the mentioned objectives have to be achieved through distributed algorithms while ensuring robustness with respect to information delays and capacity changes. This research challenge can be quite open-ended due to the multifaceted nature of the underlying problems.
1.1 Summary and Contributions This chapter presents three control and game-theoretic approaches that address the described rate control problem from different perspectives. The objective here is to investigate the underlying mathematical principles of the problem and solution concepts rather than discussing possible implementation scenarios. However, it is Tansu Alpcan Deutsche Telekom Laboratories, Technical University of Berlin, Ernst-Reuter-Platz 7, Berlin 10587, Germany e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_1,
1
2
T. Alpcan
hoped that the rigorous mathematical analysis presented will be useful as a basis for engineering future rate control schemes. A noncooperative rate control game is presented in Section 3. Adopting a utilitybased approach, user preferences are captured by a fairly general class of cost functions [4]. Based on their own utility functions and external prices, the users (players) of this game use a standard gradient algorithm to update their flow rates iteratively over time, resulting in an end-to-end congestion control scheme. The game admits a unique Nash equilibrium under a sufficient condition, where no user has an incentive to deviate from it. Furthermore, a mild symmetricity assumption and a sufficient condition on maximum delay ensure its global stability with respect to the gradient algorithm for general network topologies and under fixed heterogeneous delays. The upper bound on communication delays given in the sufficient condition is inversely proportional to the square root of the number of users sharing a link multiplied by the cube of a gain constant. Section 4 studies a primal–dual rate control scheme to solve a global optimization problem, where each user’s cost function is composed of a pricing function proportional to the queueing delay experienced by the user, and a fairly general utility function which captures the user’s demand for bandwidth [5]. The global objective is to maximize the sum of user utilities under fixed capacity constraints. Using a network model based on fluid approximations and through a realistic modeling of queues, the existence of a unique equilibrium is established, at which the global optimum is achieved. The scheme is globally asymptotically stable for a general network topology. Furthermore, sufficient conditions for system stability are derived when there is a bottleneck link shared by multiple users experiencing non-negligible communication delays. A robust flow control framework is introduced in Section 5. It is based on an H∞ -optimal control formulation for allocating rates to devices on a network with heterogeneous time-varying characteristics [6]. H∞ methods are used in control theory to synthesize controllers achieving robust performance or stabilization. Here, H∞ analysis and design allow for the coupling between different devices to be relaxed by treating the dynamics for each device independently from others. Thus, the resulting distributed end-to-end rate control scheme relies on minimum information and achieves fair and robust rate allocation for the devices. In the fixed capacity case, it is shown that the equilibrium point of the system ensures full capacity usage by the users. The formulations presented in Sections 3, 4, and 5 are, on the one hand, closely related to each other. Each approach mainly shares the same common network model, which will be discussed in Section 2. Furthermore, they are totally distributed, end-to-end schemes with little information exchange overhead. All of the schemes are robust with respect to information delays in the system and their stability properties are analyzed rigorously. On the other hand, each approach brings the problem of rate allocation a different perspective. The rate control game of Section 3 focuses mainly on incentive compatibility and adopts Nash equilibrium as the preferred solution concept. The primal–dual scheme of Section 4 extends the basic fluid network model by taking into account the queue dynamics and is built upon available information to users for
Distributed and Robust Rate Control
3
decision making. The robust rate control scheme of Section 5 emphasizes robustness with respect to capacity changes and delays. It also differs from the previous two, which share a utility-based approach, by focusing on fully efficient usage of the network capacity. The remainder of the chapter is organized as follows. A brief overview of existing relevant literature is discussed next. The network model and its underlying assumptions are presented in Section 2. Section 3 studies a rate control game along with an equilibrium and stability analysis under information delays. In Section 4, a primal– dual scheme is investigated. Section 5 presents a robust control framework and its analysis. The chapter concludes with remarks in Section 6.
1.2 Related Work The research community has intensively studied the challenging problem of rate and congestion control in the recent years. Consequently, a rich literature, which investigates the problem using a variety of approaches, has emerged. While it is far from a comprehensive survey, a small subset of the existing body of literature on the subject is summarized here as a reference. After the introduction of the congestion control algorithm for transfer control protocol (TCP) [18], research community has focused on modeling and analysis of rate control algorithms. Based on an earlier work by Kelly [20], Kelly et al. [21] have presented the first comprehensive mathematical model and posed the underlying resource allocation problem as one of constrained optimization. The primal and dual algorithms that they have introduced are based on user utility and link pricing (explicit congestion feedback) functions, where the sum of user utilities are maximized within the capacity (bandwidth) constraints of the links. They have also introduced the concept of proportional fairness, which is a relaxed version of min–max fairness [31], as a resource allocation criterion among users. Subsequent studies [14, 23, 24, 26, 29] have investigated variations and generalizations of the distributed congestion control framework of [20, 21]. Low and Lapsley [26] have analyzed the convergence of distributed synchronous and asynchronous discrete algorithms, which solve a similar optimization problem. Mo and Walrand [29] have generalized the proportional fairness and have proposed a fair end-to-end window-based congestion control scheme, which is similar to the primal algorithm. The main difference of this window-based algorithm from the primal algorithm is that it does not need explicit congestion feedback from the routers. Instead it makes use of measured queuing delay as implicit congestion feedback. La and Anantharam [24] have considered a system model similar to proposed in [29] with a window-based control scheme and static modeling of link buffers. They have investigated convergence properties of the proposed charge-sensitive congestion control scheme, which utilizes a static pricing scheme based on link queueing delays. In addition, they have established stability of the algorithm at a single bottleneck node. Kunniyur and Srikant [23] have examined the question of how to provide congestion feedback from the network to the user. They have proposed an explicit congestion notification (ECN) marking scheme combined with dynamic
4
T. Alpcan
adaptive virtual queues and have shown using a timescale decomposition that the system is semi-globally stable in the no-delay case. In developing rate control mechanisms for the Internet, game theory provides a natural framework. Users on the network can be modeled as players in a rate control game where they choose their strategies or, in this case, flow rates. Players are selfish in terms of their demands for network resources and have no specific information on other users’ strategies. A user’s demand for bandwidth is captured in a utility function which may not be bounded. To compensate for this, one can devise a pricing function, proportional to the bandwidth usage of a user, as a disincentive to him to have excessive demand for bandwidth. This way, the network resources are preserved and an incentive is provided for the user to implement end-to-end congestion control [16]. A useful concept in such a noncooperative congestion control game is Nash equilibrium [10] where each player minimizes its own cost (or maximizes payoff) given all other players’ strategies. There is, consequently, a rich literature on game-theoretic analysis of flow control problems utilizing both cooperative [34] and noncooperative [1–3, 5, 7, 8, 30] frameworks. Robustness of distributed rate control algorithms with respect to delays in the network have been investigated by many studies [19, 27, 32]. Johari and Tan [19] have analyzed the local stability of a delayed system where the end user implements the primal algorithm. They have considered a single link accessed by a single user, as well as its multiple user extension under the assumption of symmetric delays. In both cases, they have provided sufficient conditions for local stability of the underlying system of equations. Massoulie [27] has extended these local stability results to general network topologies and heterogeneous delays. In another study, Vinnicombe [32] has also provided sufficient conditions for local stability of a user rate control scheme which is a generalization of the same algorithm. Elwalid [15] has considered stability of a linear class of algorithms where the source rate varies in proportion to the difference between the buffer content and the target value. Deb and Srikant [14], on the other hand, have focused on the case of single user and a single resource and investigated sufficient conditions for global stability of various nonlinear congestion control schemes under fixed information delays. Liu et al. [25] have extended the framework of Kelly and Coworkers [20, 21] by introducing a primal–dual algorithm which has dynamic adaptations at both ends (users and links) and have given a condition for its local stability under delay using the generalized Nyquist criterion. Wen and Arcak [33] have used a passivity framework to unify some of the stability results on primal and dual algorithms without delay, have introduced and analyzed a larger class of such algorithms for stability, and have shown robustness to variations due to delay.
2 Network Model A general network model is considered which is based on fluid approximations. Fluid models are widely used in addressing a variety of network control problems, such as congestion control [1, 7, 29], routing [7, 30], and pricing [11, 21, 34]. The
Distributed and Robust Rate Control
5
topology of the network is characterized by a connected graph consisting of a set of nodes N = {1, . . . , N } and a set of links L = {1, . . . , L}. Each link l ∈ L has a fixed capacity Cl > 0 and is associated with a buffer of size bl ≥ 0. There are M active users sharing the network, M = {1, . . . , M}. For simplicity, each user is associated with a (unique) connection. Hence, the ith (i ∈ M) user corresponds to a unique connection between a source and a destination node, si , dei ∈ N . The ith user sends its nonnegative flow, xi ≥ 0, over its route (path) Ri , which is a subset of L. An upper bound, xi,max , is imposed on the ith users flow rate, which may be due to a user (device)-specific physical limitation. Define a routing matrix, A := [(al,i )] of ones and zeros, as in [21] which describes the relation between the set of routes R = {1, . . . , M} associated with the users (connections), i ∈ M, and links l ∈ L : 1, if source i uses link l Al,i = . (1) 0, if source i does not use link l It is assumed here, without any loss of generality, that no rows or columns in A are identically zero. Using this routing matrix A, the capacity constraints of the links are given by Ax ≤ C, where x is the (M ×1) flow rate vector of the users and C is the (L ×1) link capacity vector. The flow rate vector, x, is said to be feasible if it is nonnegative and satisfies this constraint. Let x−i be the flow rate vector of all users except the ith one. For a given fixed, feasible x−i , there exists a strict finite upper bound m i (x−i ) on flow rate of the ith user, x i , based on the capacity constraints of the links: m i (x−i ) = min(Cl − l∈Ri
Al, j x j ) ≥ 0 .
j=i
2.1 Model Assumptions Simplifying assumptions are necessary to develop a mathematically tractable model. The assumptions of this chapter are shared by the majority of the literature on the subject, including the works cited in Section 1. Furthermore, the analytical results obtained based on these assumptions are verified many times via realistic packetlevel simulations in the literature. The main assumptions on the network model are summarized as follows: 1. The network model is based on fluid approximations, where individual packets are replaced with flows. Fluid models are widely used in addressing a variety of
6
2.
3. 4. 5.
T. Alpcan
network control problems, such as congestion control [1, 7, 29], routing [7, 30], and pricing [11, 21, 34]. For simplicity, each user is associated with a unique connection and a corresponding fixed route (path). The routing matrix A is assumed to be of full row rank as non-bottleneck links have no effect on the equilibrium point due to zero queuing delay on those links. Bandwidth is focused on as the main network resource. Information delays are assumed to be fixed for tractability of analysis. The links are associated with first-in first-out (FIFO) finite queues (buffers) with droptail packet dropping policies.
Additional assumptions, being part of the specific optimization or game formulations, are explicitly introduced and discussed in their respective sections.
3 Rate Control Game A noncooperative rate control game is played among M users on the general network model, which is described in the previous section. The game is noncooperative as the users are assumed to be selfish in terms of their demand for bandwidth and have no means of communicating with each other about their preferences. Hence, each user tries to optimize his usage of the network independently by minimizing its own specific cost function Ji . This cost function is defined on the compact, continuous set of feasible rates of users, x := {x ⊂ R M : x ≥ 0, Ax ≤ C}. The cost function Ji not only models the user preferences but also includes a feedback term capturing the current network state. Thus, the ith user minimizes his cost, Ji , by adjusting his flow rate 0 ≤ xi ≤ m i (x−i ) given the fixed, feasible flow rates of all other users on its path, {x j : j ∈ (R j ∩ Ri )}. The cost function of the ith user, Ji , is defined as the difference between a user-specific pricing function, Pi , and a utility function, Ui . It is smooth, i.e., at least twice-continuously differentiable in all its arguments. The pricing function Pi depends on the current state of the network and can be interpreted as the price a user pays for using the network resources. The utility function Ui is defined to be increasing and concave in accordance with elastic traffic as well as with the economic principle, law of diminishing returns. The utility of each user depends only on its own flow rate. Thus, the cost function is defined as Ji (x; C, A) = Pi (x; C, A) − Ui (xi ).
(2)
Here, the pricing function Pi of user i does not necessarily depend on the flow rates of all other users; it can be structured to depend only on the flow rates of the users sharing the same links on the path of the ith user. The rate control game defined proposes “pricing” as a way to enforce a more favorable outcome for the system and users. If there is no pricing scheme, then the increasing and concave user utilities result in a solution where each user sends
Distributed and Robust Rate Control
7
with the maximum possible rate. In practice, this would lead to a congestion collapse or “tragedy of commons.” To remedy this, the function P, proportional to the bandwidth usage of a user, is utilized as an incentive for the user to curb excessive demand. Thus, the network resources are preserved, and an incentive is provided for the user to implement end-to-end congestion control. On the other hand, the analysis in this section focuses on mathematical principles rather than architectural concerns or possible implementation of such pricing schemes.
3.1 Nash Equilibrium as a Unique Solution The defined rate control game may admit a (unique) Nash equilibrium (NE) as a solution. In this context, Nash equilibrium is defined as a set of flow rates, x∗ (and corresponding costs J ∗ ), with the property that no user can benefit by modifying its flow while the other players keep their flows fixed. Furthermore, if the Nash equilibrium, x∗ , meets the capacity constraints (e.g., Ax∗ ≤ C) as well as the positivity constraint (x∗ ≥ 0) with strict inequality, then it is an inner solution. Definition 1 (Nash Equilibrium) The user flow rate vector x∗ is in Nash equilibrium, when xi∗ of any ith user is the solution to the following optimization problem given that all users on its path have equilibrium flow rates, x∗−i : min
0≤xi ≤m i (x−i ∗ )
∗ Ji (xi , x−i , C, A) ,
(3)
where x−i denotes the collection {x j : j ∈ R j ∩ Ri } j=1,...,M . The assumptions on the user cost functions are next formalized: A1. Pi (x) is jointly continuous in all its arguments and twice continuously differentiable, non-decreasing, and convex in xi , i.e. ∂ Pi (x) ≥ 0, ∂ xi
∂ 2 Pi (x) ≥ 0. ∂ xi2
(4)
A2. U (xi ) is jointly continuous in all its arguments and twice continuously differentiable, non-decreasing, and strictly concave in xi , i.e. ∂Ui (xi ) ≥ 0, ∂ xi
∂ 2 Ui (xi ) ∂ xi2
<0
∀xi
Moreover, the optimal solution (Nash equilibrium) is an inner one, 0 < j Al, j x ∗j < cl , ∀l, under the additional assumption: A3. The ith user’s cost function has the following properties at xi = 0 : ∂ Ji (x : xi = 0) <0 ∂ xi
∀x,
8
T. Alpcan
and at xi = m i (x−i ), ∂ Ji (x : xi = m i (x−i )) >0 ∂ xi
∀x.
Theorem 1 establishes that the congestion control game admits a unique NE under the following further assumption: A4. The price function Pi (x) of the ith user is defined as the sum of link price functions on its path,
Pi =
l∈Ri
⎛ Pl ⎝
⎞ x j ⎠,
j:l∈R j
where Pl is defined as a function of the aggregate flow on link l and satisfies (4) with i replaced by l. Theorem 1 Under A1–A4, the network game admits a unique inner Nash equilibrium. Proof Earlier versions of this proof are in [3, 4]. Let X := {x ∈ R M : Ax ≤ C , x ≥ 0} be the set of feasible flow rate vectors (or strategy space) of the users. The flow rate of a generic ith user is nonnegative and bounded above by the minimum link capacity on its route, 0 ≤ xi < minl∈Ri Cl . The set X is clearly closed and bounded, hence, compact. First, X is shown to have a nonempty interior and be convex. Define 1 minl Cl . Clearly, xmax ∈ X is feasible and the following flow rate vector: xmax := M positive as Cl > 0, ∀l. Hence, there exists at least one positive and feasible flow rate vector in the set X , which is an interior point. Thus, the set X has a nonempty interior. Let x1 , x2 ∈ X be two feasible flow rate vectors and 0 < λ < 1 be a real number. For any xλ := λx1 + (1 − λ)x2 , it follows that Axλ = A(λx1 + (1 − λ)x2 ) ≤ C. Furthermore, xλ ≥ 0 by definition. Hence, xλ is feasible and is in X for any 0 < λ < 1. Thus, the set X is convex. By a standard theorem of game theory (Theorem 4.4 p. 176 in [10]), the network game admits an NE. Next, uniqueness of the NE is shown. Differentiating (2) with respect to x i and using assumptions A1 and A2 results in f i (x) :=
∂ Ji (x) ∂ Pi (x) ∂Ui (xi ) = − . ∂ xi ∂ xi ∂ xi
(5)
As a simplification of notation, C and A are suppressed as arguments of the functions for the rest of this proof. Differentiating Ji (x) twice with respect to xi yields
Distributed and Robust Rate Control
9
∂ fi (x) ∂ 2 Ji (x) ∂ 2 Pi (x) ∂ 2 Ui (xi ) = = − > 0. ∂ xi ∂ xi2 ∂ xi2 ∂ xi2 Hence, Ji is unimodal and has a unique minimum. Based on A3, f i (x) attains the zero value at m i (x−i ) > x i > 0 given a fixed feasible x−i . Thus, the optimization problem (3) admits a unique positive solution. 2 be denoted by Bi . Further introduce, for i, j ∈ To preserve notation, let ∂ J (x) 2 ∂ xi
M, j = i,
∂ 2 Ji (x) ∂ 2 Pi (x) = =: Ai, j , ∂ xi ∂ x j ∂ xi ∂ x j with both Bi and Ai, j defined on the space where x is nonnegative and bounded by the link capacities. Suppose that there are two Nash equilibria, represented by two flow vectors x0 and x1 , with elements xi0 and xi1 , respectively. Define the pseudogradient vector:
T g(x) := ∇x1 J1 (x)T · · · ∇x M JM (x)T .
(6)
As the Nash equilibrium is necessarily an inner solution, it follows from firstorder optimality condition that g(x0 ) = 0 and g(x1 ) = 0. Define the flow vector x(θ ) as a convex combination of the two equilibrium points x0 and x1 : x(θ ) = θ x0 + (1 − θ )x1 , where 0 < θ < 1. By differentiating x(θ ) with respect to θ , dg(x(θ )) d x(θ ) = G(x(θ )) = G(x(θ ))(x1 − x0 ) , dθ dθ
(7)
where G(x) is the Jacobian of g(x) with respect to x : ⎛
B1 A12 ⎜ G(x) := ⎝ ... A M1 A M2
⎞ · · · A1M . ⎟ .. . . .. ⎠ · · · B M M×M
Additionally note that, by assumption A4 l∈(Ri ∩R j )
∂ 2 Jl (x) = ∂ xi ∂ x j
l∈(Ri ∩R j )
∂ 2 Jl (x) ∂ xi ∂ x j
⇒ A(i, j) = A( j, i) i, j ∈ M .
(8)
10
T. Alpcan
Hence, G(x) is symmetric. Integrating (7) over θ ,
0 = g(x ) − g(x ) = 1
0
1
G(x(θ ))dθ (x1 − x0 ) ,
(9)
0
1 where (x1 − x0 ) is a constant flow vector. Let Bi (x) = 0 Bi (x(θ ))dθ and 1 Ai j (x) = 0 Ai j (x(θ ))dθ . In view of A2 and A4, Bi (x) > Ai j (x) > 0 , ∀i, j. Thus, Bi (x) > Ai j (x) > 0, for any x(θ ). In order to simplify the notation, define 1 the matrix G(x1 , x0 ) := 0 G(x(θ ))dθ , which can be shown to be full rank for any fixed x. Rewriting (9) as, 0 = G · [x1 − x0 ], since G is full rank, it readily follows that x1 − x0 = 0. Therefore, the NE is unique. Under A3, the NE has to be an inner solution, as the following argument shows. First, x ≥ 0, with xi = 0 for at least one i, cannot be an equilibrium point since user i can decrease its cost by increasing its flow rate. Similarly, the boundary points {x ∈ R M : Ax ≤ C , x ≥ 0, with (Ax)l = Cl for at least one link l} cannot constitute NE, as users whose flows pass through the link can decrease their flow rates under A3. Thus, under A1–A4, the network game admits a unique inner NE.
3.2 Stability Analysis Consider a simple dynamic model of the defined rate control game where each user changes his flow rate in proportion with the gradient of his cost function with respect to his flow rate. Note that this corresponds to the well-known steepest descent algorithm in nonlinear programming [12]. Hence, the user update algorithm is d xi (t) dUi (xi ) ∂ Ji (x(t)) = − fl =− x˙i (t) = dt ∂ xi d xi l∈Ri
xj
:= θi (x),
(10)
j∈Ml
for all i = 1, . . . , M, where Ml (Ml ) is the set (number) of users whose flows pass through the link, l ∈ Ri ; t is the time variable, which we drop in the second line for a more compact notation; and fl is defined as fl (·) := ∂ Pl (·)/∂ xi . By assumption A4, the partial derivative of fl with respect to xi , ∂ fl (·)/∂ xi , is non-negative. Furthermore, since Pl (x) is convex and jointly continuous in xi for all i whose flows pass through the link l, on the compact set of feasible flow rate vectors, X := {x ∈ R M : Ax ≤ C , x ≥ 0}, the derivative ∂ fl (.)/∂ xi can be bounded above by a constant αl > 0. Hence, 0≤ where x¯l =
i∈Ml
xi .
∂ fl (x¯l ) ≤ αl , ∂ xi
(11)
Distributed and Robust Rate Control
11
It is now shown that the system defined by (10) is asymptotically stable on the set X , which is invariant by assumption A3 under the gradient update algorithm (10). In order to see the invariance of X , each boundary of X is investigated separately. If xi = 0 for some i ∈ M, then x˙i > 0 follows from (10) under assumption A3 due to the gradient descent algorithm of user i. Hence, the system trajectory moves toward inside of X . Likewise, in the case of x¯l = Cl for some l ∈ L, it follows from (10) and assumption A3 that x˙i < 0 ∀i ∈ Ml , and hence, the trajectory remains inside the set X . The unique inner NE, x∗ (see Theorem 1), of the rate control game constitutes the equilibrium state of the dynamical system (10) in X . Around this equilibrium, define a candidate Lyapunov function V : R M → R+ as 1 2 θi (x), 2 M
V (x) :=
i=1
which is in fact restricted to the domain X . Further let Θ := [θ1 , . . . , θ M ]. Taking the derivative of V with respect to t on the trajectories generated by (10), one obtains V˙ (x) =
M d 2 Ui (xi ) i=1
d xi2
θi2 (x) − Θ T (x)A T K AΘ(x),
where A is the routing matrix, and K is a diagonal matrix defined as ∂fM (x) ∂f1 (x) ∂f2 (x) , ,..., . K := diag ∂x ∂x ∂x
Since A T K A is non-negative definite and d 2 Ui /d xi2 is uniformly negative definite, V (x) is strictly decreasing, V˙ (x) < 0, on the trajectory of (10). Thus, the system is asymptotically stable on the invariant set X by Lyapunov’s stability theorem (see Theorem 3.1 in [22]). Theorem 2 Assume A1–A4 hold. Then, the unique inner Nash equilibrium of the network game is globally stable on the compact set of feasible flow rate vectors, X := {x ∈ R M : Ax ≤ C , x ≥ 0} under the gradient algorithm given by x˙i = −
∂ Ji (x) , ∂ xi
i = 1, . . . , M.
3.3 Stability under Information Delays Whether the user rate control (gradient) algorithm (10) is robust with respect to communication delays is an important question. This section investigates the rate
12
T. Alpcan
control scheme under bounded and heterogeneous communication delays. The distributed rate control algorithm under communication delays is defined as dUi (xi (t)) x˙i (t) = − fl d xi
x j (t − rli − rl j ) ,
(12)
j∈Ml
l∈Ri
where rli and rl j are fixed communication delays between the lth link and the ith and jth users, respectively. It is implicitly assumed here that queueing delays are negligible compared to the fixed propagation delays in the system. 3.3.1 Notation and Definitions Notice again that, fl is defined as fl (·) := ∂ Pl (·)/∂ xi and the pricing function of the ith user is defined in accordance with assumption A4 as Pi =
Pl (
x j ),
j∈Ml
l∈Ri
where Ri is the path (route) of user i and Pl is the pricing function at link l ∈ L. The notation is simplified by defining
x¯li (t − r ) :=
x j (t − rli − rl j ).
j∈Ml
In addition, let q be an upper bound on the maximum round-trip time (RTT) in the system q := 2 max i
rli − r(l−1)i ,
l∈Ri
where r0i = 0 ∀i. Finally, define xt := {x(t + s), −q ≤ s ≤ 0}, and by a slight abuse of notation let θi (xt ) denote the right-hand side of (12). Let φi ∈ C [−ri , 0], R be a feasible flow rate function (initial condition) for the ith user’s dynamics (12) at time t = 0, where C is the set of continuous functions. In addition, let x(φ)(t) be the solution of (12) through φ for t ≥ 0, and x˙ (φi )(t) be its derivative. In order to simplify the notation, x(φ) and x as well as θ (φ) and θ and their respective derivatives will be used interchangeably for the remainder of the chapter. Finally, a continuously differentiable and positive function V : C M → R+ is defined as 1 2 1 θi (xt (φ)) = Θ T (xt (φ))Θ(xt (φ)). 2 2 M
V (xt (φ)) :=
i=1
Distributed and Robust Rate Control
13
This constitutes the basis for the following candidate Lyapunov functional V : R+ × C M → R+ , V (t; φ) :=
sup
V (xs (φ)),
t−2q≤s≤t
where V (xs ) = 0, s ∈ [−2q, −q] without any loss of generality. 3.3.2 Stability Analysis Under Delays In order to establish global stability under delays, it is shown that the Lyapunov functional V (t; φ) is non-increasing. Furthermore, the stability theory for autonomous systems of [17] is utilized to generalize the scalar analysis of [35] and also of Chapter 5.4 of [17] to the multidimensional (multi-user) case. Let V˙ (t; φ) and V˙ (t; φ) be defined as the upper right-hand derivatives of V (t; φ) and V (t; φ) respectively along xt (φ). In order for V (t; φ) to be non-increasing, i.e., V˙ (t; φ) ≤ 0, the set Φ = {φ ∈ C : V (t; φ) = V (xt (φ)); V˙ (xt (φ)) > 0 ∀t ≥ 0}
(13)
has to be empty. This is established in the following lemma. Lemma 1 The set Φ, defined in (13), is empty if the following condition is satisfied 2 d¯ q≤√ , Mb3/2 where b := max i
d 2 Ui (xi ) − min + Ml αl , x i ∈X d xi2 l∈R i
and d 2 U (x ) i i d¯ := min min . i xi ∈X d xi2 Proof To see this consider the case when the set Φ is not empty. Then, by definition, there exists a time t and an h > 0 such that V˙ (xt+h (φ)) > V˙ (xt (φ)), and hence, V˙ (xt (φ)) cannot be non-increasing. It is now shown that the set Φ is indeed empty. Assume otherwise. Then, for any given t, there exists an ε > 0 such that
14
T. Alpcan
V (t; φ) = V (xt (φ)) =
M
θi2 (xt (φ)) = ε
(14)
i=1
and M
V (xs (φ)) =
θi2 (xs ) ≤ ε , s ∈ [t − 2q , t].
i=1
Thus, the following bound on θi , and thus on x˙i , follows immediately: |θi (xs )| = |x˙i (s)| ≤
√
ε , s ∈ [t − 2q , t].
(15)
Taking the derivative of x˙i (t) with respect to t, x¨i (t) =
∂ fl (x¯ i (t − r )) ∂ x˙i (t) d 2 Ui (xi ) l x ˙ (t)− x˙ j (t −ri −r j ). = θ˙i (xt ) = i ∂t d xi2 ∂ x¯li l∈R j∈M i
l
(16) d 2 Ui (xi ) Let di := − minxi ∈X > 0. Using (15) and (16), it is possible to bound d xi2 θ˙i (xs ) and x¨i (s) on s ∈ [t − q, t] with |θ˙i (xs )| = |x¨i (s)| ≤ di |x˙i (s)|+
∂ fl (x¯ i (s − r )) l
l∈Ri
∂ x¯li
|x¯li (s − r )| ≤ (di +
√ Ml αl ) ε.
l∈Ri
(17) To simplify the notation, define yi := di +
Ml αl .
l∈Ri
Hence, the following bound on θi (xs ), s ∈ [t − q, t] is obtained: √ √ θi (xt ) − qyi ε ≤ θi (xs ) ≤ θi (xt ) + qyi ε.
(18)
Subsequently, it is shown that V (xt (φ)) is non-increasing, and a contradiction is obtained to the initial hypothesis that the set Φ is not empty. Assume that ∂ fl (x¯li (t − j j r ))/∂ x¯li = ∂ fl (x¯l (t − r ))/∂ x¯l , ∀i, j ∈ Ml , ∀t for each link l. This assumption holds, for example, when fl is linear in its argument. Let B be defined in such a way that B T B := A T K A, where the positive diagonal matrix K is defined in Section 3.2. Also define the positive diagonal matrix D(x) := diag [|D1 (x1 )| , |D2 (x2 )| , . . . , |DM (xM )|] ,
Distributed and Robust Rate Control
15
where Di (x) := d 2 Ui (xi )/d xi2 . Then, using (18), V˙ (xt ) = −
M
Di (xi )θi2 (xt ) −
i=1
M
θi (xt )·
∂ fl (x¯ i (t − r )) l ∂ x¯li
l∈Ri
i=1
θ j (xt−rli −rl j )
j∈Ml
√ ≤ −Θ T DΘ − Θ T B T BΘ + q ε|Θ T B T By|, (19) where everything is evaluated at t. Now, for any fixed trajectory generated by (12), and for a frozen time t, a sufficient condition for V˙ (xt ) ≤ 0 is √ ||BΘ||2 + || DΘ||2 q ε≤ , ||BΘ|| ||By|| √
where || · || is the Euclidean norm. Let k := ||BΘ|| ||By|| > 0. Rewriting the sufficient condition one obtains √ 1 q ε ≤ k + μ, k √
DΘ|| where μ := || ||By|| > 0. The following worst-case bound on q can be derived by 2 a simple minimization: 2
√ √ q ε ≤ 2 μ.
(20)
√ Next a lower bound on μ is derived. From (14), it follows that || DΘ(xt )||2 ≥ 2 √ i) ¯ where d¯ := mini minxi ∈X d Ui (x dε, d xi2 , and D is the unique positive definite matrix whose square is D. Furthermore, ||By||2 ≤
M
yi
i=1
αl
l∈Ri
yj.
j∈Ml
Define also the following upper bound on yi : b := max di + Ml αl . i
l∈Ri
Since di > 0, one obtains ||By||2 ≤ Mb3 , and hence μ≥
¯ dε . Mb3
Thus, from (20) a sufficient condition for V (xt ) to be non-increasing is
16
T. Alpcan
2 d¯ q≤ √ , Mb3/2
(21)
which now holds for all t ≥ 0.
˙ Based on Lemma 1, V (t; φ) is non-increasing, V (t; φ) ≤ 0. Then, using Definition 3.1 and Theorem 3.1 of [17] global asymptotic stability of system (12) is established. Let S := {φ ∈ C : V˙ (t; φ) = V˙ (xt (φ)) = 0}. From (12) and (19) it follows that S = {φ ∈ C : φ(τ ) = x∗ , −q ≤ τ ≤ 0} ⊂ S, as Θ(xτ ) = x˙ (τ ) = 0 ⇔ xτ = x∗ ⇒ V˙ (xτ ) = 0.
Hence, S is the largest invariant set in S, and for any trajectory of the system that belongs identically to S, we have xτ = x∗ . In other words, the only solution that can stay identically in S is the unique equilibrium of the system. This, then leads to the following theorem: Theorem 3 Assume that ∂ f l (x¯li (s − r )) ∂ x¯li
j
=
∂ fl (x¯l (s − r )) j
∂ x¯l
∀i, j ∈ Ml ∀t.
Then, the unique Nash equilibrium of the network game is globally asymptotically stable on the compact set of feasible flow rate vectors, X := {x ∈ R M : Ax ≤ C , x ≥ 0} under the gradient algorithm x˙i (t) =
dUi (xi (t)) − fl d xi l∈Ri
x j (t − rli − rl j ) ,
j∈Ml
in the presence of fixed heterogeneous delays, rli ≥ 0, for all users i = 1, . . . , M, and links l ∈ L, if the following condition is satisfied 2 d¯ , q≤√ Mb3/2 where b := max i
− min
x i ∈X
d 2 Ui (xi ) + M α l l , d xi2 l∈R i
and d 2 U (x ) i i d¯ := min min . i xi ∈X d xi2
Distributed and Robust Rate Control
17
If the user reaction function is scaled by a user-independent gain constant, λ, then the ith user’s response is given by x˙i = −λ
∂ Ji (x(t)) , ∂ xi
and the sufficient condition for global stability turns out to be q≤√
2 d¯ Mλ3/2 b3/2
.
Notice that, for any λ < 1, the upper bound on maximum RTT, q, is relaxed proportionally with λ3/2 . The upper bound on communication delays given in the sufficient condition of Theorem 3 is inversely proportional to the square root of the number of users multiplied by the cube of a gain constant. This structure is actually similar to those of local stability results reported in other studies [19, 27, 32]. The analysis above indicates a fundamental trade-off between the responsiveness of the users gradient rate control algorithm and the stability properties of the system under communication delays.
4 Primal–Dual Rate Control The distributed structure of the Internet makes it difficult, if not impossible, for users to obtain detailed real time information on the state of the network. Therefore, users are bound to use indirect aggregate metrics that are available to them, such as packet drop rate or variations in the average round trip time (RTT) of packets in order to infer the current situation in the network. Packet drops, for example, are currently used by most widely deployed versions of TCP as an indication of congestion. An approach similar to the one discussed in this section has been suggested in a version of TCP, known as TCP Vegas [13]. Although TCP Vegas is more efficient than a widely used version of TCP, TCP Reno [28], the suggested improvements are empirical and based on experimental studies. This section presents and analyzes a primal–dual rate control scheme based on variations in the RTT a user experiences based on [5]. Although users are associated with cost functions in a way similar to the game in Section 3, the formulation here is not a proper game as the users ignore their own effects on the outcome when making decisions. Consequently, the solution here is different from the concept of Nash equilibrium. The equilibrium solution discussed in this section maximizes the sum of user utilities under capacity constraints. The result immediately follows from a Lagrangian analysis and the concept of shadow prices. Furthermore, the solution becomes proportionally fair under logarithmic user utilities. A detailed analysis can be found in [20, 21].
18
T. Alpcan
4.1 Extended Network Model An important indication of congestion for Internet-style networks is the variation in queueing delay, d, which is defined as the difference between the actual delay experienced by a packet, d a , and the fixed propagation delay of the connection, d p . If the incoming flow rate to a router l exceeds its capacity, packets are queued (generally on a first-come first-serve basis) in the existing buffer of the router of the link with bl,max being the maximum buffer size. Furthermore, if the buffer of the link is full, incoming packets have to be dropped. Let the total flow on link l be given by x¯l := i:l∈Ri xi . Thus, the buffer level at link l evolves in accordance with ⎧ ⎪[x¯l − Cl ]− ∂bl (t) ⎨ = x¯l − Cl ⎪ ∂t ⎩ [x¯l − Cl ]+
if bl (t) = bl,max if 0 < bl (t) < bl,max , if bl (t) = 0
(22)
where [.]+ represents the function max(. , 0) and [.]− represents the function min(. , 0). An increase in the buffers leads to an increase in the RTT of packets. Hence, RTT on a congested path is larger than the base RTT, which is defined as the sum of propagation and processing delays on the path of a packet. The queueing delay at the lth link, dl , is a nonlinear function of the excess flow on that link, given by ⎧ − 1 ⎪ ⎪ ⎪ (x¯l − Cl ) ⎪ ⎪ l ⎪ ⎨ C 1 d˙l (x, t) = (x¯l − Cl ) ⎪ C l ⎪ + ⎪ ⎪ 1 ⎪ ⎪ ⎩ (x¯l − Cl ) Cl
if dl (t) = dl,max if 0 < dl (t) < dl,max ,
(23)
if dl (t) = 0
in accordance with the buffer model described in (22), with dl,max := bl,max /Cl being the maximum possible queueing delay. Here, d˙l denotes (∂dl (t)/∂t). Thus, a user experiences is the sum of queueing delays on the total queueing delay, Di , its path, namely Di (x, t) = l∈Ri dl (x, t), i ∈ M, which we henceforth write as Di (t), i ∈ M. 4.1.1 Assumptions Additional assumptions of the extended model presented are as follows: 1. The effect of individual packet losses on the flow rates is ignored. This approximation is reasonable as one of the main goals of the developed rate control scheme is to minimize or totally eliminate packet losses. 2. The utility function Ui (xi ) of the ith user is assumed to be strictly increasing and concave in x i .
Distributed and Robust Rate Control
19
3. The effect of a user i on the delay, Di (t), she/he experiences is ignored. This assumption can be justified for networks with a large number of users, where the effect of each user is vanishingly small. Furthermore, from a practical point of view, it is extremely difficult, if not impossible, for a user to estimate its own effect on queueing delay.
4.2 Equilibrium Solution As in Section 3, define a cost function for each user as the difference between pricing and utility functions. However, here the pricing function of the ith user is linear in xi for each fixed total queueing delay Di of the user and is linear in Di with a fixed xi , i.e., it is a bi-linear function of xi and Di . The utility function Ui (xi ) is assumed to be strictly increasing, differentiable, and strictly concave in a similar way and it basically describes the user’s demand for bandwidth. Accordingly, variations in RTT are utilized as the basis for the rate control algorithm. The cost (objective) function for the ith user at time t is thus given by Ji (x, t) = αi Di (t) xi − Ui (xi ) ,
(24)
which she/he wishes to minimize. In accordance with this objective, again a gradient-based dynamic model is considered where each user changes its flow rate in proportion with the gradient of its cost function with respect to its flow rate, x˙i = −∂ Ji (x)/∂ xi . Taking into consideration also the boundary effects, the rate control algorithm for the ith user is ⎧ − dUi (xi ) ⎪ ⎪ ⎪ − αi Di (t) ⎪ ⎪ xi ⎪ ⎨ dUd(x i i) x˙i = − αi Di (t) ⎪ d xi ⎪ + ⎪ ⎪ dUi (xi ) ⎪ ⎪ ⎩ − αi Di (t) d xi
if xi = xi,max if 0 < xi < xi,max
(25)
if xi = 0.
Then, for a general network topology with multiple links, the generalized system is described by dUi (xi ) − αi Di (t) , i = 1, . . . , M , d xi x¯l − 1 , l = 1, . . . , L , d˙l (t) = Cl x˙i (t) =
(26)
with the boundary behavior given by (23) and (25). Define the feasible set Ω (as before) as
20
T. Alpcan
Ω = {(x, d) ∈ R M+L : 0 ≤ xi ≤ xi,max and 0 ≤ dl ≤ dl,max , ∀i , l}, where dl,max and xi,max are upper bounds on dl and xi , respectively. Define dmax := [d1,max , . . . , d L ,max ]. Existence and uniqueness of an inner equilibrium on the set Ω are now investigated under the assumption of xi,max > Cl , ∀l. Toward this end, assume that A is a full row rank matrix with M ≥ L, without any loss of generality. This is motivated by the fact that non-bottleneck links on the network have no effect on the equilibrium point and can safely be left out. Theorem 4 Let 0 ≤ αi,min ≤ αi ≤ αi,max , ∀i ∈ M where the elements of the vector αmax are arbitrarily large, and A be of full row rank. Given X , if αmin and dmax satisfy 0 < max d(αmin , x) < dmax , x∈X
where d(α, x) is defined in (30), then system (26) has a unique equilibrium point, (x∗ , d∗ ), which is in the interior of the set Ω. Proof Supposing that (26) admits an inner equilibrium and by setting x˙i (t) and d˙l (t) equal to zero for all l and i one obtains A x = C,
(27)
f(α, x) = A d ,
(28)
T
where d := [d1 , . . . , d L ]T is the delay vector at the links, C is the capacity vector introduced earlier, and the nonlinear vector function f is defined as
1 dU M 1 dU1 ,..., f(α, x) := α1 d x 1 αM d x M
T .
(29)
Define X := {x ∈ R M : Ax = C} as the set of flows, x, which satisfy (27). Multiplying (28) from left by A yields A f(α, x∗ ) = AAT d. Since A is of full row rank, the square matrix AAT is full rank, and hence invertible. Thus, for a given flow vector x and pricing vector α, d(α, x) = (AAT )−1 Af(α, x)
(30)
is unique. From the definition of f, d(α, x) is a linear combination of pi (xi )/αi and, hence, strictly decreasing in α. Since the set X is compact, the continuous function d(α, x) admits a maximum value on the set X for a given α. Therefore, for each > 0 one can choose the elements of αmax sufficiently large such that
Distributed and Robust Rate Control
21
0 < max d(αmax , x) < . x∈X
In addition, given X and dmax , one can find αmin such that 0 < max d(αmin , x) < dmax . x∈X
(31)
Hence, there is at least one inner equilibrium solution, (x∗ , d∗ ), on the set Ω, which satisfies (27) and (28). The uniqueness of the equilibrium is established next. Suppose that there are two different equilibrium points, (x∗1 , d∗1 ) and (x∗2 , d∗2 ). Then, from (27) it follows that A (x∗1 − x∗2 ) = 0 ⇔ (x∗1 − x∗2 )T AT = 0. Similarly, from (28) follows f(α, x∗1 ) − f(α, x∗2 ) = AT (d∗1 − d∗2 ) . Multiplying this with (x∗1 − x∗2 )T from left one obtains (x∗1 − x∗2 )T f(α, x∗1 ) − f(α, x∗2 ) = 0, which can be rewritten as M
(x∗1i − x∗2i )
i=1
1 αi
∗) ∗ ) dUi (x1i dUi (x2i = 0. − d xi d xi
Since Ui ’s are strictly concave, each term (say the ith one) in the summation is ∗ = x ∗ . Hence, the negative whenever x1∗ i = x 2∗ i with equality holding only if x1i 2i ∗ ∗ ∗ ∗ point x has to be unique, that is x = x1 = x2 . From this, and (26), it immediately follows that Di , i = 1, . . . , M, are unique. This does not, however, immediately imply that dl , l = 1, . . . , L, are also unique, which in fact may not be the case if A is not full row rank. The uniqueness of dl ’s, however, follow from (30), where a unique d∗ is obtained for a given equilibrium flow vector x∗ : d∗ = (AAT )−1 Af(α, x∗ ). Thus, (x∗ , d∗ ), following from (27) and (28), constitutes a unique inner equilibrium point on the set Ω.
4.3 Stability Analysis The rate control scheme and accompanying system described by (26) is first shown to be globally asymptotically stable under a general network topology in the ideal
22
T. Alpcan
case. Subsequently, the global stability of the system is investigated under arbitrary information delays, denoted by r , for a general network with a single bottleneck node and multiple users. The case of multiple users on a general network topology with multiple links is omitted since the problem in that case is quite intractable under arbitrary information delays. 4.3.1 Instantaneous Information Case The stability of the system below can easily be established under the assumption that users have instantaneous information about the network state. Alternatively, this case can be motivated by assuming that information delays are negligible in terms of their effects to the rate control algorithm. Defining the delays at links, dl , and user flow rates, xi , around the equilibrium as d˜l := dl − dl∗ and x˜i := xi − xi∗ , respectively, for all l and i, one obtains the following system inside the set Ω and around the equilibrium: x˜˙i (t) = gi (x˜i ) − αi D˜ i (t) , i = 1, . . . , M , 1 x˜i , l = 1, . . . , L , d˙˜l (t) = Cl
(32)
i:l∈Ri
where D˜ i =
l∈Ri
d˜l and gi (x˜i ) is defined as gi (x˜i ) :=
dUi (xi ) dUi (xi∗ ) − . d xi d xi
Define next a positive definite Lyapunov function ˜ = V (˜x, d)
M L 1 (x˜i )2 + Cl (d˜l )2 . αi i=1
(33)
l=1
˜ along the system trajectories is given by The time derivative of V (˜x, d) ˜ = V˙ (˜x, d)
M
(2/αi )gi (x˜i ) x˜i ≤ 0,
i=1
˜ is negative where the inequality follows because gi (x˜i ) x˜i ≤ 0, ∀i. Thus, V˙ (˜x, d) ˜ = 0}. It follows as before that ˜ ∈ R M+L : V˙ (˜x, d) semidefinite. Let S := {(˜x, d) ˜ ∈ R M+L : x˜ = 0}. Hence, for any trajectory of the system that belongs S = {(˜x, d) identically to the set S, we have x˜ = 0. It follows directly from (32) and the fact that gi (0) = 0 ∀i that x˜ = 0 ⇒ x˙˜ = 0 ⇒ D˜ i = 0 ∀i ⇒ d˜l = 0 ∀l,
Distributed and Robust Rate Control
23
where the last implication is due to the fact that D˜ = AT d˜ ∗ and the matrix A is of full row rank. Therefore, the only solution that can stay identically in S is the zero solution, which corresponds to the unique inner equilibrium of the original system. As a result, system (32) is globally stable under the assumption of instantaneous information. 4.3.2 Information Delay Case The preceding analysis is generalized to account for information delays in the system by introducing user-specific maximum propagation delays r = [r 1 , . . . , r M ] between a bottleneck link and the users. The system is assumed to have a unique inner equilibrium point (x∗ , d ∗ ) as characterized in Section 4.2. Modifying the system equations around this equilibrium point by introducing the associated maximum propagation delays, one obtains ˜ − ri ) , i = 1, . . . , M x˙˜i (t) = gi (x˜i (t)) − αi d(t M ˙˜ = 1 x˜ (t − r ). d(t) i i C
(34)
i=1
Then, the i th users rate control algorithm is ˜ + αi x˙˜i (t − ri ) = gi (x˜i (t − ri )) − αi d(t) C
0
M
−2ri j=1
x˜ j (t + s − r j )ds.
Define again a positive definite Lyapunov function ˜ = V (˜x, d)
M M 0 t 1 2 M ˜ (x˜i (t −ri ))2 +C(d(t)) + x˜i2 (u−ri )du ds. (35) αi C −2ri t+s i=1
i=1
Taking the derivative of V along the system trajectories yields M 2 ˜ = i=1 gi (x˜i (t − ri ))x(t ˜ − ri ) V˙ (˜x, d) αi 1 0 M M + 2x˜i (t − ri )x˜ j (t + s − r j )ds C −2ri i=1 j=1 M M 0 2 2 + i=1 −2ri [ x˜i (t − r ) − x˜ i (t + s − r )]ds. C This derivative V˙ is bounded from above by ˜ ≤ V˙ (˜x, d)
M 4Mri 2 2 gi (x˜i (t − ri ))x˜i (t − ri ) + x˜ (t − ri ). αi C i i=1
24
T. Alpcan
Hence, it can be made negative semi-definite by imposing a condition on the ˜ ∈ Ω˜ : maximum delay in the system, rmax := maxi ri . Let S := {(˜x, d) ˙ ˜ ˜ ˜ V (˜x, d) = 0}. It follows as before that S = {(˜x, d) ∈ Ω : x˜ = 0}. Therefore, for any trajectory of the system that belongs identically to the set S, x˜ = 0. It also follows directly from (34), and the fact that gi (0) = 0 ∀i, that x˜ = 0 ⇒ x˙˜ = 0 ⇒ d˜ = 0, where the fact that the matrix A is of full row rank is used. Consequently, the only solution that can stay identically in S is the zero solution, which corresponds to the unique equilibrium of the original system. As a result, system (34) is asymptotically stable by LaSalle’s invariance theorem [22] if the maximum delay in the system, rmax , satisfies the condition rmax <
kmin C , 2αmax M
(36)
where αmax and kmin are defined as αmax := max αi i
kmin := min i
inf
−x i∗ ≤x˜i ≤xi,max
g(x˜i ) x˜ . −x ∗ i
(37)
i
The following theorem summarizes this result: Theorem 5 Let the conditions in Theorem 4 hold such that the system dUi (xi (t)) − αi d(x, t − r ) , i = 1, . . . , M , d xi M ˙ = 1 xi (t − ri ) − 1 d(t) C x˙i (t) =
i=1
admits a unique inner equilibrium point (x∗ , d ∗ ). This system is globally asymptotically stable, if the maximum delay, rmax , in the system satisfies the condition rmax <
kmin C , 2αmax M
where αmax and kmin are defined in (37). Notice that the bound on the maximum delay required for the stability of the system is affected by, among other things, the maximum pricing parameter and the capacity per user C/M. Since the link capacity C will be provisioned in the network design stage according to the expected maximum number of users the proposed algorithm is in practice scalable for the given capacity per user.
Distributed and Robust Rate Control
25
5 Robust Rate Control The link capacities C on a network fluctuate due to short-lived background traffic as well as due to the inherent characteristics of the network, e.g., as a result of fading in the case of wireless networks. Relaxing the assumption on the knowledge of the available bandwidth B at a bottleneck link, it is possible to define a function of it, w(B), simply as an input to end users instead of attempting to model it explicitly. Then, define a system from the perspective of a user i ∈ M which keeps track of the available bandwidth on a bottleneck link shared by M − 1 others. The system state si reflects from the perspective of device i roughly the bandwidth availability on its path. Then, the system equation for user i is s˙i = a si + b u i + w,
(38)
where u i represents the control action of the user. The parameters a < 0 and b < 0 adjust the memory horizon (the smaller the a the longer the memory) and the “expected" effectiveness of control actions, respectively, on the system state si . The user i bases its control actions on its state which not only takes as input the current available bandwidth but also accumulates the past ones to some extent. It is also possible to interpret system (38) as a low-pass filter with input w and output s. Based on the discussion above, the following rate control scheme which is approximately proportional to the control actions is proposed: x˙i = −φxi + u i ,
(39)
where φ > 0 is sufficiently small. Although this rate update scheme seems disconnected from the system in (38) it is not the case as we show in the next section. As M xk , which a result of w being a function of the available bandwidth B = C − k=1 in turn is a function of the link capacity and aggregate user rates, systems (38) and (39) are connected via a feedback loop. For simplicity, the coefficient of u i is chosen to be 1 in (39). Since a rate update of a device will have the inverse effect on the available bandwidth, the parameter b in (38) is naturally picked to be negative. Notice that this is a “bandwidth probing scheme” in a sense similar to additive-increase multiplicative-decrease (AIMD) feature of the well-known transfer control protocol (TCP). However, in this case the user decides on the rate control action by solving an optimization problem. The objectives of this problem include full bandwidth utilization while preventing excessive rate fluctuations leading to instabilities and jitter.
5.1 Equilibrium and Stability Analysis for Fixed Capacity In order to compute the control actions u given the state s, consider a linear feedback control scheme of the general form u = θ s, where θ is a positive constant. An equilibrium and stability analysis of system (38) and (39) is conducted under this general
26
T. Alpcan
class of linear feedback controllers for a single bottleneck link of fixed capacity C shared by M users. The analysis of this special fixed-capacity case provides valuable insights to the original problem. Byignoring the noise in the system, make the simplifying assumption of w := M xi . Then, C − i=1 s˙i = a si + b θ si + C −
M
k=1 x k
(40) x˙i = −φxi + θ si , i = 1, . . . , M. At the equilibrium – which is unique and is asymptotically stable – s˙i = x˙i = 0 ∀i. Solving for equilibrium values of si and xi for all i, denoted by si∗ and xi∗ , respectively, one obtains xi∗ =
Cθ θ M − (a + bθ )φ
si∗ =
Cφ , θ M − (a + bθ )φ
and
which are unique, under the negativity of a and b and positivity of θ , as long as φ > 0. As φ → 0+ , it follows that i xi → C. Thus, as φ approaches to zero from the positive side, linear feedback controllers of the form u = θ s, where θ > 0 ensure maximum network usage when the capacity C is fixed and there is no noise. Notice that the equilibrium rate x ∗ is on the order of C/M and usually much larger than zero, which constitutes a physical boundary due to nonnegativity constraint. It is next proven that the linear system (40) is stable and asymptotically converges to the equilibrium point whenever φ > 0. Toward this end, sum the rates xi in (40) to obtain s˙i = −μsi − x¯ + C, i = 1, . . . , M x˙¯ = −φ x¯ + θ where x¯ := form as
M
i=1 xi
M
i=1 si
(41) ,
and μ := −(a + bθ ) > 0. We can rewrite (41) in the matrix y˙ = F y + [C · · · C 0]T ,
where y := [s1 · · · s M x] ¯ T . Then, it is straightforward to show that the characteristic function of the (M + 1)-dimensional square matrix F has the form
det(λ I − F) = (λ + μ) M−1 λ2 + (μ + φ)λ + μφ + M θ = 0.
Distributed and Robust Rate Control
27
Notice that F has M − 1 repeated negative eigenvalues at λ = −μ and two additional eigenvalues at 1 1 λ1,2 = − (μ + φ) ± (μ − φ)2 − 4M θ . 2 2 If (μ − φ)2 < 4M θ , then both of these eigenvalues are imaginary with negative real parts. Otherwise, we have μ + φ > |μ − φ| and both eigenvalues are negative and real. Therefore, all eigenvalues of F always have negative real parts and the linear system (41) is stable. It immediately follows that si is always finite and converges to the equilibrium, and from the second equation of (40), xi has to be finite and converges for all i. Thus, the original system (40) is stable.
5.2 H∞ -Optimal Rate Control Having obtained thereof equilibrium state of (40) and shown its asymptotic stability for fixed capacity, it is analyzed for robustness. First, rewrite system (40) around the equilibrium point (si∗ , xi∗ ) ∀i to obtain s˙˜i = a s˜i + b u˜ i + w x˙˜i = −φ x˜i + θ s˜i , i = 1, . . . , M,
(42)
where s˜i := si − si∗ , x˜i := xi − xi∗ , and u˜ i := u i − u i∗ . Then, reformulate the rate control objectives described earlier within a disturbance rejection problem around the equilibrium. Subsequently, H∞ optimal control theory allows for removal of all the simplifying assumptions of the previous section on w and solve the problem in the most general case. By viewing the disturbance (here the available bandwidth) as an intelligent maximizing opponent in a dynamic zero-sum game who plays with knowledge of the minimizer’s control action, the system is evaluated under the worst possible conditions (in terms of capacity usage). Then, users determine their control actions that will minimize costs or achieve the objectives defined under these worst circumstances [9], resulting in a robust linear feedback rate control scheme. Notice that a timescale separation is assumed to exist between the variations in capacity C(t) and the rate updates x(t). With a sufficiently high update frequency each device can track the variations at the equilibrium point caused by the random capacity fluctuations. The robustness properties of H∞ optimal controller also play a positive role here. System (38) can be classified as continuous time with perfect state measurements due to the state s˜i being an internal variable of user i. Next, an H∞ -optimal control analysis and design is provided by taking this into account. First, introduce the controlled output, zi (t), as a two-dimensional vector: z i (t) := [h s˜i (t) g u˜ i (t)]T ,
(43)
28
T. Alpcan
where g and h are positive preference parameters. The cost of a user that captures the objectives defined and for the purpose of H∞ analysis is the ratio of the L 2 -norm of z i to that of w: L i (x˜i , u˜ i , w) =
zi , w
(44)
∞ where z i 2 := 0 |zi |2 dτ and a similar definition applies to w2 . Although being a ratio, L i is referred to as the (user) cost in the rest of the analysis. It captures the proportional changes in z i due to changes in w. If w is very large, the user cost L i should be low even if zi is large as well. A large z i indicates that the state |˜si | and/or the control |u i | have high values reflecting and reacting to the situation, respectively. However, they should not grow unbounded, which is ensured by a low cost, L i . For the rest of the analysis, the subscript i denoting the user i is dropped for ease of notation. H∞ -optimal control theory guarantees that a performance factor will be met. This factor γ , also known as the H∞ norm, can be thought of as the worst possible value for the cost L. It is bounded below by ˜ w), γ ∗ := inf sup L(u, u˜
w
(45)
which is the lowest possible value for the parameter γ . It can also be interpreted as the optimal performance level in this H∞ context. In order to solve for the optimal controller μ(˜s ), a corresponding (softconstrained) differential game is defined, which is parametrized by γ , Jγ (u, ˜ w) = z2 − γ 2 w2 .
(46)
The environment is assumed to maximize this cost function (as part of the worstcase analysis) while the objective of the user is to minimize it. The optimal control action u˜ = μγ (˜s ) can be determined from this differential game formulation for any γ > γ ∗. This controller is expressed in terms of a relevant solution, σγ , of a related game algebraic Riccati equation (GARE) [9]: 2aσ −
b2 1 − 2 2 g γ
σ 2 + h 2 = 0.
(47)
By the general theory [12], the relevant solution of the GARE is the “minimal” one among its multiple nonnegative definite solutions. However, in this case, since the GARE is scalar and the system is open-loop stable (that is, a < 0), the GARE (which is a quadratic equation) admits a unique positive solution for all γ > γ ∗ , and the value of γ ∗ can be computed explicitly in terms of the other parameters. Solving for the roots of (7):
Distributed and Robust Rate Control
29
σγ =
−a ±
√ a 2 − λh 2 , λ
where λ :=
b2 1 − . γ2 g2
The parameter λ could be both positive and negative, depending on the value of γ , but for γ close in value to γ ∗ it will be positive. Further, γ ∗ is the smallest value of γ for which the GARE has a real solution. Hence, γ∗ =
a2 b2 + 2 2 h g
−1
.
Finally, a controller that guarantees a given performance bound γ > γ ∗ is u γ = μγ (s) = −
b σγ g2
s.
(48)
This is a stabilizing linear feedback controller operating on the device system state s, where the gain can be calculated off-line using only the linear quadratic system model and for the given system and cost parameters. It is important to note that although the analysis and controller design are conducted around the equilibrium point, the users do not have to compute the actual equilibrium values. In other words, (48) can be equivalently written in terms of u˜ γ and x. ˜ In practice, the H∞ -optimal rate control scheme is implemented as follows: each user i keeps track of the measured available bandwidth Bi on the network via the state equation (38), which takes the respective wi (Bi ) as input. Then, the linear feedback control u i is computed in (48) for a given set of system (a, b) and preference (h, g) parameters. Finally, each user updates its flow rate using (39) on each network. A discretized version of the algorithm is summarized in Fig. 1.
Input : Available bandwidth B measurements on the network; Parameters: User preferences (a, b) and (h, g); Output : Feedback control u for each user and ratex; Measure current available bandwidth (w) ; Update s using (38) ; Compute u using (48) ; Update flow rate x using (39) ;
Fig. 1 H∞ -optimal rate control scheme from a user perspective
30
T. Alpcan
6 Discussion The multi-faceted and complex nature of the rate allocation and control problem allows for a variety of approaches and diverse solutions. The objectives of efficiency, fairness, and incentive compatibility can be formulated in a variety of ways and can even be conflicting in many cases. Furthermore, the network models used for analysis may abstract different aspects of the underlying complex networks. It is, therefore, not surprising to observe the existence of a huge literature on the topic. Section 3 has presented a game-theoretic framework that addresses the cases where users on the network are selfish and noncooperative. In other words, users do not follow a cooperative protocol out of goodwill as in TCP and decrease their flow rates voluntarily when there is a congestion, but want as much bandwidth as they can get in all situations. To prevent undesirable outcomes such as congestion collapse, pricing is proposed as an enforcement mechanism. The solution concept adopted is the Nash equilibrium, where no user has an incentive to deviate from it. Section 4 has studied a primal–dual algorithm to solve a distributed optimization (maximization) of a global objective function, which is defined as the sum of user utilities, under capacity constraints. Its solution can be interpreted as a social optimum where the resulting flow rates are also proportionally fair for logarithmic user utilities. Here, the users are assumed to be cooperative in the sense that they follow a primal–dual distributed algorithm and ignore their own effect on the outcome when making decisions. Although this approach is mathematically similar to the one in Section 4, the solution point is not a proper Nash equilibrium and, hence, the users have to be cooperative to achieve it. Unlike the previous two sections, Section 5 has focused on robustness with respect to parameter changes in the problem instead of optimization. A well-known method, H∞ -optimal control, from control theory is used to design a distributed rate control scheme, where each user measures the current system state and acts independently from others. One of the main aspects of the resulting algorithm is its adaptive nature to the variations in total available bandwidth. Such variations in capacity are more the rule rather than exception in wired networks due to short lived or unresponsive flows and in wireless networks due to channel fading (fluctuation) effects. While the formulations in Sections 3, 4, and 5 share some common aspects, such as the underlying network model, they differ from each other in terms of their emphasis. The rate control game of Section 3 focuses mainly on incentive compatibility and adopts Nash equilibrium as the preferred solution concept. The primal–dual scheme of Section 4 solves a global optimization problem defined as the maximization of sum of user utilities. It also extends the basic fluid network model by taking into account the queue dynamics and is built upon available information to users for decision making. The robust rate control scheme of Section 5 emphasizes robustness with respect to capacity changes and delays. It also differs from the previous two, which share a utility-based approach, by focusing mainly on fully efficient usage of the network capacity rather than user utilities.
Distributed and Robust Rate Control
31
In conclusion, the three control and game-theoretic formulations presented provide diverse insights to the problem through rigorous mathematical analysis instead of focusing on a single specific system with a certain set of preferences and assumptions. Although implementation and architectural aspects have not been discussed here, it is hoped that the sound mathematical principles derived will be useful as a basis for engineering future rate control schemes. Acknowledgments The author thanks Tamer Ba¸sar and Jatinder Singh for their contributions and Çi˜gdem Sengül ¸ for her insightful comments.
References 1. Alpcan T, Ba¸sar T (2000) A variable rate model with QoS guarantees for real-time internet traffic. In: Proceedings of the SPIE International Symposium on Information Technologies, Boston, MA, vol. 2411, pp 243–245 2. Alpcan T, Ba¸sar T (2002) A game-theoretic framework for congestion control in general topology networks. In: Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, NV, pp 1218–1224 3. Alpcan T, Ba¸sar T (2003) Global stability analysis of an end-to-end congestion control scheme for general topology networks with delay. In Proceedings of the 42nd IEEE Conference on Decision and Control, Maui, HI, pp 1092–1097 4. Alpcan T, Ba¸sar T (2004) Global stability analysis of an end-to-end congestion control scheme for general topology networks with delay. Elektrik 12(3):139–150. papers/AlpcanBasar-Elektrik.pdf 5. Alpcan T, Ba¸sar T (2005) A utility-based congestion control scheme for Internet-style networks with delay. IEEE Trans Netw 13(6):1261–1274 6. Alpcan T, Singh JP, Ba¸sar T (2009) Robust rate control for heterogeneous network access in multi-homed environments. IEEE Trans Mobile Comput 8(1):41–51. papers/alpcantmcfinal1.pdf 7. Altman E, Ba¸sar T (1998) Multi-user rate-based flow control. IEEE Trans Commun 46(7):940–949 8. Altman E, Ba¸sar T, Jimenez T, Shimkin N. (2002) Competitive routing in networks with polynomial costs. IEEE Trans Automat Contr 47(1):92–96 9. Ba¸sar T, Bernhard P (1995) H∞ -optimal control and related minimax design problems: A dynamic game approach, 2nd edn. Birkhäuser, Boston, MA 10. Ba¸sar T, Olsder GJ (1999) Dynamic noncooperative game theory, 2nd edn. SIAM, Philadelphia, PA 11. Ba¸sar T, Srikant R (2002) Revenue-maximizing pricing and capacity expansion in a manyusers regime. In: Proceedings of IEEE INFOCOM, New York 12. Bertsekas D, Gallager R (1992) Data networks, 2nd edn. Prentice Hall, Upper Saddle River, NJ 13. Brakmo LS, Peterson LL (1995) TCP vegas: End to end congestion avoidance on a global internet. IEEE J Select Area Commun 13(8):1465–1480. Available via citeseer.nj.nec.com/brakmo95tcp.html 14. Deb S, Srikant R (2003) Global stability of congestion controllers for the internet. IEEE Trans Automat Contr 48(6):1055–1060 15. Elwalid A (1995) Analysis of adaptive rate-based congestion control for high-speed wide-area networks. In: Proceedings of the IEEE International Conference on Communications (ICC), vol 3, Seattle, WA, pp 1948–1953
32
T. Alpcan
16. Floyd S, Fall K (1999) Promoting the use of end-to-end congestion control in the internet. IEEE/ACM Trans Netw 7(4):458–472. Available via citeseer.nj.nec.com/article/floyd99 promoting.html 17. Hale JK, Lunel SMV (1993) Introduction to functional differential equations. Applied mathematical sciences, vol 99. Springer, New York, NY 18. Jacobson V (1988) Congestion avoidance and control. In: Proceedings of the Symposium on Communications Architectures and Protocols (SIGCOMM), Stanford, CA, pp 314–329. Available via citeseer.ist.psu.edu/jacobson88congestion.html 19. Johari R, Tan D (2001) End-to-end congestion control for the Internet: Delays and stability. IEEE/ACM Trans Netw 9(6):818–832 20. Kelly FP (1997) Charging and rate control for elastic traffic. Eur Trans Telecomm 8:33–37 21. Kelly F, Maulloo A, Tan D (1998) Rate control in communication networks: Shadow prices, proportional fairness and stability. J Oper Res Soc 49:237–252 22. Khalil HK (1996) Nonlinear systems, 2nd edn. Prentice Hall, Upper Saddle River, NJ 23. Kunniyur S, Srikant R (2002) A time-scale decomposition approach to adaptive explicit congestion notification (ECN) marking. IEEE Trans Automat Contr 47(6):882–894 24. La RJ, Anantharam V (2000) Charge-sensitive TCP and rate control in the internet. In: Proceedings of IEEE INFOCOM, pp 1166–1175. Available via citeseer.nj.nec.com/320096.html 25. Liu S, Ba¸sar T, Srikant R (2003) Controlling the Internet: A survey and some new results. In: Proceedings of the 42nd IEEE Conference on Decision and Control, Maui, Hawaii 26. Low SH, Lapsley DE (1999) Optimization flow control-i: Basic algorithm and convergence. IEEE/ACM Trans Netw 7(6):861–874 27. Massoulie L (2002) Stability of distributed congestion control with heterogeneous feedback delays. IEEE Trans Automat Contr 47(6):895–902 28. Mo J, La RJ, Anantharam V, Walrand JC (1999) Analysis and comparison of TCP reno and vegas. In: Proceedings of IEEE Infocom, pp 1556–1563. Available via citeseer.nj.nec.com/331728.html 29. Mo J, Walrand J (2000) Fair end-to-end window-based congestion control. IEEE/ACM Trans Netw 8:556–567 30. Orda A, Rom R, Shimkin N (1993) Competitive routing in multiuser communication networks. IEEE/ACM Trans Netw 1:510–521 31. Srikant R (2004) The mathematics of internet congestion control. Systems & Control: Foundations & Applications. Birkhauser, Boston, MA 32. Vinnicombe G (2002) On the stability of networks operating tcp-like congestion contro. In: Proceedings of the 15th IFAC World Congress on Automatic Control, Barcelona, Spain 33. Wen J, Arcak M (2003) A unifying passivity framework for network flow control. In: Proceedings of IEEE INFOCOM, San Francisco, CA 34. Yaiche H, Mazumdar RR, Rosenberg C (2000) A game theoretic framework for bandwidth allocation and pricing in broadband networks. IEEE/ACM Trans Netw 8:667–678 35. Yorke JA (1970) Asymptotic stability for one dimensional differential-delay equations. J Differ Equ 7:189–202
Of Threats and Costs: A Game-Theoretic Approach to Security Risk Management Patrick Maillé, Peter Reichl, and Bruno Tuffin
1 Introduction Telecommunication networks are becoming ubiquitous in our society, the most obvious example being the success of the Internet. One of the main reasons of this success is scalability, which means that a huge network can be managed properly at no – or no significant – additional cost compared to a small one. The key issue here is decentralization of decisions over all nodes of the network. On the other hand, it is often assumed that nodes cooperate by properly using the designed protocols, but playing with their parameters could improve one node’s position in the network, at the expense of the others. For this reason, non-cooperative game theory has recently come into play in the telecommunication community to analyse selfish behaviour and try to design mechanisms with appropriate incentives. This chapter focuses on a specific aspect of telecommunication networks: their security. Network security mechanisms aim at protecting against “natural” failures and voluntary attacks. While the former risk is related to reliability issues and can be estimated through analytic or simulation methods, the latter implies that the actions of the attacker be foreseen and countered. The choice of a security mechanism therefore depends on the defender’s knowledge of the possible attacks. On the other hand, an attacker will take into account the target’s defense strategies when determining its own attack type. Each actor therefore considers the actions of the others when Patrick Maillé Institut Telecom; Telecom Bretagne, 2 rue de la Châtaigneraie CS 17607, 35576 Cesson-Sévigné Cedex, France e-mail:
[email protected] Peter Reichl Telecommunications Research Center Vienna (ftw.), Donau-City-Str. 1, 1220 Wien, Austria e-mail:
[email protected] Bruno Tuffin INRIA Rennes – Bretagne Atlantique, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_2,
33
34
P. Maillé et al.
striving to optimize her own objective. Such interactions between actors with conflicting interests are typically the object of non-cooperative game theory. The goal of a security mechanism is to provide the highest possible level of protection; therefore, one might think that defenders should simply choose the most complete available protection. However, an improvement in the security level often has a counterpart cost in terms of bandwidth, computational power or money that decreases the overall performance or benefit from the service. Some real-time applications, for example, cannot use the most secure mechanisms because of delay constraints and, in general, choosing a defense strategy, i.e. an appropriate security mechanism for a given service, implies a trade-off between the costs of the mechanism and the incurred risks [12]. Likewise, the best strategy for an attacker might not be to develop all known attacks, because of the cost of running those attacks, and because this would increase the likelihood of being detected. It therefore appears that the “security game” played among attackers and defenders is not trivial in general since no dominant strategies can be exhibited. There are indeed several kinds of situations in the network security domain where actors have conflicting interests and must deploy complex strategies in order to reach the most profitable outcome. Examples of such situations include worm propagation, creation of trust networks, intrusion detection scenario learning, and reputation mechanisms. Due to the nature and complexity of the interactions among actors, game theory is particularly well suited to analyze all those cases. Again, the use of economical concepts – and particularly game theory – to study telecommunication networks has encountered a soaring interest for the last 15 years. Several kinds of applications of game theory have yielded important evolutions in different fields, including network routing, resource sharing and flow control, power control in wireless networks, pricing, as well as incentives to cooperate in ad hoc or peer-to-peer networks. Since the fundamental nature of telecommunication networks implies the fact that several agents share a set of resources and the actions of each one may affect the others, it seems natural that game theory is perfectly adapted to depict the network externalities and help predict the outcome of the interaction among self-interested users. Another relevant aspect mixing network security and game theory is the economic relationship between users and service providers. Indeed, providers have to define the right level of security to attract and sufficiently protect customers, but the introduction of security often implies a reduction of quality of service (QoS) and therefore potentially also of demand. This trade-off has to be analysed through a proper cost model. As a result, security can therefore be an important parameter in revenue management, especially in an environment where providers compete for customers. This chapter focuses on the game-theoretic aspects of network security issues for a broad range of scenarios, which are placed right at the cornerstone between telecommunications, economics and applied mathematics. This interdisciplinarity is particularly critical during the modelling part of user preferences in network security situations: the challenge is then to convert certain technical factors such as security mechanisms, strategies and protocols and their consequences in terms of performance into the economical concept of utility. Likewise, studying the mod-
A Game-Theoretic Approach to Security Risk Management
35
elled situations as non-cooperative games involves competences in several fields of applied mathematics and game theory, such as optimization theory, the theory of repeated games, Markov decision processes and eventually agent-based simulation. The remainder of the chapter is organized as follows. In Section 2, we review the basic notions of game theory, by using some simple and illustrative examples from security. Specific security games and solutions are then described in Section 3. The implications and consequences on the economic interactions between users and providers are described in Section 4. We finally conclude and describe the main challenges to be addressed in the future in Section 5.
2 A Game-Theoretic Perspective on Network Security This section presents the fundamental concepts of game theory that will be useful in our security context. The very general principles of game theory presented in Section 2.1 allow us to define a large number of game types, with specific forms or rules. In Section 3.1 we continue with discussing related work which focuses on the simplest (non-trivial) game models, where each user has just a finite number of choices and the game is played only once. Going one step further, we then introduce and discuss, respectively, three types of more complex games that have received specific attention in the context of security, namely repeated games, stochastic games and Bayesian games, in Sections 2.2, 2.3 and 2.4.
2.1 Fundamental concepts Game theory is a mathematical framework which allows modelling conflict and cooperation between two or more separate parties, the players. Players are assumed to behave rationally, i.e. they are triggered by the selfish incentive of maximizing their individual benefit, which is usually expressed in terms of a utility function. During the game, which follows certain rules, players can choose and implement a strategy from a set of different behavioural options, the so-called strategy space, in order to maximize the payoff they are receiving as an outcome of the game. Hence, formally a game is described by the number n of players, their strategy spaces and their payoff functions, Si and u i , respectively, for each player i (1 ≤ i ≤ n): G = {n; S1 , S2 , . . . , Sn ; u 1 , u 2 , . . . , u n }.
(1)
Based on that description, game-theoretic analysis attempts to understand the probable behaviour of the players, regarding their strategy choice, and thus to determine the presumable outcome of the game. In some cases, this work relatively straight forwardly, for instance, if each of the players can identify a “dominant strategy”, i.e. a strategy with which this player is better off independently of the behaviour of his opponents and which directly leads to an equilibrium situation. A much broader
36
P. Maillé et al.
equilibrium concept, the so-called Nash equilibrium, is achieved if an operation point is reached where each player is giving her best response facing her opponents’ strategies, i.e. for none of the players there is a unilateral incentive to change her strategy, given that the strategies chosen by all opponents are fixed. Formally, if s = (s1 , . . . , sn ) is the profile of strategies with si ∈ Si and if s−i stands for the profile of strategies excluding player i, a Nash equilibrium is a profile s (with s = (si ; s−i ) ∀i) such that ∀1 ≤ i ≤ n, u i (s) ≥ u i (t; s−i )
∀t ∈ Si .
In other words, the strategy of each player i is a best reply to the strategies of the others. Note that individual elements of the strategy space Si are called pure strategies, whereas a mixed strategy can be described as a linear combination of two or more pure strategies, with weights summing up to 1 which may be interpreted as the probability distribution πi = (πi,t )t∈Si for player i choosing randomly among the pure strategies involved. The goal is then to determine distribution maximizing the expected utility each player i the probability n for !n u (s , . . . , s ) π i 1 n k,s k. s j ∈S j j=1 k=1 A Nash equilibrium in mixed strategies is then a set of probability distributions (πi )1≤i≤n such that ∀i and any other probability vector π˜ i = (πi,t )t∈Si , n j=1 s j ∈S j
u i (s1 , . . . , sn )
n " k=1
πk,sk ≥
t∈Si j=i s j ∈S j
u i (t; s−i )π˜ i,t
"
πk,sk .
k=i
Whereas the existence of a pure strategy Nash equilibrium cannot be guaranteed, it can be demonstrated that any static game with a finite number of players and finite strategy spaces has at least one Nash equilibrium in mixed strategies, i.e. a profile of distributions such that the choice of each player maximizes its expected payoff or utility. Within the taxonomy of games, the distinction between static and dynamic games is worth mentioning: whereas in static games, all players simultaneously choose a strategy without further knowledge about their opponents’ decisions, dynamic games are characterized by a sequence of moves, where each player gives an answer based on the entire history of the game. Repeated games are a specific class of dynamic games and basically represent a sequence of static games. More details on those games are given in Section 2.2. Based on these introductory remarks, we will discuss more specific gametheoretic issues at various places in the rest of the chapter; for a detailed comprehensive introduction we kindly refer to standard textbooks like [10, 21]. Similarly, for applications in telecommunication networks, involving a lot of participants with different interests, the reader is advised to consult [3] and references therein for an overview of the type of problems that can be modelled in this way, including routing, resource allocation and queueing management. Game theory has indeed appeared as
A Game-Theoretic Approach to Security Risk Management
37
a promising tool to study interactions in that context. A last related issue is network pricing, which can fruitfully be studied using game theory tools (see [8]). In the rest of this section, we will introduce the different kinds of security-related games in more detail and discuss specific examples.
2.2 Repeated Games Repeated games are a simple way to include the time aspect into game theory. Based on a classical one-shot game, we assume that the same game is played repeatedly for a number T (possibly infinite or random) of periods. Traditionally, it is assumed that players value more the present the future T than periods, which is modeled by considering the discounted sum t=1 δ t−1 u i (st ) as the overall payoff function of each player i, where δ ∈ [0, 1] is called the discount factor, st is the (possibly mixed) strategy profile played at period t and u i the corresponding utility for player i. In repeated games, a strategy for a user describes the action choice she should make depending on the whole history of past actions. Repeated games thus allow us to model some kind of reputation effects, where the past actions of a player can be sanctioned or rewarded by her opponents. Such games have interesting properties, such as the fact that the set of equilibria can become quite large (this result is known as the “Folk Theorem” [21]).
2.3 Stochastic Games Like repeated games, stochastic games also have a form of memory, but in a more complex fashion. The game is still played over (discretized) time, but memory is represented by a state, which in the context of network security can, for example, describe the current feature of the data (not compromised, compromised, stolen), the type of application used, the ongoing attacks and activated countermeasures. The state space traditionally is assumed to be finite, where each state corresponds to • some payoff value for each player, • an action set for each player, and • some transition probability value for each state that depends on the actions taken by the players at the current period. The value corresponding to a state is the probability that the system be in that state for the next period. As for repeated games, one has to define an overall payoff function for each player, that is classically chosen as a discounted sum of the per-period payoffs. Stochastic games are hard to study analytically, since the number of states increases exponentially with the number of players and of strategy choices. Therefore, the equilibria of such games are often computed numerically. On the other hand, those games allow us to model some quite rich scenarios. They are therefore
38
P. Maillé et al.
well suited for some types of attacks like intrusion, which usually implement a sequence of attack steps before reaching their goal.
2.4 Bayesian Games Bayesian games are characterized by incomplete information about the opponents (e.g. their payoff functions). In a security context, this is used to model the difference between malicious attackers and non-malicious ordinary users who are accessing the system regularly. To model this, we assume that in the game the system is facing a subject, where the subject can be of one out of several types (for instance, malicious or not in the most simple case, but there can be more types, such as malicious users with different interests), and users with different available security services. This type (and the corresponding payoff function) is assumed to be private knowledge to the subject, whereas the system only can have a certain belief on that, e.g., a probability distribution between malicious/non-malicious users. In the course of the game, players can update their initial beliefs because of the actions observed according to Bayes’ rule. The so-called signalling games are a particularly interesting example of Bayesian games. Here, there is an informed player (agent) knowing the type of the opponent (principal). The principal is unaware of the agent’s type and has to start the game with an initial belief. During the game, however, the principal is able to update her initial belief based on signals originating from actions of the agent, until the principal eventually manages to deduce the type of her opponent. This type of games requires an extension to the concept of an equilibrium: Thus, a Bayesian Nash equilibrium is defined as a strategy profile together with a probability distribution characterizing the belief about the types of the opponents, which maximizes the expected payoff assuming that strategies and beliefs of the other players are fixed. Note that, similar to normal-form games, the existence of one or more Bayesian Nash equilibria can be proved when the numbers of pure strategies are finite, since Bayesian strategies can be interpreted as mixed strategies.
3 A Closer Look into Specific Security Games In this section, we present some security games/contexts that have been introduced in the literature. As we shall see, several types of interactions can be modelled as a non-cooperative game, depending on the aggregation level, services, timescale and types of attacks considered. We are going to present specific games illustrating the most significant interactions when dealing with security. The most basic kind of game between an attacker and a defender is analysed in Section 3.1, where each player has the choice between two strategies: doing nothing or launching a costly attack or detection procedure. The Nash equilibrium (in mixed strategies) is determined. A more complicated game is then described in Section 3.2, where more actions are possible for each player and the information available to players may be incomplete. The interactions between the attacker and the defender can then be
A Game-Theoretic Approach to Security Risk Management
39
studied as a repeated Bayesian game, whose study allows us to identify the most relevant parameters in the attack and defense strategies. In the same family of games, one can try to incentivize the defenders in a network to participate to a common defense strategy, for the best of the whole network. Such games are described in Section 3.3. While those three sections are for direct interactions among defenders and attackers, no information about the network topology is used either in player payoffs or in their strategies. Section 3.4 considers games on networks, where attacks and defenses have to be placed appropriately on the different links. Likewise, attacks can be performed by worms, for which the trade-off is between a discrete and a fast propagation to maximize the dissemination. Such worm propagation games are presented in Section 3.5. Section 3.6 highlights the problem (not fully modelled yet) of interdomain incentives for confidentiality when an intermediate node is supposed to forward the traffic of his neighbours. Note that all the games presented in the previous sections have applications in security modelling. Repeated game models have been used in other works, for instance, in [1] to represent the interactions of nodes in a wireless mobile ad hoc network (MANET). Here the attacks considered are only passive attacks, which consist of some nodes refusing to act as a relay for the communication flows of the others. The repeated feature of the game allows us to build mechanisms that sanction non-contributing nodes, in order to create incentives for collaboration. Likewise, in [28] the fact that the game has a form of memory is used to detect and isolate malicious nodes among a whole set of wireless ad hoc nodes. Similarly, the stochastic game approach is used in [17] to model attacks directed against specific applications, against communication capacities, or against databases. Numerical studies lead to some mixed Nash equilibria, i.e. at some state(s) players should choose their action according to a given probability distribution. References [25, 29] apply stochastic game models as well, to study other specific attack scenarios, and numerically compute mixed Nash strategies. The Bayesian approach is also considered in security games, for example, in [23]. There, the authors consider two types of attackers, namely “normal users” whose behaviour is only driven by selfishness and “malicious nodes” that intend to maximize the damage done. The defender has an a priori belief about the probability of its opponent being of each type and updates that probability as it observes the attacker moves. That belief is also used to determine the probability of using the detection mechanisms.
3.1 Play Detection/Attack or Not: Mixed Nash Equilibria In [2], Alpcan and Ba¸sar1 introduce a model where the attacker has two choices, i.e. launching an attack or doing nothing, while the defender’s choices are to trigger or not its (costly) detection scheme. The authors observe that for reasonable values
1 In [2], the authors actually first define a cooperative game where nodes in a sensor network should collaborate to improve intrusion detection, but we do not describe this model since this chapter focuses on noncooperative games.
40
P. Maillé et al.
of the payoffs for each outcome, the game has no Nash equilibrium in pure strategies. This is easy to see: if the attacker always attacks, then the victim always defends, so the attacker would be better off not attacking; on the contrary if the attacker never attacks, then defending is only costly for the victim that should thus never defend, which precisely makes attacks profitable to the attacker. A convenient way to visualize this is to represent player utilities depending on their actions in a matrix where player 1 (defender) actions correspond to rows, player 2 (attacker) to columns and the terms in the matrix are written in the form (u 1 , u 2 ) with u i the utility (payoff) of player i. An example of this so-called normal form representation is given in Fig. 1. In that example, a, b, c, α, γ are all positive numbers. We assume here that triggering the detection mechanism (resp. launching the attack) is costly to the defender (resp. the attacker), whereas doing nothing has no cost. In general, the cost for the defender of missing an attack is much larger than the cost of running the detection scheme, i.e. c b. We now see that as soon as c > 2b there exists a unique Nash equilibrium in mixed strategies: we denote by πdef the probability of the defender triggering the detection scheme and by πatt the probability of the attacker launching the attack. To have a Nash equilibrium with 0 < πdef < 1, i.e. positive probability for both possible choices, the utilities are
Fig. 1 A two-player attacker–defender game in normal form: the defender chooses a line and the attacker a column
u def (πdef , πatt ) = aπdef πatt − bπdef (1 − πatt ) − cπatt (1 − πdef ), u att (πdef , πatt ) = −απdef πatt + γ πatt (1 − πdef ). Computing the conditions ∂u def /∂πdef = 0 and ∂u att /∂πatt = 0 gives, respectively, γ b πatt = a+b+c and πdef = α+γ . Note that another interesting view/interpretation to get the relations obtained from ∂u def /∂πdef = 0 and ∂u att /∂πatt = 0 is that the defender should be indifferent between triggering the detection scheme or not, in terms of expected payoff (otherwise he would simply choose the best strategy). This also gives πatt a − (1 − πatt )b = −πatt c + (1 − πatt ) × 0,
(2)
where the left-hand (resp. right-hand) side of (2) is the defender’s expected payoff if he triggers the detection scheme (resp. does nothing). Similarly we also get from the opposite side −πdef α + (1 − πdef )γ = πdef × 0 + (1 − πdef ) × 0.
A Game-Theoretic Approach to Security Risk Management
41
The (existing and unique) Nash equilibrium of the game, therefore, corresponds to • the defender choosing to trigger the detection mechanism or do nothing with γ α and α+γ and respective probabilities α+γ • the attacker choosing to launch the attack or do nothing with respective probabilb a+c ities a+b+c and a+b+c . Interestingly, in such games the mixed strategy choice of each player is made such that its opponent has no preference among its possible actions, so that it can also choose a mixed strategy. Alpcan and Ba¸sar then extend that kind of model by considering several types of attacks and the corresponding defense type has to be chosen by the defender to detect the intrusion. Another interesting extension proposed in [2] consists in considering that before choosing its defense strategy, the defender has an imperfect knowledge of the type of attack chosen by the attacker: the set of possible attacks is partitioned into sets and the defender knows to which set the attack (if any) belongs. Some payoffs for each player correspond to each situation in terms of attack presence, attack type, defense trigger and defense type. Jormakka and Mölsä [14] present some very similar but concrete situations of network security (to be more precise: information warfare) games, with specific numerical values, that also lead to simple strategy sets. The specific examples introduced allow us to exhibit some particular outcomes and phenomena. The so-called evildoer game has the same form as the basic model of [2] (see Fig. 1), i.e. it has two players – an attacker and a victim – with two possible choices each and no pure Nash equilibrium. The conclusions for that game also hold for another interpretation of the game, called vandal game in [14]: here Jormakka and Mölsä do not consider defense strategy, but only the fact that the victim will simply not use the service (say, a network), and thus not suffer from the attack. Then the same reasoning as in Fig. 1 is valid: since the attacker’s objective is to maximize victim’s harm, then it should not always attack (but only with some probability) because then the victim would simply avoid the service. The same kind of attacker–defender game is studied in [5]. The number of strategies for each player is larger than 2: several attack and countermeasure types are considered. Moreover, the actual payoffs corresponding to some given strategic choices are not deterministic, since attacks are supposed to succeed with a probability that depends on the activated countermeasures. Nevertheless, players are assumed risk-neutral, i.e. only sensitive to payoff expectations, so introducing success probabilities does not change the game type. The modelling effort made in this chapter to quantify the payoffs for each player is worth mentioning: • The attacker is assumed to be sensitive to a return on attack criterion that involves some financial equivalents of the value of a successful attack, the costs of building and launching it and its success probability.
42
P. Maillé et al.
• The defender acts so as to maximize some return on investment that is calculated based on the monetary cost of the countermeasures, the value of the good to protect, the potential impact of an attack and the attack success probability. All the games mentioned above have no pure Nash equilibrium, thus only mixed strategies lead to equilibria: players would then randomize their action choice, according to a specific probability distribution as we did for the example of Fig. 1. The game presented in [22, 23] has the same features than [2, 14], but introduces an interesting refinement: the defender might not know what kind of attacker he is facing. More precisely, the “attacker” can either be a regular network user that simply may not want to offer some service (the passive attack discussed above) or a badly intentioned actor that possibly launches active attacks. The defender has an a priori knowledge of the probability of the attacker being of one type or the other and may update those probability values based on some observations of the attacker actions or messages, according to Bayes’ rules. Again, the resulting Bayesian game applied to intrusion detection does not exhibit any pure Nash equilibrium.
3.2 Incentive-Based Attacker Modelling In their fundamental paper [16], Liu et al. introduce a systematic method to model attacker intent, objectives and strategies (AIOS), based on combining the incentives of an attacker as well as his cost into a single utility function. Moreover, they propose a game-theoretic formulation of AIOS in order to capture also the relationship to the objectives and strategies of the defender and allow for inferring AIOS automatically. To this end, Liu et al. [16] start from the basic assumptions that security attacks are usually intentional (i.e., planned), that both attacker and defender only possess incomplete information about their respective opponent and that the success of an attack is always relative to the protection level of the attacked system (and vice versa). The attacker intents can vary widely, but may be subsumed under the notion of an incentive which is assumed to be quantifiable. Typical examples are, to be quantified with the same units, the amount of profit earned, the amount of terror or damages caused, directly or due to no-show of users because of the threat. Together with certain constraints like attack cost or risk of detection, the resulting utility function describes the objective of the attacker and is supposed to be maximized by the attacker. Modelling attacker strategies is considered to be more sophisticated, as they have to account for a sequence of potentially very different actions which determine a series of battles between attacker and system. This may lead to extraordinarily complex strategy spaces, and also comparing different attack strategies is far from being trivial, as the efficiency in terms of system security degradation strongly depends on countermeasures performed by the system. The formalization of these AIOS models starts from perceiving the attacker and likewise the environment (comprising the non-malicious users) as peers of the
A Game-Theoretic Approach to Security Risk Management
43
system under attack. The system is separated into a production-oriented service part and a security-related protection part and is assumed to actively take defense actions. Then, attacks are described as games between rational attackers and defenders whose Nash equilibria allow us to infer attacker strategies, whereas deriving the intention and objectives of the attacker is based on detecting strategic patterns which are matched against insights gained during a learning phase. Together with related accuracy and sensitivity analyses, this is supposed to significantly advance the risk assessment of security attacks. Eventually, this general approach leads to a more fine-grained taxonomy of AIOS models along two orthogonal dimensions, i.e. the correlation among attack actions and the accuracy of intrusion detection. Whereas a low correlation of attack actions suggests the application of Bayesian repeated games, high correlation leads to (potentially multi-stage) dynamic game models. The paper concludes with an instructive case study modelling attacker strategies for a distributed denial-of-service (DDOS) attack on a system which is countered by the popular pushback mechanism, i.e. by identifying and rate limiting those packet flows that cause the DDOS attack. To this end, user traffic is classified as either good (non-malicious), bad (malicious) or poor (non-malicious, but with the same properties as malicious traffic). Assuming a unique attacker together with multiple legitimate users, in the corresponding repeated Bayesian game (see Section 2.4) the system is uncertain about the type of each user and may only resort to a respective probability distribution. The action space of the attacker consists of several DDOS attacks, the action space of the legitimate user includes a variety of network applications and services and the action space of the system is determined by the potential defense postures of each router (specified by a large set of characteristic parameters like congestion checking time, target drop rate, rate-limit time, maximum session number). As far as the utility functions are concerned, the attacker’s utility depends on the impact of the attack on both the system and the legitimate users, whereas the utility of the non-malicious users boils down to the relative availability of the system. Finally, the utility function of the system is determined by the trade-off between the absolute impact of the DDOS attack and its relative impact on the system availability. As this game is way too complex for an analytic treatment, Liu et al. [16] present extensive simulations based on ns-2 where a total of 11 defense strategies are investigated. Legitimate user traffic is based on real-world Internet traces, whereas the attacker’s action space is determined by the number of “zombies” (i.e. hosts controlled by the attacker) as well as varying attack traffic patterns and total volumes. For the resulting 64 different possible attack strategies, the corresponding average payoffs for attacker, legitimate users and system are calculated and analysed. Whereas some resulting insights are widely consistent with the existing mainstream opinion, e.g. on the impact of total zombie number or drop rate preferences, also some surprising consequences may be drawn: For instance, neither rate nor pattern of the attacking traffic is of significant relevance for the attacker’s payoff function, but only the number of zombies and the properties of the traffic aggregate matters. Similarly, the simulation results allow a clear identification of the relevant defense
44
P. Maillé et al.
parameters of the system. Finally, a total of 42 different Nash equilibria have been calculated and allow further inferences for the attacker’s strategies, for instance, with respect to traffic patterns or the optimal ratio between bad and poor traffic, and even leads us to bounds for the attacking capacity of the attacker (i.e. the worstcase damage caused) and the assurance capacity for the defending system (i.e. the resilience against DDOS) which are of central relevance for any risk assessment purposes.
3.3 Passive Attacks in Collaborative Networks: Enforcing Cooperation on Defenders As previously mentioned, a passive attack is the action of a network participant refusing to provide some service. In peer-to-peer file sharing networks, a passive attack would consist of offering no files to the community. Likewise, in wireless ad hoc networks, a node refusing to transfer packets is considered as making a passive attack. It is true that those passive, free-riding attacks are not motivated by a desire to harm a machine, a network or a system, but rather simply by user selfishness. However, it is also reasonable to consider those noncooperative behaviours as attacks, since the system does not work anymore if too many participants do not contribute to it. More directly, a node which does not participate in the collective security by refusing to provide useful information can be considered as a passive attacker. In that context, the objective is to incentivize players to contribute to the service, either through sanctions or through rewards. Some appropriate mechanism thus has to be defined, such that rational and selfish players are better off participating to the service provision. This implies, for example, building reputation scores based on past behaviour and using those scores to possibly exclude misbehaving nodes from the system [19]. The goal is to prevent passive DoS attacks that consist of simply not participating to security (if we place it into our context instead of specifically MANETs in [19]). If non-participating nodes are isolated from the network, a reputation mechanism can, at a low cost, enforce participation. Formally, assume we have N nodes (players) and that the utility of node i (1 ≤ i ≤ N ) depends on both his payoff yi and the relative share σi = yi /( Nj=1 yi ) by αi u(yi ) + βi r (σi ) with u() differentiable, strictly increasing and concave and r () is differentiable, concave and maximized at 1/N . Weights αi , βi ≥ 0 characterize node i. If k nodes cooperate, this induces a (network) benefit B(k) (increasing and concave) and a cost C(k) (such that kC(k) is increasing) for implementing the procedures; thus, a payoff yk = B(k) − C(k). Reputation is included in functions B() and C(). Conditions for a Nash equilibrium to occur can be derived. Under proper conditions on B() and C(), it can be ensured that at least half of the nodes will cooperate.
A Game-Theoretic Approach to Security Risk Management
45
Those attacks are also addressed in [1], where the proposed mechanism is evaluated using a repeated game model.
3.4 Routing Problems or “Cat-and-Mouse” Games We now describe some intrusion detection games played on a physical network, where strategies involve some routing decisions. Since the paradigm and modelling are quite different, we present them in a separate section devoted to “security routing games”. In this section, we consider games that are played over the links (or node interfaces) of a network. The strategy sets are either a single link in the network (chosen to carry out an attack, or to put an attack detection device) or a whole routing strategy (choice of flow or attack spreading among different available paths). In those games, the attacker can try to intercept normal traffic (he is then the cat), or to reach a destination while avoiding detection (he is then the mouse). Likewise, according to the considered service, the network manager chooses to place specific detection mechanisms to protect important links and/or provide a higher security in general. Those interactions can be modelled as two-player zero-sum games, i.e. games where the gain of one a player is necessarily the loss of the other one: if player 1 gets U1 then U2 = −U1 . The (possibly mixed) Nash equilibria for that game are such that the corresponding utilities (U1N , U2N ) verify U1N = −U2N = max min U1 (s1 , s2 ) = min max U1 (s1 , s2 ), s1
s2
s2
s1
where si , i = 1, 2, is a mixed strategy for player i, i.e. a distribution probability over the strategy set of player i. Kodialam and Lakshman [15] consider such a game between an attacker trying not to be detected and an active defender. The attacker is located at some point a of the network and his target location is denoted by t. The goal of the attacker is to select a path to send his malicious packet so as to minimize the detection probability. To do so, he might choose some highly loaded links in order to become less detectable. (The background traffic on each link e is denoted by f e .) On the other hand, the defender’s objective is to select which links to scan so as to maximize that detection probability (subject to a constraint B in the total number of scanned bits per time unit). Then the authors prove that the Nash equilibrium value of the detection probability is B/M( f ), where M( f ) is the maximum possible flow from a to t on a network assuming each link e has capacity f e . Also, the player strategies at Nash equilibrium are derived: • If m i denotes the flow on the ith path from a to t for the maximum flow mentioned above, the attacker chooses to use that path with probability m i /M( f ).
46
P. Maillé et al.
• The defender selects a minimum cut of that maximum flow, which is therefore made of links e where the maximum flow is f e . Then the defender chooses to scan each of those links e with probability B f e /M( f ). The model and results are also extended to the case where the attacker can choose among several points to origin the packet from and to the case of several potential targets. A model with roles somehow inversed is also of interest. In [6], Bohacek et al. consider a user willing to send some flow from one point to another, through a network with vulnerable links: if the attacker decides to attack a link (e.g. for eavesdropping) used by a user packet, then there is a probability p that the packet gets intercepted. The strategies are thus as follows: • The attacker has to spread his scanning effort among the links. • The defender has to choose routes for his flow. He actually uses stochastic routing, i.e. determines a distribution over the next-hop possibilities for each node (avoiding cycle possibilities). Two different types games are studied: 1. Online games: The attacker can scan one physical interface at each node and therefore chooses a probability distribution over the interfaces, for each node. For each link, the transfer delay τ is augmented by T if the packet is intercepted. The objective of the attacker is to maximize the total expected transfer time. Then the authors express the Nash equilibrium as a saddle point and show that the corresponding equilibrium strategies can be computed in a distributed way. 2. Off-line games: Now the attacker only chooses one link to perform his attack. A strategy for the attacker is therefore a probability distribution over all links. The attacker’s objective is to maximize the probability of intercepting user packets, which shall be minimized by the defender (zero-sum game). To include path lengths into players objectives, the authors add a penalty related to the path length. More precisely, they define the variable χε as being 0 if the packet does not get intercepted, and (1 + ε)t−1 if it is during the tth hop. The attacker’s (resp. defender’s) objective is to maximize (resp. minimize) the expected value of χε . The authors show how the saddle point can be computed, using the solution of a flow maximization problem in a network where the capacity constraint on link is p . This solution is interestingly similar to the one obtained in [15] for reversed roles. Note that the ε parameter tunes the system according to the user preferences for short paths with respect to security. In particular, if ε is small then the defender will fully exploit the path diversity in the network by spreading his flow along all paths, whereas he concentrates on shortest paths when ε increases.
A Game-Theoretic Approach to Security Risk Management
47
3.5 Worm Propagation Games Another important domain where it is believed that game theory could be applied is the case of worm propagation [11]. Network worms are autonomous intrusion agents that have created tremendous financial losses to users due to their propagation through the Internet. The first major worm was the Morris worm in 1988 which crippled a substantial proportion of the Internet [26]. As another example, the Slammer SQL worm infected over 90% of the vulnerable hosts within just 10 min [20]. Security managers create patches, but in general those need to be developed manually and require some time: to first identify the problem, check that the patch does not have side effects, and then distribute it. For this reason, worm containment procedures are being developed. We focus here on scanning worms for which an infected node scans the address space at a given rate and infects nodes which it manages to locate. Indeed, to propagate, a worm tries out many IP addresses to be sent to and infect the corresponding host. Since those IP addresses are somehow randomly chosen, many of the ones tried do not respond. The approach for representing and analysing worm propagation is characterized by fluid models, which can adequately represent a large population of vulnerable hosts. For a given population size N , assuming that once infected, a node remains infected forever, the evolution of the number of infected nodes at time t, It follows in its simplest form (this equation being potentially different depending on the kind of worm) the (epidemiologic) differential equation d It = β It (N − It ). dt In this equation, β is a parameter representing the rate of infection of vulnerable nodes by a given infected node. The equation depends on not only the number of infected nodes (who send the worm), but also the remaining nodes to be infected (which will become less likely to be reached). Some variations of this equation will, for instance, describe whether or not worms are sent to uniformly chosen IP addresses or “closely chosen” ones. This kind of equation is typical of a worm’s propagation when the effects of human counteractions and network congestion are ignorable. We then experience a slow-start phase due to few nodes sending the worm and a slow-finish phase because at the end the remaining nodes are very few. In the slow-start phase (the one of interest for detectors, since we are interested in finding times such that It /N reaches say 5%), ddtIt = β N It , whose solution is It = I0 e−β N t . Countermeasures affect the rate at which nodes are infected. This is represented, for instance, in [11] by a reduction factor θt which affects the scanning rate of a host which has been infected for t time units. The equation now becomes in the slow-start phase t d It Is θt−s ds , = β N I 0 θ0 + dt 0
48
P. Maillé et al.
t whose solution is It = I0 + 0 It−s β N θs ds. The epidemics will spread or die out t exponentially fast depending on whether the integral 0 It−s β N θs ds is larger or smaller than 1. The interplay between worm strategies and detection/containment techniques can then be described as a game, the worm trying to infect the network as much as possible, while the network tries to slow it down. An important characteristic is that we are in presence of a Stackelberg game with the worm as leader, playing first its strategy, to which the detection and containment technique responds (the follower). This situation makes the worm powerful in the sense that, taking into account the optimal strategies of defenders, he can decide the strategy that optimizes his own interest, i.e. the infection rate. The typical goal of such an analysis is to prevent global spread before patches are developed and distributed (i.e. a given fixed amount of days). In [11], the strategy of players is to choose the best quarantining strategy for the detector, while the worm chooses a scanning rate. Quarantining means that after some time τ , an infected host”s connection attempts are blocked. We then have θt = P[τ ≥ t]. There are also throttling mechanisms reducing the rate at which a node makes new connections when considered suspicious. For Williamson’s throttle, connection requests are processed at rate c connections per second. If the rate for generating non-wormy connection attempts is w, the slow-down factor is θt = c/(β + w). Payoff is the speed of spread (the growth exponent of the epidemic), which has to be maximized for the worms and minimized for the detector. The number of unsuccessful scans can therefore be used to detect worms, as was suggested by Ganesh et al. [11] who study the game played between the worm designer setting the scanning rate and the worm detector setting the detection threshold for considering a host as infected. Detection is performed through a CUSUM (cumulative sum) test, minimizing the time between infection and detection for a given false-positive rate. It declares a node infected at R if the log likelihood ratio of being infected to being uninfected over a length k interval in the past exceeds a threshold c: R = inf{n : max1≤k≤n i=k n ln( f 1 (ti )/ f 0 (ti )) ≥ c}, where the ti inter-failure times and f 0 (resp. f 1 ) is the density, assumed here exponential, under normal (resp. infected) conditions. A detector can be designed to restrict the growth rate to no more than a value ν, while simultaneously ensuring that the false alarm probability over a specified time window T does not exceed a specified threshold. Interestingly in [11], the optimal detector is such that the worm growth exponent is insensitive to the scanning rate. As a consequence, the leader, the worm, does not have actually a significant influence. This paper rightfully stresses the importance of game theory for worm containment, pursuing in that direction, where the impact should be important. Several directions can be exploited to study this kind of games, extending the set of available strategies to the attacker and/or the defender. For example, one could imagine that a properly chosen proportion of the IP addresses used by the worm are chosen from the host’s recently contacted ones, in order to be detected later while still replicating at the same speed. Some other complicated strategies, involving scanning rates that
A Game-Theoretic Approach to Security Risk Management
49
change over time, could also be considered. It is important to note that scanning worms are not the only kind of worms; there exist many other types. For instance, routing worms which uses BGP routing tables to only scan the Internet routable address space, which allows them to propagate three times faster than a traditional worm and which can produce selective attacks. In a similar way to what was done in [11] for scanning worms, it is clearly of interest to investigate the games that can, or more exactly need to, be introduced between worm mechanisms and detection procedures for each specific type of worms and design more efficient reaction and defense strategies. Note that another potential level of game has been introduced in [27]. Instead of looking at the game between a given worm and security tools, we can also look at a larger timescale the race between worm writers and security managers. Indeed, when a worm is circumvented, a new one generally appears needing a new fight. Such a game is therefore a game for survival, in order to stay in the game. In [27], a parallel is made with biological nature, and evolutionary game theory is described as the appropriate tool.
3.6 Security/Confidentiality Issue in Interdomain and Ad Hoc Networks Another issue brought in by Chandramouli [7] is the inter-domain and ad hoc network case. A user/domain is expecting its traffic to arrive at destination with an appropriate level of security/confidentiality. But this traffic often needs to be forwarded by other providers/nodes that could behave maliciously. How to create incentives for a proper behaviour in this case? This kind of problem has been extensively studied in the literature to yield economic incentives to indeed forward traffic (see, for instance, [4, 9, 13]), but not much was related to security/confidentiality incentives. It is suggested in [7], but not solved yet, to play a repeated game, such that if a node defects in providing the expected security/confidentiality, its own traffic will also be unsecurely forwarded as a sanction, for at least a fixed amount of time. If the sanction is long enough, this should prevent the nodes from misbehaving.
4 Economics of Security Besides the direct game-theoretic modelling of interactions between malicious attacks and protection strategies, it has to be emphasized that security brings new economic issues to network service providers, because of its growing importance for companies or users. This has to be analysed mathematically and, again, can be treated by game theory not only to represent the business interactions between a provider and its customers first, but also to represent competition among providers for customers having to choose between different offers. The growth of a network such as the Internet has had a positive externality from a business point of view, but has also a negative externality when talking about security. As pointed out in
50
P. Maillé et al.
[18], “businesses have a strong incentive to seek profit from users (consumers) while cooperating – and competing – in the provision of privacy and security.” This common sense statement has to be verified though. Security can be provided at the network layer (with protocols such as Secure Sockets layer (SSL)) or at the application layer, but can be limited by the government public policy [18]. Note that security and economics bring the problem of secure payment that will not be dealt with here [24].
4.1 Model Based on Risk Percentage We could, for instance, assume that a provider has different initial security levels (or classes) ∈ {1, . . . , L}, to which an intrusion risk r is associated, with r1 < r2 for 1 < 2 and a price p with p1 > p2 for 1 < 2 . Security levels may correspond to various options concerning the availability of hardware or software security. Demand splits among the different classes, but an important characteristic of security attacks is that the larger number of customers on a class, the more likely new attacks will happen (according to Metcalfe’s or a power law), decreasing therefore the actual security level. Assuming non-atomic users, demand can then be characterized by a so-called Wardrop equilibrium, i.e. a combination of price and actual security risk which is the same for all classes having positive demand (otherwise users would have an interest in switching) and some classes have a null demand because too expensive for the proposed level. A typical situation for this kind of models is virus scan softwares, where different softwares can have different efficiencies but are also sold at different prices. If many users are known to use a typical software, then attacks will basically concentrate on this population in order to reach more people. Two situations can be considered: first all the levels are managed by a single provider (a monopoly) which then tries to maximize its revenue by paying with prices and second the case where each security level is handled by an independent provider, and providers compete for customers at a higher level by playing on prices (an oligopoly). Typical game-theoretic analysis of security management offers can be built this way.
4.2 Coalitions In the case of competing security service providers, the question of cooperation is probably more relevant than in many other fields. Indeed, due to the interactions among users, low security provided by a competitor induces a risk for its own customers, and therefore a lower security level. Coalition formation can thus become efficient for providers, in terms of reputation and revenue. It is therefore very interesting to model and investigate the incentive for forming such coalitions, and whether or not a full cooperation is the best solution for all providers. Such studies
A Game-Theoretic Approach to Security Risk Management
51
would then involve tools from collaborative game theory to study the sustainability of coalitions and the effect of revenue repartition on the sustainable coalitions.
5 Conclusions Whereas it has become clear that modelling and analysis of telecommunication networks security through non-cooperative game theory is of paramount importance, this approach is nevertheless still in its infancy and has indeed attracted interest only recently. As one of the key issues, we have identified the understanding of the interactions between malicious users (attackers) and end users or the network manager who expect a secure connection. Different such types of interactions have been introduced and discussed in this chapter, dealing, for instance, with intrusion detection, denial-of-service attacks or worm propagation, to mention just a few examples. In any case, the ultimate goal is to understand the equilibrium situation and therefore to try to design schemes or strategies to drive this equilibrium towards the most secure situation. This can be done by the introduction of proper incentives, for instance. We have also highlighted that security additionally brings economical issues for the providers in their relationships with users to propose the most profitable contracts, as well as in the competition between providers; those relationships can again be analysed within the framework of non-cooperative game theory. Note that most of the games presented here and in the literature are of a rather basic form, mainly due to the novelty of the issue. We have pointed out that therefore a lot of work remains to be done to represent practical scenarios as closely as possible. On the other hand, the constant evolution of networking technologies requires to adapt the presented issues and raises new challenges to be tackled. Thus, summarizing what has been said so far, we consider this game-theoretic perspective on various security issues of significant interest for the research community as well as of key practical importance for future industrial applications, and sincerely hope that the presented survey will manage to further stimulate research in this seminal field. Acknowledgments The authors acknowledge the support of European initiative COST IS0605, Econ@tel. Part of this work has been supported by the Austrian government and the city of Vienna in the framework of the COMET competence centre program and by the French research agency through the FLUOR project.
References 1. Agah A, Das SK (2007) Preventing DoS attacks in wireless sensor networks: A repeated game theory approach. Int J Netw Secur 5(2):145–153 2. Alpcan T, Ba¸sar T (2003) A game theoretic approach to decision and analysis in network intrusion detection. In: Proceedings of the 42nd Conference on Decision and Control, Maui, HI 3. Altman E, Boulogne T, El-Azouzi R, Jiménez T, Wynter L (2006) A survey on networking games in telecommunications. Comput Oper Res. 33(2)
52
P. Maillé et al.
4. Anderegg L, Eidenbenz S (2003) Ad hoc-VCG: A truthful and cost-efficient routing protocol for mobile ad hoc networks with selfish agents. In: Proceedings of the 9th Annual International Conference on Mobile Computing and Networking (MobiCom 2003), San Diego, CA, USA, pp 245–259 5. Bistarelli S, Dall’Aglio M, Peretti P (2006) Strategic games on defense trees. In: Proceedings of the 4th International Workshop on Formal Aspects in Security and Trust (FAST’06), LNCS 4691, Hamilton, Ontario, Canada, pp 1–15 6. Bohacek N, Hespanha JP, Lee J, Lim C, Obraczka K (2007) Game theoretic stochastic routing for fault tolerance and security in computer networks. IEEE Trans Parallel Distrib Syst 18(9):1227–1240 7. Chandramouli R (2007) Economics of security: Research challenges. In: Proceedings of the 16th International Conference on Computer Communications and Networks (ICCCN’2007), Hawaii, USA 8. Courcoubetis C, Weber R (2003) Pricing communication networks—economics, technology and modelling. Wiley, Chichester 9. Feigenbaum J, Papadimitriou C, Sami R, Shenker S (2002) A BGP-based mechanism for lowest-cost routing. In: Proceedings of the 21st ACM Symposium on Principles of Distributed Computing, Monterey, California, USA, pp 173–182 10. Fudenberg D, Tirole J (1991) Game theory. MIT, Cambridge, MA 11. Ganesh A, Gunawardena D, Jey P, Massoulié L, Scott J (2006) Efficient quarantining of scanning worms: Optimal detection and co-ordination. In: Proceedings of IEEE INFOCOM 2006, Barcelona, Spain 12. Gordon LA, Loeb MP (2002) The economics of information security investment. ACM Trans Inf Syst Secur 5(4):438–457 13. Hershberger J, Suri S (2001) Vickrey prices and shortest paths: What is an edge worth? In: Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Las Vegas, Nevada, USA, pp 252–259 14. Jormakka J, Mölsä J (2005) Modelling information warfare as a game. J Inf Warf 4(2):12–25 15. Kodialam M, Lakshman TV (2003) Detecting network intrusions via sampling: A game theoretic approach. In: Proceedings of IEEE INFOCOM, San Francisco, CA, USA 16. Liu P, Zang W, Yu M (2005) Incentive-based modeling and inference of attacker intent, objectives, and strategies. ACM Trans Inf Syst Secur 8(1):78–118. doi: http://doi.acm.org/10.1145/1053283.1053288 17. Lye KW, Wing JM (2005) Game strategies in network security. Int J Netw Secur 4(1–2):71–86 18. McKnight L, Solomon R, Reagle J, Carver D, Johnson C, Gerovac B, Gingold D (1997) Information security for internet commerce. In: McKnight LW, Bailey JP (eds) Internet economics. MIT, Cambridge, MA, pp 435–452 19. Michiardi P, Molva R (2002) Game theoretic analysis of security in mobile ad hoc networks. Tech. Rep. RR-02–070, Institut Eurécom 20. Moore D, Paxson V, Savage S, Shannon C, Staniford S, Weaver N (2003) Inside the slammer worm. IEEE Secur Priv 1(4):33–39 21. Osborne MJ, Rubinstein A (1994) A course in game theory. MIT, Cambridge, MA 22. Patcha A, Park JM (2004) A game theoretic approach to modeling intrusion detection in mobile ad hoc networks. In: Proceedings of IEEE Workshop on Information Assurance and Security, West Point, NY, USA, pp 30–34 23. Patcha A, Park JM (2006) A game theoretic formulation for intrusion detection in mobile ad hoc networks. Int J Netw Secur 2(2):131–137 24. Racz P, Stiller B (2006) A service model and architecture in support of ip service accounting. In: Management of integrated end-to-end communications and services, Proceedings of the 10th IEEE/IFIP Network Operations and Management Symposium, NOMS 2006, Vancouver, Canada, April 3–7, 2006. IEEE, pp 1–12 25. Sallhammar K, Helvik BE, Knapskog SJ (2006) A game-theoretic approach to stochastic security and dependability evaluation. In: Proceedings of the 2nd IEEE Intl Symposium on Dependable, Autonomic and Secure Computing (DASC), Indianapolis, IN, USA
A Game-Theoretic Approach to Security Risk Management
53
26. Seeley D (1989) A tour of the worm. In: Proceedings of the Winter USENIX Conference, San Diego, California, USA 27. Somayaji A (2004) How to win an evolutionary arms race. IEEE Secur Priv, 2(6):70–72 28. Theodorakopoulos G, Baras JS (2008) Game theoretic modeling of malicious users in collaborative networks. IEEE J Select Areas Commun 26(7):1317–1327 29. Wang H, Liang Y, Liu X (2008) Stochastic game theoretic method of quantification for network situational awareness. In: Proceedings of the International Conference on Internet Computing in Science and Engineering (ICICSE), Harbin, Leilongjiang, China, pp 312–316
Computationally Supported Quantitative Risk Management for Information Systems Denis Trˇcek
1 Introduction Security of information systems (IS) has become a well-established discipline, especially during the last two decades. Despite this, the area is still in its infancy as regards measuring and quantitative assessment of phenomena in its area. Risk management is at the core of IS security. Although the general field of risk management research and its application has a long and proven record, its application to contemporary IS has come up against many obstacles. The reasons are manifold. First, current networked information systems constitute one of the most complex systems that have ever been created by humans, and they have penetrated all areas of our lives. According to the Internet Systems Consortium, the number of hosts on the Internet in July 2006 was 439,286,364 [17]. In addition, a typical host that uses, e.g., the MS Windows operating system is running a few thousand COM (component object model) elements. The number of possible local interactions is, therefore, already extremely high, not to mention possible interactions on an Internet scale. Further, a growing number of these components are becoming mobile, which means that their place of origin differs from the place of execution. Even so, at the core of our problem area is the fact that the technology is changing so rapidly that decades long historical statistical data are almost impossible to obtain. And these data form the basis for traditional risk management approaches. Further, the security area is lacking well-defined metrics that would address such business questions like “How safe is my IS?,” and “How much safer is my IS than my competition’s IS?” This fact further complicates the situation. As a result, it is a general practice to base risk management in IS on qualitative methodologies. However, quantitative methodologies should remain our priority. The reason is straightforward – risk management is about financial investment into security safeguards
Denis Trˇcek Faculty of Computer and Information Science, Laboratory of E-media, University of Ljubljana, Tržaška cesta 25, 1000 Ljubljana, Slovenia e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_3,
55
56
D. Trˇcek
taking place under uncertainty. And financial assets are tangible resources; therefore, their spending should be tangibly justified as much as possible. New research steps are presented in this chapter and application approaches are focused on the development of computational tools for quantitative support of risk management in IS. Initially focusing on metrics issues, the chapter continues with the development of a risk management model that is intended to support decision making under uncertainty. Before describing the details of the model, some basics of system dynamics are given, because its concepts have been used to develop the model. Next, a complete technological architecture is presented that enables reactive, active, and proactive risk management in contemporary IS. Before the discussion section there is a section that demonstrates how the presented model can serve as a basis for simulating environment, i.e., a “business flight simulator” for risk management in IS. The chapter ends with the conclusion, followed by an extensive list of references. Finally, the complete program listing of the simulation model is appended.
2 The Basic Definitions Before going into details, the basic definitions of terms and concepts that will be used in the rest of this chapter should be given. The following normative references that are most relevant for risk management in IS will be taken into account: ISO 7498-2 [11], ISO 27005 [16], NIST SP 800-39 [22], and the related HIPAA standard [10]. Security was first addressed in the ISO 7498-2 [1b] standard that was, and still is, very important for IS security, so its definitions will be given first. According to ISO 7498-2, security means the minimization of the vulnerability of assets and resources. Vulnerability means any weakness(es) that can be exploited to cause damage to computing systems themselves or to data that reside on these systems. To provide protection of assets and resources, security mechanisms are deployed that can be of a cryptographic or non-cryptographic nature and include symmetric and asymmetric cryptographic algorithms, strong one-way hash functions, random number generators, traffic padding, and physical protection. Security services enable the following: • Authentication that provides the means to assure an entity that the peer communicating entity is the one claimed. • Confidentiality of data that provides the means to prevent unauthorized disclosure of data. • Integrity that ensures that any modification, insertion, or deletion of data is detected. • Non-repudiation that ensures that the origin or delivery of a message cannot be denied. • Access control that ensures the authorized use of resources. • Auditing that is based on logging of suspicious activities and that provides means for analysis of successful breaches and evidence in case of legal disputes.
Computationally Supported Quantitative Risk Management
57
These definitions of security are good and they have served their purpose well for a decade, and still do. In the meantime, however, certain new elements have become important, one notable example being denial of service (DoS) attacks. Further, the focus has shifted from technical and technological issues to human factors and organizational issues. This situation had therefore to be reflected appropriately in the definitions of security. New definitions have emerged, most notably in the BS 7799 [2] standard and its successor ISO 27002 [14]; in parallel, these issues were similarly addressed in [13]. In these standards security is defined as the preservation of confidentiality, integrity, and availability of services. Integrity is here meant to include also a complete recovery, while availability means that authorized users have access to information when necessary. We can now focus on risk management, i.e., coordinated activities to direct and control an organization in response to risk in the IS area. Assets and threats are at the center of risk management. Because of the various weaknesses of assets, threats can successfully exploit these vulnerabilities by interacting with assets. The longer this interaction (i.e., exposure to threats) the more likely is the exploit that causes damage to the asset. And the expected damage is what constitutes a risk. To reduce or avoid a risk, safeguards are implemented that can be technological, organizational, and also legal. By their application, singly or together, the exposure time to threats, the vulnerability of a resource, or the threat or its probability can be reduced. The selection of safeguards is based on the total process of risk analysis and risk assessment that has to be consonant with an organization’s security policy. This policy gives the overall direction for risk management, as formally expressed by management. The basic variables in risk management and their relationships are given in Fig. 1. It follows clearly that a risk emerges as a product of threat probability, asset vulnerability, exposure period, and, of course, asset value. This risk leads to appropriate counter-measures (implementation of safeguards) that diminish asset vulnerability, threat probability, and/or time of exposure. The exact value of safeguards is directed by the organization’s attitude to risks, as implemented in a security policy. The reader should note that in most cases risks are not completely eliminated, and some residual risk remains.
3 Current Risk Management Practices Risk management in IS covers the procedures for the identification, control, and reduction (or eventual elimination) of those events that could endanger the information resources. How much risk the organization is willing to take depends on its strategic orientation toward risk, which is detailed in its security policy. The most basic approach to risk management starts with identifying a set of assets A = {a1 , a2 , . . . , an } and a set of threats T = {t1 , t2 , . . . , tm }. Next, a Cartesian product is formed A × T = {(a1 , t1 ), (a2 , t1 ), . . . , (an , tm )}. The value of each asset
58
D. Trˇcek
exposure period
asset value risk
threat probability
safeguards
asset vulnerability
Fig. 1 The basic risk management elements and their relationships
v(an ) is determined and, for each threat, the probability Ean (tm ) of interaction with this asset during a certain period is assessed. Interaction as such is not harmful – the problem is the vulnerability V tm (an ) of an asset, where V tm (an ) ∈ [0, 1]. Taking this into account, an appropriate risk estimate is obtained as R(an , tm ) = v(an ) ∗ Ean (tm ) ∗ V tm (an ). The real problem with this procedure is obtaining (exact) quantitative values for the above variables. As noted earlier, one basic issue is that the technological landscape changes so rapidly that decades old statistical aggregates are not available. In addition, a significant proportion of an organization’s assets are intangible assets, such as information and goodwill, so that how to identify and value all the data, ranging from system logs to databases, remains a difficult issue [6]. To make things worse, the most important assets are employees. Due to the specifics of these kinds of assets their valuation is very hard (none of them are recorded and valued in balance sheets, for example). And finally, whatever the resources, giving the exact value of their vulnerability (in order to consequently derive the likelihood of risk) is beyond our ability due to the number of these resources in the IS. The most feasible possibility is an approach at the level of aggregates, which is what will be considered next. The above facts lead, not surprisingly, to the view that the logical – and unavoidable – alternative to quantitative IS risk management is a qualitative approach. Here, assets, threats, and vulnerabilities are each categorized into certain classes. By using tables, such as the one below, risks are assessed and estimated, and priorities are set. For example, if the estimated threat is of low frequency and the corresponding level of vulnerability is high, and if the value of an asset is also high, then the risk is described by the value “5.”
Computationally Supported Quantitative Risk Management
59
Using a descriptive, qualitative approach significantly facilitates the risk management processes. This is also a legitimate approach according to standards, such as [3, 12]. However, qualitative risk management approaches have significant shortcomings and suffer from the following two major disadvantages [4] (Table 1): • reversed rankings, i.e., assigning higher qualitative risk ratings to situations that have lower quantitative risks; • uninformative ratings, i.e., frequently assigning the most severe qualitative risk label (such as “high”) to situations with arbitrarily small quantitative risks and assigning the same ratings to risks that differ by many orders of magnitude.
Table 1 Risk management in information systems - a typical qualitative approach Threat frequency Low (L) High (H) Vulnerability Threat level L H L H Asset value
Marginal Low Medium High Extreme
0 0 2 4 6
0 1 3 5 7
1 2 4 6 8
1 3 5 7 9
Therefore, the value of information that qualitative approaches provide for improving decision making can be (close to) zero in the case of many small risks and a few large ones, where qualitative ratings often do not distinguish the large risks from the small. This further justifies the fact that quantitative treatment has always to be the preferred option. Only a limited arsenal of quantitative methodologies exists that can be used for our purpose in very specific niches. For example, nonparametric methods have been proposed [25] as a basis for analysis of failure times, in order to derive probability distributions of systems failures (these failures are the consequence of successful breaches of security services). This basis is improved by correlating system survival times with the use of certain design enhancements and other threats countermeasures. A specific risk analysis has been presented for field intellectual property rights [1].
4 Metrics Issues The very first activity for successful risk management is the collection of relevant data. These data should include new threats, detected and/or identified vulnerabilities, exposure times, and the available remedies (safeguards). Further, to ensure not only a reactive but also an active approach to risk management, this collection and dissemination of data should be in real time, which requires automation of the
60
D. Trˇcek
acquisition and distribution processes. These two approaches can also be used for pro-active treatment of risk management as will be discussed below. It needs to be emphasized that, although security in IS has been an important issue for a few decades, there is lack of appropriate metrics. So not only cryptography is a combination of science and art but also IS security in general. Fortunately, the situation in this field has started to change recently and some important advances have been achieved. The first such advances are constituted by two databases, the MITRE Corporation Common Vulnerabilities and Exposures [21] and the U.S. National Vulnerability Database [23]. These are closely related efforts in which online acquisition and distribution of related data have been enabled by the security content automation protocol, SCAP [20]. The main procedure with the first of the databases is as follows: • The basis is the ID vulnerability, which is an 11-digit number, in which the first three digits are assigned as a candidate value (CAN), the next four denote the year of assignment, and the last four denote the serial number of vulnerability (exposure) in that year. • Once vulnerability is identified in this way, the CAN value is converted to common vulnerability and exposure, CVE. The data contained in this database are in one of two states (both, of course, are related to publicly known vulnerabilities): • In the first state there are weaknesses with no available patch and • in the second state there are those variables for which a publicly available patch exists. This is the basis for the metric called daily vulnerability exposure, or DVE [15], which is obtained as follows: DVE S (date) =
vuln S
(dateknown < date) ∧ (datepatched > date).
The DVE means that, on any given day, the software S would be exposed to attack (i.e., would be vulnerable) if the vulnerability had been publicly disclosed prior to that day but a patch was not available until after that day. In Figs. 2 and 3 the DVEs for Apache web servers and Mozilla Firefox web browsers are given that are based on real data, but simplified to make presentation in the rest of the chapter more clear (the diagrams do not show daily, but monthly vulnerabilities, and it is assumed that during the month in which vulnerability was recorded, a patch was not available). Another useful metric for our purpose has been proposed by Harriri et al., called the vulnerability index or VI [9]. This index is based on categorical assessments of the state of a system, be it a router, a server, or a client, which can be normal, uncertain, or vulnerable. Each device in a network has an agent that measures the impact factors in real time and sends its reports to a vulnerability analysis engine,
Computationally Supported Quantitative Risk Management
61
Fig. 2 DVE for Apache server (all versions)
Fig. 3 DVE for Mozilla Firefox browser (all versions)
which computes the system impact metric. More precisely, the calculation of VI goes as follows. For each kind of system component, an impact factor, CIF, is calculated for a given fault scenario (FS). CIF is the ratio of two differences – the first between the normal and faulty operation parameter value and the second between the normal and acceptable threshold value of this operation parameter. For example, for a client this may be a transfer rate (TR). Having a CIF for a component in a given FS, the system impact factor, SIF, can be obtained that identifies how a fault affects the whole (sub)network. For a given fault a SIF is obtained by evaluating the weighted impact factors of all network components. Thus, the percentage of components operating in vulnerable states (i.e., where CIF exceeds normal operational thresholds d) in
62
D. Trˇcek
relation to the total number of components and, in the case of clients and servers, CIF is obtained as follows: CIF(client, F Sk ) =
T Rnorm − T Rfault
, |T Rnorm − T Rmin | T Rnorm − T Rfault CIF(server, F Sk ) = . |T Rnorm − T Rmin | In the above equations k denotes a certain fault scenario. SIFs for clients and servers are obtained as follows: SIF client (F Sk ) = SIF server (F Sk ) =
∀ j,CIF j >d
COS j
totalNumberOfClients ∀ j,CIF j >d COS j
,
totalNumberOfServers
.
In the last two equations, the component operating state (COS for short) is a binary variable that equals 1 when the component operates in an abnormal state (i.e., when CIFi > d) and 0 when it operates in a normal state (i.e., when CIFi < d). In the present case these calculations are applied to servers and routers (which are also two generic system groups according to Harriri) and, in this way, the complete SIF is obtained. To enable a demonstration of the generic model in the next section, certain SIFs have to be used, and we have chosen a hypothetical network that consists of Apache servers and Mozilla Firefox browsers. For such a network, concrete SIF values can be obtained on the basis of data provided by the US National Vulnerability Database (see Fig. 4).
Fig. 4 A hypothetical SIFs for Apache (all versions, solid line) and Mozilla Firefox browser (all versions, dashed line)
Computationally Supported Quantitative Risk Management
63
At the end of this section graph-based methods, which are often used in IS vulnerability analysis, need to be mentioned. One well-established technique in this field, suggested by Schneier [26], is called attack trees. Attack trees are constructed such that the root node of the tree is the principal goal of an attacker. This goal is further refined to obtain the first-level nodes, which are further refined until leaves are reached – these cannot be further refined. During the construction process, attention is paid to conjunctive and disjunctive sub-goals (the first kind of goals have all to be fulfilled, while in the second only one of them is needed for proceeding toward the final goal). After the tree is successfully constructed, various metrics can be applied, e.g., costs (the cost for each leaf node is determined). Based on this, the value of non-leaf nodes is calculated, unless we get to the root node. The path with the minimal cost is assumed to be chosen by an attacker.
5 Computerized Risk Management Architecture Based on the apparatus developed so far, architecture will be presented in this section that is intended to support reactive, active, and also pro-active risk management in IS. Due to the fact that the implementation of this architecture is still an on-going process, the presentation will be based on a system dynamics model. Such a model can also further improve risk management in this area with the development of the so-called business flight simulators. But first, the computerized risk management supporting solution will be presented. The basis for this solution is MITRE’s initiative called “ Making Security Measurable and Manageable” [19]. This initiative has the following structural elements: standardized enumerations of common concepts that need to be shared, languages for encoding and communicating information on common concepts, facilities for the exchange of the above contents (repositories and protocols), and adoption of the above elements by the (whole) community. Our approach follows the above structure and builds on it as follows (see Fig. 4). The US National Vulnerability Database serves as a data-feed for vulnerabilities, and the architecture communicates with the database by using SCAP protocol. Another data-feed (for active risk assessment) is provided through SIF infrastructure (where agents monitor the status of servers and clients). Finally, these two transactional databases serve for the derivation of the third, pro-active data feed. The above operational environment can also be simulated well by using system dynamics, as will be the case in the rest of this section. But for this purpose a short overview of system dynamics has to be given first. The IS security area consists of two main ingredients – information technology and the human factor. It is obvious that such systems are complex, and that general and elegant analytical solutions are the exception. We therefore have to rely on computer simulations. In searching for an appropriate methodology, the following requirements have to be met [28]:
64
D. Trˇcek
1. The methodology must support the modeling of the information systems’ main characteristics, which are mostly non-linear and very dynamic. Further, the complex interplay between human factor and technology constituted by numerous feedback loops has to be supported. 2. Multidisciplinary and/or interdisciplinary research approaches have to be supported, especially IT, management, psychology, and sociology. This should be done in a way that experts from various fields, with different professional cultures, will be able to participate. Thus, this method needs to allow effective representation and communication of structures that are the subject of research in IS security. 3. People apprehend purely numerical representations only with difficulty; the whole process and the relationships that are relevant to risk management in information systems are therefore blurred by such representations. To enable the big picture suitable to be grasped, appropriate graphical support is needed. 4. Although the qualitative models have scientific merit, support for quantitative modeling remains a priority wherever possible. 5. Due to the peculiarities of risk management of information systems, the approach at the level of aggregates is also beneficial (Fig. 5).
threat probability and asset vulnerability calculations
time series database
US Natn'l Vulner. r Database
SIFs database
Fig. 5 Reactive (left), active (right), and pro-active (center) risk management feed-pipes
System dynamics is one such methodology that meets the above requirements. Jay Forrester developed it in the early sixties [5], and some attempts to use it to improve the security of information systems have already been made [7, 8]. The use of system dynamics with the focus on risk management was proposed in [27], and this is the basis for this work.
Computationally Supported Quantitative Risk Management
65
The central structures of system dynamics are causal loop diagrams, which consist of causal, or feedback, loops which may be positive (reinforcing) or negative (balancing, stabilizing). Causal links are first set up to denote relations between the variables. If the link has positive polarity, this means that increasing the driving variable increases the targeted variable; if it has negative polarity, increasing the driving variable decreases the targeted variable. Variables can be physical or intangible (e.g., beliefs), or stocks or flows. Causal diagrams provide a qualitative insight into the structure and functioning of systems and serve as the basis for quantitative models. Quantitative models are qualitative models that are backed by formulae which quantify variables and their relationships. System dynamics is based on the premise that all forms of behavior, including those that are referred to as “chaotic,” result from the following basic building blocks: • exponential growth – this kind of behavior is the result of a positive feedback loop; • goal-seeking behavior – this behavior is the result of a negative feedback loop; • oscillations – these occur when delays are present in the loop(s); • S-shaped growth – this results from the interaction of the positive feedback loop with a negative feedback loop, which dominates at the beginning and then becomes stable through later dominance of a negative feedback loop; • S-shaped growth with overshoot – this results from S-shaped growth, but with delay starting to take effect in the negative feedback loop; • growth and collapse – this is again the result of an initially growing positive feedback loop, but with another, negative feedback loop that drives the system back to its starting point. Using the above basic structures, more complex feedback loop diagrams are generated that realistically model real life phenomena. An important point that relates to issues of interdisciplinary research in this area needs to be considered here. Management science is frequently dealing with complex systems, such as those in which the human factor is involved. In these areas, purely analytical solutions are largely beyond our reach. But because of all the computing power available, quantitative approaches based on computer simulations can be used. These can complement the most commonly used approaches that are based on case studies and that are most frequently used for treating complex systems. One main disadvantage of case studies is that they can be used for the analysis of phenomena specific to each case. They also lack deductive power. The extended generic model from Fig. 1 is presented in Fig. 6. From the human resources point of view, the basic variables are risk (R) and risk perception (RP). Risk perception is the key ingredient for decision-making processes concerning safeguards investments (SI). Risk perception is an accumulator variable, since mental anchoring is a known phenomenon and therefore mind only gradually adjusts to actual changes observed in reality. In our demonstration model this decision-making process is modeled by exponential smoothing (i.e., mental estimates are modeled so
66
D. Trˇcek
exposure Normalization
exposure Rate amortization Rate
TA
– riskPerception
assetValue adaptation Rate
amortization
–
+
threat Probability
+ +
safeguards Investments
risk
probability actualThreat Function Probability
<Time>
expStep investDelay
+ residual risk –
compensatedThreat Probability
– probability Normalization
Fig. 6 Causal loops diagram of risk management
that they fit to exponential curves). The appropriateness of such an approach has been proven many times, and a reference work is that of Makridakis [18]. Some delay definitely takes place before safeguard investments are realized. This delay includes not only that resulting from the decision-making process but also the delay due to operations in an organization that have to be executed before a safeguard is in place. The adjustment speed is dictated by the adjustment time (TA). Before going further it is necessary to add that the US National Vulnerability Database treats vulnerability and threat probability from our model in Fig. 1 as an aggregate value, commonly referred to as vulnerability. Therefore, the negative loop in our initial risk management model has been removed, and only the one with vulnerability remains in the model. More precisely, in line with safeguard investments, threat probability (TP) is reduced (or neutralized) and is modeled by the compensated threat probability (CTP), which then drives the actual threat probability (ATP). Further, actual threat probability is basically treat probability (TP) that is the function of probability function PF, i.e., TP=f(PF). And this is the core “engine” that implicitly includes positive loops and is the initial driver of the whole model – PF is driven by the real data that are obtained from the US Nat’l Vulnerability Database (see the architecture in Fig. 4). The upper loop contains the exposure rate (ER) and, together with TP, CTP, and asset value (AV), constitutes the risk (R). AV (which in our case includes all assets of a certain kind, i.e., Apache servers) is diminishing as dictated by its amortization rate
Computationally Supported Quantitative Risk Management
67
(AR), and it is an accumulator. Note that in contrast to RP, which is an information accumulator, AV is a material accumulator. There are two other variables in the model that are necessary for scaling (i.e., for tuning the model to a particular application environment like ensuring that values for probability belong to the interval [0,1]) and for dimensional consistency – these are exposure normalization (EN) and probability normalization (PN). Finally, there is residual risk (RR), which is the difference between R and SI. More often than not, when applying risk prevention measures, the risk is only reduced and that part of risk which remains is referred to as residual risk (how much residual risk an organization is willing to take is a matter of security policy). The cautious reader may have noticed that there is no explicit reinforcing loop that would drive the whole system from its stable state defined by the upper loop (along R, RP, SI, and ER) and the lower loop (along R, RP, SI, CTP, and ATP). The generator (positive loop) is implicitly hidden behind the threat probability variable. Once a new threat appears, or an asset’s vulnerability is discovered, this variable jumps to its high values. The basic settings for the model will be as follows (the complete listing is given in the appendix): the asset value is set to 100 of given monetary units, with the daily amortization rate set to 0.1. Such a large amortization rate is taken intentionally to see how a diminishing value of an asset influences other variables, because the whole simulation time-span is set to 1 year. TB values are those given in Fig. 3 (see the solid line in Fig. 3). Threat probability is thus defined as a lookup function and it actually represents SIF of Apache servers. The “tuning” variables of the model are set as follows: exposure normalization is set to 20, mental adjustment time to 10, probability normalization to 50, while expStep and investment delay are set to 1. The function that is used to model investment delay is a simple delay function. In reality, however, this part of the decision-making process involves many stages for many risks (these stages may include ordering, implementing, and configuring safeguards). If mixing can be assumed to be sufficiently close to ideal, the more appropriate function would be, e.g., a third-order delay function. Now let us run the model. As already stated, the asset value is assumed to be amortized relatively rapidly, which is intentional, in order to show how fast the system (its behavior) adapts to risk neutralization without being affected by the actual value of an asset (which is correct). With the threat probability defined as the SIF for Apache servers, the risk behaves as shown in Fig. 7. The risk is the driver behind the adaptation rate of mentally perceived risk, and it is to be expected that this rate undergoes a kind of over-shoot and/or under-shoot behavior before it succeeds in driving the risk to its equilibrium value (Fig. 8). The risk perception variable exhibits a similar behavior to that of the safeguards investment (and consequently to that of compensated threat probability and exposure rate). The reason is straightforward – the delay function used is a very simple one (fixed delay) with a low delay value. If only the delay is increased so as to
68
D. Trˇcek assetValue 100
euro
75
50
25
0 0
61
122
assetValue : current
183 Time (Day)
244
305
244
305
366
risk 6
euro/Day
4.5
3
1.5
0 0 risk : current
61
122
183 Time (Day)
366
Fig. 7 The dynamics of asset value (its rapid depreciation is intentional to show its influence on the system behavior) and the dynamics of risk in the basic setting of the model
significantly exceed the adjustment time (e.g., by setting it to 21 days), a new pattern (spike) is introduced into the system behavior at about time=30 days. Assuming a third-order delay, with investment delay being set to 3, the “anomaly” can be seen to occur earlier, which is as expected – the behavior of third-order delay in this region does not differ significantly from that of an ordinary fixed delay, so the behavior pattern is almost the same (Fig. 9). All the above figures show a regime that is more or less expected. However, the whole model does not behave so “innocently” in all regimes of operation. In reality, the value of an asset will not so rapidly diminish and the asset will not be written-off in inventory listings in just 180 days. Further, real data supplied through the US Nat’l Vulnerability Database can produce a very “bumpy” shape. Suppose
Computationally Supported Quantitative Risk Management
69
adaptationRate 0
euro/(Day*Day)
–0.25
–0.5
–0.75 –1 0
61
122
183 Time (Day)
adaptationRate : current
244
305
366
riskPerception 10
euro/Day
7.5
5
2.5
0 0
61
riskPerception : current
122
183 Time (Day)
244
305
366
Fig. 8 The dynamics of mental adaptation rate and risk perception as a consequence of this adaptation rate
we continue analysis with the data used so far. Further, let us shorten the TA to 5 (which means quicker reactions of the human decision maker that take place before the whole situation settles down). Put another way, the decision maker is too reactive to changes in the system (before they stabilize at a certain point) and intervenes all the time with the system. In this case patterns can emerge that can be quite confusing. In addition, let expStep be set to 0.3, while significant operational delays in implementation of safeguards are captured through setting investDelay to 15. Figure 10 shows that AR starts oscillating, and RP becomes very bumpy. Consequently, and most importantly, these changes (that are mostly related to the decision-making process) result in final consequences that are shown in Fig. 11. It is clearly visible that the whole system is more endangered than it was in the first case – actual threat probability is actually increased.
70
D. Trˇcek safeguardsInvestments 10
euro/Day
7.5
5
2.5 0 0
61
122
183 Time (Day)
244
305
366
safeguardsInvestments : current
Fig. 9 The dynamics of the actual implementation of selected safeguards
And finally, the global variables for the model are set as follows: time step of 0.007812 and the duration of the simulation of 1 year; the complete listing can be found in the appendix.
6 Discussion Quite a few new steps toward quantifiable reactive, active, and pro-active risk management in IS have been presented in this chapter. These provide a concrete methodology for implementing a quantitative risk management solution through a top-down approach with the following elements: 1. The necessary definitions in the area of risk management are presented and the core variables of IS risk management are identified – assets with their values and vulnerabilities, threats, exposure time of assets to threats, emerged risks, implemented safeguards, and existing residual risks. The gap between risks and residual risks is filled by the mental processes of decision makers (risk awareness). 2. The system dynamics-based conceptual model of IS risk management is built on the basis of the basic variables. It provides the core variables and their inter relationships. 3. This conceptual model has to be further refined to be computationally implementable and adapted to a particular IS. This requires introduction of additional variables that also provide the dimensional consistency that is necessary to execute the model in concrete simulation environments.
Computationally Supported Quantitative Risk Management
71
adaptationRate
euro/(Day*Day)
4
2
0
–2
–4 0
61
122
183
244
305
366
244
305
366
Time (Day)
adaptationRate : current
riskPerception 10
euro/Day
7.5
5
2.5
0 0
61
122
183 Time (Day)
riskPerception : current
Fig. 10 Modified decision-making process and its most important variables
4. The model thus obtained needs real quantitative data to provide a tangible metric as output (i.e., quantified risk). These data are acquired through the US National Vulnerability Database and the local area network. 5. To cover the proactive part of risk management, the above-mentioned data are used in a local engine that deploys known forecasting methodologies, and its output is used as the third stream of quantitative data for the developed concrete simulation model. Some issues still remain open and require further research. The main open issue concerns the suitability of forecasting methods to be applied to metrics defined in this chapter for supporting pro-active risk management. These methods include
72
D. Trˇcek actualThreatProbability 0.2
Dmnl
0.15 0.1 0.05 0 0
61
122
183 244 Time (Day)
305
366
actualThreatProbability : current
risk 20
euro/Day
15 10 5 0 0
61
122
183 Time (Day)
244
305
366
risk : current
Fig. 11 Increased actual threat probability and risk as a result of modified decision-making procedures
moving averages, exponential smoothing, extrapolation, linear prediction, trend estimation and growth curve: • With moving averages, short-term disturbances (randomness) are filtered out to make long-term trends more visible. • Exponential smoothing is similar to moving averages, the main difference being that past values are not all equally weighted. Those that are further back in the past are given higher weights and more recent ones, smaller weights. More precisely, this method assigns exponentially increasing weights with increasing age of the observation. • Extrapolation provides new data points on the basis of a discrete set of known data points. Prior knowledge of the process that created the existing data points is necessary. • Linear prediction produces future values of a discrete-time series by assuming that these future values are a linear function of previous values. • Trend estimation is about derivation of a straight line that fits best to existing points to produce the best output according to the least-squares criterion. In the
Computationally Supported Quantitative Risk Management
73
case that one variable is an independent variable, the method is actually linear regression. In the case of non-linear fitting, non-linear regression is obtained. Taking into account the nature of the above methods, let us briefly discuss their appropriateness for application to SIF. Taking SIF for probability, forecasting cannot be applied by using moving averages – the basic idea behind moving averages is to filter out the noise, but SIF does not contain (significant) noise. Exponential smoothing makes no sense as long as we cannot spot any exponential law behind the vulnerabilities. With linear prediction, the main problem is that our systems are highly non-linear. Therefore this method can be applied only in limited regimes that can be treated as linear. Finally, extrapolation requires knowledge of the complete process in the background, which will be feasible only in certain cases. Trend estimation is similar, a more relaxed requirement being that only the nature of a phenomenon is required, not its exact mathematical representation. Further research in this area will be needed to determine the most suitable methods for the newly introduced metrics. One possible approach is through understanding the process by which vulnerabilities are eliminated. The following questions have to be answered to model it properly: 1. What is the probability distribution of newly discovered flaws in an average product? 2. What is the probability distribution of patching times, assuming that the same manpower has to be dedicated to each newly discovered vulnerability? Once a weakness is discovered, it is recorded and its neutralization efforts start. The complexity of a particular patching process defines the length of its elimination. The same questions as above have to be applied to this patching process. The above questions should also be addressed to define the nature of vulnerabilities that are a result of mis-configurations of IT resources. Finally, an area that also needs significant additional research is human factor modeling, be it related to risk awareness or to threats generation. This area is a very complex one that has to cover many interrelated factors like personal perception of risks, organizational elements like security policy, education.
7 Conclusions We are witnessing the strong penetration of networked information systems into all areas of our lives. Their security is therefore of paramount importance, and risk management is at the center of this security. However, the increasing complexity of information systems (complex networking, the extensive number of existing and emerging services, exponentially increasing amounts of data, strong involvement of the human factor, and almost countless numbers of possible interactions) results in a situation where traditional risk management techniques are no longer adequate. Current techniques are of a qualitative nature, but quantitative methods are preferred on the scientific agenda.
74
D. Trˇcek
The latest advancements in overcoming these problems have been presented and two promising metric methodologies have been deployed (DVI and SIF). Both are based on (or can be tied to) the US National Vulnerability Database. These metric methodologies provide the basis for developing IT architecture that supports reactive, active, and pro-active risk management. It is based on a generic risk management model for IT environments, which, in turn, is based on system dynamics. The main intention behind using system dynamics is clear presentation of the main risk management elements and their relationships that have to be properly understood for the risk management architecture. However, the model can also serve for the complete and detailed development in the system dynamics domain to support the study of such systems. In this chapter, the detailed development of this system dynamics model has served for demonstration and security awareness purposes and for the understanding of such system behavior. Last but not least, the question as to how the introduced metrics can be handled has been analyzed by applying established forecasting methodologies. These are now some of the main issues that need to be addressed. Minor issues are related to integration with other available systems (e.g., [24]) to further support computerized reactive, active, and pro-active risk management.
Appendix This appendix provides the complete listing of the model presented in this chapter, Section 5 (the listing is for VensimTM package produced by Ventana Systems): (01)
actualThreatProbability= threatProbability∗ compensatedThreatProbability Units: Dmnl
(02)
adaptationRate=(risk-riskPerception)/TA Units: (euro/Day)/Day
(03)
amortization=assetValue∗ amortizationRate Units: euro/Day
(04)
amortizationRate=0.03 Units: 1/Day [0.01,1,0.01]
(05)
assetValue= INTEG (-amortization, 100) Units: euro Inicialna vrednost sredstva je 100.
(06)
compensatedThreatProbability= safeguardsInvestments/probabilityNormalization Units: Dmnl [0,1,0.1]
Computationally Supported Quantitative Risk Management
(07)
exposureNormalization=20 Units: euro/Day [1,100]
(08)
exposureRate= safeguardsInvestments/exposureNormalization Units: Dmnl [0,1]
(09)
expStep=1 Units: Day [0.1,100]
(10)
FINAL TIME = 366 Units: Day The final time for the simulation.
(11)
INITIAL TIME = 0 Units: Day The initial time for the simulation.
(12)
investDelay=1 Units: Day [0.1,21,0.1]
(13)
probabilityFunction( [(0,0)-(366,1)], (0,0.6),(31,0.7),(60,0.4),(91,0.5),(121,0.7),(152,0.5), (182,0.4),(213,0.6),(244,0.9),(274,0.1), (305,0.1),(335,0),(335,0),(350,0.2),(366,0.2)) Units: Dmnl probabilityNormalization=50 Units: euro/Day [1,100]
(14) (15)
residual risk=risk-safeguardsInvestments Units: euro/Day
(16)
risk= assetValue∗ actualThreatProbability∗ exposureRate/expStep Units: euro/Day
(17)
riskPerception= INTEG (adaptationRate,10) Units: euro/Day
(18)
safeguardsInvestments= DELAY FIXED(riskPerception, investDelay, 0) Units: euro/Day
(19)
SAVEPER = TIME STEP Units: Day [0,?] The frequency with which output is stored.
(20)
TA=10 Units: Day [0.1,31,1]
75
76
D. Trˇcek
(21)
threatProbability= probabilityFunction(Time) Units: Dmnl
(22)
TIME STEP = 0.0078125 Units: Day [0,?] The time step for the simulation.
Acknowledgments The author acknowledges the support of the Slovenian Research Agency ARRS for the support of this research through program P2-0359 and the EU Commission for SEMPOC research grants JLS/2008/CIPS/024 and ABAC 30-CE-0221852/00-43. This research is partially also a result of collaboration within COST Econ@TEL project. The author would also like to thank anonymous reviewers that have provided constructive comments for the first version of this chapter. Last but not least, special thanks go to Prof. Dr. R. Pain – he knows why.
References 1. Andrijcic E, Horowitz B (2006) A macro-economic framework for evaluation of cyber security risks related to protection of intellectual property. Risk Anal 26(4):907. 2. British Standards Institute (1995) Code of practice for information security management, BS 7799, London. 3. COBIT (1998) COBIT overview. Information Systems Audit and Control Foundation, Rolling Meadows, IL, USA. 4. Cox LA, Babayev D, Huber W (2005) Some limitations of qualitative risk rating systems. Risk Anal 25(3):651. 5. Forrester J (1961) Industrial dynamics. MIT, Cambridge. 6. Gerber M, Von Solms R (2005) Management of risk in the information age. Comput Secur 24(1):16–30. 7. Gonzalez JJ (ed) (2003) From modeling to managing security – a system dynamics approach. Höyskole Forlaget AS, Kristiansand. 8. Gonzalez JJ, Sawicka A (2002) A framework for human factors in information security. In: Proceedings of the WSEAS Conference on Security, HW/SW Codesign, E-Commerce and Computer Networks, Rio de Janeiro. 9. Hariri S, Qu G, Dharmagadda T, Ramkishore M, Cauligi S, Raghavendra A (2003) Impact analysis of faults and attacks in large-scale networks. IEEE Secur Priv September/October, IEEE, 49–54. 10. HIPAA (2005) Basics of risk analysis and risk management, US Dept. of Health & Human Services, Washington, DC. 11. International Standards Organization (1989) Information processing systems – open systems interconnection – basic reference model – part 2: Security architecture, ISO 7498–2:1989, Geneva. 12. International Standards Organization (2000) IT – code of practice for information security management. ISO 17799, Geneva. 13. International Standards Organization (2004) IT – management of information and communications technology security, part 1: concepts and models for information and communications technology security management. ISO/IEC standard 13335–1, Geneva. 14. International Standards Organization (2005) IT – security techniques – code of practice for information security management, ISO/IEC 27002, Geneva. 15. Jones JR (2007) Estimating software vulnerabilities, IEEE Security & Privacy, July and August, IEEE, pp 28–32.
Computationally Supported Quantitative Risk Management
77
16. International Standards Organization (2008) Information security risk management, ISO/IEC 27005, Geneva. 17. Internet Systems Consortium (2006) ICS domain survey: number of internet hosts. http://www.isc.org/index.pl?/ops/ds/host-count-history.php. Last Accessed on 27th of October 2009. 18. Makridakis S, Ersen A, Carbone R, Fildes R, Hibon M, Lewandowski R, Newton J, Parze NE, Winkler R (1984) The forecasting accuracy of major time series methods. Wiley, New York, NY. 19. Martin AR (2008) Making security measurable and manageable. In: Proceedings of MILCOM, November 17–19, San Diego, CA, IEEE, Los Alamitos, pp 1–9. 20. Mell P, Quinn S, Banghart J, Waltermire D (2008) Security content automation protocol (SCAP), v 1.1, NIST Interagency Report 7511 (Draft), Gaithersburg. 21. MITRE Corp. (2009) Common vulnerabilities and exposures, MITRE, Washington, DC, http://cve.mitre.org/. Last Accessed on 6th September 2010. 22. NIST (2007) Managing risk from information systems, NIST SP 800–39 Draft, US Dept. of Commerce, Washington, DC. 23. NIST (2009) US National Vulnerability Database, NIST, Washington, DC, http://nvd.nist.gov/ 24. Raghu TS, Hsinchun C (2007). Cyberinfrastructure for homeland security: Advances in information sharing, data mining, and collaboration systems. Decis Support Syst (online), 2006. 25. Ryan JJCH, Ryan DJ (2005) Proportional hazards in information security. Risk Anal 25(1):141. 26. Schneier B (1999) Attack trees. Dr Jobbs J 12, pp 21–29. 27. Trˇcek D (2005) Managing information systems security and privacy. Springer, Heidelberg/New York, NY. 28. Trˇcek D (2006) Security models: Refocusing on the human factor. IEEE Comput, 39(11): 103–104.
Cardinality-Constrained Critical Node Detection Problem Ashwin Arulselvan, Clayton W. Commander, Oleg Shylo, and Panos M. Pardalos
1 Introduction In this chapter, we study the cardinality-constrained critical node problem, in which the objective is to minimize the set of nodes deleted in order to obtain a node-deleted subgraph in which the size of biggest component is smaller than a given input. These nodes will then be characterized as the set of important nodes as they are important in maintaining the overall connectivity of the graph. Studies carried out in this line include those by Bavelas [3] and Freeman [7] which emphasize node centrality and prestige, both of which are usually functions of a node’s degree. However, they lacked applications to problems which emphasized network fragmentation and connectivity. We can apply the CC - CNP to the problem of jamming wired telecommunication networks by identifying the critical nodes and suppressing the communication on these nodes. Like the CNP , the CC - CNP can also be applied to the study of covert terrorist networks, where a certain number of individuals have to be identified whose deletion would result in the desired breakdown of communication between Ashwim Arulselvan Center for Discrete Mathematics and Applications, Warwick Business School, University of Warwick, Coventry, UK e-mail:
[email protected] Clayton W. Commander Air Force Research Laboratory, Munitions Directorate, and Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA e-mail:
[email protected] Oleg Shylo Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA e-mail:
[email protected] Panos M. Pardalos Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA e-mail:
[email protected] Date: Feb 2009. This project was partially funded by Air Force Research Laboratory. N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_4,
79
80
A. Arulselvan et al.
individuals in the network [12]. Likewise in order to stop the spreading of a virus over a telecommunication network, one can identify the critical nodes of the graph and take them off-line. The CC - CNP also finds applications in network immunization [5, 21] where mass vaccination is an expensive process and only a specific number of people, modeled as nodes of a graph, can be vaccinated. The immunized nodes cannot propagate the virus and the goal is to identify the individuals to be vaccinated in order to reduce the overall transmissibility of the virus. There are several vaccination strategies in the literature (see, e.g., [5, 21]) offering control of epidemic outbreaks; however, none of the proposed are optimal strategies. Deletion of central nodes may not guarantee a fragmentation of the network or even disconnectivity, in which case disease transmission cannot be prevented. Of course, owing to its dynamic stature, the relationships between people represented by edges in the social network are transient and there is a constant rewiring between nodes, and alternate relationships could be established in the future. The proposed critical node technique helps in a maximum prevention of disease transmission over an instance of the dynamic network. Borgatti [4] has studied a similar problem, focusing on node detection resulting in maximum network disconnectivity. Other studies in the area of node detection such as centrality [3, 7] focus on the prominence and reachability to and from the central nodes. However, little emphasis is placed on the importance of their role in the network connectivity and diameter. Perhaps one reason for this is because all of the aforementioned references relied on simulation to conduct their studies. Although the simulations have been successful, a mathematical formulation is essential for providing insight and helping to reveal some of the fundamental properties of the problem [16]. In the next section, we present a mathematical model based on integer linear programming which provides optimal solutions for the critical node problem. We organize this chapter by first formally defining the problem and discussing its computational complexity. Next, we provide an integer programming (IP) formulation for the corresponding optimization problem. In Section 3 we introduce a heuristic to quickly provide solutions to large-scale instances of the problem. We present a computational study in Section 4, in which we compare the performance of the heuristic against the optimal solutions which were determined using a commercial software package. Some concluding remarks are given in Section 5.
2 Problem Formulations Denote a graph G = (V, E) as a pair consisting of a set of vertices V and a set of edges E. All graphs in this chapter are assumed to be undirected and unweighted. For a subset W ⊆ V , let G(W ) denote the subgraph induced by W on G. A set of vertices I ⊆ V is called an independent or stable set if for every i, j ∈ I, (i, j) ∈ E. That is, the graph G(I ) induced by I is edgeless. An independent set is maximal if it is not a subset of any larger independent set (i.e., it is maximal by inclusion) and maximum if there are no larger independent sets in the graph.
Cardinality-Constrained Critical Node Detection Problem
81
2.1 Critical Node Problem The formal definition of the problem is given by CRITICAL N ODE P ROBLEM (CNP) INPUT: An undirected graph k. G = (V, E) and an integer OUTPUT: A = arg min i, j∈(V \A) u i j G(V \ A) : |A| ≤ k, where u i j :=
1, if i and j are in the same component of G(V \ A) 0, otherwise.
The objective is to find a subset A ⊆ Vof nodes such that |A| ≤ k, whose deletion results in the minimum value of u i j in the edge-induced subgraph G(V \ A). A minimum cohesion between the nodes ensues as result of the deletion of the nodes.The objective function while maximizing the number of connected components implicitly min- imizes the difference in the sizes of the components. Maximum pair-wise disconnectivity between the nodes is an alternate interpretation of the objective function. We refer to [2] for a detailed account of the critical node detection problem. An integer programming formulation is provided with a heuristic. The complexity of the problem has also been discussed.
2.2 Cardinality-Constrained Problem We now provide the formulation for a slightly modified version of the CNP based on constraining the connectivity index of the nodes in the graph. Given a graph G = (V, E), the connectivity index of a node is defined as the number of nodes reachable from that vertex To constrain the network connectivity in optimization models, we can impose constraints on the connectivity indices. This leads to a cardinality-constrained version of the CNP which we aptly refer to as the cardinality-constrained critical node detection problem ( CC - CNP ). The objective is to detect a set of nodes A ⊆ V such that the connectivity indices of the nodes in the vertex-deleted subgraph G(V \ A) is less than some threshold value, say L. Using the same definition of the variables as in [2], we can formulate the CC - CNP as the following integer linear programming problem. We have the decision variables u : V × V → {0, 1} defined by u i j :=
1, if node i and node j are in the same component 0, otherwise
and v : V → {0, 1} defined by
82
A. Arulselvan et al.
1, if node i is deleted in the optimal solution, vi := 0, otherwise
(CC-CNP-1)
Minimize
vi
(1)
i∈V
s.t. u i j + vi + v j ≥ 1 ∀(i, j) ∈ E, u i j + u jk + u ki = 2 ∀(i, j, k) ∈ V, u i j ≤ L ∀i ∈ V, i = j
(2) (3) (4)
j∈V
u i j ∈ {0, 1} ∀i, j ∈ V, vi ∈ {0, 1} ∀i ∈ V,
(5) (6)
In the above formulation, L is the maximum allowable connectivity index for any node in V . Theorem 1 CC-CNP1 is a correct formulation for the cardinality-constrained critical node detection problem. Proof First, we see that the objective function given clearly minimizes the number of nodes deleted. Constraints (2) and (3) follow exactly as in the CNP formulation [2]. Constraint (2) indicates that for every edge (i, j) either node i is deleted or node j is deleted or i and j are in the same component. Constraint set (3) models the transitive property of connectivity of nodes, i.e. if i is connected to j and j is connected to k, then i is connected to k. The only difference from the CNP is to constrain the connectivity index of each node. This is accomplished by constraint (4). Finally constraints (5) and (6) define the domains of the decision variables, and we have the proof. The proof of NP completeness is obtained from the result proved by Krishnamoorthy and Deo [13] for a class of node deletion problems, where we minimize the number of nodes to be deleted so that the node-deleted subgraph satisfies some desired property. The CC-CNP requires that the size of the largest connected component of the node-deleted subgraph to be less than some value provided as input. Lemma 1 (Krishnamoorthy and Deo [13]) Let π be a specified graph property that is determined by its components, and suppose there is a graph F with a node “s” such that the following hold: (1) the graph F and the subgraph resulting after deleting node “s” from F satisfy π , (2) if a node x is added to the graph F and nodes x and “s” are joined by an edge, then the resulting graph is a forbidden-induced subgraph for property π , then the node-cover problem is polynomially transformable to the node deletion problem for property π .
Cardinality-Constrained Critical Node Detection Problem
83
Let us consider the graphs with the property that the size of any connected component in the graph is less than or equal to L, where L > 0. Then CC-CNP is a node deletion problem [13] for this property. Additionally, the property satisfies the conditions stated in Lemma 1 (one can take any connected component of size L as graph F). Thus we have a polynomial time reduction from the node cover problem which is well known to be NP complete [8]. A precise generalization of the property is provided in [11] as hereditary and non-trivial. A property π is hereditary if a graph satisfies π , then every node and edge-induced subgraph of the graph satisfies π and it is non-trivial if there are infinitely many graphs that satisfy the property. The node deletion problem for hereditary properties was later proved to be max-SNP hard as the transformation provided was approximation preserving [14]. This implies that there is no polynomial time approximation scheme to solve the problem unless P=NP.
3 Heuristics for Cardinality-Constrained Critical Node Problem 3.1 Combinatorial Algorithm for CC-CNP (ComAlg) We make a subtle modification to the heuristic for CNP [2] to create a heuristic for the CC - CNP. To do this, notice that now we are only concerned with the connectivity indices of the nodes. Stated differently, we are only concerned with the sizes of the components in the vertex-deleted subgraph. Unlike CNP, there is no limit on the number of critical nodes we choose, so long as the connectivity constraints are satisfied. Pseudo-code for the proposed algorithm is provided in Fig. 1. The heuristic starts off by identifying a maximal independent set (MIS), which requires a random initial node i as a seed to generate. We could run the algorithms for some pre-specified number (θ ) of iterations each corresponding to a distinct initial node as input for generating the maximal independent set. The test runs presented here considered all possible nodes as the initial node and so the outermost while loop performs |V | iterations, with each iteration corresponding to a maximal independent set with a distinct node from the set V as the initial seed. Then, the boolean variable OPT is set to FALSE. Finally in line 4, a variable NoAdd is initialized to 0. This variable determines when to exit the inner while loop from lines 5–17. After this loop is entered, the procedure iterates through the vertices and determines which can be added back to the graph while still maintaining feasibility. Mi is the set of all connected components in the node-induced subgraph G(MIS ∪ {i}) and |sk | is the size of the kth component. If vertex i can be added, MIS is augmented to include i in step 8, otherwise NoAdd is incremented. If NoAdd is ever equal to |V | − |MIS|, then no nodes can be returned to the graph and OPT is set to TRUE. The loop is then exited and the algorithm returns the set of nodes to be deleted, i.e., V \MIS. We later enhance the critical node heuristic with a local search procedure which randomly selects two nodes not in MIS (nodes that are deleted), one in the MIS and make a
84
A. Arulselvan et al.
Fig. 1 Heuristic for the cardinality-constrained critical node problem
swap if the size of largest connected component is less than L, as this provides an improvement in the objective function (Figs. 2 and 3).
3.2 Genetic Algorithm for the CC-CNP Genetic algorithms (GAs) mimic the biological process of evolution. In this section, we describe the implementation of a GA for the CC - CNP . Recall the general structure of a GA as outlined in Fig. 4. When designing a genetic algorithm for an optimization problem, one must provide a means to encode the population, define the crossover operator, and define the mutation operator which allows for random changes in offspring to help prevent the algorithm from converging prematurely. For our implementation, we use binary vectors as an encoding scheme for individuals within the population of solutions. When the population is generated (Fig. 4, line 1), a random deviate from a distribution which is uniform onto (0, 1) ∈ R is generated for each node. If the deviate exceeds some specified value, a parameter chosen based on empirical observation after extensive initial testing, the
Cardinality-Constrained Critical Node Detection Problem
85
Fig. 2 Local search algorithm
Fig. 3 Local search enhancement of the cardinality-constrained node problem
corresponding allele is assigned value 1, indicating this node should be deleted. Otherwise, the allele is given a 0, implying it is not deleted. In order to evaluate the fitness of the population, per line 2, we must determine whether each individual solution is feasible or not. Determining feasibility is a relatively straightforward task and can accomplished in O(|V | + |E|) using a depth-first search [1]. In order to evolve the population over successive generations, we use a reproduction scheme in which the parents chosen to produce the offspring are selected using the binary tournament method [15, 20]. Using this method, two chromosomes are chosen at random from the population and the one having the best fitness, i.e.,
86
A. Arulselvan et al.
Fig. 4 Pseudo-code for a generic genetic algorithm
the lowest objective function value, is kept as a parent. The process is then repeated to select the second parent. The two parents are then combined using a crossover operator to produce an offspring [10]. To breed new solutions, we implement a strategy known as parameterized uniform crossover [19]. This method works as follows. After the selection of the parents, refer to the parent having the best fitness as MOM. For each of the nodes (alleles), a biased coin is tossed. If the result is heads, then the allele from the MOM chromosome is chosen. Otherwise, the allele from the least fit parent, call it DAD, is selected. The probability that the coin lands on heads is known as CrossProb and is determined empirically. Figure 5 provides an example of a potential crossover when the number of nodes is 5 and CrossProb = 0.65.
Fig. 5 An example of the crossover operation. In this case, CrossProb = 0.65
After the child is produced, the mutation operator is applied. Mutation is a randomizing agent which helps prevent the GA from converging prematurely and escape to local optima. This process works by flipping a biased coin for each allele of the chromosome. The probability of the coin landing heads, known as the mutation rate (MutRate), is typically a very small user-defined value. If the result is heads, then the value of the corresponding allele is reversed. For our implementation, MutRate = 0.03. After the crossover and mutation operators create the new offspring, it replaces a current member of the population using the so-called steady-state model [6, 10, 15].
Cardinality-Constrained Critical Node Detection Problem
87
Using this methodology, the child replaces the least fit member of the population, provided that a clone of the child is not an existing member in the population. This method ensures that the worst element of the population is monotonically improving in every generation. In the subsequent iteration, the child becomes eligible to be a parent and the process repeats. Though the GA does converge in probability to the optimal solution, it is common to stop the procedure after some “terminating condition” (see Fig. 4, line 3) is satisfied. This condition could be one of several things including a maximum running time, a target objective value, or a limit on the number of generations. For our implementation, we use the latter option and the best solution after MaxGen generations is returned. We set MaxGen to 1000 for our computational experiments. The test runs for the genetic algorithms were conducted 100 times for each instance and the average and standard deviation are reported in the result.
4 Computational Results All of the proposed heuristics were implemented in the C++ programming language and compiled using GNU g++ version 3.4.4, using optimization flags -O2. It was tested on a PC equipped with a 1700 MHz Intel Pentium M processor and 1.0 GB of RAM operating under the Microsoft Windows XP Professional environment.
4.1 CC-CNP Results We present the results of the two algorithms developed for the CC - CNP , namely the combinatorial algorithm and the genetic algorithm. We tested the IP model and both heuristics on the terrorist network [12] and a set of randomly generated graphs. For each computer-generated instance, we report solutions for different values of L, the connectivity index threshold. Finally, we have implemented the integer programming model for the CC - CNP using CPLEXTM . Table 1 presents computational results of the IP model and heuristic solutions when tested on the terrorist network data. Notice that for all values of L with known optimal solutions, the genetic algorithm and the combinatorial algorithm with local search (ComAlg + LS) computed optimal solutions. We now consider the performance of the algorithms when tested on the randomly generated data sets containing up to 50 nodes taken from [2]. The results are shown in Table 2. For these relatively small instances, we were able to compute the optimal solutions using CPLEX. For each instance, we provide solutions for different values of L, the maximum connectivity index. Notice that for these problems, the genetic algorithm computed solutions close to optimal for each instance within a fraction of the time required by CPLEX. The combinatorial heuristic found optimal solutions for all but three cases requiring approximately half of the time of the GA.
88
A. Arulselvan et al.
Table 1 Results of IP model and heuristics on terrorist network data from [12] Instance IP model Genetic alg ComAlg
ComAlg + LS
Max conn. Obj index (L) Val
Comp Time (s)
Avg obj val
Std dev
Avg comp Obj time (s) val
Comp time (s)
Obj val
Comp time (s)
3 4 5 8 10
188.98 886.09 30051.09 – –
21.16 17.15 15.15 12.81 11.83
0.366 0.574 0.357 0.417 0.499
0.603 1.455 1.05 1.368 0.451
0.01 0.01 0.18 0.05 0.07
21 17 15 13 11
0.1 0.45 1.331 0.07 0.05
21 17 15 – –
22 19 20 14 12
Table 2 Results of the IP model and genetic algorithm and the combinatorial heuristic on randomly generated scale-free graphs Instance IP model Genetic alg ComAlg + LS Nodes Arcs 20 20 20 25 25 25 30 30 30 30 30 30 35 35 35 40 40 40 45 45 45 50 50 50
45 45 45 60 60 60 50 50 50 75 75 75 60 60 60 70 70 70 80 80 80 135 135 135
Max conn. index (L) 2 4 8 2 4 8 2 4 8 4 6 10 2 4 6 2 4 6 2 4 6 2 4 6
Obj comp value time (s) 9 6 5 11 9 7 11 8 6 10 9 7 12 8 7 15 11 8 16 11 8 19 15 14
0.04 0.13 0.39 0.07 14.1 26.64 0.07 0.1 1152.15 18.77 442.41 64.94 0.13 29.89 31.61 0.17 341.97 78.94 0.24 48.17 118.23 0.36 165.18 5722.88
Avg Obj value Std dev 9 6.99 5.92 11.9 9.98 7.93 11.04 8 6 10.07 9.83 7.22 12.09 8.01 7 15.12 11.1 9.03 16 11 8.18 19.1 15.5 14.82
0 0.099 0.271 0.3 0.14 0.255 0.196 0 0 0.255 0.377 0.414 0.38 0.1 0 0.381 0.3 0.386 0 0 0.384 0.33 0.5 0.46
Avg Comp time (s)
Obj Comp value time (s)
0.035 0.0255 0.0248 0.087 0.077 0.058 0.119 0.056 0.066 0.276 0.308 0.363 0.308 0.191 0.143 0.267 0.401 0.443 0.232 0.2244 0.4601 0.372 0.692 0.608
9 6 5 11 10 8 11 8 6 10 9 8 12 8 7 15 11 8 16 11 8 19 15 14
0.03 0.862 1.482 0.08 0.01 0.06 0.01 0 0 0.02 0.04 0 0.14 0 0.01 0.101 0 0.04 0.1 0.02 0.071 0.05 0.291 0.03
Table 3 presents the solutions for the random instances from 75 to 150 nodes. In order to demonstrate the robustness of the heuristics, we provide solutions for different values of L for each instance. In this table, we provide the results for the genetic algorithm and combinatorial heuristic with and without the local search enhancement. CPLEX was unable to compute optimal solutions within reasonable time limits for any of the instances represented in this table. We see from this table that in terms of solution quality of the GA is fairly comparable to that of the ComAlg + LS, which requires more computation time than the
Cardinality-Constrained Critical Node Detection Problem
89
Table 3 Comparative results of the genetic algorithm and the combinatorial heuristic when tested on the larger random graphs. Due to the complexity, we were unable to compute the corresponding optimal solutions Instance Genetic algorithm ComAlg ComAlg + LS Max conn. Nodes Arcs index (L) 75 75 75 75 75 75 75 75 75 100 100 100 100 100 100 100 100 100 125 125 125 150 150 150 150 150 150
140 140 140 210 210 210 280 280 280 194 194 194 285 285 285 380 380 380 240 240 240 290 290 290 435 435 435
5 8 10 5 8 10 5 8 10 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15
Avg obj value Std dev
Avg comp Obj time (s) value
Comp time (s)
Obj value
Comp time (s)
17.75 14.08 13 23.95 21.82 20.87 31.54 29.45 28.61 22.29 17.54 15.43 33.53 28.97 26.95 41.41 36.88 34.67 30.84 25.03 23.05 32.7 27.31 24.46 49.44 41.79 38.59
1.633 1.306 0.754 1.757 1.6185 1.5568 1.96 1.89 1.782 2.388 2.861 3.121 3.102 3.393 3.764 3.292 3.105 3.851 4.3549 4.849 4.873 6.334 5.894 6.155 6.358 6.346 7.326
0 0.02 0.12 0.01 0.01 0.09 0.101 0.05 0.13 0.02 0.241 0.021 0.02 0.05 0.16 0.051 0.02 0.39 0.251 0.07 0.18 0.421 0.2 1.101 0.06 0.44 0.07
18 14 12 23 22 20 31 29 28 22 17 15 33 28 27 42 37 36 31 24 22 30 25 23 49 41 38
1.502 1.181 3.364 18.476 2.934 21.17 3.144 3.746 4.787 2.774 6.499 0.44 1.262 11.076 1.142 5.739 3.866 3.034 1.472 1.993 9.233 5.798 5.107 19.889 6.459 5.518 13.699
0.75 0.306 0 0.727 0.698 0.483 0.727 0.572 0.691 0.516 0.818 0.667 1.081 0.921 0.931 1.01 1.022 0.86 1.347 1. 099 1.169 2.032 1.154 1.12 1.402 1.251 1.25
21 20 20 29 23 24 35 31 30 33 22 22 38 31 28 47 41 40 37 29 26 40 32 29 57 50 45
GA and requires more computing time on average. The combinatorial algorithm without the local search procedure produces solution which are arguably reasonable given that the required computation time was relatively less and could serve as procedure to provide initial feasible solution for exact algorithms.
5 Concluding Remarks In this chapter, we proposed several methods of jamming communication networks based on the detection of the critical nodes. Critical nodes are those vertices whose deletion results in the maximum network disconnectivity. In general, the problem of detecting critical nodes has a wide variety of applications from jamming communication networks and other anti-terrorism applications to epidemiology and transportation science [2].
90
A. Arulselvan et al.
In particular we examined the cardinality-constrained CNP (CC - CNP ). In the CC we determine the minimum number of nodes to be deleted, in order to obtain a node-deleted subgraph satisfying the property that every node is connected to at most L nodes. The proposed problem was modeled as integer linear programming problems. Then we discussed some complexity results for the problem. Furthermore, we proposed several heuristics for efficiently computing quality solutions to large-scale instances. The heuristic proposed for the CNP was a combinatorial algorithm which exploited properties of the graph in order to compute basic feasible solutions. The method was further intensified by the application of a local search mechanism. We also provided a genetic algorithm for the problem. By using the integer programming formulation we were able to determine the precision of our heuristic by comparing their relative solutions and computation times for several networks. We also conclude with a few words on the possibility of future expansion of this work. A heuristic exploration of cutting plane algorithms on the IP formulation would be an interesting alternative. Other heuristic approaches worthy of investigation include hybridizing the genetic algorithm with the addition of a local search or path-relinking enhancement procedure [9]. Finally, the local search used in the combinatorial algorithm was a simple two-exchange method, which was the cause of a significant slow down in computation as noted in Table 3. A more sophisticated local search such as a modification of the one proposed by Resende and Werneck [17, 18] should be a major focus of attention. Furthermore, it would be interesting to study the weighted version of the problem to see how weights added to the nodes affect the solutions. For example, it is rational to perceive applications containing weighted networks in which the cost of deleting one node is different from another. Also, pertaining to applications outside the scope of jamming networks, a study of epidemic threshold variation with respect to the heuristic results will help determine the impacts on contagion suppression in biological and social networks. CNP
References 1. Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows: Theory, algorithms, and applications. Prentice-Hall, Englewood Cliffs, NJ 2. Arulselvan A, Commander CW, Elefteriadou L, Pardalos PM (2009) Detecting critical nodes in sparse graphs. Comput Oper Res, 36(7):2193–2200 3. Bavelas A (1948) A mathematical model for group structure. Hum Org 7:16–30 4. Borgatti SP (2006) Identifying sets of key players in a network. Comput Math Org Theory 12(1):21–34 5. Cohen R, Erez K, ben Avraham D, Havlin S (2000) Efficient immunization strategies for computer networks and populations. Phys Rev Lett 85:4626 6. Coley DA (1999) An introduction to genetic algorithms for scientists and engineers. World Scientific, Singapore 7. Freeman LC (1979) Centrality in social networks I: Conceptual clarification. Social Netw 1:215–239
Cardinality-Constrained Critical Node Detection Problem
91
8. Garey MR, Johnson DS (1979) Computers and intractability: A guide to the theory of NPcompleteness. W.H. Freeman, New York 9. Glover F, Laguna M, Martí R (2000) Fundamentals of scatter search and path-relinking. Contr Cybernet 39:653–684 10. Harper PR, de Senna V, Vieira IT, Shahani AK (2005) A genetic algorithm for the project assignment problem. Comput Oper Res 32:1255–1265 11. Krebs V (2002) Uncloaking terrorist networks. First Monday 7(4) http://www. firstmonday.dk/issue74/krebs/index.html 12. Krishnamoorthy MS, Deo N (1979) Node-deletion NP-complete problems. SIAM J Comput 8(4):619–625 13. Lewis J, Yannakakis M (1980) The node-deletion problem for hereditary properties is NPcomplete. J Comput Syst Sci 20(2):219–230 14. Lund C, Yannakakis M (1993) The approximation of maximum subgraph problems. In: ICALP ’93: Proceedings of the 20th International Colloquium on Automata, Languages and Programming, London, UK. Springer, Berlin, pp 40–51 15. Mitchell M (1996) An introduction to genetic algorithms. MIT, Cambridge, MA 16. Oliveira CAS, Pardalos PM, Querido TM (2004) Integer formulations for the message scheduling problem on controller area networks. In: Grundel D, Murphey R, Pardalos P (eds) Theory and algorithms for cooperative systems. World Scientific, Singapore, pp 353–365 17. Resende MGC, Werneck RF (2006) A hybrid multistart heuristic for the uncapacitated facility location problem. Eur J Oper Res 174:54–68 18. Resende MGC, Werneck RF (2007) A fast swap-based local search procedure for location problems. Ann Oper Res. doi: 10.1007/s10479–006–0154–0 19. Spears WM, DeJong KA (1991) On the virtues of parameterized uniform crossover. In Proceedings of the 4th International Conference on Genetic Algorithms, San Diego, CA, USA, pp 230–236 20. Wislon JM (1997) A genetic algorithm for the generalised assignment problem. J Oper Res Soc 48:804–809 21. Zhou T, Fu Z-Q, Wang B-H (2006) Epidemic dynamics on complex networks. Progr Nat Sci 16(5):452–457
Reliability-Based Routing Algorithms for Energy-Aware Communication in Wireless Sensor Networks Janos Levendovszky, Andras Olah, Gergely Treplan, and Long Tran-Thanh
1 Introduction Due to the recent advances in electronics and wireless communication, the development of low-cost, low-energy, multi-functional sensors has received increasing attention. These sensors are compact in size and besides sensing they also have some limited signal processing and communication capabilities [13]. However, these limitations in size and energy make the WSNs different from other wireless and ad hoc networks. As a result, new protocols must be developed with special focus on power efficiency in order to increase the lifetime of the network which is crucial in case applications, where recharging of the nodes is out of reach (e.g., military field observations, living habitat monitoring, for more details see [6, 33]). This chapter addresses reliable packet transmission in WSN when packets are to be received by the base station (BS) with a given reliability (i.e., keeping the probability of packet loss under a given threshold). Since the success of every individual packet transmission depends on the distance and the transmission energy, the probability of correct reception will diminish exponentially with respect to the number of hops, in the case of multihop packet transfers. In order to guarantee energy
Janos Levendovszky Budapest University of Technology and Economics, Department of Telecommunications, H-1117 Magyar tud. krt. 2, Budapest, Hungary e-mail:
[email protected] Andras Olah Faculty of Information Technology, Peter Pazmany Catholic University, H-1083 Práter u. 50/A, Budapest, Hungary e-mail:
[email protected] Gergely Treplan Faculty of Information Technology, Peter Pazmany Catholic University, H-1083 Práter u. 50/A, Budapest, Hungary e-mail:
[email protected] Long Tran-Thanh Budapest University of Technology and Economics, Department of Telecommunications, H-1117 Magyar tud. krt. 2, Budapest, Hungary e-mail:
[email protected] N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_5,
93
94
J. Levendovszky et al.
balancing, new protocols are proposed which ensure minimal power consumption subject to the constraint of achieving a given level of reliability. Traditional routing protocols (Directed Diffusion [12], LEACH [11], PEDAP-PA [28], PEGASIS [7]) for wireless sensor networks aim to reduce energy consumption in order to prolong the longevity. However, they do not focus on reliable packet forwarding, even though satisfying a predefined delivery criterion is a crucial problem in many applications. Thus, our concern is to develop polynomial complexity algorithms to find optimal paths satisfying the reliability constraint and having maximum lifespan at the same time. First, a randomized packet forwarding mechanism is introduced termed as random shortcut protocol. In this case the packet travels to the BS on a given path; however, each node can decide randomly whether forwarding the packet to the next node in the path or sending it directly to the BS (shortcutting it to the BS). In this way, the node may use larger transmission energy in order to reach the BS directly, but, on the other hand, the nodes closer the BS may get relieved from carrying (receiving and sending) an aggregated traffic load. The performance of the new protocol is optimized by using large deviation theory. This method optimizes energy balancing over a given path. The second of group of new methods focuses optimal path selection which either minimizes the overall energy (the sum of transmission energies used by the nodes along the path) needed to get the packet from source to BS or maximizes the remaining energy of the bottleneck node of the path. Both objectives are satisfied subject to a reliability constraint by keeping the probability of packet loss under a certain threshold. As a result of extensive simulations the lifespan achieved by the new protocols are evaluated and compared to the ones of traditional protocols. This comparison clearly demonstrates the advantage of the new protocols. The chapter is structured as follows: • In Section 2, the description of WSN is given and the current routing protocols are reviewed (PEDAP, PEGASIS, LEACH, Directed Diffusion (DD), etc.). • In Section 3, the model used for the analysis and the new protocols are discussed. • In Section 4, packet forwarding schemes over a 1D chain are investigated in the following order: (i) new estimates are developed for the lifetime of WSN by large deviation theory and (ii) the performance of random shortcut protocols are optimized by an extended version of the Chernoff bound followed by a numerical analysis. • In Section 5, two novel polynomial complexity routing algorithms are proposed for energy balancing: (i) the first one (OERA) minimizes the overall energy needed for transferring a packet to the BS subject to fulfilling the reliability criterion and (ii) the second one (BERA) maximizes the minimum energy which remains after transferring the packet to the BS and fulfills the reliability criterion at the same time. • In Section 6, possible implementations of the new protocols are described. • In Section 7, some concluding remarks are given.
Reliability-Based Routing Algorithms
95
2 The Description of WSN and Some Current Routing Protocols In order to develop novel routing algorithms, a brief description of the operation of the sensors and the sensor network is needed.
2.1 Operation of the Sensor Node The architecture of a sensor is depicted in Fig. 1. The processing unit contains a CPU, a storage unit, and a memory controller, whereas the sensing unit contains the sensors and the A/D converters. The transceiver unit ensures the communication with other sensors. The power unit, which is in most cases a non-reachable battery (alkaline or lithium), provides the energy for the node and contains the circuit to stabilize the voltage and monitors the state of the energy supply. Other peripheries, such as actuators and location finding system, can also be connected to the sensor. The power consumption in active mode of operation is in the range of milliwatts, but when sleeping it is only in the range of microwatts. In the next figure, the structure and the reference parameters of the Mica2 node are indicated [13].
Fig. 1 Structure of the Mica2 wireless node and packet forwarding in WSN
Using the transceivers the nodes are able to communicate to each other and to the base station (BS) and transmit the collected data. Communication basically entails three tasks: (i) sending the collected data in the form of packets; (ii) retransmitting packets sent by other nodes; and (iii) sending information about the state of the node.
2.2 Operation of the Sensor Network The WSN contains nodes which communicate each other by radio communication. Information is sent in the forms of packets and the task of the routing protocol is to ensure reliable packet transfer to the BS. We assume the following properties: • There is only a single, stationary BS on a fixed location (in certain applications there can more than one BS with mobility).
96
J. Levendovszky et al.
• The BS is energy abundant as it can be recharged or connected to an energy supply network, and, as a result, the BS is able to reach each node of the WSN (even the furthest ones) by radio communication. • The nodes are also assumed to be stationary. • There may be some nodes which do not have enough power to reach the BS directly; hence, multihop packet transfer is in use, where packet forwarding is determined by an addressing mechanism. • The medium access control (MAC) interface discovers the neighbors and measures the LQI (link quality indicator) to the neighbors. • The direction of communication is node-to-BS (the data acquired by the nodes must be collected by the BS). • If necessary, the nodes can organize themselves into a hierarchy where a node at a given level of the hierarchy receives packets from nodes at a lower level of the hierarchy. • The reporting model is either query driven (the BS requests data from a certain node) or event driven (the node sends a packet if an event occurs, e.g., the measured temperature exceeds a certain threshold). • The radio propagation is described by the Rayleigh model, implying that the reliability of each packet hop of correct packet reception over $ # (the probability a single hop) is P (r ) = exp
−d α Θσz2 g
, where Θ is the sensitivity threshold, α
is the exponent of fading attenuation, and σz2 denotes the energy of noise. One must note that this formula connects the reliability of packet transfer P (r ) over distance d with the required energy g (for the sake of notational simplicity this relationship will be denoted by P (r ) = Ψ (g)). Based on the discussion above, the WSN is perceived as a graph G(V, E, d) depicted in Fig. 1, where V represents the set of wireless sensors, E represents the $ # (r )
radio links between nodes i and j operating with reliability Pi j = exp
−diαj Θσz2 gi j
,
and the distance di j from node i to j is covered by a transmission energy gi j .
2.3 Routing Protocols for WSNs In this section, an overview of existing routing protocols is given. The main routing algorithms, such as DD, PEDAP, LEACH, and PEGASIS, are briefly described in order to compare them with the proposed new methods in Sections 4 and 5, respectively. Routing protocols can be divided into three classes based on the structure of the network organization [17]: (i) flat (there is no hierarchy among the nodes); (ii) hierarchical (nodes belong to different classes of importance); and (iii) location based (nodes are classified based on their location). Furthermore, the routing can be adaptive if the system parameters (e.g., topology and link qualities) are controlled by the available energy levels. If the path has been assigned prior to the packet generation
Reliability-Based Routing Algorithms
97
then we speak of proactive protocol, whereas if the path is identified after the packet generation then we speak of reactive routing [3]. Routing algorithms combining the two strategies are called hybrid methods. In the case of static solutions the path to the BS is determined by routing tables. In cooperative routing the data collected by a group of modes are aggregated by a central node. Another definition of cooperation is related to multipath delivery to the BS. Routing algorithms can also be grouped on the basis of mode of operation, e.g., QoS based, query based, multipath based [5, 18]. The different protocols are described in Table 1. In the third column of the table, the sources of computation and communication overheads are indicated in the case of the different protocols. From column six, one can also see how the protocols handle the different link metrics (lifetime, reliability, delay, etc.).
Protocol Directed Diffusion [12] PEDAP [28]
Table 1 Comparison of different protocols Computation and communication Reporting Data Classification overhead model aggregation QoS Flat/proactive
Flat/reactive
Too much overhead
Event/query driven
Setting up Event/query spanning tree driven Geographic Location/proactive Complicated Event/query adaptive sleeping driven fidelity [33] strategy GBR [16] Flat/reactive Same as DD Event/query driven SAR [25] Location/proactive Multipath com- Time/query munication driven MCFA [34] Flat/proactive setting up cost Query driven field and one hop broadcasting LEACH [11] Hierarchical/proactive Setting up and Event driven maintaining cluster PEGASIS [7] Hierarchical/proactive Chain setup Event driven
Yes
No
Yes
No/lifetime
No
No
Yes
No/lifetime
Yes
Yes/reliability
No
No
Yes
No
No
No/lifetime
DD is one of the most important data-centric routing protocols [12]. It uses attribute-value pairs for the data and queries the sensors. Data aggregation in the network is achieved by solving a minimum Steiner tree problem. Low-energy adaptive clustering hierarchy (LEACH) [11] protocol is a hierarchical routing algorithm which groups the nodes into clusters based on their energy levels. Each cluster elects a clusterhead by random selection, which collects the packets in the cluster and then sends them to the BS. Data fusion and aggregation take place in the cluster. This protocol can considerably increase the lifetime. On the other hand, the dynamic selection of clusterheads and the need for announcing the results periodically entail a huge overhead.
98
J. Levendovszky et al.
The power-efficient data gathering and aggregation protocol (PEDAP) [28] calculates a minimum spanning tree by the Prim algorithm having the BS as the root node. The cost of a link in the case of a k-bit packet is gi j (k) = 2 · E elec · k + E amp · k · di2j , where E elec is the transceiver consumption and E amp denotes the transmitter energy, while di j is the distance between nodes i and j. The power-efficient gathering in sensor information systems (PEGASIS) [7] protocol will determine a chain of nodes instead of clusters. Each node has only two neighbors. The path is constructed by a greedy algorithm. At the end of the path is the furthest node from the BS. Its neighbor is going to be the next element in the path and these steps are repeated until we get to the BS. Each node in the path aggregates the information obtained from its neighbor and forward it toward an elected leader in the chain which communicates the BS directly. The leader is elected randomly in each round. The following metrics are used most frequently to evaluate routing performance [19]: • Energy efficiency/lifetime: The lifetime can either be defined as the time till the first node goes flat or as the time till a κ portion of the nodes go flat. Furthermore, the performance of a protocol can be quantified by the overall remaining energy when the network goes flat. If this remaining energy is high then the protocol did not perform well, as a lot of energy is wasted. • Reliability: It determines the probability of successful reception of the packets at the BS. • Latency (or delay): The time interval between sensing and data collection at the BS. In the case of multihop routing it is often associated with the number of hops needed to reach the BS. A couple of applications of the routing protocols to real-life problems are indicated in Table 2. These examples can help to find the protocol which is appropriate for a given application. As can be seen, there are several routing protocols which are concerned with energy balancing and prolonging the longevity of WSNs. On the other hand, these protocols do not satisfy any reliability constraints as packet losses are not taken into account.
3 The Model In this section, the model of WSN we use for introducing the novel protocols is summarized. As mentioned before, WSN is perceived as a graph G(V, E, d) depicted in Fig. 1, where V represents the set of wireless sensors and di is the distance between nodes i and i − 1. According to the Rayleigh fading model, the energy needed for transmitting the packet to distance d with the probability of correct reception P (r) is given as
Health
Human monitoring [24] Object tracking [23]
Smart kindergarten [26] Cold chain [21]
Mantis [2]
SSIM (artificial retina) [24]
Commercial
Home/office entertainment Education
Military
Environmental monitoring Environmental monitoring Health
Great Duck Island [4]
PODS in Hawaii [32]
Type
Application
Video, identification Reliable, data aggregation
Data archiving, long lifetime Energy efficiency Real-time, complex processing Quality, security, alerts Collaborative processing, real time Intensive communication
Requirements
Minimal, continuously
Large, continuously
Large, location awareness with high frequency near an object Large, hybrid with high frequency
Moderate, periodic
Minimal periodic, every 5–10 min Large, query-driven, infrequently Large, continuously, in high frequency
Data amount and frequency
Flat
Flat
Flat
Location based
Flat
Hierarchical
Flat
Hierarchical
Topology
Table 2 Routing protocols and particular applications for WSNs
55
Tens, indoor
1 node per input device
Several nodes per human 200
SAR (MCFA)
PEGASIS, GEAR (GRAB) PEGASIS
PEGASIS, SAR, GBR GAF
LEACH
DD
GAF, LEACH
32 nodes in 1 km2 30–50 nodes in 0.51 km2 100 sensors per retina
Routing protocols
Scale and density
Reliability-Based Routing Algorithms 99
100
J. Levendovszky et al.
P
(r )
−d α Θσ Z2 = exp g
% (1)
(for further explanation, see [10]). For the sake of notational simplicity, this relationship will be denoted by P (r) = Ψ (g). Furthermore, when there is a transmission (r) from i to i − 1, then the reliability of this transmission is denoted by Pi = Ψ (gi ), where gi denotes the transmission energy on node i. It must be noted that the node transmission energy can adaptively be changed. As a result, when developing optimal protocols, we focus on finding the optimal transmission energies as well as finding the optimal path to ensure reliable packet transfer from the source node to the BS. In this way, one can achieve that reliable communication is carried out with minimum energy which, in turn, increases the longevity of WSN. From a 2D graph we often change to a 1D model, i.e., after the routing protocol has found the path to the BS the nodes participating in the packet transfer can be regarded as a 1D chain labeled by i = 1, . . . , N and depicted in Fig. 2.
Fig. 2 One-dimensional chain topology of WSN packet forwarding
In the forthcoming discussion, we will first use the 1D model, as our aim is to develop optimal packet forwarding mechanisms by using randomized transmission strategies. Then we will return to the 2D model to find optimal paths over random graphs which will ensure reliable packet transfer to the BS.
4 Lifespan Estimation and Optimal Packet Forwarding Mechanisms Over a 1D Chain by Random Shortcut and Large Deviation Theory In this section, first we will develop a technique for lifespan estimation by large deviation theory and then the random shortcut protocol is optimized. The 1D model (shown in Fig. 2) is characterized as follows: • The topology is uniquely defined by a distance vector d = (d1 , . . . , d N ) where di , i = 1, . . . , N denotes the distance between nodes i and i − 1, respectively. • The initial battery power on each node is the same and denoted by C. • The packets are generated in discrete time instant and the corresponding time variable is denoted by k.
Reliability-Based Routing Algorithms
101
• We assume that each node generates packets subject to an on/off model, i.e., packet generation occurs with probability P (yi = 1) = pi , whereas the node does not generate packet with probability P (yi = 0) = 1 − pi . • The traffic state of the network is represented by an N -dimensional binary vector y ∈ {0, 1} N and the corresponding probability of a traffic state is given as p (y) = N ! y pi i (1 − pi )1−yi assuming independence among the sensed quantities. i=1
As a result, the path is fully characterized by vectors g and p. The following packet forwarding mechanisms are investigated: 1. Chain protocol: Each node transmits packet to its neighbor lying closer to the BS. In this way, each node consumes minimal energy being engaged with short-range energy transmission. However, as each packet will go through the node being closest to the BS, the lifetime of this node is likely to determine the longevity of the whole network. 2. Random shortcut protocol: Upon receiving or generating a packet node i can choose to forward the packet to its neighboring node being closer to the base station (labeled as i − 1) with probability (1 − ai ), or directly send the packet to the BS with probability ai . In this way paths are randomized and not every packet will go through the node closest to the BS. As a result, a better energy balancing is expected. 3. Single-hop protocol: Each node sends its packet directly to the BS. In this case, the farthest node is likely to determine the longevity of the network.
4.1 Lifespan Estimation of 1D Chain by Using the Chernoff Bound The following results have been partly published in [15] by the authors. Let us assume that the chain protocol is in effect. The energy consumed by sending a packet generated on node i to the BS is given as
Gi =
i
gj
(2)
j=1
and the average energy consumption up to time instant K is given as K N 1 yi (k)G i , N k=1
(3)
i=1
where y (k) ∈ {0, 1} N is the traffic state of the network at time k. The lifespan of node denoted by K˜ is defined as
102
J. Levendovszky et al.
⎛ ⎞ K˜ N 1 K˜ : P ⎝ yi (k)G i < C ⎠ = e−γ , N k=1
(4)
i=1
where γ is a given reliability parameter. By using the complementary probability ⎞ ⎛ K˜ N 1 yi (k)G i > C ⎠ = 1 − e−γ P⎝ N k=1
(5)
i=1
lifetime evaluation is cast as a tail estimation problem, where bounds like the Chernoff inequality [8] can be used as follows: ⎞ ⎛ ) & N K˜ N ( ' 1 s N C . yi (k)G i > C ⎠ ≤ exp μi s , G i − P⎝ N K˜ k=1 i=1 i=1
(6)
Here μi (s, G i ) := log E esyi G i = log 1 − pi + pi esG i is the logmoment generating function and s : arg min K˜
s
N i=1
% sNC . μi (s, G i ) − K˜
By using the estimation above, one obtains N
ei=1
' ( μi s ,G i − s N˜ C K
= 1 − e−γ
(7)
and the lifespan of the simple chain protocol can finally be estimated by the following formula: sˆNC . −γ i=1 μi sˆ , G i − log 1 − e
K˜ = N
(8)
4.2 Optimization of the Random Shortcut Method by the Extended Version of the Chernoff Bound If the random shortcut protocol is in effect, then the packet generated by node i will travel in the chain down to node i − li which then sends it directly to the BS [15]. This shortcutting node is selected by random trials as follows: basically, at each node in the chain there random generator ξ j ∈ {0, 1} generating values is a binary with probabilities P ξ j = 0 = 1 − a j and P ξ j = 1 = a j . When a packet is sent by node i to the BS then a binary vector is generated as a result of the random trials
Reliability-Based Routing Algorithms
103
on each node. If the first nonzero component of this binary vector is i − li then node i − li will shortcut the packet to the BS. Let the number of hops in the chain be denoted by λi . Its distribution is given as i "
P (λi = li ) = ai−li
1 − aj ,
(9)
j=i−li +1
where a0 = 1. Traveling down from node i to node i − li and then getting shortcut to the BS, the packet consumes an overall Vi := ij=i−li +1 g j + γi−li energy. Here g j is the energy needed to send the packet to the neighboring node and γi−li is the shortcut energy from node i − li (i.e., the energy required to transmit the packet from node i − li directly to the BS). As a result, the average energy consumption is given as ⎛ K N 1 yi ⎝ N k=1
i=1
⎞
i
g j + γi−λi ⎠.
(10)
j=i−λi +1
Thus the lifespan (similar to (4)) is defined as follows: ⎛ K˜ N 1 K˜ : P ⎝ yi ⎝ N ⎛
k=1
i=1
i
⎞
⎞
g j + γi−λi ⎠ > C ⎠ = 1 − e−γ .
(11)
j=i−λi +1
The probability in (11) can be rewritten as ⎞ ⎛ ⎞ ⎛ K˜ N i 1 yi ⎝ g j + γi−λi ⎠ > C ⎠ P⎝ N k=1 i=1 j=i−λi +1 ⎛ ⎞ ⎞ ⎛ ˜ K N i 1 = ... P⎝ yi ⎝ g j +γi−λi ⎠ > C |λ1 =l1 , . . . , λ N =l N ⎠· N l1
lN
k=1
i=1
j=i−λi +1
· P (λ1 = l1 , . . . , λ N = l N ) = ⎛ ⎛ K˜ N i 1 = ... P⎝ yi ⎝ N l1
lN
k=1
i=1
⎞
⎞
g j + γi−li ⎠ > C ⎠
j=i−li +1
N "
P (λi = li ).
i=1
Expression ⎛
⎛ K˜ N i 1 ⎝ ⎝ P yi N k=1
i=1
j=i−li +1
⎞
⎞
g j + γi−li ⎠ > C |λ1 = l1 , . . . , λ N = l N ⎠
104
J. Levendovszky et al.
can be upper bounded by the Chernoff bound as ⎛ ⎛ N i K˜ 1 P⎝ yi ⎝ N k=1
⎞
⎞
g j + γi−li ⎠ > C |λ1 = l1 , . . . , λ N = l N ⎠ ≤
j=i−li +1
i=1
N
≤ ei=1
μi (s,Vi )− sNC ˜ K
,
where '
( ' ( μi (s, Vi ) := log E esyi Vi = log 1 − pi + pi esVi and N "
P (λi = li ) =
i=1
N "
⎛ ⎝ai−li
⎞ 1 − a j ⎠,
i " j=i−li +1
i=1
thus we obtain ⎛ ⎛ K˜ N 1 P⎝ yi ⎝ N k=1
...
l1
i=1
N
e
sNC i=1 μi (s,Vi )− K˜
lN
e
− sNC K˜
...
l1 − sNC K˜
N " l N i=1
N "
⎞
⎛
⎝ai−li ⎛
eμi (s,Vi ) ⎝ai−li
⎛
⎞
g j + γi−λi ⎠ > C ⎠ ≤
j=i−λi +1
N " i=1
e
i
⎛
⎝eμi (s,Vi ) ⎝ai−li
i " j=i−li +1 i "
⎞ 1 − aj ⎠ = ⎞ 1 − aj ⎠ =
j=i−li +1 i "
⎞⎞ 1 − a j ⎠⎠.
j=i−li +1
i=1 li
Introducing the extended logarithmic moment generation function as ⎛ ⎛ ⎝eμi (s,Vi ) ai−li βi (s, Vi ) := log ⎝ li
one can write
i " j=i−li +1
⎞⎞ 1 − a j ⎠⎠
(12)
Reliability-Based Routing Algorithms
105
⎛ ⎛ K˜ N 1 P⎝ yi ⎝ N k=1
i=1
≤e
− sNC ˜
⎞
i
⎞
g j + γi−λi ⎠ > C ⎠ ≤
j=i−λi +1
N "
K
eβi (s,Vi ) = e
N
sNC i=1 βi (s,Vi )− K˜
.
(13)
i=1
Comparing the bound with 1 − e−γ , we obtain N
i=1 βi
e where sˆ : arg min s
N
(sˆ,Vi )− sˆ NK˜C = 1 − e−γ ,
(s, Vi ) −
i=1 βi
sNC . K˜
(14)
The lifespan is the solution of the follow-
ing equation: K˜ :
N i=1
sˆ N C + log 1 − e−γ . βi sˆ , Vi = K˜
(15)
As it can be seen, the equation above determines the lifespan as a function of vector a; the components of which represent the probabilities of shortcut on a given node. This relationship is denoted by K˜ = Ψ (a). Using (12) the protocol optimization can take place by searching in the space of a-vectors to maximize Ψ (a). This can be done by gradient descent given as follows: # ai (n + 1) = ai (n) − Δsgn
$ Ψ (a (n)) − Ψ (a (n − 1)) , ai (n) − ai (n − 1)
i = 1, . . . , N , (16)
where a (n) is the probability vector at iteration n and Δ is the learning rate of the gradient descent. As a result, the protocol optimization is carried out as in Fig. 3. In the case of single-hop protocol we have ai = 1, i = 1, . . . , N . Here, we obtain ⎛
⎞ K˜ N 1 K˜ : P ⎝ yi (k) γi < C ⎠ = e−γ , N k=1
(17)
i=1
which yields the following lifespan: sˆ N C , −γ i=1 μi sˆ , γi + log 1 − e
K˜ = N
(18)
where μi (s, γi ) := log E esyi γi = log (1 − pi + pi esγi ) is the log moment generating function. The lifespan estimation (5) has been defined on the basis of the overall energy consumption of a packet. Taking into account that one may want to maximize the
106
J. Levendovszky et al.
Fig. 3 Algorithm to optimize the free parameters of the random shortcut protocol
minimum remaining energy after the packet transfer instead of minimizing the overall energy, this definition can be modified as follows: ⎧ ⎛ ⎫ ⎞ K˜ i ⎨ ⎬ K˜ : min P ⎝ ϑi (k) > c⎠ = e−γ , ⎭ i ⎩
(19)
k=1
where ϑi (k) denotes the average energy consumption on node i at time instant k and K˜ i denotes the lifespan of node i. The method discussed above can be extended to estimate the lifespan of any arbitrary protocol [14].
4.3 Performance Analysis and Numerical Results for Random Shortcut Protocols In this section, a detailed performance analysis of the chain, the shortcut and the single-hop protocols are given. The aim is to evaluate the lifespan of a sensor network containing N nodes placed in an equidistant manner. Figure 4 shows how the estimated lifespan changes as the function of the number of nodes (N ) in the case of the three methods described above. The distance between the base station and the farthest node was 20 m, the initial battery power was C = 10∧ 5, and the reliability parameter was set as (1 − e−γ ) = 0.95, while the probabilities of shortcut were uniform ai = a = 0.2. One can see that there is a maximum lifespan in the cases of chain and random shortcut protocols with respect to the number of nodes. Figure 4 shows that when the network is sparse then both methods result in relatively closer lifespan, while departing from the optimal number of nodes (either decreasing or increasing the number of nodes), the random shortcut model definitely gives much higher (it is more than
Reliability-Based Routing Algorithms
107
Fig. 4 Estimated lifespan as a function of the number of nodes
37% in the case of N = 7) lifespan. The figure also demonstrates that in the case of relatively low node density, the chain protocol yields higher longevity than the single-hop protocol. On the other hand, the random shortcut protocol always results in longer lifespan than any of the other two protocols. Figure 5 demonstrates the accuracy of lifespan estimation at different protocols. The settings were the same as earlier and the number of nodes was N = 7. From the figure one can see that the lifespan can be sharply estimated by the Chernoff bound (estimation error < 2%). Thus, based on this estimation the system parameters can be optimized accordingly in an off-line manner. The methods treated
Fig. 5 Lifespan and estimated lifespan values achieved by different protocols
108
J. Levendovszky et al.
in this section have provided increased lifespan but reliable packet delivery has not yet taken into account. This problem is going to be addressed in the forthcoming sections.
5 Reliability-Based Routing with Energy Balancing In this section, we develop new algorithms for optimal path selection over a random graph subject to the constraint that the probability of successful packet reception at the BS must exceed a given threshold. Packet forwarding to the BS is carried out over a path characterized by a set of indices = {i 1 , i 2 , . . . , i L }, where the indices identify the nodes which participate in the packet transfer (see Fig. 6).
Fig. 6 Packet forwarding over path from source node to the BS in WSN
In our notation, node i 1 is the sender node and i L+1 denotes the-BS. The transmis. sion energies used by the nodes contained by path are given as gi1 i2 , . . . , gi L B S . Thus, the overall energy required by getting the packet from sender node i 1 to the L gil il+1 . BS over the path = {i 1 , i 2 , . . . , i L } is given as l=1 If the probability of successfully forwarding the packet from node i 1 to node i L+1 reception of the is denoted by Pil il+1 = Ψ gil il+1 then the probability 'of successful ( !L packet at BS is P(correct reception at BS) = l=1 Ψ gil il+1 . Here, it is noteworthy to mention again that the transmission energies on the nodes can adaptively be changed to achieve reliable packet transfer at the cost of minimum energy. Hence reliable routing poses the following constrained optimization problems: 1. Minimizing the overall energy: In this case, our objective is to find the optimal path and the corresponding optimal transmission energies which minimize the overall energy needed for a packet to get to the BS subject
Reliability-Based Routing Algorithms
109
to the -reliability constraint. .More precisely, we seek the optimal path opt = i 1,opt , i 2,opt , . . 0 . , i L ,opt and the optimal transmission energies opt = / L opt opt opt gil il+1 subject to gi1 i2 , gi 2 i3 , . . . , gi L i L+1 for which opt , opt : min, l=1
P(correct reception at BS) =
L "
( ' opt Ψ gil il+1 ≥ 1 − ε.
(20)
l=1
The algorithm solving this problem will be referred to as Overall Energy Reliability Algorithm (OERA) [29]. 2. Maximizing the remaining energy of the bottleneck node: In this case, our objective is to find the optimal path and the corresponding optimal transmission energies which ensure that after the packet has been forwarded to the BS, the minimum remaining energy (the energy in the so-called bottleneck node) is maximized. More precisely, let ci (k), i ∈ V denote the energy state of node i at time instant k whereas c (k) is the energy state of the - network at time instant . k. Our objective is to find the optimal path / opt = i 1,opt , i 2,opt , . . . ,0i L ,opt and opt
opt
opt
the corresponding optimal energies opt = gi 1 i2 , gi2 i3 , . . . , gi L i L+1 for which max, minl cil (k + 1), where cil (k + 1) := cil (k) − gil il+1 and guarantee that P(correct reception at BS) =
L "
Ψ gil il+1 ≥ 1 − ε.
(21)
l=1
The algorithm solving this problem will be referred to as Bottleneck Energy Reliability Algorithm (BERA). We generally solve these problems in two phases. In the first phase, we assume that the path = {i1 , i 2 , . . . , i L } over which the packet is forwarded from node i 1 to 0 / opt opt opt the BS is given and only the transmission energies opt = gi 1 i2 , gi2 i3 , . . . , gi L i L+1 are to be optimized. In the second phase, we determine opt , the optimal packet forwarding path that guarantees the 1 − ε reliability and minimizes the overall consumption of the packet transfers. It can be proven that carrying out these two steps separately, one can find the optimum.
5.1 Optimization of the Overall Energy Consumption – The OERA Algorithm In this section, we will demonstrate that the optimal path solving problem (20) can be given in polynomial time which gives rise to the OERA algorithm [29]. As was mentioned before, let us first assume that the path is already given. In this case, our goal is to determine the optimal transmission energies opt =
110
/
opt
J. Levendovszky et al. opt
opt
gi1 i2 , gi2 i3 , . . . , gi L i L+1 (20). We state the following:
0 for which
L
l=1 gil il+1
is minimal subject to constraint
Theorem 1 Assuming that the packet transmission path = {i 1 , i 2 , . . . ., i L } from L gil il+1 can node i 1 to the BS is given, under the reliability parameter (1 − ε), l=1 only be minimal if √ √ √ √ gil il+1 = ( w1 + w2 + · · · + w L ) · wl ,
(22)
where wl :=
diαl il+1 Θσ Z2 − ln(1 − ε)
.
(23)
The proof of this theorem can be found in Appendix A. Thus, in the case of a given path = {i 1 , i 2 , . . . , i L } the optimal transmission energies which yield maximal lifespan are obtained from (22). Consequently, the overall energy consumption to get the packet to the BS along the path is given as L
√ √ √ gil ,il+1 = ( w1 + w2 + · · · + w L )2 .
(24)
i=1
Based on (24), the energy consumption of a packet transfer is E () =
L √
wi1 ,i2
)2 & L √ √ √ √ + wi2 ,i3 + · · · + wi L−1 ,i L wik ,ik+1 = wik ,ik+1 ,
k=1
k=1
(25) where wik ,ik+1 =
diαk ,ik+1 Θσ Z2 − ln(1 − ε)
.
(26)
We are seeking opt for which opt : min E().
(27)
√ √ As wik ,ik+1 is positive, minimizing (25) is equivalent to minimizing ( wi1 ,i2 + √ wi2 ,i3 + · · · + wi L−1 ,i L ). Hence problem (27) reduces to opt : min E () ∼ min
L √ l=1
wil ,il+1 .
(28)
Reliability-Based Routing Algorithms
111
√ It is, however, a shortest path problem if the value wu,v is assigned to the edge connecting the nodes (u, v), where wu,v is defined by (26). In this way, (28) can be solved by the Bellman Ford algorithm in polynomial time.
5.2 Reliable Packet Transfer by Maximizing the Remaining Energy of the Bottleneck Node . In this case, we seek opt = i 1,opt , i 2,opt , . . . , i L ,opt and the corresponding optimal 0 / opt opt opt energies opt = gi1 i2 , gi 2 i3 , . . . , gi L i for which max, minl cil (k + 1) subject L+1 to the condition that the packets arrive at the BS with a given reliability, as indicated below P(correct reception at BS) =
L "
Ψ gil il+1 ≥ 1 − ε.
(29)
l=1
We will demonstrate that this problem can also be solved in polynomial time by using the BERA algorithm. Let us again first assume that the packet forwarding path is already given. Then we state the following: Theorem 2 Assuming that the packet transmission path = {i 1 , i 2 , . . . , i L } from node i 1 to the BS is given, under the reliability parameter (1 − ε), then minil cil (k) − gil il+1 can only be maximal if the residual energy of each node is the same, expressed as cil (k) − gil il+1 = A, and A satisfies the following equation: "
Ψ cil − A = 1 − ε.
(30)
il ∈
The proof of this theorem can be found in Appendix B1. It is easy to note that the left-hand side of (30) is monotone decreasing with respect to parameter A. Thus (30) will have a unique solution over the interval 0, mini j ci j . If there is no solution then there is no such energy set opt = / 0 opt opt opt gi1 i2 , gi2 i3 , . . . , gi L i which could fulfill the reliability constraint. Due to its L+1 monotonicity, one can develop fast methods to solve (30), like the Newton–Raphson algorithm. Having A at hand, we can search for the most reliable the maximiza path when tion of reliability is equivalent to the minimization of − il ∈ log Ψ cil (k) − A . This formula reduces the search for the most reliable pathinto a shortest path optimization problem where the weight − log Ψ cil (k) − A is assigned to each link. The task opt : min − log Ψ dil il+1 , cil − A (31)
il ∈
112
J. Levendovszky et al.
can be solved in polynomial time by the Bellman–Ford algorithm. (Note that − log Ψ dil il+1 , cil − A ≥ 0). By applying the Rayleigh model (as described by (1)) and with setting A = 0, one obtains opt : min
il ∈
%) & α −dil il+1 Θσ Z2 ∼ − log Ψ cil ∼ min − log exp cil il ∈
∼ min
diαl il+1 Θσ Z2 il ∈
cil
. (32)
In this special case, expression (32) is equivalent to the optimization problem solved by the PEDAP-PA algorithm [28]. On the other hand, it is easy to see that the solution of (31) depends on the value of A. Furthermore, the optimal value of A depends on the path itself. Therefore, let us solve (31) and (30) recursively, one after another. This implies that we search for the most reliable path and then for the path found we make sure that the reliability constraint holds obtaining the proper value of A belonging to the given reliability parameter. This algorithm will have a fix point and will stop when there are no changes in the obtained paths any longer. The convergence to the optimal solution is stated by the following theorem: Theorem 3 Let A(k) indicates the series obtained by recursively solving (31) and (30) one after another. A(k) is monotonically increasing and will converge to the fix point of (31) and (30). Furthermore (31) and (30) have a unique fix point. Hence the algorithm described above and depicted by Algorithm 1 converges to the global optimum. The proof of this theorem can be found in Appendix B2. Algorithm 1 This algorithm calculates variables [A, path], where OptResEnergies is an equation solver which solves equation (30). DIJKSTRA is the well-known minimal path selection algorithm which solves the optimization task indicated with (31). The initial path is a one-hop path between source and BS. Require: ci > 0, ∀i Ensure: [A, path] A←0 path ← [S OU RC E, B S] while path = path old do path old ← path El j ← − log Ψ dl j , cl (k) − A , ∀l j path ← DIJKSTRA(E) A ← O pt Res Energies( path) end while
Furthermore, it can be shown that the speed of convergence is O M N 2 , where M is the upper bound on number of time the decision operation is performed, while O N 2 is the complexity of the Dijkstra algorithm. Note that the M is independent of the network size. Hence the convergence speed is still O N 2 .
Reliability-Based Routing Algorithms
113
5.3 Performance Analysis and Numerical Results In this section, the performance of the two new reliability-based routing algorithms (OERA and BERA) are analyzed and compared with the standard WSN routing algorithms. However, we also run simulations to analyze the network behavior when some nodes had already died and in this case the lifespan is defined as the time being to the latest death. We actually define earliest death as the time for the first node going dead and we define latest death as the time till the last node goes flat. In each time instant a new packet has been generated randomly by one of the nodes still operational. The fading model was the well-known Rayleigh fading with the exponent of fading attenuation α = 3, sensitivity threshold Θ = 50, and energy of noise σz2 = 10−4 . The methods were tested on multiple networks with size N =5, 10, 20, 50, and 100 nodes, respectively. The nodes had been distributed randomly or deterministically according to a uniform distribution over an area of 100 m2 . The BS was placed in a corner. The test topologies of the network is indicated in Fig. 7.
Fig. 7 Topologies: (a) Random topology with 100 sensor nodes and (b) grid topology with 100 sensor nodes
Among the traditional methods the first one tested is the single-hop protocol in which every node transmits directly to the BS; thus, the reliability can be easily ensured. In the case of LEACH, the number of hops is 2; hence, the reliability criteria can also be ensured. On the other hand, for reversed path forwarding algorithms (e.g., PEGASIS, PEDAP-PA, DD) reliability cannot be ensured directly. Therefore, the traditional protocols had to be modified in order to guarantee the reliability of delivery. Hence, if the length of the selected path was M, in order to ensure (1 − ε) level of reliability the nodes participating in the transmission must transmit with, such that (1 − ε1 ) M = (1 − ε).
(33)
In Fig. 8a, one can see that the newly proposed BERA and OERA algorithms outperform the traditional protocols, yielding longer lifespan (the longevity is twice or
114
J. Levendovszky et al.
Fig. 8 Comparing the lifespan of the BERA and OERA algorithms with lifespan achieved by traditional protocols: (a) Time to the earliest death and (b) time to the latest death
three times longer compared to the single-hop protocol). On the other hand, single hop is better in the sense of last node dying, since it is the nearest node to the BS (Fig. 8b). The probability of successful packet transfer to BS has also been evaluated in the case of all protocols, where parameter ε was set as ε = 0.05 . The results depicted in Fig. 8 are made from multiple test, where 20 different random networks were tested and averaged. In the next simulations the lifespan was defined as the longevity of the longest lasting node. Figure 9 indicates the percentage of the operational node as a function of time.
Fig. 9 The percentage of the operational node as a function of time
One can see that in the case of BERA algorithm the nodes go flat more or less at the same time. BERA is the best in the load balancing effect, as using energy awareness. OERA makes a good compromise between the earliest and latest death of the network, i.e., the latest death is occurring much later than in the case of BERA; however, the first death occurs earlier. Here first and last death are consid-
Reliability-Based Routing Algorithms
115
ered timewise. In the case of BERA the longevity of the first node going flat has been significantly improved compared with the traditional methods. Figure 10 depicts the average number of packets transferred to the BS with respect to the size of the network when all protocols started to run with the same initial energy. The packets were generated subject to uniform distribution in the case of each protocol. The optimal strategy can be selected easily if we know the network density. If this density is low PEGASIS is good choice, on the other hand BERA guarantees the highest throughput (the maximum number of delivered packets) if the density is high. From Fig. 10, it can be seen that the performance of some protocols depends on the number of nodes. Furthermore, DD underperforms the rest of the protocols due to the fact that they minimize the total energy consumption instead of maximizing the longevity of node which goes flat earliest. One may also observe that the new BERA protocol performs rather well, implying that more packets can be transferred with a given level of initial energy than in the case of the classical protocols. The simulation results are made from a multiple test (running the simulations several times and averaging the results).
Fig. 10 Average number of delivered packets using a given energy level
Figure 11 depicts the lifespan with respect to the reliability parameter. Analyzing the different protocols, one can see that the new BERA protocol can achieve higher lifespan in the case of all ε. One can also observe that decreasing ε will increase the lifespan exponentially. In this example, algorithms perform very differently, but PEDAP-PA has proven to be a good strategy for this specific topology. Figure 12 demonstrates the load-balancing capabilities of the different protocols in the case of a 20-node network. The results were obtained from a single test, as we have analyzed the battery reduction of all nodes as a function of time. As can be seen, the single hop and DD do not enforce load balancing at all, while LEACH, PEDAP-PA, and the new protocols achieve good load balancing. One can see that the new BERA protocol provides the best load balancing. If longevity is defined as the death of the last node, BERA still performs well, as the nodes will die more or less at the same time. The PEGASIS protocol does not perform well because
116
J. Levendovszky et al.
Fig. 11 Lifespan of the network depends on the required service
satisfying the reliability criterion in the case of a large number of nodes in a chain topology needs a large amount of energy consumption. So far, we have adopted the Rayleigh model for performance analysis, which is valid if the node antennas do not “see” each other. This typically occurs in indoor applications. If the antennas can see each other then the Rice fading [1, 32] describes the radio channel better. Figure 13 indicates, how the lifespan increases (in the case of BERA) when the amplitude of the dominant wave grows and it also demonstrates the effect of errors in the fading parameter estimation on the lifespan. The Rician factor F means the ratio of the LOS signal energy and the non-LOS signal energy [23, 28]. As can be seen, by underestimating the fading parameters the lifespan will increase but it puts the reliability in jeopardy. In the case of overestimation the reliability criterion is satisfied but with excessive energy consumption (too large transmission energies are selected) and the lifespan is decreased. The expected value and the standard deviation of the delay in the case of the different protocols are depicted in Fig. 14b. From the figure it can be inferred that with increasing lifespan the latency is also going to be increased. In Fig. 14a one can see the dependency of the size of the network and delay. In the case of PEGASIS, the packet is forwarded along a very long chain, which may result in extremely long delays. The long chain is also disadvantageous with respect to the transmission energies as the reliability can only be maintained by high transmission energies in the case of several hops. In Tables 3, 4, and 5, respectively the investigated protocols are compared with each other. The values are always normalized by the lifespan provided by the best protocol, e.g., 79% for PEDAP-PA with N = 50 implies that the lifespan of PEDAP-PA is 79% of the lifespan of BERA for the multiple test. From Table 3 one
Reliability-Based Routing Algorithms
117
Fig. 12 Illustration of the load balancing performance of different protocols
Fig. 13 The impact of fading parameter estimation on the lifespan and reliability (the reliability parameter was set as ε = 0.05, while the exponent of fading attenuation was set as α = 3)
118
J. Levendovszky et al.
Fig. 14 Average delay of the different protocols: (a) delay of the analyzed protocols and (b) delay in the function of network size
can see that the single-hop protocol performs rather poorly, due to the fact that the node closest to BS carries a high load and dies prematurely. However, in Tables 4 and 5 the single-hop protocol exhibits increasing performance as κ increases (for the definition of κ, see Section 2.3 on page 8). In the case of a low node number the PEGASIS algorithm proves to be the best. However, when the number of nodes is increasing then the lifespan of PEGASIS deteriorates fast. LEACH can only be used in large networks as it requires the clustering of nodes. Analyzing Tables 4 and 5 one can see that OERA performs the best in the case of choosing κ = 0.2 or 0.4 because it minimizes the overall energy. One can clearly see that the traditional protocols are less efficient than the new ones. Furthermore, the traditional protocols do not guarantee reliability, either.
N 100 50 20 10 5
N 100 50 20 10 5
PEDAP-PA (%) 67 79 82 91 88
PEDAP-PA (%) 60 65 83 72 71
Table 3 Comparison of lifespans till the first node dies Single PEGASIS LEACH DD hop (%) (%) (%) (%) 8 16 24 42 68
10 3 6 38 108
19 10 10 19 28
20 17 20 26 55
Table 4 Comparison of lifespans with parameter κ = 0.2 Single PEGASIS LEACH DD hop (%) (%) (%) (%) 25 28 32 52 42
9 10 25 39 63
20 15 23 53 24
44 45 53 47 72
OERA (%)
BERA (%)
42 47 48 67 82
100 100 100 100 100
OERA (%)
BERA (%)
100 100 100 100 76
94 89 92 86 100
Reliability-Based Routing Algorithms
N 100 50 20 10 5
PEDAP-PA (%) 58 64 67 70 60
119
Table 5 Comparison of lifespans with parameter κ = 0.4 Single PEGASIS LEACH DD hop (%) (%) (%) (%) 42 55 62 57 70
10 21 35 70 100
22 30 34 45 35
41 47 51 55 60
OERA (%)
BERA (%)
100 100 100 100 75
74 73 69 75 53
Based on the numerical results, as a summary one can conclude that the new BERA protocol can outperform the traditional ones and it can be applied in any application when longevity and reliability are of major concerns. Moreover, one can see that increasing κ the OERA algorithm will also perform well as it minimizes the overall energy without balancing.
6 Protocol Implementation In this section we summarize the implementation of the novel algorithms. At first we describe the special protocol stack which must be implemented to use the new routing protocols. Second, we demonstrate how to develop distributed implementations.
6.1 Protocol Stack Assumptions The proposed novel reliability-based routing algorithms (OERA and BERA) can directly be implemented as routing protocols if the protocol layers are operating as follows: • Application layer: The BS collects the measurements made by the nodes and determines the reliability and other QoS parameters which is then handed over to the network layer. • Network layer: When requesting data from node i the routing protocol determines the optimal path and transmission energies based on OERA or BERA algorithm taking into account the network topology, energy levels, and QoS parameters. As the information will be conveyed to node i via the nodes contained by the optimal path, each node in the path is informed about its role and the transmission power can be set accordingly. • Data link layer: It implements a time division multiple access (TDMA) protocol with synchronous access, where each node uses a different time slot for access but most of the time the nodes are switched off (in this way the resources are not misused and collision is avoided). • Physical layer: It can operate on any platform (e.g., Mica2, BTmote, Skymote,TI ez430 [9, 21, 31]).
120
J. Levendovszky et al.
The two requirements for the operation of the WSN indicated above are (i) the knowledge of the current state of the network as far as the available node powers and fading state are concerned and (ii) time synchronization for TDMA. These conditions can be ensured by running a “maintenance protocol” periodically, which estimates the fading parameters and ensures the clock synchrony. As the data traffic is known on the BS the available node energies can be evaluated there.
6.2 Distributed Implementations for Novel Packet Forwarding and Routing Algorithms In this section distributed versions of the aforementioned algorithms are developed based on the ad hoc on-demand distance vector (AODV) protocol [19]. AODV is a proactive table-driven routing protocol for mobile ad hoc networks (MANET), implying that every node maintains a routing table: • Random shortcut protocol: It is easy to see that the novel random shortcut packet forwarding method can be combined with an AODV protocol implementation. Decision algorithm must only be modified by using a Bernoulli random number generator, which can overwrite the next destination address. • Overall energy routing algorithm: In the case of fading-aware routing with minimal overall energy consumption, the metric of the links is given by expression (26). Then, we can run the algorithm described in Section 5.1, by using AODV for each of the steps. • Bottleneck energy routing algorithm: In the case of BERA, let the metric now be defined as given in (31). In this way, we select the route which has maximum reliability. Unfortunately the nodes are not aware of Aopt which is ci (k + 1) if the solution is optimal. Theorem 3 states that finding the most reliable route in a case where the residual energy is optimal, we achieved the global optimum. As we cannot run the AODV protocol recursively due to its huge signaling overhead, one may run the algorithm only once with an estimated Aopt . Let this estimation ˆ We can be sure that be denoted by A. min ci < Aopt < max ci − gmax
(34)
holds. This algorithm runs as follows: according to (34) we can choose Aˆ = max ci − gmax which is propagated in the network and each node calculates the corresponding transmission energy according to (30). Then we run AODV with the calculated link measures (if some of the nodes do not have the necessary ˆ then the connecting edges will be set amount of residual energy dictated by A, zeros; hence, they will not be part of the path). Aˆ is very close to A opt because gmax is much smaller than ci . If we change Aˆ all the time then the routing tables are updated continuously, which would present an intolerable overhead. Hence it is advised to update the routing tables only with a given frequency, which may
Reliability-Based Routing Algorithms
121
ease the routing complexity and decrease the amount of signaling. This frequency parameter can also be optimized with respect to the dynamics and the size of the network; however, this problem is not the subject of the present discussion.
7 Conclusion In this chapter, optimal packet forwarding algorithms and reliability-based energyaware routing for WSN have been studied. First, the statistical analysis of the socalled random shortcut protocol was given by large deviation theory and the gain in the achieved lifespan was demonstrated. Then we have proposed two novel routing algorithms (OERA and BERA), which are capable of providing reliability-based routing in polynomial time. With the help of the new protocols the probability of packet loss can be kept under a predefined threshold, while the transmission energies are minimized along the paths. It has also been shown that these novel algorithms outperform the traditional routing algorithms with respect to both longevity and reliability. Furthermore, the new protocols can be implemented in a distributed fashion. As a result, they can be applied to applications where lifespan and reliable communication to the BS are of major concerns.
Appendix A – Proof of Theorem 1 As the reliability of packet transfer is L ! l=1
⇒
Ψ gil il+1 = exp L l=1
−diα i Θσ Z2 l l+1 gil il+1
L
−diα i
l=1
Θσ Z2
l l+1 gil il+1
≥ (1 − ε) (35)
≥ ln(1 − ε).
Using the definition of wil ,il+1 in (23), we can reformulate (35) as L −wil il+1 l=1
gil il+1
≥ −1, wil il+1 > 0, gil il+1 > 0.
Hence, we have the following constraint optimization (CO) problem: Let
f (G) =
L l=1
and
gil il+1
(36)
122
J. Levendovszky et al.
g(G) =
L −ail il+1 l=1
gil il+1
+ 1,
-
. where G = gi0 ,i1 , gi1 ,i 2 , . . . , gi L ,i L+1 . The CO is G opt : min f G s.t. g G ≥ 0.
(37)
G
Let L(G, λ) = f (G)−λg(G) be the Lagrangian function of the problem. Therefore, its Lagrange dual problem can be written as the following: max L(G, λ)s.t. λ ≥ 0; G,λ
∂L ∂G
= 0.
After solving (38), we have the following solution: gil il+1 = L √ k=1 wi k ,i k+1 . Thus the optimal solution is gil il+1
(38) λwil ,il+1 and
λopt =
) & L √ √ = wik ,ik+1 · wil ,il+1 . k=1
Appendix B1 – Proof of Theorem 2 First, we show that the solution of optimization task defined in Section 5 must fulfill "
Ψ gil il+1 = 1 − ε.
(39)
il ∈
Let us assume that (39) does not hold, then because of (29) the next expression holds "
Ψ gil il+1 > 1 − ε.
(40)
il ∈
1 for which gˆil i < gil i , il = arg min(ci j − gi j i ) In this case, there exists a G l+1 l+1 j+1 ij
and (39) is satisfied. This will yield a better solution; thus, the path in (40) is not needed. Theorem 2 states that if G is a solution of (21) then cil (k + 1) = A, for ∀il . Let us assume that it does not hold, meaning that cil (k + 1) = Ail is a better solution implying A < A il Then the values can be arranged as follows:
∀il .
(41)
Reliability-Based Routing Algorithms
123
Ai1 < Ai2 ≤ · · · ≤ Ai L
(42)
as the remaining energies are different. However, if (41) and (42) are true then "
Ψ dil il+1 , gil il+1 < 1 − ε
(43)
il ∈
as Ψ (.) is monotone decreasing with respect to the remaining energy and A is the solution of (30) which contradicts (29).
Appendix B2 – Proof of Theorem 3 First, we show that Ak is monotone increasing. Let us denote the solution of (30) as a function F (.) and similarly the solution of (31) is represented by a function G (.), namely A = F (), = G (A), respectively. Furthermore, let us introduce H (, A) =
"
Ψ dil il+1 , cil − A ,
(44)
il ∈
where dil il+1 argument represents the distance between the nodes. Then Ak = F (k )
(45)
gives us the path k with an optimal Ak selection which satisfies the (1−ε) criterion. Then we select a new path as follows: k+1 = G ( Ak )
(46)
is more reliable, since function G (.) seeks the most reliable path based on A, thus k+1 will be more reliable than k : "
Ψ dil il+1 , cil − A > 1 − ε.
(47)
il ∈k+1
If there is no more reliable route, then we get stuck in a fix point. The monotonicity of Ψ (.) and (47) implies that Ak+1 = F (k+1 )
(48)
will give a solution where Ak+1 > Ak . Second, it will be shown that if A = F (G ( A))
(49)
124
J. Levendovszky et al.
is a fix point then ∃A∗ > A, A∗ = F G A∗ .
(50)
In other words there exists only one fix point of Algorithm 1. It is trivial that H (G ( A) , A) = 1 − ε,
(51)
because of (49). Since G (.) selects the most reliable path it holds that ∀ = G (A) ,
H (, A) ≤ 1 − ε.
(52)
Let us assume indirectly that there exists A∗ > A fix point. In this case, ∀ = G ( A) ,
H , A∗ < 1 − ε.
(53)
But A∗ cannot be a fix point because (30) always ensures that the chosen strategy fulfills the QoS requirement.
References 1. Abdi A, Tepedelenlioglu C, Member S, Member S, Kave M, Giannakis G (2001) On the estimation of the k parameter for the rice fading distribution. IEEE Commun Lett 5:92–94 2. Abrach H, Bhatti S, Carlson J, Dai H, Rose J, Sheth A, Shucker B, Deng J, Han R (2003) Mantis: System support for multimodal networks of in-situ sensors. In: 2nd ACM International Workshop on Wireless Sensor Networks and Applications (WSNA), San Diego, CA, USA, pp 50–59 3. Akkaya K, Younis M (2005) A survey on routing protocols for wireless sensor networks. Elsevier Ad Hoc Netw J 3:325–349. http://citeseerx.ist.psu.edu/viewdoc/summary? doi=10.1.1.85.3616 4. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: A survey. Comput Netw 38:393–422 5. Al-Karaki JN, Kamal AE (2004) Routing techniques in wireless sensor networks: A survey. IEEE Wireless Commun 11(6):6–28. doi 10.1109/MWC.2004.1368893. http://dx.doi.org/10.1109/MWC.2004.1368893 6. Arampatzis T, Lygeros J, Manesis S (2005) A survey of applications of wireless sensors and wireless sensor networks, pp 719–724. http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=1467103 7. Cauligi CA, Lindsey S, Raghavendra CS, Raghavendra CS Pegasis: Power-efficient gathering in sensor information systems, IEEE Aerospace Conf. Proc., 2002, vol. 3, 9–16, pp 1125–30 8. Chernoff H (1952) A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23:493–509 9. Fournel N, Fraboulet A, Chelius G, Fleury E, Allard B, Brevet O (2007) Worldsens: From lab to sensor network application development and deployment. In: IPSN ’07: Proceedings of the 6th international conference on Information Processing in Sensor Networks. ACM, New York, NY, USA, pp 551–552. doi http://doi.acm.org/10.1145/1236360.1236435
Reliability-Based Routing Algorithms
125
10. Haenggi M (2005) On routing in random Rayleigh fading networks. IEEE Trans Wireless Commun 4(4):1553–1562. Available at http://www.nd.edu/ mhaenggi/pubs/routing.pdf 11. Heinzelman WR, Ch A, Balakrishnan H (2000) Energy-efficient communication protocol for wireless microsensor networks, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences, 4-7 Jan. 2000, vol.2, pp 10 12. Intanagonwiwat C, Govindan R, Estrin D, Heidemann J, Silva F (2003) Directed diffusion for wireless sensor networking. IEEE/ACM Trans Netw 11(1):2–16. doi http://dx.doi.org/10.1109/TNET.2002.808417 13. Kuorilehto M, Hännikäinen M, Hämäläinen TD (2005) A survey of application distribution in wireless sensor networks. EURASIP J Wirel Commun Netw 2005(5):774–788. doi: http://dx.doi.org/10.1155/WCN.2005.774 14. Levendovszky J, Bojárszky A, Karlócai B, Oláh A (2008) Energy balancing by combinatorial optimization for wireless sensor networks. WTOC 7(2):27–32 15. Levendovszky J, Kiss G, Tran-Thanh L (2009) Energy balancing by combinatorial optimization for wireless sensor networks. In: Performance Modelling and Analysis of Heterogeneous Networks. River Publishers, Aalborg, Denmark, pp 169–182 16. Mani CS, Srivastava MB (2001) Energy efficient routing in wireless sensor networks. In: MILCOM Proceedings on Communications for Network-Centric Operations: Creating the Information Force, Washington, D.C., USA, pp 357–361 17. N. Narasimha Datta and K. Gopinath (2005) Indian Institute of Science: A survey of routing algorithms for wireless sensor networks. J. Indian Inst. Sci., Nov-Dec. 2006, 86, 569–598 18. Pereira PR, Grilo A, Rocha F, Nunes MS, Casaca A, Chaudet C, Almström P, Johansson M (2007) End-to-end reliability in wireless sensor networks: Survey and research challenges. In: Pereira PR (ed) EuroFGI workshop on IP QoS and Traffic Control. EuroFGI 2007 Workshop on IP QoS and Traffic Control, Lisbon, Portugal 19. Perkins C, Royer E (1997) Ad-hoc on-demand distance vector routing. In: Proceedings of the 2nd IEEE Workshop on Mobile Computing Systems and Applications, New Orleans, LA, USA, pp 90–100 20. Polastre J, Szewczyk R, Culler D (2005) Telos: enabling ultra-low power wireless research. In: IPSN’05: Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, IEEE, Piscataway, NJ, USA. http://portal.acm.org/citation.cfm?id=1147685.1147744 21. Ramamurthy H, Prabhu BS, Gadh R, Madni AM (2007) Wireless industrial monitoring and control using a smart sensor platform. IEEE Sensors J 7(5):611–618. doi 10.1109/JSEN.2007.894135. URL http://dx.doi.org/10.1109/JSEN.2007.894135 22. Rappaport T (2001) Wireless communications: principles and practice. Prentice Hall PTR, Upper Saddle River, NJ 23. Rmer K (2004) Tracking real-world phenomena with smart dust. In: EWSN 2004, Springer, Berlin, Germany, pp 28–43 24. Schwiebert L, Gupta SK, Weinmann J (2001) Research challenges in wireless networks of biomedical sensors. In: MobiCom’01: Proceedings of the 7th Annual International Conference on Mobile Computing and Networking. ACM, New York, NY, USA, pp 151–165. doi http://doi.acm.org/10.1145/381677.381692 25. Sohrabi K, Gao J, Ailawadhi V, Pottie GJ (2000) Protocols for self-organization of a wireless sensor network. IEEE Pers Commun 7:16–27 26. Srivastava M, Muntz R, Potkonjak M (2001) Smart kindergarten: Sensor-based wireless networks for smart developmental problem-solving environments. In: MobiCom’01: Proceedings of the 7th Annual International Conference on Mobile Computing and Networking, ACM, New York, NY, USA, pp. 132–138. doi http://doi.acm.org/10.1145/381677.381690 27. Stüber GL (2001) Principles of mobile communication, 2nd edn. Kluwer, Norwell, MA 28. Tan HO, Körpeo˘glu I (2003) Power efficient data gathering and aggregation in wireless sensor networks. ACM SIGMOD Rec 32(4):66–71. doi http://doi.acm.org/10.1145/959060.959072 29. Tran-Thanh L, Levendovszky J (2009) A novel reliability based routing protocol for power aware communications in wireless sensor networks. In: WCNC’09: Proceedings of the 2009
126
30.
31.
32.
33.
34.
J. Levendovszky et al. IEEE conference on Wireless Communications & Networking Conference, IEEE, Piscataway, NJ, USA, pp 2308–2313 Vieira MAM, da Silva Jr, DC, Coelho Jr, CJN, da Mata JM (2003) Survey on wireless sensor network devices. In: Proceedings of the 9th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA’03), Lisbon, Portugal Wyne S, Santos T, Tufvesson F, Molisch AF (2007) Channel measurements of an indoor office scenario for wireless sensor applications. In: GLOBECOM, 2007, Washington, D.C., USA, pp 3831–3836. IEEE. http://dblp.uni-trier.de/db/conf/globecom/globecom2007.html Ning Xu. A survey of Sensor Network Applications. IEEE. Communications Magazine, pp 102–114, 2002. Proceedings of the Electronics, Robotics and Automotive Mechanics Conference, Washington, DC, USA Xu Y, Heidemann J, Estrin D (2001) Geography-informed energy conservation for ad hoc routing. In: MobiCom’01: Proceedings of the 7th Annual International Conference on Mobile Computing and Networking, ACM, New York, NY, USA, pp 70–84. doi http://doi.acm.org/10.1145/381677.381685 Ye NF, Chen FYA (2001) A scalable solution to minimum cost forwarding in large sensor, 10th International Conference on Computer Communications and Networks Proceedings, 2001, Scottsdale, Arizona USA, pp 304–309
Opportunistic Scheduling with Deadline Constraints in Wireless Networks David I Shuman and Mingyan Liu
1 Introduction We consider a single source transmitting data to one or more users over a wireless channel. It is highly desirable to operate such a wireless system in an energyefficient manner. When the sender is a mobile device relying on battery power, this is obvious. However, even when the sender is a base station that is not power constrained, it is still desirable to conserve energy in order to limit potential interference to other base stations and their associated mobiles. Due to random fading, wireless channel conditions vary with time and from user to user. The key realization from a transmission scheduling perspective is that these channel variations are not a drawback, but rather a feature to be beneficially exploited. Namely, transmitting more data when the channel between the sender and receiver is in a “good” state and less data when the channel is in a “bad” state increases system throughput and reduces total energy consumption. Doing so is commonly referred to as opportunistic scheduling. In this chapter, we are particularly interested in delay-sensitive applications, such as multimedia streaming, voice over Internet protocol (VoIP), and data upload or transfer with a time restriction. In such applications, packets are often subject to hard deadline constraints, after which time their transmission provides limited or no benefit. Our objective is to review energy-efficient opportunistic transmission scheduling policies that also comply with deadline constraints. Specifically, we want to provide an intuitive understanding of how the deadline constraints affect the scheduler’s optimal behavior. David I Shuman Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI 48109, USA e-mail:
[email protected] Mingyan Liu Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI 48109, USA e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_6,
127
128
D.I Shuman and M. Liu
The remainder of the chapter is organized as follows. In the next section, we introduce the concept of opportunistic scheduling and provide some examples. In Section 3, we review some common modeling issues in opportunistic scheduling problems in wireless networks. We formulate three related opportunistic scheduling problems with deadline constraints, and then elucidate the role of the deadline constraints in Section 4. In Section 5, we relate the models of Section 4 to models from inventory theory. Section 6 concludes the chapter.
2 Opportunistic Scheduling We motivate opportunistic scheduling with a few simple example problems. Example 1 Consider a channel that can be in one of M channel conditions, with probabilities p1 , p 2 , . . . , p M , respectively. Associated with each channel condition is a known convex, increasing, differentiable power-rate function, f 1 (z), f 2 (z), . . . , f M (z), respectively, describing the power required to transmit at data rate z (or equivalently the energy required to transmit z packets in a discrete time slot or time unit). The objective is to minimize the average power consumed ¯ This probover an infinite horizon, subject to a minimum average rate constraint, R. lem reduces to the following convex optimization problem: min
M i=1
pi · f i z i
(z 1 ,z 2 ,...,z M )∈IR+M M i i ¯ s.t. i=1 p · z ≥ R,
(1)
where z i represents the number of packets transmitted per time slot when the channel is in condition i. The solution to (1) is found by reducing (in the same manner as [11, Example 5.2, p. 245]) the Karush–Kuhn–Tucker (KKT) conditions to ' ( ∗ ∗ ∗ ∗ z i ≥ 0, z i · pi · ν ∗ + f i (z i ) = 0, and ν ∗ + f i (z i ) ≥ 0 ∀i ∈ {1, 2, . . . , M}, and pT z∗ = R¯ , where ν ∗ is the Lagrange multiplier associated with the rate constraint. Graphically, the so-called inverse water-filling solution is found by fixing the slope of a tangent line and setting the number of packets to be transmitted under condition i to be a z i such that f i (z i ) is equal to the slope, or zero if f i (z i ) is greater than the slope for continuously repeated as the slope of the tangent line is all z i ≥ 0. This process is M ¯ The resulting optimal solution z∗ has pi · z i = R. gradually increased until i=1 ∗ the property that for every channel condition i, the optimum number of packets z i is either equal to zero or satisfies fi (z i ) = −ν ∗ , where −ν ∗ is the slope of the final tangent line. See Fig. 1 for a diagram of this solution. Example 2 Next, we consider the same infinite horizon average cost problem as (1), with the additional stipulations that (i) the power-rate function in each channel
Opportunistic Scheduling in Wireless Networks
129
Fig. 1 Pictorial representation of the solution to (1). The vector z∗ of the optimal number of packets ∗ to transmit under each channel condition has the property that f i (z i ) is the same for all channel ∗ i conditions i such that z > 0
condition is linear, with slope φ i , and (ii) there is a power constraint P in each slot. In other words, ⎧ P i i i ⎨ φ · z , if z ≤ φ i ' ( ⎪ i fi z = . ⎪ ⎩ ∞, if z i > P i φ We assume without loss of generality that φ 1 ≤ φ 2 ≤ · · · ≤ φ M (i.e., φ 1 is the slope of the power-rate function under the best channel condition and φ M is the slope under the worst condition). With these assumptions, the problem becomes (
min
z 1 ,z 2 ,...,z M
M )
M ∈IR+
s.t.
i=1
M i=1
zi ≤
and
pi · φ i · zi p i · z i ≥ R¯
P φi
,
(2)
∀i ∈ {1, 2, . . . , M}
where z i represents the number of packets transmitted per time slot when the channel is in condition i. The solution to (2) is found by defining ⎧ ⎫ j ⎨ ⎬ P j ∗ := min j ∈ {1, 2, . . . , M} : p m · m ≥ R¯ . ⎩ ⎭ φ m=1
Then the optimal amount of data to send under each channel condition is given by
∗
z m :=
⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩
P φm ,
j ∗ −1 m P ¯ R− m=1 p · φ m pj
∗
0,
if m < j ∗ , if m = j ∗ .
(3)
if m > j ∗
See Fig. 2 for a diagram of this solution. Examples 1 and 2 illustrate the main idea of exploiting the temporal variation of the channel via opportunistic scheduling; namely, we can reduce energy con-
130
D.I Shuman and M. Liu
P
P
z1
0 *
z1
P
P
z2
0 z2
*
z3
0 *
z3
zM
0 zM
*
Fig. 2 Pictorial representation of the solution to (2). Each plot represents the power-rate curve under a different channel condition. The full power available is used for transmission when the channel is in its best condition(s), and no packets are transmitted when the channel is in its worst condition(s)
sumption by sending more data when the channel is in a “good” state, and less data when the channel is in a “bad” state. Much of the challenge for the scheduler lies in determining how good or bad a channel condition is, and how much data to send accordingly. In Examples 1 and 2, the scheduler was scheduling packets for a single receiver, but it is often the case in wireless communication networks that a single source sends data to multiple users over a shared channel. Such a downlink system model is shown in Fig. 3. In this situation, the scheduler can exploit both the temporal variation and the spatial variation of the channel by sending data to the receivers with the best conditions in each slot. The benefit from doing so is commonly referred to as the multiuser diversity gain [77]. It was introduced in the context of the analogous uplink problem where multiple sources transmit to a single destination (e.g., the base station) [45].
Fig. 3 Multiuser downlink system model. A single source transmits data to multiple users over a shared wireless channel
3 Modeling Issues and Literature Review There is a wide range of literature on opportunistic scheduling problems in wireless communications. This section is by no means intended to be an exhaustive survey of problems that have been examined, but rather an introduction to some of the most common modeling issues. For more complete surveys of opportunistic scheduling studies in wireless networks, see [50, 52].
Opportunistic Scheduling in Wireless Networks
131
3.1 Wireless Channel Modeling the wireless channel deserves an entire book in its own right. For a good introduction to the topic, see [74]. Here, we restrict attention to modeling the wireless channel at the simplest level required for opportunistic scheduling problems, without considering any specific modulation or coding schemes. In this context, the condition of the time-varying wireless channel is usually modeled as either (i) independently and identically distributed (IID) over time; or (ii) a discrete-time Markov process. In the case of multiple receivers, as shown in Fig. 3, the channels between the sender and each receiver may or may not be correlated in each time slot. For a detailed introduction to modeling fading channels as Markov processes, see [63]. In general, the transmitter can reliably send data across the channel at a higher rate by increasing transmission power. For each possible channel condition, there is a corresponding power-rate curve that describes how much power is required to transmit at a given rate. In the low signal-to-noise ratio (SNR) regime, this powerrate curve is commonly taken to be linear and strictly increasing. In the high SNR regime, the power-rate curve is commonly taken to be convex and strictly increasing [74, Section 5.2]. For a justification of the convex assumption, see [76]. Specific convex power-rate curves that have been considered in the literature include (i) z c(z, s) = α2 1−1 (s) (see, e.g., [47]), motivated by the capacity of a discrete-time additive μ
white Gaussian noise (AWGN) channel, and (ii) c(z, s) = α2z (s) (see, e.g., [48]), where in both cases, c(z, s) is the power required to transmit at rate z under channel condition s, μ is a fixed parameter, and the αi ’s are parameters that may depend on the channel condition.
3.2 Channel State Information In this chapter, we assume that, through a feedback channel, the transmission scheduler learns perfectly (and for free) the state of the channel between the sender and each receiver at the beginning of every time slot. Thus, its scheduling decisions are based on all past and current states of the channel(s), but none of the future channel realizations. This set of assumptions is commonly referred to as causal or full channel state information. Some papers such as [17, 75] also refer to problems resulting from this assumption on the scheduler’s information as online scheduling problems, to differentiate them from offline scheduling problems, where the scheduler learns all future channel realizations at the beginning of the time horizon. For a recent survey of research on systems with limited feedback, which may cause the channel state information to be outdated or suffering from errors, see [53]. References [26, 29, 64, 70, 82] also discuss ways to deal with restrictions on the timing and amount of feedback in an opportunistic scheduling context. A second relaxation of the perfect channel state information assumption is to force the scheduler to decide whether or not to attain channel state information at some cost, which represents the time and energy consumed in learning the channel
132
D.I Shuman and M. Liu
state. The process of learning the channel is often referred to as probing, and [12– 14, 32, 33, 37, 57, 62] are examples of studies that examine the best joint strategies for probing and transmission in different contexts.
3.3 Data The simplest and often most tractable way to model the data is that the sender has an infinite backlog of data to send to each receiver. Analysis under this assumption gives a bound on the maximum achievable performance of a system in terms of throughput. Alternatively, one can assume that data arrive to the sender’s buffer over time and explicitly model the arrival process. The arrival process may be deterministic (often the case in offline scheduling problems, where the scheduler is assumed to learn the times of all future arrivals at the beginning of the horizon), an IID sequence of random variables (as in [18]), a Poisson process (as in [3]), a discretetime Markov process (as in [2]), or just about any other stochastic process appropriate for a given application. With an arriving packet model, the scheduler’s control policies often depend on both the current queue length of packets backlogged at the sender and the statistics of future arrivals. It may also be the case, as in [3], that the sender’s buffer to store arriving packets is of finite length. If so, the scheduler must take care to avoid having to drop packets due to buffer overflow. Finally, the opportunistic scheduling literature is divided on the treatment of a “packet.” Some studies take the data to be some integer number of packets that cannot be split, while others consider a fluid packet model that allows packets to be split, with the receiver reassembling fractional packets.
3.4 Performance Objectives Broadly speaking, opportunistic scheduling problems in wireless networks focus on the trade-offs between energy efficiency, throughput, and delay. With some exceptions (e.g., [6]), delay is usually modeled as a QoS constraint (a maximum acceptable level of delay), rather than a quantity to be directly minimized. In many opportunistic scheduling problems, delay is not even considered, with the justification that some applications are not delay sensitive. We discuss delay further in the next section. Thus, the two most basic setups are (i) to maximize throughput, subject to a constraint on the maximum average or total energy expended, and (ii) to minimize energy consumption, subject to a constraint on the minimum average or total throughput (as in Examples 1 and 2). These two problems are dual to each other, and many similar techniques can therefore be used to solve both problems. Examples of studies that solve both problems for a similar setup and relate their solutions are [25, 48].
Opportunistic Scheduling in Wireless Networks
133
3.5 Resource and Quality-of-Service Constraints In this section, we provide a brief introduction to some common resource and QoS constraints. 3.5.1 Transmission Power Due to hardware and/or regulatory constraints, a limit is often placed on the sender’s transmission power in each slot. Some models allow the sender to transmit to multiple users in a slot, with the total transmission power not exceeding a limit, while others only allow the sender to transmit data to a single user in each slot. This power constraint is often left out of problems where the power-rate curve is strictly convex, as the increasing marginal power required to increase the transmission rate prevents the scheduler from wanting to increase transmission power too much. However, the absence of a power constraint in a problem with a linear power-rate curve would often result in the scheduler wanting to increase transmission power well beyond a reasonable limit in order to send a large amount of data when the channel condition is very good (see, e.g., [25]). 3.5.2 Delay Delay is an important QoS constraint in many applications. Different notions of delay have been incorporated into opportunistic scheduling problems. One proxy for delay is the stability of all of the sender’s queues for arriving packets awaiting transmission. The motivation for this criterion is that if none of these queues blows up, then the delay is not “too bad.” With stability as an objective, it is common to restrict attention to throughput optimal policies, which are scheduling policies that ensure the sender’s queues are stable, as long as this is possible for the given arrival process and channel model. References [2, 58, 65, 72] present such throughput optimal scheduling algorithms and examine conditions guaranteeing stabilizability in different settings. When an arriving packet model is used for the data, then one can also define end-to-end delay as the time between a packet’s arrival at the sender’s buffer and its decoding by the receiver. A number of opportunistic scheduling studies have considered the average end-to-end delay of all packets over a long horizon. For instance, [1, 6, 7, 9, 18, 20, 30, 31, 43, 44, 61, 78] all consider average delay, either as a constraint or by incorporating it directly into the objective function to be minimized. However, the average delay criterion allows for the possibility of long delays (albeit with small probability); thus, for many delay-sensitive applications, strict end-to-end delay is often a more appropriate consideration for studies with arriving packet models. References [12, 16, 54, 61] consider strict constraints on the end-to-end delay of each packet. A strict constraint on the end-to-end delay of each packet is one particular form of a deadline constraint, as each arriving packet has a deadline by which it must be transmitted (which happens to be a fixed number of slots after its arrival). This
134
D.I Shuman and M. Liu
notion can be generalized to impose individual deadlines on each packet, whether or not the packets are arriving over time or are all in the sender’s buffer from the beginning, as with the case of infinite backlog. Studies that impose such individual packet deadlines include [17, 68]. In [24, 25, 46–49, 71, 75], the individual deadlines coincide, so that all the packets must be received by a common deadline (usually the end of the time horizon under consideration). We further examine the role of these deadline constraints in Section 4. 3.5.3 Fairness If, in the multiuser setting shown in Fig. 3, the scheduler only considers total throughput and energy consumption across all users, it may often be the case that it ends up transmitting to a single user or to the same small group of users in every slot. This can happen, for instance, if a base station requires less power to send data to a nearby receiver, even when the nearby receiver’s channel is in its worst possible condition and a farther away receiver’s channel is in its best possible condition. Thus, fairness constraints are often imposed to ensure that the transmitter sends packets to all receivers. A number of different fairness conditions have been examined in the literature. For example, [5, 51] consider temporal fairness, where the scheduler must transmit to each receiver for some minimum fraction of the time over the long run. Under the proportional fairness considered by [2, 35, 77], the scheduler considers the current channel conditions relative to the average channel condition of each receiver. Reference [51] considers a more general utilitarian fairness, where the focus is on system performance from the receiver’s perspective, rather than on resources consumed by each user. The authors of [10] incorporate fairness directly into the objective function by setting relative throughput target values for each receiver and maximizing the minimum relative long-run average throughput.
4 The Role of Deadline Constraints In this section, we elucidate the role of deadline constraints by examining a series of three related problems. The overarching goal in all three problems is to do energyefficient transmission scheduling, subject to deadline constraints.
4.1 Problem Formulations In all three problems, we consider a single source transmitting data to a single user/receiver over a wireless channel. Time evolution is modeled in discrete steps, indexed backwards by n = N , N −1, . . . , 1, with n representing the number of slots remaining in the time horizon. N is the length of the time horizon, and slot n refers to the time interval [n, n − 1).
Opportunistic Scheduling in Wireless Networks
135
The wireless channel condition is time varying. Adopting a block fading model, we assume that the slot duration is within the channel coherence time such that the channel condition within a single slot is constant. The user’s channel condition in slot n is modeled as a random variable, Sn . We take the channel condition to be independent and identically distributed (IID) from slot to slot, and denote its sample space by S. At the beginning of each time slot, the transmitter or scheduler learns the channel’s state through a feedback channel. It then allocates some amount of power (possibly zero) for transmission. If the channel condition is in state s, then the transmission of z data packets incurs an energy cost of c(z, s). Note that we allow data packets to be split, so the number of packets sent in a given slot can be any nonnegative real number. We also assume the channel condition evolution is independent of any of the transmitter’s scheduling decisions. Our primary objective in deriving a good transmission policy is to minimize energy consumption. However, in doing so, we must also meet the deadline constraint(s), and possibly a power constraint in each slot. Thus, all three problems we discuss in this section can be formulated as Markov decision processes (MDPs) with the following common form: min IE π
π ∈
# N
$ c(Z n , Sn ) | F N
n=1
s.t. Per-Slot Power Constraints and Deadline Constraint(s),
(4)
where F N denotes all information available at the beginning of the time horizon, and Z n = πn (Z N , Z N −1 , . . . , Z n+1 , S N , S N −1 , . . . , Sn ) is the number of packets the scheduler decides to transmit in slot n. The sequence π = (π N , π N −1 , . . . , π1 ) is called a control law, control policy, or scheduling policy, and denotes the set of all randomized and deterministic control laws (see, e.g., [34] definition 2.2.3, p. 15). Next, we specify the precise variant of (4) for each of the three problems. 4.1.1 Single Deadline Constraint, Linear Power-Rate Curves, and a Power Constraint in Each Slot The first problem we consider features linear power-rate curves, a power constraint in each slot, and a single deadline constraint. For each possible channel condition s, there exists a constant cs such that c(z, s) = cs · z. The maximum transmission power in any given slot is denoted by P, and the total number of packets that need to that be transmitted by the end of the horizon is denoted by dtotal . In order to ensure ' ( it is always possible to satisfy the deadline constraint, we assume that N · cs P ≥ worst
·P , where csworst is the energy cost per packet dtotal , or, equivalently, csworst ≤ dNtotal transmitted under the worst possible channel condition. Thus, even if the channel is in the worst possible condition for the entire duration of the time horizon, it is
136
D.I Shuman and M. Liu
still possible to send dtotal packets by transmitting at full power in every slot. The general formulation (4) becomes
min IE
π∈
π
# N
$ c Sn · Z n | F N
n=1
. s.t. c Sn · Z n ≤ P, w. p.1 ∀n ∈ 1, 2, . . . , N and
N
Z n ≥ dtotal , w. p.1 .
n=1
We refer to this problem as problem (P1). It was introduced and analyzed by Fu et al. [24, 25, Section III-D].
4.1.2 Strict Underflow Constraints, Linear Power-Rate Curves, and a Power Constraint in Each Slot The second problem we consider features exactly the same per-slot power constraint and linear power-rate curves as problem (P1); however, the deadline constraints come in the form of strict underflow constraints. Namely, the single receiver maintains a buffer to store received packets, as shown in Fig. 3 (here, M = 1). Following transmission in every slot, d packets are removed from the receiver buffer. Strict underflow constraints are imposed so that the transmitter must send enough packets in every slot to guarantee that the receiver buffer contains at least d packets following transmission. The primary motivating application for this problem is wireless media streaming, where the packets removed from the receiver buffer in each slot are decoded and used for playout. Underflow is undesirable as it may lead to jitter and disruptions to the user playout. These strict underflow constraints can also be interpreted as multiple deadline constraints: the source must transmit at least d packets by the end of the first slot, 2d packets by the end of the second slot, and so forth. As with problem (P1), we need to make an assumption to ensure that it is always possible to satisfy the deadline constraints; namely, we assume that cs P ≥ d, so worst that if the receiver buffer is empty at the beginning of the slot, the source can still send the required d packets, even if the channel is in the worst possible condition. For this problem, the general formulation (4) becomes
min IE
π ∈
π
# N
$ c Sn · Z n | F N
n=1
s.t. c Sn · Z n ≤ P, w. p.1 and
N n=k
. ∀n ∈ 1, 2, . . . , N
Z n ≥ (N − k + 1) · d, w. p.1
. ∀k ∈ 1, 2, . . . , N .
Opportunistic Scheduling in Wireless Networks
137
We refer to this problem as problem (P2). It was introduced and analyzed by Shuman et al. [66–68]. 4.1.3 Single Deadline Constraint and Convex Monomial Power-Rate Curves The third problem we consider features the same single deadline constraint as problem (P1); however, there is no per-slot power constraint imposed, and the energy cost from transmission is taken to be a convex monomial function of the number of packets sent. Namely, for every channel condition s, there exists a constant ks such μ that c(z, s) = zks , where μ > 1 is the fixed monomial order of the cost function. As mentioned in Section 3.1, such a power-rate curve may be more appropriate in the high SNR regime. The general formulation (4) becomes
min IE
π∈
s.t.
N
π
# N (Z n )μ n=1
k Sn
$ | FN
Z n ≥ dtotal , w. p.1 .
n=1
We refer to this problem as problem (P3). It was introduced and analyzed by Lee and Jindal [48].
4.2 Structures of the Optimal Policies In this section, we present the structures of the optimal policies for each of the three problems as straightforwardly as possible, without changing drastically the original presentations. All three problems can be solved using standard dynamic programming (see, e.g., [8, 34]), and the structures of the optimal policies follow from properties of the value functions or expected costs-to-go. For problem (P1), Fu et al. [25] take the information state at time n to be the pair (Q n , Sn ), where Q n represents the number of packets remaining to be transmitted at time n, and Sn denotes the channel condition in slot n. The dynamics of packets remaining to be transmitted are Q n−1 = Q n − Z n , as Z n packets are transmitted during slot n. The dynamic programming equations for this problem are given by Vn (q, s) =
min/
0≤z≤min q, cPs
= cs · q + # V0 (q, s) =
0
. cs · z + IE Vn−1 (q − z, Sn−1 )
/ min 0 max 0,q− cPs ≤u≤q
-
. −cs · u + IE Vn−1 (u, Sn−1 ) , n = N , N − 1, . . . , 1
0, if q = 0 ∞, if q > 0
∀s ∈ S.
(5) (6)
138
D.I Shuman and M. Liu
Here, the transition from (5) to (6) is done by a change of variable in the action space from Z n to Un , where Un = Q n − Z n . The controlled random variable Un represents the number of packets remaining to be transmitted after 0 takes place in / transmission the nth slot. The restrictions on the action space, max 0, q − cPs ≤ u ≤ q, ensure the following: (i) a nonnegative number of packets is transmitted; (ii) no more than dtotal packets are transmitted over the course of the horizon; and (iii) the power constraint is satisfied. For problem (P2), Shuman et al. [68] take the information state at time n to be the pair (X n , Sn ), where X n represents the number of packets in the receiver buffer at time n, and Sn denotes the channel condition in slot n. The simple dynamics of the receiver buffer are X n−1 = X n + Z n − d, as Z n packets are transmitted during slot n, and d packets are removed from the buffer after transmission. The dynamic programming equations for this problem are given by Vn (x, s) =
-
min
max{0,d−x}≤z≤ cPs
= −cs · x +
. cs · z + IE Vn−1 (x + z − d, Sn−1 )
min
max{x,d}≤y≤x+ cPs
-
(7)
. cs · y + IE Vn−1 (y − d, Sn−1 ) , (8) n = N , N − 1, . . . , 1
V0 (x, s) = 0
∀x ∈ IR+ , ∀s ∈ S.
Here, the transition from (7) to (8) is done by a change of variable in the action space from Z n to Yn , where Yn = X n + Z n . The controlled random variable Yn represents the queue length of the receiver buffer after transmission takes place in the nth slot, but before playout takes place (i.e., before d packets are removed from the buffer). The restrictions on the action space, max {x, d} ≤ y ≤ x + cPs , ensure the following: (i) a nonnegative number of packets is transmitted; (ii) there are at least d packets in the receiver buffer following transmission, in order to satisfy the underflow constraint; and (iii) the power constraint is satisfied. Note that the dynamic programming equations (6) and (8) have the following common form: Vn (x, s) = f (x, s) +
min
w1 (x,s)≤a≤w2 (x,s)
-
. h 1 (a) + IE Vn−1 (h 2 (a), s) ,
(9)
n = N , N − 1, . . . , 1 , where (x, s) is the current state and a represents the action. The key realizations for both problems are (i) h 1 (a) is convex in a and h 2 (a) is an affine function of a; and (ii) for any fixed s, f (x, s), w1 (x, s), and Vn−1 (x, s) are all convex in x, and w2 (x, s) is concave in x. These functional properties can be shown inductively using the following key lemma, which is due to Karush [40], and presented in [60, pp. 237–238].
Opportunistic Scheduling in Wireless Networks
139
Lemma 1 (Karush [40]) Suppose that f : IR → IR and that f is convex on IR. For v ≤ w, define g(v, w) := min f (z). Then it follows that z∈[v,w]
(a) g can be expressed as g(v, w) = F(v) + G(w), where F is convex nondecreasing and G is convex nonincreasing on IR. (b) Suppose that S is a minimizer of f over IR. Then g can be expressed as ⎧ ⎨ f (v), if S ≤ v g(v, w) = f (S), if v ≤ S ≤ w. ⎩ f (w), if w ≤ S Using Lemma 1, we can write Vn (x, s) = f (x, s) + F(w1 (x, s)) + G(w2 (x, s)), which is convex in x for a fixed s, because F(w1 (x, s)) is the composition of a convex nondecreasing function with a convex function and G(w2 (x, s)) is the composition of a convex nonincreasing function with a concave function (see, e.g., [11, Section 3.2] for the relevant results on convexity-preserving operations). Furthermore, by Lemma 1, if, for a fixed s, βn (s) is a global minimizer of h 1 (a) + IE Vn−1 (h 2 (a), s) over all a, then the optimal action has the form ⎧ ⎨ w1 (x, s), if βn (s) < w1 (x, s) an∗ (x, s) := βn (s), if w1 (x, s) ≤ βn (s) ≤ w2 (x, s) . ⎩ w2 (x, s), if w2 (x, s) < βn (s)
(10)
The optimal transmission policy in (10) is referred to as a modified base-stock policy. Applying this line of analysis to the dynamic program (6) for problem (P1), we see that when the channel condition is in state s at time n, and there are q packets remaining to be transmitted by the deadline, the optimal action is given by ⎧ / 0 0 / P P ⎪ max 0, q − , if β (s) < max 0, q − ⎪ n ⎨ cs cs 0 / P , (11) u ∗n (q, s) := βn (s), ≤ β if max 0, q − n (s) ≤ q ⎪ cs ⎪ ⎩ q, if q < βn (s) for some sequence of critical numbers {βn (s)}s∈S . Changing variables back to the original action variable Z n and noting that βn (s) ≥ 0 for all n and s, (11) is equivalent to ⎧P if βn (s) + cPs < q ⎨ cs , ∗ (12) z n (q, s) := q − βn (s), if βn (s) ≤ q ≤ βn (s) + cP . s ⎩ 0, if q < βn (s) See Fig. 4 for diagrams of this optimal policy. When the number of possible channel conditions is finite (i.e., the sample space S of each random variable Sn is finite) and for every s ∈ S,
140
D.I Shuman and M. Liu
Fig. 4 Optimal action for problem (P1) in slot n when the state is (q, s), the number of packets remaining to be transmitted before transmission and the current channel condition. (a) z ∗ , the optimal transmission quantity. (b) u ∗ , the optimal number of packets remaining to be transmitted after transmission in slot n
cs =
csworst for some lˆ ∈ IN , lˆ
(13)
then the critical numbers {βn (s)}s∈S can be calculated recursively. For further details on the calculation of these critical numbers, see [25]. A similar line of analysis for problem (P2) leads to the following structure of the optimal control action with n slots remaining: ⎧ if x ≥ bn (s) ⎨ x, P yn∗ (x, s) := bn (s), if bn (s) − cs ≤ x < bn (s) . ⎩ P x + cs , if x < bn (s) − cPs
(14)
Furthermore, for a fixed s, bn (s) is nondecreasing in n, and for a fixed n, bn (s) is nonincreasing in cs ; i.e., for arbitrary s 1 , s 2 ∈ S with cs 1 ≤ cs 2 , bn (s 1 ) ≥ bn (s 2 ). At time n, for each possible channel condition realization s, the critical number bn (s) describes the ideal number of packets to have in the user’s buffer after transmission in the nth slot. If that number of packets is already in the buffer, then it is optimal to not transmit any packets; if there are fewer than ideal and the available power is enough to transmit the difference, then it is optimal to do so; and if there are fewer than ideal and the available power is not enough to transmit the difference, then the sender should use the maximum power to transmit. See Fig. 5 for diagrams of this optimal policy. When the number of possible channel conditions is finite and for every s ∈ S, P = l · d for some l ∈ IN , cs
(15)
then the critical numbers {βn (s)}s∈S can be calculated recursively. Condition (15) says that the maximum number of packets that can be transmitted in any slot covers
Opportunistic Scheduling in Wireless Networks
141
Fig. 5 Optimal action for problem (P2) in slot n when the state is (x, s). (a) the optimal transmission quantity. (b) the resulting number of packets available for playout in slot n
exactly the playout requirements of some integer number of slots. For further details on the calculation of these critical numbers, see [68]. Like problem (P1), Lee and Jindal [48] take the information state for problem (P3) to be the pair (Q n , Sn ), where Q n represents the number of packets remaining to be transmitted at time n, and Sn denotes the channel condition in slot n. The dynamics of packets remaining to be transmitted are once again Q n−1 = Q n − Z n . The dynamic programming equations for problem (P3) are given by #
$ zμ Vn (q, s) = min + IE Vn−1 (q − z, Sn−1 ) , z≥0 ks n = N , N − 1, . . . , 1 # 0, if q = 0 V0 (q, s) = . ∞, if q > 0
(16)
The key idea of Lee and Jindal is to show inductively that IE Vn−1 (q − z, Sn−1 ) = ξn−1,μ · (q − z)μ for some constant ξn−1,μ that depends on the time n − 1 and the known monomial order μ. Therefore, z n∗ (q, s) = argmin z≥0
#
zμ + ξn−1,μ · (q − z)μ ks
$ .
(17)
Differentiating the inner term of the right-hand side of (17) with respect to z and setting it equal to zero yields z n∗ (q, s) = λn,μ (s) · q, for some λn,μ (s) ∈ [0, 1]. Thus, with n slots remaining in the time horizon, the optimal control action is to send a fraction, λn,μ (s), of the remaining packets to be sent. Here, the fraction to send depends on the time remaining in the horizon, n; the current condition of the channel, s; and the parameter representing the monomial order of the cost function, μ. The fractions λn,μ (s) can be computed recursively. Note that plugging the optimal z n∗ (q, s) back into (16) yields
142
D.I Shuman and M. Liu
μ (λn,μ (S) · q)μ = ξn,μ · q μ IE Vn (q, S) = IE + ξn−1,μ · q − λn,μ (S) · q kS forsome constant ξn,μ , completing the induction step on the form of IE Vn (q, S) . Finally, Lee and Jindal also show that for each fixed channel state s, the fraction λn,μ (s) is decreasing in n. In other words, the scheduler is more selective or opportunistic when the deadline is far away, as it sends a lower fraction of the remaining packets than it would under the same state closer to the deadline. This makes intuitive sense as it has more opportunities to wait for a very good channel realization when the deadline is farther away.
4.3 Comparison of the Problems In this section, we provide further intuition behind the role of deadlines by comparing the above problems. First, we show that problems (P1) and (P2) are equivalent when a certain technical condition holds. Next, we examine how the extra deadline constraints in problem (P2) affect the optimal scheduling policy, as compared with problem (P1). We finish with some conclusions on the role of deadline constraints. 4.3.1 A Sufficient Condition for the Equivalence of Problems (P1) and (P2) In this section, we transform the dynamic programs (6) and (8) to find a condition under which problems (P1) and (P2) are equivalent. In problem (P1), there is just a single deadline constraint; however, because the terminal cost is set to ∞ if all the data are not transmitted by the deadline, the scheduler must transmit enough data in each slot so that it can still complete the job if the channel is in the worst possible condition in all subsequent slots. Thus, the scheduler can leave no more than cs P packets for the final slot, no more than worst
2 · cs P packets for the last two slots, and so forth. So there are in fact implicit worst constraints on how much data can remain to be transmitted at the end of each slot. If we make these implicit constraints explicit, then the dynamic program (6) becomes Vn (q, s) = cs · q +
min/ 0 / max 0,q− cPs ≤u≤min q,(n−1)· cs
P worst
0
. −cs · u + IE Vn−1 (u, Sn−1 ) , n = N , N − 1, . . . , 1
V0 (q, s) = 0
∀q ∈ [0, dtotal ], ∀s ∈ S.
Next, we change the state space from total packets remaining to be transmitted to total packets transmitted since the beginning of the horizon (Tn = dtotal − Q n ), and we change the action space from total packets remaining after transmission in the
Opportunistic Scheduling in Wireless Networks
143
nth slot to total packets sent after transmission in the nth slot ( An = dtotal − Un ). The resulting dynamic program is Vn (t, s) = −cs · t + / max
min0
t,dtotal −(n−1)· cs P worst
-
/ 0 cs ≤a≤min t+ cPs ,dtotal
. · a + IE Vn−1 (t, Sn−1 ) ,
n = N , N − 1, . . . , 1 V0 (t, s) = 0
∀t ∈ [0, dtotal ], ∀s ∈ S.
(18)
In problem (P2), it is never optimal to fill the buffer beyond n · d at time n. This is easily shown through a simple interchange argument, and we can therefore impose this as an explicit constraint. We can also do similar changes of variables as above to change the state space from packets in the receiver’s buffer to total packets transmitted since the beginning of the horizon (Tn = X n + (N − n) · d), and to change the action space from packets in the receiver’s buffer following transmission to total packets sent after transmission in the nth slot (An = Yn + (N − n) · d). With these changes of variables, the dynamic program (8) becomes Vn (t, s) = −cs · t +
min
/ 0 max{t,(N −n+1)·d}≤a≤min t+ cPs ,N ·d
-
. cs · a + IE Vn−1 (t, Sn−1 ) , n = N , N − 1, . . . , 1
V0 (t, s) = 0 ∀t ∈ [0, N · d], ∀s ∈ S.
(19)
The dynamic programs (18) and (19) associated with problems (P1) and (P2), respectively, become identical when the following two conditions are satisfied: (C1) dtotal = N · d (i.e., the total number of packets to send over the horizon of N slots is the same for both problems). (C2) cs P = d (i.e., the maximum number of packets that can be transmitted under worst the worst channel condition is equal to the number of packets removed from the receiver’s buffer at the end of each slot in problem (P2)). Furthermore, if condition (C2) is not satisfied, then
P csworst
> d, because we require
≥ d for problem (P2) to be well defined. Thus, when condition (C1) is that cs worst satisfied, but condition (C2) is not satisfied, the action space at time n and state (t, s) in (18) contains the action space at the same time and state in (19). This is because the explicit deadline constraints resulting from the strict underflow constraints in problem (P2) are more restrictive than the implicit deadline constraints in problem (P1). P
144
D.I Shuman and M. Liu
4.3.2 Inverse Water-Filling Interpretations In this section, we interpret problems (P1) and (P2) within the context of the inverse water-filling procedure introduced in Section 2. The aim is to show how the extra deadline constraints in problem (P2) affect the optimal scheduling policy. We again start with problem (P1), which features a single deadline constraint. If, at the beginning of the horizon, the scheduler happens to know the realizations of all future channel conditions, s N , s N −1 , . . . , s1 , then problem (P1) reduces to the following convex optimization problem: (
min
N )
N z N ,z N −1 ,...,z 1 ∈IR+
s.t. and
n=1 csn
N
n=1 z n z n ≤ cPs n
· zn (20)
≥ dtotal ∀n ∈ {1, 2, . . . , N } .
It should be clear that (20) is essentially the same problem as (2), and the solution can be found by scheduling data transmission during the slot with the best condition until all the data are sent or the power limit is reached, and then scheduling data transmission during the slot with the second best condition until all the data are sent or the power limit is reached, and so forth. See Fig. 6 for a diagram of this solution. P
P
z6
0 0
2d
4d
P
z5
0 0
2d
4d
c s5
P
z4
0 0
2d
4d
c s4
P
z3
0 0
2d
4d
c s3
P
z2
0 0
2d
4d
cs2
z1
0 0
2d
4d
c s1
Fig. 6 Pictorial representation of the solution to problem (P1) in the somewhat unrealistic case that all future channel conditions are known at the beginning of the horizon. Packets are scheduled in slots in ascending order of csn , until all the data are transmitted or the power constraint for the slot is reached. In the example shown, the time horizon to send the data is N = 6, the total number of data packets to be sent is 6d, and the power constraint in each slot is P = 4d. One optimal policy is to transmit 4d packets in slot 2, which has the best channel condition, and the remaining 2d packets in slot 5, which has the second best channel condition. This policy results in a total cost of 2P
If we are focused on finding the optimal amount to transmit in the current slot, we can also aggregate the power-rate functions of all future slots, by reordering them according to the strength of the channel, as shown in Fig. 7. The aggregate power-rate curve shown is defined by c˜ N −1 (˜z , s N −1 , s N −2 , . . . , s1 ) :=
N −1
n=1 csn · z n (z N −1 ,...,z1 )∈IR+N N −1 s.t. n=1 z n = z˜ and z n ≤ cPs ∀n ∈ {1, 2, . . . , N − 1} , n (21)
min
Opportunistic Scheduling in Wireless Networks
145
5P
4P
3P
2P
P
P
z6
0 0
2d
~ z
0 0
4d
2d
4d
6d
8d
10d
cs6 Fig. 7 The aggregate power-rate function for future slots. Aggregating the power-rate functions of the final five slots from Fig. 6 allows us to determine the optimal number of packets to transmit in the current slot by comparing the current slope to the slopes of the aggregate curve. In this case, the slope of the current curve is greater than the slope of the aggregate curve at all points up to dtotal = 6d, so it is optimal to not transmit any packets in the current slot
where z˜ is the aggregate number of packets to be transmitted in slots N − 1, N − 2, . . . , 1. The optimal number of packets/to transmit in the current slot 0is then determined as follows. Define γ N := min z˜ 0 : ψ˜ N −1 (˜z ) ≥ cs N , ∀˜z > z˜ 0 , where ψ˜ N −1 (·) is the slope from above of the aggregate power-rate curve, c˜ N −1 (·, s N −1 , s N −2 , . . . , s1 ), shown in Fig. 7. Then the optimal number of packets to transmit in slot N is given by z ∗N
#
$ P = min , max {dtotal − γ N , 0} . cs N
(22)
This policy says that if the current per packet energy cost from transmission is greater than the slope of the aggregate curve at all points up to dtotal , then it is optimal to not transmit any packets in the current slot. Otherwise, the optimal number of packets to transmit in the current slot N is the minimum of the maximum number of
146
D.I Shuman and M. Liu
packets that can be transmitted under the current channel condition and the number of packets that would otherwise be transmitted in worse channel conditions in future slots. Now, as Fu et al. explain in [24, 25, Section III-D], in the more realistic case that the channel condition in slot n is not learned until the beginning of the nth slot, a very similar aggregate method can be used as long as the number of possible channel conditions is finite and condition (13) is satisfied. In this situation, however, the slopes of the piecewise-linear aggregate power-rate function for future slots are not defined in terms of the actual channel conditions of future slots (which are not available), but rather by a series of thresholds that only depend on the statistics of future channel conditions. Condition (13) ensures that the slopes of this aggregate expected power-rate curve only change at integer multiples of cs P . The form of the worst optimal policy at time N is the same as (22), with dtotal being the number of packets remaining to transmit at time N . Because the slopes of the aggregate expected power-rate curve only change at integer multiples of cs P , we have worst
# γ N ∈ 0,
P csworst
,2 ·
P csworst
, . . . , (N − 1) ·
P csworst
$ .
We now return to the wireless streaming model considered in problem (P2), with d packets removed from the receiver’s buffer at the end of every slot. Let us once again begin by considering the unrealistic case that the scheduler knows all future channel conditions at the beginning of the horizon. The optimal solution can be found by using the same basic inverse water-filling type principle of transmitting as much as possible in the slot with the best channel condition, and then the second best, and so forth; however, due to the additional underflow constraints, one needs to solve N sequential problems of this form. The first problem is the trivial problem of sending d packets in the first slot, [N , N − 1). The second problem is to send 2d packets in the first two slots. If the power limit in the first slot has not been reached after allocating the initial d packets there, then the scheduler may choose to send the second batch of d packets in either the first or second slot, according to their respective channel conditions. For each sequential problem, whatever packets have been allocated in the previous problem must be “carried over” to the subsequent problem, where there is one additional time slot available and the next d packets are allocated. The solution to the Nth problem represents the optimal allocation. See Fig. 8 for a diagram of this solution. Comparing Fig. 8 to Fig. 6, we see that when N · d = dtotal and the known sequence of channel conditions is the same for both problems, the additional underflow constraints cause more data to be scheduled in earlier time slots with worse channel conditions. When all future channel conditions are known ahead of time, as in Fig. 8, we can also use the same aggregation technique from above to represent problems 2 to N as comparisons between the current channel condition and the aggregate of the future channel conditions. Furthermore, when the future channel conditions are not known ahead of time and condition (15) is satisfied, we can once again define the aggregate expected power-rate function for future slots in terms of a series of
Opportunistic Scheduling in Wireless Networks
147
Fig. 8 Pictorial representation of the solution to problem (P2) in the somewhat unrealistic case that all future channel conditions are known at the beginning of the horizon. In the example shown, the time horizon is N = 6, d packets are removed from the receiver’s buffer at the end of every slot, and the power constraint in each slot is P = 4d. To satisfy the underflow constraints, six sequential problems are considered, with an additional d packets allocated in each problem. Packets allocated in one problem are “carried over” to all subsequent problems and shown in solid black filling. The optimal policy, given by the solution to problem 6, is to transmit d packets in slots 6 and 3, and 2d packets in slots 5 and 2. This policy results in a total cost of 3P
thresholds that only depend on the statistics of future channel conditions. Due to the underflow constraints, however, these thresholds are computed differently than those in problem (P1). The net result for this more realistic case is the same as the case when all future channel conditions are known – the additional underflow constraints make it optimal to send more data in earlier time slots with worse channel conditions.
4.4 Extensions and Other Energy-Minimizing Transmission Scheduling Studies Featuring Strict Deadline Constraints We have presented these three deadline problems in their most basic form in order to enable comparisons, but they can also be extended in a number of different ways. For instance, the structures of the optimal policies for problems (P1) and (P2), presented in Section 4.2, also hold for a Markovian channel. For problem (P2), a
148
D.I Shuman and M. Liu
modified base-stock policy is also optimal if the sequence of packet removals from the receiver buffer is nonstationary or if the optimization criterion is the infinite horizon discounted expected cost. When the linear power-rate curves of problem (P2) are generalized to piecewise-linear convex power-rate curves, a finite generalized base-stock policy, which is discussed further in Section 5, is optimal. Perhaps most interesting from a wireless networking standpoint and most difficult from a mathematical standpoint is the extension to the case of a single source transmitting data to multiple receivers over a shared channel. In [71], Tarello et al. extend problem (P1) (without the power constraint) to the case of multiple identical receivers and assume that the source can only transmit to one user in each slot. The extension of problem (P2) to the case of multiple receivers is discussed in [66, 68]. In addition to the three problems discussed above and their extensions, there have been a few other studies of energy-minimizing transmission scheduling that feature a time-varying wireless channel and strict deadline constraints. In [46, 47, 49], Lee and Jindal consider the same setup as problem (P3), except that the convex powerz z or e α−1 , which are based on the Gausrate curves are of the form c(z, s) = 2 α−1 s s sian noise channel capacity. The earlier models of Zafer and Modiano in [79, 80] also include essentially the same setup as problem (P3), with the exception that the underlying timescale is continuous rather than discrete. Using continuous-time stochastic control theory, they also reach the key conclusion that the optimal number of packets to transmit under convex monomial power-rate curves is the product of the number of packets remaining to be sent and an “urgency” fraction that depends on the current channel condition and the time remaining until the end of the horizon. Chen et al. [16, 17] and Uysal-Biyikoglu and El Gamal [75] consider packets arriving at different times, analyze offline scheduling problems, and use the properties of the optimal offline scheduler to develop heuristics for online (or causal) scheduling problems. An overview of the models considered in each of these studies is provided in Table 1. Additionally, Luna et al. [54] consider an energy minimization problem subject to end-to-end delay constraints, where the scheduler must select various source coding parameters in addition to the transmission powers. Finally, there is a sizeable literature on energy-efficient transmission scheduling studies such as [76] that feature a time-invariant or static channel and strict deadline constraints; however, we do not discuss these studies further, so as to keep the focus on the opportunistic scheduling behavior resulting from the time-varying wireless channel.
4.5 Summary Takeaways on the Role of Deadline Constraints As mentioned earlier, the main idea of opportunistic scheduling is to reduce energy consumption by sending more data when the channel is in a “good” state and less data when the channel is in a “bad” state. However, deadline constraints may force the sender to transmit data when the channel is in a relatively poor state. One strategy when faced with such deadline constraints would be to deal with them as they come, by always sending just enough packets when the channel is “bad” to ensure the deadline can be met, and holding out for the best channel conditions to send
Opportunistic Scheduling in Wireless Networks
149
Table 1 Overview of models for energy-minimizing transmission scheduling that feature a time-varying wireless channel and strict deadline constraints Study
Time
Num. of receivers
Fu et al. [24, 25]
Discrete
Shuman et al. [66–68]
Discrete
Lee and Discrete Jindal [48] Lee and Discrete Jindal [46, 47, 49]
Data
Deadline constraints
Scheduler’s information
Power-rate curves
1
Infinite backlog
Single deadline
Non-causal; causal
Multiple (focus on 1, 2)
Infinite backlog
Multiple Causal deadlines (underflow constraints)
1
Infinite backlog Infinite backlog
Single deadline Single deadline
Causal
Infinite backlog; random packet arrivals Packet arrivals
Causal
Convex; linear with power constraint Linear and piecewiselinear convex with power constraint Convex monomial Convex (Gaussian noise channel capacity) Convex; convex monomial
1
Zafer and Modiano [79, 80]
Continuous 1
Chen et al. [16, 17]
Discrete
UysalDiscrete Biyikoglu and El Gamal [75] Tarello et al. Discrete [71]
Multiple (focus on 2)
Packet arrivals
Single deadline; multiple variable deadlines Individual packet deadlines Single deadline
Multiple
Infinite backlog
Single deadline
1
Causal
Non-causal; causal
Convex
Non-causal; causal
Convex
Non-causal; causal
Linear; convex
We use the term “infinite backlog” to include problems where there are a finite number of packets to be sent, all of which are queued at the beginning of the time horizon. “Non-causal” refers to the offline scheduling situation where the transmission scheduler has knowledge of future channel states and packet arrival times
a lot of data. Yet, a key conclusion from the analysis of the three problems we presented in this section is that it is better to anticipate the need to comply with these constraints in future slots by sending more packets (than one would without the deadlines) under “medium” channel conditions in earlier slots. In some sense, doing so is a way to manage the risk of being stuck sending a large amount of data over a poor channel to meet an imminent deadline constraint. We also saw that the extent to which the scheduler should plan for the deadline by sending data under such “medium” channel conditions depends on the time remaining until the
150
D.I Shuman and M. Liu
deadline(s), and on how many deadlines it must meet; namely, the closer the deadlines and the more deadlines it faces, the less opportunistic the scheduler can afford to be. So perhaps the essence of opportunistic scheduling with deadline constraints is that the scheduler should be opportunistic, but not too opportunistic.
5 Relation to Work in Inventory Theory The models outlined in Section 4.1 correspond closely to models used in inventory theory. Borrowing that field’s terminology, our abstractions are multi-period, singleechelon, single-item, discrete-time inventory models with random ordering costs, a budget constraint, and deterministic demands. The item corresponds to the stream of data packets, the random ordering costs to the random channel conditions, the budget constraint to the power available in each time slot (there is no such budget constraint for problem (P3)), and the deterministic demands to the packet deadline constraints. In problems (P1) and (P2), the random ordering costs are linear, and in problem (P3), they are convex monomial. In problem (P2), the deterministic demands are stationary (d packets are removed from the inventory at the end of every time slot); in problems (P1) and (P3), the deterministic demand sequence is nonstationary, and equal to {0, 0, . . . , 0, dtotal }. To the best of our knowledge, the particular problems introduced in Section 4.1 have not been studied in the context of inventory theory, but similar problems have been examined. References [22, 27, 28, 38, 41, 42, 55, 56] all consider single-item inventory models with random ordering prices (linear ordering costs). The key result for the case of deterministic demand of a single item with no resource constraint is that the optimal policy is a base-stock policy with different target stock levels for each price. Specifically, at each time, for each possible ordering price (translates into channel condition in our context), there exists a critical number such that the optimal policy is to fill the inventory (receiver buffer) up to that critical number if the current level is lower than the critical number, and not to order (transmit) anything if the current level is above the critical number. Of the prior work, Kingsman [41, 42] is the only author to consider a resource constraint, and he imposes a maximum on the number of items that may be ordered in each slot. The resource constraint in problems (P1) and (P2) is of a different nature in that we limit the amount of power available in each slot. This is equivalent to a limit on the per-slot budget (regardless of the stochastic price realization), rather than a limit on the number of items that can be ordered. Of the related work on inventory models with deterministic ordering prices and stochastic demand, [23, 73] are the most relevant; in those studies, however, the resource constraint also amounts to a limit on the number of items that can be ordered in each slot and is constant over time. References [4, 69, 81] consider singleitem inventory models with deterministic piecewise-linear convex ordering costs and stochastic demand. The key result in this setup is that the optimal inventory level after ordering is a piecewise-linear nondecreasing function of the current inventory
Opportunistic Scheduling in Wireless Networks
151
level (i.e., there are a finite number of target stock levels), and the optimal ordering quantity is a piecewise-linear nonincreasing function of the current inventory level. Porteus [59] refers to policies of this form as finite generalized base-stock policies, to distinguish them from the superclass of generalized base-stock policies, which are optimal when the deterministic ordering costs are convex (but not necessarily piecewise-linear), as first studied in [39]. Under a generalized base-stock policy, the optimal inventory level after ordering is a nondecreasing function of the current inventory level, and the optimal ordering quantity is a nonincreasing function of the current inventory level. Finally [15, 19, 21, 36] consider multi-item discretetime inventory systems under deterministic ordering costs, stochastic demand, and resource constraints. These studies are more relevant for the multiple receiver extensions discussed in Section 4.4. We have found some of the techniques used to solve these related inventory models to be quite informative in examining the opportunistic scheduling problems from wireless communications. For a more in-depth comparison of problem (P2) to the related inventory theory literature, see [70]. For more background on common models and techniques used in inventory theory, see [60, 83].
6 Conclusion In this chapter, we introduced opportunistic scheduling problems in wireless networks and specifically focused on the role of deadline constraints in this class of problems. We presented three opportunistic scheduling problems with deadline constraints, along with their solutions and outlines of the techniques used to analyze the problems. The roots of some of these techniques lie in inventory theory studies developed three to four decades earlier. By comparing the problems to each other and interpreting their solutions with water-filling-type principles, we were able to better understand the effect of the deadline constraints on the optimal scheduling policies. In particular, we concluded that the scheduler must anticipate the impending deadlines and adjust its behavior in earlier time slots to manage the risk of being stuck sending a large amount of data over a poor channel just before the deadline.
References 1. Agarwal M, Borkar VS, Karandikar A (2008) Structural properties of optimal transmission policies over a randomly varying channel. IEEE Trans Autom Control 53(6):1476–1491 2. Andrews M, Kumaran K, Ramanan K, Stolyar A, Vijayakumar R, Whiting P (2004) Scheduling in a queueing system with asynchronously varying service rates, Probab Eng Inform Sci 18:191–217 3. Ata B (2005) Dynamic power control in a wireless static channel subject to a quality-of-service constraint. Oper Res 53(5):842–851 4. Bensoussan A, Crouhy M, Proth J-M (1983) Mathematical theory of production planning, Elsevier Science, Amsterdam
152
D.I Shuman and M. Liu
5. Berggren F, Jäntti R (2004) Asymptotically fair transmission scheduling over fading channels. IEEE Trans Wireless Commun 3(1):326–336 6. Berry RA, Gallager RG (2002) Communication over fading channels with delay constraints, IEEE Trans Inform Theory 48(5):1135–1149 7. Berry RA, Yeh EM (2004) Cross-layer wireless resource allocation. IEEE Signal Process Mag 21(5):59–69 8. Bertsekas DP, Shreve SE (1996) Stochastic optimal control: the discrete-time case. Athena Scientific 9. Bhorkar A, Karandikar A, Borkar VS (2006) Power optimal opportunistic scheduling. In: Proceedings of the IEEE global telecommunications conference (GLOBECOM), San Francisco, CA 10. Borst S, Whiting P (2003) Dynamic channel-sensitive scheduling algorithms for wireless data throughput optimization. IEEE Trans Veh Technol 52(3):569–586 11. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge 12. Chang N, Liu M (2007) Optimal channel probing and transmission scheduling for opportunistic spectrum access. In: Proceedings of the ACM international conference on mobile computing and networking (MobiCom’07), Montreal, Canada, pp. 27–38 13. Chang NB, Liu M (2009) Optimal channel probing and transmission scheduling for opportunistic spectrum access. IEEE/ACM Trans Netw 17(6):1805–1818 14. Chaporkar A, Proutiere P (2008) Optimal joint probing and transmission strategy for maximizing throughput in wireless systems. IEEE J Select Areas Commun 26(8):1546–1555 15. Chen SX (2004) The optimality of hedging point policies for stochastic two product flexible manufacturing systems. Oper Res 52(2):312–322 16. Chen W, Mitra U, Neely MJ (2007) Energy-efficient scheduling with individual delay constraints over a fading channel. In: Proceedings of the international symposium on modeling and optimization in mobile, ad hoc, and wireless networks, Limassol, Cyprus 17. Chen W, Mitra U, Neely MJ (2009) Energy-efficient scheduling with individual delay constraints over a fading channel. Wireless Netw 15(5):601–618 18. Collins BE, Cruz RL (1999) Transmission policies for time varying channels with average delay constraints. In: Proceedings of the Allerton conference on communication, control, and computing, Monticello, IL 19. DeCroix GA, Arreola-Risa A (1998) Optimal production and inventory policy for multiple products under resource constraints. Manage Sci 44(7):950–961 20. Djonin DV, Krishnamurthy V (2005) Structural results on the optimal transmission scheduling policies and costs for correlated sources and channels. In: Proceedings of the IEEE conference on decision and control, Seville, Spain, pp. 3231–3236 21. Evans R (1967) Inventory control of a multiproduct system with a limited production resource. Naval Res Logist Quart 14(2):173–184 22. Fabian T, Fisher JL, Sasieni MW, Yardeni A (1959) Purchasing raw material on a fluctuating market. Oper Res 7(1):107–122 23. Federgruen A, Zipkin P (1986) An inventory model with limited production capacity and uncertain demands II. The discounted-cost criterion. Math Oper Res 11(2):208–215 24. Fu A, Modiano E, Tsitsiklis J (2003) Optimal energy allocation for delay-constrained data transmission over a time-varying channel. In: Proceedings of the IEEE INFOCOM, San Francisco, CA, vol 2, pp. 1095–1105 25. Fu A, Modiano E, Tsitsiklis JN (2006) Optimal transmission scheduling over a fading channel with energy and deadline constraints. IEEE Trans Wireless Commun 5(3):630–641 26. Gesbert D, Alouini M-S (2004) How much feedback is multi-user diversity really worth? In: Proceedings of the IEEE international conference on communications, Paris, France, vol 1, pp. 234–238 27. Golabi K (1982) A single-item inventory model with stochastic prices. In: Proceedings of the second international symposium on inventories, Budapest, Hungary, pp. 687–697
Opportunistic Scheduling in Wireless Networks
153
28. Golabi K (1985) Optimal inventory policies when ordering prices are random. Oper Res 33(3): 575–588 29. Gopalan A, Caramanis C, Shakkottai S (2007) On wireless scheduling with partial channelstate information. In: Proceedings of the 45th Allerton conference on communication, control, and computing, Urbana, IL 30. Goyal M, Kumar A, Sharma V (2003) Power constrained and delay optimal policies for scheduling transmission over a fading channel. In: Proceedings of the IEEE INFOCOM, San Francisco, CA, pp. 311–320 31. Goyal M, Kumar A, Sharma V (2008) Optimal cross-layer scheduling of transmissions over a fading multiaccess channel. IEEE Trans Inform Theory 54(8):3518–3537 32. Guha S, Munagala K, Sarkar S (2006) Jointly optimal transmission and probing strategies for multichannel wireless systems. In: Proceedings of the conference on information sciences and systems, Princeton, NJ 33. Guha S, Munagala K, Sarkar S (2006) Optimizing transmission rate in wireless channels using adaptive probes. In: Proceedings of the ACM sigmetrics/performance conference, Saint-Malo, France, June 2006 34. Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Springer, New York, NY 35. Holtzman JM (2000) CDMA forward link waterfilling power control. In: Proceedings of the IEEE vehicular technology conference, Tokyo, Japan, vol 3, pp. 1663–1667 36. Janakiraman G, Nagarajan M, Veeraraghavan S (2009) Simple policies for managing flexible capacity. Manuscript 37. Ji Z, Yang Y, Zhou J, Takai M, Bagrodia R (2004) Exploiting medium access diversity in rate adaptive wireless LANs. In: Proceedings of MOBICOM, Philadelphia, PA, pp. 345–359 38. Kalymon B (1971) Stochastic prices in a single-item inventory purchasing model. Oper Res 19(6):1434–1458 39. Karlin S (1958) Optimal inventory policy for the Arrow-Harris-Marschak dynamic model. In: Arrow KJ, S. Karlin, H. Scarf (eds) Studies in the mathematical theory of inventory and production, Stanford University Press, Stanford, CA, pp. 135–154 40. Karush W (1959) A theorem in convex programming. Naval Res Logist Quart 6(3):245–260 41. Kingsman BG (1969) Commodity purchasing. Oper Res Quart 20:59–80 42. Kingsman BG (1969) Commodity purchasing in uncertain fluctuating price markets. PhD thesis, University of Lancaster 43. Kittipiyakul S, Javidi T (2007) Resource allocation in OFDMA with time-varying channel and bursty arrivals. IEEE Commun Lett 11(9):1708–710 44. Kittipiyakul S, Javidi T (2009) Delay-optimal server allocation in multiqueue multi-server systems with time-varying connectivities. IEEE Trans Inform Theory, vol. 55, May 2009, pp. 2319–2333 45. Knopp R, Humblet PA (1995) Information capacity and power control in single-cell multiuser communications. In: Proceedings of the international conference on communications, Seattle, WA, vol 1, pp. 331–335 46. Lee J, Jindal N (2008) Energy-efficient scheduling of delay constrained traffic over fading channels. In: Proceedings of the IEEE international symposium on information theory, Toronto, Canada 47. Lee J, Jindal N (2009) Energy-efficient scheduling of delay constrained traffic over fading channels. IEEE Trans Wireless Commun 8(4):1866–1875 48. Lee J, Jindal N (2009) Delay constrained scheduling over fading channels: Optimal policies for monomial energy-cost functions. In: Proceedings of the IEEE international conference on communications, Dresden, Germany 49. Lee J, Jindal N (2009) Asymptotically optimal policies for hard-deadline scheduling over fading channels. IEEE Trans Inform Theory, submitted 50. Liu X, Chong EKP, Shroff NB (2003) Optimal opportunistic scheduling in wireless networks. In: Proceedings of the vehicular technology conference, Orlando, FL, vol 3, pp. 1417–1421 51. Liu X, Chong EKP, Shroff NB (2003) A framework for opportunistic scheduling in wireless networks. Comput Netw 41(4):451–474
154
D.I Shuman and M. Liu
52. Liu X, Shroff NB, Chong EKP (2004) Opportunistic scheduling: An illustration of cross-layer design. Telecommun Rev 14(6):947–959 53. Love DJ, Heath RW Jr, Lau VKN, Gesbert D, Rao BD, Andrews M (2008) An overview of limited feedback in wireless communication systems. IEEE J Select Areas Commun 26(8): 1341–1365 54. Luna CE, Eisenberg Y, Berry R, Pappas TN, Katsaggelos AK (2003) Joint source coding and data rate adaptation for energy efficient wireless video streaming. IEEE J Select Areas Commun 21(10):1710–1720 55. Magirou VF (1982) Stockpiling under price uncertainty and storage capacity constraints. Eur J Oper Res 11:233–246 56. Magirou VF (1987) Comments on ‘On optimal inventory policies when ordering prices are random’ by Kamal Golabi. Oper Res 35(6):930–931 57. Neely MJ (2009) Max weight learning algorithms with application to scheduling in unknown environments. In: Proceedings of the information theory and applications workshop, La Jolla, CA 58. Neely MJ, Modiano E, Rohrs CE (2003) Dynamic power allocation and routing for time varying wireless networks. In: Proceedings of the IEEE INFOCOM, San Francisco, CA, vol 1, pp. 745–755 59. Porteus EL (1990) Stochastic inventory theory. In: Heyman DP, Sobel MJ (eds) Stochastic models. Elsevier Science, Amsterdam, pp. 605–652 60. Porteus EL (2002) Foundations of stochastic inventory theory. Stanford University Press, Stanford, CA 61. Rajan D, Sabharwal A, Aazhang B (2004) Delay-bounded packet scheduling of bursty traffic over wireless channels. IEEE Trans Inform Theory 50(1):125–144 62. Sabharwal A, Khoshnevis A, Knightly E (2007) Opportunistic spectral usage: Bounds and a multi-band CSMA/CA protocol. IEEE/ACM Trans Netw 15(3):533–545 63. Sadeghi P, Kennedy RA, Rapajic PB, Shams R (2008) Finite-state Markov modeling of fading channels – a survey of principles and applications. IEEE Signal Process Mag 25(5):57–80 64. Sanayei S, Nosratinia A (2007) Opportunistic beamforming with limited feedback. IEEE Trans Wireless Commun 6(8):2765–2771 65. Shakkottai S, Srikant R, Stolyar A (2004) Pathwise optimality of the exponential scheduling rule for wireless channels. Adv Appl Probab 36(4):1021–1045 66. Shuman DI (2010) From sleeping to stockpiling: Energy conservation via stochastic scheduling in wireless networks. PhD thesis, University of Michigan, Ann Arbor 67. Shuman DI, Liu M (2008) Energy-efficient transmission scheduling for wireless media streaming with strict underflow constraints. In: Proceedings of the international symposium on modeling and optimization in mobile, ad hoc, and wireless networks, Berlin, Germany, pp. 354–359 68. Shuman DI, Liu M, Wu OQ (2010) Energy-efficient transmission scheduling with strict underflow constraints. IEEE Trans Inform Theory, forthcoming 69. Sobel MJ (1970) Making short-run changes in production when the employment level is fixed. Oper Res 18(1):35–51 70. Srivastava R, Koksal CE (2010) Energy optimal transmission scheduling in wireless sensor networks. IEEE Trans Wireless Commun 9(5):1550–1560 71. Tarello A, Sun J, Zafer M, Modiano E (2008) Minimum energy transmission scheduling subject to deadline constraints. ACM Wireless Netw 14(5):633–645 72. Tassiulas A, Ephremides L (1993) Dynamic server allocation to parallel queues with randomly varying connectivity. IEEE Trans Inform Theory 39(2):466–478 73. Tayur S (1993) Computing the optimal policy for capacitated inventory models. Commun Statist Stochastic Models 9(4):585–598 74. Tse D, Viswanath P (2005) Fundamentals of wireless communication. Cambridge University Press, Cambridge
Opportunistic Scheduling in Wireless Networks
155
75. Uysal-Biyikoglu E, El Gamal A (2004) On adaptive transmission for energy efficiency in wireless data networks. IEEE Trans Inform Theory 50(12):3081–3094 76. Uysal-Biyikoglu E, Prabhakar B, El Gamal A (2002) Energy-efficient packet transmission over a wireless link. IEEE/ACM Trans Netw 10(4):487–499 77. Viswanath P, Tse DNC, Laroia R (2002) Opportunistic beamforming using dumb antennas. IEEE Trans Inform Theory 48(6):1277–1294 78. Wang H (2003) Opportunistic transmission of wireless data over fading channels under energy and delay constraints. PhD thesis, Rutgers University 79. Zafer M, Modiano E (2005) Continuous-time optimal rate control for delay constrained data transmission. In: Proceedings of the 43rd Allerton conference on communication, control, and computing, Urbana, IL 80. Zafer M, Modiano E (2007) Delay-constrained energy efficient data transmission over a wireless fading channel. In: Proceedings of the information theory and applications workshop, La Jolla, CA 81. Zahrn FC (2009) Studies of inventory control and capacity planning with multiple sources. PhD thesis, Georgia Institute of Technology 82. Zhang D, Wasserman KM (2002) Transmission schemes for time-varying wireless channels with partial state observations. In: Proceedings of the IEEE INFOCOM, New York, NY, vol 2, pp. 467–476 83. Zipkin PH (2000) Foundations of Inventory Management. McGraw-Hill, New York, NY
A Hybrid Polyhedral Uncertainty Model for the Robust Network Loading Problem Ay¸segül Altın, Hande Yaman, and Mustafa Ç. Pınar
1 Introduction For a given undirected graph G, the network loading problem (NLP) deals with the design of a least cost network by allocating discrete units of capacitated facilities on the links of G so as to support expected pairwise demands between some endpoints of G. For a telephone company, the problem would be to lease digital facilities for the exclusive use of a customer where there is a set of alternative technologies with different transmission capacities. For example, DS0 is the basis for digital multiplex transmission with a signalling rate of 64 bits per second. Then, DS1 and DS3 correspond to 24 and 672 DS0s in terms of transmission capacities, respectively. The cost of this private service is the total leasing cost of these facilities, which is determined in a way to offer significant economies of scale. In other words, the least costly combination of these facilities would be devoted to a single customer to ensure communication between its sites. Then the customer would pay just a fixed amount for leasing these facilities and would not make any additional payment in proportion to the amount of traffic its sites exchange with each other. The structure of the leasing cost, which offers economies of scale, complicates the problem [31]. Although the traditional approach is to assume that the customer would be able to provide accurate estimates for point-to-point demands, this is not very likely to happen in real life. Hence, we relax this assumption and study the robust NLP to obtain designs flexible enough to accommodate foreseen fluctuations in demand.
Ay¸segül Altın Department of Industrial Engineering, TOBB University of Economics and Technology, Söˇgütözü 06560 Ankara, Turkey e-mail:
[email protected] Hande Yaman Department of Industrial Engineering, Bilkent University, 06800 Ankara, Turkey e-mail:
[email protected] Mustafa Ç. Pınar Department of Industrial Engineering, Bilkent University, 06800 Ankara, Turkey e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_7,
157
158
A. Altın et al.
Our aim is to design least cost networks which remain operational for any feasible realization in a prescribed demand polyhedron. Efforts for incorporating uncertainty in data can be divided into two main categories. The first one is stochastic optimization(SO), where data are represented as random variables with a known distribution and decisions are taken based on expectations. However, it is computationally quite challenging to quantify these expectations. Moreover, given that limited or no historical data are available most of the time, it is not realistic to assume that exact distributions can be obtained reliably. Besides, SO yields decisions that might become infeasible with some probability. This latter issue might lead to undesirable situations especially when such a tolerance is not preferable. Alternatively, in robust optimization (RO) data are represented as uncertainty sets like polyhedral sets and the best decision is the one with the best worst-case performance, i.e., the one to handle the worst-case scenario in the uncertainty set in the most efficient way. Besides, RO is computationally tractable for some polyhedral or conic quadratic uncertainty sets and for many classes of optimization [36]. The interested reader can refer to [4, 6, 9–14, 33, 34, 40] for several examples of RO models and methodology. In network design, the most common component subject to uncertainty is the traffic matrix, i.e., the traffic demand between some node pairs in the network. In this chapter, we study the robust network loading problem under polyhedral uncertainty. A few polyhedral demand models have found acceptance in the telecommunications network design society. The initial efforts belong to Duffield et al. [18] and Fingerhut et al. [20], who propose the so-called hose model independently, for the design of virtual private networks (VPNs) and broadband networks, respectively. The hose model has become quickly popular since it handles complicated communication requests efficiently and scales well as network sizes continue to grow. This is mainly because it does not require any estimate for pairwise demands but defines the set of feasible demand realizations via bandwidth capacities of some endpoints called terminals in the customer network. Later, Bertsimas and Sim [13, 14] introduce an alternative demand definition where they consider lower and upper bounds on the uncertain coefficients and allow at most a fixed number of coefficients to assume their worst possible values. They suggest to use this quota as a measure for the trade-off between the conservatism and the cost of the final design. Their model, which we will refer to as the BS model in the rest of the chapter, has also gained significant adherence in several applications of network design. In our network design context, the Bertsimas–Sim uncertainty definition amounts to specifying lower and upper bounds on the pointto-point communication demands and to allowing at most a fixed number of pairs to exchange their maximum permissible amount of traffic. Finally, Ben-Ameur and Kerivin [8] propose to use a rather general demand polyhedron, which could be constructed by describing the available information for the specific network using a finite number of linear inequalities. The common point for all these demand models is that, no matter which definition you use, the concern is to determine the least costly link capacity configuration that would remain operational under the worst case. We will refer to the feasible demand realization which leads
A Hybrid Polyhedral Uncertainty Model
159
to the most costly capacity configuration for the optimal routing as the worst-case scenario throughout this chapter. Against this background, our main contribution is to introduce a new demand model called the hybrid model, which specifies lower and upper bounds on the point-to-point communication demands as well as bandwidth capacities as in the hose model. In other words, the hybrid model aims to make the hose model more accurate by incorporating additional information on pairwise demands. The advantage of extra information in the form of lower and upper bounds is to avoid redundant conservatism. NLP is an important problem, which can be applied to different contexts like private network design or capacity expansion in telecommunications, supply chain capacity planning in logistics, or truck loading in transportation. Existing studies on the deterministic problem can be grouped under several classes. One source of variety is the number of facility alternatives. Single-facility [5, 16, 30, 32, 35] and two-facility [17, 21, 31] problems are the most common types. On the other hand, NLP with flow costs [17, 21, 35] and without flow costs [7, 16, 29–31] are also widely studied. Although static routing is always used, the multi-path routing [5, 7, 16, 17, 21, 29–31, 35] and single-path routing [5, 15, 22] lead to a technical classification of the corresponding literature. Although the deterministic NLP is widely studied, the Rob-NLP literature is rather limited. Kara¸san et al. [27] study DWDM network design under demand uncertainty with an emphasis on modelling, whereas Altın et al. [3] provide a compact MIP formulation for Rob-NLP and a detailed polyhedral analysis of the problem for the hose model as well as an efficient branch-and-cut algorithm. Atamtürk and Zhang [6] study the two-stage robust NLP where the capacity is reserved on network links before observing the demands and the routing decision is made afterwards in the second stage. Furthermore, Mudchanatongsuk et al. [33] study an approximation to the robust capacity expansion problem with recourse, where the routing of demands (recourse variables) is limited to a linear function of demand uncertainty. They consider transportation cost and demand uncertainties with binary capacity design variables and show that their approximate solutions reduce the worst-case cost but incur sub-optimality in several instances. In addition to introducing the hybrid model, we initially give a compact mixed integer programming model of Rob-NLP with an arbitrary demand polyhedron [3]. Then we focus on the hybrid model and discuss two alternative MIP models for the corresponding Rob-NLP. Next, we compare them in terms of the computational performance using an off-the-shelf MIP solver and mention the differences in terms of the polyhedral structures of their feasible sets. Moreover, we provide an experimental economic analysis of the impact of robustness on the design cost. Finally, we compare the final design cost for the hybrid model with those for the hose and BS models. In Section 2 we first define our problem briefly. Then in Section 2.1, we introduce the hybrid model as a new, general-purpose demand uncertainty definition and present alternative MIP models. We present results of computational tests in
160
A. Altın et al.
Section 3. Then, we conclude the chapter with Section 4, where we summarize our results and mention some future research directions.
2 Problem Definition For a given undirected graph G = (V, E) with the set of nodes V and the set of edges E, we want to design a private network among the set of customer sites W ⊆ V . Let Q be the set of commodities where each commodity q ∈ Q corresponds to a potential communication demand from the origin site o(q) ∈ W to the destination site t (q) ∈ W \ {o(q)}. In this chapter, our main concern is to allocate discrete number of facilities with different capacities on the edges of G to design a least-cost network viable for any demand realization in a prescribed polyhedral set. Let L be the set of facility types, C l be the capacity of each type l facility, pel be the cost of using one unit of type l facility on link e, and yel be the number of type l ∈ L facilities reserved on link e. Moreover, dq is the estimated demand from node q o(q) to node t (q) whereas f hk is the fraction of dq routed on the edge {h, k} ∈ E in the direction from h to k. Throughout the chapter, we will sometimes use {h, k} in place of edge e if we need to mention the end points of e. Using this notation, the MIP formulation for the traditional NLP is as follows: min
pel yel
(1)
e∈E l∈L
⎧ ⎨ 1 h = o(q) q q s.t. f hk − f kh = −1 h = t (q) ∀h ∈ V, q ∈ Q, ⎩ 0 otherwise k:{h,k}∈E q q ( f hk + f kh )dq ≤ C l yel ∀e = {h, k} ∈ E,
q∈Q
(2) (3)
l∈L
yel ≥ 0 integer q q f hk , f kh ≥ 0
∀l ∈ L , e ∈ E, ∀{h, k} ∈ E, q ∈ Q.
(4) (5)
The main motivation of the current work is to incorporate some robustness and flexibility into the capacity configuration decision. Accordingly, we consider the possibility of changes in demand expectations and determine the least-cost design based on a polyhedral set of admissible rather than a single matrix of average esti demands q mates. Let D = {d ∈ R|Q| : q∈Q az dq ≤ αz ∀z = 1, . . . , H, dq ≥ 0 ∀q ∈ Q} be the polyhedron containing non-simultaneous demand matrices, which are admissible given the available information about the network. The most significant impact of such an extension is observed in constraint (3), which has to be replaced with q∈Q
q
q
( f hk + f kh )dq ≤
l∈L
C l yel ∀d ∈ D, e = {h, k} ∈ E,
(6)
A Hybrid Polyhedral Uncertainty Model
161
since we want the final capacity configuration to support any feasible realization d ∈ D. However, this leads to a semi-infinite optimization problem since we need one constraint for each of the infinitely many feasible demand matrices in D. To overcome this difficulty, we use a common method in robust optimization [1, 11, 13] and obtain a compact MIP formulation of the problem. This method which was also used in [3] is now briefly summarized for the sake of completeness. First, observe that any one of the infinitely many non-simultaneous feasible communication request d ∈ D would be routed safely along each link if the capacity of each link is sufficient to route the most capacity consuming, i.e., the worst-case, admissible traffic requests. As a result, we can model our problem using the following semi-infinite MIP formulation (NLPpol ): min
pel yel
e∈E l∈L
s.t max d∈D
(2), (4), (5) q
q
( f hk + f kh )dq ≤
q∈Q
C l yel ∀e = {h, k} ∈ E,
(7)
l∈L
where we replace (3) with (7) to ensure (6). Notice that given a routing f , we can obtain the worst-case capacity requirement for each link e ∈ E by solving the linear programming problem on the left-hand side of (7). Hence for each link e = {h, k} ∈ E, we can apply a duality-based transformation to the maximization problem in (7) and reduce NLPpol to the following compact MIP formulation (NLP D ): min
pel yel
(8)
e∈E l∈L
(2), (4), (5)
s.t. H
αz λez ≤
C l yel ∀e ∈ E,
(9)
l∈L
z=1 q
q
f hk + f kh ≤
H
q
az λez ∀e = {h, k} ∈ E, q ∈ Q,
(10)
z=1
λez ≥ 0
∀z = 1, . . . , H, q ∈ Q,
(11)
where λ ∈ H |E| are the dual variables used in the transformation. The interested reader can refer to Altın et al. [2] for a more detailed discussion of this approach. The above duality transformation motivates two important contributions. First, we obtain a compact MIP formulation, which we can solve for small-to-mediumsized instances using off-the-shelf MIP solvers. Moreover, we get rid of the bundle constraints (3), which complicate the polyhedral studies on traditional NLP. In Altın et al. [3], we benefit from this single-commodity decomposition property and provide a thorough polyhedral analysis for the so-called symmetric hose model of demand uncertainty. In the next section, we will introduce the hybrid model as
162
A. Altın et al.
a new general-purpose polyhedral demand definition and study Rob-NLP for this uncertainty set.
2.1 The Hybrid Model Due to the dynamic nature of the current business environment, the variety of communication needs keeps increasing and hence it gets harder to accurately estimate point-to-point demands. Therefore, designing networks that are flexible enough to handle multiple demand scenarios efficiently so as to improve network availability has become a crucial issue. Although considering a finite number of potential scenarios is a well-known approach in stochastic optimization, Duffield et al. [18] and Fingerhut et al. [20] introduced the hose model as a first effort to use polyhedral demand uncertainty sets in telecommunications context. In its most general form, that is called the asymmetric hose model, outflow (bs+ ) and inflow (bs− ) capacities are set for each customer site s ∈ W as
dq ≤ bs+ ∀s ∈ W,
(12)
dq ≤ bs− ∀s ∈ W.
(13)
q∈Q:o(q)=s
q∈Q:t (q)=s
Then the corresponding polyhedron is D Asym = {dq ∈ R|Q| : (12), (13), dq ≥ 0 ∀q ∈ Q}. There are also the symmetric hose model with a single capacity bs for the total be incident to node s ∈ W , and a sum-symmetric hose model flow that can with s∈W bs+ = s∈W bs− . The hose model has several strengths. To name a few, the transmission capacities can be estimated more reliably and easily than individual point-to-point demands especially if sufficient amount of statistical information is not available. Moreover, it offers resource-sharing flexibility and hence improved link utilization due to multiplexing. Basically, the size of an access link can be smaller if we use hose model rather than point-to-point lines with fixed resource sharing. These and several other competitive advantages helped the hose model to prevail within the telecommunications society [1, 3, 19, 23–26, 28, 39]. On the other hand, Bertsimas and Sim [13, 14] proposed the BS model or the restricted interval model, which defines an applicable interval for each pairwise demand such that at most a fixed number of demands can take their highest values, simultaneously. For our problem, this implies that dq ∈ [d¯q , d¯q + dˆq ] for all q ∈ Q and at most of these demands differ from d¯q ≥ 0 at the same time. If we define ¯ ˆ each demand q ∈ Q as dq = dq + dq βq with βq ∈ {0, 1}, then the BS model requires q∈Q βq ≤ . Bertsimas and Sim [14] use to control the conservatism of the final design.
A Hybrid Polyhedral Uncertainty Model
163
In this section, we introduce the hybrid model. Although we study a private network design problem, the hybrid model can certainly be used in any context where parameter uncertainty is a point at issue. We call this new model as hybrid since a demand matrix d ∈ R|Q| has to satisfy the slightly modified symmetric hose constraint q∈Q:o(q)=s
2
dq ≤ bs
∀s ∈ W,
(14)
t (q)=s
as well as the interval restrictions dq ≤ u q ∀q ∈ Q, d¯q ≤ dq ∀q ∈ Q
(15) (16)
to be admissible. As a result, the corresponding demand polyhedron for the hybrid model is Dhyb = {d ∈ R|Q| : (14), (15), (16)}. We should remark here that (14) should not be considered as analogous to the conservatism level restriction in the BS model. Actually, the conservatism dimension is not articulated explicitly in the hybrid model, where the main purpose is to incorporate more information into the hose definition so as to avoid overly conservative designs taking care of unlikely worst-case demand realizations. Hence, notice that the hybrid model is a hybrid of the hose model and the interval uncertainty model but not the BS model. Finally, we are interested in the case where Dhyb = ∅ and hence q∈Q:o(q)=s 2 t (q)=s d¯q ≤ bs for all s ∈ W to have a meaningful design problem. Let Dsym = {d ∈ R|Q| : (14), dq ≥ 0 ∀q ∈ Q}. Notice that, Dhyb = Dsym if d¯q = 0 and u q ≥ min{bo(q) , bt (q) } for all q ∈ Q whereas Dhyb ⊆ Dsym otherwise. This is because we can avoid over-conservative designs and hence redundant investment by using additional information and our provisions about the specific topology.
2.2 Robust NLP with the Hybrid Model In this section, we will briefly discuss two alternative MIP models for Rob-NLP with the hybrid model of demand uncertainty. The first formulation follows directly from the discussions in Section 2. On the other hand, we slightly modify our Dhyb so as to express it in terms of deviations from the nominal values to obtain the second formulation. First, notice that NLP D reduces to the following compact MIP formulation (NLPhyb ) for the hybrid model:
164
A. Altın et al.
min
pel yel
e∈E l∈L
s.t. (2), (4), (5) q q s bs we + (u q λe − d¯q μe ) ≤ C l yel ∀e ∈ E, s∈W
q∈Q
o(q) we
(17)
l∈L
t (q) + we
q q + λe − μe q q λe , μe ≥ 0 wes ≥ 0
≥
q f hk
q
+ f kh
∀q ∈ Q, e = {h, k} ∈ E, (18) ∀q ∈ Q, e ∈ E, ∀s ∈ W, e ∈ E,
(19) (20)
where w, λ, and μ are the dual variables used in the duality transformation corresponding to constraints (14), (15), and (16), respectively.
2.3 Alternative Flow Formulation Given the hybrid model, we know that the best-case scenario would be the one where all demands are at their lower bounds. Then the total design cost would increase as the deviation from this best case increases. Consequently, we can restate the hybrid model in terms of the deviations from lower bounds, which requires us to modify NLPpol by replacing link capacity constraints (7) with
( f hk + f kh )d¯q + max q
q
ˆ Dˆ hyb d∈ q∈Q
q∈Q
( f hk + f kh )dˆq ≤ q
q
C l yel ∀e = {h, k} ∈ E,
l∈L
(21) 2 ˆ where Dˆ hyb = {dˆ ∈ R|Q| : 0 ≤ dˆq ≤ q ∀q ∈ Q; q∈Q:o(q)=s t (q)=s dq ≤ ˙bs ∀s ∈ W } such that q = u q − d¯q for all q ∈ Q and b˙ s = bs − 2 ¯ q∈Q:o(q)=s t (q)=s dq for all s ∈ W . This observation leads to the following result. Proposition 1 NLP D reduces to the following compact linear MIP formulation (NLPalt ) for the hybrid model: min
pel yel
e∈E l∈L
q
( f hk
(2), (4), (5) q ¯ q + f kh )dq + q ηe ≤ C l yel ∀e = {h, k} ∈ E, b˙s νes +
q∈Q o(q) νe
s∈W
q∈Q
t (q) q + νe + ηe o(q) t (q) νe , νe ,
q f hk
≥ q
l∈L
+
ηe ≥ 0
q f kh
∀q ∈ Q, e = {h, k} ∈ E, ∀q ∈ Q, e ∈ E.
A Hybrid Polyhedral Uncertainty Model
165 |L|
Proof For link e = {h, k} ∈ E, the capacity assignment y = (ye1 , . . . , ye ) should be sufficient to route the worst-case demand as in (21). Then, we can model the maximization problem on the left-hand side of (21) as max
q q ( f hk + f kh )dˆq
(22)
q∈Q
s.t.
q∈Q:o(q)=s
2
dˆq ≤ b˙s ∀s ∈ W,
(23)
∀q ∈ Q,
(24)
∀q ∈ Q.
(25)
t (q)=s
dˆq ≤ q dˆq ≥ 0
Notice that for a given routing f , this is a linear programming problem. Since it is feasible and bounded, we can apply a duality transformation similar to Soyster [38]. q So, we associate the dual variables νes and ηe with (23) and (24), respectively, and obtain the equivalent dual formulation ⎛ min ⎝
b˙s νes +
s∈W
s.t.
o(q) νe
⎞ q ηe ⎠ q
(26)
q∈Q
t (q) q q + νe + ηe ≥ f hk q νes , ηe ≥ 0
q
+ f kh ∀q ∈ Q, ∀s ∈ W, q ∈ Q.
(27) (28)
Next, we can complete the proof by equally replacing the maximization problem in (21) with (26), (27), (28) and removing min since the facility capacities C l and the reservation costs pel are nonnegative for all l ∈ L. We show in Section 3, off-the-shelf MIP solvers can handle NLPalt better than NLPhyb in some instances.
3 Experimental Results In this section, we focus on the single-facility multi-commodity problem where just one type of facility with C units of capacity is available. We perform our analysis in two stages. First, we compare the performance of ILOG Cplex for the two compact MIP formulations NLPhyb and NLPalt in terms of solution times and bounds they provide at the end of 2-h time limit. The instances polska, dfn, newyork, france, janos, atlanta, tai, nobel-eu, pioro, and sun are from the SNDLIB web site [37] whereas the remaining are used in Altın et al. [1] for a virtual private network design problem. For the SNDLIB instances [37], we have the average demand estimates d˜q . In order to generate the bandwidth values as well as the lower and upper bounds on pairwise communication demands, we have used the following relations:
166
• bs = • d¯q =
A. Altın et al.
q∈Q:o(q)=s
d˜q 1+ p ;
2
t (q)=s
d˜q ;
• u q = (1 + p)d˜q . In our tests, we choose p = 0.2. This parameter can be determined based on the available information about the demand pattern, past experience, etc. We should note that Dhyb would not get smaller as p increases and hence the optimal design would never get less conservative. By defining bs , u q , and d¯q as a function of p, we can interpret the trade-off between the conservatism of a design and its cost. The interested reader can refer to Altın et al. [3] for an analogous parametric analysis of the symmetric hose model. We have used AMPL to model the formulations and Cplex 11.0 MIP solver for numerical tests and set a 2-h solution time limit for all instances. We present the results of the initial comparison for two MIP models in Table 1 where we provide the following information: • • • • • • • • • • •
zhyb : best total design cost for NLPhyb at termination, tcp : solution time in CPU seconds for NLPhyb , G hyb : the gap at termination for NLPhyb , #hyb : number of B&C nodes for NLPhyb , which is 0 if no branching takes place, z alt : best total design cost for NLPalt at termination, tcp : solution time in CPU seconds for NLPalt , G alt : the gap at termination for NLPalt , #alt : number of B&C nodes for NLPalt , which is 0 if no branching takes place, ∗ indicates the best upper bound at termination, INF means that we have no integer solution at termination, ‘–’ under the z columns shows that even the LP relaxation cannot be solved in 2-h time limit.
We could solve both NLPhyb and NLPalt for 7 out of 17 instances to optimality within 2-h time limit. In addition to that, Cplex could also solve NLPalt for polska with C = 1000. For the same instance, Cplex could reduce the integrality gap to 2.16% with NLPhyb . We will analyze our results in two stages. Initially, in Fig. 1, we show the reduction in solution times when NLPalt rather than NLPhyb is solved for the first seven t −t instances in Table 1. We measure this improvement as hybthyb alt ×100 and thus positive values show the instances that are easier to solve using the alternative formulation. We see that except bhv6c and pdh, NLPalt is easier to solve for Cplex. On the other hand, for the remaining 10 instances, we see that Cplex achieved better upper bounds with NLPalt in 6 cases. Figure 2 displays the termination gaps for both models. Note that for newyork, nobel-eu, and sun, Cplex could not solve even the LP relaxation of NLPhyb whereas we have some upper bounds for NLPalt . We let gaphyb = 105% in Fig. 2 for these three instances just for convenience. On the other hand, we see that the upper bounds on total design cost at termination are smaller with NLPhyb for tai and janos whereas there is a tie for polska (C = 155)
A Hybrid Polyhedral Uncertainty Model
167
Table 1 A comparison of the projected formulation and the alternative flow formulation Instance (|V |,|E|,|W |,C) z hyb thyb (G hyb ) #(hyb) z alt talt (G alt ) #(alt) metro nsf1b at-cep1 pacbell bhv6c bhvdc pdh polska polska dfn newyork france atlanta tai janos nobel-eu sun
(11,42,5,24) (14,21,10,24) (15,22,6,24) (15,21,7,24) (27,39,15,24) (29,36,13,24) (11,34,6,480) (12,18,12,155) (12,18,12,1000) (11,47,11,155) (16,49,16,1000) (25,45,14,2500) (15,22,15,1000) (24,51,19,504k) (26,42,26,64) (28,41,28,20) (27,51,24,40)
768 86,600 47,840 10,410 810,368 952,664 2,467,983 44,253∗ 7478∗ 51,572∗ – 21,600∗ 458,020,000∗ 28,702,323.54∗ 1,289,931,888 – –
23.02 246 1.61 49.1 669.09 657.09 318.41 (0.77%) (2,16%) (3.85%) INF (3.22%) (0.11%) (20.37%) (99.8%) INF INF
428 2293 66 1216 13,148 1210 4796 25,871 12,742 4993 0 915 16,023 140 0 0 0
768 86,600 47,840 10,410 810,368 952,664 2,467,983 44,253∗ 7478 51,572∗ 1,318,400∗ 22,600∗ 458,040,000∗ 27,611,428.86∗ 1,289,911,204∗ 14,718,917,910∗ 62,938,898.76∗
13.28 134.95 1.41 28.06 725.19 149.52 827.58 (1.14%) 4591.14 (7.32%) (54.10%) (9.40%) (0.36%) (17.85%) (99.8%) (99.97%) (99.99%)
486 1470 90 1199 10,960 481 10,961 19,144 7961 281 100 151 7620 157 37 26 1
100%
50%
0% metro
nsf1b
at-cep1
pacbell
bhv6c
bhvdc
pdh
–50%
–100%
–150%
–200%
Fig. 1 Reduction in solution times when we solve NLPalt rather than NLPhyb with Cplex
and dfn. Based on the overall comparison of the two models that we show in Fig. 3, we can say Cplex can solve NLPalt more efficiently since solution times, termination gaps, and upper bounds are better with NLPalt . On the other hand, we suppose that we can make a better use of NLPhyb so as to develop efficient solution tools like a branch-and-cut algorithm. We suppose NLPhyb to be more advantageous than NLPalt since the latter does not have the nice single commodity decomposition property.
168
A. Altın et al. 100%
alt
hyb
80% 60% 40% 20% 0%
Fig. 2 Comparison of termination gaps when we solve NLPalt and NLPhyb with Cplex
upper bound
solution time hyb 29%
22%
gap
22%
tie
hyb
hyb
40% alt 60%
alt 71%
alt
56%
Fig. 3 A general comparison of solving NLPalt and NLPhyb with Cplex
Next, we display how the design cost has changed according to the demand uncertainty model in Fig. 4. To this end, we consider three models: the interval uncertainty model with Dint = {dq ∈ R|Q| : d¯q ≤ dq ≤ u q ∀q ∈ Q}, the symmetric hose model with Dhose = {dq ∈ R|Q| : (14), dq ≥ 0 ∀q ∈ Q}, and the hybrid model. Notice that the interval model is a special case of the BS model with = |Q| and thus the corresponding worst case would be dqworst = u q for all q ∈ Q. We consider six instances, which we could solve to optimality in reasonable times for all demand models. We should remark here that we had to terminate the test for the bhvdc and bhv6c instances under interval uncertainty model after 60,000 CPU seconds with 0.21% and 0.3% gaps since the best solutions have not changed for a long while and the gaps are relatively small. Let z det be the total design cost for the deterministic case if we consider the best-case scenario with dq = d¯q for all q ∈ Q. Then for the three demand models, z −z we show the percent of increase in design cost, which is hybzdet det ∗ 100 for the hybrid model and similar for the other models, in Fig. 4. For each instance such
A Hybrid Polyhedral Uncertainty Model
169
60% 50% 40% 30% 20% 10% 0% metro
nsf1b
at-cep1 int
hose
pacbell
bhv6c
bhvdc
hyb
Fig. 4 Increase in design cost with respect to the deterministic case for different demand models
an increase can be interpreted as the cost of robustness or the price that we should be ready to pay so as to have a more flexible network and hence increased service availability. We see that as we shift from the interval model to the hose model and then to the hybrid model, the total design cost decreases significantly, namely, the average increase rates are 44.89%, 29.33%, and 18.31% for these six instances with three models, respectively. Given that these instances are constructed using the same |W |+2∗|Q| ¯ u) ∈ R+ , we can interpret this decreasing trend in cost as parameters (b, d, a consequence of using more informative demand uncertainty sets and hence being protected against practically and technically more realistic worst-case scenarios. Our worst-case definition over a polyhedron is clearly quite different from simply determining the worst-case scenario a priori. Since we exploit the hybrid model information, we can avoid over-conservative designs. Suppose that we have not done so and we determine a worst case that can happen using the available infor|W |+2∗|Q| ¯ u) ∈ R+ mation (b, d, . Obviously, the safest approach would be to set dq = min{bo(q) , bt (q) , u q } for all q ∈ Q and then solve the nominal problem (1), (2), (3), (4), and (5) to get the optimal design cost z worst . When we compare the design cost z hyb with z worst for the six instances we have mentioned above, we see that the design costs have reduced by 18.27% on the average. On the other hand, the average savings is around 10.61% for the hose model. Figure 5 displays the percentage of savings in cost for each instance with both models. We also compare the design costs for the hybrid model and the BS model for = #0.1|Q|$ and = #0.15|Q|$. We show the relative savings the hybrid model provide in Fig. 6. We see that when = #0.1|Q|$, using the hybrid model yields a less costly design for all instances except nsf1b, where it is only 0.12% worse. On the average, the hybrid model provides 6.53% and 10.69% savings, respectively, for these six instances and the difference increases rapidly as grows larger. Finally, we consider the metro, at-cep1, and pacbell instances so as to compare the robust designs for the BS model ( = #0.15|Q|$) and the hybrid model in
170
A. Altın et al. 30% 25%
hyb
20% 15% 10%
hose
5% 0% metro
nsf1b
at-cep1
pacbell
bhv6c
bhvdc
Fig. 5 Reduction in design cost with respect to the worst-case scenario determined without exploiting the demand model
20%
Γ = ⎡0.1⏐Q⏐⎤
20%
15%
15%
10%
10%
5%
50%
0%
0%
Γ = ⎡0.15⏐Q⏐⎤
–5%
Fig. 6 Savings in design cost by using the hybrid model rather than the BS model
terms of their routing performances. For this purpose, we first generate 20 demand j matrices d˙1 , d˙ 2 ,..,d˙ 20 for each instance where the demand d˙q for each commodity q ∈ Q is normally distributed with mean d˜q and standard deviation 0.5d˜q . Then given the optimal capacity configurations y(B S) and y(hyb), we determine the maximum total flow F j (B S) and F j (hyb) we can route for the demand matrix d˙ j for all j = 1, . . . , 20 by solving a linear programming problem. For each demand matrix, the fraction routed for both demand models we calculate of demand j j as F j (B S)/ q∈Q d˙q and F j (hyb)/ q∈Q d˙q , respectively. Finally, we take the average over the 20 demand matrices to evaluate the two robust designs. We present our test results in Table 2 where Rhyb and R B S are the average routing rates for the hybrid and BS models, respectively, whereas cost shows the increase in design cost if the BS model rather than the hybrid model is used. We see that the average routing rates are quite close for metro and at-cep1, whereas they are equal for pacbell. On the other hand, y(B S) is clearly more costly than y(hyb) in all instances. Hence, we can suggest the hybrid model to provide almost the same level of availability at a much lower cost.
A Hybrid Polyhedral Uncertainty Model
171
Table 2 Routing rate and cost comparison between the hybrid model and the BS model Instance Rhyb (%) R B S (%) cost (%) metro at-cep1 pacbell
95.7 96.6 99.9
97.3 97.7 99.9
9.4 14.1 7.4
4 Conclusion In this chapter, we introduced the hybrid model as a new demand uncertainty definition. It inherits the strengths of the two well-known and frequently used demand models: it is easy to specify like the hose model and it avoids over-conservatism like the BS model. We provided two compact MIP formulations, i.e., NLPhyb and NLPalt , for robust NLP under the hybrid model and compared them in terms of their computational performances. Finally, we discussed how the optimal design cost changes for different demand models. When compared with the interval model, the hose model, and the BS model, we observed that the hybrid model provides significant cost savings by exploiting additional information to exclude overly pessimistic worst-case scenarios. Our test results are encouraging for undertaking further studies on robust network design problems.
References 1. Altın A, Amaldi A, Belotti P, Pınar MÇ (2007) Provisioning virtual private networks under traffic uncertainty. Networks 49(1):100–115 2. Altın A, Belotti P, Pınar MÇ (2010) OSPF routing with optimal oblivious performance ratio under polyhedral demand uncertainty. Optimiz Eng, 11(3), pp 395–422 3. Altın A, Yaman H, Pınar MÇ (to appear) The robust network loading problem under hose demand uncertainty: formulation, polyhedral analysis, and computations. INFORMS J Comput 4. Atamtürk A (2006) Strong reformulations of robust mixed 0–1 programming. Math Program 108:235–250 5. Atamtürk A, Rajan D (2002) On splittable and unsplittable capacitated network design arc-set polyhedra. Math Program 92:315–333 6. Atamtürk A, Zhang M (2007) Two-stage robust network flow and design under demand uncertainty, Oper Res 55:662–673 7. Avella P, Mattia S, Sassano A (2007) Metric inequalities and the network loading problem. Discrete Optimiz 4:103–114 8. Ben-Ameur W, Kerivin H (2005) Routing of uncertain demands. Optimiz Eng 3:283–313 9. Ben-Tal A, Goryashko A, Guslitzer E, Nemirovski A (2004) Adjustable robust solutions of uncertain linear programs. Mathem Program Ser A 99:351–376 10. Ben-Tal A, Nemirovski A (1998) Robust convex optimization. Math Oper Res 23(4):769–805 11. Ben-Tal A, Nemirovski A (1999) Robust solutions of uncertain linear programs. Oper Res Lett 25:1–13 12. Ben-Tal A, Nemirovski A (2008) Selected topics in robust convex optimization. Math Program112(1):125–158 13. Bertsimas D, Sim M (2003) Robust discrete optimization and network flows. Math Program Ser B 98:49–71 14. Bertsimas D, Sim M (2004) The price of robustness. Oper Res52:35–53
172
A. Altın et al.
15. Berger D, Gendron B, Potvin J, Raghavan S, Soriano P (2000) Tabu search for a network loading problem with multiple facilities. J Heurist 6:253–267 16. Bienstock D, Chopra S, Günlük O, Tsai C-Y (1998) Minimum cost capacity installation for multi-commodity network flows. Math Program 81:177–199 17. Bienstock D, Günlük O (1996) Capacitated network design – polyhedral structure and computation. INFORMS J Comput 8:243–259 18. Duffield N, Goyal P, Greenberg A, Mishra P, Ramakrishnan K, van der Merive JE (1999) A flexible model for resource management in virtual private networks. In: Proceedings of ACM SIGCOMM, pp 95–108, Massachusetts, USA. 19. Erlebach T, Rúegg M (2004) Optimal bandwidth reservation in hose-model vpns with multipath routing. Proc IEEE Infocom, 4:2275–2282 20. Fingerhut JA, Suri S, Turner JS (1997) Designing least-cost nonblocking broadband networks. J Algorithm, 24(2):287–309 21. Günlük O (1999) A Branch-and-Cut algorithm for capacitated network design problems. Math Program 86:17–39 22. Günlük O, Brochmuller B, Wolsey L (2004) Designing private line networks – polyhedral analysis and computation. Trans Oper Res 16:7–24. Math Program Ser. A (2002) 92:335–358 23. Gupta A, Kleinberg J, Kumar A, Rastogi R, Yener B (2001) Provisioning a virtual private network: a network design problem for multicommodity flow. In: Proceedings of ACM symposium on theory of computing (STOC), Crete, Greece, pp 389–398 24. Gupta A, Kumar A, Roughgarden T (2003) Simpler and better approximation algorithms for network design. In: Proceedings of the ACM symposium on theory of computing (STOC), pp 365–372, San Diego, CA. 25. Italiano G, Rastogi R, Yener B (2002) Restoration algorithms for virtual private networks in the hose model. In: IEEE INFOCOM, pp 131–139. 26. A. Jüttner, Szabó I, Szentesi Á (2003) On bandwidth efficiency of the hose resource management in virtual private networks. In: IEEE INFOCOM, 1:386–395. 27. Kara¸san O, Pınar MÇ, Yaman H (2006) Robust DWDM routing and provisioning under polyhedral demand uncertainty. Technical report, Bilkent University 28. Kumar A, Rastogi R, Silberschatz A, Yener B (2001) Algorithms for provisioning virtual private networks in the hose model. In: SIGCOMM’01, August 27–31, 2001, San Diego, CA, USA 29. Magnanti TL, Mirchandani P (1993) Shortest paths single origin-destination network design, and associated polyhedra. Networks 23:103–121 30. Magnanti TL, Mirchandani P, Vachani R (1993) The convex hull of two core capacitated network design problems. Math Program 60:233–250 31. Magnanti TL, Mirchandani P, Vachani R (1995) Modeling and solving the two-facility capacitated network loading problem. Oper Res 43(1):142–157 32. Mirchandani P (2000) Projections of the capacitated network loading problem. Eur J Oper Res 122:534–560 33. Mudchanatongsuk S, Ordoñez F, Liu J (2008) Robust solutions for network design under transportation cost and demand uncertainty. J Oper Res Soc 59(5):652–662 34. Ordoñez F, Zhao J (2007) Robust capacity expansion of network flows. Networks 50(2): 136–145 35. Rardin RL, Wolsey LA (1993) Valid inequalities and projecting the multicommodity extended formulation for uncapacitated fixed charge network flow problems. Eur J Oper Res 71:95–109 36. Sim M (2009) Distributionally robust optimization: a marriage of robust optimization and stochastic optimization. In: Third nordic optimization symposium, Stockholm, Sweden. 37. http://sndlib.zib.de/home.action. 38. Soyster AL (1973) Convex programming with set-inclusive constraints and applications to inexact linear programming. Oper Res 21:1154–1157 39. Swamy C, Kumar A (2002) Primal-dual algorithms for connected facility location problems. In: Proceedings of the international workshop on approximation algorithms for combinatorial optimization (APPROX), Lecture Notes in Computer Science series, 2462:256–270 40. Yaman H, Kara¸san OE, Pınar MÇ (2007) Restricted robust uniform matroid maximization under interval uncertainty. Math Program 110(2):431–441
Analytical Modelling of IEEE 802.11e Enhanced Distributed Channel Access Protocol in Wireless LANs Jia Hu, Geyong Min, Mike E. Woodward, and Weijia Jia
1 Introduction The IEEE 802.11-based wireless local area networks (WLANs) have experienced impressive commercial success owing to their low cost and easy deployment [8]. The basic medium access control (MAC) protocol of the IEEE 802.11 standard is distributed coordination function (DCF) [8], which is based on the carries sense multiple access with collision avoidance (CSMA/CA) protocol and binary exponential backoff (BEB) mechanism. The DCF is designed for the best-effort traffic only and does not provide any support of priorities and differentiated quality of service (QoS). However, due to the rapid growth of wireless multimedia applications, such as voice-over-IP (VoIP) and video conferencing, there is an ever-increasing demand for provisioning of differentiated QoS in WLANs. To support the MAC-level QoS, an enhanced version of the IEEE 802.11 MAC protocol, namely IEEE 802.11e [9], has been standardized. This protocol employs a channel access function called hybrid coordination function (HCF) [9], which comprises the contention-based enhanced distributed channel access (EDCA) and the centrally controlled hybrid coordinated channel access (HCCA). The EDCA is Jia Hu Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK e-mail:
[email protected] Geyong Min Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK e-mail:
[email protected] Mike E. Woodward Department of Computing, School of Informatics, University of Bradford, Bradford, BD7 1DP, UK e-mail:
[email protected] Weijia Jia Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Hong Kong e-mail:
[email protected] N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_8,
173
174
J. Hu et al.
the fundamental and mandatory mechanism of 802.11e, whereas HCCA is optional and requires complex scheduling algorithms for resource allocation. This chapter focuses on the analysis of the EDCA, which is an extension of DCF for provisioning of QoS. The EDCA classifies the traffic flows into four access categories (ACs), each of which is associated with a separate transmission queue and behaves independently. These ACs are differentiated through adjusting the parameters of arbitrary inter-frame space (AIFS), contention window (CW), and transmission opportunity (TXOP) limit [9]. The AIFS and CW schemes control the channel access time, while the TXOP scheme controls the channel occupation time after accessing the channel. Analytical performance evaluation of the DCF and EDCA has been extensively studied in recent years [1–3, 5–7, 10, 12–36]. Since the traffic loads are unsaturated in the realistic network environments, the development of analytical models for the DCF and EDCA in unsaturated conditions is an important and open research issue. With the aim of obtaining a thorough and deep understanding of the performance of EDCA, we propose a comprehensive analytical model to incorporate the three QoS differentiation schemes including the AIFS, CW, and TXOP schemes simultaneously in IEEE 802.11e WLANs under unsaturated traffic loads. First, we develop a novel three-dimensional Markov chain to analyse the backoff process of each AC. Afterwards, the transmission queue at each AC is modelled as a bulk service queueing system with finite capacity to address the difficulties of queueing analysis arising from the TXOP burst transmission scheme. The accuracy of the proposed model is validated by comparing the analytical results to those obtained from extensive NS-2 simulation experiments. The rest of the chapter is organized as follows. Section 2 introduces the MAC protocols. Section 3 describes a detailed survey of the related work on modelling of the DCF and EDCA. Section 4 presents the derivation of the analytical model for EDCA. After validating the accuracy of the model in Section 5, we conclude the chapter in Section 6.
2 Medium Access Control 2.1 Distributed Coordination Function (DCF) The DCF is the fundamental channel access scheme in the IEEE 802.11 MAC protocol [8]. A station with backlogged frames first senses the channel before transmission. If the channel is detected idle for a distributed inter-frame space (DIFS), the station transmits the frame. Otherwise, the station defers until the channel is detected idle for a DIFS, and then starts a backoff procedure by generating a random backoff counter. The value of the backoff counter is uniformly chosen between zero and W , which is initially set to Wmin and doubled after each unsuccessful transmission until it reaches a maximum value Wmax . It is reset to Wmin after the successful transmission or if the unsuccessful transmission attempts reach a retry limit. The backoff counter is decreased by 1 for each time slot when the channel is idle, halted when
Analytical Modelling of IEEE 802.11e
175
the channel is busy, and resumed when the channel becomes idle again for a DIFS. A station transmits a frame when its backoff counter reaches zero. Upon the successful reception of the frame, the receiver sends back an ACK frame immediately after a short inter-frame space (SIFS). If the station does not receive the ACK within a timeout interval [8], it retransmits the frame. Each station maintains a retry counter that is increased by 1 after each retransmission. The frame is discarded after an unsuccessful transmission if the retry counter reaches the retry limit. The above-mentioned procedure is referred to as the basic access method. The hidden terminal problem [18] occurs when a station is unable to detect a potential competitor for the channel because they are not within the hearing range of each other. To combat the hidden terminal problem, DCF also defines an optional four-way handshake scheme whereby the source and destination exchange requestto-send (RTS) and clear-to-send (CTS) messages before the transmission of actual data frame.
2.2 Enhanced Distributed Channel Access (EDCA) The EDCA was designed to enhance the performance of the DCF and provide the differentiated QoS [9]. As shown in Fig. 1, traffic of different classes is assigned to one of four ACs, which is associated with a separate transmission queue and behaves independently of others in each station. The QoS of these ACs are differentiated through assigning different EDCA parameters including AIFS values, CW sizes, and TXOP limits. Specifically, a smaller AIFS/CW results in a larger probability of winning the contention for the channel. On the other hand, the larger the TXOP limit then the longer are channel holding times of the station winning the contention. The operation of channel access in EDCA is similar to the DCF, as shown in Fig. 2. Before generating the backoff counter, the EDCA function must sense the channel to be idle for an AIFS instead of a DIFS in the DCF. The AIFS for a given AC is defined as AC0
AC1
AC2
AC3
Backoff AIFS[AC0] CW[AC0]
Backoff AIFS[AC1] CW[AC1]
Backoff AIFS[AC2] CW[AC2]
Backoff AIFS[AC3] CW[AC3]
Virtual Collision Handler Transmission during TXOP[AC]
Fig. 1 The IEEE 802.11e MAC with four ACs
176
J. Hu et al. Immediate access when channel is idle ≥ AIFS[AC] AIFS[AC] Busy Channel
AIFS[AC] SIFS
Contention Window from 0 to CWmin[AC]
Backoff Slots
Frame Transmission
Slot time Defer Access
Decrease backoff slots as long as channel stays idle
Fig. 2 The timing diagram of the EDCA channel access
AIFS[AC] = SIFS + AIFSN [AC] × aSlotTime, where AIFSN [AC] (AIFSN [AC] ≥ 2) is an integer value to determine AIFSN [AC] and aSlotTime is the duration of a time slot. When the EDCA function generates the backoff counter before transmission, the AC-specific Wmin and Wmax are used to differentiate the CW sizes of backoff counters. Upon winning the contention for the channel, the AC transmits the frames available in its queue consecutively provided that the duration of transmission does not exceed the specific TXOP limit [9]. Each frame is acknowledged by an ACK after an SIFS. The next frame is transmitted immediately after it waits for an SIFS upon receiving this ACK. If the transmission of any frame fails the burst is terminated and the AC contends again for the channel to retransmit the failed frame. When the backoff counters of different ACs within a station decreases to zero simultaneously, the frame from the highest priority AC among the contending ones is selected for transmission on the channel, while the others suffer from a virtual collision and invoke the backoff procedure with a doubled CW value. The backoff rule of EDCA is slightly different from that of DCF. In DCF, the backoff counter is frozen during the channel busy period, resumed after the channel is sensed idle for a DIFS, and decreased by 1 at the end of the first slot following DIFS. However, the backoff counter in EDCA is resumed one slot time before the end of AIFS. It means that the backoff counter has already been decremented by 1 at the end of AIFS. In addition, after the backoff counter decrements to zero, the AC has to wait for an extra slot before transmitting.
3 Related Work 3.1 Overview of Analytical Models of the DCF Performance modelling of the DCF has attracted considerable research efforts [1, 2, 13, 15, 17, 18, 20, 27, 31, 33, 34, 36]. For instance, Bianchi [1] proposed a bi-dimensional discrete-time Markov chain to derive the expressions of the saturation throughput for the DCF, under the assumption that all the stations have frames for transmission anytime. This simplified assumption excludes any need to
Analytical Modelling of IEEE 802.11e
177
consider queuing dynamics or traffic models for performance analysis. Many subsequent studies of the DCF have built upon Bianchi’s work. For instance, Ziouva and Antonakopoulos [36] have extended Bianchi’s model by taking account of the busy channel conditions for invoking the backoff procedure. Wu et al. [31] have modified Bianchi’s model to deal with the retry limit. Kumar et al. [13] have studied the fixed point formulation based on the analysis of Bianchi’s model and showed that the derivation of transmission probability can be significantly simplified by viewing the backoff procedure as a renewal process. However, realistic network conditions are non-saturated as only very few networks are in a situation where all nodes have frames to send all the time. Consequently, there are many research works devoted to developing analytical models for DCF under unsaturated working conditions [15, 18, 20, 27, 33, 34]. For example, Zhai et al. [33] showed that exponential distribution is a good approximation for the MAC service time distribution of the DCF and then presented an M/M/1/K queueing model to analyse the DCF under non-saturated conditions. Malone et al. [17] extended Bianchi’s model to a non-saturated environment in the presence of heterogeneous loads at the nodes, assuming that each MAC buffer has only one frame at most. Medepalli and Tobagi [18] presented a unified analytical model for the DCF where the transmission queue is modelled as an M/M/1 queuing system. Tickoo and Sikdar [27] proposed a discrete-time G/G/1 queue for modelling stations in the IEEE 802.11 WLAN.
3.2 Overview of Analytical Models of the EDCA Significant research efforts have been devoted to developing the analytical performance model for the AIFS and CW differentiation schemes defined in EDCA [3, 6, 7, 10, 12, 16, 22–26, 32, 35]. Most of these studies were based on the extension of Bianchi’s model [1] under the assumption of saturated traffic loads. For instance, Xiao [32] extended [1] to study the CW differentiation scheme of EDCA. Kong et al. [12] analysed the AIFS and CW differentiation by a three-dimensional Markov chain. Tao and Panwar [26] developed another three-dimensional Markov chain model where the third dimension represents the number of time slots after the end of the last AIFS period. Robinson and Randhawa [23] adopted a bi-dimensional Markov chain model where the collision probability is calculated as a weighted average of the collision probabilities in different contention zones during AIFS. Zhu and Chlamtac [35] proposed a Markov model of EDCA to calculate the saturation throughput and access delay. Huang and Liao [7] analysed the performance of saturation throughput and access delay, taking into account the virtual collisions among ACs inside each EDCA station. Tantra et al. [24] introduced a simple model which adopts two different types of Markov chains to model various ACs in EDCA. As another QoS scheme specified in the EDCA, the TXOP scheme has also drawn much research attention [5, 6, 10, 14, 19, 21, 27–29]. However, the existing analytical models were primarily focused on the saturation performance
178
J. Hu et al.
[14, 21, 28, 29]. More specifically, Tinnirello and Choi [28] compared the system throughput of the TXOP scheme with different acknowledgement (ACK) policies. Vitsas et al. [29] presented an analytical model for the TXOP scheme to derive the throughput and average access delay. Li et al. [14] analysed the throughput of the TXOP scheme with block ACK policy under noisy channel conditions. Peng et al. [21] evaluated the throughput of various ACs as a function of different TXOP limits. The majority of existing models for the EDCA were derived based on the assumption of saturated working conditions. However, since the traffic loads in the realistic network environments are unsaturated, there are also a number of analytical models for the EDCA under unsaturated conditions [3, 5, 6, 10, 16, 19, 25, 27]. For example, Tantra et al. [25] introduced a Markovian model to compute the throughput and delay performance metrics of EDCA with the CW scheme, assuming that each station has a transmission queue size of one frame. Engelstad and Osterbo [3] analysed the end-to-end delay of EDCA with the AIFS and CW schemes through modelling each AC as an M/G/1 queue of infinite capacity. Liu and Niu [16] employed an M/M/1 queueing model to analyse the EDCA protocol, considering the AIFS and CW schemes. They also assumed an infinite capacity of the transmission queue. As for the unsaturated model of the TXOP scheme, Hu et al. [5] analysed and compared the performance of the TXOP scheme with different ACK policies under unsaturated traffic loads. Tickoo and Sikdar [27] extended the G/G/1 discrete-time queueing model for the DCF to analyse the TXOP scheme under non-saturated traffic loads. Although the performance of the AIFS, CW, and TXOP schemes has been studied separately in non-saturated conditions, to the best of our knowledge, there has been very few analytical models [10] reported in the current literature for the combination of these three schemes in non-saturated conditions. Recently, Inan et al. [10] proposed an analytical model to incorporate these three schemes under unsaturated traffic loads. They modelled the unsaturated behaviour of EDCA through a three-dimensional Markov chain. Since the model introduced the third dimension of the Markov chain to denote the number of backlogged frames in the transmission queue, the complexity of the solution will become very high with a large buffer size. Different with the model in [10], we employ a method combining the queueing theory and Markov analysis. Consequently, the proposed model can handle a large buffer size without heavily increasing the complexity of the solution.
4 Analytical Model In this section, we propose a comprehensive analytical model of the EDCA under unsaturated traffic loads. We assume a network with n stations using the EDCA of IEEE 802.11e as the MAC protocol. We use the basic access scheme for channel contention. The analysis can be readily extended for the RTS/CTS scheme. The wireless channel is assumed to be ideal, thus the transmission failures are only caused by collisions and there is no hidden terminal problem.
Analytical Modelling of IEEE 802.11e
179
The ACs from the lowest priority to the highest one are denoted by subscripts 0, 1, 2, . . ., N . The transmission queue at each AC is modelled as a bulk service queueing system where the arrival traffic follows a Poisson process with rate λv (frames/s, v = 0, 1, 2, . . ., N ). The service rate of the queueing system, μv , is derived by analysing the backoff and burst transmission procedures of AC v . As shown in Fig. 3, a novel three-dimensional Markov chain is introduced to model the backoff procedure.
0, 0, dv ptv 1–pv
0, 0, 0
0, 1, dv 1–pbv
0, 1, 0
pbv
0, W0v–2, dv
…
ptv
ptv
1–pbv …
pbv
pbv
0, W0v–1, dv ptv
1–pbv
0, W0v–2, 0
0, W0v–1, 0
pbv
pv/W1v 1, 0, dv 1–pv
1, 1, dv
ptv 1, 0, 0
1–pbv
… pbv
pbv
1, W1v–1, dv
ptv
1–pbv
ptv
1, 1, 0
pbv
1, W1v–2, dv
…
ptv
1–pbv
1, W1v–2, 0
1, W1v–1, 0
pbv
pv/W2v ...
...
...
... pv/Wmv
m, 1, dv
m, 0, dv ptv 1
m, 0, 0
ptv
1–pbv pbv
m, 1, 0
m, Wmv–2, dv
…
1–pbv
pbv
…
pbv
ptv
m, Wmv–1, dv 1–pbv
m, Wmv–2, 0
pbv
ptv
m, Wmv–1, 0
(a) i–1, 0, 0 pv/Wiv 1-p tv p tv
i, j, 2
1–ptv
...
i, j, 1
1-p tv ...
ptv
ptv
i, j, dv–1
ptv
1–ptv i, j, dv 1–pbv
(b)
i, j, 0
Fig. 3 The three-dimensional Markov chain for modelling the backoff procedure of EDCA: (a) the three-dimensional Markov chain; (b) the sub-Markov chain for modelling the deferring period
180
J. Hu et al.
4.1 Modelling of the Backoff Procedure A discrete and integer timescale is adopted [1] where t and (t + 1) correspond to the starts of two consecutive time slots, which is the variable time interval between the starts of two successive decrements of the backoff counter, or a fixed time interval specified in the protocol [8], namely the physical time slot. Let s(t) and b(t) denote the stochastic processes representing the backoff stage (i.e., retry counter) and backoff counter for a given AC at time t, respectively. The newly introduced dimension, c(t), denotes the number of remaining time slots to complete the deferring period in AIFS of the AC (AIFSv ) after the minimum AIFS (AIFSmin ). Since only the head-of-burst (HoB) frame needs contend for the channel, the term frame refers to as the HoB frame in this section, if without any specifications. Assuming that the collision probability of frames transmitted from the AC v , pv , is independent of the number of retransmissions that a frame has suffered [1], the three-dimensional process {s(t), b(t), c(t)} can be modelled as a discrete-time Markov chain as shown in Fig. 3a, with a dashed line box shown in detail in the sub-Markov chain of Fig. 3b. The state transition probabilities in the three-dimensional Markov chain for the ACv are described as follows: P{i, j, 0|i, j + 1, 0} = pbv , P{i, j, dv |i, j, 0} = 1 − pbv , P{i, j, 0|i, j, 1} = ptv , P{i, j, k|i, j, k + 1} = ptv , P{i, j, dv |i, j, k} = 1 − ptv , P{i, j, dv |i − 1, 0, 0} = pv /Wiv , P{0, j, dv |i, 0, 0} = (1 − pv )/W0v , P{0, j, dv |m, 0, 0} = 1/W0v ,
0 ≤ i ≤ m, 0 ≤ j ≤ Wiv − 2, 0 ≤ i ≤ m, 1 ≤ j ≤ Wiv − 1, 0 ≤ i ≤ m, 0 ≤ j ≤ Wiv − 1,
(1a) (1b) (1c)
1 ≤ k ≤ dv − 1, 1 ≤ k ≤ dv ,
(1d) (1e)
1 ≤ i ≤ m, 0 ≤ j ≤ Wiv − 1, (1f) 0 ≤ i ≤ m − 1, 0 ≤ j ≤ Wiv − 1, (1g) 0 ≤ j ≤ Wiv − 1,
(1h)
where m is the retry limit and Wiv is the contention window size after i retransmissions. pbv is the probability that the channel is idle in a time slot after the AIFS period of ACv , and ptv is the probability that the channel is idle in a time slot during the deferring period in AIFS of ACv after AIFSmin . dv denotes the difference in the number of time slots between AIFSmin and AIFSv , i.e. dv = AIFSN v − AIFSN min . These equations account, respectively, for the following (1a) the backoff counter is decreased by 1 after an idle time slot; (1b) the backoff counter is frozen and the AC starts deferring; (1c) the backoff counter is activated and decreased by 1 after the AIFS period; (1d) the remaining number of time slots for activating the backoff counter is decreased by 1 if the channel is detected idle in a time slot; (1e) the AC has to go through the AIFS period again if the channel is sensed busy during the deferring procedure; (1f) the backoff stage increases after an unsuccessful transmission and the AC starts deferring before activating the backoff counter; (1g) after a successful transmission, the contention window is reset to Wmin ; (1h) once
Analytical Modelling of IEEE 802.11e
181
the backoff stage reaches the retry limit, the CW is reset to Wmin after the frame transmission. Let bi, j,k be the stationary distribution of the three-dimensional Markov chain. First, the steady-state probabilities, bi,0,0 , satisfy bi,0,0 = pvi b0,0,0 ,
0 ≤ i ≤ m.
(2)
Because of the chain regularities, for each j ∈ [0, Wiv − 1], we have 3 bi, j,0 = bi,0,0 (Wiv − j) Wiv ,
0 ≤ i ≤ m.
(3)
From the balance equations in the sub-Markov chain, we have the following relations: dv −k , bi, j,k = bi, j,dv ptv
1 ≤ k ≤ dv ,
ptv bi, j,dv = (1 − pbv )bi, j,0 +
d v −1 pv bi−1,0,0 + bi, j,k (1 − ptv ), Wiv k=1
1 ≤ i ≤ m,
ptv b0, j,dv = (1 − pbv )b0, j,0 + +
d v −1
0 ≤ j ≤ Wiv − 1,
(1 − pv )
m−1 i=0
(4) bi,0,0 + bm,0,0
W0v
b0, j,k (1 − ptv ),
0 ≤ j ≤ Wiv − 1.
k=1
With (2), (3), and (4), b0,0,0 can be finally determined by imposing the following normalization condition: 1=
m Wiv −1 i=0
j=0
bi, j,0 +
m Wiv −1 dv i=0
j=0
k=1
bi, j,k ,
(5)
from which
b0,0,0
&
m 1 − pvm+1 Wiv − 1 i = ) pv + (1 − p bv dv 2 1 − pv (1 − ptv ) ptv i=0 −1 m Wiv − 1 i 1 − pvm+1 , + pv + 2 1 − pv dv (1 − ptv )
i=0
where
m
i=0 (Wiv
3 − 1) pvi 2 is given by
)
(6)
182
J. Hu et al.
( ' W0v 1 − pv − pv (2 pv )m + 2m pvm+1 (2 pv − 1) 2(1 − pv )(1 − 2 pv )
−
1 − pvm+1 . 2(1 − pv )
(7)
For the ACs with the minimum AIFS period, bi, j,k equals zero. Therefore, (6) is reduced to b0,0,0
m −1 Wiv − 1 1 − pvm+1 i = . pv + 2 1 − pv
(8)
i=0
The probability that an ACv transmits in a randomly chosen time slot, τv , given that its transmission queue is non-empty, can be written as τv = b0,0,0
m
pvi =
i=0
b0,0,0 (1 − pvm+1 ) . (1 − pv )
(9)
Note that an AC can transmit only when there are pending frames in its transmission queue. Therefore, the transmission probability, τv , that the ACv transmits under unsaturated traffic loads can be derived by τv = τv (1 − P0v ),
(10)
where P0v is the probability that the transmission queue of the ACv is empty, which will be derived in the following sections. Taking into account virtual collisions, an AC will only collide with frames from other stations, or from the higher priority ACs in the same station. Therefore, the collision probability, pv , of the ACv is given by pv = 1 −
N "
(1 − τx )n−1
N "
(1 − τx ),
(11)
x>v
x=0
where n is the number of stations. The probability, pbv , that the channel is idle in a time slot after the AIFS period of the ACv , is the probability that no station is transmitting in the given slot pbv =
N "
(1 − τx )n .
(12)
x=0
During the deferring period between AIFSmin and AIFSv , the ACs with priorities lower than or equal to ACv will not transmit in a time slot. As a result, the probability, ptv , that the channel is sensed idle in a time slot during the deferring period between AIFSmin and AIFSv , is given by
Analytical Modelling of IEEE 802.11e
183
ptv =
N "
(1 − τx )n .
(13)
x>v
4.2 Analysis of the Service Time The service time of the queueing system is defined as the time interval from the instant that an HoB frame starts contending for the channel to the instant that the burst is acknowledged following successful transmission or the instant that the HoB frame is dropped due to transmission failures. The service time is composed of two components: channel access delay and burst transmission delay. The former is the time interval from the instant the frame reaches to the head of the transmission queue, until it wins the contention and is ready for transmission, or until it is dropped due to transmission failures. The latter is defined as the time duration of successfully transmitting a burst (note that it equals zero if the HoB frame is dropped). Let E[Ssv ], E[Av ], and E[Bsv ] denote the mean of the service time, channel access delay, and burst transmission delay, respectively, where v represents that the burst is transmitted from ACv and s denotes the number of frames transmitted in the burst. E[Av ] is given by E[Av ] = Tcv ϕv + σv δv ,
(14)
where Tcv is the average collision time, σv is the average length of a time slot, ϕv and δv account for the average number of collisions and the average number of backoff counter decrements before a successful transmission from the ACv , respectively. According to the Markov chain in Fig. 1, ϕv and δv can be computed as ϕv =
m
δv =
i=0 m
i(1 − pv ) pvi ,
(15)
pvi Wiv /2,
(16)
i=0
where pvi is the probability that the backoff counter reaches stage i, and Wiv /2 is the average value of the backoff counter generated in the ith backoff stage. Let P T denote the probability that at least one AC transmits in a time slot. P T is given by PT = 1 −
N "
(1 − τx )n .
(17)
x=0
The probability, P Sv , that an AC v successfully transmits can be expressed as P Sv = nτv
N " x=0
(1 − τx )n−1
N " x>v
(1 − τx ).
(18)
184
J. Hu et al.
Since the channel is idle with probability (1 − P T ), a successful transmission from anAC x occurs with probability P Sx ; a collision happens with probability N P Sx ); the average length of a time slot, σv , can be calculated as (P T − x=0 σv
= (1 − P T )σ +
N
& P Sx Tsx +
PT −
x=0
N
) P Sx Tcv + E[X v ]P T,
(19)
x=0
where σ is the length of a physical time slot. E[X v ] is the total time spent on deferring the AIFS period of the AC v . Recall that an AC has to go through the AIFS period again if the channel is sensed busy during the deferring procedure, as shown in the sub-Markov chain of Fig. 3, E[X v ] may consist of several attempts for deferring the AIFS period of the ACv , which is given by E[X v ] =
∞
dv dv u−1 ptv (1 − ptv ) uTav ,
(20)
u=1 dv where u is the number of attempts for deferring the AIFS period of the ACv , ptv is the probability that a deferring attempt is successful, and Tav is the average time spent on each attempt, respectively. Tav is given by
Tav =
N
& P Sx Tsx
+
PT −
x>v
N
) P Sx
Tcv +
x>v
d v −1
s sσ ptv ,
(21)
s=1
where the first and second terms correspond to the “frozen time” of the backoff counter of the ACv caused by the transmission from the higher priority ACs with the smaller AIFS. The third term is the time spent on a failed attempt for down-counting the remaining time slots during the deferring period between AIFSmin and AIFSv . P T and P Sx denote the probabilities that at least one AC transmits and the AC x successfully transmits in a time slot, respectively, given that an AC v is deferring. P T and P Sx can be calculated as PT = 1 −
N "
(1 − τx )n ,
(22)
x>v
P Sx = nτx
N " y>v
(1 − τ y )n−1
N "
(1 − τ y ).
(23)
y>max{x,v}
Note that only the HoB frame is involved in the collision, Tcv is given by Tcv = TL + TH + TSIFS + TACK + AIFSv .
(24)
Analytical Modelling of IEEE 802.11e
185
On the other hand, the average time, Tsv , for a successful burst transmission from the ACv can be expressed as Tsv =
Fv
E[Bsv ]L sv , 1 − P0v
s=1
(25)
where Fv denotes the maximum number of frames that can be transmitted in a TXOP limit, the denominator (1 − P0v ) means that the occurrence of burst transmission is conditioned on the fact that there is at least one frame in the transmission queue, L sv (1 ≤ s ≤ Fv ) is the probability of having s frames transmitted within the burst, and E[Bsv ] is the burst transmission delay given by E[Bsv ] = AIFSv + s(TL + TH + 2TSIFS + TACK ) − TSIFS .
(26)
4.3 Queueing Model The transmission queue at the ACv can be modelled as an M/G[1,Fv ] /1/K queueing system [11], where the superscript [1, Fv ] denotes that the number of frames transmitted within a burst ranges from 1 to Fv and K represents the buffer size. The server becomes busy when a frame reaches to the head of the transmission queue. The server becomes free after a burst of frames are acknowledged by the destination following successful transmission or after the HoB frame is dropped due to transmission failures. The service time is dependent on the number of frames transmitted within a burst and the class of the transmitting AC. Thus, the service time of a burst with s frames transmitted from the ACv can be modelled by an exponential distribution function with mean E[Ssv ], then the mean service rate, μsv , is given by μsv = 1/E[Ssv ]. The state transition rate diagram of the queuing system at the AC v can be found in our previous work [5] where each state denotes the number of frames in the system. The transition rate matrix, Gv , of the Markov chain for the ACv can be obtained by the state transition rate diagram. The steady-state probability vector, Pv = (Pr v , r = 0, 1, . . . , K ), of the Markov chain satisfies the following equations: Pv Gv = 0 and Pv e = 1,
(27)
where e is a unit column vector. Solving these equations yields the steady-state vector as [4] Pv = u(I − v + eu)−1 ,
(28)
where r = I+Gr / min{Gr (ρ, ρ)}, u is an arbitrary row vector of r , and I denotes the unitmatrix. After obtaining Pv , we have L sv as L sv = Psv , 1 ≤ s < Fv and L sv = rK=Fv Pr v , s = Fv .
186
J. Hu et al.
The end-to-end delay of a frame is the time interval from the instant that the frame enters the buffer of the ACv to the instant that the frame leaves the buffer. Its mean value, E[Dv ], can be given by virtue of Little’s Law [11]: E[Dv ] =
E[Mv ] , λv (1 − PK v )
(29)
where E[Mv ] = rK=0 r Pr v is the average number of frames in the queueing system. λv (1 − PK v ) is the effective rate of the traffic entering into the transmission queue since the arriving frames are dropped if finding the queue full. Given the loss probability PK v , the throughput, v , of the ACv can be computed by v = λv E[P](1 − PK v )(1 − pvm+1 ),
(30)
where E[P] is the frame payload length and pvm+1 is the probability that the frame is discarded due to (m + 1) transmission failures.
5 Validation of the Analytical Model To validate the accuracy of the proposed model, we compare the analytical performance results against those obtained from the NS-2 simulation experiments based on the TKN implementation of the IEEE 802.11e EDCA [30]. We consider a WLAN with 10 stations located in a 100 m × 100 m rectangular grid where all stations are within the sensing and transmission range of each other. Each station generates four ACs of traffic with the identical arrival rates. The packet arrivals at each AC are characterized by a Poisson process. For the sake of clarity, the number of frames is adopted to be the unit of the TXOP in this study. Each simulation is executed once with 600 s simulation time, which is sufficiently long as the simulation results do not change with any further increment of simulation time. The system parameters used in the analytical model and simulations follow the IEEE 802.11b standard [8] using direct sequence spread spectrum (DSSS) as the physical-layer technology and are summarized in Table 1.
Frame payload
Table 1 System parameters 8000 bits PHY header 192 bits
MAC header Channel rate Slot time SIFS
224 bits 11 Mbit/s 20 µs 10 µs
ACK Basic rate Buffer size Retry limit
112 bits + PHY header 1 Mbit/s 50 frames 7
To investigate the accuracy of the model under various working conditions, we consider two scenarios with the different combinations of EDCA parameters, as shown in Table 2. Figures 4 and 5 depict the results of the throughput, end-to-end
Analytical Modelling of IEEE 802.11e
Scenarios
ACs
Scenario 1
AC0 AC1 AC2 AC3 AC0 AC1 AC2 AC3
Scenario 2
187
Table 2 EDCA parameters AIFSN CWmin CWmax 6 2 2 2 7 4 2 2
32 32 16 8 64 32 16 16
0.3
AC3,Model 4.5
AC2,Model AC0,Model AC3,Sim AC2,Sim AC1,Sim
0.15
AC2,Model AC1,Model
4
End-to-end delay (Second)
Throughput per AC (Mbps)
AC1,Model
0.2
1 frame 1 frame 4 frames 2 frames 1 frame 1 frame 2 frames 4 frames
5
AC3,Model 0.25
TXOP
1024 512 256 128 512 512 256 256
AC0,Sim 0.1
AC0,Model
3.5
AC3,Sim
3
AC2,Sim
2.5
AC1,Sim AC0,Sim
2 1.5 1
0.05
0.5 0
0 0
0.40
0.80
0.12
0.16
0.2
0.24
0.28
0.32
0.36
0
0.4
0.40
0.80
0.12
Load per AC (Mbps)
0.16 0.2 0.24 0.28 Load per AC (Mbps)
(a)
0.32
0.36
0.4
(b) 1
AC3,Model 0.9
AC2,Model
0.8
AC1,Model AC0,Model
Loss probability
0.7
AC3,Sim 0.6
AC2,Sim
0.5
AC1,Sim AC0,Sim
0.4 0.3 0.2 0.1 0 0
0.04 0.08 0.12 0.16
0.2
0.24 0.28 0.32 0.36
0.4
Load per AC (Mbps)
(c)
Fig. 4 Performance measures versus the offered load per AC in scenario 1. (a) Throughput; (b) end-to-end delay; and (c) frame loss probability
delay, and frame loss probability versus the offered loads per AC in scenarios 1 and 2, respectively. The close match between the analytical results and those obtained from simulation experiments demonstrates that the proposed model can produce the accurate prediction of the performance of the EDCA protocol with AIFS, CW, and TXOP schemes under any traffic loads. Moreover, it is worth mentioning that the maximum throughputs of AC1 and AC 0 are much larger than their saturation throughputs, which emphasize the importance of analysing the EDCA protocol
188
J. Hu et al.
under unsaturated traffic loads. We can also observe that the curves for AC1 and AC0 are close together, but differ widely from those of AC3 and AC2 . This is because that the EDCA parameters (AIFS, CW, TXOP) of AC1 and AC0 are close to each other, while they are very different with those of AC3 and AC2 . 5
0.45
AC3,Model AC2,Model
AC3,Model AC2,Model AC0,Model
0.3
AC3,Sim AC2,Sim
0.25
AC1,Sim AC0,Sim
0.2
AC1,Model
4
AC1,Model
0.35
End-to-end delay (Second)
Throughput per AC(Mbps)
0.4
0.15 0.1
AC0,Model AC3,Sim AC2,Sim
3
AC1,Sim AC0,Sim
2
1
0.05
0
0 0
0.05
0.1
0.15
0.2 0.25 0.3 0.35 Load per AC (Mbps)
0.4
0.45
0
0.5
0.05
0.1
0.15
0.2 0.25 0.3 0.35 Load per AC (Mbps)
(a)
0.4
0.45
0.5
(b) 1 AC3,Model AC2,Model AC1,Model
0.8
AC0,Model
Loss probability
AC3,Sim AC2,Sim
0.6
AC1,Sim AC0,Sim
0.4
0.2
0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Load per AC (Mbps)
(c)
Fig. 5 Performance measures versus the offered load per AC in scenario 2. (a) Throughput; (b) end-to-end delay; and (c) frame loss probability
6 Conclusions In this chapter, we have presented a detailed literature review of the existing analytical models of the IEEE 802.11 DCF and the IEEE 802.11e EDCA protocols. We have then proposed a comprehensive analytical model to accommodate the QoS differentiation schemes in terms of AIFS, CW, and TXOP specified in the IEEE 802.11e EDCA protocol under unsaturated traffic loads. First, we develop a novel three-dimensional Markov chain to analyse the backoff procedure of each AC. Afterwards, to address the difficulties of queueing analysis arising from the TXOP scheme, the transmission queue at each AC is modelled as a bulk service queueing system. The QoS performance metrics including throughput, end-to-end delay, and
Analytical Modelling of IEEE 802.11e
189
frame loss probability have been derived and further validated through extensive NS-2 simulation experiments. The proposed analytical model is based on the assumption that each AC is under Poisson traffic. However, WLANs are currently integrating a diverse range of traffic sources, such as video, voice, and data, which significantly differ in their traffic patterns as well as QoS requirements. In future work, we intend to develop an analytical model for EDCA in WLANs with heterogeneous multimedia traffic. On the other hand, admission control is an important mechanism for the provisioning of QoS in WLANs. We plan to develop an efficient admission control scheme based on the proposed analytical model and the game-theoretical approach.
References 1. Bianchi G (2000) Performance analysis of the IEEE 802.11 distributed coordination function. IEEE J Select Areas Commun 18(3):535–547 2. Choi J, Yoo J, Kim CK (2008) A distributed fair scheduling scheme with a new analysis model in IEEE 802.11 wireless LANs. IEEE Trans Veh Technol 57(5):3083–3093 3. Engelstad PE, Osterbo ON (2006) Analysis of the total delay of IEEE 802.11e EDCA and 802.11 DCF. In: Proceedings of IEEE ICC’06, Istanbul, vol 2, pp 552–559 4. Fischer W, Meier-Hellstern K (1993) The Markov-modulated Poisson process (MMPP) cookbook. Perform Eval 18(2):149–171 5. Hu J, Min G, Woodward ME (2007) Analysis and comparison of burst transmission schemes in unsaturated 802.11e WLANs. In: Proceedings of IEEE GLOBECOM’07, Washington, pp 5133–5137 6. Hu J, Min G, Woodward ME, Jia W (2008) A comprehensive analytical model for IEEE 802.11e QoS differentiation schemes under unsaturated traffic loads. In: Proceedings of IEEE ICC’08, pp 241–245 7. Huang CL, Liao W (2007) Throughput and delay performance of IEEE 802.11e enhanced distributed channel access (EDCA) under saturation condition. IEEE Trans Wireless Commun 6(1):136–145 8. IEEE (1999) Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications. IEEE Standard 802.11 9. IEEE (2005) Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications: Medium Access Control (MAC) Quality of Service (QoS) enhancements. IEEE Standard 802.11e 10. Inan I, Keceli F, Ayanoglu E (2009) Analysis of the 802.11e enhanced distributed channel access function. IEEE Trans Commun 57(6):1753–1764 11. Kleinrock L (1975) Queueing systems: theory. Wiley 12. Kong Z, Tsang D, Bensaou B, Gao D (2004) Performance analysis of IEEE 802.11e contention-based channel access. IEEE J Select Areas Commun 22(10):2095–2106 13. Kumar A, Altman E, Miorandi D, Goyal M (2007) New insights from a fixed-point analysis of single cell IEEE 802.11 WLANs. IEEE/ACM Trans Netw 15(3):588–601 14. Li T, Ni Q, Xiao Y (2006) Investigation of the block ACK scheme in wireless ad-hoc networks. Wireless Commun Mobile Comput 6(6):877–888 15. Lin L, Fu H, Jia W (2005) An efficient admission control for IEEE 802.11 networks based on throughput analysis of (un)saturated channel. In: Proceedings of IEEE GLOBECOM’05, St. Louis, Missouri, vol 5, pp 3017–3021 16. Liu J, Niu Z (2007) Delay analysis of IEEE 802.11e EDCA under unsaturated conditions. In: Proceedings of IEEE WCNC’07, Hong Kong, pp 430–434 17. Malone D, Duffy K, Leith DJ (2007) Modeling the 802.11 distributed coordination function in nonsaturated heterogeneous conditions. IEEE/ACM Trans Netw 15(1):159–172
190
J. Hu et al.
18. Medepalli K, Tobagi FA (2006) Towards performance modelling of IEEE 802.11 based wireless networks: A unified framework and its applications. In: Proceedings of IEEE INFOCOM’06, Barcelona 19. Min G, Hu J, Woodward ME (2008) A dynamic IEEE 802.11e TXOP scheme in WLANs under self-similar traffic: Performance enhancement and analysis. In: Proceedings of IEEE ICC’08, Beijing, pp 2632–2636 20. Ozdemir M, McDonald AB (2006) On the performance of ad hoc wireless LANs: A practical queuing theoretic model. Perform Eval 63(11):1127–1156 21. Peng F, Alnuweiri HM, Leung VCM (2006) Analysis of burst transmission in IEEE 802.11e wireless LANs. In: Proceedings of IEEE ICC’06, Istanbul, vol 2, pp 535–539 22. Ramaiyan V, Kumar A, Altman E (2008) Fixed point analysis of single cell IEEE 802.11e WLANs: Uniqueness and multistability. IEEE/ACM Trans Netw 16(5):1080–1093 23. Robinson JW, Randhawa TS (2004) Saturation throughput analysis of IEEE 802.11e enhanced distributed coordination function. IEEE J Select Areas Commun 22(5):917–928 24. Tantra JW, Foh CH, Mnaouer AB (2005) Throughput and delay analysis of the IEEE 802.11e EDCA saturation. In: Proc. IEEE ICC’05, Seoul, vol 5, pp 3450–3454 25. Tantra JW, Foh CH, Tinnirello I, Bianchi G (2006) Analysis of the IEEE 802.11e EDCA under statistical traffic. In: Proceedings of IEEE ICC’06, Istanbul, vol 2, pp 546–551 26. Tao Z, Panwar S (2006) Throughput and delay analysis for the IEEE 802.11e enhanced distributed channel access. IEEE Trans Commun 54(4):596–603 27. Tickoo O, Sikdar B (2008) Modeling queueing and channel access delay in unsaturated IEEE 802.11 random access MAC based wireless networks. IEEE/ACM Trans Netw 16(4):878–891 28. Tinnirello I, Choi S (2005) Efficiency analysis of burst transmission with block ACK in contention-based 802.11e WLANs. In: Proceedings of IEEE ICC’05, Seoul, vol 5, pp 3455–3460 29. Vitsas V, Chatzimisios P, Boucouvalas AC, Raptis P, Paparrizos K, Kleftouris D (2004) Enhancing performance of the IEEE 802.11 distributed coordination function via packet bursting. In: Proceedings of IEEE GLOBECOM’04, Dallas, Texas, pp 245–252 30. Sven W, Emmelmann M, Christian H, Adam W (2006) TKN EDCA model for ns-2. Technical University of Berlin, Technical Report TKN-06-003 31. Wu H, Peng Y, Long K, Cheng S, Ma J (2002) Performance of reliable transport protocol over IEEE 802.11 wireless LAN: analysis and enhancement. In: Proceedings of IEEE INFOCOM’02, New York, vol 2, pp 599–607 32. Xiao Y (2005) Performance analysis of priority schemes for IEEE 802.11 and IEEE 802.11e wireless LANs. IEEE Trans Wireless Commun 4(4):1506–1515 33. Zhai H, Kwon Y, Fang Y (2004) Performance analysis of IEEE 802.11 MAC protocols in wireless LANs. Wireless Commun Mobile Comput 4(8):917–931 34. Zhao Q, Tsang DHK, Sakurai T (2008) A simple model for nonsaturated IEEE 802.11 DCF Networks. IEEE Commun Lett 12(8): 563–565 35. Zhu H, Chlamtac I (2005) Performance analysis for IEEE 802.11e EDCF service differentiation. IEEE Trans Commun 4(4):1779–1788 36. Ziouva E, Antonakopoulos T (2002) CSMA/CA performance under high traffic conditions: Throughput and delay analysis. Comput Commun 25(3):313–321
Dynamic Overlay Single-Domain Contracting for End-to-End Contract Switching Murat Yüksel, Aparna Gupta, and Koushik Kar
1 Introduction The Internet’s simple best-effort packet-switched architecture lies at the core of its tremendous success and impact. Today, the Internet is firmly a commercial medium involving several competitive service providers and content providers. However, current Internet architecture allows neither (i) users to indicate their value choices at sufficient granularity nor (ii) providers to manage risks involved in investment for new innovative QoS technologies and business relationships with other providers as well as users. Currently, users can only indicate their value choices at the access/link bandwidth level not at the routing level. End-to-end QoS contracts are possible today via virtual private networks, but with static and long-term contracts. Further, an enterprise that needs end-to-end capacity contracts between two arbitrary points on the Internet for a short period of time has no way of expressing its needs. We envision an Internet architecture that allows flexible, fine-grained, dynamic contracting over multiple providers. With such capabilities, the Internet itself will be viewed as a “contract-switched” network beyond its current status as a “packetswitched” network. A contract-switched architecture will enable flexible and economically efficient management of risks and value flows in an Internet characterized by many tussle points [7] where competition for network resources takes place. Realization of such an architecture heavily depends on the capabilities of provisioning dynamic single-domain contracts which can be priced dynamically or based on intra-domain congestion. Implementation of dynamic pricing still remains a Murat Yüksel University of Nevada - Reno, Reno, NV 89557, USA e-mail:
[email protected] Aparna Gupta Rensselaer Polytechnic Institute, Troy, NY 12180, USA e-mail:
[email protected] Koushik Kar Rensselaer Polytechnic Institute, Troy, NY 12180, USA e-mail:
[email protected]
N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_9,
191
192
M. Yüksel et al.
challenge, although several proposals have been made, e.g., [18, 25, 32]. Among many others, two major implementation obstacles can be defined: need for timely feedback to users about the price and determination of congestion information in an efficient, low-overhead manner. The first problem, timely feedback, is relatively hard to achieve in a wide area network such as the Internet. In [3], the authors showed that users do want feedback about charging of the network service (such as current price and prediction of service quality in near future). However, in [37], we illustrated that congestion control by pricing cannot be achieved if price changes are performed at a timescale larger than roughly 40 round-trip-times (RTTs). This means that in order to achieve congestion control by pricing, service prices must be updated very frequently (i.e., 2–3 s since RTT is expressed in terms of milliseconds for most cases in the Internet). In order to solve this timescale problem for dynamic pricing, we propose two solutions, which lead to two different pricing “architectures”: • By placing intelligent intermediaries (i.e., software or hardware agents) between users and the provider. This way it is possible for the provider to update prices frequently at low timescales, since price negotiations will be made with a software/hardware agent rather than a human. Since the provider will not employ any congestion control mechanism for its network and try to control congestion by only pricing, we call this pricing architecture as pricing for congestion control (PFCC). • By overlaying pricing on top of an underlying congestion control mechanism. This way it is possible to enforce tight control on congestion at small timescale, while performing pricing at timescales large enough for human involvement. The provider implements a congestion control mechanism to manage congestion in its network. So, we call this pricing architecture as pricing over congestion control (POCC). The big picture of the two pricing architectures PFCC and POCC are shown in Fig. 1. We will describe PFCC and POCC later in Section 4.
(a)
(b)
Fig. 1 Different pricing architectures with/without edge-to-edge congestion control: (a) Pricing for congestion control (PFCC) and (b) pricing over congestion control (POCC)
Dynamic Overlay Single-Domain Contracting
193
The second problem, congestion information, is also very hard to solve in a way that does not require a major upgrade at network routers. However, in diff-serv [2], it is possible to determine congestion information via a good ingress–egress coordination. So, this flexible environment of diff-serv motivated us to develop a pricing framework on it. The chapter is organized as follows: In the next section, we position our work and briefly survey relevant work in the area. Section 3 introduces our contractswitching paradigm. In Section 4, we present PFCC and POCC pricing architectures motivated by the timescale issues mentioned above. In Section 5 we describe properties of distributed-DCC framework according to the PFCC and POCC architectures. Next in Section 6, we define a pricing scheme edge-to-edge pricing (EEP) which can be implemented in the defined distributed-DCC framework. We study optimality of EEP for different forms of user utility functions and consider effect of different parameters such as user’s budget, user’s elasticity. In Section 7, according to the descriptions of distributed-DCC framework and EEP scheme, we simulate distributed-DCC in the two architectures PFCC and POCC. With the simulation results, we compare distributed-DCC’s performance in PFCC and POCC architectures. We, then, extend the pricing formulations to an end-to-end level in Section 8. We finalize with summary and discussions in Section 9.
2 Related Work There have been several pricing proposals, which can be classified in many ways: static vs. dynamic, per-packet charging vs. per-contract charging, and charging a priori to service vs. a posteriori to service. Although there are opponents to dynamic pricing in the area (e.g., [21–23]), most of the proposals have been for dynamic pricing (specifically congestion pricing) of networks. Examples of dynamic pricing proposals are MacKie-Mason and Varian’s Smart Market [18], Gupta et al.’s priority pricing [11], Kelly et al.’s proportional fair pricing (PFP) [15], Semret et al.’s market pricing [24, 25], and Wang and Schulzrinne’s resource negotiation and pricing (RNAP) [32, 33]. Odlyzko’s Paris metro pricing (PMP) [20] is an example of static pricing proposal. Clark’s expected capacity [5, 6] and Cocchi et al.’s edge pricing [8] allow both static and dynamic pricing. In terms of charging granularity, smart market, priority pricing, PFP, and edge pricing employ per-packet charging, while RNAP and expected capacity do not employ per-packet charging. Smart market is based primarily on imposing per-packet congestion prices. Since smart market performs pricing on per-packet basis, it operates at the finest possible pricing granularity. This makes smart market capable of making ideal congestion pricing. However, smart market is not deployable because of its per-packet granularity (i.e., excessive overhead) and its many requirements from routers (e.g., requires all routers to be updated). In [35], we studied smart market and difficulties of its implementation in more detail.
194
M. Yüksel et al.
While smart market holds one extreme in terms of granularity, expected capacity holds the other extreme. Expected capacity proposes to use long-term contracts, which can give more clear performance expectation, for statistical capacity allocation and pricing. Prices are updated at the beginning of each long-term contract, which incorporates little dynamism to prices. Our work, distributed-DCC, is a middle-ground between smart market and expected capacity in terms of granularity. Distributed-DCC performs congestion pricing at short-term contracts, which allows more dynamism in prices while keeping pricing overhead small. In the area, another proposal that mainly focused on implementation issues of congestion pricing on diff-serv is RNAP [32, 33]. Although RNAP provides a complete picture for incorporation of admission control and congestion pricing, it has excessive implementation overhead since it requires all network routers to participate in determination of congestion prices. This requires upgrades to all routers similar to the case of smart market. We believe that pricing proposals that require upgrades to all routers will eventually fail in implementation phase. This is because of the fact that the Internet routers are owned by different entities who may or may not be willing to cooperate in the process of router upgrades. Our work solves this problem by requiring upgrades only at edge routers rather than at all routers.
3 Contract-Switching Paradigm The essence of “contract switching” is to use contracts as the key building block for inter-domain networking. As shown in Fig. 2, this increases the inter-domain architecture flexibilities by introducing more tussle points into the protocol design. Especially, this paradigm will allow the much needed revolutions in the Internet protocol design: (i) inclusion of economic tools in the network layer functions such as inter-domain routing while the current architecture only allows basic connectivity information exchange and (ii) management of risks involved in QoS technology investments and participation into end-to-end QoS contract offerings by allowing ISPs to potentially apply financial engineering methods.
ISP A
ISP B
ISP C
(a) Packet-switching
ISP A
ISP B
ISP C
(b) Contract-switching
Fig. 2 Packet switching (a) introduced many more tussle points into the Internet architecture by breaking the end-to-end circuits of circuit switching into routable datagrams. Contract switching (b) introduces even more tussle points at the edge/peering points of domain boundaries by overlay contracts
Dynamic Overlay Single-Domain Contracting
195
In addition to these design opportunities, the contract-switching paradigm introduces several research challenges. As the key building block, intra-domain service abstractions call for design of (i) single-domain edge-to-edge QoS contracts with performance guarantees and (ii) nonlinear pricing schemes geared toward cost recovery. Moving one level up, composition of end-to-end inter-domain contracts poses a major research problem which we formulate as a “contract routing” problem by using single-domain contracts as “contract links.” Issues to be addressed include routing scalability, contract monitoring, and verification as the inter-domain context involves large-size effects and crossing trust boundaries. Several economic tools can be used to remedy pricing, risk sharing, and money-back problems of a contract-switched network provider (CSNP), which can operate as an overlay re-seller ISP (or an alliance of ISPs) that buys contract links and sells end-to-end QoS contracts. In addition to CSNPs, the contract-switching paradigm allows more distributed ways of composing end-to-end QoS contracts as we will detail later.
4 Single-Domain Pricing Architectures: PFCC vs. POCC In our previous work [27], we presented a simple congestion-sensitive pricing “framework,” dynamic capacity contracting (DCC), for a single diff-serv domain. DCC treats each edge router as a station of a service provider or a station of cooperating set of service providers. Users (i.e., individuals or other service providers) make short-term contracts with the stations for network service. During the contracts, the station receives congestion information about the network core at a timescale smaller than contracts. The station, then, uses that congestion information to update the service price at the beginning of each contract. Several pricing “schemes” can be implemented in that framework. DCC models a short-term contract for a given traffic class as a tuple of price per unit traffic volume Pv , maximum volume Vmax (maximum number of bytes that can be sent during the contract), and the term of the contract T (length of the contract): Contract =< Pv , Vmax , T > .
(1)
Figure 3 illustrates the big picture of DCC framework. Customers can only access network core by making contracts with the provider stations placed at the edge routers. The stations offer contracts (i.e., Vmax and T ) to fellow users. Access to these available contracts can be done in different ways, what we call edge strategy. Two basic edge strategies are “bidding” (many users bid for an available contract) or “contracting” (users negotiate Pv with the provider for an available contract). Notice that, in DCC framework, provider stations can implement dynamic pricing schemes. In particular, they can implement congestion-based pricing schemes, if they have actual information about congestion in network core. This congestion information can come from the interior routers or from the egress edge routers depending on the congestion detection mechanism being used. DCC assumes that
196
M. Yüksel et al. Stations of the provider computing and advertising local prices for edge-toedge contracts.
Edge Router
Edge Router
Edge Router
Customers
Network Core accessed only by contracts
Edge Router
Edge Router
Edge Router
Fig. 3 DCC framework on diff-serv architecture
the congestion detection mechanism is able to give congestion information in timescales (i.e., observation intervals) smaller than contracts. However, in DCC, we assumed that all the provider stations advertise the same price value for the contracts, which is very costly to implement over a wide area network. This is simply because the price value cannot be communicated to all stations at the beginning of each contract. We relax this assumption by allowing the stations to calculate the prices locally and advertise different prices than the other stations. We call this new version of DCC as distributed-DCC. We introduce ways of managing the overall coordination of the stations. As a fundamental difference between distributed-DCC and the well-known dynamic pricing proposals in the area (e.g., proposals by Kelly et al. [15] and Low et al. [17]) lies in the manner of price calculation. In distributed-DCC, the prices are calculated on an edge-to-edge basis, while traditionally it has been proposed that prices are calculated at each local link and fed back to users. In distributed-DCC, basically, the links on a flow’s route are abstracted out by edge-to-edge capacity estimation and the ingress node communicates with the corresponding egress node to observe congestion on the route. Then, the ingress node uses the estimated capacity and the observed congestion information in price calculation. However, in Low et al.’s framework, each link calculates its own price and sends it to the user, and the user pays the aggregate price. So, distributed-DCC is better in terms of implementation requirements, while Low et al.’s framework is better in terms of optimality. Distributed-DCC trades off some optimality in order to enable implementation of dynamic pricing. Amount of lost optimality depends on the closed-loop edge-to-edge capacity estimation.
4.1 Pricing for Congestion Control (PFCC) In this pricing architecture, provider attempts to solve congestion problem of its network just by congestion pricing. In other words, the provider tries to control congestion of its network by changing service prices. The problem here is
Dynamic Overlay Single-Domain Contracting
197
that the provider will have to change the price very frequently such that human involvement into the price negotiations will not be possible. This problem can be solved by running intermediate software (or hardware) agents between end-users and the provider. The intermediate agent receives inputs from the end-user at large timescales and keeps negotiating with the provider at small timescales. So, intermediate agents in PFCC architecture are very crucial in terms of acceptability by users. If PFCC architecture is not employed (i.e., providers do not bother to employ congestion pricing), then congestion control will be left to the end-user as it is in the current Internet. Currently in the Internet, congestion control is totally left to end-users, and common way of controlling congestion is TCP and its variants. However, this situation leaves open doors to non-cooperative users who do not employ congestion control algorithms or at least employ congestion control algorithms that violate fairness objectives. For example, by simple tricks, it is possible to make TCP connection to capture more of the available capacity than the other TCP connections. The major problem with PFCC is that development of user-friendly intermediate agents is heavily dependent on user opinion, and hence requires significant amount of research. A study of determining user opinions is available in [3]. In this chapter, we do not focus development of intermediate agents.
4.2 Pricing Over Congestion Control (POCC) Another way of approaching the congestion control problem by pricing is to overlay pricing on top of congestion control. This means the provider undertakes the congestion control problem by itself and employs an underlying congestion control mechanism for its network. This way it is possible to enforce tight control on congestion at small timescales, while maintaining human involvement into the price negotiations at large timescales. Figure 1 illustrates the difference between POCC (with congestion control) and PFCC (without congestion control) architectures. So, assuming that there is an underlying congestion control scheme, the provider can set the parameters of that underlying scheme such that it leads to fairness and better control of congestion. The pricing scheme on top can determine user incentives and set the parameters of the underlying congestion control scheme accordingly. This way, it will be possible to favor some traffic flows with higher willingness to pay (i.e., budget) than the others. Furthermore, the pricing scheme will also bring benefits such as an indirect control on user demand by price, which will in turn help the underlying congestion control scheme to operate more smoothly. However, the overall system performance (e.g., fairness, utilization, throughput) will be dependent on the flexibility of the underlying congestion control mechanism. Since our main focus is to implement pricing in “diff-serv environment,” we assume that the provider employs “edge-to-edge” congestion control mechanisms under the pricing protocol on top. So, in diff-serv environment, overlaying pricing on top of edge-to-edge congestion control raises two major problems:
198
M. Yüksel et al.
1. Parameter mapping: Since the pricing protocol wants to allocate network capacity according to the user incentives (i.e., the users with greater budget should get more capacity) that change dynamically over time, it is a required ability to set corresponding parameters of the underlying edge-to-edge congestion control mechanism such that it allocates the capacity to the user flows according to their incentives. So, this raises need for a method of mapping parameters of the pricing scheme to the parameters of the underlying congestion control mechanism. Notice that this type of mapping requires the edge-to-edge congestion control mechanism to be able to provide parameters that tune the rate being given to edge-to-edge flows. 2. Edge queues: The underlying edge-to-edge congestion control scheme will not always allow all the traffic admitted by the pricing protocol, which will cause queues to build up at network edges. So, management of these edge queues is necessary in POCC architecture. Figure 1a and b compares the situation of the edge queues in the two cases when there is an underlying edge-to-edge congestion control scheme and when there is not.
5 Distributed-DCC Framework Distributed-DCC framework is specifically designed for DiffServ environments, because the edge routers can perform complex operations which are essential to several requirements for implementation of congestion pricing. Each edge router is treated as a station of the provider. Each station advertises locally computed prices with information received from other stations. The main framework basically describes how to preserve coordination among the stations such that stability and fairness of the overall network is preserved. We can summarize essence of distributed-DCC in two items: • Since upgrading of all routers is not possible to implement, pricing should happen on an edge-to-edge basis which only requires upgrades to edge routers. • Provider should employ short-term contracts in order to have ability to change prices frequently enough such that congestion pricing can be enabled. Distributed-DCC framework has three major components: Logical pricing server (LPS), ingress stations, and egress stations. Solid-lined arrows in the figure represent control information being transmitted among the components. Basically, ingress stations negotiate with customers, observe customer’s traffic, and make estimations about customer’s demand. Ingress stations inform corresponding egress stations about the observations and estimations about each edge-to-edge flow. Egress stations detect congestion by monitoring edge-to-edge traffic flows. Based on congestion detections, egress stations estimate available capacity for each edgeto-edge flow and inform LPS about these estimations.
Dynamic Overlay Single-Domain Contracting
199
LPS receives capacity estimations from egress stations and allocates the network available capacity to edge-to-edge flows according to different criteria (such as fairness, price optimality).
5.1 Ingress Station i Figure 4 illustrates sub-components of ingress station i in the framework. Ingress i includes two sub-components: pricing scheme and budget estimator.
Fig. 4 Major functions of ingress i
Ingress station i keeps a “current” price vector pi , where pi j is the price for the flow from ingress i to egress j. So, the traffic using flow i to j is charged the price pi j . Pricing scheme is the sub-component that calculates price pi j for each edgeto-edge flow starting at ingress i. It uses allowed flow capacities ci j and other local information (such as bi j ), in order to calculate price pi j . The station, then, uses pi j in negotiations with customers. We will describe a simple pricing scheme edge-toedge pricing (EEP) later in Section 6. However, it is possible to implement several other pricing schemes by using the information available at ingress i. Other than EEP, we implemented another pricing scheme, price discovery, which is available in [1]. Also, the ingress i uses the total estimated network capacity C in calculating the Vmax contract parameter defined in (1). Admission control techniques can be used to identify the best value for Vmax . We use a simple method which does not put any restriction on Vmax , i.e., Vmax = C ∗ T where T is the contract length. Budget estimator is the sub-component that observes demand for each edge-toedge flow. We implicitly assume that user’s “budget” represents user’s demand (i.e.,
200
M. Yüksel et al.
willingness to pay). So, budget estimator estimates budget bˆi j of each edge-to-edge traffic flow.1
5.2 Egress Station j Figure 5 illustrates sub-components of egress station j in the framework: congestion detector, congestion-based capacity estimator, flow cost analyzer, and fairness tuner.
Fig. 5 Major functions of egress j
Congestion detector implements an algorithm to detect congestion in network core by observing traffic arriving at egress j. Congestion detection can be done in several ways. We assume that interior routers mark (i.e., sets the ECN bit) the data packets if their local queue exceeds a threshold. Congestion detector generates a “congestion indication” if it observes a marked packet in the arriving traffic.
1 Note
that edge-to-edge flow does not mean an individual user’s flow. Rather it is the traffic flow that is composed of aggregation of all traffic going from one edge node to another edge node.
Dynamic Overlay Single-Domain Contracting
201
Congestion-based capacity estimator estimates available capacity cˆi j for each edge-to-edge flow exiting at egress j. In order to calculate cˆi j , it uses congestion indications from congestion detector and actual output rates μi j of the flows. The crucial property of congestion-based capacity estimator is that it estimates capacity in a congestion-based manner, i.e., it decreases the capacity estimation when there is congestion indication and increases when there is no congestion indication. This makes the prices congestion sensitive, since pricing scheme at ingress calculates prices based on the estimated capacity. Flow cost analyzer determines cost of each traffic flow (e.g., number of links traversed by the flow, number of bottlenecks traversed by the flow, amount of queuing delay caused by the flow) exiting at egress j. Cost incurred by each flow can be several things: number of traversed links, number of traversed bottlenecks, amount of queuing delay caused. We assume that number of bottlenecks is a good representation of the cost incurred by a flow. It is possible to define edge-to-edge algorithms that can effectively and accurately estimate the number of bottlenecks traversed by a flow [34]. LPS, as will be described in the next section, allocates capacity to edge-to-edge flows based on their budgets. The flows with higher budgets are given more capacity than the others. So, egress j can penalize/favor a flow by increasing/decreasing its budget bˆ i j . Fairness tuner is the component that updates bˆi j . So, fairness tuner penalizes or favors the flow from ingress i by updating its estimated budget value, i.e. bi j = f (bˆi j , rˆi j , < parameters >) where < parameters > are other optional parameters that may be used for deciding how much to penalize or favor the flow. For example, if the flow ingress i is passing through more congested areas than the other flows, fairness tuner can penalize this flow by reducing its budget estimation bˆi j . We will describe an algorithm for fairness tuner later in Section 5.2.1. Egress j sends cˆi j s (calculated by congestion-based capacity estimator) and bi j s (calculated by fairness tuner) to LPS.
5.2.1 Fairness Tuner We examine the issues regarding fairness in two main cases. We first determine these two cases and then provide solutions within distributed-DCC framework: • Single-bottleneck case: The pricing protocol should charge the same price (i.e., $/bandwidth) to the users of the same bottleneck. In this way, among the customers using the same bottleneck, the ones who have more budget will be given more rate (i.e., bandwidth/time) than the others. The intuition behind this reasoning is that the cost of providing capacity to each customer is the same. • Multi-bottleneck case: The pricing protocol should charge more to the customers whose traffic passes through more bottlenecks and cause more costs to the provider. So, other than proportionality to customer budgets, we also want to allocate less rate to the customers whose flows are passing through more bottlenecks than the other customers.
202
M. Yüksel et al.
For multi-bottleneck networks, two main types of fairness have been defined: max–min fairness [16] and proportional fairness [15]. In max–min fair rate allocation, all flows get equal share of the bottlenecks, while in proportional fair rate allocation flows get penalized according to the number of traversed bottlenecks. Depending on the cost structure and user’s utilities, for some cases the provider may want to choose max–min or proportional rate allocation. So, we would like to have ability of tuning the pricing protocol such that fairness of its rate allocation is in the way the provider wants. To achieve the fairness objectives defined in the above itemized list, we introduce new parameters for tuning rate allocation to flows. In order to penalize flow i to j, the egress j can reduce bˆi j while updating the flow’s estimated budget. It uses the following formula to do so: bi j = f (bˆi j , r (t), α, rmin ) =
bˆi j , rmin + (ri j (t) − rmin )α
where ri j (t) is the congestion cost caused by the flow i to j, rmin is the minimum possible congestion cost for the flow, and α is fairness coefficient. Instead of bˆi j , the egress j now sends bi j to LPS. When α is 0, fairness tuner is employing max–min fairness. As it gets larger, the flow gets penalized more and rate allocation gets closer to proportional fairness. However, if it is too large, then the rate allocation will move away from proportional fairness. Let α ∗ be the α value where the rate allocation is proportionally fair. If the estimation ri j (t) is absolutely correct, then α ∗ = 1. Otherwise, it depends on how accurate ri j (t) is. Assuming that each bottleneck has the same amount of congestion and capacity. Then, in order to calculate ri j (t) and rmin , we can directly use the number of bottlenecks the flow i to j is passing through. In such a case, rmin will be 1 and ri j (t) should be number of bottlenecks the flow is passing through.
5.3 Logical Pricing Server (LPS) Figure 6 illustrates basic functions of LPS in the framework. LPS receives information from egresses and calculates allowed capacity ci j for each edge-to-edge flow. The communication between LPS and the stations take place at every LPS interval L. There is only one major sub-component in LPS: capacity allocator. Capacity allocator receives cˆi j s, bi j s and congestion indications from egress stations. It calculates allowed capacity ci j for each flow. Calculation of ci j values is a complicated task which depends on internal topology. In general, the flows should share capacity of the same bottleneck in proportion to their budgets. Other than functions of capacity allocator, LPS also calculates total available network capacity C, which is necessary for determining the contract parameter Vmax at ingresses. LPS simply sums cˆi j to calculate C.
Dynamic Overlay Single-Domain Contracting
203
Fig. 6 Major functions of LPS
5.3.1 ETICA: Edge-to-Edge, Topology-Independent Capacity Allocation First, note that LPS is going to implement the ETICA algorithm as a capacity allocator (see Fig. 6). So, we will refer to LPS throughout the description of ETICA below. At LPS, we introduce a new information about each edge-to-edge flow fi j . A flow f i j is congested if egress j has been receiving congestion indications from that flow recently (we will later define what “recent” is). Again at LPS, let K i j determine the state of f i j . If K i j > 0, LPS determines f i j as congested. If not, it determines f i j as non-congested. At every LPS interval t, LPS calculates K i j as follows: # K i j (t) =
k, congestion in t − 1 K i j (t − 1) − 1, no congestion in t − 1 K i j (0) = 0,
(2)
where k is a positive integer. Notice that k parameter defines how long a flow will stay in “congested” state after the last congestion indication. So, in other words, k defines the timeline to determine if a congestion indication is “recent” or not. According to these considerations in ETICA algorithm, Fig. 7 illustrates states of an edge-to-edge flow given that probability of receiving a congestion indication in the last LPS interval is p. Gray states are the states in which the flow is “congested,” and the single white state is the “non-congested” state. Observe that number of
204
M. Yüksel et al.
Fig. 7 States of an edge-to-edge flow in ETICA algorithm: the states i > 0 are “congested” states and the state i = 0 is the “non-congested” state, represented with gray and white colors, respectively
congested states (i.e., gray states) is equal to k which defines to what extent a congestion indication is “recent.”2 Given the above method to determine whether a flow is congested or not, we now describe the algorithm to allocate capacity to the flows. Let F be the set of all edgeto-edge flows in the diff-serv domain and Fc be the set of congested edge-to-edge flows. Let Cc be the accumulation of cˆi j s where f i j ∈ Fc . Further, let Bc be the accumulation of bi j s where f i j ∈ Fc . Then, LPS calculates the allowed capacity for f i j as follows: ci j =
bi j Bc C c , cˆi j ,
Ki j > 0 . otherwise
The intuition is that if a flow is congested, then it must be competing with other congested flows. So, a congested flow is allowed a capacity in proportion to its budget relative to budgets of all congested flows. Since we assume no knowledge about the interior topology, we can approximate the situation by considering these congested flows as if they are passing through a single bottleneck. If knowledge about the interior topology is provided, one can easily develop better algorithms by sub-grouping the congested flows that are passing through the same bottleneck.
6 Single-Domain Edge-to-Edge Pricing Scheme (EEP) For flow f i j , distributed-DCC framework provides an allowed capacity ci j and an estimation of total user budget bˆi j at ingress i. So, the provider station at ingress i can use these two information to calculate price. We propose a simple price formula to balance supply and demand:
2 Note that instead of setting K to k at every congestion indication, more accurate methods can be ij used in order to represent self-similar behavior of congestion epochs. For simplicity, we proceed with the method in (2).
Dynamic Overlay Single-Domain Contracting
205
pˆi j =
bˆi j . ci j
(3)
Here, bˆi j represents user demand and ci j is the available supply. The main idea of the EEP is to balance supply and demand by equating price to the ratio of users’ budget (i.e., demand) B by available capacity C. Based on that, we used the pricing formula: p=
Bˆ , Cˆ
(4)
where Bˆ is the users’ estimated budget and Cˆ is the estimated available network capacity. The capacity estimation is performed based on congestion level in the network, and this makes the EEP scheme a congestion-sensitive pricing scheme. We now formulate the problem of total user utility maximization for a multiuser multi-bottleneck network. Let F = {1, . . . , F} be the set of flows and L = {1, . . . , L} be the set of links in the network. Also, let L( f ) be the set of links the flow f passes through and F(l) be the set of flows passing through the link l. Let cl be the capacity of link l. Let λ be the vector of flow rates and λ f be the rate of flow f . We can formulate the total user utility maximization problem as follows: SYSTEM:
max λ
U f (λ f )
f
subject to λf
≤
cl ,
l = 1, . . . , L .
(5)
f ∈F(l)
This problem can be divided into two separate problems by employing monetary exchange between user flows and the network provider. Following Kelly’s [14] methodology we split the system problem into two. The first problem is solved at the user side. Given accumulation of link prices on the flow f ’s route, p f , what is the optimal sending rate in order to maximize surplus. FLOW f ( p f ): ⎧ ⎫ ⎨ ⎬ pl λ f max U f (λ f ) − ⎭ λf ⎩ l∈L( f )
over λ f ≥ 0.
(6)
206
M. Yüksel et al.
The second problem is solved at the provider’s side. Given sending rate of user flows (which are dependent on the link prices), what is the optimal price to advertise in order to maximize revenue. NETWORK(λ( p f )) :
max p
pl λ f
f l∈L( f )
subject to λf
≤
cl ,
l = 1, . . . , L
f ∈F(l)
over p ≥ 0. Let the total price paid by flow f be p f = F L O W f ( p f ) will be
(7)
l∈L( f )
f λ f ( p f ) = U −1 f ( p ).
pl . Then, solution to
(8)
When it comes to the NETWORK(λ( p f )) problem, the solution will be dependent on user flows utility functions since their sending rate is based on their utility functions as shown in the solution of FLOW f ( p f ). So, in the next sections we will solve the NETWORK(λ( p f )) problem for the cases of logarithmic and nonlogarithmic utility functions. We model customer i’s utility with the well-known function3 [15–17, 19]: u i (x) = wi log(x),
(9)
where x is the allocated bandwidth to the customer and wi is customer i’s budget (or bandwidth sensitivity). Now, we set up a vectorized notation, then solve the revenue maximization problem NETWORK(λ( p f )). Assume the network includes n flows and m links. Let λ be row vector of the flow rates (λ f for f ∈ F), P be column vector of the price at each link ( pl for l ∈ L). Define the n × n matrix P ∗ in which the diagonal element P j∗j is the aggregate price being advertised to flow j (i.e., p j = l∈L( j) pl ) and all the other elements are 0. Also, let A be the n × m routing matrix in which the element Ai j is 1 if ith flow is passing though jth link and the element Ai j is 0, if not, C be the column vector of link capacities (cl for l ∈ L). Finally, define the n ×n matrix λˆ in which the diagonal element λˆ j j is the rate of flow j (i.e. λˆ j j = λ j ) and all the other elements are 0.
3 Wang
and Schulzrinne introduced a more complex version in [33].
Dynamic Overlay Single-Domain Contracting
207
Given the above notation, relationship between the link price vector P and the flow aggregate price matrix P ∗ can be written as A P = P ∗ e,
(10)
ˆ ˆ = e λ, λ = (λe)
(11)
T
T
where e is the column unit vector. We use the utility function of (9) in our analysis. By plugging (9) into (8) we obtain flow’s demand function in vectorized notation: λ(P ∗ ) = W P ∗ −1 ,
(12)
where W is row vector of the weights wi in flow’s utility function (9). Also, we can write the utility function (9) in vectorized notation as follows: U (λ) = W log(λˆ ).
(13)
The revenue maximization of (7) can be re-written as follows: max R = λA P P
subject to λA
≤
CT .
(14)
So, we write the Lagrangian as follows: L = λA P + (C T − λA)γ ,
(15)
where γ is column vector of the Lagrange multipliers for the link capacity constraint. Solving (15), we derive P: P = (C T )−1 W e.
(16)
Since P ∗ = (P ∗ )T , we can derive another solution: P = A−1 W T C −1 A T e.
(17)
Notice that the result in (16) holds for a single-bottleneck (i.e., single-link) network. In non-vectorized notation, this result translates to f ∈F w f p= . c
208
M. Yüksel et al.
The result in (17) holds for a multi-bottleneck network. This result means that each link’s optimal price is dependent on the routes of each flow passing through that link. More specifically, the optimal price for link l is accumulation of budgets of flows passing through link l (i.e., W T A T in the formula) divided by total capacity of the links that are traversed by the flows traversing the link l (i.e., A−1 C −1 in the formula). In non-vectorized notation, price of link l can be written as pl =
f ∈F(l) w f
f ∈F(l)
k∈L( f ) ck
.
Similar results can be found for non-logarithmic utility functions involving user’s utility-bandwidth elasticity [36].
7 Distributed-DCC: PFCC and POCC Architectures In order to adapt distributed-DCC to PFCC architecture, LPS must operate at very low timescales. In other words, LPS interval must be small enough to maintain control over congestion, since PFCC assumes no underlying congestion control mechanism. This raises practical issues to be addressed. For instance, intermediate agents between customers and ingress stations must be implemented in order to maintain human involvement into the system. Further, scalability issues regarding LPS must be solved since LPS must operate at very small timescales. Distributed-DCC operates on per edge-to-edge flow basis which means the flows are not on a per-connection basis. All the traffic going from edge router i to j is counted as only one flow. This actually relieves the scalability problem for operations that happen on per-flow basis. The number of flows in the system will be n(n − 1) where n is the number of edge routers in the diff-serv domain. So, indeed, scalability of the flows is not a problem for the current Internet since number of edge routers for a single diff-serv domain is very small. If it becomes so large in future, then aggregation techniques can be used to overcome this scalability issue, of course, by sacrificing some optimality. To adapt distributed-DCC framework to POCC architecture, an edge-to-edge congestion control mechanism is needed, for which we use Riviera [13] in our experiments. Riviera takes advantage of two-way communication between ingress and egress edge routers in a diff-serv network. Ingress sends a forward feedback to egress in response to feedback from egress, and egress sends backward feedback to ingress in response to feedback from ingress. So, ingress and egress of a traffic flow keep bouncing feedback to each other. Ignoring loss of data packets, the egress of a traffic flow measures the accumulation, a, caused by the flow by using the bounced feedbacks and RTT estimations. When a for a particular flow exceeds a threshold or goes below a threshold the flow is identified as congested or not-congested,
Dynamic Overlay Single-Domain Contracting
209
respectively. The ingress node gets informed about the congestion detection by backward feedbacks and uses egress’ explicit rate to adjust the sending rate. We now provide solutions defined in Section 4.2, for the case of overlaying distributed-DCC over Riviera: 1. Parameter mapping: For each edge-to-edge flow, LPS can calculate the capacity share of that flow out of the total network capacity. Let γi j = ci j /C be the fraction of network capacity that must be given to the flow i to j. LPS can convey γi j s to the ingress stations, and they can multiply the increase parameter αi j with γi j . Also, LPS can communicate γi j s to the egresses, and they can multiply max_thr esh i j and min_thr esh i j with γi j . 2. Edge queues: In distributed-DCC, ingress stations are informed by LPS about allocated capacity ci j for each edge-to-edge flow. So, one intuitive way of making sure that the user will not contract for more than ci j is to subtract necessary capacity to drain the already built edge queue from ci j , and then make contracts accordingly. In other words, the ingress station updates the allocated capacity ci j for flow i to j by the following formula ci j = ci j − Q i j /T and uses ci j for price calculation. Note that Q i j is the edge queue length for flow i to j, and T is the length of the contract.
7.1 Simulation Experiments and Results We now present ns-2 [29] simulation experiments for the two architectures, PFCC and POCC, on single-bottleneck and multi-bottleneck topology. Our goals are to illustrate fairness and stability properties of the two architectures with possible comparisons of two. For PFCC and POCC, we simulate distributed-DCC’s PFCC and POCC versions which were described in Section 7. We will simulate EEP pricing scheme at ingress stations. The key performance metrics we will extract from our simulations are as follows: • Steady-state properties of PFCC and POCC architectures: queues, rate allocation • PFCC’s fairness properties: Provision of various fairness in rate allocation by changing the fairness coefficient α • Performance of distributed-DCC’s capacity allocation algorithm ETICA in terms of adaptiveness The single-bottleneck topology has a bottleneck link, which is connected to n edge nodes at each side where n is the number of users. The multi-bottleneck topology has n − 1 bottleneck links that are connected to each other serially. There are again n ingress and n egress edge nodes. Each ingress edge node is mutually connected to the beginning of a bottleneck link, and each egress node is mutually connected to the end of a bottleneck link. All bottleneck links have a capacity of 10 Mb/s and all other links have 15 Mb/s. Propagation delay on each link is 5 ms, and users send UDP traffic with an average packet size of 1000 B. To ease understanding the
210
M. Yüksel et al.
experiments, each user sends its traffic to a separate egress. For the multi-bottleneck topology, one user sends through all the bottlenecks (i.e., long flow) while the others cross that user’s long flow. The queues at the interior nodes (i.e., nodes that stand at the tips of bottleneck links) mark the packets when their local queue size exceeds 30 packets. In the multi-bottleneck topology they increment a header field instead of just marking. Figure 8a shows a single-bottleneck topology with n = 3. Figure 8b shows multi-bottleneck topology with n = 4. The white nodes are edge nodes and the gray nodes are interior nodes. These figures also show the traffic flow of users on the topology. The user flow tries to maximize its total utility by contracting for b/ p amount of capacity, where b is its budget and p is price. The flows’s budgets are randomized according to truncated normal [30] distribution with a given mean value. This mean value is what we will refer to as flows’s budget in our simulation experiments.
(a)
(b)
Fig. 8 (a) Single-bottleneck and (b) multi-bottleneck network for distributed-DCC experiments
Contracting takes place at every 4 s, observation interval is 0.8 s, and LPS interval is 0.16 s. Ingresses send budget estimations to corresponding egresses at every observation interval. LPS sends information to ingresses at every LPS interval. The parameter k is set to 25, which means a flow is determined to be non-congested at least after 25 LPS intervals equivalent to one contracting interval (see Section 5.3.1). The parameter δ is set to 1 packet (i.e., 1000 B), the initial value of cˆi j for each flow f i j is set to 0.1 Mb/s, β is set to 0.95, and %r is set to 0.0005. Also note that, in the experiments, packet drops are not allowed in any network node. This is because we would like to see performance of the schemes in terms of assured service. 7.1.1 Experiments on Single-Bottleneck Topology We run simulation experiments for PFCC and POCC on the single-bottleneck topology, which is represented in Fig. 8a. In this experiment, there are three users with budgets of 30, 20, 10, respectively, for users 1, 2, 3. Total simulation time is 15,000 s, and at the beginning only user 1 is active in the system. After 5000 s, user 2 gets active. Again after 5000 s at simulation time 10,000, user 3 gets active. For POCC, there is an additional component in the simulation: edge queues. The edge queues mark the packets when queue size exceeds 200 packets. So, in order to manage the edge queues in this simulation experiment, we simultaneously employ both of the two techniques described before.
Dynamic Overlay Single-Domain Contracting
211
In terms of results, the volume given to each flow is very important. Figures 9a and 10a show the volumes given to each flow in PFCC and POCC, respectively. We see the flows are sharing the bottleneck capacity in proportion to their budgets. In comparison to POCC, PFCC allocates volume more smoothly but with the same proportionality to the flows. The noisy volume allocation in POCC is caused by coordination issues (i.e. parameter mapping, edge queues) investigated in Section 7.
12 10
Flow 0 Flow 1 Flow 2
6 5
8
Price ($/Mb)
Volume (Mb)
7
Flow 0 Flow 1 Flow 2
6 4
4 3 2
2
1
0
0 0
2000
4000
6000
8000 10000 12000 14000
0
2000
4000
6000
8000 10000 12000 14000
Time (seconds)
Time (seconds)
(a)
(b)
Bottleneck Queue Size (packets)
250 200 150 100 50 0 0
2000
4000
6000
8000 10000 12000 14000
Time (seconds)
(c)
Fig. 9 Results of single-bottleneck experiment for PFCC: (a) Flow rates; (b) price to flows; and (c) bottleneck queue
Figures 9b and 10b show the price being advertised to flows in PFCC and POCC, respectively. As the new users join in, the pricing scheme increases the price in order to balance supply and demand. Figures 9c and 10c show the bottleneck queue size in PFCC and POCC, respectively. Notice that queue sizes make peaks transiently at the times when new users gets active. Otherwise, the queue size is controlled reasonably and the system is stable. In comparison to PFCC, POCC manages the bottleneck queue much better because of the tight control enforced by the underlying edge-to-edge congestion control algorithm Riviera.
212
M. Yüksel et al.
9
7
Flow 0 Flow 1 Flow 2
8
Flow 0 Flow 1 Flow 2
6 5
6
Price ($/Mb)
Volume (Mb)
7
5 4 3
4 3 2
2 1
1 0
0 0
2000
4000
6000
8000 10000 12000 14000
0
2000
4000
6000
8000 10000 12000 14000
Time (seconds)
Time (seconds)
(a)
(b)
Bottleneck Queue Size (packets)
250 200 150 100 50 0
0
2000
4000
6000
8000 10000 12000 14000
Time (seconds)
(c)
Fig. 10 Results of single-bottleneck experiment for POCC: (a) Flow rates; (b) prices; and (c) bottleneck queue
7.1.2 Experiments on Multi-bottleneck Topology On a multi-bottleneck network, we would like to illustrate two properties for PFCC: • Property 1: Provision of various fairness in rate allocation by changing the fairness coefficient α of distributed-DCC framework (see Section 5.2.1) • Property 2: Performance of distributed-DCC’s capacity allocation algorithm ETICA in terms of adaptiveness (see Section 5.3.1) Since Riviera does not currently provide a set of parameters for weighted allocation on multi-bottleneck topology, we will not run any experiment for POCC on multi-bottleneck topology. In order to illustrate property 1, we run a series of experiments for PFCC with different α values. Remember that α is the fairness coefficient of distributed-DCC. Higher α values imply more penalty to the flows that cause more congestion costs. We use a larger version of the topology represented in Fig. 8b. In the multibottleneck topology there are 10 users and 9 bottleneck links. Total simulation time is 10,000 s. At the beginning, the user with the long flow is active. All the other users have traffic flows crossing the long flow. After each 1000 s, one of these other users gets active. So, as the time passes the number of bottlenecks in the system increases
Dynamic Overlay Single-Domain Contracting
213
9
alpha = 0.0 alpha = 0.25 alpha = 0.50 alpha = 0.75 alpha = 1.00 alpha = 1.25 alpha = 1.50 alpha = 1.75 alpha = 2.00 alpha = 2.25 alpha = 2.50 Proportional fair rate
Long Flow Rate(Mb/s)
8 7 6 5 4 3 2 1 0
0
1
2
3
4
5
6
7
8
Price Advertised to the Long Flow ($/Mbits)
since new users with crossing flows join in. Notice that the number of bottlenecks in the system is 1 less than the number of active user flows. We are interested in the volume given to the long flow, since it is the one that causes more congestion costs than the other user flows. Figure 11a shows the average volume given to the long flow versus the number of bottlenecks in the system for different values of α. As expected the long flow gets less and less capacity as α increases. When α is zero, the scheme achieves max–min fairness. As it increases the scheme gets closer to proportional fairness. Also note that the other user flows get the rest of the bottleneck capacity, and hence utilize the bottlenecks. 5
alpha = 0.0 alpha = 0.25 alpha = 0.50 alpha = 0.75 alpha = 1.00 alpha = 1.25 alpha = 1.50 alpha = 1.75 alpha = 2.00 alpha = 2.25 alpha = 2.50
4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 0
1
2
3
4
5
Number of Bottlenecks
Number of Bottlenecks
(a)
(b)
Averaged Rates of Flows (Mb/s)
12
6
7
8
Flow 0 Flow 1 Flow 2 Flow 3
10 8 6 4 2 0 0
5000
10000
15000
20000
25000
30000
Time (seconds)
(c)
Fig. 11 Results of PFCC experiments on multi-bottleneck topology: (a) Long flow rate; (b) price to long flow; and (c) flow rates
This variation in fairness is basically achieved by advertisement of different prices to the user flows according to the costs incurred by them. Figure 11b shows the average price that is advertised to the long flow as the number of bottlenecks in the system increases. We can see that the price advertised to the long flow increases as the number of bottlenecks increases. Finally, to illustrate property 2, we ran an experiment on the topology in Fig. 8b with small changes. We increased the capacity of the bottleneck at node D from
214
M. Yüksel et al.
10 to 15 Mb/s. There are four flows and three bottlenecks in the network as represented in Fig. 8b. Initially, all the flows have an equal budget of 10. Total simulation time is 30,000 s. Between times 10,000 and 20,000, budget of flow 1 is temporarily increased to 20. The fairness coefficient α is set to 0. All the other parameters (e.g., marking thresholds, initial values) are exactly the same as in the single-bottleneck experiments of the previous section. Figure 11c shows the volumes given to each flow. Until time 10,000 s, flows 0, 1, and 2 share the bottleneck capacities equally presenting a max–min fair allocation because α was set to 0. However, flow 3 is getting more than the others because of the extra capacity at bottleneck node D. This flexibility is achieved by the freedom given individual flows by the capacity allocation algorithm (see Section 5.3.1). Between times 10,000 and 20,000, flow 2 gets a step increase in its allocated volume because of the step increase in its budget. In result of this, flow 0 gets a step decrease in its volume. Also, flows 2 and 3 adapt themselves to the new situation by attempting to utilize the extra capacity leftover from the reduction in flow 0’s volume. So, flows 2 and 3 get a step decrease in their volumes. After time 20,000, flows restore to their original volume allocations, illustrating the adaptiveness of the scheme.
8 Pricing Loss Guarantee in End-to-End Service As stated in Section 3, in the contract-switched paradigm, it is possible to offer QoS guarantees beyond the best-effort service. In this section, we develop pricing for loss guarantees offered along with the base bandwidth service, where the price of the additional loss guarantee is a component of the overall price. Provision of a loss-based QoS-guaranteed service is inherently risky due to uncertainties caused by competing traffic in the Internet. Future outcomes of a service may be in favor of or against the provider, i.e., the provider may or may not deliver the loss-based QoS as promised. Uncertainty in quality of service is not unique to Internet services [28]. For example, an express delivery company may not always deliver customers’ parcels intact and/or on time; and when losses or delays occur, certain remedy mechanisms, such as money back or insurance, are employed to compensate the customers. On the other hand, the provider needs to take into account such uncertainty when pricing its services. In other words, prices should be set such that the provider will be able to recuperate the possible expenses it will incur for attempting to deliver the QoS, as well as the pay-offs to customers when the promised quality of service is not delivered. When further improving QoS deterministically gets too costly, the provider may be better off using economic tools to manage risks in QoS delivery, rather than trying to eliminate them. We use option pricing techniques to evaluate the risky nature of the loss-guaranteed service. In particular, we consider pricing from the provider’s perspective and evaluate the monetary “reward” for the favorable risks to the provider, which then becomes an added component to the base price of the
Dynamic Overlay Single-Domain Contracting
215
contract. Pricing the risk appropriately lets the risk be fairly borne by the provider and the customer. If all ISPs employ an option-based pricing scheme for their intra-domain lossassured services (for details, see [10]), in this section, we consider the price of endto-end loss assurance. Here risk underlying a loss guarantee refers to whether the service the ith ISP delivers to a customer satisfies the loss guarantee specified in the contract or not. The price is set as the monetary “reward” the ith ISP gets when it is able to meet the loss guarantee, SiU . The price depends on the loss behavior of a customer’s traffic through the ISP’s domain, which is dictated by the characteristics of the network and traffic load. Option pricing techniques are used to value the service by utilizing the ISP’s preferences for different loss outcomes, captured in a state-price density (SPD) [10]. Let l be the end-to-end data loss rate of a customer’s data, li, contract be the data loss rate of the customer’s data in the ith ISP’s network, and l Nj, contract be the data loss rate of the customer’s data in the jth transit node, respectively. If the loss rates li, contract and l Nj, contract are less than 20%, the second-order terms in their product will be significant to one lower digit of accuracy, hence ignorable. If an ISP is causing a steady loss rate of higher than 20%, it practically does not make much sense to utilize their services for end-to-end loss guarantees. As loss rates are multiplicative, and all li, contract ’s and l Nj, contract ’s can safely be assume to be small, we have l≈
li, contract +
N i∈Vcontract
l Nj, contract ,
j∈E contract
N ⊆ V N are the set of ISP contracts and E contract ⊆ E are the set where Vcontract of transit nodes used in delivering a specific end-to-end contract, and V N , E are all the ISPs and transit nodes in a provider’s overlay. The intra-domain contract with each ISP is a guarantee that the loss rate within its domain does not exceed N Siu , i ∈ Vcontract . If the provider chooses a threshold value S u,N as the “assurance” for loss at transit nodes, the end-to-end loss guarantee can be approximately defined as Siu + S u,N . (18) Su ≈ N i∈Vcontract
To obtain the price of a contract, a payoff function, Yi , is defined that measures an ISP’s performance according to the contract definition. For example, the payoff function of a sample contract, ( A), defined in terms of loss rates starting at t = 3 pm for a duration of an hour may be given by Yi (li , Siu ) = 1{li ≤Siu } |li − Siu |,
(19)
where li is the loss outcome of the service. Therefore, in general a payoff function, Yi , can be created that (1) captures whether the loss outcome is within the contracted
216
M. Yüksel et al.
guarantee Siu and (2) how much better than Siu the provider is able to perform. The price of a contract Vi is then determined by the expectation of the payoff under a special probability measure Q, i.e., Vi = E Q (Yi ).
(20)
The probability measure Q captures the providers preference for loss outcomes and is termed the state price density (SPD). Note that Vi and Yi are time dependent just as li . Time is not explicitly indicated due to our assumption of identical time descriptions of all contracts. For the choice of payoff function given in (19), it can be shown that the price for a given li process, Vi (SiU ) = Vi (SiU ; li ), has the following properties: (1) Vi (SiU ) is a convex function and (2) Vi (0) = 0; and Vi (SiU ) is non-decreasing with SiU up to a certain SiU , after which Vi (SiU ) = 0. These properties are reasonable since the risk being priced is defined by the outcomes of an ISP’s performance against the contract without any account for efforts in providing the service. A lower SiU does not imply a greater effort on the provider’s part, neither does a higher SiU imply less effort. A higher SiU , up to SiU , however, does imply that the provider gets a greater benefit of risk bearing. With an increasing threshold, greater loss outcomes will be in favor of the provider; hence, the provider’s performance with respect to the contract specifications becomes better, therefore, gets a higher reward. It should be noted that this is only part of the picture. When the provider violates the contract, he also incurs a penalty. Clearly with an increasing SiU the benefits to the provider increase, but the penalty on violation must also simultaneously increase. The penalty determination, however, is beyond the scope of this chapter. Without the need to define a specific form of an ISP’s utility function for losses, we stipulate Q that captures the ISP’s preference structure for loss outcomes, by some general properties of the ISP’s preferences for loss outcomes in its domain. (1) The ISP would expect that losses in its domain are rare events during a contract term. (2) If losses do happen, they will more likely take small to moderate values. (3) The ISP will not be rewarded when large losses occur. In particular, we consider ISPs with two types of preference structures and the corresponding intra-domain prices: 1. An ISP has a strict preference for smaller losses over large losses, i.e., the loss free state is the most favorable outcome to the ISP. This results in performancebased prices. Price is higher when network is at low utilization and the provider is capable of performing in better accordance with the contract. 2. We also consider an alternative preference structure where the loss outcome most desirable to the ISP is at a small but positive level. This will be the case if customers of the ISP services can tolerate certain small losses, as the ISP is possibly able to accommodate more customers by allowing small losses to an individual customer’s data. This results in congestion-sensitive prices. Price is higher when
Dynamic Overlay Single-Domain Contracting
217
network is at high utilization, which discourages customers from buying loss assurances when the network is congested. Using the assumptions described above, we develop pricing strategies for end-toend service with loss assurances. An end-to-end loss-assured service is characterized by its source and destination (s–d pair) location and the loss guarantee, along with other specifications of the service. Consider a contract for service between a certain s–d pair with an end-to-end loss guarantee S u . In the following an end-to-end contract (service) refers to a contract thus defined, and end-to-end pricing refers to the pricing of the end-to-end loss assurance, unless otherwise stated. The provider can acquire and concatenate intra-domain contracts in multiple ways to provide a service. Different customers’ traffic between an s–d pair may be routed differently within the overlay constructed from acquired contracts; on a given path, the provider may purchase from the same ISP intra-domain contracts with different loss assurances to achieve the end-to-end assurance S u , as long as (18) holds. The routing information and ISP contract types used, however, are invisible to customers. For a price, a customer simply expects to receive the same service from a certain contract. Therefore, at a certain time, prices are determined entirely by the specifications of the contract – the s–d pair and the loss guarantee S u . In particular, if the end-to-end services can be created in an oligopolistic competition, i.e., there are alternatives for how end-to-end services can be created, and no single provider exerts its monopoly power in creating such services, as well as there is efficiency in how such services can be created dynamically, the following is true about the price of an end-to-end contract. Proposition 1 Under the assumption that the market for end-to-end services is competitive and efficient, the price of an end-to-end contract is the lowest price over all possible concatenations of intra-domain services to deliver the end-to-end contract, denoted by V ∗ (S u ). Assume that there are R routes to construct a loss-guaranteed service between an s–d pair, where path r (r = 1, . . . , R) involves h r ISPs. On path r , the provider u at purchases from each constituent ISP, i, a service contract with loss guarantee Si,r u a price Vi (Si,r ). The provider assigns a price for the risk for loss at all h r − 1 transit nodes on path r , using a price function VN . In addition, assume that the provider will u assign a threshold value, Sru, N as an “assurance” to lrN , with {Si,r , Sru, N } satisfying u, N the condition of (18). VN depends on the loss assurance Sr and other characteristics, Θ, of the provider. To solve the end-to-end pricing problem, we first define price of a path. Definition 1 (Price of a path) The price of path r , V r (S u ), r = 1, . . . , R, is defined as the price of the end-to-end loss assurance if path r were the only path to deliver the end-to-end contract. The following proposition is obtained by the direct application of Proposition 1. Proposition 2 The price of a path r , V r (S u ), for an end-to-end contract with loss assurance S u is determined by the least costly concatenation of intra-domain
218
M. Yüksel et al.
contracts on path r , to eliminate arbitrage between different concatenations of intradomain contracts, i.e., u Vi (Si,r ) + VN (Sru,N ). (21) V r (S u ) = min u , S u,N } {Si,r r
i ISP i on path r
In our earlier work [12], we developed a categorization-based pricing scheme for end-to-end bandwidth, where prices are determined by the hop count h and number of bottlenecks (ISPs and/or transit nodes) b on the most likely path between an s–d pair. We will utilize this bh classification for pricing end-to-end loss assurance, so that the two price components, for bandwidth and loss-guarantee, are combined consistently to form the price of the contract. In addition, only simple cycle-free paths are considered, since presence of cycles artificially inflates traffic in a network, and thus increases the provider’s cost in providing the service. By applying Propositions 1 and 2, the provider’s pricing problem to determine V ∗ (S u ) is defined as follows: Problem 1 (End-to-end pricing problem) min
r ∈bh
min
class {S u , Sru,N } i,r
i
u ) + V (S u,N ) Vi (Si,r N r
ISP i on path r
s.t.
i
ISP i on path r u,N u
Si,r , Sr
u Si,r + Sru,N = S u
≥0
(22)
∀ i, r, ISP i on path r.
This most general formulation of the pricing problem, where each ISP has a unique price function, can be complex to solve. This is especially true since the provider needs to solve a problem of this form for each s–d pair in the overlay. Estimating the price functions Vi (Siu ) of all ISPs also adds to the complexity of the pricing model. Furthermore, prices need to be recomputed when the provider changes the configuration of its overlay. Therefore, simplifying assumptions will be necessary for solving the end-to-end pricing problem. In Section 8.1, we will conduct a numerical simulation analysis of the above pricing problem.
8.1 Numerical Evaluation We consider the price of an end-to-end loss assurance S u between an s–d pair in an overlay with N = 10 ISPs. We will study the solutions to the end-to-end pricing problems with loss-free and loss-prone transit nodes, respectively. For simplicity, we assume that the provider and all ISPs use congestion-based pricing. Instead of a fixed overlay configuration, we consider all possible combinations of the numbers
Dynamic Overlay Single-Domain Contracting
219
of ISPs (h r ) and of bottleneck ISPs (br ) on a path r when h r ≤ 10. The price functions of underlying ISPs can be described by quadratic functions [10]: V0 (g) = cg 2 , V1 (g) = ct cg 2 , ct > 1. We use representative intra-domain price functions with c = 8667 and ct = 1.074 (t = 3pm). ct indicates the difference between the price functions of congested and non-congested ISPs. The price function for transit nodes is also a quadratic function VN (g) = c N g 2 ; c N is varied to examine the effect of the price of transit nodes on the end-to-end price. 8.1.1 Solution to Loss-Free Nodes Figure 12 shows V r with different combinations of h r and br on the path, assuming all transit nodes are loss free. We see that V r decreases with h r . At the same time, V r increases with br , the number of bottleneck ISPs on the path, for a given choice of h r ; this is also shown in Fig. 13 which gives the relationship between the price of a path and the number of bottleneck ISPs on the path with h r = 10. It is further observed that between h r and br , h r seems to have a more significant effect on V r ; a longer path involving more ISPs is less expensive than a shorter path, regardless of how many bottleneck ISPs are encountered on the path. Therefore, in this case the end-to-end price V ∗ will be determined by the longest path that involves the least bottleneck ISPs. This resembles the high diversification principle of the financial portfolio theory. To obtain the end-to-end price V ∗ , the provider will need to (i) search for the longest path between the s and d pair in a bh class and (ii) among such paths, search for the one involving fewest bottleneck ISPs.
8.1.2 Solution to Leaky Nodes We now study the effect of introducing risks at transit nodes to the end-to-end price V ∗ . We assume the provider uses a congestion-based pricing for risk of loss,
Fig. 12 V r with ISP classification (beta SPD)
220
M. Yüksel et al.
Fig. 13 V r with # of bottlenecks (beta SPD): loss-free transit nodes, h r = 10
producing a quadratic price function VN (g; h) = c N ,h g 2 , where g, h are the loss assurance and number of ISPs on the path, respectively. Different price functions are studied by varying the coefficient c N ,h ; in particular, we study the cases when VN (g; h) > V1 (g), V1 (g) > VN (g; h) > V0 (g), and VN (g; h) < V0 (g), ∀h, respectively. c N ,h also depends on h; we set c N ,h to decrease linearly by 5% for every additional transit node involved. Figure 14 shows V r with different h r , br combinations, where λ = 0.3 is the constraint on the proportion of the loss assurance that can be assigned to transit nodes Sru,N , and VN (g; h) < V1 (g) (Fig. 14 (a)), VN (g; h) > V1 (g) (Fig. 14b), respectively. The scenario with V1 (g) > VN (g; h) > V0 (g) looks similar to the loss-free transit node case (Fig. 12) and is not shown here.
Fig. 14 V r with leaky transit nodes (beta SPD, λ = 0.3): (a) V1 (g) > V0 (g) > VN (g; h) and (b) VN (g; h) > V1 (g) > V0 (g)
Comparing Fig. 14 with Fig. 12, we can see that introducing risks at transit nodes decreases the overall price levels for V r , regardless of the relative relations between the price functions, although as expected, between these two scenarios the decrease in V r is more significant with a lower price function VN in Fig. 14a. Similar to the case of loss-free transit nodes, the primary effect on V r is of h r ; that is, the price of a longer path involving more ISPs and transit nodes is always lower. However, the relationship between V r and the number of bottlenecks on the path br becomes
Dynamic Overlay Single-Domain Contracting
221
Fig. 15 V r with # of bottlenecks (beta SPD): leaky transit nodes, h r = 10
irregular, as is also seen in Fig. 15. Therefore, to obtain the exact end-to-end price, the provider will need to search for all the longest paths in a bh class and choose V ∗ as the price of the least expensive path. In all scenarios considered above, the longest paths involving more ISP contracts will be preferred in constructing the end-to-end loss assurance. This resembles the diversification technique to reduce risk in risk management [4]. The provider benefits from allocating risk in the end-to-end loss assurance to more ISP domains.
9 Summary In this chapter, we describe a dynamic congestion-sensitive pricing framework that is easy to implement and yet provides great flexibility in rate allocation. The distributed-DCC framework presented here can be used to provide short-term contracts between the user and the service provider in a single diff-serv domain, which allows the flexibility of advertising dynamic prices. We observe that distributedDCC can attain a wide range of fairness metrics through the effective use of an edge-to-edge pricing (EEP) scheme that we provide. Also, we introduce two broad pricing architectures based on the nature of the relationship between the pricing and congestion control mechanisms: pricing for congestion control (PFCC) and pricing over congestion control (POCC). We show how the distributed-DCC framework can be adapted to these two architectures and compare the resulting approaches through simulation. Our results demonstrate that POCC is better in terms of managing congestion in network core, while PFCC achieves wider range of fairness types in rate allocation. Since distributed-DCC is an edge-based scheme, it does not require upgrade of all routers of the network, and thus, existing tunneling techniques can be used to implement edge-to-edge closed-loop flows. An incremental deployment of distributed-DCC is possible by initially installing two distributed-DCC edge routers, followed by replacement of others in time.
222
M. Yüksel et al.
We also describe a framework for pricing loss guarantees in end-to-end bandwidth service, constructed as an overlay of edge-to-edge services from many distributed-DCC domains. An option-based pricing approach is developed for endto-end loss guarantees over and above the price for basic bandwidth service. This provides a risk-sharing mechanism in the end-to-end service delivery. The price of the end-to-end contract is determined by the lowest price over all valid intra-domain contract concatenations. Based on certain simplifying homogeneity assumptions about the available intra-domain contracts, numerical studies show the importance of diversification in the path chosen for end-to-end service. Acknowledgments This work is supported in part by National Science Foundation awards 0721600, 0721609, and 0627039. The authors wish to thank Shivkumar Kalyanaraman for his mentoring and Lingyi Zhang for excellent research assistance.
References 1. Arora GS, Yuksel M, Kalyanaraman S, Ravichandran T, Gupta A (2002) Price discovery at network edges. In: Proceedings of international symposium on performance evaluation of telecommunication systems (SPECTS), San Diego, CA, pp 395–402 2. Blake S et al. An architecture for differentiated services. IETF RFC 2475, December 2008 3. Bouch A, Sasse MA (2001) Why value is everything?: A user-centered approach to Internet quality of service and pricing. In: Proceedings of IEEE/IFIP IWQoS, Karlsruhe, Germany. 4. Chiu DM (1999) Some observations on fairness of bandwidth sharing. Tech. Rep. TR-99–80, Sun Microsystems Labs 5. Clark D (1997) Internet cost allocation and pricing. In: McKnight LW, Bailey JP (eds) MIT Press, Cambridge, MA 6. Clark D (1995) A model for cost allocation and pricing in the Internet. Tech. Rep., MITPress, Cambridge, MA 7. Clark DD, Wroclawski J, Sollins KR, Braden R (2005) Tussle in cyberspace: defining tomorrow’s Internet. IEEE/ACM Trans Netw 13(3): 462–475 8. Cocchi R, Shenker S, Estrin D, Zhang L (1993) Pricing in computer networks: motivation, formulation and example. IEEE/ACM Trans Netw 1:614–627 9. Crouhy M, Galai D, Mark R (2001) Risk management. McGraw-Hill, New York, NY 10. Gupta A, Kalyanaraman S, Zhang L (2006) Pricing of risk for loss guaranteed intra-domain Internet service contracts. Comput Netw 50:2787–2804 11. Gupta A, Stahl DO, Whinston AB (1997) Priority pricing of integrated services networks. In: McKnight LW, Bailey JP (eds) MIT Press, Cambridge, MA 12. Gupta A, Zhang L (2008) Pricing for end-to-end assured bandwidth services. Int J Inform Technol Decision Making 7(2):361–389 13. Harrison D, Kalyanaraman S, Ramakrishnan S (2001) Overlay bandwidth services: basic framework and edge-to-edge closed-loop building block. In: Poster in SIGCOMM, San Diego, CA 14. Kelly FP (1997) Charging and rate control for elastic traffic. Eur Trans Telecommun 8:33–37 15. Kelly FP, Maulloo AK, Tan DKH (1998) Rate control in communication networks: shadow prices, proportional fairness and stability. J Oper Res Soc 49: 237–252 16. Kunniyur S, Srikant R (2000) End-to-end congestion control: utility functions, random losses and ecn marks. In: Proceedings of conference on computer communications (INFOCOM) Tel Aviv, Israel
Dynamic Overlay Single-Domain Contracting
223
17. Low SH, Lapsley DE (1999) Optimization flow control – I: basic algorithm and convergence. IEEE/ACM Trans Netw 7(6):861–875 18. MacKie-Mason JK, Varian HR (1995) Pricing the Internet. In: Public Access to the Internet, Kahin B, Keller J (eds), Cambridge, MA: MIT Press, 269–314 19. Mo J, Walrand J (2000) Fair end-to-end window-based congestion control. IEEE/ACM Trans Netw 8(5):556–567 20. Odlyzko AM (1997) A modest proposal for preventing Internet congestion. Tech. Rep., AT & T Labs 21. Odlyzko AM (1998), The economics of the Internet: utility, utilization, pricing, and quality of service. Tech. Rep., AT & T Labs 22. Odlyzko AM (2000) Internet pricing and history of communications. Tech. Rep., AT & T Labs 23. Paschalidis IC, Tsitsiklis JN (2000) Congestion-dependent pricing of network services. IEEE/ACM Trans Netw 8(2):171–184 24. Semret N, Liao RR-F, Campbell AT, Lazar AA (1999) Market pricing of differentiated Internet services. In: Proceedings of IEEE/IFIP international workshop on quality of service (IWQoS), London, England, pp 184–193 25. Semret N, Liao RR-F, Campbell AT, Lazar AA (2000) Pricing, provisioning and peering: dynamic markets for differentiated internet services and implications for network interconnections. IEEE J Select Areas Commun 18(12): 2499–2513 26. Shenker S (1995) Fundamental design issues for the future Internet. IEEE J Select Areas Commun 13:1176–1188 27. Singh R, Yuksel M, Kalyanaraman S, Ravichandran T (2000) A comparative evaluation of Internet pricing models: Smart market and dynamic capacity contracting. In: Proceedings of workshop on information technologies and systems (WITS), Queensland, Australia 28. Teitelbaum B, Shalunov S (2003) What QoS research hasn’t understood about risk. In: Proceedings of the ACM SIGCOMM 2003 workshops 148–150, Karlsruhe, Germany 29. UCB/LBLN/VINT network simulator – ns (version 2) (1997) http://wwwmash.cs.berkeley.edu/ns 30. Varian HR (1999) Estimating the demand for bandwidth. In: MIT/Tufts Internet Service Quality Economics Workshop, Cambridge, MA 31. Varian HR (1999) Intermediate microeconomics: a modern approach. W. W. Norton and Company, New York, NY 32. Wang X, Schulzrinne H (2000) An integrated resource negotiation, pricing, and QoS adaptation framework for multimedia applications. IEEE J Select Areas Commun 18(12):2514–2529 33. Wang X, Schulzrinne H (2001) Pricing network resources for adaptive applications in a differentiated services network. In: Proceedings of INFOCOM, Shanghai, China, pp 943–952 34. Yuksel M (2002) Architectures for congestion-sensitive pricing of network services. PhD thesis, Rensselaer Polytechnic Institute, Troy, NY 35. Yuksel M, Kalyanaraman S (2002) A strategy for implementing the Smart Market pricing scheme on Diff-Serv. In: Proceedings of IEEE GLOBECOM, pages 1430–1434, Taipei, Taiwan. 36. Yuksel M, Kalyanaraman S (2003) Elasticity considerations for optimal pricing of networks. In: Proceedings of IEEE symposium on computer communications (ISCC), Antalya, Turkey, pp 163–168 37. Yuksel M, Kalyanaraman S (2005) Effect of pricing intervals on congestion-sensitivity of network service prices. Telecommun Syst 28(1):79–99
Modelling a Grid Market Economy Fernando Martínez Ortuño, Uli Harder, and Peter Harrison
1 Introduction It has for some time been widely believed that computing services will, in due course, be provided similar to telephone, electricity and other utilities today. As a result, a market will develop with different Grid companies competing and cooperating to serve customers; there are already signs that this is beginning to happen. Companies will have to set prices to attract customers and make profits. Governments that may choose to provide GRID services will similarly have to set prices, perhaps with a different aim of maximizing utilization. This area of Grid research has been highlighted as a key area of research in the EU report on next-generation Grids. The development of the Grid has come to a stage where companies are beginning to sell Grid services. For example, both IBM and SUN are selling Grid access by the hour; already a market exists where companies compete for GRID customers. It will probably not take long until we see brokers (also called middlemen or suppliers to end-users in different environments) re-selling Grid services they bought in bulk from providers. This would constitute the usual 3-tier setup of markets seen, for instance, in the electricity market, where we have generators, suppliers and users. For the companies involved, who may wish to provide GRID access as an efficient alternative to buying computers, it is important to have a market model to be able to Fernando Martínez Ortuño Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK e-mail:
[email protected] Uli Harder Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK e-mail:
[email protected] Peter Harrison Department of Computing, Imperial College London, Huxley Building, 180 Queens Gate, London SW7 2AZ, UK e-mail:
[email protected] N. Gülpınar et al. (eds.), Performance Models and Risk Management in Communications Systems, Springer Optimization and Its Applications 46, C Springer Science+Business Media, LLC 2011 DOI 10.1007/978-1-4419-0534-5_10,
225
226
F.M. Ortuño et al.
test various pricing schemes. This will become even more evident once futures and options for Grid access are sold. Future customers of this Grid market are expected to be banks and insurance, engineering and games companies, as well as universities and other research organisations. Although a great deal of effort has been aimed at modelling financial markets, much less is known about modelling commodity markets. Our aim has been to produce a high-level market model for Grid computing and a future Grid-based market economy, rather than the traditional use of economic methods for the micromanagement of future Grids. The approach bears some similarities and analogies to current structures in place for the airline, electricity and telephone industries, but is specifically focused on the needs and characteristics unique to the Grid. We mainly discuss and elaborate on results that the authors have already published in workshops [16, 17, 21], adding new material to make the paper coherent and up to date. First, we review the use of peer-to-peer (P2P) technology for commercial Grid computing. Using economic incentives to facilitate fair use of computing power in a group is not new. Greenberger [15] discusses this as far back as 1966. He mentions various ways and methods to use economic ideas to make queueing systems fairer. In 1968 Sutherland describes how access to a PDP-11 was organised at Harvard using an auction-based system [31]. In a very simple auction, users bid for exclusive access to the PDP-1. Their budgets were replenished every day and left-overs could not be saved. The more senior the position held in the department the larger the budget of an individual. This system allowed fair access by not excluding anyone and also maximised the utilisation of the system by refilling the budgets at the end of each day. Later, for instance, Nielsen [23] and Cotton [9] discuss shared access to resources in a multi-user environment. In general there are two main reasons to charge: for computer access: • Fair access: Using real or fake money a group organises access to facilities by means of charging for their use. • Profit: A company essentially hires out facilities to third parties for real money. The first case is basically what Sutherland describes in a very simple setting where only one user has access at a time. The money does not need to be real in this case but can be tokens. Priorities can be implemented by giving different groups of people a different budget. In this case micromanagement of resources might also make sense. The second case is actually not too different from mobile phone use, for example. A company is unlikely to charge users for detailed use of facilities but rather for accessing services over a certain period of time. The concept of the Grid [14] renewed interest in using economic incentives to manage computer systems. The idea of the Grid is to provide access to computing power in the same way as electricity is offered to end-users. Authors have suggested
1 An
18-bit computer made by Digital Equipment Corporation in the early 1960s,
http://en.wikipedia.org/wiki/PDP-1
Modelling a Grid Market Economy
227
the use of economics in micro- and macro-management and the methods tend to be either auctions or commodity markets, see [36], Regev and Nisan [26], Waldspurger et al. [32] and Buyya [6]. One of the major problems with any kind of scheduling or access-providing infrastructure is to avoid centralised services since they are likely to become a bottleneck for a large number of users or transactions. An example of a centralised system is Tycoon [19], which has a central service locator service and experienced this problem. There is therefore now a trend to try to build future Grid architectures using P2P networks, which are de-centralised. Examples are the P-Grid, which is mainly a data Grid [1], and more generally Catallaxy [2]. Another approach is middleware for activating the global open Grid (MAGOG) [8, 27] which will be our main focus in this chapter. We present three different ways to model some aspects of the simplified MAGOG system. First, we describe an analytical model, which is essentially a mean field approximation of the system. Next we show how an agent-based simulation can be used to investigate the system. Agent-based simulations have in the past been used to model privatised electricity markets [3] and auctions like the Marseille fish market [18]. In fact our market of computing power is very similar to that of perishable commodities like fish as it is difficult to store. In our simulation model, the agents have only limited knowledge of the entire system and, rather than globally optimising agents, they are satisficing agents with bounded rationality [5, 30]. We find that the mean field approximation and simulation show good agreement. Lastly, we use Markov decision processes (MDP) introduced by Belman to investigate decision strategies and market behaviour for a good introduction to the topic see [25]. We next present a short review of P2P networks, together with some related, basic graph theory, and then describe the salient features of MAGOG before introducing our simplifications for their modelling.
1.1 P2P Networks and Graphs In recent years P2P networks have become very popular for disseminating content amongst users. In particular, there are networks that do not need a central server, like Freenet. One of the most popular applications of P2P networking is Skype which allows video calls via the Internet. Some of Skype’s services have to be paid for which needs minimal central bookkeeping. A very good review of P2P networking is [20]. Essentially P2P networks are about resource sharing and in the next section we show how this can be extended to resources other than storage to allow Grid computing via P2P technology. Another interesting and important aspect of P2P networks is the way their nodes are connected. Graph theory is a suitable means to describe this precisely. A graph consists of nodes (vertices) which are connected by links (edges). Nodes can be characterised by their degree which is the number of links they have with other nodes. In the case of a directed graph, where links have a sense of direction, one distinguishes between the in and out degree of a node. Graphs can either be connected
228
F.M. Ortuño et al.
or unconnected, depending on whether there is a continuous path from each node to all other nodes. To compare different graphs, one can, for instance, measure their node degree distribution. Other measures include the diameter, which is the longest of all shortest paths between all pairs of nodes in the network. This gives an indication of the size of the network. There is also the clustering coefficient, which is roughly the ratio of the numbers of existing links between nodes and the theoretical maximum number of links. The two types of graphs we have used in this study are the Erd˝os–Rényi (ER) graphs [12] and Barabási–Albert (BA) [4] graphs. ER graphs are “random” and so are characterised by the fact that their node degree is a binomial random variable n, which tends to Poisson in large graphs: P(n = k) ≈ K k
e−K , k!
where K is the average node degree. In contrast, the node degrees in a BA graph have a power law distribution: P(n = k) ≈ (k + c)−α , where c is a constant. The power law is often referred to as scale free. BA graphs are small world graphs as their nodes are highly clustered. Small world networks were first classified by Watts and Strogartz [35] and a fairly concise review of complex networks can be found in [13]. The P2P network structure of the overlay network has been studied for Freenet, which is scale free [24], and for Gnutella, which is small world and scale free [34]. The evidence that scale-free networks are in common use and that BA graphs reflect this property well is the reason we have chosen BA networks in our simulations, in addition to ER graphs.
1.2 MAGOG The MAGOG system has not yet been implemented and some of the details of its architecture will be omitted as they are not important for our investigation. In fact one can look at MAGOG as an example application of our analysis which is abstract enough to cover other different architectures with the same general structure. The architecture of MAGOG was first described in [27]. The global open Grid (GoG) is meant to be a P2P network of nodes that provide (seller) or require computing services (buyer). Nodes can be any computing device that can install the MAGOG software. The inventors of MAGOG believe this will be possible on a large range of devices, from smartphones to supercomputers. It is envisaged that nodes send out messages (called Ask or Bid bees) over the network to advertise that they either need or can supply computing services at a certain price; this is referred to as double message flooding. Using a micropayment system, nodes will
Modelling a Grid Market Economy
229
be able to exchange real money for services. Each node has a list of nearest neighbours that will be seeded in an appropriate way when they (re-)join the system; therefore, nodes only have local knowledge of the system. Apart from the payment process there is no centralised point in the system. Therefore, the system should scale extremely well. Each node passes on messages from its neighbours to its neighbours until a counter attached to the message, the time to live (TTL), has been exceeded at which point the message is discarded. This way messages can only travel a certain number of hops. Nodes also keep a copy of messages in a buffer (the pub) and try to match up suitable needs or services advertised in the messages. If a match can be made, the node stops forwarding a matched message. Similarly, a message advertising services or needs that have been matched is discarded. The combination of pubs and double message flooding allows matches to be found between nodes up to twice the number of TTL hops apart. The design of MAGOG was inspired by Hajek’s model of market economies, Catallaxy [7], which contrasts with the model of Walras [31] that assumes global knowledge of all agents in the market.
2 Abstract MAGOG To be able to model MAGOG, we need a simplified abstraction of the system and so we specify the connections between and characteristics of (buying or selling) nodes in the system. In our initial model, the number of buyer and seller nodes is constant and nodes cannot change from being a buyer to a seller and vice versa. The nodes use their links bi-directionally for message transfer. The list of nearest neighbours is simply given by the nearest neighbours of the nodes of the chosen network. The pub of each node is finite and the older messages are discarded when the node runs out of pub memory. Buyer nodes have a finite budget which gets replenished after B time epochs. This stops the exponential price increases seen in [16]. There is also a smallest currency unit that can be used to pay for services, which prevents infinitesimally small prices. Nodes can only buy or sell one resource at a time. After a preset time, matched up nodes go back to an unsatisfied state. As in the real MAGOG, messages have a TTL and get discarded once they have hopped between TTL nodes, as do messages which already have been satisfied. We use a call auction algorithm for matching the orders in the pubs, similar to the one used to set the opening and closing prices in the stock market [11], as an approximation to the continuous trading that would take place in a real MAGOG deployment. All new unmatched messages are then forwarded to the nearest neighbours. Once matches have been found, nodes stay in the satisfied state for a while and after that start sending out messages advertising needs or services. In our simplified model the agents behave in a very simple way to change prices:
230
F.M. Ortuño et al.
• Buyers will try to acquire the same service for less the next time and therefore bid pb = (1 − ) pb , with 0 < < 1. • Similarly, sellers will try to sell the same resource for more the next time and ask ps = (1 + ) ps , with 0 < < 1. Another source of price variation is starvation, when sellers or buyers cannot find a match. In this case, sellers reduce their price and buyers increase their bids according to the same ratios in the equations above after TTL epochs. The bid prices buyers can make cannot exceed their remaining budget and if a buyer runs out of money, he stops bidding until his funds have been replenished. Nodes that currently have a match continue to operate as usual with respect to forwarding messages and finding matches for other nodes.
2.1 An analytical approximation If we assume that the P2P network is fully connected, we can approximate the system as a single central market place, where the orders of all nodes get together and match according to price constraints and supply/demand. Under these conditions, the average price of all deals in the system after n time steps is given by b 1−b n δ− ) , P(n) = P0 (δ+
(1)
where P0 is the initial price and δ± = 1 ± . The price change ranges between 0 and 1. The percentage of buyers b ranges from 0 to 1 inclusively. In order to derive expression (1), it is assumed that time is synchronous and discrete for all nodes in the system. At every time step, all nodes submit their orders for buying and selling with their respective bid and ask prices. All orders of a certain time step meet at the central market place, where deals are made. A deal is made between a buyer and a seller if the bid price of the buyer is greater than or equal to the ask price of the seller. The deal price for these two market participants is always the average between their ask and bid prices. The nodes that have found a deal in a certain time step change their ask/bid price in their own interest on the next time step: the bid price of the buyer for the next time step will be the one he has used in the present time step multiplied by δ− ; the ask price of the seller for the next time step will be the one he has used in the present time step multiplied by δ+ . The average deal price in the system at a certain time step is the average of all deal prices in that time step. Some nodes will not be able to find a deal in a particular time step. This may happen because their prices do not match or because there is a shortage of supply or demand (respectively for buyers and sellers). In this case, the unsatisfied nodes will change their respective bid/ask prices for the next time step against their own interest: the bid price of the buyer for the next time step will be the one he has used in the present time step multiplied by δ+ ; the ask price of the seller for the next time step will be the one he has used in the present time step multiplied by δ− .
Modelling a Grid Market Economy
231
We further assume that initially (first time step: n = 0) all buyers are bidding P0 and all sellers are asking P0 . Following the evolution of time as explained above, one arrives at the expression for the average deal price in the system given by (1). Therefore, the price development in the system depends on the expression in the bracket of (1), which we define as F: b 1−b δ− . F(, b) = δ+
(2)
Depending on the value of F, the deal price in the system will evolve towards one out of three possible values: ⎧ ⎪ ⎨∞ if F > 1 P(n) → P0 if F = 1 . (3) ⎪ ⎩ 0 if F < 1 We can now calculate the critical value of b() for which F = 1: b() = log(1 − )/(log(1 − ) − log(1 + )).
(4)
Equation (4) is plotted in Fig. 1, and represents the combination of values of b and (which make F = 1) for which the deal price in the system will evolve towards P0 . A combination of b and situated above the line plotted in Fig. 1 (F > 1) will make the deal price evolve towards infinity, whereas a pair of b and situated below the line (F < 1) will make the deal price evolve towards 0. As one can appreciate from Fig. 1, the percentage of buyers has to be greater than 0.5 to achieve a stable price in this simple model, independently of the value of . In order to have a realistic market model, both a simulation and a real system should avoid prices that evolve towards extreme values (0 and infinity), achieving a bounded price fluctuation with a certain stability. This may be achieved by a 1 F=1 0.9 F>1
b(Δ)
0.8 0.7
F<1
0.6 0.5 0.4
0
0.2
0.4
0.6
0.8
1
Δ
Fig. 1 The percentage of buyers b for a stable deal price in the analytic solution
232
F.M. Ortuño et al.
dynamic system where the values of price change and percentage of buyers b can vary.
2.2 The Simulation Multi-agent simulations have become a fairly standard technique for analysing market behaviour. This method has been used in Arthur’s influential paper [5] and later by, for instance, Nagel et al. for the investigation of the influence of timescale on the formation of markets [22]. It has also been used by Rosvall and Sneppen to see how information moves in a network with agents that only have local concepts of the network structure [28]. For our simulation we create the P2P overlay network for our MAGOG model with the igraph package [10]. We use ER and BA graphs to see whether different types of graphs have any influence on the behaviour of the system. Nodes are then randomly assigned to be either buyers or sellers with the specified probabilities. During the simulation we successively pick nodes randomly (without replacement) and update the state by dealing with the incoming and outgoing messages, trying to find matches for deals. For a simulation with N nodes one time epoch has elapsed after picking N nodes. In addition to the properties described earlier, the nodes in our simulation adapt their choice of over time and also are allowed to hibernate rather than look for a new deal if they estimate the percentage gain is disadvantageous to their own interests. Each node uses the messages it forwards to guess the percentage of buyers. If there is an excess of sellers a seller node will hibernate rather than try to find a new buyer when its deal has come to an end. It will re-enter the market after a certain time called the hibernation period. Buyers act in the same way when the percentage indicates there are too many buyers in the market. This does not affect the nodes role in forwarding messages and finding matches for deals in their pubs. In a similar way, the nodes use the prices they see from deals they facilitate to find a new for the price change. All types of nodes increase their individual when the deal price in their pubs increases and they decrease their when the deal price in their pubs decreases. In particular, when the increase in the deal price in the pub of a node is lower than 30%, the node increases its by 0.1, i.e. = + 0.1. When the increase in the deal price in the pub is between 30 and 60%, the node increases its by 0.2, i.e. = + 0.2. If the increase in the deal price in the pub is greater than or equal to 60%, the node increases its by 0.3, i.e. = + 0.3. In all cases, the individual of the node is forced to remain lower than 1. Similarly, the node decreases its by 0.1 when the decrease in the deal price of its pub is lower than 30%, i.e. = − 0.1; the node decreases its by 0.2 when the reduction of the deal price is between 30 and 60%; and the node decreases its by 0.3 when the decrease in the deal price in its pub is greater than or equal to 60%. In all cases, the individual of the node is forced to remain greater than 0. This behaviour is based on the observation of Fig. 1 for a fixed percentage of buyers, i.e. tracing a horizontal line. For any horizontal line in Fig. 1, a larger will
Modelling a Grid Market Economy
233
push the system towards the area where price tends to 0, whereas a smaller will push the system towards the area where price tends to infinity. The hibernation mechanism implemented by the nodes in the simulation is a selfish behaviour that contributes to price stability, whereas the change of that the nodes implement is a neutral behaviour that also contributes to price stability.
2.3 Comparison of the Analytical and Simulation Models We compared the simulation and analytical approximation for a variety of percentages of buyers. Here we present one example that demonstrates how closely the simulation coincides with the analytical approximation. We choose a mixture of 60% buyers and 40% sellers. This turns (1) into P(n) = P0 [F(, 0.6)]n .
(5)
When we make n go to infinity, (5) provides the final price according to the theoretical model. This final price will depend on the value of F(, 0.6) and, specifically, it will depend on whether F(, 0.6) is equal, greater or lower than 1. We can see graphically the value of F(, 0.6) for different values of by plotting F(, 0.6) as a function of . For a clearer view, we decided to plot F 10 (, 0.6) instead of F(, 0.6), since both expressions are greater, equal or lower than 1 for the same values. This is shown in Fig. 2.
Fig. 2 Plot of F 10 (, 0.6)
From Fig. 2 we can see that for = 0.3894 the initial price will not change as F = 1, for < 0.3894 the price will tend to infinity (since F > 1) and for > 0.3894 the price will tend to zero (because F < 1). We compared this theoretical result against a simulation using a BA graph with 16,384 nodes and show the results in Table 1. The simulation achieves a large price, bounded only by the limit on the budgets and the time the simulation has run (and
234
F.M. Ortuño et al.
therefore equivalent to the theoretical final price of infinity), for ≤ 0.4; and a price of 1.0, the minimum asking price in the simulation (and therefore equivalent to the theoretical final price of 0), for ≥ 0.5. The value for which the final price in the simulations reverts from tending to infinity to tending to zero is between 0.415 and 0.425. This fits fairly well with the prediction from the analytical model, taking into account the simplifications made to derive the latter. We also note that the final price in the simulations follows a similar pattern to the value of F 10 (, 0.6) (the greater the value of F 10 (, 0.6), the faster the final price in the simulations approaches infinity), up until = 0.3, as it is possible to see from Table 1. In summary the results from the analytical model and the simulation are not identical but show very similar results.
2.4 A Further Comparison with the Analytic Model Inspired by Fig. 1 we investigate whether the system evolves to the line of stability if it has the chance to adjust the for each node separately, and the nodes can go into hibernation, both by using local knowledge. We first note that the price evolution of the system2 appears to be as stable as one could expect, as depicted in Fig. 3. This system was started off with a percentage of buyers b = 0.5 and all nodes started with a = 0.7. Buyer nodes went into hibernation for 150 epochs when they estimated b ≥ 0.5, sellers hibernated the same length if they estimated b < 0.5. They estimated this percentage by looking at the messages they had forwarded from their pubs in each epoch. This ratio was continually recalculated every epoch by averaging the information of the new epoch with that of the previous epochs up to the past 10. After 10 epochs had elapsed, all stored information about the ratio was erased and the cycle starts again for the next 10 epochs. We have run simulations of this system with different values of initial b and and all systems that achieved stable deal prices ended up having b = 0.79 and an average of 0.8. As can be seen in Fig. 4 this is in line with the prediction of a stable model by the analytical approximation. We observed that initial combinations of b and close to the F = 1 line in Fig. 4 lead to a stable price. Further away from the line the price ends up being 1 (limited by the minimum ask price) or very large (limited by the finite budget). This instability also arises with initial points of b > 0.8. It appears that these initial conditions are too extreme to be accommodated by any length of hibernation time or . The choice of the hibernation length seems to influence how far the initial condition can be chosen away from the stable line and still result in a stable system. In Fig. 4 we show a number of hibernation times for several initial choices of b and that makes the system achieve a final stable price. The hibernation time increases for initial points that are below the line for a stable deal 2 In this mode the TTL is 7, the nodes’ pubs have a size of 100. Buyers and sellers start bidding/asking with an initial price of 1200. Buyers have a limited budget of 10,000, which is re-filled after they have been picked 10 times. A node that has been picked four times re-enters the market. The diameter of the 512 node BA network is 9.
Modelling a Grid Market Economy
235
Table 1 Comparison of the simulation results with the analytical model for different values
Value of F 10 (, 0.6)
0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.99 0.99999
1.0197 1.0915 1.1623 1.2231 1.1589 0.9758 0.7119 0.4295 0.1955 0.0544 0.0047 6.2104×10 −7 6.3998×10−19
>1 >1 >1 >1 >1 <1 <1 <1 <1 <1 <1 <1 <1
Theoretical FP
FP in the simulations
∞ ∞ ∞ ∞ ∞ 0 0 0 0 0 0 0 0
3032.53 ± 11.37 31, 197.39 ± 241.79 35, 357.35 ± 211.37 37, 684.15 ± 346.19 36, 621.40 ± 521.83 5973.94 ± 681.17 1.00 ± 0.01 1.00 ± 0.00 1.00 ± 0.01 1.00 ± 0.00 1.00 ± 0.00 1.00 ± 0.00 1.00 ± 0.00
FP stands for final price. The final price in the simulations is taken at epoch 3000. In the column of the final price in the simulations, the number on the left is the average of the final price in the simulations for 10 different seeds and the number on the right is two times the standard deviation for that sample.
5000 1000
3000
Price
7000
price, and the further away from the line these points are, the larger the hibernation time is. The hibernation time decreases for points above the line and it gets even lower as the initial points move further up from the line. We think this is due to the fact that longer hibernation excludes too many sellers from the market, driving up the price; therefore, a higher hibernation time is needed for initial combinations of b and that cause a sharper decrease in price according to the results of the analytical model.
0
50000
100000
150000
200000
Time in Epochs
Fig. 3 Average deal price evolution in the system for an initial 50% of buyers and an initial = 0.7
2.5 Response Time Distributions An important metric of a computer system from a user’s perspective is its response time. In our system this could be defined as the time that elapses between sending
236
F.M. Ortuño et al. 1 F=1 0.9
b(Δ)
0.8 10
0.7 50
0.6 0.5 0.4
0.2
50,75
50 100 125,150
0
10
150
0.4
50,75 150
0.6
0.8
1
Δ
Fig. 4 Hibernation time that makes the dynamic simulation go to a final stable deal price, for several initial conditions of b and and a BA network of 512 nodes. Each point with a hibernation time in the figure was run for 10 different seeds, all of them achieving a final stable price
out a message advertising or requesting services to finding a matching partner. By analysing a long-running simulation we present the response time distribution for sellers and buyers in Fig. 5. The simulation used in this case has the same parameters as the one plotted in Fig. 3, with the only exception of having an initial = 0.5. We have looked at highly connected and less connected nodes and found that the response time distributions look very similar and rather than the degree distribution, it appears to depend on the type of the node. In Fig. 5, we show typical distributions for both a highly connected and a less connected seller node and a highly connected and less connected buyer node. There are 80% buyers and hence they have a longer mean response time. The response times appear to be more related to the type of the node, i.e. whether it is a buyer or seller node, rather than its level of connectivity. In the tail, all of the response time distributions appear to approach a normal distribution. It is interesting that the response times does not reflect the power law of the node degree distribution at all. We would need to investigate in more detail what is causing this as it suggests that the network topology is not important for this aspect of the system.
2.6 Performance of the P2P Network Compared to a Centralised System One of the main selling points of MAGOG, as with most P2P systems, is that it scales better than a centralised system. In order to provide a fair comparison of
20
40
60
0.12 0.08 0.04
Probability Density Function
0.03 0.02 0.01 0
0.00
0.04
237
0.00
Probability Density Function
Modelling a Grid Market Economy
80
0
20
40
60
80
10
15
20
25
0.15 0.10 0.05
Probability Density Function 0
Waiting time in epochs to get service for buyers (less connected)
5
Waiting time in epochs to get service for sellers (highly connected)
0.00
0.04 0.03 0.02 0.01 0.00
Probability Density Function
Waiting time in epochs to get service for buyers (highly connected)
0
5
10
15
20
25
Waiting time in epochs to get service for sellers (less connected)
Fig. 5 Normalised histograms for the response time of buyers (left) and sellers (right); highly connected (top) and less connected (bottom). The dotted lines are the real histograms, whereas the continuous lines are normal probability density functions of the same mean and standard deviation as the real distributions. The mean and standard deviation of each case are μ = 26.78 and σ = 15.93 (highly connected buyers); μ = 29.99 and σ = 14.80 (less connected buyers); μ = 6.73 and σ = 4.76 (highly connected sellers); μ = 6.69 and σ = 3.46 (less connected sellers)
two completely different architectures we compare here a MAGOG system with a centralised system of the same size, i.e. the same number of nodes. The centralised system is achieved by using a graph with a star topology. The de-centralised system has a BA graph structure. In the centralised system, the central node handles all the load of the system, whereas in the de-centralised system, the load is distributed among the buffers of all nodes in the network. We arrange it so that neither system loses any messages due to insufficient buffer space. We then measure the average number of messages per buffer, which is an indicator of the amount of overhead experienced in either system. We find that the centralised one has to handle a number of messages that is several orders of magnitude larger than the load handled by each of the distributed nodes in the de-centralised system. This is true for each network size that we investigated and is shown in Fig. 6, where we plot the total numbers of messages in the central node and the average number of messages in an individual buffer in the de-centralised system, per epoch. It becomes obvious that individual MAGOG nodes on a BA graph certainly deal with far fewer messages than the central node on a star graph. Therefore, it should take less time and computing power for individual nodes to make decisions in finding matches between nodes in the distributed case.
238
F.M. Ortuño et al. Average number of messages in buffers
Number of messages
100000
Centralized Distributed (MaGoG)
10000
1000
100
10 10
100
1000 Network size
10000
100000
Fig. 6 Total number of messages in the central node of a star-shaped network and the average number of messages in a single MAGOG node in a distributed system
3 The Market Model In this section we model the market by making the assumption that the behaviour of the agents can be modelled by Markov chains. Assuming that decisions have Markovian character is fairly common in mathematical models of financial markets [25, 29]. The Markovian agents can either sell (−1), hold (0) or buy (+1) a resource. The notation will be shortened to {+, 0, −} sometimes. Changes in the states of the agent are memoryless and therefore the behaviour of agent i can be modelled by a discrete-time Markov chain with transition probability matrix Ti . The elements Ti (a, b) of this matrix are the probability of agent i changing from state a to state b, where a, b ∈ S = {−, 0, +}. Depending on those probabilities, an agent can be classed as a greedy buyer, a fearful seller or a neutral participant in the market. Each agent i has its own matrix Ti . The market can then be understood as the collection of all those matrices (Fig. 7). The entire state of the market and its evolution would of course be unfeasibly large to analyse. To simplify matters we introduce the concept of market pressure which simply determines whether the market price is being pushed up or down. Essentially, this way we look at the price changes. As in any market, excess of demand will push the price up and excess of supply will force it down. The total market state space, including the contribution of each agent individually, has 3 N states, as each of the N agents can be in three different states {−, 0, +}. To simplify matters we sum over the states of all agents. The sum can range from −N (where every agent is in sell mode), through −N + 1, . . . , N − 1, up to N (where all agents are in buy mode) and therefore our aggregated market now has 2N + 1 states.
Modelling a Grid Market Economy
239
Fig. 7 The market is formed by its market participants. We therefore specify the behaviour of each individual agent Ai
3.1 The Transition Probability Matrix of the Market We write the one-step transition probability matrix modelling the market as (Fig. 8) M = (m sd | −N ≤ s, d ≤ N ),
(6)
where m sd is the probability of moving from state s to state d, and s, d ∈ {−N , . . . , 0, . . . N }. The calculation of each element m sd is computationally complex since the states s and d of the market may comprise any of several different combinations of local states of the individual agents. Each element must be calculated as a weighted sum of the probabilities of going from each of the initial combinations of agent-states, comprising the global state s to any of the final combinations of agent-states that give global market state d. Given a heterogeneous market, let the Markov chain of agent k be irreducible and have equilibrium probability state vector π k = (πk− , πk0 , πk+ ), with components for each of the states {−, 0, +}. Let the probability generating function (pgf) for agent k from state i ∈ {−1, 0, +1} be Ak (z; i) = Tk (i, −)z −1 +Tk (i, 0)+Tk (i, +)z. We now define the N -component vector random variable Y(n to be the joint state of the agents just after the nth transition instant; the initial joint state is Y(0 . In the = |Y(n | be the corresponding global state just after the nth transisame way , let X n N tion, where |( v | = k=1 vk is the sum of the elements of a vector v(. The probability m sd of moving from state s to state d is then, m sd = lim P(X n = d | X n−1 = s). n→∞
when the limit exists. Under the assumption that the agents’ Markov chains are time homogeneous, all the probabilities P(X n = d | X n−1 = s) = P(X 1 = d | X 0 = s). Therefore m sd is the coefficient of z d in the generating function
240
F.M. Ortuño et al.
Fig. 8 We focus on the price variation of the market, instead of on the market price itself. At a particular time step ti , the maximum price variation is limited by the number of agents in the market, N . This avoids the problem of having to analyse an infinite number of states for the market price (0 to +∞). Furthermore, the price variation can be both positive and negative, what eliminates the constraint of dealing exclusively with positive market price values
G(z; s) = lim G n (z; s) = G 1 (z; s) = E[z X 1 | X 0 = s]. n→∞
The following proposition and corollary determine these coefficients. Proposition 1 Let πklk be the lk (lk ∈ {−1, 0, +1}) component of the equilibrium probability state vector of agent k, and Ak (z; lk ) the probability generating function for row lk of agent k, as described above. Then, for a market of N agents, the probability generating function (pgf) of the market state, one epoch after the state being s, is G(z; s) =
!N
k=1 πkk Ak (z; k ) . !N ( |=s ( k=1 πkk :|
( |=s ( :|
Proof (
G n (z; s) = E[E[z |Yn | | Yn−1 , X n−1 = s] | X n−1 = s] N "
=E E[z Ynk | Yn−1 , X n−1 = s] | X n−1 = s k=1
since the agents are independent N
" E[z Ynk | Yn−1,k ] | X n−1 = s =E k=1
by the Markov property
Modelling a Grid Market Economy
=E
N "
241
Ak (z; Yn−1,k ) | X n−1 = s
k=1
=
( = s) P(Y(n−1 = ( | ||
( |=s ( :|
N "
Ak (z; k ).
k=1
The result now follows as n → ∞.
This pgf is a polynomial in z, whose coefficients give the probability of the market going from a given state s to each possible state d (destination); the coefficient of z d is the probability of the market going from state s to state d in one step, i.e. ( the components of which m sd . Notice that the sums of products over vectors , sum to a given number s, are simply convolutions of N sequences. They are most conveniently computed by multiplying generating functions of the sequences and extracting the coefficients. For a system where all agents are identically specified, Proposition 1 simplifies greatly as follows. Corollary 1 If all the agents are identical with local states u = −, 0, +, rowtransition pgf A(z; u) and equilibrium probability vector π , for s ≥ 0, G(z; s) = )(N −s)/2*
N! n N −s−2n B (z)n+s + n!(N −s−2n)!(n+s)! B− (z) B0 (z)
n=0
)(N −s)/2* n=0
,
N! n N −s−2n n+s π+ n!(N −s−2n)!(n+s)! π− π0
where Bu (z) = πu A(z; u) for u = −, 0, +. For s < 0, G(z; s) = )(N +s)/2* n=0
N! n N +s−2n B (z)n−s − n!(N +s−2n)!(n−s)! B+ (z) B0 (z)
)(N +s)/2* n=0
N! n N +s−2n n−s π− n!(N +s−2n)!(n−s)! π+ π0
Proof Each term in the product in the numerator of the proposition is πk A(z; k ), where k takes one of the three values −, 0, +. Let there be n − , n 0 , n + occurrences, respectively, where n − + n 0 + n + = N . Moreover, to have global state s, we must have n + − n − = s. The sum then simplifies to
242
F.M. Ortuño et al.
(n − , n 0 , n + ) : n+ − n− = s n− + n0 + n+ = N
N! B− (z)n − B0 (z)n 0 B+ (z)n + . n − !n 0 !n + !
For s ≥ 0, n + must be at least s and n − = n + − s so that N = n 0 + 2n + − s. Since n 0 ≥ 0, 2n + ≤ N + s and so the range of n + is [s, )(N + s)/2* and for each n + , the values of n 0 and n − are fixed at n = n + − s and n 0 = N − 2n + + s. The result now follows by changing the summation variable n + to n = n + − s. For s < 0, we must have n − ≥ −s and the analogous result follows by inter changing the roles of n + and n − . As already noted, the elements of M are defined by the coefficients of G(z; s), which are routine to compute – by primitive operations in many mathematical software packages. The generating functions G(z; s) are quickly computed when all the agents are identical using Corollary 1. For the case where all agents are different, Proposition 1 must be used. This would require sums over state spaces of 3 N terms for each of the 2N + 1 values of its second argument, which is a large amount of computation even for a fairly small N . For systems at neither of these extremes, one can partition agents into groups of similar behaviour. For example, there may be 4 types of agents and 100 agents of each type, i.e. 400 agents in total, with a partition of 4 sets. Consider a partition of r agent types containing n 1 , . . . , n r agents. The ith set in the partition ! has 2n i + 1 aggregate states (1 ≤ i ≤ r ), so the total number of aggregate states is ri=1 (2n i + 1), which is 2014 or 1, 632, 240, 801. This is a large number but one for which it is perfectly feasible to derive the global state transition matrix M, with just 801 states. In the following, we shall deal with systems that consist of types of similar agents. To this end, we derive the following proposition. Let the equilibrium probability vector for the 2n (sub)states of the ith agent type be denoted i + 1 aggregate !n i π φ i , defined by φiv = :| ( |=v ( j=1 k j k j for −n i ≤ v ≤ n i , where the sequence numbers of the agents of type i are here denoted as k1 , . . . , kn i . Now let the transition probabilities out of aggregate state v in type i have pgf Ci (z; v), computed using Corollary 1 with s = v, applied to states numbered k1 , . . . , kn i instead of 1, . . . , n i . Proposition 2 For a collection of agents partitioned into r types, as defined above, the pgf of the market from state s is G(z; s) =
!r
k=1 φkk C k (z; k ) !r , ( |=s ( k=1 φkk :|
( |=s ( :|
where k ranges over [−n k , n k ] for 1 ≤ k ≤ r (as opposed to [−1, 1] for 1 ≤ k ≤ N in Proposition 1).
Modelling a Grid Market Economy
243
Proof The proof is very similar to that of Proposition 1, but with the probabilities φ replacing π , the products being taken over agent types instead of individual agents and the sums being over vectors of aggregate type states instead of individual agent states. Similarly, Ck (z; k ), the pgf for a group of agents of the same type, replaces the pgf of a single agent, used in Proposition 1.
3.2 An Example We now discuss a small example of a market consisting of two agents. The first is considered “neutral”, with transition probability matrix: ⎛
0.3 0.4 T1 = ⎝ 0.3 0.4 0.3 0.4
⎞ 0.3 0.3 ⎠ , 0.3
(7)
where the first row contains the probabilities of going from state {−} to, respectively, and in order from left to right, states {−, 0, +}. The second and third rows specify the corresponding transition probabilities from states {0} and {+}, respectively. The second agent could be termed a “fearful seller”, showing a slight tendency towards state −1 (selling). This is given by a state transition matrix such as ⎛
0.4 0.3 T2 = ⎝ 0.4 0.2 0.3 0.4
⎞ 0.3 0.4 ⎠ . 0.3
(8)
Those local agent transition matrices result in the following matrix for the global market: ⎛
0.120000 ⎜ 0.120000 ⎜ M =⎜ ⎜ 0.111000 ⎝ 0.102222 0.090000
0.250000 0.238533 0.236000 0.231852 0.240000
0.330000 0.326178 0.329333 0.331852 0.340000
0.210000 0.213822 0.222667 0.231852 0.240000
⎞ 0.090000 0.101467 ⎟ ⎟ 0.101000 ⎟ ⎟. 0.102222 ⎠ 0.090000
(9)
The equilibrium probabilities for the market state are then ⎛
⎞ 0.110092 ⎜ 0.237615 ⎟ ⎜ ⎟ ⎟ P=⎜ ⎜ 0.330275 ⎟ . ⎝ 0.222936 ⎠ 0.099083
(10)
244
F.M. Ortuño et al.
This demonstrates one obvious effect of adding a “fearful seller” to the “neutral” agent: the equilibrium probabilities of the lower numbered states ({−2, −1}) are greater than those of the higher numbered states ({1, 2}).
3.3 Simulation Model and Results In this section we compare the results of the market evolution derived from the previous section with results from a simulation of the market. To this end we compute the global steady-state probabilities of the market. The simulation in this section is a slight adaptation of the ones presented earlier. In particular, market participants submit in this case “market orders”, with their direct intention of buying or selling at the current market price; whereas in Section 2, market participants submitted “limit orders”, with the intention of buying or selling only at a specified price. We make the comparison between the analytical model and the simulation results with regard to their probability functions for two different scenarios. These correspond to two different kinds of networks. The first kind of network is fully connected, and we refer to this as the ideal simulation set-up. The second kind of network is a random network (BA), which might not be fully connected, and we call this the non-ideal simulation set-up. In the ideal simulation set-up, agents can exchange messages with all other nodes each time step. In this case the simulation results are identical within the errors of the simulation to the analytical model. This can be seen in Fig. 9 where we have a Ideal simulation setup. 128 nodes
0.00
0.01
0.02
pdf 0.03
0.04
0.0
0.06
Analytic Simulations
−30
−20
−10
0
10
20
30
Market state
Fig. 9 Comparison of the probability functions for the global market state, calculated analytically (solid line) and with the simulations (dashed line), for the ideal simulation set-up (fully connected network) and 128 market participants. The small vertical lines that cross the simulation plot correspond to 95% confidence intervals
Modelling a Grid Market Economy
245
Table 2 Summary of simulation results for the three non-ideal simulation setups and the three different-sized networks N h μa μs σa2 σs2 First non-ideal simulation set-up 128 4.4 0 512 6.6 0 1024 6.9 0 Second non-ideal simulation set-up 128 4.4 −22.26 512 6.6 −89.04 1024 6.9 −178.08 Third non-ideal simulation set-up 128 4.4 −11.13 512 6.6 −44.52 −89.04 1024 6.9
0 0 0
76.80 307.20 614.40
34.46 43.56 72.25
−20.50 −43.70 −78.00
85.17 340.69 681.38
40.96 53.29 90.25
−9.93 −18.30 −33.60
80.99 323.94 647.89
37.21 48.02 79.21
N is the network size; h is the average number of hops between nodes in the network; μa is the mean obtained with the analytical model and μs is the mean obtained in the simulation; σa2 is the variance of the analytical model and σs2 is the variance of the simulation
market with 128 agents, who all have the same transition probability matrix given by (11). We have also run simulations with the non-ideal set-up on BA graphs with 128, 512 and 1024 nodes. The results are summarized in Table 2. For the 128 node simulation, we chose a pub size of 128 messages and a TTL of 7 and ran the simulations for 200,000 epochs. First we chose all agents to have the same neutral transition probability matrix: ⎛
0.3 0.4 Tn = ⎝ 0.3 0.4 0.3 0.4
⎞ 0.3 0.3 ⎠ . 0.3
(11)
In Fig. 10 we show the results of the comparison between the analytical centralised system and the distributed BA network of the simulation. The probability function of the analytical model and the relative frequency histogram of the simulation both appear close to normal, as one would expect from the central limit theorem. As one would also expect from the different network infrastructures, the results are different. Both distributions have a zero mean; however, the simulated one has a lower variance than the analytical. This might be due to the fact that past messages stored in the several pubs of the BA network tend to decrease the variance in the global state of the market. In the second, non-ideal set-up, all 128 nodes have a tendency to sell: ⎛
⎞ 0.6 0.2 0.2 Ts = ⎝ 0.4 0.2 0.4 ⎠ , 0.2 0.6 0.2
(12)
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
F.M. Ortuño et al.
pdf
246
Analytic Simulations
−30
−20
−10
0
10
20
30
Market State
Fig. 10 Probability functions for the global market state, calculated analytically (solid line) and by simulation (dashed line), for the first non-ideal set-up of the simulations and a random network of 128 nodes
pdf
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
As can be seen in Fig. 11, again both models have a probability function/histogram that appears normal. However, the simulation has a smaller negative mean than the analytical solution. This is because the market state obtained in the simulations tends to have a lower absolute value for a non-fully connected network, since some nodes might not receive copies of all messages.
Analytic Simulations
−50
−40
−30
−20
−10
0
10
Market State
Fig. 11 Probability functions for the global market state, calculated analytically (solid line) and by simulation (dashed line), for the second non-ideal set-up of the simulations
Modelling a Grid Market Economy
247
pdf 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
The third and last model is a network of 64 agents with a transition probability matrix given by (11) and 64 agents with a transition probability matrix given by (12). We expect this set-up to have a mean somewhere between 0 and that of the previous case. Figure 12 shows that this is the case and again the distributions appear normal. The discrepancy in the means and variances is again almost certainly due to the BA network.
Analytic Simulations
−40
−30
−20
−10
0
10
20
Market state
Fig. 12 Probability functions for the global market state, calculated analytically (solid line) and with the simulations (dashed line), for the third non-ideal set-up of the simulations
In Table 2 we summarize the simulation results for different network sizes. As mentioned before, both the mean and the variance in the simulations tend to have a lower absolute value than in the analytical model. Also, the variance of the market state increases with the number of agents in the market. This is as expected from the central limit theorem. In summary the ideal simulations show very good agreement with the analytical model. In the cases where BA networks are used, there are slight discrepancies that can be explained by the inefficiencies caused by a non-fully connected network. In this non-ideal simulation set-up, both shifting factors (additive, e.g. communication times) and scaling factors (multiplicative) need to be found in order to have an equivalent result to the analytical centralised system. In the following section we make use of the full global transition probability matrix of the whole market rather than just the probabilities of the market state at equilibrium. We only looked at these probabilities here to be able to make a comparison between the simulations and the analytical model.
248
F.M. Ortuño et al.
3.4 Futures Trading of Computing Power Once a global computing power market has been established, we expect that a parallel market of derivatives of computing power will emerge, in a similar way to what happens in other perishable commodity markets, such as electricity. This is a natural development, since the impossibility of storing CPU cycles makes computing power a perishable commodity, but still allows its price to be decided for a fixed time in the future. Institutions and individuals connected to the global P2P market will trade future contracts of computing power in order to maximise the use of their resources, acquire resources when needed or for purposes of hedging and speculation (Fig. 13).
Fig. 13 For a non-ideal simulation set-up (peer to peer), both shifting and scaling factors need to be found in order to obtain the same result as in a fully connected network
It is for this reason that we explore in this section the trading of future contracts of computing power. In particular, we investigate the performance of a futures trader who operates in the Grid computing market. The trader makes decisions about buying, selling or holding future contracts, with the objective of maximizing his own profit. We model our problem as a Markov decision process (MDP). In our scenario, the set of decision epochs is discrete and infinite: I ≡ N\{0},
(13)
i.e. decisions are made at all decision epochs for infinity. The state of the MDP is formed by two parts: the state of the market and the state of the trader. We assume one of the general hypotheses of perfect markets, where the decisions taken by a single individual cannot determine the behaviour of prices, and therefore a single agent is unable to manipulate the market. Specifically, we consider that the
Modelling a Grid Market Economy
249
market is modelled by a global Markov chain whose transition probability matrix (with the form specified in Section 3.1) determines the price evolution. Together with the price evolution, we also incorporate the trading volume, which provides information about the number of operations. Both the price evolution and the trading volume form the state of the market. The evolution over time of these two variables is usually indicated in parallel. Consequently the market evolves as indicated by a transition probability matrix that has the form of (6). The state of the Markov chain is an integer (from −N to N inclusively) that represents the variation in price with respect to the deal price of the previous time step or decision epoch. This variation can be understood to be relative (a percentage of variation) or absolute (points of variation in price). We define the trading volume to be the absolute value of this price variation, i.e. the volume is a natural number between 0 and N inclusively. The trading volume of a particular decision epoch gives the number of available future contracts that can be bought or sold at the current market price. This specification of volume implies that the higher the variation in price (up or down), the higher the trading volume will be. This correspondence between price variation and trading volume is not an arbitrary choice, and it can be observed in many graphs plotted, for instance, from stock market data (Fig. 14).
Fig. 14 Computing power is non-storable, and therefore non-tradeable. Future contracts allow trading the underlying computing power. With different delivery dates (t + T1 , t + T2 , etc.), future contracts extend the trading spectrum, allowing maximisation of the use of resources, as well as hedging and speculation
Therefore, as explained above, two variables, price variation (i) and trading volume (|i|), define the state of the market: Mi = (i, |i|)
for i ∈ Z, −N ≤ i ≤ N .
(14)
The state of the market itself determines the first two variables of the state of the MDP (Fig. 15). The open position of the futures trader is the third variable that completes the definition of the state of the MDP. At every decision epoch, the trader can choose to buy, sell or hold, with the respective meaning of buying one future contract, selling one future contract or keeping his open position unchanged. Although the trader can
250
F.M. Ortuño et al.
Fig. 15 Graphical representation of an arbitrary example of a Markov Decision Process. MDPs are used for decision making in sequential, uncertain environments. In this example, the states of the MDP are the large circles, the possible actions at each state for the decision maker are the small circles, and the arrows that come out of the small circles are the possible evolutions of the system after that particular action is taken; the probability of each of these possible evolutions happening is indicated on the arrows, together with the reward that the decision maker receives when this happens
only buy or sell one single future contract at every decision epoch, over time he is allowed to accumulate a total of N contracts. This means that the open position of the trader that results from his decisions will be an integer number between −N and N inclusively: Tpos = pos
for
pos ∈ Z ∩ [−N , N ].
(15)
Then the state space of the MDP, S, is given by Si,pos = (i, |i|, pos)
for i, pos ∈ Z ∩ [−N , N ],
(16)
where i is the market price variation given by (6), |i| is the trading volume and pos is the open position of the trader. The MDP has a total of (2N + 1)2 states, obtained by all the possible combinations between the 2N + 1 values of the price variation and the 2N + 1 different open positions of the trader. Furthermore, the possible actions for the trader are Acs = {−1, 0, 1}
for s ∈ S,
(17)
in all the states of the MDP, except in those states in which the trader has an open position of N , when his available actions will be {−1, 0}; and in those states in which the trader has an open position of −N , when his available actions will be {0, 1}. Finally, in order to complete the specification of the MDP, we define a reward that is given to the trader, which will depend on the decisions he makes. The trader will try to maximise this reward to improve his performance in the market. This reward is specified in two parts. The first part of the reward is the profit/loss the
Modelling a Grid Market Economy
251
trader acquires from his open position and the price variation in the market. This reward is calculated as the product of the open position of the trader at the next decision epoch (which is given by considering both the current open position of the trader and the action he takes at the current decision epoch) by the price variation of the market at the next decision epoch. This market price variation is uncertain and will be given by the probabilities of the transition probability matrix in (6). In other words, the reward depends, not only on the current state of the MDP and the action taken by the trader but also on the next state of the MDP. It is for this reason that we need to calculate the expected value of the reward in the current state of the system. Consequently, when the system is in state s and the trader chooses action a ∈ Acs , this first reward is r1 (s, a) =
r1 (s, a, j) p( j|s, a),
(18)
j∈S
r1 (s, a, j) being the reward the trader receives when the system is in state s, the trader chooses action a ∈ Acs and the system evolves to state j at the next decision epoch. This expression is calculated as follows: r1 (s, a, j) = i j ∗ pos j ,
(19)
where i j and pos j are, respectively, the price variation of the market and the open position of the trader when the system is in state j. The other term in (18), p( j|s, a), is the probability of the system evolving from state s to state j when the trader chooses action a ∈ Acs . Once the trader has decided to do action a ∈ Acs , it is immediate to calculate, at the present time step or decision epoch, the trader’s open position at the next decision epoch, since it is the direct result of his current open position plus his chosen action. Therefore, the conditional probability p( j|s, a) is obtained directly from the transition probability matrix of the market, given by (6). On the other hand, the second part of the reward comes from the possibility for the trader to close his current open position, i.e. the possibility of liquidating his remaining future contracts, which depends on the available trading volume. This second reward is established as a penalty, and consequently it will be negative or, in the best case, zero. This penalty can be understood as a way of forcing the trader to have an open position that can be easily closed in the market, in order to avoid an important loss due to a drastic change in market conditions, as well as be to able to immediately liquidate a position that is no longer needed. At the current time step or decision epoch, the available trading volume at the next decision epoch is uncertain, which means that this second reward also depends on the next state of the system. The expected value of this second reward is therefore r2 (s, a) =
j∈S
r2 (s, a, j) p( j|s, a),
(20)
252
F.M. Ortuño et al.
which has the same structure as (18), but with r2 (s, a, j) = −c ∗ max(|pos j | − |i| j , 0),
(21)
c ∈ R+ being a penalty factor, and |pos j | and |i| j the absolute open position of the trader and the available trading volume, respectively, when the system is in state j. Considering the two parts of the reward, when the system is in state s and the trader chooses action a ∈ Acs , the total reward the trader is given is r (s, a) = r1 (s, a) + r2 (s, a).
(22)
3.5 An Optimal Trading Policy This section focuses on finding an optimal policy for the MDP defined above, which will serve as an optimal trading strategy for the futures trader. We consider an infinite-horizon Markov decision process and the expected total discounted reward optimality criterion is applied [25]. With this setting, the MDP does not have a final decision epoch, but continues to infinity. However, a discount factor λ (0 ≤ λ < 1) weights the rewards, making those further in the future less valuable. Then, when the system is in state s at the first decision epoch and a policy π is used, the expected total present value of the income stream obtained is vλπ (s) = E sπ {
∞
λt−1r (X t , Yt )},
(23)
t=1
r (X t , Yt ) being the reward received when the action Yt is used in the state X t , and t is the decision epoch (t ∈ I , see (13)). The convergence of the series is guaranteed by using a discount factor λ and finite rewards [25]. We now focus on finding the particular policy π that maximizes (23). Among the different algorithms for finding an optimal policy, we use linear programming, due to its easy formulation. See [25] for the details of converting a discounted Markov decision problem into a linear programming problem. In particular, selecting positive scalars α( j), j ∈ S (where S is the state space of the MDP defined in equation 16) with the condition α( j) = 1, the primal linear program consists j∈S
of minimizing j∈S
subject to
α( j)v( j),
(24)
Modelling a Grid Market Economy
v(s) −
253
λp( j|s, a)v( j) ≥ r (s, a),
(25)
j∈S
for a ∈ Acs and s ∈ S, and v(s) unconstrained for all s ∈ S. On the other hand, the dual linear program consists of maximizing
r (s, a)x(s, a),
(26)
s∈S a∈Acs
subject to
x( j, a) −
a∈Ac j
λp( j|s, a)x(s, a) = α( j),
(27)
s∈S a∈Acs
and x(s, a) ≥ 0 for a ∈ Acs and s ∈ S. We can solve the dual program to obtain the different x(s, a). We apply these values to P{dx (s) = a} =
x(s, a) , x(s, a )
a ∈Ac
(28)
s
which gives, for a particular state s, different values of probabilities for different actions a. For each state s, the action a that gives the highest probability is chosen as the decision rule for that state. The policy is the set of decision rules for all the states of the MDP. 3.5.1 Example We present in this section an example of an MDP as defined in Section 3.4. To model the evolution of the market, we take the transition probability matrix of the example in Section 3.2, i.e. (9). Therefore, the number of agents that defines the market is N = 2 and the market can be in one of the following five states: Mi = (i, |i|)
for i ∈ Z, −2 ≤ i ≤ 2.
(29)
For each of the above market states, the trader can be in one of his five positions: Tpos = pos
for
pos ∈ Z, −2 ≤ pos ≤ 2,
(30)
and consequently the total number of states of the MDP is 25. The set of these states constitutes the state space S: Si,pos = (i, |i|, pos)
for i, pos ∈ Z ∩ [−2, 2],
(31)
254
F.M. Ortuño et al.
where i is the market price variation, |i| is the trading volume and pos is the open position of the trader. For the trader, his possible actions are Acs = {−1, 0, 1}
for s ∈ S,
(32)
with the exception of those states s where the open position of the trader is −2, for which the trader can only choose from {0, 1}; and those states s where the open position of the trader is 2, for which the trader can only choose from {−1, 0}. We use (18), (20) and (22) for the application of rewards with a penalty factor of c = 0.1 and build the dual linear program as indicated in expressions (26) and (27). The dual is solved with GLPK (GNU linear programming kit), using a discount factor λ = 0.95 and the same value for all the α( j). In particular, the standard linear programming solver of GLPK, glpsol, is used, and an optimal solution is found by the simplex method. The solutions to the 65 variables of the dual problem are shown in Table 3, where x(Si,pos , a) corresponds to a variable that is associated with the fact that when the Table 3 Solution to the dual linear problem a
Si,pos
S0,−2 S0,−1 S0,0 S0,1 S0,2 S1,−2 S1,−1 S1,0 S1,1 S1,2 S2,−2 S2,−1 S2,0 S2,1 S2,2 S−2,−2 S−2,−1 S−2,0 S−2,1 S−2,2 S−1,−2 S−1,−1 S−1,0 S−1,1 S−1,2
−
0
+
– 88.9916 88.9916 0.108839 0.04 – 50.5129 0.133 0.115239 0.04 – 22.0828 0.124293 0.112039 0.04 – 74.8694 0 0 0.04 – 73.6971 0 0.190716 0.04
0 0 0 0 0 101.02 0 0 0 0 93.9005 0 0 0 0 0 0 0 0.159157 0 118.988 0 0 0 0
88.9405 0 0 0 – 0 0 0 0 – 0 0 0 0 – 96.6009 0 0.11533 0 – 0 0 0.133 0 –
The numbers in the cells where Si,pos and a intersect are the solutions for x(Si,pos , a). When in an intersection there is a dash (–), it means that the combination is not possible because the trader cannot choose that action in that state
Modelling a Grid Market Economy
255
MDP state is Si,pos = (i, |i|, pos), the trader chooses to do action a. By applying (28), we finally obtain a decision rule for each state. This set of decision rules for each state is the optimal policy, and it is shown in Table 4. As can be seen in Table 4, the sell action −1 dominates the trader’s decisions, which is in agreement with the bear market given by expression (9).
S0,−2 S0,−1 S0,0 S0,1 S0,2 S1,−2 S1,−1 S1,0 S1,1 S1,2
1 −1 −1 −1 −1 0 −1 −1 −1 −1
Table 4 Optimal policy S2,−2 0 S2,−1 −1 S2,0 −1 S2,1 −1 S2,2 −1 S−2,−2 1 S−2,−1 −1 S−2,0 1 S−2,1 0 S−2,2 −1
S−1,−2 S−1,−1 S−1,0 S−1,1 S−1,2
0 −1 1 −1 −1
The trader’s optimal action for each state of the MDP is specified
4 Conclusion and Future Work In this chapter we have proposed and investigated a system that allows a Grid computing network to be built by allowing participants of a peer-to-peer network to buy and sell computing resources. The aim of our fairly abstract model was to establish whether a market would form without too much outside intervention when individual agents (nodes) in the system had only local knowledge. In our model of the system we deliberately choose to make agents simple by having only a few rules for price and demand change and also to have only one type of resource that can be traded. We did this in order to keep the number of parameters of our system in check. Using a mean field approximation we have first shown that we can expect the price development of the system to be in different regions separated by a critical stable transition zone. This was backed up by a simulation of the system which also showed that the price changes and the numbers of buyers and sellers in the system are important factors as to whether a stable price development can be achieved. We used this insight to develop a simulation model where agents do not just adjust their prices according to demand and supply but also back off from demanding or supplying according to their local information about the state of the entire system. We also showed that this behaviour does not lead to unacceptable response times for nodes to find buyers or sellers. We can also confirm that the peer-to-peer system is likely to scale better than other, centralised systems. We have also modelled the market of the system using Markov chains, which is a fairly common method used for financial markets. In particular, we showed how to calculate the one-step transition probability matrix for an aggregation of this market. The analytical results we obtained for the market’s equilibrium state probabilities compared well against simulation for the ideal set-up. For the non-ideal set-ups,
256
F.M. Ortuño et al.
further work will consist of finding both shifting and scaling factors in order to obtain an identical result to the analytic centralised system. An important feature of any market for computing power will be trading in futures. Just as in financial markets, buyers and suppliers will be able to use derivatives as insurance policies against market fluctuations. Moreover, we have shown how potential traders can optimise their strategies using Markov decision processes. With these results, one can be fairly certain that buying and selling computing power in a similar way to the MAGOG paradigm will result in a stable market. Being able to buy and sell in a simple way on demand would greatly enhance the computing experience of the future. There are many obvious extensions to this work. On the one hand, the economic model and the behaviour of agents should be made more realistic. One could also, for instance, add nodes that only trade in computing power and investigate the consequences on the market behaviour. Also, if the model is to be practical, it must be extended to allow multiple resources to be traded and to accept more complex demands, so that it would matter to nodes how big is the latency between acquiring different resources they have bid for. In parallel computing applications, for instance, this would be crucial. However, increasing the complexity of the model would also be challenging when it comes to computing either analytical or simulation results. Finally, network latency should be represented explicitly, especially in the analytical models, where it is so far entirely absent. There are many such network sub-models in the literature and the challenge would be how to parameterise the demand (i.e. input parameters) on the network, based on the trading model. Acknowledgments This work is partially supported by an EPSRC grant (EP/D061717/1).
References 1. Aberer K, Cudré-Mauroux P, Datta A, Despotovic Z, Hauswirth M, Punceva M, Schmidt R (2003) P-grid: a self-organizing structured p2p system. Sigmod Rec 32(3):29–33 2. Ardaiz O, Artigas P, Eymann T, Freitag F, Navarro L, Reinicke M (2006) The catallaxy approach for decentralized economic-based allocation in grid resource and service markets. Appl Intel 25(2):131–145 3. Bagnall AJ, SmithGD (2005) A multiagent model of the UK market in electricity generation. Evol Comput IEEE Trans, 9(5):522–536 4. Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286:173 5. Brian Arthur W (1994) Inductive reasoning and bounded rationality (the El Farol Problem). Am Econ Rev (Papers and Proceedings) 84:406–411 6. Buyya R, Abramson D, Venugopal S (2005) The grid economy. Proc IEEE, 93(3):698–714 7. Caldwell B (ed) (1999–2007) The collected works of F. A. Hayek. University of Chicago Press, Chicago, IL 8. Cohen J, Richardson C, Harder U, Martínez Ortuño F, Darlington J (2009) Node-level architecture design and simulation of the MAGOG grid middleware. In: AusGrid 2009, Wellington, New Zealand, vol 99. pp 57–66 9. Cotton IW (1975) Microeconomics and the market for computer services. Comput Surv 7(2):95–111
Modelling a Grid Market Economy
257
10. Csàrdi G, Nepusz T (2006) The igraph software package for complex network research. InterJ Complex Syst 1695 11. Ellul A, Shin HS, Tonks I (2004) Opening and closing the market: evidence from the London stock exchange. Technical report, LSE Discussion Paper 506. http://eprints.lse.ac.uk/24753/ 12. Erd˝os P, Rényi A (1959) On random graphs I. Publ Math (Debrecen) 6:290–297 13. Evans T. Complex networks. Contemp Phys 45:455–474 14. Foster I, Kesselman C (1999) Computational grids. The grid:blueprint for a new computing infrastructure (chapter 2). Morgan-Kaufman, USA. http://www.globus. org/research/papers/chapter2.pdf 15. Greenberger M (1966) The priority problem and computer time sharing. Manage Sci 12(11):888–906 16. Harder U, Martínez Ortuño F (2008) Simulation of a peer to peer market for grid computing. In: Al-Begain K, Heindl A, Telek M (eds) ASMTA 2008, LNCS, vol 5055 . Springer, Berlin, pp 234–248 17. Harder U, Martínez Ortuño F (2009) A more realistic peer-to-peer grid market model. In: Bradley JT (ed) EPEW 2009, LNCS, vol 5652 . Springer, Berlin, pp. 149–154 18. Kirman AP, Vriend NJ (2000). Learning to be loyal. A study of the Marseille fish market. In: Delli Gatti D, Gallegati M, Kirman AP (eds) Interaction and market structure. Essays on heterogeneity in economics, vol 484. Springer, Berlin, pp 33–56 19. Lai K, Rasmusson L, Adar E, Zhang L, Huberman BA (2005) Tycoon: an implementation of a distributed, market-based resource allocation system. Multiagent Grid Syst 1(3):169–182 20. Lua K, Crowcroft J, Pias M, Sharma R, Lim S (2005) A survey and comparison of peer-to-peer overlay network schemes. IEEE Commun Surv Tutorials 7(2): 72–93 21. Martínez Ortuño F, Harrison PG, Harder U (2010) A Markovian futures market for computing power. In: WOSP/SIPEW, San Jose, California, USA 22. Nagel K, Shubik M, Strauss M (2004) The importance of timescales: simple models for economic markets. Phys A 340(4):668–677 23. Nielsen NR (1970) The allocation of computing resources – is pricing the answer? Commun ACM 13(8):467–474 24. Oram A (ed) (2001) Peer-to-peer: harnessing the power of disruptive technologies, chapter 14 (Performance) by Theodore Hong. O’Reilly and Associates, Sebastopol 25. Puterman ML (1994) Markov decision processes. Discrete stochastic dynamic programming. Wiley InterScience, New York 26. Regev O, Nisan N (1998) The POPCORN market an online market for computational resources. In: Proceedings of the first international conference on Information and computation economies, Charleston, South Carolina, USA, pp 148–157. ISBN:1–58113–076–7 27. Richardson C (2007) Growing the global open grid: design brief and middleware architecture. Technical report, The Internet Centre, Imperial College, London 28. Rosvall M, Sneppen K (2006) Self-assembly of information in networks. Europhys Lett 74:1109 29. Shreve SE (2004) Stochastic calculus for finance: continuous-time models. Springer, Berlin 30. Simon A (Ed.) (1957) Models of man, Chapter A. In: Behavioral model of rational choice. Wiley, New York, NY 31. Sutherland IE (1968) A futures market in computer time. Commun ACM 11(6):449–451 32. Waldspurger CA, Hogg T, Huberman BA, Kephart JO, Scott Stornetta W (1992) Spawn: a distributed computational economy. Software Eng 18(2):103–117 33. Walras L (1954) Elements of pure economics. George Allen and Unwin, London 34. Wang F, Moreno Y, Sun Y (2006) Structure of peer-to-peer social networks. Phys Rev E 73:036123 35. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393:440 36. Wolski R, Plank JS, Brevik J, Bryan T (2001) Analyzing market-based resource allocation strategies for the computational grid. Int J High Perf Comput Appl 15(3):258–281